Apache Atlas integration
Lumada Data Catalog can integrate with external data sources to share metadata using the Apache® Atlas™ connector. The Atlas connector supports the following two job types:
- Export: You can export business terms and associations from Data Catalog to Atlas.
- Import: You can import lineage information in Atlas into Data Catalog.
You can create an Atlas connector by adding an external data source. For information on creating a connector, see Create an Atlas connector. For information on configuring an Atlas connector, see Configure an Atlas connector.
As a result, the external data source is added to the Data Sources tile in the Management page. In addition, a tab is added to the Tools page where you can import or export to Atlas. For more information on importing lineage information and exporting term associations, see the following tasks:
- Import Atlas HIVE_DB lineages to Data Catalog
- Export Data Catalog HIVE_DB level terms to Atlas
- Export HDFS_DB level terms to Atlas
Create an Atlas connector
Procedure
Navigate to the Management page and locate the Data Sources tile.
Click External Data Sources.
Click Add External Source.
Fill in the mandatory values as follows and click Create Data Source .
Field Name Value External data source name
Atlas External data source type
Atlas URL Provide the Atlas URL Atlas user name Provide the Atlas user name Atlas password Provide the Atlas password Atlas cluster name Provide the Atlas cluster name
Results
Configure an Atlas connector
Procedure
Navigate to the Management page and locate the Configurations tile.
Click Configuration , then select Agent.
Select MISC.
In the Atlas connector export business term associations with status setting, enter a value of ACCEPTED, SUGGESTED, or REJECTED. You can set multiple values using a comma separation. Example: ACCEPTED, SUGGESTED.
Configuration setting Export outcome ACCEPTED Only accepted business terms associations are exported. ACCEPTED, SUGGESTED Both accepted and suggested business terms associations are exported. ACCEPTED, SUGGESTED, REJECTED All accepted, suggested, and rejected business terms associations are exported.
Results
Import Atlas HIVE_DB lineages to Data Catalog
Procedure
Navigate to the Tools page in the left-hand navigation pane.
Click External Data Source.
Select Atlas as the external data source type.
Enter one of the following parameters and click Submit.
Parameter Description -virtualFolder <hive_virtual_folder> -path <path>
The import starts from the given path. -virtualFolder <hive_virtual_folder>
The import starts from the root path.
To complete the job, navigate to the Management page. In the Jobs tile, you can monitor the progress of the job.
Results
Export Data Catalog HIVE_DB level terms to Atlas
Procedure
Navigate to the Tools page in the left-hand navigation pane.
Click External Data Source.
Select Atlas as the external data source type.
Click Export.
Select Atlas as the external data source.
Enter one of the following parameters and click Submit.
Parameter Description -virtualFolder <hive_virtual_folder> -path <path>
The export starts from the given path. -virtualFolder <hive_virtual_folder>
The export starts from the root path.
To complete the job, navigate to the Management page.
In the Jobs tile, you can monitor the progress of the job.
Results
Configuration setting | Export outcome |
ACCEPTED | Only accepted business terms associations are exported. |
ACCEPTED, SUGGESTED | Both accepted and suggested business terms associations are exported. |
ACCEPTED, SUGGESTED, REJECTED | All accepted, suggested, and rejected business terms associations are exported. |
If the business term exported is a built-in Data Catalog term, then you see the term as LDC_BITS_<Business_term>
in Atlas.
If the business term exported is a custom Data Catalog term, then you see the term as LDC_<GLOSSORY>_<Business_term>
in Atlas.
Export HDFS_DB level terms to Atlas
Procedure
Navigate to the Tools page in the left-hand navigation pane.
Click External Data Source.
Select Atlas as the external data source type.
Click Export.
Select Atlas as the external data source.
Enter one of the following parameters and click Submit.
Parameter Description -virtualFolder <hdfs_virtual_folder> -path <path>
The export starts from the given path. -virtualFolder <hdfs_virtual_folder>
The export starts from the root path.
Complete the job by navigating to the Management page.
In the Jobs tile, you can monitor the progress of the job.
Results
Configuration setting | Export outcome |
ACCEPTED | Only accepted business terms associations are exported. |
ACCEPTED, SUGGESTED | Both accepted and suggested business terms associations are exported. |
ACCEPTED, SUGGESTED, REJECTED | All accepted, suggested, and rejected business terms associations are exported. |
If the business term exported is a built-in Data Catalog term, then you see the term as LDC_BITS_<Business_term>
in Atlas.
If the business term exported is a custom Data Catalog term, then you see the term as LDC_<GLOSSORY>_<Business_term>
in Atlas.