Skip to main content

Pentaho+ documentation is moving!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Apache Atlas integration

Parent article

Lumada Data Catalog can integrate with external data sources to share metadata using the Apache® Atlas connector. The Atlas connector supports the following two job types:

  • Export: You can export business terms and associations from Data Catalog to Atlas.
  • Import: You can import lineage information in Atlas into Data Catalog.

You can create an Atlas connector by adding an external data source. For information on creating a connector, see Create an Atlas connector. For information on configuring an Atlas connector, see Configure an Atlas connector.

As a result, the external data source is added to the Data Sources tile in the Management page. In addition, a tab is added to the Tools page where you can import or export to Atlas. For more information on importing lineage information and exporting term associations, see the following tasks:

Create an Atlas connector

Perform the following steps below to create an Atlas connector for your environment:

Procedure

  1. Click Management in the left navigation menu and on the Data Sources card, click Add New and select Add External Data Source.

  2. Fill in the mandatory values as follows and click Create Data Source .

    Field NameValue

    External data source name

    Provide the name of the data source

    External data source type

    Provide the Atlas URL, including the host and port for the Atlas service
    Atlas user nameProvide the Atlas user name
    Atlas passwordProvide the Atlas password
    Atlas cluster nameProvide the Atlas cluster name
  3. Test the connection.

Results

The External Source is created. Additionally, the count is updated in the Data Sources tile.

Configure an Atlas connector

Follow the steps below to configure the Apache Atlas connector for exporting business terms.

Procedure

  1. Click Management in the left navigation menu.

    The Manage Your Environment page opens.
  2. On the Configuration card, click View Configuration.

  3. Under local-agent Categories, click View Details (down arrow at the end of the row) for MISC.

  4. Expand the Atlas connector export business term associations with status setting by clicking the down arrow.

  5. Enter a value of ACCEPTED, SUGGESTED, or REJECTED. You can set multiple values by separating values with a comma. Example: ACCEPTED, SUGGESTED.

    Configuration settingExport outcome
    ACCEPTEDOnly accepted business terms associations are exported.
    ACCEPTED, SUGGESTEDBoth accepted and suggested business terms associations are exported.
    ACCEPTED, SUGGESTED, REJECTEDAll accepted, suggested, and rejected business terms associations are exported.
  6. Click Save Changes.

Results

After you enter the value, the configuration is updated and the business terms that have the chosen status are exported.

Import Atlas HIVE_DB lineages to Data Catalog

This task describes how to import an Atlas Hive database into Data Catalog. Perform the following steps to run an import job:

Procedure

  1. Click Tools in the left navigation menu.

  2. Click the External Data Source tab.

    Atlas is the default external data source type.
  3. If it is not selected, click Import.

  4. Select the external data source from the drop-down list.

  5. Enter one of the following parameters and click Submit.

    ParameterDescription

    -virtualFolder <hive_virtual_folder> -path <path>

    The import starts from the given path.

    -virtualFolder <hive_virtual_folder>

    The import starts from the root path.

    You can monitor the progress of the job on the Job Activity page.

Results

After the job is complete, the lineages from Atlas are imported to Data Catalog.

Export Data Catalog HIVE_DB level terms to Atlas

This task describes how to export Data Catalog HIVE_DB level terms to Atlas. Perform the following steps to run an export job:

Procedure

  1. Click Tools in the left navigation menu.

  2. Click External Data Source.

    Atlas is the default external data source type.
  3. Click Export.

  4. Select the external data source from the drop-down list.

  5. Enter one of the following parameters and click Submit.

    ParameterDescription

    -virtualFolder <hive_virtual_folder> -path <path>

    The export starts from the given path.

    -virtualFolder <hive_virtual_folder>

    The export starts from the root path.

    You can monitor the progress of the job on the Job Activity page.

Results

After the job is complete, business terms associated with resources from Data Catalog are exported to Atlas according to the configuration in the Atlas connector for business terms export. For example, the following table describes the possible export outcomes:
Configuration settingExport outcome
ACCEPTEDOnly accepted business terms associations are exported.
ACCEPTED, SUGGESTEDBoth accepted and suggested business terms associations are exported.
ACCEPTED, SUGGESTED, REJECTEDAll accepted, suggested, and rejected business terms associations are exported.

If the business term exported is a built-in Data Catalog term, then you see the term as LDC_BITS_<Business_term> in Atlas.

If the business term exported is a custom Data Catalog term, then you see the term as LDC_<GLOSSORY>_<Business_term> in Atlas.

Export HDFS_DB level terms to Atlas

This task describes how to export Data Catalog HDFS_DB level terms to Atlas. Perform the following steps to run an export job:

Procedure

  1. Click Tools in the left navigation menu.

  2. Click the External Data Source tab.

    Atlas is the default external data source type.
  3. Click Export.

  4. Select the external data source from the drop-down list.

  5. Enter one of the following parameters and click Submit.

    ParameterDescription

    -virtualFolder <hdfs_virtual_folder> -path <path>

    The export starts from the given path.

    -virtualFolder <hdfs_virtual_folder>

    The export starts from the root path.

    You can monitor the progress of the job on the Job Activity page.

Results

After the job is complete, business terms associated with resources from Data Catalog are exported to Atlas according to the configuration in the Atlas connector for business terms export. For example, the following table describes the possible export outcomes:
Configuration settingExport outcome
ACCEPTEDOnly accepted business terms associations are exported.
ACCEPTED, SUGGESTEDBoth accepted and suggested business terms associations are exported.
ACCEPTED, SUGGESTED, REJECTEDAll accepted, suggested, and rejected business terms associations are exported.

If the business term exported is a built-in Data Catalog term, then you see the term as LDC_BITS_<Business_term> in Atlas.

If the business term exported is a custom Data Catalog term, then you see the term as LDC_<GLOSSORY>_<Business_term> in Atlas.