Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Managing configurations

Parent article

To customize Lumada Data Catalog for your environment, you can change specific configuration settings that are used by scripts when gathering catalog metadata, business terms, and term associations, and other Data Catalog functions.

The available configuration categories are shown in the following example:Configurations page

For the Lumada Data Catalog Application Server, you can update configuration settings for both the Security and Tools configuration groups. The following table shows an example of available configuration settings in the configuration group.

Group NameExample
SecuritySet the supported characters for validating job template arguments, for preventing an injection attack
ToolsWhether to enable or disable the lineage import-export feature

For the local Lumada Data Catalog Agent, you can update configuration settings for discovery, the discovery profiler, miscellaneous settings, and the metadata service configuration groups. The following table shows an example of available configuration settings in a configuration group.

Group nameExample
DiscoveryWhether or not to enable Hive support when initializing a Spark session
Discovery_ProfilerSet characters in a file name prefix that should be skipped during discovery
MiscSet the URI for the discovery cache metadata files
MetadataServiceSet the Hive version

Searching configurations

You can search the Configurations page for a specific setting that you want to update. If you do not know the configuration setting you want to modify, you can enter a keyword in the search box as in the image below. Search exampleResults return a list of the configuration groups with settings that match your search term. In this example, there is only one configuration group with the prefix keyword, Discovery Profiler. Click the View Details (>) icon to view the list of matching settings.

Changing configuration settings

If you want to change the file name prefixes that are ignored during discovery, you would modify the LDC Selector Ignore Prefixes setting in the Discovery Profiler configuration group, as shown in the following image.

NoteIf you have the administrator role, you can configure most settings in the Configurations page. Some settings are read-only, and you can not expand these settings. An example of read-only values are the Security settings for the LDC Application Server.
Asterisk indicating server restart is requiredThe functions of the buttons are described below:
  • Reset

    Reset to the last saved value.

  • Set to default

    Set the value back to the default value.

  • Save change

    Save the value specified in the text box.

If you change the value of the setting, make sure you save the change. For some LDC Agent configuration settings, you need to restart the agent after changing the setting. If the Value text box for a configuration setting displays with an asterisk (*), then an Agent restart is required.For more information, see Restart an agent.

Restart an agent

After you change some configuration settings, you need to restart the LDC Agent to use the new values. If the Value text box for a property displays with an asterisk (*), then an Agent restart is required.

  1. After you have updated your settings for the Agent, return to the Configurations page.
  2. Click Restart <agent>.
  3. Click Management, then click Agents to check the status of the LDC Agent.

    The Connected column for the LDC Agent displays a green checkmark icon.

Large properties configuration example

A common use case for managing configurations is to change the settings for large properties for an LDC Agent.

In Data Catalog, the large properties location is where the agent stores metadata from profile jobs. You can set up a large properties location in a file system like HDFS, or on object storage like AWS S3. This example illustrates configuring the large properties settings to an AWS S3 bucket.

Procedure

  1. Navigate to Management, then click Configuration.

  2. Locate the configuration section for the agent where you want the large properties location to be set up. In that section, click the View Details for the MISC group to view all miscellaneous settings for that agent.

  3. Expand the Attributes for discovery cache metadata store setting and provide the credentials and the endpoint for your AWS S3 bucket:

    fs.s3a.access.key=<AWS S3 access key>
    fs.s3a.secret.key=<AWS S3 secret key>
    fs.s3a.endpoint=<AWS S3 endpoint>
    fs.s3a.path.style.access=true
    fs.s3a.threads.max=40
    fs.s3a.connection.maximum=200
  4. Click Save Change.

  5. Open the Relative location for a large properties metadata store setting. In the Value text box, provide a folder location in your S3 bucket. For example. /lp1.

  6. Click Save Change.

  7. Open the URI for discovery cache metadata store setting and provide the URI for the S3 bucket. For an AWS S3 URI, this setting is in one of two formats, depending on the agent for which you are configuring these settings:

    • For a remote agent running on EMR: s3://<Bucket Name>
    • Any other agent: s3a://<Bucket Name>
  8. Click Save Change.

  9. Navigate to the Configurations page and click the restart link of the applicable agent.