Skip to main content
Hitachi Vantara Lumada and Pentaho Documentation

Write Metadata

You can use the Write Metadata step to add new metadata in the Data Catalog that is associated with specific data resources. You can add a description and tags to the data resource. You can also add metadata for any new data resources that your transformation adds to Data Catalog.

The Write Metadata step includes options to identify, locate, and append the metadata that is associated with resource IDs in the Data Catalog.

For more information about accessing Data Catalog in PDI, see PDI and Data Catalog.

NoteThe Write Metadata step is supported on the PDI engine but not on the Spark engine. Only CSV text file formats are currently supported. You must have role permissions in Data Catalog to read the data resources.

Before you begin

Before you can use the Write Metadata step, you must establish a VFS connection to Data Catalog. For more information see Access to Data Catalog. You must also have role permissions set in Data Catalog to create and write tag descriptions for data resources that are registered in Data Catalog.

General

The following options are general to the Write Metadata transformation step.

OptionDescription
Step NameSpecify a unique name for the Write Metadata step. You can customize the name or use the default name.
Connection

Select the name of your connection to Data Catalog.

See Connecting to Virtual File Systems for details.

Options

You can use the Write Metadata step to select the ID for an existing data resource in Data Catalog. The Write Metadata step appends new tags and descriptions of the selected tags to any existing tags for the resource.

Input tab

You can use the Input tab to specify how a transformation obtains resource IDs from Data Catalog to use as input.

If your transformation needs only a specific data resource, you can enter the ID for that data resource in the Resource ID field. The Write Metadata step adds new tags to only that specific data resource.

If the transformation is designed to work with multiple resource IDs, the resource IDs can be supplied as input from a previous step in the transformation,such as the Read Metadata step. You can select one of the resource ID options to specify how the resource IDs are supplied to the next step in a transformation.

Resource ID optionDescription
Accept resource ids from previous stepSelect this option if the exact resource IDs are the incoming data from a previous step in the transformation.
Pass through fields from previous stepSelect this option if the resource IDs are in a specific field that is incoming from a previous step in the transformation.
Field in the input to use as resource idIf you selected the Pass through fields from previous step option, enter the name of the field that contains the resource IDs.

Metadata tab

In the Metadata tab, you can specify metadata business terms that you want to associate with the data resources that are identified in the Input tab. The PDI option to add metadata tags matches the Data Catalog Add a tag feature.

For example, if your transformation uses the Catalog Input step to create a data resource in Data Catalog, you can use the Write Metadata step to inject resource IDs into your transformation and add business terms to those data resources.

In the Business Terms list, you can select one or more business terms that are available through your Data Catalog connection, and then enter or edit a description in the Description box.

OptionDescription
DescriptionEnter or edit the description for the business that you selected from the Business Terms list.
Business TermsSelect business terms that you want associated with the resource ID, and then click ADD.
NoteIf missing or incomplete data is returned, you might need to change the default limit for returned results. See Data Catalog searches returning incomplete or missing data for information.