Using the Annotate Stream Step for SDR
The Annotate Stream step helps you refine your data for the Streamlined Data Refinery by creating measures, link dimensions, or attributes on stream field(s) which you specify. If you want, you can create multiple annotations on the same field. For example, you might want to create an average measure and a sum measure on the same field. You can also annotate multiple streams to modify the same data model. Additionally, you can create and add calculated measures to the data model.
The Annotate Stream modifies the default model produced from the Build Model job entry.
After you are done annotating your data model, you are ready to publish it.
Use the Annotate Stream Step
You can create annotations in different ways. The annotation type which you create determines which properties are shown in the dialog box to complete that annotation. This task assumes you are in the Transformation canvas in Spoon.
- In the Design tab, click the Flow folder, and then double-click the Annotate Stream step. Alternatively, you can drag the step icon on to the transformation canvas.
- Double-click the Annotate Stream icon to open the Annotate Stream dialog box.
- Enter a name for the step in the Step Name field.
- Select if you want to save the step locally or if you want to share it.
- Local: The annotations will be saved locally into the transformation.
- Shared: Allows you to select, create, or rename a shared group of annotations for use by PDI users.
- Optionally, enter a description for the annotations in the Description field.
- Select available fields for annotation.
- Click the Select Fields button to open the Select Fields to Annotate dialog box.
- Double-click the fields in the Available Fields list to add them to the Selected Fields list. Optionally, you can use the arrows to move one or more fields to the Selected Fields list.
- When finished, click OK to close the dialog box. The selected fields now display in the Annotations table featuring the following columns:
Column Description Field Lists the names of the fields selected for annotation. Model Action Specifies which model action is being taken: Create Measure, Create Attribute, or Link Dimension. Summary Displays a summary of that specific annotation.
- Now you can annotate the selected fields.
- To annotate a field, double-click it in the Annotations section. The Annotate dialog box appears for the selected field. Optionally, you can select the field in the Annotations table and then click the Edit (pencil) icon in the upper-right corner.
- In the Actions field, click the drop-down arrow to select one of the following actions. (For help on selecting values, select the topic link beside each action.)
- Create Measure. See Creating Measures on Stream Fields.
- Create Attribute. See Creating Attributes.
- Link Dimension. See Creating Link Dimensions.
- You can use the Previous and Next buttons to navigate through the fields. When finished, click OK to continue or Cancel to close the dialog box without saving your annotations.
You can remove a field from the Annotations section by selecting it in the table and then clicking the Delete ('X') icon in the upper-right corner.
- Optionally, you can create a calculated measure to add to the model.
- Click the Add Calculated Measure button to open the Annotate dialog box.
- Fill in the following fields:
Field Description Measure Name Enter the name of the calculated measure you are creating. Format
Specify how you want your calculated measure to appear in a report, such as currency, general number, or percentage. Use the drop-down arrow to select a format from a system-defined list, or type in the field to enter a custom format. For example, to display the measure as a percentage, you might select '0.00 %'.
See Format Field Options for more information on selecting the appropriate format string.
Formula Enter the formula of your calculated measure. This is an MDX statement. For more information, see Introduction to the MDX. When calculating subtotals use this formula check box Optionally, select this check box if you want this calculated measure to be used in calculations of subtotals in your reports. Hide this calculated measure in the model Optionally, select this check box if you want to hide this calculated measure in the data model. When selected, the calculated measure will be a part of the model, but will not be visible to users when the data source is opened in Analyzer. This check box is useful for calculated measures needed to build a proper data model, but not needed for analytic purposes.
- Use the Previous and Next buttons to navigate through the fields. When finished, click OK to add the attribute to the annotations list or Cancel to close the dialog box without saving your annotations. The calculated measure displays in the Annotation table with the Model Action 'Create Calculated Measure' and a summary detailing the measure name and formula.
The calculated measure is not validated until it is selected in Analyzer. Calculated measures will display as base measures in the Available Fields list in Analyzer.
- Click Apply to save your changes. You can continue to create or edit annotations. When finished, click OK to save your changes and close the dialog box, or Cancel to discard your changes and close the dialog box.
Create Annotation Groups
Annotation groups are useful when data sources, such as a weblog table, are reused in many transformations. Whenever this table is used, you can link to the shared annotation group to get model information on each table field. If the table were to ever change, then the annotations would only need to be updated in one place.
You can create multiple annotations based on the same annotation group by copying the group, and then saving it with a different name. You can do this as many times as you need to make a series of related annotation groups, such as annotations for time dimensions.
You must create a group to save your annotations. This group can be saved as just a local group or as a shared group. When the group is saved locally, it is saved to the transformation on your machine. When you select 'Shared', it will also be stored in the metastore and available to other users.
Create an Annotation Group for Sharing with Other Users
If you want to share your annotation group with other users, select the Shared button and ensure the annotation group has a unique name. Once you click Apply, the group will be available to other users for creating PDI jobs by selecting from the Shared menu in the Annotate Stream step.
This task assumes you are in the Transformation canvas in Spoon.
- Complete steps 1-3 in the Using the Annotate Stream Step.
- Select the Shared radio button, then click on the Add Annotation Group ("+" plus sign) icon next to the drop-down field.
- Enter a name for your annotation group in the Shared field, and then click Select Fields to begin creating annotations to populate the group. See steps 6-8 in Using the Annotate Stream Step for more information on selecting and annotating fields.
- When you are done, click Apply to save your annotations.
Create an Annotation Group Locally
- Complete steps 1-3 in Using the Annotate Stream Step.
- Select the Local radio button. Your annotation group will only be saved into your transformation. Any user running this transformation can see and use the annotations group.
- Click Select Fields to begin creating annotations to populate the group. See steps 6-8 in Using the Annotate Stream Step for more information on selecting and annotating fields.
- When you are done, click Apply to save your annotation group locally.
If you later decide that you want to share the annotation group, you can reopen it and select the Shared radio button, then click Apply. The group will then be shared to the metastore and be available to other users.
Metadata Injection Support
All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.
When using metadata injection with the Annotate Stream step, you can reuse shared annotation groups which already exist, but you cannot create a new shared annotation group. If you inject a shared annotation group by providing a value for SHARED_ANNOTATION_GROUP, then it is assumed that you are re-using an existing shared annotation group. As a result, any annotations defined in the ETL Metadata Injection step are ignored.