Using Merge rows (diff) on the Pentaho engine
If you are running your transformation on the Pentaho engine, use the following instructions to set up the Merge rows (diff) step.
ImportantWhen using the Pentaho transformation engine, the reference
rows and compare rows must be sorted on the specified keys. When using the Merge rows (diff) step within a
PDI
transformation, such as with the Sort
rows step, sorting works correctly. However, if the data is sorted outside
of PDI, such as in a
SQL query, you may run into issues with the internal case sensitive/insensitive flag or other
collations. If you are using the Merge rows (diff) step with
the Spark engine, see Using Merge rows (diff)
on the Spark engine.
General
Enter the following information for the step:
- Step name: Specify the unique name of the transformation on the canvas. You can customize the name or leave it as the default.
Options
The Merge rows (diff) step contains the following options.
Option | Description |
Reference rows origin | From the drop-down list, select the input source for the original reference rows you want to compare. The input source will be a previous step in the transformation. |
Compare rows origin | From the drop-down list, select the input source for the compare rows. The input source will be a previous step in the transformation. |
Flag fieldname | Specify a field name that will contain the flag indicating how the values were compared and merged in the output row. |
Keys to match | Specify the field names that contain the keys on which to generate a match. |
Values to compare | Specify the field names that contain the values on which to generate a comparison. |
Get key fields | Click the Get key fields button to populate the Keys to match table with all the fields in the Reference rows origin step. |
Get value fields | Click the Get value fields button to populate the Values to compare table with all the fields in the Compare rows origin step. |
Examples
The Merge rows – mergs 2 streams of data and add a flag.ktr transformation located in the data-integration/samples/transformations directory illustrates how to use the Merge rows (diff) step.
Metadata injection support
All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.