Memory Group By
The Memory Group By step groups rows in memory from a source step. The resulting rows are grouped based on a specified field or collection of fields. A new row is generated for each group. This step differs from the Group By step by processing all rows in memory, and is designed to handle non-sorted input. If the number of rows you want to group is too large to fit into memory, you must use a combination of the Sort Rows and Group By steps.
General
Enter the information in the transformation step name field as shown in the following table:
Option | Definition |
---|---|
Step Name | Specifiy the unique name of the Memory Group by step on the canvas. You can customize the name or leave it as the default. |
Always give back a result row | Select to return a result row, even when there is no input row. If no input rows exist, this option returns a count of zero (0). |
The Fields that Make up the Group Table
Use this table to specify the fields you want to group. Click Get Fields to add all fields from the PDI input stream. Right-click a row in the table to edit that single row or all rows in the table.
Leave this table blank to calculate aggregate functions over the entire dataset.
Aggregates Table
Use this table to specify the group you want to aggregate, the aggregation method, and the name of the resulting new field. Click Get lookup fields to specify the fields to include in a group. See the Group by step for more details.
Column | Description |
---|---|
Name | The name of the aggregate field. |
Subject | The subject on which you want to use an aggregation method. |
Type |
The type of aggregation method for grouping the rows. The following methods are available:
|
Value | The aggregate value. |
Metadata Injection Support
All fields of this step support metadata injection. You can use this step with ETL Metadata Injection to pass metadata to your transformation at runtime.