Memory Group By

Last updated
Save as PDF

The Memory Group By step groups rows in memory from a source step. The resulting rows are grouped based on a specified field or collection of fields. A new row is generated for each group. This step differs from the Group By step by processing all rows in memory, and is designed to handle non-sorted input. If the number of rows you want to group is too large to fit into memory, you must use a combination of the Sort rows and Group By steps.

General

Enter the information in the transformation step name field as shown in the following table:

Option	Definition
Step Name	Specify the unique name of the Memory Group by step on the canvas. You can customize the name or leave it as the default.
Always give back a result row	Select to return a result row, even when there is no input row. If no input rows exist, this option returns a count of zero (0).

Memory Group By dialog

The fields that make up the Group Table

Use this table to specify the fields you want to group. Click Get Fields to add all fields from the PDI input stream. Right-click a row in the table to edit that single row or all rows in the table.

NoteLeave this table blank to calculate aggregate functions over the entire dataset.

Aggregates table

Use this table to specify the group you want to aggregate, the aggregation method, and the name of the resulting new field. Click Get lookup fields to specify the fields to include in a group. See the Group By step for more details.

Column	Description
Name	The name of the aggregate field.
Subject	The subject on which you want to use an aggregation method.
Type	The type of aggregation method for grouping the rows. The following methods are available: Sum Average (Mean) Median Percentile Minimum Maximum Number of values (N) Concatenate strings separated by , (comma) First non-null value Last non-null value First value (including null) Last value (including null) Standard deviation Concatenate strings that are separated by the character specified in the Value column Number of distinct values Number of rows (without field argument).
Value	The aggregate value.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com.

General

The fields that make up the Group Table

Aggregates table

Metadata injection support