Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at


Hitachi Vantara Lumada and Pentaho Documentation

String Operations


With the String Operations step, you can perform the following string operations on an incoming PDI field:

  • Trim (remove leading and/or trailing spaces).
  • Convert to upper or lowercase.
  • Pad (add leading or trailing extra characters).
  • Convert to initial capitalization.
  • Ignore escape characters.
  • Remove or return only numeric digits.
  • Remove special characters.



Enter the following information in the transformation step field:

  • Step name

    Specify the unique name of the String Operations step on the canvas. You can customize the name or leave it as the default.

The fields to process


The fields to process table in the String Operations step

Use The fields to process table to specify which operations you want to apply to your input strings.

The table contains the following columns:


Column Description
In stream field Name of the field containing the string to be processed. Use Get fields to populate the table with fields from the incoming PDI data stream.
Out stream field (Optional) A new outgoing PDI field containing the results of the specified string operations. If you do not specify a value for this field, the In stream field is replaced by the resulting string.
Trim type Specifies whether to remove extra spaces before or after a field. You can trim a field on its left side, right side, or both. The default value is none.
Lower/Upper Specifies whether to make all the characters in a field upper or lowercase. The default value is none.
Padding Specifies whether to add extra characters before or after a field. You can pad a field on its left side or its right side. The default value is none.
Pad char The extra character added to a field for padding.
Pad Length The amount of extra characters used for padding.
InitCap Specifies whether to capitalize the initial character in a field. The default value is N.

Specifies whether to use, ignore (escape), or process (unescape) if the following formats are present in a field:

  • XML
  • HTML
  • SQL
The default value is None.
Digits Specifies whether to return only or remove numeric characters (digits). The default value is none.
Remove Special character

Specifies whether to remove any of the following special characters from a field:

  • Carriage return (CR)
  • Line feed (LF)
  • Carriage return and line feed
  • Horizontal tab
  • Space
The default value is none.

Metadata injection support


All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.