Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

String Operations

Parent article

With the String Operations step, you can perform the following string operations on an incoming PDI field:

  • Trim (remove leading and/or trailing spaces).
  • Convert to upper or lowercase.
  • Pad (add leading or trailing extra characters).
  • Convert to initial capitalization.
  • Ignore escape characters.
  • Remove or return only numeric digits.
  • Remove special characters.

General

Enter the following information in the transformation step field:

  • Step name

    Specify the unique name of the String Operations step on the canvas. You can customize the name or leave it as the default.

The fields to process

The fields to process table in the String Operations step

Use The fields to process table to specify which operations you want to apply to your input strings.

The table contains the following columns:

ColumnDescription
In stream fieldName of the field containing the string to be processed. Use Get fields to populate the table with fields from the incoming PDI data stream.
Out stream field(Optional) A new outgoing PDI field containing the results of the specified string operations. If you do not specify a value for this field, the In stream field is replaced by the resulting string.
Trim typeSpecifies whether to remove extra spaces before or after a field. You can trim a field on its left side, right side, or both. The default value is none.
Lower/UpperSpecifies whether to make all the characters in a field upper or lowercase. The default value is none.
PaddingSpecifies whether to add extra characters before or after a field. You can pad a field on its left side or its right side. The default value is none.
Pad charThe extra character added to a field for padding.
Pad LengthThe amount of extra characters used for padding.
InitCapSpecifies whether to capitalize the initial character in a field. The default value is N.
Escape

Specifies whether to use, ignore (escape), or process (unescape) if the following formats are present in a field:

  • XML
  • HTML
  • CDATA
  • SQL
The default value is None.
DigitsSpecifies whether to return only or remove numeric characters (digits). The default value is none.
Remove Special character

Specifies whether to remove any of the following special characters from a field:

  • Carriage return (CR)
  • Line feed (LF)
  • Carriage return and line feed
  • Horizontal tab
  • Space
The default value is none.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.