Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at


Hitachi Vantara Lumada and Pentaho Documentation

ORC Input

Parent article

The ORC Input step reads the fields data from an Apache ORC (Optimized Row Columnar) file into the PDI data stream.

Before using the ORC Input step, you must configure a named connection for your distribution, even if you set your Location to Local. For information on named connections, see Set up the Pentaho Server to connect to a Hadoop cluster.

Select an engine

You can run the ORC Input step on the Pentaho engine or on the Spark engine. Depending on your selected engine, the transformation runs differently. Select one of the following options to view how to set up the ORC Input step for your selected engine.

For instructions on selecting an engine for your transformation, see Run configurations