The ORC Output step serializes data from the PDI data stream into an ORC file format, and then writes it to a file. ORC is a data format for fast columnar storage. This step creates a file containing output data in the ORC format.
Fields written to the ORC output file are defined by the input fields. Fields not written to the output file are either deleted, or are written to the output file with alternate field names or default values.
Select an Engine
You can run the ORC Output step on the Pentaho engine or on the Spark engine. Depending on your selected engine, the transformation runs differently. Select one of the following options to view how to set up the ORC Output step for your selected engine.
- Using the ORC Output step on the Pentaho engine: Learn how to set up this step when using the Pentaho engine.
- Using the ORC Output step on the Spark engine: Learn how to set up this step when using the Spark engine.
For instructions on selecting an engine for your transformation, see Run configurations