Hadoop File Output

Last updated
Save as PDF

The Hadoop File Output step exports data to text files stored on a Hadoop cluster. It is commonly used to generate comma separated values (CSV files) that are easily read by spreadsheet applications. You can also generate fixed-width files by setting lengths on the fields in the Fields tab.

Select an Engine

You can run the Hadoop File Output on the Pentaho engine or on the Spark engine. Depending on your selected engine, the transformation will run differently. Select one of the following options to view how to set up the Hadoop File Output step for your selected engine.

Using the Hadoop File Output step on the Pentaho engine: Learn how to set up this step when using the Pentaho engine.
Using the Hadoop File Ouput step on the Spark engine: Learn how to set up this step when using the Spark engine.

For instructions on selecting an engine from your transformation, see Run configurations.

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com.

Select an Engine