Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at


Hitachi Vantara Lumada and Pentaho Documentation

Avro Output

Parent article

The Avro output step serializes data into an Avro binary or JSON format from the PDI data stream, then writes it to file. Apache Avro is a data serialization system. Avro relies on schema for decoding binary and extracting data.

This output step creates the following files:

  • A file containing output data in the Avro format
  • An Avro schema file defined by the fields in this step

Fields can be defined manually or extracted from incoming steps.

Select an engine

You can run the Avro Output step on the Pentaho engine or on the Spark engine. Depending on your selected engine, the transformation runs differently. Select one of the following options to view how to set up the Avro Output step for your selected engine.

For instructions on selecting an engine for your transformation, see Run configurations.