Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

SSTable Output

 

Parent article

The SSTable Output step writes to a filesystem directory as an Apache Cassandra SSTable using CQL (Cassandra Query Language) version 3.x.

 

NoteThis step supports Cassandra 2.2 and later.

AEL considerations

 

When using the SSTable Output step with the Adaptive Execution Layer, the following factor affects performance and results:

  • Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.

Options

 

SSTable Output

The following options are available for the SSTable Output transformation step.

Option Description
Step name Specify the unique name of the SSTable Output step on the canvas. You can customize the name or leave it as the default.
Cassandra yaml file Specify the location of YAML file. A cassandra.yaml file is the main configuration file for Cassandra. It defines node and cluster configuration details.
Directory Specify where to write the output. This directory points to the target table to load to and must match the Keyspace and Table fields.
Keyspace Specify the keyspace (database) name of the target table to load. The name specified must match the Directory field.
Table Specifies the table (column family) to upload. It assumes the metadata for this table was previously defined in Cassandra. The table specified must match the Directory field.
Incoming fields to use as the key Specify which incoming row to use as the key. You can use Set Fields to specify the key from the names of incoming PDI transformation fields.
Set Fields Select from a list of incoming PDI transformation fields to specify as the Incoming fields to use as the key.
Buffer (MB) Specify buffer size to use. A new table file is written every time the buffer is full.