This step defines the key/value pairs for Hadoop input, and indicates the injection point of the transformation that receives input from the MapReduce framework. The rest of the transformation operates on the fields that came from this step.
When using the MapReduce Input step with the Adaptive Execution Layer, the following factor affects performance and results:
- Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.
Enter the following information in the transformation step fields.
|Specifies the unique name of the MapReduce Input step on the canvas. A MapReduce Input step can be placed on the canvas several times; however, it represents the same MapReduce Input step. You can customize the name or leave it as the default.
|The Hadoop input field and data type that represents the key in MapReduce.
|The Hadoop input field and data type that represents the value in MapReduce.