This step defines the key/value pairs for Hadoop input, and indicates the injection point of the transformation that receives input from the MapReduce framework. The rest of the transformation operates on the fields that came from this step.
When using the MapReduce Input step with the GUID-C8C1E61D-458D-4E18-953B-3E3FE4F2EE05, the following factor affects performance and results:
- Spark processes null values differently than the Pentaho engine. You will need to adjust your transformation to successfully process null values according to Spark's processing rules.
Enter the following information in the transformation step fields.
|Step name||Specifies the unique name of the MapReduce Input step on the canvas. A MapReduce Input step can be placed on the canvas several times; however, it represents the same MapReduce Input step. You can customize the name or leave it as the default.|
|Key field||The Hadoop input field and data type that represents the key in MapReduce.|
|Value field||The Hadoop input field and data type that represents the value in MapReduce.|