Transforming Data within a MapR Cluster
These tutorials contain guidance and instructions on leveraging the massively parallel, fault tolerant MapR processing engine to transform resident cluster data.
-
Using Pentaho MapReduce to Parse Weblog Data in MapR—How to use Pentaho MapReduce to convert raw weblog data into parsed, delimited records.
-
Using Pentaho MapReduce to Generate an Aggregate Dataset in MapR—How to use Pentaho MapReduce to transform and summarize detailed data into an aggregate dataset.
-
Transforming Data within Hive in MapR—How to read data from a Hive table, transform it, and write it to a Hive table within the workflow of a PDI job.
-
Transforming Data with Pig in MapR—How to invoke a Pig script from a PDI job.