Skip to main content

Pentaho+ documentation is moving!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

PDI job tutorial

Parent article
Jobs are used to coordinate ETL activities such as:
  • Defining the flow and dependencies for what order transformations should be run.
  • Preparing for execution by checking conditions such as, "Is my source file available?" or "Does a table exist?"
  • Performing bulk load database operations.
  • File management such as posting or retrieving files using FTP, copying files and deleting files.
  • Sending success or failure notifications through email.

For this exercise, imagine that an external system is responsible for placing your sales_data.csv input in its source location every Saturday night at 9 p.m. You want to create a job that will check to see that the file has arrived and run your transformation to load the records into the database. In a subsequent exercise, you will schedule the job to be run every Sunday morning at 9 a.m.

To complete this exercise, you must have completed the exercises in the PDI Transformation Tutorial.

Procedure

  1. Go to File New Job.

    PDI Job Window
  2. Expand the General folder and drag a Start job entry onto the graphical workspace.

    The Start job entry defines where the execution will begin.
  3. Expand the Conditions folder and add a File Exists job entry.

  4. Draw a hop from the Start job entry to the File Exists job entry.

  5. Double-click the File Exists job entry to open its Edit Properties dialog box. Click Browse and set the filter near the bottom of the window to All Files. Select the sales_data.csv from the following location: ...\design-tools\data-integration\samples\transformations\files.

  6. Click OK to exit from the Open File window.

  7. Click OK to exit from the Check if a file exists window.

  8. In Spoon, expand the General folder and add a Transformation job entry.

  9. Draw a hop between the File Exists and the Transformation job entries.

  10. Double-click the Transformation job entry to open its edit Properties dialog box.

  11. Click Browse to open the Select repository object window. Browse to and select the transformation you created in the PDI Transformation Tutorial.

  12. Expand the repository tree to find your sample transformation. Select it and click OK.

    Select repository object window
  13. Save your job as Sample Job.

  14. Click Run icon in the toolbar. When the Run Options window appears, choose Local environment type and click Run. The Execution Results panel should open showing you the job metrics and log information for the job execution.

    Job Sample