Working with Transformations
Overview
Explains how to create, save, and run a transformation. Also explains how to build a job.
This section explains how to create, save, and run a transformation. See Getting Started with PDI for a comprehensive, "real world" exercise for creating, running, and scheduling transformations and jobs.
Create a Transformation
Follow these instructions to create your transformation.
- Click File > New > Transformation or hold down the CTRL+N keys.
- Go to the Design tab. Expand the folders or use the Steps field to search for a specific steps.
- Either drag a step to the Spoon canvas or double-click it.
- Double-click the step to open its properties window. For help on filling out the window, click the Help button that is available in each step.
- To add another step, either drag the step to the Spoon canvas or double-click it.
- If you drag the step to the canvas, you can add a hop by pressing the SHIFT key and drawing a hop from one step to the other.
- If you double-click it, the step appears on the canvas with a hop already connected to your previous step.
- When finished, save the transformation. See Save A Transformation Locally or Save a Transformation Remotely for more details.
Adjust Transformation Properties
You can adjust the parameters, logging options, dates, dependencies, monitoring, settings, and data services for transformations. To view the transformation properties, click the CTRL+T or right-click on the canvas and select Transformation settings from the menu that appears.
Save a Transformation Locally
Follow these instructions to save a transformation locally on your file system.
- In Spoon, select File > Save.
- Enter the transformation name in the Save As window and select the location.
- Click OK. The transformation is saved.
- Connect to a repository.
- In Spoon, click File > Save As. The Transformation Properties window appears.
- In the Transformation Name field, enter the transformation name.
- In the Directory field, click the Folder Icon to select a repository folder where you will save your transformation.
- Click OK to exit the Transformation Properties dialog box. The Enter Comment dialog box appears.
- Enter a comment, then click OK. The transformation is saved.
Open a Transformation
Follow these instructions to open a transformation.
- In Spoon, select File > Open.
- If you are opening a file from the repository, select the file from the Select a Repository Object window, then click OK. Otherwise, select the file from the Open window, then click OK.
- The transformation appears on the canvas.
If you get a message indicating that a plugin is missing, see the Fix Transformation and Job Problems section for more details.
Using the Transformation Menu
Right-click the transformation canvas to view the transformation menu.
Menu Item | Description |
---|---|
New Hop | Creates a new hop. |
Open Referenced Object | Allows you to map a sub-transformation. Mapping a sub-transformation is covered in detail in the Reusing Transformation Mapping Flows Between Steps. |
Edit ... | Shows the configuration window for the step or transformation. |
Description ... | Allows you to add a description to the step. |
Data Movement | Describes the way data moves through the transformation when there is more than one hop. There are three options:
|
Change Number of Copies to Start | Starts several instances of a step in parallel. |
Copy | Copies selected items to the clipboard. |
Duplicate | Makes a copy of the selected items, then pastes them to the canvas. |
Delete | Deletes selected items from the canvas. |
Hide | Hides the step or entry from the Spoon canvas. Caution: if you hide the step or entry, you will need to open the transformation or job XML file and hand edit it to view it again. For more details, see the troubleshooting section. |
Detach | Detach the step or entry from the transformation or job. |
Input Fields | Shows metadata, like the field name and type, for fields that come into the step. |
Output Fields | Shows metadata, like the field name and type, for fields that go out of the step. |
Sniff Test During Execution | The sniff test displays data as it travels from one step to another in the stream. To use this, right-click a step in the transformation as it runs and select Sniff Test During Execution. There are three options in this menu:
For more information on how to use this tool, see the Sniff Test Tool article. |
Check Selected Step(s) | Checks transformation steps for problems that could interfere with successfully running the transformation. Right-click the transformation step that you want to check and click Check Selected Step(s). Warnings and errors appear in the Results of transformation checks window. |
Error Handling | Indicates how to apply error handling for a step. When this is selected, the Step error handling settings window appears. |
Preview | Allows you to preview the results of the transformation. Launches the Transformation Debug Dialog. |
Align/Distribute | Arranges steps or entries on the canvas so that they are aligned properly or distributed evenly. This helps create a visually pleasing transformation or job that is easier to read and digest.
Align refers to where the steps or entries are permitted along the x (horizontal) or y (vertical) axis. Distribute makes the horizontal and vertical spacing between steps or entries consistent. Typically, you turn on the grid, then move the different steps or entries on the canvas so that they form some sort of pattern, like a straight or branching line. Then you select steps or entries and apply the following options as needed.
|
Data Services | Allows you to create, edit, delete, or test a Pentaho Data Service. The Pentaho Data Service allows others to obtain the results of a transformation, even if the person does not have the Spoon or DI Server installed. The Pentaho Data Service is discussed in great detail in Turn a Transformation into a Data Service. |
Mapping … | Provides a way for you to map target fields from the step to source columns in a database. When this option is clicked the Mapping window appears that contains these fields:
When you click OK, the Mapping window closes and a Select / Rename Values step appears on the canvas. (It is usually named after the step that right-clicked.) The Select/Rename Values window contains the mappings. If you weren't able to make mappings, the step still appears, but the properties are blank. |
Partitions… | Partitions split data into subsets according to a rule that is applied on a row of data. Partitions are discussed in detail in the Partitions article. |
Clusters … | Clusters allow you to create Carte Clusters. For more information, see Use Carte Clusters to Run Transformations and Jobs. |
Model |
Generates a model of the data in your transformation. The data should have a dimension or a measure. Right-click on a step or entry that has data that can be modeled, such as the Table Output step. The data appears in Pentaho Metadata Editor. |
Visualize | Generates a visualization of the data in your transformation. Right-click on a step or entry that has data that can be visualized, such as the Table Output step. Two sub-options appear when this menu option is selected:
|
Run a Transformation
When you are done modifying a transformation, you can run it by clicking the Run button from the main menu toolbar, or by pressing F9. There are three options that allow you to decide where you want your transformation to be executed:
- Local Execution — The transformation executes on the machine you are currently using.
- Execute remotely — Allows you to specify a remote server where you want the execution to take place. This feature requires that you have the Data Integration Server running or Data Integration installed on a remote machine and running the Carte service. To use remote execution you first must set up a slave server (see Use Carte Clusters to Run Transformations and Jobs) .
- Execute clustered — Allows you to execute a transformation in a clustered environment.