Work with Jobs
In the PDI client (Spoon), you can develop jobs that orchestrate your ETL activities. The entries used in your jobs define the individual ETL elements (such as transformations). The jobs containing your entries are stored in .kjb files. You can access these .kjb files through the PDI client.
Create a Job
Follow these instructions to create your job.
- Perform one of the following actions:
- Click File > New > Job.
- Click the New file icon in the toolbar and select Job.
- Hold down the CTRL+ALT+N keys.
- Go to the Design tab. Expand the folders or use the Entries field to search for specific entries.
- Either drag or double-click an entry to place it on the PDI client canvas.
- Double-click the entry to open its properties window. For help on filling out the window, click the Help button that is available in each entry.
- To add another entry, either drag or double-click the entry to place it on the PDI client canvas.
- If you dragged the entry to the canvas, you can add a hop by pressing the SHIFT key and drawing a hop from one entry to the other.
- If you double-click it, the entry appears on the canvas with a hop already connected to your previous entry.
- When finished, save the job.
Open a Job
The way you open an existing job depends on whether you are using PDI locally on your machine or if you are connected to a repository. If you are connected to a repository, you are remotely accessing your file on the Pentaho Server. Another option is to open a job using HTTP with the Visual File System (VFS) Browser.
If you get a message indicating that a plugin is missing, see the Troubleshooting Transformation Steps and Job Entries section for more details.
If you recently had a file open, you can also use File > Open Recent.
On Your Local Machine
Follow these instructions to open a job on your local machine.
- In the PDI client, perform one of the following actions:
- Select File > Open.
- Click the Open file icon in the toolbar.
- Hold down the CTRL+O keys.
- Select the file from the Open window, then click Open.
The Open window closes when your job appears in the canvas.
In the Pentaho Repository
Follow these instructions to access a job in the Pentaho Repository.
- Make sure you are connected to a repository.
- In the PDI client, perform one of the following actions to access the Open repository browser window:
- Select File > Open.
- Click the Open file icon in the toolbar.
- Hold down the CTRL+O keys.
- If you recently opened a file, use Recents to navigate to your job.
- Use either the search box to find your job, or use the left panel to navigate to a repository folder containing your job.
- Perform one of the following actions:
- Double-click on your job.
- Select it and press the Enter key.
- Select it and click Open.
The Open window closes when your job appears in the canvas.
If you select a folder or file in the Open window, you can click on it again to rename it.
With the VFS Browser
Select File > Open URL to access files using HTTP with the VFS browser. The URL you specify identifies the protocol to use in the browser.
Save a Job
The way you save a job depends on whether you are using PDI locally on your machine or if you are connected to a repository. If you are connected to a repository, you are remotely saving your file on the Pentaho Server.
On Your Local Machine
Follow these instructions to save a job on your local machine.
- In the PDI client, perform one of the following actions:
- Select File > Save.
- Click the Save current file icon in the toolbar.
- Hold down the CTRL+S keys.
If you are saving your job for the first time, the Save As window appears.
- Specify the job's name in the Save As window and select the location.
- Either press the Enter key or click Save.
The Save As window closes when your job is saved.
In the Pentaho Repository
Follow these instructions to save a job to the Pentaho Repository.
- Make sure you are connected to a repository.
- In the PDI client, perform one of the following actions:
- Select File > Save As.
- Click the Save current file icon in the toolbar.
- Hold down the CTRL+S keys.
If you are saving your job for the first time, the Save repository browser window appears.
- Navigate to the repository folder where you want to save your job.
- Specify the job's name in the File name field.
- Either press the Enter key or click Save.
The Save window closes when your job is saved.
Adjust Job Properties
You can adjust the parameters, logging options, settings, and transactions for jobs. To view the job properties, click CTRL+J or right-click on the canvas and select Properties from the menu that appears.
Use the Job Menu
Right-click any entry in the job canvas to view the job menu.
Menu Item | Description |
---|---|
New Hop | Creates a new hop. |
Edit | Shows the configuration window for the entry. |
Description | Allows you to add a description to the entry. |
Open Referenced Object | Opens referenced transformations. |
Copy | Copies selected items to the clipboard. |
Duplicate | Makes a copy of the selected items, then pastes them to the canvas. |
Delete | Deletes selected items from the canvas. |
Hide |
Hides the entry from the the PDI client canvas. Caution: if you hide the entry, you will need to open the job XML file and hand edit it to view it again. For more details, see the troubleshooting section. |
Detach | Detaches the entry from the job. |
Align/Distribute |
Arranges entries on the canvas so that they are aligned properly or distributed evenly. This helps create a visually pleasing job that is easier to read and digest. Align refers to where the entries are permitted along the x (horizontal) or y (vertical) axis. Distribute makes the horizontal and vertical spacing between entries consistent. Typically, you turn on the grid, then move the different entries on the canvas so that they form some sort of pattern, like a straight or branching line. You select entries and apply the following options, as needed:
|
Restartable Checkpoint |
Restarts a failed job at specific checkpoints, instead of rerunning the entire job from the beginning. You add checkpoints at hops that connect one job entry to another. Checkpoints are addressed in detail in the Use Checkpoints to Restart Jobs topic. |
Run Next Entries in Parallel | Allows you to launch job entries in parallel (on the same machine or remotely). |