Inspecting Your Data
When working with your transformation, you can gain valuable insights by visualizing and interacting with your data in many ways. The ability to quickly inspect step data reduces the amount of iterative work needed while building your transformation and enables you to rapidly publish a data source to share with either your teams or across your organization.
Depending on your operating system, you may need to upgrade your Web browser for the full experience. See our list of supported components here.
Begin Inspecting
Begin inspecting your transformation by clicking on a step. This displays the fly-out inspection bar at the top of the canvas area. The bar displays the name of the step selected and offers two options:
- Inspect Data - Lets you inspect the data of a step once the transformation has run.
Note: This option is not available until after you run your transformation. - Run and Inspect Data - Runs the transformation up to the selected step, then lets you inspect your data.
Additionally, you can begin inspecting in the following ways:
-
Step Context Menu - Right-click on a step and choose either Inspect Data or Run and Inspect Data.
- Preview Data Panel - Select the Preview Data tab. Click the Inspect Data button located at the top right of the Preview Data bar.
- Actions Menu - Select a step. From the Menu bar, click Action>Inspect Data or Action>Run and Inspect Data.
- Keyboard Shortcuts - Select a step. Then using your keyboard:
- In Windows, press either Shift+Ctrl+F9 (Inspect Data) or Ctrl+F9 (Run and Inspect Data).
- In OS X, press Shift+Command+F9 (Inspect Data) or Command+F9 (Run and Inspect Data).
Tour the Environment
When you decide to inspect your data, the transformation presents options to visualize your data.
By default, table data is displayed with all available fields selected in Stream View.
The following sample screen shows a visualization using data field values from the default Stream View for a step.
Use the number locators in the sample screen to reference the sections of the inspection environment.
Explore with Visualizations
When you begin inspecting your data, you are presented with the Stream View with all available data fields selected. The selected data fields are represented in the Canvas area by a flat table. To reduce the number of data fields selected, click anywhere on a data field name. The blue dot to the left of the data field name will disappear, indicating that it is no longer selected. In some cases, it may be faster to deselect all data fields first by clicking the Clear All actions first, then select only the data fields you want to inspect. Your selections will be listed in the order that they are selected.
Once you have the desired data fields selected, you can change the table to a different visualization type by using the Visualization Selector. Alternately, you can create a new visualization by clicking the plus symbol button located to the right of the current tab. Once you have a new visualization created, switch to Model View to display a multidimensional representation of your selected fields. If you selected a visualization that requires a multidimensional model, it will automatically switch to Model View.
You can customize your model by adding, moving, and deselecting fields in the Layout panel, or by drilling down into fields in the visualization itself. When you double-click on drill down fields in your visualization, these fields display in the Applied Filters panel. The Layout panel automatically updates based on the selected filters. To remove a filter, hover over the field in the Applied Filters panel and click the Close button.
You can keep tabs open between sessions and always return to the inspection canvas to fine tune your transformation at any time until you are satisfied with the results. When you exit the inspection canvas, the step displays with the Inspection icon in the transformation canvas so you know it contains a remembered inspection session.
Note that when you reopen a remembered inspection session where some of the selected fields were removed from the transformation or step, the tabs using those fields are now marked as ‘invalid’. To validate those tabs, you can deselect the fields from the visualization in the inspection canvas, or exit your session and add the fields back to the transformation or step itself. The only exception is the flat table, where all invalid fields are removed automatically.
Once you are satisfied with your step data, you can make the content available for further collaboration by publishing a data source.
Publish for Collaboration
When you are ready to make your content available for others, you can publish it as a data source. The data source uses a data service that is automatically created on the step, and is available to other tools. You must be connected to your repository to publish the data source.
Perform the following steps to publish your content:
- Click the Publish button (
) at the top right of the Header bar. The Publish Data Source window opens.
- Click Get Started to open the Publish Details window.
Enter the data source information in the following fields:
Fields | Description |
---|---|
Data Source Name | The name used by other Pentaho applications when accessing your data source. |
Server | The default value for this field is your current repository. You can select other repository connections, if you have created them, through the Repository Manager. |
URL | The base URL string used to connect to the server. |
User Name |
The user name required to access the server. The user must also have publishing permissions. |
Password | The password associated with the provided user name |
- When you are done, click Finish.
- Once your data source is created a confirmation will appear. The data source should now be available on the server. Click Close to continue inspecting your data or click View this in User Console to open a new browser window and work with the data source in Analyzer.