Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Pentaho Schema Workbench

Parent article

With a physical multidimensional data model in place, you must create a logical model that maps to it. A Mondrian schema is essentially an XML file that performs this mapping, thereby defining a multidimensional database structure. You can create Mondrian schemas using the Pentaho Schema Workbench.

In a very basic scenario, you will create a Mondrian schema with one cube that consists of a single fact table and a few dimensions, each with a single hierarchy consisting of a handful of levels. More complex schemas may involve multiple virtual cubes, and instead of mapping directly to the single fact table at the center of a star schema, they might map to views or inline tables instead.

All of the Mondrian XML elements are documented in this section in both a single quick reference list and a full individual reference piece for each element. Primarily you will be using Pentaho Schema Workbench to create Mondrian schemas graphically, though you can do advanced schema design through the Data Source Wizard in the User Console or through Schema Workbench later on.

Get started with the Schema Workbench

Before you start using Schema Workbench, you should be aware of the following points:

  • You start Schema Workbench by executing the /pentaho/design-tools/schema-workbench/workbench script. On Linux and OS X, this is a SH file; on Windows this is a BAT file.
  • You must be familiar with your physical data model before you use Schema Workbench. If you don't know which are your fact tables and how your dimensions relate to them, you will not be able to make significant progress in developing a Mondrian schema.
  • When you make a change to any field in Schema Workbench, the change will not be applied until you click out of that field such that it loses the cursor focus.
  • Schema Workbench is designed to accommodate multiple sub-windows. By default they are arranged in a cascading fashion. However, you may find more value in a tiled format, especially if you put the JDBC Explorer window next to your Schema window so that you can see the database structure at a glance. Simply resize and move the sub-windows until they are in agreeable positions.

Add a data source

Your data source must be available, its database driver JAR must be present in the /pentaho/design-tools/schema-workbench/drivers/ directory, and you should know or be able to obtain the database connection information and user account credentials for it.

Follow the below process to connect to a data source in Schema Workbench.

Procedure

  1. Establish a connection to your data source by going to the Options menu and selecting Connection.

    The Database Connection dialog box appears.
  2. Select your database type, then enter in the necessary database connection information, then click Test. When you've verified that the connection settings work, click OK.

    The database connection information includes the database name, port number, and user credentials. If you don't know what to type into any of these fields, consult your database administrator or database vendor's documentation.
    NoteThe Require Schema check box, when selected in the Options menu, puts Schema Workbench into a mode where unpopulated elements appear in the schema.
    NoteIf you are using an Oracle data source, selecting Require Schema will dramatically improve your analysis schema load time.
  3. If you required a database schema in the previous step, you must now define it by going to the Options section of the database dialog box, and creating a parameter called FILTER_SCHEMA_LIST with a value of the schema name you want to use.

Results

Your data is now available to Schema Workbench, and you can proceed with creating a Mondrian schema.

Remove Mondrian data sources

As you phase out old analysis schemas, you will have to manually remove their data source entries in the Data Source Wizard in the User Console.

Procedure

  1. Log in to the User Console with administrator credentials.

  2. On the Home page of the User Console, click Manage Data Sources.

    The Data Source Wizard appears.
  3. Click to highlight the data source to be deleted, and click Remove.

Results

The data source is removed and is no longer available for use.

Create a Mondrian schema

In order to complete this process, you should have already connected to your data source in Schema Workbench.

This section explains the basic procedure for creating a barebones Mondrian schema using Schema Workbench. If you are confused about the definition or application of any of the elements discussed below, refer to the Mondrian Schema Element Quick Reference and the individual reference pieces it links to.

Procedure

  1. To create a new Mondrian schema, click the New button, or go to the File New Schema.

    A new schema sub-window will appear. Resize it to fit your preference.
  2. It's easier to visualize your physical data model if you have it in front of you. Turn on the JDBC Explorer from the New section of the File menu and position it according to your preference. If you have a third-party database visualization tool that you are more familiar with, use that instead.

    The JDBC Explorer is not interactive; it only shows the table structure of your data source so that you can see at a glance what the names of the columns and rows in it.
  3. Typically your first action when creating a schema is to add a cube. Right-click the Schema icon in the schema window, then select Add cube from the context menu. Alternatively you can click the New Cube button in the toolbar.

    A new default cube will show up in your schema.
  4. Give your cube a name.

  5. Add a table by clicking the New Table button, or by right-clicking your cube, then selecting Add Table. This will be your fact table. Alternatively, you can select View or Inline Table if these are the data types you need for your fact table.

  6. Click the Table entry in the name field of your new table, and select or type in the name of the table in your physical model that you want to use for this cube's fact table.

  7. Add a dimension by right-clicking the cube, then selecting Add Dimension, or by clicking the New Dimension button.

    When you add a dimension, a new hierarchy is automatically created for it.
  8. Type in a friendly name for this dimension in the name field.

  9. Select a foreign key for this dimension from the foreignKey drop-down box, or just type it into the field.

  10. To configure the hierarchy, expand the dimension by clicking the lever icon on the left side of the dimension's tree entry, then click on New Hierarchy 0. Choose a primaryKey or primaryKey Table.

  11. Add a table to the hierarchy by right-clicking the hierarchy, then selecting Add Table from the context menu.

  12. Choose a column for the name attribute.

  13. Add a level to the hierarchy by right-clicking the hierarchy, then selecting Add Level from the context menu.

  14. Give the level a name and choose a column for it.

  15. Add a member property to the level by right-clicking the level, then selecting Add Property from the context menu.

  16. Give the property a name and choose a column for it.

  17. Add a measure to the cube by right-clicking the cube and selecting Add Measure from the context menu.

  18. Choose a column that you want to provide values for, then select an aggregator to determine how the values should be calculated.

Results

These instructions have shown you how to use Schema Workbench's interface to add and configure basic Mondrian schema elements.

Next steps

When your schema is finished, you should test it with a basic MDX query such as:
select {[Dim1].[All Dim1s]} on rows, {[Measures].[Meas1]} on columns from [CubeName]

In order to use your schema as a data source in any Pentaho Business Analytics client tools, you must publish it to the Pentaho Server. To do this, select Publish from the File menu, then enter in your Pentaho Server connection information and credentials when requested.

Edit a schema

There are two advanced tools in Schema Workbench that enable you to work with raw MDX and XML. The first is the MDX query editor, which can query your logical data model in real time. To open this view, go to the File menu, select New, then click MDX Query.

The second is XML viewing mode, which you can get to by clicking the rightmost icon (the pencil) in the toolbar. This replaces the name/value fields with the resultant XML for each selected element. To see the entire schema, select the top-level schema entry in the element list on the left of the Schema Workbench interface. Unfortunately you won't be able to edit the XML in this view; if you want to edit it by hand, you'll have to open the schema in an XML-aware text editor.

Add business groups

The available fields list in Analyzer organizes fields in folders according to the AnalyzerBusinessGroup annotation. To implement business groups, add these annotations to your member definitions appropriately. If no annotation is specified, then the group defaults to Measures for measures and the hierarchy name/caption for attributes.

Below is an example that puts Years, Quarters and Months into a Time Periods business group:

...
<Level name="Years" ... >
   <Annotations><Annotation name="AnalyzerBusinessGroup">Time Periods</Annotation></Annotations>
</Level>
<Level name="Quarters" ... >
   <Annotations><Annotation name="AnalyzerBusinessGroup">Time Periods</Annotation></Annotations>
</Level>
<Level name="Months" ... >
   <Annotations><Annotation name="AnalyzerBusinessGroup">Time Periods</Annotation></Annotations>
</Level>
...

The AnalyzerBusinessGroup annotation is supported on the following schema elements:

  • Level
  • Measure
  • CalculatedMember
  • VirtualCubeMeasure

Add field descriptions

By adding description attributes to your Mondrian schema elements, you can enable tooltip (mouse-over) field descriptions in Analyzer reports.

<Level name="Store Country" column="store_country" uniqueMembers="true" caption="%{foodmart.dimension.store.country.caption}" 
            description="%{foodmart.dimension.store.country.description}"/>

CautionRemove the line-wrap or this may not work. These variables will not work unless you localize schemas.

This attribute can be set on the following schema elements:

  • Level
  • Measure
  • CalculatedMember

Build a schema and detect errors

Analysis schemas are built by publishing them to the Pentaho Server. Each schema is validated to make sure that there are no errors before it is built; if there are any, they'll be shown to you and the schema will fail to publish. If you want to see the errors marked in Schema Workbench before you publish, go to the Options menu and select Require Schema. When this option is checked, schema validation will happen as new elements are added, and any errors will show as a Red X next to the offending element.

Adapt Mondrian schemas to work with Analyzer

A few Mondrian features are not yet functional in Analyzer. You must adapt your schemas to adjust for these limitations and enable some Analyzer functions to work properly.

Learn more

Localization and internationalization of analysis schemas

You can create internationalized message bundles for your analysis schemas and deploy them with your Pentaho web applications.

Learn more