Skip to main content
Hitachi Vantara Lumada and Pentaho Documentation

Data Integration Operations Mart

The PDI Operations Mart is a centralized data mart that stores job or transformation log data for auditing, reporting, and analysis. The PDI Operations Mart enables you to collect and query Data Integration log data and then use the Pentaho Server tools to examine the log data in reports, charts, and dashboards. The data mart is a collection of tables organized as a data warehouse using a star schema. Together, the dimension tables and a fact table represent the logging data. These tables must be created in the PDI Operations Mart database. Pentaho provides SQL scripts to create these tables for the PostgreSQL database. A Data Integration job populates the time and date dimensions.

Note: For optimal performance, be sure to clean the operations mart periodically.

Getting Started

Installation of DI Operations Mart depends on the following conditions and prerequisites:

Database Requirement

Before proceeding with the DI Operations Mart installation steps below, ensure that your Pentaho Server and Repository are configured with one of the following database types:

  • PostgreSQL
  • MySQL
  • Oracle
  • MS SQL Server

If you need to review the Pentaho Server installation method, see Pentaho Installation.

Existing 8.2 Installation

If you have an existing 8.2 installation of the PDI client (Spoon) and the Pentaho Server, you must configure them to use the DI Operations Mart using the Installation Steps below.

Installation Steps

To install the DI Operations Mart, you will perform the following steps:

  • Step 1: Get the DI Operations Mart Files
  • Step 2: Run the Setup Script
  • Step 3: Set the Global Kettle Logging Variables
  • Step 4: Add Logging and Operations Mart Connections
  • Step 5: Add the DI Operations Mart ETL Solution and Sample Reports to the Repository
  • Step 6: Initialize the DI Operations Mart
  • Step 7: Verify the DI Operations Mart is Working

Step 1: Get the DI Operations Mart Files 

The DI Ops Mart files are available for download from the Pentaho Customer Support Portal

  1. On the Customer Portal home page, sign in using the Pentaho support user name and password provided in your Pentaho Welcome Packet. 
  2. Click Downloads, then click Pentaho 8.2 GA Release in the 8.x list. 
  3. On the bottom of the Pentaho 8.2 GA Release page, browse the folders in the Box widget to find the files you need, located in the Operations Mart folder:
  • pentaho-operations-mart-5.0.0-dist.zip
  1. Unzip the Pentaho Operations Mart file. Inside are the packaged Operations Mart installations file.
  2. Unpack the installation file by running the installer file for your environment.

  3. In the IZPack window, read the license agreement, select I accept the terms of this license agreement, and then click Next.
  4. In the Select the installation path text box, browse to or enter the directory location where you want to unpack the files, then click Next.
  5. If you chose an existing directory, a warning message that the directory already exists appears. Click Yes. Any existing files in the directory will be retained.
  6. When the installation progress is complete, click Quit. Your directory will contain the setup scripts and files used to create the default content in the following steps. 

Step 2: Run the Setup Script

Depending on your database repository type, run the following scripts to create the tables which will capture the activity of transformations and jobs.

The pentaho-operations-mart-ddl-5.0.0.zip file contains folders for each database type listed with the scripts that are needed. 

Database Type Script Name Located in the Directory
PostgreSQL
  • pentaho_logging_postgresql.sql
  • pentaho_mart_postgresql.sql **
  • pentaho_mart_upgrade_postgresql.sql
/pentaho-server/data/postgresql
MySQL
  • pentaho_logging_mysql.sql
  • pentaho_mart_mysql.sql
  • pentaho_mart_upgrade_mysql.sql
/pentaho-server/data/mysql5
Oracle
  • pentaho_logging_oracle.sql
  • pentaho_mart_oracle.sql
  • pentaho_mart_upgrade_oracle.sql
/pentaho-server/data/oracle10g
Microsoft SQL Server
  • pentaho_logging_server.sql
  • pentaho_mart_sqlserver.sql
  • pentaho_mart_upgrade_sqlserver.sql
/pentaho-server/data/sqlserver

** This script is an optional installation during the Pentaho Server installation (either Windows or Linux).

Step 3: Set the Global Kettle Logging Variables

Perform this step on the computer where you have installed your Pentaho Data Integration (PDI) client and Pentaho Server.

When you run PDI for the first time, the kettle.properties file is created and stored in the $USER_HOME/.kettle.properties directory.

  1. In the PDI client, choose Edit > Edit the kettle.properties file 
  2. Add or edit the variables and values to reflect the values in the shown in the following table:

For Oracle and Microsoft SQL Server, leave Value blank with Variables that contain SCHEMA in the name.

Variable Value
KETTLE_CHANNEL_LOG_DB live_logging_info
KETTLE_CHANNEL_LOG_TABLE channel_logs
KETTLE_CHANNEL_LOG_SCHEMA pentaho_dilogs
KETTLE_JOBENTRY_LOG_DB live_logging_info
KETTLE_JOBENTRY_LOG_TABLE jobentry_logs
KETTLE_JOBENTRY_LOG_SCHEMA pentaho_dilogs
KETTLE_JOB_LOG_DB live_logging_info
KETTLE_JOB_LOG_TABLE job_logs
KETTLE_JOB_LOG_SCHEMA pentaho_dilogs
KETTLE_METRICS_LOG_DB live_logging_info
KETTLE_METRICS_LOG_TABLE metrics_logs
KETTLE_METRICS_LOG_SCHEMA pentaho_dilogs
KETTLE_STEP_LOG_DB live_logging_info
KETTLE_STEP_LOG_TABLE step_logs
KETTLE_STEP_LOG_SCHEMA pentaho_dilogs
KETTLE_TRANS_LOG_DB live_logging_info
KETTLE_TRANS_LOG_TABLE trans_logs
KETTLE_TRANS_LOG_SCHEMA pentaho_dilogs
KETTLE_TRANS_PERFORMANCE_LOG_DB live_logging_info
KETTLE_TRANS_PERFORMANCE_LOG_TABLE transperf_logs
KETTLE_TRANS_PERFORMANCE_LOG_SCHEMA pentaho_dilogs

Step 4: Add Logging and Operations Mart Connections

This section explains how to add the logging (live_logging_info) and Operations Mart (PDI_Operations_Mart) connections for a PDI client.

  1. Navigate to the pentaho/design-tools/data-integration/simple-jndi directory. 
  2. Open the jdbc.properties file with a text editor.
  3. Depending on your repository database type, update the values accordingly (URL, users, password) as shown in the samples:

The URLs, users, and passwords may need to be obtained from your system administrator.

PostgreSQL:

PDI_Operations_Mart/type=javax.sql.DataSource
PDI_Operations_Mart/driver=org.postgresql.Driver
PDI_Operations_Mart/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_operations_mart
PDI_Operations_Mart/user=hibuser
PDI_Operations_Mart/password=password
live_logging_info/type=javax.sql.DataSource
live_logging_info/driver=org.postgresql.Driver
live_logging_info/url=jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs
live_logging_info/user=hibuser
live_logging_info/password=password

MySQL:

PDI_Operations_Mart/type=javax.sql.DataSource
PDI_Operations_Mart/driver=com.mysql.jdbc.Driver
PDI_Operations_Mart/url=jdbc:mysql://localhost:3306/pentaho_operations_mart
PDI_Operations_Mart/user=hibuser
PDI_Operations_Mart/password=password
live_logging_info/type=javax.sql.DataSource
live_logging_info/driver=com.mysql.jdbc.Driver
live_logging_info/url=jdbc:mysql://localhost:3306/pentaho_dilogs
live_logging_info/user=hibuser
live_logging_info/password=password

Oracle:

PDI_Operations_Mart/type=javax.sql.DataSource
PDI_Operations_Mart/driver=oracle.jdbc.OracleDriver
PDI_Operations_Mart/url=jdbc:oracle:thin:@localhost:1521/XE
PDI_Operations_Mart/user=pentaho_operations_mart
PDI_Operations_Mart/password=password
live_logging_info/type=javax.sql.DataSource
live_logging_info/driver=oracle.jdbc.OracleDriver
live_logging_info/url=jdbc:oracle:thin:@localhost:1521/XE
live_logging_info/user=pentaho_dilogs
live_logging_info/password=password

Microsoft SQL Server:

PDI_Operations_Mart/type=javax.sql.DataSource
PDI_Operations_Mart/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
PDI_Operations_Mart/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart
PDI_Operations_Mart/user=pentaho_operations_mart
PDI_Operations_Mart/password=password
live_logging_info/type=javax.sql.DataSource
live_logging_info/driver=com.microsoft.sqlserver.jdbc.SQLServerDriver
live_logging_info/url=jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_dilogs
live_logging_info/user=dilogs_user
live_logging_info/password=password

Step 5: Add a JNDI Connection for the Pentaho Server

This section explains how to add a JNDI connection for the Pentaho Server. Perform this task on the computer where you have installed the Pentaho Server.

  1. Navigate to the pentaho/server/pentaho-server/tomcat/webapps/Pentaho/META-INF/ folder.
  2. Open the context.xml file with a text editor. 
  3. Depending on your database type, edit the file to reflect the values in the applicable example:

PostgreSQL:

 <Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="org.postgresql.Driver" 
            url="jdbc:postgresql://localhost:5432/hibernate"
            validationQuery="select 1"/>
           
 <Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="org.postgresql.Driver" 
            url="jdbc:postgresql://localhost:5432/hibernate"
            validationQuery="select 1"/>
           
 <Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" minIdle="0" maxIdle="5" initialSize="0"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="org.postgresql.Driver" 
            url="jdbc:postgresql://localhost:5432/hibernate?searchpath=pentaho_dilogs"            
            validationQuery="select 1"/>

MySQL:

   <Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart"
            jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>
           
  <Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_operations_mart"
            jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>
            
  <Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
            factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxActive="20" maxIdle="5"
            maxWait="10000" username="hibuser" password="password"
            driverClassName="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost:3306/pentaho_dilogs"            
            jdbcInterceptors="ConnectionState" defaultAutoCommit="true" validationQuery="select 1"/>

Oracle:

<Resource 
    validationQuery="select 1 from dual"
    url="jdbc:oracle:thin:@localhost:1521/orcl"
    driverClassName="oracle.jdbc.OracleDriver"
    password="password"
    username="pentaho_operations_mart"
    initialSize="0"
    maxActive="20"
    maxIdle="10"
    maxWait="10000"
    factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
    type="javax.sql.DataSource"
    auth="Container"
    connectionProperties="oracle.jdbc.J2EE13Compliant=true"
    name="jdbc/pentaho_operations_mart"/>

<Resource 
    validationQuery="select 1 from dual"
    url="jdbc:oracle:thin:@localhost:1521/orcl"
    driverClassName="oracle.jdbc.OracleDriver"
    password="password"
    username="pentaho_operations_mart"
    initialSize="0"
    maxActive="20"
    maxIdle="10"
    maxWait="10000"
    factory="org.apache.tomcat.jdbc.pool.DataSourceFactory"
    type="javax.sql.DataSource"
    auth="Container"
    connectionProperties="oracle.jdbc.J2EE13Compliant=true"
    name="jdbc/PDI_Operations_Mart"/>

<Resource validationQuery="select 1 from dual" url="jdbc:oracle:thin:@localhost:1521/XE" 
     driverClassName="oracle.jdbc.OracleDriver" password="password" 
     username="pentaho_dilogs" maxWaitMillis="10000" maxIdle="5" maxTotal="20" 
     jdbcInterceptors="ConnectionState" defaultAutoCommit="true" 
     factory="org.apache.commons.dbcp.BasicDataSourceFactory" type="javax.sql.DataSource" 
     auth="Container" name="jdbc/live_logging_info"/>

Microsoft SQL Server:

<Resource name="jdbc/PDI_Operations_Mart" auth="Container" type="javax.sql.DataSource"
      factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
      maxWaitMillis="10000" username="pentaho_operations_mart" password="password" 
      jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
      driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
      url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart"
      validationQuery="select 1"/>
            
<Resource name="jdbc/pentaho_operations_mart" auth="Container" type="javax.sql.DataSource"
       factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
       maxWaitMillis="10000" username="pentaho_operations_mart" password="password" 
       jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
       driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
       url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_operations_mart"
       validationQuery="select 1"/>
 
<Resource name="jdbc/live_logging_info" auth="Container" type="javax.sql.DataSource"
        factory="org.apache.tomcat.jdbc.pool.DataSourceFactory" maxTotal="20" maxIdle="5"
        maxWaitMillis="10000" username="dilogs_user" password="password" 
        jdbcInterceptors="ConnectionState" defaultAutoCommit="true"
        driverClassName="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
        url="jdbc:sqlserver://10.0.2.15:1433;DatabaseName=pentaho_dilogs"            
        validationQuery="select 1"/>

Step 6: Add the DI Operations Mart ETL Solution and Sample Reports to the Repository

  1. Stop the Pentaho Server.
  2. Depending on your repository database type, copy the following ETL solution and sample reports (downloaded in Step 1: Get the DI Operations Mart Files) to: $PENTAHO_HOME/pentaho-server/pentaho-solution/default-content

pentaho-operations-mart-etl-5.0.0-dist.zip may already be in this directory. If you are using a repository database type other than PostgreSQL, remove it.

  • PostgreSQL: pentaho-operations-mart-etl-5.0.0-dist.zip

  • MySQL: pentaho-operations-mart-etl-mysql5-5.0.0-dist.zip

  • Oracle: pentaho-operations-mart-etl-oracle10g-5.0.0-dist.zip

  • Microsoft SQL Server: pentaho-operations-mart-etl-mssql-5.0.0-dist.zip

  1. Place these two files in the directory as well:
  • DI Operations Mart sample reports: pentaho-operations-mart-operations-di-5.0.0-dist.zip
  • BA Operations Mart sample reports: pentaho-operations-mart-operations-bi-5.0.0-dist.zip
  1. Start the Pentaho Server.

Step 7: Initialize the DI Operations Mart

  1. Launch the PDI client (Spoon).
  2. Connect to the Pentaho Repository via the Pentaho Server.
  3. At the Main Menu, select File > Open.
  4. Select Browse Files > Public >Pentaho Operations Mart > DI Ops Mart ETL > Fill_in_DIM_DATE_and_DIM_TIME job file and run it.
  5. At the Main Menu, select File > Open.
  6. Select Public > Pentaho Operations Mart > DI Ops Mart ETL > Update_Dimensions_then_Logging_Datamart job file and run it.

Step 8: Verify the DI Operations Mart is Working

  1. From the Pentaho User Console, select Browse Files > Public > Pentaho Operations Mart > DI Audie Reports > Last_Run and open it.
  2. You should see the Jobs and Transformations that were run in Step 6

Give Users Access to the PDI Operations Mart

By default, only users who have the Admin role can access the Pentaho Operations Mart. The Admin role has access to all capabilities within all Pentaho products, including the Pentaho Operations Mart. If you want to allow users to view and run the Pentaho Operations Mart only, you can assign them the Pentaho Operations role. For example, a user who has been assigned the Pentaho Operations user role is able to open and view a report within the PDI Operations Mart, but does not have the ability to delete it.

To give users access to view the PDI Operations Mart, assign the Pentaho Operations role to those users as follows:

  1. From within the Pentaho User Console, select the Administration tab.
  2. From the left panel, select Security Users/Roles.
  3. Select the Roles tab.
  4. Add the new role called Pentaho Operations by following the instructions in Adding Roles.
  5. Assign the appropriate users to the new role, as described in Adding Users to Roles.
  6. Advise these users to log in to the Pentaho User Console, create a Pentaho Analyzer or Pentaho Interactive Report, and ensure that they can view the Pentaho Operations Mart in the Select a Data Source dialog. 

Charts, Reports, and Dashboards Using PDI Operations Mart Data

Once you have created and populated your Data Integration Operations Mart with log data, the features of the User Console enable you to examine this data and create reports, charts, and dashboards. We provide many pre-built reports, charts, and dashboards that you can modify.

To help understand the contents of the log, see DI Operations Mart Reference.

Clean Up Operations Mart Tables

Cleaning the PDI Operation Mart consists of running either a job or transformation that deletes data older than a specified maximum age. The transformation and job for cleaning up the PDI Operations Mart can be found in the etl folder.

Perform the following steps to clean up the PDI Operations Mart:

  1. Using the PDI Client (Spoon), open either Clean_up_PDI_Operations_Mart.kjb for jobs or the Clean_up_PDI_Operations_Mart_fact_table.ktr for transformations.
  2. Set the following parameters:
  • max.age.days (required)—the maximum age in days of the data.
  • schema.prefix (optional)—for PostgreSQL databases, enter the schema name followed by a period (.), this will be applied to the SQL statements. For other databases, leave the value blank.
  1. Run the job or transformation. This will delete Job and transformation data older than the maximum age from the data mart.

To schedule regular clean up of the PDI Operations Mart, see Schedule Perspective in the PDI Client.