Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Data Integration Operations Mart Reference

Parent article

This article provides the reference for Pentaho Data Integration logging table status, dimensions, and metrics.

Logging tables status

The following tables are contained in the PDI Operations Mart:

Transformation log tables

The status values and their descriptions are in the following table:
Status DisplayDescription
startIndicates the transformation was started and retains this status until the transformation ends when no logging interval is set.
endTransformation ended successfully.
stopIndicates the user stopped the transformation.
errorIndicates an error occurred when attempting to run the transformation.
runningA transformation is only in this status directly after starting and does not appear without a logging interval.
pausedIndicates the user paused the transformation and does not appear without a logging interval.

Jobs logs tables

The status values and their descriptions are in the following table:
Status DisplayDescription
startIndicates the job was started and retains this status until the job ends, and when no logging interval is set.
endJob ended successfully.
stopIndicates the user stopped the job.
errorIndicates an error occurred when attempting to run the job.
runningA job is only in this status directly after starting and does not appear without a logging interval.
pausedIndicates the user paused the job and it does not appear without a logging interval.

Logging dimensions and metrics

These tables identify the various dimensions and metrics that can be used to create new ETL log charts and reports:

Fact table

(fact_execution)
Field NameDescription
execution_date_tkA technical key (TK) linking the fact to the date when the transformation/job was executed.
execution_time_tkA technical key (TK) linking the fact to the time-of-day when the transformation/job was executed.
batch_tkA technical key (TK) linking the fact to batch information for the transformation/job.
execution_tkA technical key (TK) linking the fact to execution information about the transformation/job.
executor_tkA technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tkA technical key (TK) linking the fact to information about the parent transformation/job).
root_executor_tkA technical key (TK) linking the fact to information about the root transformation/job.
execution_timestampThe date and time when the transformation/job was executed.
durationThe length of time (in seconds) between when the transformation was logged (LOGDATE) and the maximum dependency date (DEPDATE).
rows_inputThe number of lines read from disk or the network by the specified step. Can be input from files, databases, etc.
rows_outputThe number of rows output during the execution of the transformation/job.
rows_readThe number of rows read in from the input stream of the specified step.
rows_writtenThe number of rows written during the execution of the transformation/job.
rows_rejectedThe number of rows rejected during the execution of the transformation/job.
errorsThe number of errors that occurred during the execution of the transformation/job.

Batch dimension

(dim_batch)
Field NameDescription
batch_tkA technical key (TK) for linking facts to batch information.
batch_idThe ID number for the batch.
logchannel_idA string representing the identifier for the logging channel used by the batch.
parent_logchannel_idA string representing the identifier for the parent logging channel used by the batch.

Date dimension

(dim_date)
Field NameDescription
date_tkA technical key (TK) for linking facts to date information.
date_fieldA Date object representing a particular day (year, month, day).
ymdA string representing the date value in year-month-day format.
ymA string representing the date value in year-month format.
yearAn integer representing the year value.
quarterAn integer representing the number of the quarter (1-4) to which this date belongs.
quarter_codeA 2-character string representing the quarter (Q1-Q4) to which this date belongs.
monthAn integer representing the number of the month (1-12) to which this date belongs.
month_descA string representing the month (e.g., January or December) to which this date belongs.
month_codeA string representing the shortened month code (e.g., JAN or DEC) to which this date belongs.
dayAn integer representing the day of the month (1-31) to which this date belongs.
day_of_yearAn integer representing the day of the year (1-366) to which this date belongs.
day_of_weekAn integer representing the day of the week (1-7) to which this date belongs.
day_of_week_descA string representing the day of the week (e.g., Sunday or Saturday) to which this date belongs.
day_of_week_codeA string representing the shortened day-of-week code (e.g., SUN or SAT) to which this date belongs.
weekAn integer representing the week of the year (1-53) to which this date belongs.

Execution dimension

(dim_execution)
Field NameDescription
execution_tkA technical key (TK) for linking facts to execution information.
execution_idA unique string identifier for the execution.
server_nameThe name of the server associated with the execution.
server_hostThe name of the server associated with the execution.
executing_userThe name of the user who initiated the execution.
execution_statusThe status of the execution (start, stop, end, error).

Executor dimension

The following table contains information about an executor that is a job or transformation (dim_executor):
Field NameDescription
executor_tkA technical key (TK) for linking facts to executor information.
versionAn integer corresponding to the version of the executor.
date_fromA date representing the minimum date for which the executor is valid.
date_toA date representing the maximum date for which the executor is valid.
executor_idA string identifier for the executor.
executor_sourceThe source location (either file- or repository-relative) for the executor.
executor_environment *File server, repository name, related to the executor_source.
executor_typeThe executor type (e.g., job or transformation).
executor_nameThe name of the executor (e.g., transformation name).
executor_descA string description of the executor (e.g., job description).
executor_revisionA string representing the revision of the executor (e.g., 1.3).
executor_version_labelA string representing a description of the revision (i.e., change comments).
exec_enabled_table_loggingWhether table logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_detailed_loggingWhether detailed (step or job entry) logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_perf_loggingWhether performance logging is enabled for this executor. Values are Y if enabled; N otherwise.
exec_enabled_history_loggingWhether historical logging is enabled for this executor. Values are Y if enabled; N otherwise.
last_updated_dateThe date the executor was last updated.
last_updated_userThe name of the user who last updated the executor.
* Reserved for future use.

Log table

The log table (dim_log_table) contains data used by the Pentaho Operations Mart.
CautionDo not modify the Log table. The following table is for reference only:
Field NameDescription
log_table_tkA technical key (TK) for linking.
object_typeThe type of PDI object being logged (e.g., job or transformation or step).
table_connection_nameThe name of the database connection corresponding to the location of the transformation/job logging table.
table_nameThe name of the table containing the transformation/job logging information.
schema_nameThe name of the database schema corresponding to the location of the transformation/job logging table.
step_entry_table_conn_nameThe name of the database connection corresponding to the location of the step/entry logging table.
step_entry_table_nameThe name of the table containing the step/entry logging information.
step_entry_schema_nameThe name of the database schema corresponding to the location of the step/entry logging table.
perf_table_conn_nameThe name of the database connection corresponding to the location of the performance logging table.
perf_table_nameThe name of the table containing the performance logging information.
perf_schema_nameThe name of the database schema corresponding to the location of the performance logging table.

Time-of-day dimension

This dimension contains entries for every second of a day from midnight to midnight (dim_time).
Field NameDescription
time_tkA technical key (TK) for linking facts to time-of-day information.
hmsA string representing the time of day as hours-minutes-seconds (e.g., 00:01:35).
hmA string representing the time of day as hours-minutes (e.g., 23:59).
ampmA string representing whether the time-of-day is AM or PM. Values are: am or pm
hourThe integer number corresponding to the hour of the day (0-23).
hour12The integer number corresponding to the hour of the day with respect to AM/PM (1-12).
minuteThe integer number corresponding to the minute of the hour (0-59).
secondThe integer number corresponding to the second of the minute (0-59).

Step fact table

This fact table contains facts about individual step executions (fact_step_execution).
Field NameDescription
execution_date_tkA technical key (TK) linking the fact to the date when the step was executed.
execution_time_tkA technical key (TK) linking the fact to the time-of-day when the step was executed.
batch_tkA technical key (TK) linking the fact to batch information for the step.
executor_tkA technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tkA technical key (TK) linking the fact to information about the parent transformation.
root_executor_tkA technical key (TK) linking the fact to information about the root transformation/job.
execution_timestampThe date and time when the step was executed.
step_tkA technical key (TK) linking the fact to information about the step.
step_copyThe step copy number. This is zero if there is only one copy of the step, or (0 to N-1) if N copies of the step are executed.
rows_inputThe number of lines read from disk or the network by the step. Can be input from files, databases, etc.
rows_outputThe number of lines written to disk or the network by the step. Can be output to files, databases, etc.
rows_read The number of rows read in from the input stream of the step.
rows_writtenThe number of rows written to the output stream of the step.
rows_rejectedThe number of rows rejected during the execution of the step.
errorsThe number of errors that occurred during the execution of the step.

Step dimension

This dimension contains information about individual steps and job entries (dim_step) .
Field NameDescription
step_tkA technical key (TK) for linking facts to step/entry information
step_idThe string name of the step/entry
original_step_name *The name of the step/entry template used to create this step/entry (e.g., Table Input).
* Reserved for future use.

Job entry fact table

This fact table contains facts about individual job entry executions (fact_jobentry_execution).

This fact table contains facts about individual job entry executions (fact_jobentry_execution).

Field NameDescription
execution_date_tkA technical key (TK) linking the fact to the date when the job entry was executed.
execution_time_tkA technical key (TK) linking the fact to the time-of-day when the job entry was executed.
batch_tkA technical key (TK) linking the fact to batch information for the job entry.
executor_tkA technical key (TK) linking the fact to information about the executor (transformation or job).
parent_executor_tkA technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tkA technical key (TK) linking the fact to information about the root transformation/job.
step_tkA technical key (TK) linking the fact to information about the job entry.
execution_timestampThe date and time when the job entry was executed.
rows_inputThe number of lines read from disk or the network by the job entry. Can be input from files, databases, etc.
rows_outputThe number of lines written to disk or the network by the job entry. Can be output to files, databases, etc.
rows_readThe number of rows read in from the input stream of the job entry.
rows_writtenThe number of rows written to the output stream of the job entry.
rows_rejectedThe number of rows rejected during the execution of the job entry.
errorsThe number of errors that occurred during the execution of the job entry.
resultWhether the job entry finished successfully or not. Values are Y (successful) or N (otherwise).
nr_result_rowsThe number of result rows after execution.
nr_result_filesThe number of result files after execution.

Execution performance fact table

This fact table contains facts about the performance of steps in transformation executions (fact_perf_execution).
Field NameDescription
execution_date_tkA technical key (TK) linking the fact to the date when the transformation was executed.
execution_time_tkA technical key (TK) linking the fact to the time-of-day when the transformation was executed.
batch_tkA technical key (TK) linking the fact to batch information for the transformation.
executor_tkA technical key (TK) linking the fact to information about the executor (transformation).
parent_executor_tkA technical key (TK) linking the fact to information about the parent transformation/job.
root_executor_tkA technical key (TK) linking the fact to information about the root transformation/job.
step_tkA technical key (TK) linking the fact to information about the transformation/job.
seq_nrThe sequence number. This is an identifier differentiating performance snapshots for a single execution.
step_copyThe step copy number. This is zero if there is only one copy of the step, or (0 to N-1) if N copies of the step are executed.
execution_timestampThe date and time when the transformation was executed.
rows_inputThe number of rows read from the input file, database, or network during the interval.
rows_outputThe number of rows written to output file, database, or network during the interval.
rows_readThe number of rows read from previous steps during the interval.
rows_writtenThe number of rows written to following steps during the interval.
rows_rejectedThe number of rows rejected by the steps error handling during the interval.
errorsThe number of errors that occurred during the execution of the transformation/job.
input_buffer_rowsThe size of the step’s input buffer in rows at the time of the snapshot.
output_buffer_rowsThe size of the output buffer in rows at the time of the snapshot.