Configuring the Mondrian Engine for Large Data Warehouses
There are several mondrian.properties options that control how the Analysis engine interacts with large data warehouse volumes in conjunction with Pentaho Analyzer:
- mondrian.result.limit=5000000
- mondrian.rolap.iterationLimit=5000000
- mondrian.rolap.queryTimeout=300
- mondrian.native.crossjoin.enable=true
- mondrian.native.topcount.enable=true
- mondrian.native.filter.enable=true
- mondrian.native.nonempty.enable=true
- mondrian.rolap.maxConstraints=1000
- mondrian.native.ExpandNonNative=true
- mondrian.expCache.enable=true
Below are explanations for each property.
mondrian.result.limit
Controls the largest cross join size that Mondrian will handle in-memory. Ideally, no queries should involve large cross joins in-memory; instead, they should be handled by the database.
mondrian.rolap.iterationLimit
This is similar to mondrian.result.limit, except this applies to calculating aggregates in-memory such as SUM, MAX, AGGREGATE, etc. This should be set to the same value as mondrian.result.limit.
mondrian.rolap.queryTimeout
If any query runs past this number of seconds, then the query is immediately cancelled. The total sum of all SQL statements to process a single MDX statement must be less than this timeout. Setting this to zero disables query timeout, which is not recommended because runaway queries can deprive system resources from other necessary processes.
mondrian.native.crossjoin.enable
If this is set to true, when Mondrian needs to cross join multiple dimensions in a report, if the cross join is non-empty, meaning a fact table relationship has been defined, then the join operation is done using SQL. The resultant SQL query returns only combined dimension members that actually have fact data. This reduces the amount of tuples that need to be processed and is critical for performance on high-cardinality dimensions.
mondrian.native.topcount.enable
If set to true, when fetching the first set of records for the filter dialog, Mondrian will only read that set of records into memory. If set to false, all records from the dimension level will be read into memory.
mondrian.native.nonempty.enable
If set to true, Mondrian will validate each member in the MDX via SQL. If set to false, Mondrian will traverse from parent to child tokens in the member. For high-cardinality dimensions, this must be enabled to avoid reading all members into cache.
mondrian.rolap.maxConstraints
This should be set to the largest number of values that the data warehouse database supports in an IN list.
mondrian.native.ExpandNonNative
Works in conjunction with native evaluation of cross joins. If set to true, Mondrian will expand cross join inputs to simple member lists that are candidates for pushdown.