Pentaho 9.1
- What's new in Pentaho 9.1
- The Pentaho 9.1 Enterprise Edition delivers a variety of features and enhancements, including access to Google DataProc and Lumada Data Catalog in PDI, along with the Pentaho Upgrade Installer. Pentaho 9.1 also continues to enhance the Pentaho platform experience by introducing new features and improvements.
- Products
- Pentaho products are a comprehensive platform used to access, integrate, manipulate, visualize, and analyze your data. Whether data is stored in a flat file, relational database, Hadoop cluster, NoSQL database, analytic database, social media streams, operational stores, or in the cloud, Pentaho products can help you discover, analyze, and visualize data to find the answers you need, even if you have no coding experience. Advanced users with programming experience can use our extensive API to customize reports, queries, transformations to extend functionality.
- Abort
- Activate CDE
- Adaptive Execution Layer
- Adapt Mondrian schemas to work with Analyzer
- Add a chart
- Add a Checksum
- Add a JNDI data source
- Add a MongoDB data source
- Add notes to transformations and jobs
- Add query parameters to Analyzer reports
- Add report elements
- Add sequence
- AEL logging
- Amazon EMR Job Executor
- Amazon Hive Job Executor
- AMQP Consumer
- AMQP Producer
- Analyze your transformation results
- Applying conditional formatting to measures
- Apply formatting to report elements
- Apply metadata properties and concepts
- Attributes reference
- Avro Input
- Avro Output
- Build a business view
- Bulk load into Amazon Redshift
- Bulk load into Snowflake
- Calculator
- Cassandra Input
- Cassandra Output
- Catalog Input
- Catalog Output
- CDE advanced solutions
- CDE dashboard overview
- CDE quick start guide
- Chart Options for Analyzer reports
- Common Formats
- Connecting to Virtual File Systems
- Connect to a data source
- Contribute additional step and job entry analyzers to the Pentaho Metaverse
- Copybook Input
- Copybook steps in PDI
- CouchDB Input
- Create advanced filters in Interactive Reports
- Create a chart
- Create a comparison filter on a numeric level
- Create a CSV data source
- Create a dashboard that uses a streaming service as a data source
- Create a database table source
- Create a domain
- Create a filter on measure values
- Create a SQL query data source
- Create a string filter on a level
- Create date range filters
- Create Pentaho Dashboard Designer templates
- Create queries
- Create Report Design Wizard templates
- Create Snowflake warehouse
- Creating a business model
- CSV File Input
- CTools
- Customize an Interactive Report
- Data Integration perspective in the PDI client
- Data lineage
- Data Source Model Editor
- Data Source Wizard
- Data types
- Defining hyperlinks
- Delete
- Delete Snowflake warehouse
- Edit multidimensional data source models
- ElasticSearch Bulk Insert
- ETL metadata injection
- Execute Row SQL Script
- Execute SQL Script
- Export an Analyzer report through a URL
- File Exists (Job Entry)
- File exists (Step)
- Filter functions
- Formulas and functions
- Function reference
- Get records from stream
- Get rows from result
- Get System Info
- Google BigQuery Loader
- Group and Filter Data in Interactive Reports
- Group By
- Hadoop Copy Files
- Hadoop File Input
- Hadoop File Output
- HBase Input
- HBase Output
- HBase row decoder
- HBase setup for Spark
- Hide and unhide fields
- Inspect your data
- Java filter
- JMS Consumer
- JMS Producer
- Job (job entry)
- Job entry reference
- Job Executor
- JSON Input
- Kafka Consumer
- Kafka Producer
- Kinesis Consumer
- Kinesis Producer
- Learn about the PDI client
- Link an Analyzer report
- Link a report from Report Designer
- Link columns in a data table to other dashboard panels
- Localization and internationalization of analysis schemas
- Localize a report
- Logging and performance monitoring
- Manage international locales
- Manage Users and Roles in PUC
- Mapping
- MapReduce Input
- MapReduce Output
- Memory Group By
- Merge rows (diff)
- Metadata properties reference
- Metadata security
- Microsoft Excel Input
- Microsoft Excel Output
- Microsoft Excel Writer
- Modified Java Script Value
- Modify charts
- Modify Snowflake warehouse
- Mondrian Input
- MongoDB Input
- MongoDB Output
- MQL formula syntax
- MQTT Consumer
- MQTT Producer
- Optimize a Pentaho Data Service
- ORC Input
- ORC Output
- Other prompt types
- Output parameterization
- Parquet Input
- Parquet Output
- Partitioning data
- PDI and Hitachi Content Platform (HCP)
- PDI and Lumada Data Catalog
- PDI and Snowflake
- PDI run modifiers
- Pentaho Aggregation Designer
- Pentaho Analyzer
- Pentaho Dashboard Designer
- Pentaho Data Integration
- Pentaho Data Services
- Pentaho Data Service SQL support reference and other development considerations
- Pentaho Interactive Reports
- Pentaho MapReduce
- Pentaho Metadata Editor
- Pentaho metadata formulas
- Pentaho Reporting Output
- Pentaho Report Designer
- Pentaho Schema Workbench
- Pentaho User Console
- Perform calculations
- Publish a domain to the Pentaho Server
- Publish a report
- Python Executor
- Quartz cron attributes
- Query HCP
- Read Metadata
- Read metadata from Copybook
- Read metadata from HCP
- Recommended PDI steps to use with Spark on AEL
- Regex Evaluation
- Replace in String
- Report Designer configuration files
- REST Client
- Row Denormaliser
- Row Flattener
- Row Normaliser
- Run files in background
- S3 CSV Input
- S3 File Output
- Salesforce Delete
- Salesforce Input
- Salesforce Insert
- Salesforce Update
- Salesforce Upsert
- Schedule perspective in the PDI client
- Schedule Reports
- Secure SQL filter function access
- Select Values
- Set Analyzer report options
- Set dashboard parameters
- Set Field Value
- Set Field Value to a Constant
- Set Up a Carte Cluster
- Simple Mapping (sub-transformation)
- Single Threader
- Sort rows
- Spark Submit
- Split Fields
- Splunk Input
- Splunk Output
- SSTable Output
- Start Snowflake warehouse
- Steps supporting metadata injection
- Stop Snowflake warehouse
- Streaming analytics
- Strings cut
- String Operations
- Style properties reference
- Supported functions and operators
- Switch-Case
- Table Input
- Table Output
- Text File Input
- Text File Output
- Tour the Report Designer interface
- Transactional databases and job rollback
- Transformation (job entry)
- Transformation Executor
- Transformation step reference
- Understanding PDI data types and field metadata
- Unique Rows
- Unique Rows (HashSet)
- User Defined Java Class
- Use a Pentaho Repository in PDI
- Use calculated measures in Analyzer reports
- Use Carte Clusters
- Use checkpoints to restart jobs
- Use Command Line Tools to Run Transformations and Jobs
- Use content linking to create interactive dashboards
- Use data tables in a dashboard
- Use filters to explore your data
- Use Pentaho Repository access control
- Use prompts on dashboards
- Use the Database Explorer
- Use the Job menu
- Use the Pentaho Marketplace to manage plugins
- Use the Repository Explorer
- Use the SQL Editor
- Use the Transformation menu
- Use version history
- Using Merge rows (diff) on the Pentaho engine
- Using Merge rows (diff) on the Spark engine
- Using Pan and Kitchen with a Hadoop cluster
- Using Parquet Input on the Pentaho engine
- Using Parquet Input on the Spark engine
- Using Table input to Table output steps with AEL for managed tables in Hive
- Using the Avro Input step on the Pentaho engine
- Using the Avro Input step on the Spark engine
- Using the Avro Output step on the Pentaho engine
- Using the Avro Output step on the Spark engine
- Using the Group By step on the Pentaho engine
- Using the Group By step on the Spark engine
- Using the Hadoop File Input step on the Pentaho engine
- Using the Hadoop File Input step on the Spark engine
- Using the Hadoop File Output step on the Pentaho engine
- Using the Hadoop File Output step on the Spark engine
- Using the HBase Input step on the Pentaho engine
- Using the HBase Input step on the Spark engine
- Using the HBase Output step on the Pentaho engine
- Using the HBase Output step on the Spark engine
- Using the MQTT Consumer step on the Pentaho engine
- Using the MQTT Consumer step on the Spark engine
- Using the ORC Input step on the Pentaho engine
- Using the ORC Input step on the Spark engine
- Using the ORC Output step on the Pentaho engine
- Using the ORC Output step on the Spark engine
- Using the Text File Input step on the Pentaho engine
- Using the Text File Input step on the Spark engine
- Using the Text File Output step on the Pentaho engine
- Using the Text File Output step on the Spark engine
- Using the Unique Rows step on the Pentaho engine
- Using the Unique Rows step on the Spark engine
- Variables
- VFS properties
- Visualizations for Analyzer
- Visualization types
- Web services steps
- Working with Analyzer fields
- Working with Analyzer measures
- Work with jobs
- Work with transformations
- Write Metadata
- Write metadata to HCP
- XML Input Stream (StAX)
- Setup
- Setting up Pentaho products includes installation, configuration, administration, and if necessary, upgrading to a current version of Pentaho. In addition, we provide a list of the various components and technical requirements necessary for installing Pentaho.
- About Hadoop
- About Pentaho business analytics tools
- About Pentaho Report Designer
- About Pentaho workflows
- About Spark tuning in PDI
- Adding JBoss logging
- Add a chart to your report
- Add parameters to your report
- Advanced settings for connecting to a Amazon EMR cluster
- Advanced settings for connecting to a Cloudera cluster
- Advanced settings for connecting to a Hortonworks cluster
- Advanced settings for connecting to Cloudera Data Platform
- Advanced settings for connecting to Google Dataproc
- AES security
- Analysis issues
- Archive installation
- Assign permissions to use or manage database connections
- Backup and restore Pentaho repositories
- Big Data issues
- Big data resources
- Big data security
- Business Analytics Operations Mart
- Change the Java VM memory limits
- Command line arguments reference
- Commonly-used PDI steps and entries
- Components Reference
- Configure and start the Pentaho Server after manual installation
- Configure the design tools and utilities
- Configure the Pentaho Server
- Configuring AEL with Spark in a secure cluster
- Configuring application tuning parameters for Spark
- Connect to the Pentaho Repository from the PDI client
- Create a report with Report Designer
- Customize the Pentaho Server
- Data integration issues
- Data Integration Operations Mart
- Data Integration Operations Mart Reference
- Define data connections
- Define JDBC or OCI connections for BA design tools
- Define JNDI connections for Report Designer and Metadata Editor
- Define security for the Pentaho Server
- Design your report
- Determining Spark resource requirements
- Develop your BA environment
- Develop your PDI solution
- Evaluate and learn Pentaho Business Analytics
- Evaluate and learn Pentaho Data Integration (PDI)
- General issues
- Getting Started with Analyzer, Interactive Reports, and Dashboard Designer
- Getting Started with PDI
- Getting started with PDI and Hadoop
- Getting started with Report Designer
- Get started with Analyzer Reports
- Get started with Dashboard Designer
- Get started with Interactive Reports
- Get started with Pentaho Reporting tools
- Google BigQuery
- Go live for production - BA
- Go Live for production - DI
- Hiding user folders in PUC and PDI
- Import and export PDI content
- Increase the PDI client memory limit
- Increase the Pentaho Server memory limit
- Installation and upgrade issues
- Installation of the Pentaho design tools
- Install drivers with the JDBC distribution tool
- Install the BA design tools
- Install the PDI tools and plugins
- Jackrabbit repository perfomance tuning
- JDBC drivers reference
- JDBC security
- Karaf performance tuning
- LDAP security
- Localize Folders and Reports
- Maintain logging
- Manage Pentaho licenses
- Manage the Pentaho Repository
- Manage the Pentaho Server
- Manage users and roles in the PDI client
- Manual and advanced secure impersonation configuration
- Manual installation
- Metadata issues
- Mondrian performance tips
- Monitoring system performance
- More about row banding, data formatting, and alignment
- MSAD security
- Next steps
- PDI job tutorial
- PDI logging
- Pentaho, big data, and Hadoop
- Pentaho administration
- Pentaho Business Analytics workflow
- Pentaho configuration
- Pentaho Data Integration (PDI) tutorial
- Pentaho Data Integration performance tips
- Pentaho Data Integration workflows
- Pentaho data mining (Weka) performance tips
- Pentaho evaluation
- Pentaho installation
- Pentaho Reporting performance tips
- Pentaho Repository issues
- Pentaho Server issues
- Pentaho Server performance tips
- Pentaho Server security
- Pentaho upgrade
- Performance tuning
- Post-upgrade tasks
- Prepare JBOSS connections and web app servers
- Prepare your Linux environment for an archive install
- Prepare your Linux environment for a manual installation
- Prepare your Windows environment for an archive install
- Prepare your Windows environment for a manual installation
- Publish your report
- Purge transformations, jobs, and shared objects from the Pentaho Repository
- Quick tour of the Pentaho User Console (PUC)
- Refine your report
- Report Designer and Reporting engine issues
- Restoring a Pentaho Upgrade Installer backup
- SAML security
- Security Issues
- Setting up DI Operations Mart with an archive installation
- Setting up password encryption after upgrading
- Setting up the DI Operations Mart with a manual installation
- Set PDI version control and comment tracking options
- Set up a cluster
- Set up JNDI connections for the Pentaho Server
- Set Up Kerberos for Pentaho
- Set up native (JDBC) or OCI data connections for the Pentaho Server
- Set up the Adaptive Execution Layer (AEL)
- Set up the Pentaho Server to connect to a Hadoop cluster
- Spark Tuning
- Specify data connections for BA design tools
- Specify data connections for the Pentaho Server
- Spring security
- SSL Security
- SSO security
- Starting the Pentaho Server after an archive installation
- Start and stop BA design tools
- Start and stop PDI design tools and utilities
- Start and stop the Pentaho Server for configuration
- Steps using Dataset tuning options
- Support statement for Analyzer on Impala
- Third-party monitoring with SNMP
- Troubleshooting
- Troubleshooting AEL
- Tutorials
- Upload and download from the Pentaho Repository
- User security
- Use Kerberos with MongoDB
- Use Kerberos with Spark Submit
- Use Knox to access Hortonworks
- Use MS SQL Server as your repository database (Archive installation)
- Use MS SQL Server as your repository database (Manual installation)
- Use MySQL as your repository database (Archive installation)
- Use MySQL as your repository database (Manual installation)
- Use Oracle as Your Repository Database (Archive installation)
- Use Oracle as your repository database (Manual installation)
- Use password encryption with Pentaho
- Use PostgreSQL as Your Repository Database (Archive installation)
- Use PostgreSQL as your repository database (Manual installation)
- Using Oozie
- Using the Pentaho Upgrade Installer in silent mode
- Verification checklist for JBoss connection tasks
- Work with data
- You can refine your Pentaho relational metadata and multidimensional Mondrian data models. You can also learn how to work with big data.
- About Multidimensional Expression Language
- Adding a new driver
- AggExclude
- AggFactCount
- AggForeignKey
- AggIgnoreColumn
- AggLevel
- AggMeasure
- AggName
- AggPattern
- AggTable
- Analysis schema security
- App Builder and Community Dashboard Editor
- App endpoints for SDR forms
- Big data resources
- Building blocks for the SDR
- Cache Configuration Files
- CalculatedMember
- CalculatedMemberProperty
- CaptionExpression
- Clean up the All Requests Processed list
- Closure
- ColumnDef
- ColumnDefs
- Configure KTR files for your environment
- Configure Mondrian engine
- Connecting to a Hadoop cluster with the PDI client
- Copy files to a Hadoop YARN cluster
- Creating attributes
- Creating link dimensions
- Creating measures on stream fields
- Cube
- CubeGrant
- CubeUsage
- CubeUsages
- Dimension
- DimensionGrant
- DimensionUsage
- Formula
- Hadoop connection and access information list
- Hierarchy
- HierarchyGrant
- How to use the SDR sample form
- InlineTable
- Installing and configuring the SDR sample
- Install and configure the Streamlined Data Refinery
- Install the Vertica JDBC driver
- Join
- KeyExpression
- Level
- Manage Hadoop configurations through PDI
- Measure
- MeasureExpression
- MemberGrant
- Memcached Configuration Options
- Modify the JGroups configuration
- Mondrian cache control
- Mondrian role mapping in the Pentaho Server
- Mondrian Schema Element Reference
- Multidimensional Data Modeling in Pentaho
- NamedSet
- NameExpression
- OLAP Log Output
- OrdinalExpression
- Parameter
- ParentExpression
- PDI big data job entries
- PDI big data transformation steps
- Property
- PropertyExpression
- Relational Data Modeling in Pentaho
- Restrict Access to Specific Members
- Restrict OLAP schemas per user
- Role
- RoleUsage
- Row
- Rows
- Schema
- SchemaGrant
- Segment cache architecture
- SQL
- Switch to another cache framework
- Switch to Memcached
- Switch to Pentaho Platform Delegating Cache
- Table
- Union
- UserDefinedFunction
- Use a Custom SegmentCache SPI
- Use Hadoop with Pentaho
- Use Hadoop with the SDR
- Use PDI outside and inside the Hadoop cluster
- Use the Build Model job entry for SDR
- Use the Streamlined Data Refinery
- Using Spark Submit
- Using the Annotate Stream step
- Using the Publish Model job entry for SDR
- Using the Shared Dimension step for SDR
- Value
- View
- VirtualCube
- VirtualCubeDimension
- VirtualCubeMeasure
- Work with the Streamlined Data Refinery
- Developer center
- Integrate and customize Pentaho products, as well as perform highly advanced tasks. These sections are best used by software engineers who are very familiar with programming concepts and have extensive programming experience.
- Additional resources
- Assign Analyzer chart colors
- Configuration API
- Configuring a visualization
- Create a Pentaho Server plug-in
- Create database plugins
- Create job entry plugins
- Create partitioner plugins
- Create PDI icons
- Create step plugins
- Create the Pentaho web package
- Customize PDI Data Explorer
- Customize Pentaho Analyzer
- Customize the Pentaho User Console
- Custom Analyzer action links to JavaScript functions
- Custom visualizations
- Define the custom visualization
- Deploying a visualization
- Develop a visualization in a sandbox
- Embed and extend PDI functionality
- Embed Pentaho Data Integration
- Embed Pentaho Server functionality into web applications
- Embed reporting functionality
- Embed the reporting engine into a Java application
- Extend Pentaho Analyzer with custom visualizations
- Extend Pentaho Data Integration
- JAR reference
- Moving to Visualization API 3.0
- Moving to Visualization API 3.0 in Analyzer
- Multi-tenancy
- OSGi artifacts deployment
- Other embedding scenarios
- PDF and Excel export customizations
- Pentaho web package
- Platform JavaScript APIs
- Register the created JavaScript files with Pentaho Analyzer
- Register the visualization with Pentaho Analyzer
- Register the visualization with Pentaho Visualization API
- Restart the Pentaho Server and test the visualization
- Sample 0: The base class
- Sample 1: Static report definition, JDBC input, PDF output
- Sample 2: Static report definition, JDBC input, HTML output
- Sample 3: Dynamically generated, JDBC input, swing output
- Sample 4: Dynamically generated, JDBC input, Java servlet output
- Source code links
- Stock color palettes identifiers
- Stock visualizations identifiers
- Visualization API