Data Integration and Analytics 9.4
- What's new in Pentaho 9.4
- The Pentaho 9.4 Enterprise Edition delivers a variety of new features and enhancements, including:Thin Kettle enginePentaho User Experience improvementsPentaho Analyzer visualization configurationsBusiness analytics server performance improvementsMongo DB plugin improvementsPentaho 9.4 also continues to enhance the Pentaho business analytics experience.
- Products
- Pentaho products are a comprehensive platform used to access, integrate, manipulate, visualize, and analyze your data. Whether data is stored in a flat file, relational database, Hadoop cluster, NoSQL database, analytic database, social media streams, operational stores, or in the cloud, Pentaho products can help you discover, analyze, and visualize data to find the answers you need, even if you have no coding experience. Advanced users with programming experience can use our extensive API to customize reports, queries, transformations to extend functionality.
- Abort
- Activate CDE
- Adapt Mondrian schemas to work with Analyzer
- Add a chart
- Add a Checksum
- Add a JNDI data source
- Add a MongoDB data source
- Add notes to transformations and jobs
- Add query parameters to Analyzer reports
- Add report elements
- Add sequence
- Amazon EMR Job Executor
- Amazon Hive Job Executor
- AMQP Consumer
- AMQP Producer
- Analyze your transformation results
- Applying conditional formatting to measures
- Apply formatting to report elements
- Apply metadata properties and concepts
- Attributes reference
- Avro Input
- Avro Output
- Build a business view
- Bulk load into Amazon Redshift
- Bulk load into Azure SQL DB
- Bulk load into Snowflake
- Calculator
- Cassandra Input
- Cassandra Output
- CDE advanced solutions
- CDE dashboard overview
- CDE quick start guide
- Chart Options for Analyzer reports
- Common Formats
- Connecting to Virtual File Systems
- Connect to a data source
- Contribute additional step and job entry analyzers to the Pentaho Metaverse
- Copybook Input
- Copybook steps in PDI
- CouchDB Input
- Create advanced filters in Interactive Reports
- Create a chart
- Create a comparison filter on a numeric level
- Create a CSV data source
- Create a dashboard that uses a streaming service as a data source
- Create a database table source
- Create a domain
- Create a filter on measure values
- Create a SQL query data source
- Create a string filter on a level
- Create date range filters
- Create Pentaho Dashboard Designer templates
- Create queries
- Create Report Design Wizard templates
- Create Snowflake warehouse
- Creating a business model
- CSV File Input
- CTools
- Customize an Interactive Report
- Data Integration perspective in the PDI client
- Data lineage
- Data Source Model Editor
- Data Source Wizard
- Data types
- Defining hyperlinks
- Delete
- Delete Snowflake warehouse
- Edit multidimensional data source models
- ElasticSearch Bulk Insert (deprecated)
- Elasticsearch REST Bulk Insert
- ETL metadata injection
- Execute Row SQL Script
- Execute SQL Script
- Export an Analyzer report through a URL
- File Exists (Job Entry)
- File exists (Step)
- Filter functions
- Formulas and functions
- Function reference
- Get records from stream
- Get rows from result
- Get System Info
- Google BigQuery Loader
- Group and Filter Data in Interactive Reports
- Group By
- Hadoop Copy Files
- Hadoop File Input
- Hadoop File Output
- HBase Input
- HBase Output
- HBase row decoder
- Hide and unhide fields
- Inspect your data
- Java filter
- JMS Consumer
- JMS Producer
- Job (job entry)
- Job entry reference
- Job Executor
- JSON Input
- Kafka Consumer
- Kafka Producer
- Kinesis Consumer
- Kinesis Producer
- Learn about the PDI client
- Link an Analyzer report
- Link a report from Report Designer
- Link columns in a data table to other dashboard panels
- Localization and internationalization of analysis schemas
- Localize a report
- Logging and performance monitoring
- Manage international locales
- Manage Users and Roles in PUC
- Mapping
- MapReduce Input
- MapReduce Output
- Memory Group By
- Merge rows (diff)
- Metadata discovery
- Metadata properties reference
- Metadata security
- Microsoft Excel Input
- Microsoft Excel Output
- Microsoft Excel Writer
- Modified Java Script Value
- Modify charts
- Modify Snowflake warehouse
- Mondrian Input
- MongoDB Execute
- MongoDB Input
- MongoDB Output
- MQL formula syntax
- MQTT Consumer
- MQTT Producer
- Optimize a Pentaho Data Service
- ORC Input
- ORC Output
- Other prompt types
- Output parameterization
- Parquet Input
- Parquet Output
- Partitioning data
- PDI and Hitachi Content Platform (HCP)
- PDI and Snowflake
- PDI run modifiers
- Pentaho Aggregation Designer
- Pentaho Analyzer
- Pentaho Dashboard Designer
- Pentaho Data Integration
- Pentaho Data Services
- Pentaho Data Service SQL support reference and other development considerations
- Pentaho Interactive Reports
- Pentaho MapReduce
- Pentaho Metadata Editor
- Pentaho metadata formulas
- Pentaho Reporting Output
- Pentaho Report Designer
- Pentaho Schema Workbench
- Pentaho User Console
- Perform calculations
- Publish a domain to the Pentaho Server
- Publish a report
- Python Executor
- Quartz cron attributes
- Query HCP
- Query metadata from a database
- Read metadata from Copybook
- Read metadata from HCP
- Regex Evaluation
- Replace in String
- Report Designer configuration files
- REST Client
- Row Denormaliser
- Row Flattener
- Row Normaliser
- Run files in background
- S3 CSV Input
- S3 File Output
- Salesforce Delete
- Salesforce Input
- Salesforce Insert
- Salesforce Update
- Salesforce Upsert
- Schedule perspective in the PDI client
- Schedule Reports
- Secure SQL filter function access
- Select Values
- Set Analyzer report options
- Set dashboard parameters
- Set Field Value
- Set Field Value to a Constant
- Set Up a Carte Cluster
- Simple Mapping (sub-transformation)
- Single Threader
- Sort rows
- Spark Submit
- Split Fields
- Splunk Input
- Splunk Output
- SSTable Output
- Start Snowflake warehouse
- Steps supporting metadata injection
- Stop Snowflake warehouse
- Streaming analytics
- Strings cut
- String Operations
- Style properties reference
- Supported functions and operators
- Switch-Case
- Table Input
- Table Output
- Text File Input
- Text File Output
- Tour the Report Designer interface
- Transactional databases and job rollback
- Transformation (job entry)
- Transformation Executor
- Transformation step reference
- Understanding PDI data types and field metadata
- Unique Rows
- Unique Rows (HashSet)
- User Defined Java Class
- Use a Pentaho Repository in PDI
- Use calculated measures in Analyzer reports
- Use Carte Clusters
- Use checkpoints to restart jobs
- Use Command Line Tools to Run Transformations and Jobs
- Use content linking to create interactive dashboards
- Use data tables in a dashboard
- Use filters to explore your data
- Use Pentaho Repository access control
- Use prompts on dashboards
- Use the Database Explorer
- Use the Job menu
- Use the Pentaho Marketplace to manage plugins
- Use the Repository Explorer
- Use the SQL Editor
- Use the Transformation menu
- Use version history
- Using Pan and Kitchen with a Hadoop cluster
- Variables
- VFS properties
- Visualizations for Analyzer
- Visualization types
- Web services steps
- Working with Analyzer fields
- Working with Analyzer measures
- Work with jobs
- Work with transformations
- Write metadata to HCP
- XML Input Stream (StAX)
- XML Output
- Setup
- Setting up Pentaho products includes installation, configuration, administration, and if necessary, upgrading to a current version of Pentaho. In addition, we provide a list of the various components and technical requirements necessary for installing Pentaho.
- About Hadoop
- About Pentaho business analytics tools
- About Pentaho Report Designer
- About Pentaho workflows
- Adding JBoss logging
- Add a chart to your report
- Add parameters to your report
- Advanced settings for connecting to an Amazon EMR cluster
- Advanced settings for connecting to Azure HDInsight
- Advanced settings for connecting to a Cloudera cluster
- Advanced settings for connecting to a Hortonworks cluster
- Advanced settings for connecting to Cloudera Data Platform
- Advanced settings for connecting to Google Dataproc
- AES security
- Analysis issues
- Archive installation
- Assign permissions to use or manage database connections
- Backup and restore Pentaho repositories
- Big Data issues
- Big data resources
- Big data security
- Business Analytics Operations Mart
- Change the Java VM memory limits
- Command line arguments reference
- Commonly-used PDI steps and entries
- Compatability issues running Pentaho on Java 11 with your Hadoop cluster
- Components Reference
- Configure and start the Pentaho Server after manual installation
- Configure the design tools and utilities
- Configure the Pentaho Server
- Connect to an Azure SQL database
- Connect to the Pentaho Repository from the PDI client
- Create a report with Report Designer
- Customize the Pentaho Server
- Data integration issues
- Data Integration Operations Mart
- Data Integration Operations Mart Reference
- Define data connections
- Define JDBC or OCI connections for BA design tools
- Define JNDI connections for Report Designer and Metadata Editor
- Define security for the Pentaho Server
- Design your report
- Develop your BA environment
- Develop your PDI solution
- Docker command tool property and registry files
- Docker container deployment of Pentaho
- Evaluate and learn Pentaho Business Analytics
- Evaluate and learn Pentaho Data Integration (PDI)
- General issues
- Getting Started with Analyzer, Interactive Reports, and Dashboard Designer
- Getting Started with PDI
- Getting started with PDI and Hadoop
- Getting started with Report Designer
- Get started with Analyzer Reports
- Get started with Dashboard Designer
- Get started with Interactive Reports
- Get started with Pentaho Reporting tools
- Google BigQuery
- Go live for production - BA
- Go Live for production - DI
- Hiding user folders in PUC and PDI
- Import and export PDI content
- Increase the PDI client memory limit
- Increase the Pentaho Server memory limit
- Installation and upgrade issues
- Installation of the Pentaho design tools
- Install drivers with the JDBC distribution tool
- Install the BA design tools
- Install the PDI tools and plugins
- Jackrabbit repository perfomance tuning
- JDBC drivers reference
- JDBC security
- Karaf performance tuning
- LDAP security
- Localize Folders and Reports
- Maintain logging
- Manage Pentaho licenses
- Manage the Pentaho Repository
- Manage the Pentaho Server
- Manage users and roles in the PDI client
- Manual and advanced secure impersonation configuration
- Manual installation
- Metadata issues
- Mondrian performance tips
- Monitoring system performance
- More about row banding, data formatting, and alignment
- MSAD security
- Next steps
- PDI job tutorial
- PDI logging
- Pentaho, big data, and Hadoop
- Pentaho administration
- Pentaho Business Analytics workflow
- Pentaho configuration
- Pentaho Data Integration (PDI) tutorial
- Pentaho Data Integration performance tips
- Pentaho Data Integration workflows
- Pentaho data mining (Weka) performance tips
- Pentaho evaluation
- Pentaho installation
- Pentaho Reporting performance tips
- Pentaho Repository issues
- Pentaho Server issues
- Pentaho Server performance tips
- Pentaho Server security
- Pentaho upgrade
- Performance tuning
- Post-upgrade tasks
- Prepare JBOSS connections and web app servers
- Prepare your Linux environment for an archive install
- Prepare your Linux environment for a manual installation
- Prepare your Windows environment for an archive install
- Prepare your Windows environment for a manual installation
- Publish your report
- Purge transformations, jobs, and shared objects from the Pentaho Repository
- Quick tour of the Pentaho User Console (PUC)
- Refine your report
- Report Designer and Reporting engine issues
- Restoring a Pentaho Upgrade Installer backup
- SAML security
- Security Issues
- Setting up DI Operations Mart with an archive installation
- Setting up password encryption after upgrading
- Setting up the DI Operations Mart with a manual installation
- Set PDI version control and comment tracking options
- Set up a cluster
- Set up JNDI connections for the Pentaho Server
- Set Up Kerberos for Pentaho
- Set up native (JDBC) or OCI data connections for the Pentaho Server
- Set up the Pentaho Server to connect to a Hadoop cluster
- Specify data connections for BA design tools
- Specify data connections for the Pentaho Server
- Spring security
- SSL Security
- SSO security
- Starting the Pentaho Server after an archive installation
- Start and stop BA design tools
- Start and stop PDI design tools and utilities
- Start and stop the Pentaho Server for configuration
- Support statement for Analyzer on Impala
- Third-party monitoring with SNMP
- Tracking access to sensitive data using Pentaho logging tools
- Troubleshooting
- Tutorials
- Upload and download from the Pentaho Repository
- User security
- Use Kerberos with MongoDB
- Use Kerberos with Spark Submit
- Use Knox to access Hortonworks
- Use MS SQL Server as your repository database (Archive installation)
- Use MS SQL Server as your repository database (Manual installation)
- Use MySQL as your repository database (Archive installation)
- Use MySQL as your repository database (Manual installation)
- Use Oracle as Your Repository Database (Archive installation)
- Use Oracle as your repository database (Manual installation)
- Use password encryption with Pentaho
- Use PostgreSQL as Your Repository Database (Archive installation)
- Use PostgreSQL as your repository database (Manual installation)
- Using Oozie
- Using the Pentaho Upgrade Installer in console or silent mode
- Using the Pentaho Upgrade Installer in silent mode
- Verification checklist for JBoss connection tasks
- Work with data
- You can refine your Pentaho relational metadata and multidimensional Mondrian data models. You can also learn how to work with big data.
- About Multidimensional Expression Language
- Adding a new driver
- AggExclude
- AggFactCount
- AggForeignKey
- AggIgnoreColumn
- AggLevel
- AggMeasure
- AggName
- AggPattern
- AggTable
- Analysis schema security
- App Builder and Community Dashboard Editor
- App endpoints for SDR forms
- Big data resources
- Building blocks for the SDR
- Cache Configuration Files
- CalculatedMember
- CalculatedMemberProperty
- CaptionExpression
- Clean up the All Requests Processed list
- Closure
- ColumnDef
- ColumnDefs
- Configure KTR files for your environment
- Configure Mondrian engine
- Connecting to a Hadoop cluster with the PDI client
- Copy files to a Hadoop YARN cluster
- Creating attributes
- Creating link dimensions
- Creating measures on stream fields
- Cube
- CubeGrant
- CubeUsage
- CubeUsages
- Dimension
- DimensionGrant
- DimensionUsage
- Formula
- Hadoop connection and access information list
- Hierarchy
- HierarchyGrant
- How to use the SDR sample form
- InlineTable
- Installing and configuring the SDR sample
- Install and configure the Streamlined Data Refinery
- Install the Vertica JDBC driver
- Join
- KeyExpression
- Level
- Manage Hadoop configurations through PDI
- Measure
- MeasureExpression
- MemberGrant
- Memcached Configuration Options
- Modify the JGroups configuration
- Mondrian cache control
- Mondrian role mapping in the Pentaho Server
- Mondrian Schema Element Reference
- Multidimensional Data Modeling in Pentaho
- NamedSet
- NameExpression
- OLAP Log Output
- OrdinalExpression
- Parameter
- ParentExpression
- PDI big data job entries
- PDI big data transformation steps
- Property
- PropertyExpression
- Relational Data Modeling in Pentaho
- Restrict Access to Specific Members
- Restrict OLAP schemas per user
- Role
- RoleUsage
- Row
- Rows
- Schema
- SchemaGrant
- Segment cache architecture
- SQL
- Switch to another cache framework
- Switch to Memcached
- Switch to Pentaho Platform Delegating Cache
- Table
- Union
- UserDefinedFunction
- Use a Custom SegmentCache SPI
- Use Hadoop with Pentaho
- Use Hadoop with the SDR
- Use PDI outside and inside the Hadoop cluster
- Use the Build Model job entry for SDR
- Use the Streamlined Data Refinery
- Using Spark Submit
- Using the Annotate Stream step
- Using the Publish Model job entry for SDR
- Using the Shared Dimension step for SDR
- Value
- View
- VirtualCube
- VirtualCubeDimension
- VirtualCubeMeasure
- Work with the Streamlined Data Refinery
- Developer center
- Integrate and customize Pentaho products, as well as perform highly advanced tasks. These sections are best used by software engineers who are very familiar with programming concepts and have extensive programming experience.
- Additional resources
- Assign Analyzer chart colors
- Configuration API
- Configuring a visualization
- Create database plugins
- Create job entry plugins
- Create partitioner plugins
- Create PDI icons
- Create step plugins
- Create the Pentaho web package
- Customize PDI Data Explorer
- Customize Pentaho Analyzer
- Customize the Pentaho User Console
- Custom Analyzer action links to JavaScript functions
- Deploying a visualization
- Develop a visualization in a sandbox
- Embed and extend PDI functionality
- Embed Pentaho Data Integration
- Embed Pentaho Server functionality into web applications
- Embed reporting functionality
- Embed the reporting engine into a Java application
- Extend Pentaho Data Integration
- JAR reference
- Moving to Visualization API 3.0
- Moving to Visualization API 3.0 in Analyzer
- Multi-tenancy
- OSGi artifacts deployment
- Other embedding scenarios
- PDF and Excel export customizations
- Pentaho web package
- Platform JavaScript APIs
- Sample 0: The base class
- Sample 1: Static report definition, JDBC input, PDF output
- Sample 2: Static report definition, JDBC input, HTML output
- Sample 3: Dynamically generated, JDBC input, swing output
- Sample 4: Dynamically generated, JDBC input, Java servlet output
- Source code links
- Stock color palettes identifiers
- Stock visualizations identifiers
- Visualization API