Pentaho Configuration
This guide presents basic configuration tasks for the Pentaho Server, data connections, the Pentaho design tools, and Hadoop cluster connections so you can get started creating ETL solutions and data analytics. This guide assumes you have installed the Pentaho software.
Tools: These configuration tasks can be performed through the PUC (Pentaho User Console), the PDI (Pentaho Data Integration) client, or edits to shell scripts and property files.
Login Credentials: A Pentaho administrator user name and password is required to perform configuration tasks through the user console.
These tasks are for IT and Pentaho administrators as described in the following definitions:
- An IT administrator installs, configures, and upgrades the Pentaho Server. An IT administrator knows where the data is stored, how to connect to it, details about the computing environment, and how to use the command line on Microsoft Windows or Linux.
- A Pentaho administrator is responsible for creation and management of users and roles along with managing workstations so the ETL specialists and business analysts can create, publish, and share content.
Tasks to be Performed by an IT Administrator
As an IT administrator, you need to configure the Pentaho Server and define what security to use. If your team is working with Big Data, you will also need to set up a connection to a Hadoop cluster.
Configure the Pentaho Server
Basic server configuration tasks include starting and stopping the Pentaho Server, increasing the server's memory limit, and specifying data connections.
Define Security for the Pentaho Server
You also need to establish a security plan for your Pentaho system.
Set Up Pentaho to Connect to a Hadoop Cluster
If you are an IT Administrator working with Big Data, you will need to configure Pentaho to connect to a Hadoop cluster.
Set Up the Adaptive Execution Layer (AEL)
You can use AEL to run transformations in different engines, such as Spark.
Tasks to be Performed by a Pentaho Administrator
As a Pentaho administrator, you need to configure data connections, manage the Pentaho Server, and set up the BA (Business Analytics) or PDI (Pentaho Data Integration) design tools.
Define Data Connections
Use the Database Connection dialog box in PUC and PDI to define database connections.
Assign Permissions to Use or Manage Database Connections
Specify which data to make visible to selected users and roles.
Manage Users and Roles
If you are using basic Pentaho Security, the Pentaho Administrator may be tasked with creating and managing users and roles, including assigning permissions to allow users to access the content they need.
With PUC
Switch between user and role settings to add, delete, and edit users and roles in PUC.
With the PDI Client
Control users and roles in the Pentaho Repository with the PDI client.
Configure the Design Tools and Utilities
Before using design tools and utilities, you need to perform configuration tasks for each workstation running these tools.