Create the Pentaho User
Create a pentaho user account that has administrative privileges. You will use this account to complete the rest of the installation instructions.
- Create an administrative user on computer that will host the DI Server and name it pentaho.
- Verify that you have the appropriate permissions to read, write, and execute commands in the pentaho user's home directory.
Install the DI Repository Host Database
The DI Repository houses data needed for Pentaho tools to provide scheduling and security functions. The repository also stores metadata and models for reports that you create. You can choose to host the DI Repository on these databases.
- MS SQL Server
To install the DI Repository's host database, do these things.
- Check the Supported Technologies section to determine which versions of the databases Pentaho supports.
- Download and install the database of your choice.
- Verify that the DI Repository database is installed correctly.
Download and Unpack Installation Files
The Pentaho DI Server software, data files, and examples are stored in pre-packaged .zip files. You will need to manually copy these files to the correct directories.
There are two types of Pentaho DI installations:
- The DI-Server Only installation contains only the DI server and its supporting file structure.
- The Archive Build contains the DI Server along with the PDI design tools, plugins, and utilities. For more information, see Install DI Tools and Plugins.
Make sure your users can write to the directory where you install the Pentaho suite.
- Download either the DI-Server Only or Archive Build installation file from the Pentaho Customer Support Portal.
- On the Pentaho Customer Support Portal home page, sign in using the Pentaho support user name and password provided in your Pentaho Welcome Packet.
- Click Downloads, then click Pentaho Data Integration 5.4.1 GA in the 5.x list.
- On the bottom of the Pentaho 5.4.1 GA Release page, click the Data-Integration-Server folder in the Box widget.
- Click one of the following folders and then select the file that corresponds to your chosen installation method:
- DI Server Only > pdi-ee-server-5.0.0-dist.zip
- Archive Build > pdi-ee-5.0.0-dist.zip
- Unzip the DI Server Installation file.
- To unpack the file, run install.sh.
If you are unpacking the file in a non-graphical environment, open a Terminal or Command Prompt window and type
java -jar installer.jar -console and follow the instructions presented in the window.
- In the IZPak window, read the license agreement, select I accept the terms of this license agreement, and then click Next.
- In the Select the installation path text box, you have a choice of:
- Entering a new directory file path
- Browsing to an existing directory file path
- Accepting the default directory file path where you want to create or have already created the pentaho directory
If a pentaho directory has not been created and is included as the target directory path, a message indicating that a target directory will be created appears. Click OK.
After entering your choice, click Next.
When the installation progress is complete click Quit.
- Navigate to the target directory (created in Step 5) and create a server subdirectory.
- Move the data-integration-server directory into the server directory. For an Archive Build installation, it will reside under the pdi-ee directory.
Optionally, if other design tools have been installed and a design-tools directory already exists, move the data-integration directory into the design-tools directory.
- When you are finished, ensure that the directory structures are as follows:
- pentaho/data-integration or pentaho/design-tools/data-integration (if design tools are already installed)
Set Environment Variables
Set the PENTAHO_JAVA_HOME and PENTAHO_INSTALLED_LICENSE_PATH environment variables. If you do not set these variables, Pentaho will not start correctly.
If you are using a JRE, set the JRE_HOME home environment variable as well.
- Set the path of the PENTAHO_JAVA_HOME variable to the path of your Java installation, like this.
- Set the path of the PENTAHO_INSTALLED_LICENSE_PATH variable to the path of the installed licenses, like this.
- Log out and in again, then verify the variables have been properly set.
Advanced Linux and Mac Topics
Prepare a Headless Linux or Solaris Server
There are two headless server scenarios that require special procedures on Linux and Solaris systems. One is for a system that has no video card; the other is for a system that has a video card, but does not have an X server installed. In some situations -- particularly if your server doesn't have a video card -- you will have to perform both procedures to properly generate reports with the DI Server.
Systems without video cards
The java.awt.headless option enables systems without video output and/or human input hardware to execute operations that require them. To set this application server option when the DI Server starts, you will need to modify the startup scripts for either the DI Server, or your Java application server. You do not need to do this now, but you will near the end of these instruction when you perform the Start DI Server step. For now, add the following item to the list of CATALINA_OPTS parameters: -Djava.awt.headless=true.
The entire line should look something like this:
export CATALINA_OPTS="-Djava.awt.headless=true -Xms4096m -Xmx6144m -XX:MaxPermSize=256m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000"
If you intend to create a DI Server service control script, you must add this parameter to that script's CATALINA_OPTS line.
Systems without X11
To generate charts, the Pentaho Reporting engine requires functionality found in X11. If you are unwilling or unable to install an X server, you can install the xvfb package instead. xvfb provides X11 framebuffer emulation, which performs all graphical operations in memory instead of sending them to the screen.
Use your operating system's package manager to properly install xvfb.
Adjust Amount of Memory Mac OS Allocates for PostgreSQL
If you plan to install the software on a Mac OS, and you choose to use PostgreSQL, you need to increase the amount of memory that the Mac OS allocates for PostgreSQL. You can skip these instructions if you plan to install the software on Windows or Linux.
PostgreSQL is the name of the default database that contains audit, schedule and other data that you create. PostgreSQL starts successfully only if your computer has allocated enough memory. Go to http://www.postgresql.org/docs/devel/static/kernel-resources.html and follow the instructions there on how to adjust the memory settings on your computer.
You've finished preparing your environment. Go to Configure Your Repository Database to continue.