Skip to main content
Hitachi Vantara Lumada and Pentaho Documentation

Set Up Worker Nodes on the Pentaho Server

Now that you have installed the Pentaho Worker Nodes product on a single instance of HCI, you are ready to enable Pentaho Worker Nodes on your Pentaho Server and configure it to run work items. In addition, you can enable secure communication between the Pentaho Server and the worker nodes.

Configure the Pentaho Worker Nodes

Once you have enabled Worker Nodes, there are several important configuration tasks which you must perform. 

Step One: Edit the Karaf Feature Boot 

After starting HCI, you will need to configure the Pentaho Server to delegate PDI transformations and jobs to the Pentaho Worker Nodes within the HCI instance. 

Perform the following steps in the Karaf environment:

  1. If you have not done so already, stop the Pentaho Server.
  2. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the org.apache.karaf.features.cfg file with any text editor.
  3. Locate the featuresBoot property and add the pentaho-worker-nodes-ee feature.
# Comma separated list of features to install at startup 
#
featuresBoot=config,management,kar,cxf,camel,camel-blueprint,camel-stream,pentaho-camel-jms,pentaho-server,pentaho-monitoring-to-snmp,pentaho-metaverse,pdi-dataservice,pdi-data-refinery,pentaho-big-data-ee-plugin-osgi-obf,pdi-engine-configuration,pentaho-worker-nodes-ee
  1. Save and close the org.apache.karaf.features.cfg file. 
  2. Start the Pentaho Server. 
  3. Verify that the new configuration file pentaho.worker.nodes.cfg has been created in the following location: pentaho-server/pentaho-solutions/system/karaf/etc/

The Karaf OSGI framework allows a user to make configuration changes which will be applied in “real-time” to the environment when the configuration file is saved. 

Step Two: Edit the web.xml File

To operate successfully, Pentaho Worker Nodes must gain access to the Pentaho repository through the Pentaho Server. To grant the worker nodes access to the repository, perform the following steps.

  1. Stop the Pentaho Server.
  2. Navigate to the pentaho-server/tomcat/webapps/pentaho/WEB-INF directory and open the web.xml file with a text editor.

Specify the IP addresses of your HCI instance as trusted IP addresses by the Pentaho Server.

  1. Navigate to the following location in the web.xml file:
<filter-name>Proxy Trusting Filter</filter-name>
   <filter-class>org.pentaho.platform.web.http.filters.ProxyTrustingFilter</filter-class>
  1. Directly below this location, find the comma-separated list of trusted IP addresses: TrustedIpAddrs
  2. Add the IP addresses of your HCI instances here. 

Now any requests originating from those IP addresses will be "trusted" by the Pentaho Server for specific actions and will not require credentials to be passed along.

Specify which actions are allowed for requests originating from those IP addresses. 

  1. Navigate to the other Proxy Trusting Filters, for example:
<filter-mapping>
   <filter-name>Proxy Trusting Filter</filter-name>
   <url-pattern>/i18n</url-pattern>
</filter-mapping>
  1. Below this location, add new filter mappings for 'webservices' and 'authentication-provider', as follows:
<filter-mapping>
   <filter-name>Proxy Trusting Filter</filter-name>
   <url-pattern>/webservices/*</url-pattern>
</filter-mapping>
 
<filter-mapping>
   <filter-name>Proxy Trusting Filter</filter-name>
   <url-pattern>/api/system/authentication-provider</url-pattern>
</filter-mapping>
  1. Save and close the web.xml file. 
  2. Start the Pentaho Server to implement your changes.

Step Three: Edit the Pentaho Worker Nodes Configuration File

Once you have enabled Worker Nodes, you must configure the orchestration method, as well as the host and event logging connections for the Worker Nodes. This task assumes you are on the server or virtual machine for your Pentaho Server instance.

Perform the following steps in the configuration file:

  1. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the pentaho.worker.nodes.cfg file with any text editor.
  2. Perform the associated action for each property listed in the table below:
Property Action Notes
orchestrator Change the value to 'pentaho scale'. If you wanted to return to standard Pentaho Server operation, set this property value to ‘None’. 
wn-hostname Set the value to the IP address of the Content-Execution-Broker service running within your cluster.     To find the IP address of the Content-Execution-Broker service, browse to HCI Admin App > Monitoring > Services. Select the Content-Execution-Broker panel to view the Service Details window. 
wn-port

Set the value to the port number of the Content-Execution-Broker service running within your cluster.

  • The default value is '38080'.
  • The recommended secure value is '38443'.
To find the port number of the Content-Execution-Broker service, browse to HCI Admin App > Monitoring > Services. Select the Content-Execution-Broker panel to view the Service Details window.
  1. Save and close the pentaho.worker.nodes.cfg file. 

After the Worker Nodes are configured, all transformations and jobs submitted to the Pentaho Server will be processed using Worker Nodes. To return to standard Pentaho Server operation, set the orchestrator property value to ‘None’.

Configure Your PDI Job Parameters

Before you run work items on worker nodes, be sure that you have configured both the PDI job resources allocated to the Worker Nodes execution, and the PDI job parameters used in the execution of PDI jobs. 

Perform the following steps in the Pentaho Worker Nodes administration site:

  1. Open a web browser and enter https://<HCI-instance-IP-address>:8000.

 ‘HTTPS’ is the required IP address to gain site access, even for unsecured communications.

  1. In the Security Realm field, select the location where your user account is defined. To log on using the local admin account, select Local.   
  2. In the Pentaho Worker Nodes > Home page, select System Configuration. From the main page that displays, select Jobs. 

HCI_Admin810_SysConf_Jobs_ManageJobs_SelectPDIJob.png

  1. In the Manage Jobs Wizard, select the PDI-Job panel and then click Next. The Configuration page displays.

HCI_Admin810_SysConf_Jobs_ManageJobs_PDIJob_Conf.png

  1. Click the Settings tab. Set the following parameters in the Container Options: Default section. 
    • Container Memory: Set the value of the hard memory limit for the Docker container. Ships with a default value of '1024' MB.
    • CPU: Set the CPU usage for the worker node. Ships with a default value of '1.0'.
  2. Set the following parameters in the PDI Job Parameters section.

You must set the Pentaho Repository IP and the Pentaho Repository port parameters to enable communication between the Pentaho server and the worker nodes.

Parameter Description Value
Pentaho Repository IP Set the value to the IP address of the Pentaho repository. You must set a valid IP address for this parameter to enable communication between the Pentaho Server and the worker nodes.
Pentaho Repository port Set the value to the port number of the Pentaho repository.  You must set a valid port number for this parameter to enable communication between the Pentaho Server and the worker nodes.
JVM initial memory allocation (MB) Set the value in megabytes of the PDI job's JVM memory allocation at startup. Ships with a default of '256'. Since the JVM resides inside the container, these values should not exceed the value in the Container Memory parameter; otherwise, the JVM would try to allocate more memory than is available to it.
JVM maximum memory allocation (MB) Set the value in megabytes of the PDI job's JVM maximum allowed memory allocation. Ships with a default of '768'.  Since the JVM resides inside the container, these values should not exceed the Container Memory parameter; otherwise, the JVM would try to allocate more memory than is available to it.
  1. Once all the values are set, select Update Job

You should have set the Pentaho Repository parameters immediately after deploying the Pentaho Worker Nodes package. See Set Pentaho Repository Connection Information for instructions on setting the Pentaho Repository parameters.

Enable Secure Communication for Pentaho Worker Nodes

The following sections detail how to enable and establish a secure communication channel between the Pentaho Server and the Worker Nodes. 

Overview

SSL is enabled by default. For Pentaho Server to establish a communication over SSL with the Content Execution Broker, you will first need to provide the required security certificate. The Content Execution Broker applies a certificate in the following location on the machine where HCI is installed:

 <install-path>/hci/data/com.pentaho.foundry.contentexecution.service/<guid>/ceb-truststore.pem

Note: If you want to use your own certificate instead of the provided ceb-truststore.pem certificate, copy your certificate to the above location and rename it as 'ceb-truststore.pem'.

To secure your Pentaho Worker Nodes configuration, you will need to perform the following tasks.

  1. Configure the Content-Execution-Broker
  2. Locate the HCI Security Certificate
  3. Add the CEB's Security Certificate to the Truststore on the Pentaho Server
  4. Configure the Pentaho Server in HCI

Configure the Content Execution Broker

Before you run work items on worker nodes, you must configure the Content Execution Broker service running within your cluster. 

Perform the following steps in the Pentaho Worker Nodes (HCI) administration site:

  1. Open a web browser and enter https://<HCI-instance-IP-address>:8000.

 ‘HTTPS’ is the required IP address to gain site access, even for unsecured communications.

  1. In the Security Realm field, select the location where your user account is defined. To log on using the local admin account, select Local. 
  2. In the Pentaho Worker Nodes > Home page, select System Configuration. From the main page that displays, select Services, then Manage Services.

HCI_Admin810_SysConf_Services.png

  1. Select the Content-Execution-Broker panel and then click Next.
  2. In the Manage Services Wizard, select Configure and then click Next. The Configuration page displays.

HCI_Admin810_SysConf_Services_CEB_Conf.png

  1. Click the Settings tab. Set the parameters in the Container Options: Default section. 
    • Container Memory. Set the value of the hard memory limit for the Docker container. Ships with a default value of '512' MB.
    • CPU. Set the CPU usage for the worker node. Ships with a default value of '0.1'.
  2. Set the parameter in the Security Configuration section.
    • Enable Foundry Authentication. Turn on or off the requirement for Foundry authentication from external services seeking access to the Content Execution Broker. Options include:
      • Set to 'Yes' to turn on security and force external services to provide authentication credentials when accessing the Content Execution Broker.
      • Set to 'No' to turn off security such that external services can access the Content Execution Broker without providing credentials.  

For secure communication between the Pentaho Server and the worker nodes, you must set the Enable Foundry Authentication parameter to a value of 'Yes'. When you turn on this parameter by setting it to 'Yes', external services are required to provide authentication to access the Content Execution Broker.

After the parameter is set and the service is restarted, any incoming requests into the Content Execution Broker via the defined REST API must include the  Authentication-Header accompanied by the credentials to properly authenticate the request. See Configure Your PDI Job Parameters.

  1. Set the parameter in the Job Configuration section.
    • Time Interval (minutes) between execution. Set the time interval, in minutes, to configure the cleaning scheduler to delete completed PDI jobs.
  2. Select Next and then click Update Service to save your changes.

Locate the HCI Security Certificate 

HCI comes with its own self-signed certificate, which is installed automatically during system installation. This certificate enables the authentication between the Pentaho Server and the Worker Nodes for SSL communication. 

Perform the following steps to enable the HCI system certificate to a truststore:

  1. Ensure that the HCI instance and the Pentaho Server are running.
    1. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory.
    2. Check that the pentaho.worker.nodes.cfg file exists in the directory. If the file is not found, see Configure Pentaho Server Settings.
  2. Ensure that the HCI installation contains the Content Execution Broker security certificate.
    1. Navigate to the <install-path>/hci/data/com.pentaho.foundry.contentexecution.service/<guid>/ceb-truststore.pem directory.
    2. Check that the ceb-truststore.pem file exists in the directory.

Add the Content Execution Broker's Certificate to a Truststore on the Pentaho Server

The following is a suggested method for using a truststore to hold the security certificates for the system.

  1. Ensure that the HCI instance and the Pentaho Server are running. 
  2. Create a directory named 'Security' on the Pentaho Server host machine and change to this new 'Security' directory.
  3. Copy the provided ceb-truststore.pem certificate (or your own security certificate) from <install-path>/hci/data/com.pentaho.foundry.contentexecution.service/<guid>/ceb-truststore.pem and save it in the Security directory.
  4. Open a command line interface and enter the following command: 

<jre_home>/bin/keytool -importcert -file ceb-truststore.pem -alias <alias> -storepass <trust_store_password> -keystore <trust_store_name>

  1. When completed, ensure you have a generated a file with the name "<trust_store_name>." Access this file with the "<trust_store_password>" to ensure it contains the security certificates. 
  2. Remember the absolute path to the truststore file and its password; we will be setting them in the Pentaho Server's pentaho.worker.nodes.cfg file in the Configure Pentaho Server Settings section.

Configure Pentaho Server Settings

These actions set the user name and password for Worker Nodes security and enable a secure communication channel.

Perform the following steps:

  1. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the pentaho.worker.nodes.cfg file with any text editor.
  2. Perform the associated action for each property listed in the table below:
Property Action Notes
wn-hostname Set the value to the IP address used in the Network: External section of the Pentaho Worker Nodes Admin site>Home>Monitoring>Dashboard>Services>Service Details>Content-Execution-Broker page. For more information, see Configuring the Content Execution Broker
wn-port Set the value to the port number used in the PRIMARY_PORT field in the Network: External section of the Pentaho Worker Nodes Admin site>Home>Monitoring>Dashboard>Services>Service Details>Content-Execution-Broker page. The recommended value for secure communication is '38443'. For more information, see Configuring the Content Execution Broker
wn-security-enabled Set the value to ‘true’.  
wn-username Set the value to the user name you select when you log on to the Pentaho Worker Nodes administration site.   
wn-password Set the value to the password you select when you log on to the Pentaho Worker Nodes administration site.   
wn-realm Set the value to the security realm you select when you log on to the Pentaho Worker Nodes administration site.   
wn-use-https Set the value to ‘true’.  
wn-trust-store Uncomment (remove the hashtag (#) icon) the setting and set the value to the absolute path of the truststore file that contains the Content Execution Broker's certificate.   For more information, see Add the Content Execution Broker's Certificate to a Truststore on the Pentaho Server.
wn-trust-store-password Uncomment (remove the hashtag (#) icon) the setting and set the value to the truststore password.     For more information, see Add the Content Execution Broker's Certificate to a Truststore on the Pentaho Server.
  1. Save and close the pentaho.worker.nodes.cfg file. Your file changes take effect immediately.

Run and Administer the Pentaho Worker Nodes Product

Use the following articles to assist you in running and administering Pentaho Worker Nodes: