Setting up Pentaho Worker Nodes
As part of setting up Pentaho Worker Nodes, consider how you want your system to operate. Pentaho Worker Nodes needs to have access to Pentaho work items, such as PDI jobs and transformations, for execution. Worker Nodes can be configured to access PDI jobs and transformations by using the Pentaho Repository or by using the mounted volume.
Using the Pentaho Repository
In this configuration, Pentaho work items, such as PDI jobs and transformations are accessed from the Pentaho Repository. You must have the Pentaho Server installed.
To connect to the repository, the repositories.xml file must be in the mounted volume under the .kettle folder. The repositories.xml has the following format:
<repositories> <repository> <id>PentahoEnterpriseRepository</id> <name>Pentaho Repository Name</name> <description>Pentaho repository</description> <is_default>true</is_default> <repository_location_url><ip_pentaho_server>:8080/pentaho</repository_location_url> <version_comment_mandatory>N</version_comment_mandatory> </repository> </repositories>
Pentaho Worker Nodes needs read/write access to the Pentaho Repository. See Granting Worker Nodes access to Pentaho Repository for changes to the web.xml file in the Pentaho Server.
Using the mounted volume
In this configuration, place the Pentaho work items, such as PDI jobs and transformation directly under the mounted volume under .kettle folder.
Setting up the Pentaho Server to communicate with Pentaho Worker Nodes
After starting the Pentaho Worker Nodes cluster, you need to configure the Pentaho Server to delegate PDI transformations and jobs called work items to the Content Execution Router service.
Enable Pentaho Worker Nodes
Procedure
If you have not done so already, stop the Pentaho Server.
Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the org.apache.karaf.features.cfg file with any text editor.
Locate the
featuresBoot
property and add thepentaho-worker-nodes-ee
feature.featuresBoot=config,management,kar,cxf,camel,camel-blueprint,camel-stream,pentaho-camel-jms,pentaho-server,pentaho-monitoring-to-snmp,pentaho-metaverse,pdi-dataservice,pdi-data-refinery,pentaho-big-data-ee-plugin-osgi-obf,pdi-engine-configuration,pentaho-worker-nodes-ee
Save and close the org.apache.karaf.features.cfg file.
Start the Pentaho Server.
Verify that the new configuration file pentaho.worker.nodes.cfg is created in the following location:
pentaho-server/pentaho-solutions/system/karaf/etc/NoteThe Karaf OSGI framework allows a user to make configuration changes which are applied in real-time to the environment when the configuration file is saved.
Configure the Pentaho Worker Nodes
Complete the following editing steps in the configuration file:
Procedure
Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the pentaho.worker.nodes.cfg file with any text editor.
Perform the associated action for each property listed in the table below:
Property Action Notes orchestrator Change the value to pentaho-scale. If you wanted to restore the default work item (WI) delegation scheme, set this property value to None. wn-hostname Set the value to the IP address of the Content Execution Router service running within your cluster. To find the IP address of your Content Execution Router service, browse to .wn-port Set this property to the value of the PRIMARY_PORT of the Content Execution Router service. To find the port number of the Content Execution Router service, browse to .Save and close the pentaho.worker.nodes.cfg file.
NoteAfter the Worker Nodes are configured, all transformations and jobs submitted to the Pentaho Server are processed using Worker Nodes. To return to standard Pentaho Server operation, set the orchestrator property value to None.
Granting Worker Nodes access to Pentaho Repository
The Pentaho Worker Nodes product needs read/write access to the Pentaho Repository to process work items from the Pentaho Server. To grant Pentaho Repository access to the PWN cluster:
Procedure
Edit <pentaho-server-install-dir>/pentaho-server/tomcat/webapps/pentaho/WEB-INF/web.xml and locate the following tag:
<filter> <filter-name>Proxy Trusting Filter</filter-name> <filter-class>org.pentaho.platform.web.http.filters.ProxyTrustingFilter</filter-class> <init-param> <param-name>TrustedIpAddrs</param-name> <param-value>127.0.0.1,0\:0\:0\:0\:0\:0\:0\:1(%.+)*$</param-value> <description>Comma separated list of IP addresses of a trusted hosts.</description> </init-param> </filter>
In the
Any requests originating from these IP addresses will be trusted by the Pentaho Server for specific actions, eliminating the need for passing along credentials.<param-name>TrustedIpAddrs</param-name>
tags, add the IP addresses of the nodes in the PWN cluster.Specify which actions are allowed for requests originating from these IP addresses.
Locate the other Proxy Trusting Filters. For example:
<filter-mapping> <filter-name>Proxy Trusting Filter</filter-name> <url-pattern>/i18n</url-pattern> </filter-mapping>
Below this code, add the following:
<filter-mapping> <filter-name>Proxy Trusting Filter</filter-name> <url-pattern>/webservices/*</url-pattern> </filter-mapping> <filter-mapping> <filter-name>Proxy Trusting Filter</filter-name> <url-pattern>/api/system/authentication-provider</url-pattern> </filter-mapping>
Save.
Restart the Pentaho Server to see the changes.