Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Setting up Pentaho Worker Nodes

Parent article

As part of setting up Pentaho Worker Nodes, consider how you want your system to operate. Pentaho Worker Nodes needs to have access to Pentaho work items, such as PDI jobs and transformations, for execution. Worker Nodes can be configured to access PDI jobs and transformations by using the Pentaho Repository or by using the mounted volume.

Using the Pentaho Repository

In this configuration, Pentaho work items, such as PDI jobs and transformations are accessed from the Pentaho Repository. You must have the Pentaho Server installed.

To connect to the repository, the repositories.xml file must be in the mounted volume under the .kettle folder. The repositories.xml has the following format:

<repositories>
   <repository>
     <id>PentahoEnterpriseRepository</id>
     <name>Pentaho Repository Name</name>
     <description>Pentaho repository</description>
     <is_default>true</is_default>
     <repository_location_url><ip_pentaho_server>:8080/pentaho</repository_location_url>
     <version_comment_mandatory>N</version_comment_mandatory>
   </repository>
</repositories>
Changes to the Web.xml file

Pentaho Worker Nodes needs read/write access to the Pentaho Repository. See Granting Worker Nodes access to Pentaho Repository for changes to the web.xml file in the Pentaho Server.

Using the mounted volume

In this configuration, place the Pentaho work items, such as PDI jobs and transformation directly under the mounted volume under .kettle folder.

Setting up the Pentaho Server to communicate with Pentaho Worker Nodes

After starting the Pentaho Worker Nodes cluster, you need to configure the Pentaho Server to delegate PDI transformations and jobs called work items to the Content Execution Router service.

Enable Pentaho Worker Nodes

Perform the following steps in the Karaf environment to enable the Pentaho Worker Nodes.

Procedure

  1. If you have not done so already, stop the Pentaho Server.

  2. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the org.apache.karaf.features.cfg file with any text editor.

  3. Locate the featuresBoot property and add the pentaho-worker-nodes-ee feature.

    featuresBoot=config,management,kar,cxf,camel,camel-blueprint,camel-stream,pentaho-camel-jms,pentaho-server,pentaho-monitoring-to-snmp,pentaho-metaverse,pdi-dataservice,pdi-data-refinery,pentaho-big-data-ee-plugin-osgi-obf,pdi-engine-configuration,pentaho-worker-nodes-ee
  4. Save and close the org.apache.karaf.features.cfg file.

  5. Start the Pentaho Server.

  6. Verify that the new configuration file pentaho.worker.nodes.cfg is created in the following location:

    pentaho-server/pentaho-solutions/system/karaf/etc/
    NoteThe Karaf OSGI framework allows a user to make configuration changes which are applied in real-time to the environment when the configuration file is saved.

Configure the Pentaho Worker Nodes

Once you have enabled Worker Nodes, you must configure the orchestration method, as well as the IP and port of the PWN cluster's Content Execution Router service.

Complete the following editing steps in the configuration file:

Procedure

  1. Navigate to the pentaho-server/pentaho-solutions/system/karaf/etc/ directory and open the pentaho.worker.nodes.cfg file with any text editor.

  2. Perform the associated action for each property listed in the table below:

    PropertyActionNotes
    orchestratorChange the value to pentaho-scale.If you wanted to restore the default work item (WI) delegation scheme, set this property value to None.
    wn-hostnameSet the value to the IP address of the Content Execution Router service running within your cluster. To find the IP address of your Content Execution Router service, browse to Admin App > Services > Content Execution Router > INSTANCES.
    wn-portSet this property to the value of the PRIMARY_PORT of the Content Execution Router service. To find the port number of the Content Execution Router service, browse to Admin App > Services > Content Execution Router > NETWORK.
  3. Save and close the pentaho.worker.nodes.cfg file.

    NoteAfter the Worker Nodes are configured, all transformations and jobs submitted to the Pentaho Server are processed using Worker Nodes. To return to standard Pentaho Server operation, set the orchestrator property value to None.

Granting Worker Nodes access to Pentaho Repository

The Pentaho Worker Nodes product needs read/write access to the Pentaho Repository to process work items from the Pentaho Server. To grant Pentaho Repository access to the PWN cluster:

Procedure

  1. Edit <pentaho-server-install-dir>/pentaho-server/tomcat/webapps/pentaho/WEB-INF/web.xml and locate the following tag:

    <filter>
      <filter-name>Proxy Trusting Filter</filter-name>
      <filter-class>org.pentaho.platform.web.http.filters.ProxyTrustingFilter</filter-class>
      <init-param>
        <param-name>TrustedIpAddrs</param-name>
        <param-value>127.0.0.1,0\:0\:0\:0\:0\:0\:0\:1(%.+)*$</param-value>
        <description>Comma separated list of IP addresses of a trusted hosts.</description>
      </init-param>
    </filter>
    
  2. In the <param-name>TrustedIpAddrs</param-name> tags, add the IP addresses of the nodes in the PWN cluster.

    Any requests originating from these IP addresses will be trusted by the Pentaho Server for specific actions, eliminating the need for passing along credentials.
  3. Specify which actions are allowed for requests originating from these IP addresses.

    1. Locate the other Proxy Trusting Filters. For example:

      <filter-mapping>
         <filter-name>Proxy Trusting Filter</filter-name>
         <url-pattern>/i18n</url-pattern>
      </filter-mapping>
      
    2. Below this code, add the following:

      <filter-mapping>
         <filter-name>Proxy Trusting Filter</filter-name>
         <url-pattern>/webservices/*</url-pattern>
      </filter-mapping>
      
       <filter-mapping>
         <filter-name>Proxy Trusting Filter</filter-name>
         <url-pattern>/api/system/authentication-provider</url-pattern>
      </filter-mapping>
      
  4. Save.

  5. Restart the Pentaho Server to see the changes.