Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Use Carte Clusters

Parent article

Carte is a simple web server that allows you to run transformations and jobs remotely. It receives XML (using a small servlet) that contains the transformation to run and the execution configuration. It allows you to remotely monitor, start and stop the transformations and jobs that run on the Carte server.

About Carte Clusters

You can set up an individual instance of Carte to operate as a standalone execution engine for a job or transformation. In the PDI client (Spoon) you can define one or more Carte servers and send jobs and transformations to them. If you want to improve PDI performance for resource-intensive transformations and jobs, use Carte cluster.

NoteYou can cluster the Pentaho Server to provide failover support. If you decide to use the Pentaho Server, you must enable the proxy trusting filter as explained in Schedule Jobs to Run on a Remote Carte Server, then set up your dynamic Carte slaves and define the Pentaho Server as the master.

There are two types of Carte clusters. Static Carte cluster has a fixed schema that specifies one master node and two or more slave nodes. In a static cluster, you specify the nodes in a cluster at design-time, before you run the transformation or job.

Static clusters are a good choice for smaller environments where you don't have a lot of machines (virtual or real) to use for PDI transformations. Dynamic clusters work well if nodes are added or removed often, such as in a cloud computing environment. Dynamic clustering is also more appropriate in environments where transformation performance is extremely important, or if there can potentially be multiple concurrent transformation executions.

A Dynamic Carte cluster has a schema that specifies one master node and a varying number of slave nodes. Unlike a static cluster, slave nodes are not known until runtime. Instead, you register the slave nodes, then at runtime, PDI monitors the slave nodes every 30 seconds to see if it is available to perform transformation and job processing tasks.

Set Up a Carte Cluster

You can configure a Carte static or dynamic cluster.

Learn more

Schedule Jobs to Run on a Remote Carte Server

The following instructions are needed to schedule a job to run on a remote Carte server. Without making these configuration changes, you will be unable to remotely run scheduled jobs.
NoteThis process is also required for using the Pentaho Server as a load balancer in a dynamic Carte cluster.

Procedure

  1. Stop the Pentaho Server and remote Carte server.

  2. Copy the repositories.xml file from the .kettle directory on your workstation to the same location on your Carte slave. Without this file, the Carte slave will be unable to connect to the Pentaho Repository to retrieve PDI content.

  3. Open the /pentaho/server/pentaho-server/tomcat/webapps/pentaho/WEB-INF/web.xml file with a text editor.

  4. Find the Proxy Trusting Filter filter section, and add your Carte server's IP address to the param-value element.

    <filter>
        <filter-name>Proxy Trusting Filter</filter-name>
        <filter-class>org.pentaho.platform.web.http.filters.ProxyTrustingFilter</filter-class>
        <init-param>
          <param-name>TrustedIpAddrs</param-name>
          <param-value>127.0.0.1,192.168.0.1</param-value>
          <description>Comma separated list of IP addresses of a trusted hosts.</description>
        </init-param>
        <init-param>
          <param-name>NewSessionPerRequest</param-name>
          <param-value>true</param-value>
          <description>true to never re-use an existing IPentahoSession in the HTTP session; needs to be true to work around code put in for BISERVER-2639</description>
        </init-param>
    </filter>
  5. Uncomment the proxy trusting filter-mappings between the <!-- begin trust --> and <!-- end trust --> markers.

    <!-- begin trust --> 
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/authorizationPolicy</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/roleBindingDao</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/userRoleListService</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/unifiedRepository</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/userRoleService</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/Scheduler</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/webservices/repositorySync</url-pattern>
      </filter-mapping>
      
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/api/system/authentication-provider</url-pattern>
      </filter-mapping>
    
      <filter-mapping>
        <filter-name>Proxy Trusting Filter</filter-name>
        <url-pattern>/api/session/userName</url-pattern>
      </filter-mapping>
      <!-- end trust -->
  6. Save and close the file, then edit the carte.sh or Carte.bat startup script on the machine that runs your Carte server.

  7. Add -Dpentaho.repository.client.attemptTrust=true to the java line at the bottom of the file.

    java $OPT -Dpentaho.repository.client.attemptTrust=true org.pentaho.di.www.Carte "${1+$@}"
  8. Save and close the file.

  9. Start your Carte server and Pentaho Server

Results

You can now schedule a job to run on a remote Carte instance.

Stop Carte from the Command Line Interface or URL

Perform the following steps to stop Carte either from the CLI or URL:

Procedure

  1. Open the command line interface by clicking Start and typing cmd. Press Enter.

  2. In the command line interface, enter the location of the Carte server.

  3. Enter a space, then type the arguments for stopping the server.

  4. Press Enter after the arguments are typed.

    Arguments:

    Carte <Interface address> <Port> [-s] [-p <arg>] [-u <arg>]

    Example:

    Carte 127.0.0.1 8080 -s -p amidala4ever -u dvader

    You can also now use a URL to stop Carte:

    http://localhost:8080/kettle/stopCarte
    Parameters
    Command OptionDescriptionType
    -h, --helpHelp text.n/a
    -p,--password <arg>The administrator password. Required only if stopping the Carte server.Alphanumeric
    -s,--stopStop the running Carte server. This is only allowed when using the hostname/port form of the command.Alphanumeric
    -u,--username <arg>The administrator user name. Required only if stopping the Carte server.Alphanumeric

Run Transformations and Jobs from the Repository on the Carte Server

To run a job or transformation remotely on a Carte server, you first need to copy the local repositories.xml from the user's .kettle directory to the Carte server's $HOME/.kettle directory. The Carte service also looks for the repositories.xml file in the directory from which Carte was started.

For more information about locating or changing the ..kettle home directory, see Changing the Pentaho Data Integration Home Directory Location (.kettle folder).