Configure Static and Dynamic Carte Clusters
If you want to speed the processing of your transformations, consider setting up a Carte cluster. A Carte cluster consists of two or more Carte slave servers and a Carte master server. When you run a transformation, the different parts of it are distributed across Carte slave server nodes for processing, while the Carte master server node tracks the progress.
Configure a Static Carte Cluster
Follow the directions below to set up static Carte slave servers.
- Copy over any required JDBC drivers and PDI plugins from your development instances of PDI to the Carte instances.
- Run the Carte script with an IP address, hostname, or domain name of this server, and the port number you want it to be available on.
./carte.sh 127.0.0.1 8081
- If you will be executing content stored in a DI Repository, copy the repositories.xml file from the .kettle directory on your workstation to the same location on your Carte slave. Without this file, the Carte slave will be unable to connect to the DI Repository to retrieve content.
- Ensure that the Carte service is running as intended, accessible from your primary PDI development machines, and that it can run your jobs and transformations.
- To start this slave server every time the operating system boots, create a startup or init script to run Carte at boot time with the same options you tested with.
Configure a Dynamic Carte Cluster
The following instructions explain how to create carte-master-config.xml and carte-slave-config.xml files. You can rename these files if you want, but you must specify the content in the files as per the instructions.
Configure Carte Master Server
Follow the process below to configure the Carte Master Server.
- Copy over any required JDBC drivers from your development instances of PDI to the Carte instances.
- Create a carte-master-config.xml configuration file using the following example as a template:
<slave_config> <!-- on a master server, the slaveserver node contains information about this Carte instance --> <slaveserver> <name>Master</name> <hostname>yourhostname</hostname> <port>9001</port> <username>cluster</username> <password>cluster</password> <master>Y</master> </slaveserver> </slave_config>
The <name> of the Master server must be unique among all Carte instances in the cluster.
- Run the Carte script with the carte-slave-config.xml parameter. Note that if you placed the carte-master-config.xml file in a different directory than the Carte script, you will need to add the path to the file to the command.
./carte.sh carte-master-config.xml
- Ensure that the Carte service is running as intended.
- To start this master server every time the operating system boots, create a startup or init script to run Carte at boot time.
Tuning Options
The table below shows the three configurable settings for schedule and remote execution logging in the slave-server-config.xml file .
To make modifications to slave-server-config.xml, you must stop the DI Server.
Property | Values | Description |
---|---|---|
max_log_lines | Any value of 0 (zero) or greater. 0 indicates that there is no limit. | Truncates the execution log when it goes beyond this many lines. |
max_log_timeout_minutes | Any value of 0 (zero) or greater. 0 indicates that there is no timeout. | Removes lines from each log entry if it is older than this many minutes. |
object_timeout_minutes | Any value of 0 (zero) or greater. 0 indicates that there is no timeout. | Removes entries from the list if they are older than this many minutes. |
The following code block is an example of the slave-server-config.xml file:
<slave_config> <max_log_lines>0</max_log_lines> <max_log_timeout_minutes>0</max_log_timeout_minutes> <object_timeout_minutes>0</object_timeout_minutes> </slave_config>
Configure Carte Slave Servers
Follow the directions below to set up static Carte slave servers.
- Follow the process to configure the Carte Master Server.
- Make sure the Master server is running.
- Copy over any required JDBC drivers from your development instances of PDI to the Carte instances.
- In the /pentaho/design-tools/ directory,create a carte-slave-config.xml configuration file using the following example as a template:
<slave_config> <!-- the masters node defines one or more load balancing Carte instances that will manage this slave --> <masters> <slaveserver> <name>Master</name> <hostname>yourhostname</hostname> <port>9000</port> <!-- uncomment the next line if you want the DI Server to act as the load balancer --> <!-- <webAppName>pentaho-di</webAppName> --> <username>cluster</username> <password>cluster</password> <master>Y</master> </slaveserver> </masters> <report_to_masters>Y</report_to_masters> <!-- the slaveserver node contains information about this Carte slave instance --> <slaveserver> <name>SlaveOne</name> <hostname>yourhostname</hostname> <port>9001</port> <username>cluster</username> <password>cluster</password> <master>N</master> </slaveserver> </slave_config>
The slaveserver <name> must be unique among all Carte instances in the cluster.
- If you want a slave server to use the same kettle properties as the master server, add the <get_properties_from_master> and <override_existing_properties> tags between the <slaveserver> and </slaveserver> tags for the slave server. Put the name of the master server between the <get_properties_from_master> and </get_properties_from_master> tags. Here is an example.
<!-- the slaveserver node contains information about this Carte slave instance --> <slaveserver> <name>SlaveOne</name> <hostname>yourhostname</hostname> <port>9001</port> <username>cluster</username> <password>cluster</password> <master>N</master> <get_properties_from_master>Master</get_properties_from_master> <override_existing_properties>Y</override_existing_properties> </slaveserver>
- Save and close the file.
- Run the Carte script with the carte-slave-config.xml parameter. Note that if you placed the carte-slave-config.xml file in a different directory than the Carte script, you will need to add the path to the file to the command.
./carte.sh carte-slave-config.xml
- If you will be executing content stored in a DI Repository, copy the repositories.xml file from the .kettle directory on your workstation to the same location on your Carte slave. Without this file, the Carte slave will be unable to connect to the DI Repository to retrieve PDI content.
- Stop, then start the master and slave servers.
- Stop, then start the DI Server.
- Ensure that the Carte service is running as intended. If you want to start this slave server every time the operating system boots, create a startup or init script to run Carte at boot time.
Changing Jetty Server Parameters
Jetty Server Parameters | Definition |
---|---|
acceptors | The number of thread dedicated to accepting incoming connections. The number of acceptors should be below or equal to the number of CPUs. |
acceptQueueSize | Number of connection requests that can be queued up before the operating system starts to send rejections. |
lowResourcesMaxIdleTime | This allows the server to rapidly close idle connections in order to gracefully handle high load situations. |
If you want to learn more about these options, check out the Jetty documentation here: http://wiki.eclipse.org/Jetty/Howto/Configure_Connectors#Configuration_Options. For more information about a high load setup read this article: https://wiki.eclipse.org/Jetty/Howto/High_Load.
Setting the Jetty Server Parameters in the carte-slave-config.xml file
- In the /pentaho/design-tools/ directory, open the carte-slave-config.xml and add these lines between the <slave_config> </slave_config> tags.
<slave_config> ... <!-- Carte uses an embedded jetty server. Include this next section only if you want to change the default jetty configuration options.--> <jetty_options> <acceptors>2</acceptors> <acceptQueueSize>2</acceptQueueSize> <lowResourcesMaxIdleTime>2</lowResourcesMaxIdleTime> </jetty_options> </slave_config>
- Adjust the values for the parameters as necessary, then save and close the file.
Setting the Jetty Server Parameters in the kettle.properties file
Kettle Variable in kettle.properties | Jetty Server Parameter |
---|---|
KETTLE_CARTE_JETTY_ACCEPTORS | acceptors |
KETTLE_CARTE_JETTY_ACCEPT_QUEUE_SIZE | acceptQueueSize |
KETTLE_CARTE_JETTY_RES_MAX_IDLE_TIME | lowResourcesMaxIdleTime |