Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Installing the Carte Server on GCP

These instructions provide the steps necessary to deploy Docker images of the Carte Server on GCP.

Prerequisites for installing the Carte Server on GCP

The following software must be installed on your workstation before installing the Carte Server:

  • The PDI is needed to connect to the Carte Server for testing.
  • A stable version of Docker must be installed on your workstation. See Docker documentation.
  • The Kubernetes command-line tool, Kubectl must be installed.
  • (Optional) Use Kubernetes or Lens to manage your Kubernetes cluster.
  • (Optional) Use the Kubectl bash-completion package.
  • GCloud CLI utils must be installed and authenticated.

Process overview for running the Carte Server on GCP

Use the following steps to deploy the Carte Server on the GCP cloud platform

  1. Download and extract the Carte Server for GCP.
  2. Create a Docker registry in GCP.
  3. Push the Carte Server Docker image to GCP.
  4. Create and populate a Google Cloud Storage bucket.
  5. Create a Google Cloud SQL PostgreSQL instance.
  6. Set up a GKE cluster on GCP.
  7. Deploy the Carte Server on GCP.

Download and extract Pentaho for GCP

Download and open the package files that contain the files needed to install Pentaho.

Procedure

  1. Navigate to the Support Portal and download the GCP version of the Docker image with the corresponding license file for the applications you want to install on your workstation.

    NoteMake note of the image name for later.
  2. Extract the image into your local Docker registry.

    The image package file (<package-name>.tar.gz) contains the following:
    NameContent description
    imageDirectory containing all the Pentaho source images.
    yamlDirectory containing YAML configuration files and various utility files.
    README.mdFile containing a link to detailed information about what we are providing for this release.

Create a Docker registry in GCP

Before pushing the Pentaho image to GCP, you need to create a Docker registry in GCP.

Procedure

  1. Create a Docker registry in GCP.

    For instructions, see Store Docker container images in Artifact Registry.
  2. Connect to the Docker registry using the following command:

    gcloud auth configure-docker <YOUR_REGION>-docker.pkg.dev
  3. To verify that the registry has been added correctly, run this command:

    cat ~/.docker/config.json
  4. Record the name of the registry that you have created in the Worksheet for GCP hyperscaler.

Load and push the Pentaho Docker image to the GCP registry

Perform the following steps to load and push the Pentaho Docker image to GCP:

Procedure

  1. Navigate to the image directory containing the Pentaho tar.gz files.

  2. Select and load the tar.gz file into the local registry by running the following command:

    docker load -i <pentaho-image>.tar.gz
  3. Record the name of the source image that was loaded into the registry by using the following command:

    docker images
  4. Tag the source image so it can be pushed to the cloud platform by using the following command:

    docker tag <source-image>:<tag> <target-repository>:<tag>
  5. Push the image to the GCP registry using the following command:

    docker push <IMAGE_NAME>
  6. Verfiy that the image has been properly loaded using the Google Cloud Console.

Create a Google Cloud Storage bucket

Create a Google Cloud Storage bucket and place your configuration files in any directory path in the bucket.

In these instructions, the following path is used as an example: gs://pentaho-project/my-bucket.

Perform the following steps to create and populate a Google Cloud Storage bucket:

Procedure

  1. Create a Cloud Storage bucket as explained in the GCP documentation.

  2. Add the Kettle transformation (KTR) and job (KJB) files that you want to use to the bucket.

  3. If any of your jobs or transformations use VFS connections to the Google Storage buckets, perform the following steps:

    1. Upload a copy of your GCS credentials file to the Google Storage bucket.

      For example gs://pentaho-project/my-bucket/<credentials-file>.json
    2. Update any VFS connections that use this credentials file to point to the following path: /home/pentaho/data-integration/data/<credentials-file>.json

  4. Copy your local .pentaho/metastore folder to the Google Storage bucket.

    The .pentaho/ folder is located in the user home directory by default.
    NoteYou must edit your GCS VFS connections before copying the .pentaho/ folder. If you need to change the VFS connections, upload the GCS credentials file and update any associated GFS connections again.
  5. Copy any license files (*.lic) needed for the product(s) you will be using to the location specified by PROJECT_GCP_LOCATION.

Set up a GKE cluster on GCP

Use the following steps to configure a GKE cluster on Google Cloud Platform (GCP).

Procedure

  1. Create a GKE cluster on GCP.

    Configure the cluster to meet your requirements. For a simple example, see Create a GKE cluster.
  2. Authenticate with the cluster using the following command:

    gcloud container clusters get-credentials <CLUSTER_NAME> --region=<YOUR_REGION>
  3. Check the connection using the following command:

  4. (Optional) Use a tool like Kubernetes or Lens to check that they are connected.

Deploy the Carte Server on GCP

Prepare to launch your service by performing some setup steps and then deploy.

NoteThis example serves as a starting point. More elaborate deployments may require more extensive changes depending on the use case and cluster setup, including modifying the nodeport specification.

Procedure

  1. Navigate to the file carte-gcp-gke.yaml in your distribution and open it in a text editor.

  2. Configure the YAML file as follows:

    VariableReplace with
    IMAGE_URIThe name of the image that has been loaded into the GCP Docker registry.
    PROJECT_CLOUD_STORAGE_LOCATIONThe path to the bucket. Using the sample bucket name above, this would be pentaho-project/my-bucket.
    METASTORE_LOCATION

    The relative path to the parent folder of the .pentaho/folder.

    In the example above (with the metastore inside pentaho-project/my-bucket/.pentaho/), the value is simply “.” to indicate that the parent folder of .pentaho/ is the same as the root of the bucket.

    CARTE_CONFIG_FILE

    If using a custom carte-config.xml, replace the value under CARTE_CONFIG_FILE with the relative path to the custom config.

    If not using a custom carte-config.xml, comment out or remove this section from the YAML file.

  3. If the example YAML is intended to run the pod in a different namespace than carte, replace the name carte in the namespace section at the top with the desired name.

    In addition, replace all instances of the namespace carte with the new name.
    NoteThe namespace carte is used in the rest of this procedure.
  4. Save the YAML file.

  5. Once all the configuration changes have been made, deploy it to the cluster:

    kubectl apply -f carte-gcp-gke
  6. Verify that the service is running by executing the following command:

    kubectl get pods --namespace carte
    After a few moments, you should see a pod come up. (It can take a minute to pull the image the first time you start.)
  7. Connect to the Carte Server from the Spoon client and try running a KTR or KJB.

    The default port number is 8081 but can be different if the networking setup requires it.

Worksheet for GCP hyperscaler

To access the common worksheet for the GCP hyperscaler, go to Worksheet for GCP hyperscaler.