Installing the Carte Server on Azure
These instructions provide the steps necessary to deploy Docker images of the Carte Server on Azure.
Prerequisites for installing Pentaho on Azure
Observe the following prerequisites before installing Pentaho:
- A stable version of Docker must be installed on your workstation.
- You must have an Azure account and subscription to complete this installation.
- The following software versions are required:
Application | Supported version |
Docker | v20.10.21 or a later stable version |
Azure CLI | v2.x |
Process overview for installing the Carte Server on Azure
Use the following instructions to deploy the Carte Server on the Azure cloud platform:
- Download and unpack Pentaho for Azure.
- Create an Azure ACR.
- Push the Pentaho Docker image to ACR.
- Create an Azure storage account.
- Choose one of the following deployment methods for Azure hyperscaler:
- Use Azure Container Instances (ACI)
- Use Azure Kubernetes Services (AKS)
You can also perform the following operation:
- Update a license when stored in a storage account
Download and extract Pentaho for Azure
Procedure
Navigate to the Support Portal and download the Azure version of the Docker image with the corresponding license file for the applications you want to install on your workstation.
Extract the image to view the directories and the readme file.
The image package file (<package-name>.tar.gz) contains the following:Name Content description image Directory containing all the Pentaho source images. templates Directory containing templates for various operations. yaml Directory containing YAML configuration files and various utility files. README.md File containing a link to detailed information about what we are providing for this release. In the image directory, unpack the tar.gz file that contains the Pentaho Docker image layers.
Create an Azure ACR
Before pushing the Pentaho image to Azure, you need to create an Azure Container Registry (ACR).
Procedure
Create an ACR repository to load Pentaho.
For information on how to create an Azure ACR, see Create an ACR repository.Record the name of the ACR repository that you have created in the Worksheet for Azure hyperscaler.
Push the Pentaho Docker image to ACR
Select and tag the Pentaho Docker image and then push it to the ACR registry.
Procedure
Navigate to the image directory containing the Pentaho tar.gz files.
Select and load the tar.gz file into the local registry by running the following command:
docker load -i <pentaho-image>.tar.gz
Record the name of the source image that was loaded into the registry by using the following command:
docker images
Tag the source image so it can be pushed to the cloud platform by using the following command:
docker tag <source-image>:<tag> <target-repository>:<tag>
Push the image file into the ACR registry by using the following Docker command:
docker push <target-repository>:<tag>
NoteFor general Azure instructions on how to push an image to Azure, see Pushing a Docker image.The Azure Management Console displays the uploaded image URI.Record the newly created ACR repository URI in the Worksheet for Azure hyperscaler.
Create an Azure storage account for Carte
You should create an storage account only if you want to do one or more of the following. Otherwise proceed to create the AKS cluster or ACI instance:
- Store the Pentaho license. (Alternatively, you can store the license in the Kubernetes Secret. In that case, do not store the license in the storage account as storing a license in both places is not supported by Pentaho.)
- Add transformations or jobs that you want to run.
- Add third party JAR files like JDBC drivers or custom JAR files for the Carte Server to use.
- Customize the default Carte Server configuration.
- Replace the Carte files.
- Upload or update the metastore.
- Add files to the Carte Server's .kettle directory
Procedure
To create a storage account, see Creating a storage account. To upload a file to the storage account’s file share, see Creating and using Azure file shares.
Record the newly created storage account name and the corresponding file share name in the Worksheet for Azure hyperscaler.
Upload files into the storage account’s file share.
After the storage account is created, upload the relevant files to the file share into an appropriate directory location using the Azure Portal. The relevant PentahoCarte directories are explained below:
Directory Actions /root All the files in the storage account are copied to the Carte Server's .kettle directory.
If you need to copy a file to the Carte Server's .kettle directory, do the following:
User action: Drop the file in the root directory of the file share.
custom-lib If your Carte Server needs custom JAR libraries, add the custom-lib directory to your storage account and place the libraries there.
Any files within this directory will be copied to Carte Server’s lib directory.
Jdbc-drivers If your Carte Server needs JDBC drivers, do the following:
- Add the jdbc-drivers directory to your storage account.
- Place the drivers in this directory.
Any files within this directory will be copied to the Carte Server’s lib directory.
plugins If your Pentaho installation needs additional plugins installed, add the plugins directory to your file share.
Any files within this directory are copied to the Carte Server’s plugins directory. For this reason, the plugins should be organized in their own directories as expected by the Carte Server.
drivers If your Carte Server needs big data drivers installed, do the following:
- Add the drivers directory to your file share.
- Place the big data drivers in this directory.
Any files placed within this directory will be copied to the Carte Server’s drivers directory.
The relevant Carte Server files are explained below:
File Actions content-config.properties The content-config.properties file is used by the Carte Server's Docker image to provide instructions on which storage account files to copy over and their location.
The instructions are populated as multiple lines in the following format:
${KETTLE_HOME_DIR}/<some-dir-or- file>=${APP_DIR}/<some-dir>
A template for this file can be found in the templates project directory.
content-config.sh This is a bash script that can be used to configure files, change file and directory ownership, move files around, install missing apps, and so on.
You can add it to the storage account’s file share.
It is executed in the Docker image after the other files are processed.
Deployment methods for Azure hyperscaler (Carte)
There are two methods to deploy based on the use case required:
The following table lists a few differences between these methods:
Factors | ACI | AKS |
Scalability | Limited scalability: With ACI, you can only run one single Carte instance. Multiple Carte instances and load-balancing cannot be achieved with ACI. | Scalability and high availability: AKS provides automatic scaling and self-healing, which make it ideal for running large and complex workloads that require scalability and high availability. |
Flexibility | Limited flexibility: ACI is a managed service, which means you have limited control over the underlying infrastructure. | Flexibility: AKS provides more control over the underlying infrastructure and allows for greater customization and flexibility. |
Cost | Cost-effective: ACI is a pay-per-second model, which means you only pay for the time your container is running. | Cost: AKS can be more expensive than ACI, especially for small workloads that do not require scaling. |
Maintenance | Minimal maintenance required: ACI is a managed service, so most maintenance tasks are handled by Microsoft. | Maintenance required: AKS requires ongoing maintenance and management, including updates and patches. |
Feature Set | Limited feature set: ACI lacks some of the advanced features available in AKS, such as automatic scaling, self-healing, and service discovery. | Advanced features: AKS provides advanced features such as service discovery, load balancing, and container orchestration, which make it a powerful tool for managing containerized applications. |
Complexity | Simple setup: ACI provides a simple and fast way to run containers without the need to manage a cluster. | Complexity: AKS can be more complex to set up and manage than ACI, especially for users who are not familiar with Kubernetes. |
Use Azure Container Instances (ACI) with the Carte Server
With ACI, you can only run one single server instance. Multiple servers instances and load-balancing cannot be achieved with ACI. See ACI context creation and prerequisites for more information.
Perform the following steps to deploy Pentaho on an Azure Container Instance:
Procedure
Create a Docker ACI context by entering the following command:
docker context create aci <context name>
For
<context name>
, choose your existing resource group where you have your ACR.Use the created ACI context by entering the following command:
docker context use <context-name>
Open the file docker-compose-server-aci.yml and replace the following values:
Value Setting <image_uri> Image URI from the ACR in the format name:tag
<fileshare-name> The file share name created in the storage account <your-storageaccount-name> Your storage account name Replace the following values (1st column) in the docker-compose-server-aci.yml file with the setting in the 2nd column:
Value Setting <STORAGE> STORAGE property from the Pentaho Worksheet for Azure hyperscaler. <CARTE_CONFIG_FILE> CARTE_CONFIG_FILE property from the Pentaho Worksheet for Azure hyperscaler. This contains the file path of the config file. <METASTORE_LOCATION> METASTORE_LOCATION property from the Pentaho Worksheet for Azure hyperscaler. NoteThis folder contains the license file.Run the following command to update the YAML file:
docker-compose –f docker-compose-server-aci.yml up
Check the overview on the container instance to find the IP address. Use the port number mentioned in the carte-config file.
Example: 8081
Use Azure Kubernetes Service (AKS) (Carte)
Use Azure Elastic Kubernetes Service (AKS) to create a cluster for running the Carte Server.
Perform the following steps to deploy Pentaho on the Azure Kubernetes Service (AKS):
Procedure
To create an AKS cluster, you can follow the steps outlined in the Azure documentation: https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough-portal
NoteYou need to have “contributor” built-in role level permissions to work with all the services.Check the specific permissions associated with your custom role by going to the Access Control (IAM) section in the subscription that you have.
Install
kubectl
using this command:az aks install-cli
NoteSee Kubectl Install and connect to cluster for more information.Use the following steps to create a namespace (this can also be created in the
yaml
files application):docker-namespace
Go to the newly created AKS cluster and go to
namespaces
.Click
+Create
and specify the name.Record the name of the newly created namespace in the Pentaho Hyperscaler Installation Worksheet.
Use the
This command retrieves and merges the AKS cluster's credentials into your local kubeconfig file. Here is a sample command:az aks get-credentials
command to retrieve the kubeconfig from an AKS cluster in Azure.az aks get-credentials --admin --name docker-aks --resource-group docker-rover
To configure the Carte Server's YAML file, open the file pentaho-carte-azure-aks.yaml in the yaml project directory.
In the secrets.yml file, replace the following variables from the worksheet:
<your-namespace-name>
Specify your AKS Namespace name <your-secret-name>
Specify your AKS Secret name <your-storage-account-name-base64encoding>
Run the command echo -n "<your-storage-account-name>" | base64
and specify the output of this command as the value<your-storage-account-key-base64encoding>
Run the command echo -n "<your-storage-account-key>" | base64
and specify the output of this command as the value<your-storageaccount-name>
Specify your Storage account name NoteYour Storage account details can be found in the Access keys sections. To run this secrets.yml file, enterkubectl apply –f secrets.yml
at the command prompt where the file exists.For the yaml files, replace the following variables from the Worksheet for Azure hyperscaler:
<image_uri>
<your-namespace-name>
<fileshare-name>
<your-secret-name>
Add the Pentaho license in the storage account within the .pentaho folder of your metastore folder.
Deploy the Carte Server using a Pentaho license file stored on the storage acccount with the following command:
kubectl apply -f <PATH TO PENTAHO DEPLOYMENT YAML>
Use the following steps to test the Pentaho/Carte Server:
Retrieve the Service URI by running the following command in your workstation console:
echo $( kubectl get ingress -n <your-namespace-name> -o jsonpath='{.items..hostname}' )
NoteThe port number for this carte server is 8081, which is mentioned in the yaml file.Open the URI you received in a prior step in a Pentaho supported browser or go to the portal and you can find the URI in the Services and Ingress section of your AKS cluster and you should see Pentaho/Carte Server login screen.
Update a license when stored in a storage account
Perform the following steps to update a license that is stored in a storage account:
Procedure
Navigate to the home/pentaho directory.
Run the load-data.sh script.
Run the installlicenses.sh script.
Worksheet for AWS hyperscaler
To access the common worksheet for the Azure hyperscaler, go to Worksheet for Azure hyperscaler.