Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Installing Data Catalog on Kubernetes with Helm

Parent article

Use this installation method if all your data sources are in networks separated by low latencies from the Kubernetes cluster and if you do not have any Hadoop-based data sources (HDFS or Hive). In this installation, all the necessary Data Catalog components are entirely deployed in the Kubernetes cluster.

Before you begin

Before you begin to install Data Catalog on Kubernetes, you must obtain the relevant Helm chart and Docker images.

Contact Hitachi Vantara Lumada Data Catalog support to obtain access to the following artifacts:

  • Helm chart

    ldc-7.0.0.tgz

  • Docker images

    By default, the Hitachi Vantara-owned Docker images are not publicly available. The following Docker images files will be provided by Hitachi Vantara:

    • lumada-catalog/app-server:7.0.0
    • lumada-catalog/agent:7.0.0
    • lumada-catalog/mongodb-migration-tool:7.0.0
    • lumada-catalog/spark:3.1.1-hadoop-2.10.1-1.0.0
    • lumada-catalog/mongodb:5.0.6-220222.21-master-ee

Installing on Kubernetes

Perform the following tasks to install Data Catalog on Kubernetes with Helm:

  1. Load Docker images and Helm charts
  2. Create Kubernetes secrets
  3. Customize Helm chart values
  4. Deploy the Helm chart

Load Docker images and Helm charts

You must load the Docker images for Kubernetes to have the ability to create, manage, and run Data Catalog containers. To load them from the archive to the private registry, use the helper script.

Procedure

  1. Log into the registry, and make sure you have the following required software to run the script:

    • jq
    • docker
    • tar
  2. Load the Docker images on each node of your cluster with the following example command:

    ./ldc-load-images.sh -r myregisty.azurecr.io --images /tmp/ldc-images-7.0.0.tar.gz
  3. (Optional) Re-tag the loaded images and push them to your organization's Docker registry for improved manageability.

Create Kubernetes secrets

You create Kubernetes secrets to store and maintain small amounts of sensitive data, such as passwords, tokens, and keys.

Perform the following steps create secrets for configuration information in the core-site.xml file and your license files:

Procedure

  1. Use the following example command to create a secret for your configurations:

    kubectl create secret generic ldc-custom-core-site --from-file="core-site.xml"
    NoteThe file must specifically be named core-site.xml.
  2. Use the following example command to create secrets for your ldc-license-public-keystore.p12 and license-features.yaml license files:

    kubectl create secret generic ldc-license --from-file=license-features.yaml --from-file=ldc-license-public-keystore.p12
    NoteThese files must specifically be named ldc-license-public-keystore.p12 and license-features.yaml.

Customize Helm chart values

You will need to customize certain values that you provide to the Helm chart. Create a custom-values.yml file with contents similar to the following examples.

Perform the following steps to customize your custom-values.xml Helm deployment file:

Procedure

  1. Create a local copy of the custom-values.xml file with the following example command:

    touch custom-values.yaml
  2. Customize the local version of your XML file similar to one of the following examples:

    • Minimum required configuration for services exposed via NodePort:

      keycloak:
        service:
          type: NodePort
          nodePort: 31111
      app-server:
        service:
          type: NodePort
          httpsNodePort: 31112
        keycloak:
          authServerUrl: "http://<k8s node host name>:31111/auth"
      global:
        registry: myregisty.azurecr.io
    • Minimum required configuration for services exposed via the Ingress controller:

      keycloak:
        ingress:
            enabled: true
            hosts:
            - host: keycloak-dev1.hv.com
              paths:
                - path: /
                  pathType: Prefix
            tls:
              - hosts:
                - "keycloak-dev1.hv.com"
                secretName: keycloak-ingress-certs
      app-server:
        ingress:
          enabled: true
          hosts:
          - host: app-server-dev1.hv.com
            paths:
              - path: /
                pathType: Prefix
          tls:
            - hosts:
              - "app-server-dev1.hv.com"
              secretName: app-server-ingress-certs
        keycloak:
          authServerUrl: "https://keycloak-dev1.hv.com/auth"
      global:
        registry: myregisty.azurecr.io
  3. Apply further customizations as needed based on the charts in Chart parameters.

Deploy the Helm chart

With the Helm chart customized for your system, you can deploy it to start running Data Catalog.

Perform the following steps to deploy the chart and start running Data Catalog:

Procedure

  1. Use the following example command to deploy the chart:

    helm install --wait ldc ldc-7.0.0.tgz -f custom-values.yml
  2. Use the following example command to temporarily access Data Catalog for a deployment test:

    kubectl port-forward svc/ldc-app-server 8082:8082
  3. Open http://localhost:8082 in your local browser to test the deployment.

Next steps

To make Data Catalog generally available, you can expose the ldc-app-server service using your cluster's Ingress gateway or load balancer, depending on your system.

Chart parameters

You can customize your Helm chart for Data Catalog with the following parameters:

  • Agent

    ParameterTypeDescriptionDefault value
    agent.appServerGraphQLUrlstringNone (leave unspecified)
    agent.appServerWSUrlstringAn empty value means the value will be generated using the template functionNone (leave unspecified)
    agent.enabledbooleantrue
    agent.isDefaultbooleantrue
    agent.serviceAccount.createstringtrue
    agent.serviceAccount.namestringNone (leave unspecified)
    agent.spark.historyServer.addressstring"http://{{ .Release.Name }}-spark-history-server:18080"
    agent.spark.historyServer.storageLocationstring"s3a://spark-history/events/"
    agent.spark.jarUpload.accessKeystring"minioadmin"
    agent.spark.jarUpload.endpointstring"http://{{ .Release.Name }}-minio-bundled:9000"
    agent.spark.jarUpload.locationstring"s3a://ldc/cluster_jars"
    agent.spark.jarUpload.secretKeystring"minioadmin"
    agent.spark.jarUpload.secretTokenstringNone (leave unspecified)
    agent.spark.k8sMasterEnabledbooleantrue
    agent.spark.securebooleantrue
    agent.spark.serviceAccountboolean"{{ .Release.Name }}-spark"
  • Application server

    ParameterTypeDescriptionDefault value
    app-server.configurationOverridesExtraEnv[0].namestring"MINIO_SECRET_KEY"
    app-server.configurationOverridesExtraEnv[0].valuestring "minioadmin"
    app-server.configurationOverridesExtraEnv[1].namestring"MINIO_ACCESS_KEY"
    app-server.configurationOverridesExtraEnv[1].valuestring"minioadmin"
    app-server.configurationOverridesExtraEnv[2].namestring"MINIO_ENDPOINT"
    app-server.configurationOverridesExtraEnv[2].valuestring"http://{{ .Release.Name }}-minio-bundled:9000"
    app-server.configurationOverrides[0].componentstring"__template_agent"
    app-server.configurationOverrides[0].propertyKeystring"ldc.metadata.hdfs.large_properties.attributes"
    app-server.configurationOverrides[0].value[0]string"fs.s3a.access.key=${MINIO_ACCESS_KEY}"
    app-server.configurationOverrides[0].value[1]string"fs.s3a.secret.key=${MINIO_SECRET_KEY}"
    app-server.configurationOverrides[0].value[2]string"fs.s3a.endpoint=${MINIO_ENDPOINT}"
    app-server.configurationOverrides[0].value[3]string"fs.s3a.path.style.access=true"
    app-server.configurationOverrides[0].value[4]string"fs.s3a.threads.max=40"
    app-server.configurationOverrides[0].value[5]string"fs.s3a.connection.maximum=200"
    app-server.debugbooleanPrint debug messages in logfalse
    app-server.enabledbooleantrue
    app-server.keycloak.authPassstringPassword for role syncing"admin"
    app-server.keycloak.authServerUrlstringBase URL for you Realm authorization endpoint. Needs to be accessible for client's browser"http://localhost:8080/auth"
    app-server.keycloak.authUserstringUser name for role syncing"admin"
    app-server.keycloak.callbackUrlstringURL to which Keycloak will redirect the user after granting authentication. By default is it relative, but it could be an absolute URL"/callback"
    app-server.keycloak.clientIDstringThis should match your Application Name, resource or OAuth Client Name."ldc-client"
    app-server.keycloak.realmstringName of your Keycloak realm"ldc-realm"
    app-server.keycloak.resourcestringThis should match your Application Name, resource or OAuth Client Name."ldc-client"
    app-server.mongodbURIstringnil
  • Miscellaneous

    ParameterTypeDescriptionDefault value
    global.registrystringOverride registry for Hitachi Vantara-managed images. By default, they are in ldmp-docker.repo.orl.eng.hitachivantara.comnil
    keycloak.enabledbooleanWhatever to deploy or not dev/demo keycloak instancetrue
    minio-bundledobjectMinio helm chart configFor reference see MinIO Helm chart
    minio-bundled.enabledbooleanWhatever to deploy or not dev/demo minio instancetrue
    mongodb.enabledbooleanWhatever to deploy or not dev/demo mongodb instancetrue
    spark-history-server.enabledbooleantrue
    tekton-hooks.enabledbooleanfalse