Skip to main content
Hitachi Vantara Lumada and Pentaho Documentation

Prerequisites and follow-up actions for upgrading the Lumada Data Catalog solution in LDOS

Parent article

Before you upgrade Lumada Data Catalog to the latest version in Lumada DataOps Suite, you first need to back up the LDC repository. After upgrading, you need to restore the repository.

You can upgrade your Lumada Data Catalog solution in the Solution management window. See Upgrade solutions for instructions for upgrading previously installed solutions.

Before upgrading to a new version of LDC, you need to back up your current LDC repository. After you perform the upgrade, you can then restore the repository.

Backing up the LDC repository is a three-step process. Be sure to back up the following three components in this order:

  1. Back up the Postgres server.
  2. Back up the Solr platform.
  3. Back up the discovery cache.

After you upgrade your LDC solution, restore the repository that you backed up. Restoring is also a three-step process performed in the following order:

  1. Restore the Postgres server.
  2. Restore the Solr platform.
  3. Restore the discovery cache.

Repository backup and restore is typically performed by the system administrators or IT administrators who installed LDC.

Back up the Postgres server

Postgres credentials are stored as Secrets in the Kubernetes cluster. This task assumes you have access to the Kubernetes cluster and that the kubectl command-line tool is configured to communicate with the cluster. Perform the following steps to back up the Postgres server:

Procedure

  1. Retrieve the Postgres user secret from the cluster by running the following command:

    kubectl get secrets -n <name-space>The Postgres user secret appears.
    NoteThe format of the secret is postgres.t0-catalog-pg-cluster.credentials.postgresql.acid.zalan.do where t0-catalog-pg-cluster is the name of the cluster.
  2. Create a persistent volume claim (PVC) file in the hitachi-solutions namespace to back up the data. Save it as postgres-backup-restore-pvc.yaml as in the following example:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
    
      name: postgres-backup-restore-volume
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 20Gi
  3. Create a database backup job and define it as postgres-backup-job.yaml. Replace the variables <postgres-master-pod-svc-name>, <postgres-user-secret>, and <ldc-data-base> with the applicable values for your system:

    NoteBe sure to use the Postgres secret, not the ldcuser secret.

    The following sample is an example of the postgres-backup-job.yaml file.

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: postgres-backup
    spec:
      template:
        metadata:
          name: backup-pg-data
        spec:
          containers:
            - name: postgres-backup
              image: postgres:12
              imagePullPolicy: Always
              env:
                - name: POSTGRES_HOST
                  value: <postgres-master-pod-svc-name>
                - name: POSTGRES_USER
                  valueFrom:
                    secretKeyRef:
                      name: <postgres-user-secret>
                      key: username
                - name: POSTGRES_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: <postgres-user-secret>
                      key: password
                - name: POSTGRES_DB
                  value: <ldc-data-base>
                - name: BACKUP_FILE
                  value: /var/backups/postgres-data.tar
              volumeMounts:
                - mountPath: /var/backups
                  name: postgres-backup-pvc
              command:
                - sh
                - -c
                - |
                  #!/usr/bin/env bash
                  export PGPASSWORD=$POSTGRES_PASSWORD
                  
                  echo "Postgres data backup: $BACKUP_FILE"
                  until pg_isready -h $POSTGRES_HOST -U $POSTGRES_USER -d $POSTGRES_DB; do sleep 2; done;
                  pg_dump -h $POSTGRES_HOST -U $POSTGRES_USER -Ft $POSTGRES_DB > $BACKUP_FILE
                  echo 'Backup finished!'
          restartPolicy: Never
          volumes:
            - name: postgres-backup-pvc
              persistentVolumeClaim: 
                claimName: postgres-backup-restore-volume
  4. Deploy the PVC file using the following command:

    kubectl apply -f postgres-backup-restore-pvc.yaml
  5. Run the backup job using the following command:

    kubectl apply -f postgres-backup-job.yaml

Results

The Postgres server is backed up to the PVC.

Back up the Solr platform

This task assumes you have access to the Kubernetes cluster and that the kubectl command-line tool is configured to communicate with the cluster. This task assumes you have completed the backup of the Postgres server. Perform the following steps to back up the Solr platform:

Before you begin

Verify that the storage provisioner is enabled in the cluster so you can create persistent volume claims which are shared between multiple pods.

Procedure

  1. Enable the backup functions in LDC by adding the following code to the custom-values.yaml file in the LDOS user interface:

    solr:
      cloud:
        backupRestoreData:
          enabled: true
  2. Create a persistent volume claim file in the hitachi-solutions namespace to back up the Solr data. Save it as solr-backup-pvc.yaml using the following code:

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: solr-backup-persistence
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
  3. Create the custom resource definition (CRD) file to back up the Solr data into a tgz file, as in the following sample:

    apiVersion: solr.bloomberg.com/v1beta1
    kind: SolrBackup
    metadata:
      name: solrbackup
    spec:
      persistence:
        volume:
          source:
            persistentVolumeClaim:
              claimName: solr-backup-persistence
      solrCloud: {{ .Release.Name }}
      collections:
        - {{ collection-name }}
    1. Replace {{ .Release.Name }} with the name of the release where your SolrCloud is running.

    2. Replace {{ collection-name }} with the name of the Solr collection that you want to back up.

  4. Save the CRD file as solr-backup.yaml.

  5. Run the CRD file using the following command: kubectl apply -f solr-backup.yaml -n <namespace>.

Results

The Solr data is backed up and stored in the solr-backup-persistence PVC.

Back up the discovery cache

The discovery cache data is stored in a shared object storage service in the following location: config: fingerprintStorage: uri: "s3a://ldc" path: "/home/ldcuser".

Perform the following steps to back up the discovery cache:

Procedure

  1. Navigate to the s3a://ldc/home/ldcuser bucket.

  2. Download the ldc_hdfs_metadata.zip file and store it in a location where you can access it to restore the backup.

Restore the Postgres server

Perform the following steps to restore the Postgres server using a database restore job.

Procedure

  1. Create a database restore job using the following code and save it as postgres-restore-job.yaml.

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: postgres-restore
    spec:
      template:
        metadata:
          name: restore-pg-data
        spec:
          containers:
            - name: postgres-restore
              image: postgres:12
              imagePullPolicy: Always
              env:
                - name: POSTGRES_HOST
                  value: <postgres-master-pod-svc-name>
                - name: POSTGRES_DB
                  value: <ldc-data-base>
                - name: POSTGRES_USER
                  valueFrom:
                    secretKeyRef:
                      name: <postgres-user-secret>
                      key: username
                - name: POSTGRES_PASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: <postgres-user-secret>
                      key: password
                - name: BACKUP_FILE
                  value: /var/backups/postgres-data.tar
              volumeMounts:
                - mountPath: /var/backups
                  name: postgres-restore-pvc
              command:
                - sh
                - -c
                - |
                  #!/usr/bin/env bash
                  export PGPASSWORD=$POSTGRES_PASSWORD
    
                  echo "Restoring the postgres data from file $BACKUP_FILE"
                  until pg_isready -h $POSTGRES_HOST -p 5432 -U $POSTGRES_USER -d $POSTGRES_DB; do sleep 2; done;
                  pg_restore -U $POSTGRES_USER -h $POSTGRES_HOST -Ft --clean -d $POSTGRES_DB < $BACKUP_FILE
                  echo 'Postgres data restored!'
          restartPolicy: Never
          volumes:
            - name: postgres-restore-pvc
              persistentVolumeClaim:
                claimName: postgres-backup-restore-volume
  2. In the code, replace the following variables with the values for your environment:

    • <postgres-master-pod-svc-name>
    • <postgres-user-secret>
    • <ldc-data-base>
    NoteYou must use the Postgres user secret credentials that you retrieved in the backup process. If you use the ldcuser secret, then the data restoration fails.
  3. Run the database restore job.

    NoteIf the Postgres database has been deleted, you can re-create the database with your restore job by using the --create option with the pg_restore command. For example, pg_restore -U $POSTGRES_USER -h $POSTGRES_HOST -Ft --clean --create -d $POSTGRES_DB < $BACKUP_FILE .

Results

The Postgres database is restored to the cluster.

Restore the Solr platform

Perform the following steps to restore the data to the SolrCloud:

Procedure

  1. Create a restore job using the following code.

    apiVersion: batch/v1
    kind: Job
    metadata:
      name: solr-collection-restore
    spec:
      template:
        metadata:
          name: collection-restore
        spec:
          restartPolicy: Never
          containers:
            - name: restore-collection-job
              image: registry_name/lumada-data-catalog/solr-collection-manage-hook:8.4.1-6.1.1
              volumeMounts:
                - mountPath: /opt/solr/backup-mount
                  name: solr-backup-persistence-volume
                - mountPath: /var/solr/data/backup-restore
                  name: solr-backup-restore-volume
                  subPath: cloud/t0
              env:
                - name: "SOLR_COLLECTION"
                  value: {{ collection-name }}
                - name: "SOLR_HTTP_URL"
                  value: {{ solr-http-url }}
                - name: "JOB_MODE"
                  value: "RESTORE_COLLECTION"
                - name: "SOLR_BACKUP_NAME"
                  value: "solrbackup"
                - name: "LDC_COLLECTION_HOOK_DEBUG"
                  value: "true"
          volumes:
            - name: solr-backup-persistence-volume
              persistentVolumeClaim:
                claimName: {{ backup-persistence-name }}
            - name: solr-backup-restore-volume
              persistentVolumeClaim:
                claimName: {{ restore-pvc-name }}
  2. In the code, replace the following values with the values for your environment:

    • registry_name

    • {{ collection-name }}

      The collection name where you want to restore Solr

    • {{ solr-http-url }}

      The SolrCloud URL. You can use the SolrCloud common service name here. For example: http://t0-solrcloud-common

    • {{ backup-persistence-name }}

      The name of the PVC created in the Solr platform backup topic (solr-backup-persistence.yaml).

    • {{ restore-pvc-name }}

      The name of the PVC created in the Solr platform backup (custom-values.yaml) . This PVC was created during the catalog installation. The format is {{.Release.Name }}--solr-backup-restore-volume-claim

  3. Save the edited code as the job file solr-collection-restore.yaml.

  4. Run the solr-collection-restore.yaml job by entering the following command :kubectl apply -f solr-collection-restore.yaml -n hitachi-solutions.

Restore the discovery cache

Perform the following steps to restore the discovery cache.

Procedure

  1. Remove the data in this path: s3a://ldc/home/ldcuser.

  2. Extract the ldc_hdfs_metadata.zip file from the location where you stored the discovery cache backup.

  3. Upload the extracted data to s3a://ldc/home/ldcuser.

Results

The discovery cache is restored to the cluster.