Skip to main content
Hitachi Vantara Lumada and Pentaho Documentation

Backup, failover, and recovery

Parent article

Lumada Data Catalog takes advantage of the fail-over and redundancy attributes of HDFS and Solr.

Failover

To configure Data Catalog and Solr so that you can recover quickly should a server fail, we recommend the following configuration:

  • SolrCloud with at least three ZooKeeper nodes in the ensemble and at least 2 Solr servers controlled by ZooKeeper.

    In case of a Solr server failing, ZooKeeper will manage routing requests to the active replica. ZooKeeper handles its own failover.

  • Data Catalog installed in two locations where only one web server is active at a time.

    In case of the primary Data Catalog node failing, you can start Data Catalog on the secondary node. The secondary system should be configured to access the same Solr repository and the web server should be accessible at the same URL.

Backup

To backup all Data Catalog metadata such that it can be used to restore the catalog to a snapshot in time, you would make regular backups of the following:

  • Solr data and indexes.
  • Postgres database that contains audit information for Data Catalog functions.
  • The Keystore file that contains the encrypted passwords for the current installation.
CautionBack up the Keystore before any upgrade.

The upgrade process generally takes care of backing up the keystore, but it is good practice to have a separate backup in case the upgrade is unable to restore the keystore.

Depending on your requirements, you could also back up the following:

  • Data Catalog discovery metadata — This material supports Data Catalog discovery operations and, if not maintained, it can be reproduced as needed. You set the location during the Data Catalog installation; check the ldc.metadata.hdfs.MetadataServiceHdfsPath property in the configuration.json file in the conf directory.
  • Data Catalog UI logs — Web server logging information may include access attempts. Backup the ldc-ui.log files for complete auditing of login attempts.

Options for backing up Solr data and indexes

There are options for backing up Solr data and indexes:

  • Use Solr's API for backups.

    See steps at Detailed instructions for backup and restore.

  • Use third-party backup utilities and storage.

    The location to backup is the dataDir location as shown in the Solr admin page:

    Solr admin page

  • Use NAS for the data.

    When creating the Data Catalog collection, choose local storage rather than HDFS and point the local storage location to the NAS.

Detailed instructions for backup and restore

These instructions use the Solr APIs for backup and restore. They are not viable for CDH versions prior to 5.9.

Backup

Perform the following steps to backup Data Catalog:

Procedure

  1. Make sure there are no Data Catalog jobs running.

  2. As the Data Catalog service user, shut down the Data Catalog web service.

    $ sudo su waterlinesvc  
    $ /opt/ldc/app-server/bin/app-server stop
  3. Make backup locations with appropriate permissions for service users.

    Write Accessed ByLocationExample
    Solr Service UserComputer where a Solr instance is running/tmp/wd-backups
    Data Catalog service user

    Either:

    • Same as above, where Solr is running; or,
    • Computer where Data Catalog is running.
    /tmp/wd-backups
    Data Catalog service userHDFS/backups/ldc/backup_discovery_metadata/20170901
  4. Backup the Solr collection.

    1. For CDH, EMR, HDP (local/HDFS storage). Run this command for one of the replicas.

      curl ‘http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=backup&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups’

      Above command takes backup of shard1. If you have more than one shards, you will have to run the same command for all the respective shards.

      For instance, in case of 2 shard collection, run below command in addition to the above commands, this command should be entered as a single line:

      $ curl ‘http://solrnode1:8983/solr/wdcollection_shard2_replica1/replication?command=backup&name=wdcollection_shard2_replica1_backup&location=/tmp/wd-backups’

      For example, if your backup location is /tmp/wd-backups, this command will create /tmp/wd-backups/backup_solr_index in the local file system.

      If SSL is enabled, use the following command with a reference to the SSL certificate:

      $ curl --cacert /certs/cert1.pem ‘http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=backup&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups’

      If Kerberos is enabled, use the following command to activate authentication (the user specified here is not actually used):

      $ curl --negotiate -u solradmin ‘http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=backup&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups’

      If Kerberos and SSL are enabled, use the following command:

      $ curl --negotiate --cacert /certs/cert1.pem -u solradmin ‘http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=backup&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups’
    2. Backup the ZooKeeper configuration. (If using SolrCloud)

      Your values may be different for the ZooKeeper port number and the Data Catalog collection name. Use the same path for the backup as in the previous step (/tmp/wd-backups); name the configuration directory so you can identify it again, such as backup_zk_config.

      CDH:

      $ solrctl --zk localhost:2181/solr instancedir --get wdconfig /tmp/wd-backups/backup_zk_config

      HDP:

      $ /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost.localdomain:2181 -cmd downconfig -confname wdconfig -confdir /tmp/wd-backups/backup_zk_config

      EMR:

      $ /opt/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost zkhost:zkport -cmd downconfig -confname wdconfig -confdir /tmp/wd-backups/backup_zk_config
  5. Backup Postgres

    1. Login to the node where Postgres is installed.

      $ ssh <ssh-user>@<postgres-host> 
    2. Navigate to the Postgres backup directory - typically /data/postgres_backups

    3. Backup using pg_dump as follows:

      <pg-bkup>$ pgsql/bin/pg_dump -U ldc -f 20190929_postgres.bak -d waterlinedb
    4. Gzip the backed up file:

      <pg-bkup>$ gzip 20190929_postgres.bak
    5. Upgrade the Postgres backup file to an archive store.

  6. Backup the Data Catalog discovery metadata.

    You can verify the location of the discovery metadata in waterlinedata/conf/configuration.json, the value of the ldc.metadata.hdfs.large_properties.path property. Use the same path for the backup as in the previous steps (/tmp/ldc-backups); name the discovery metadata directory so you can identify it again, such as backup_hdfs_metadata.
    $ hdfs fs -get /user/waterlinesvc/.ldc_hdfs_metadata/* /tmp/ldc-backups/backup_discovery_metadata
  7. Backup Keystore

    The Data Catalog keystore stores the necessary passwords used for various component communications, in an encrypted format. The keystore is generated at the time of installation and added to the keystore in Application Server. The Metadata Server and Agent retrieve the secrets key from App-Server and is then saved to their respective keystores.

    Each installation creates a unique secret key that is valid and essential for the Data Catalog functions of that installation. Thus this Keystore needs to be backed up before considering any product upgrade.

    NoteKeystore backup only applies to Version 2019.3 and up.

    Backup the keystore located in the following paths:

    • App-server: <LDC Install location>/app-server/jetty-distribution-9.4.18.v20190429/ldc-base/etc/keystore
    • Metadata server: <LDC Install location>/metadata-server/conf/keystore
    • Agent server: <LDC Install location>/agent/conf/keystore
  8. Restart Data Catalog web service

    $ /opt/ldc/app-server/bin/app-server start

Restore the backup

Perform the following steps to restore a backup:

Procedure

  1. Make sure there are no Data Catalog jobs running.

  2. As the Data Catalog service user, shut down the Data Catalog web service.

    $ sudo su waterlinesvc
    $ /opt/ldc/app-server/bin/app-server stop
    $ /opt/ldc/agent/bin/agent stop
    $ /opt/ldc/metadata-server/bin/metadata-server stop
  3. Restore the Solr Collection:

    The steps required for restoring backups is dependent on your Solr version, which is typically connected to which Hadoop distribution you are running. Perform the following for CDH and Solr 4.10:
    1. Drop the old Data Catalog Solr collection

      Your values may be different for the ZooKeeper ensemble string including port number, the Data Catalog collection name and the Data Catalog configuration name and the location of the Solr collection on HDFS.
      $ solrctl --zk localhost:2181/solr collection --delete wdcollection
      $ sudo -u hdfs hadoop fs -rm -r /solr/wdcollection
    2. Restore the Data Catalog collection configuration in ZooKeeper

      The backup saves the required schema as managed-schema; to restore the configuration, copy the managed-schema file into schema.xml. Your values may be different for the Zookeeper port number, the Data Catalog collection name, the backup file name, and the location of the backup.
      $ cd /tmp/wd-backups/backup_zk_config/conf
      $ mv schema.xml.bak schema.xml.bak2
      $ mv managed-schema schema.xml
      $ solrctl --zk localhost:2181/solr instancedir --update wdconfig /tmp/wd-backups/backup_zk_config
    3. Recreate the Data Catalog Solr collection with the same shard count and replication factor as the one previously backed up above

      Your values may be different for the ZooKeeper ensemble string including port number, the Data Catalog collection name and the Data Catalog configuration name and the location of the Solr collection on HDFS.
      $ solrctl --zk localhost:2181/solr collection --create wdcollection -c wdconfig  -s 1 -r 2 -m 2
    4. As the Data Catalog service user, rebuild the schema of the collection

      $ sudo su waterlinesvc
      $ /opt/ldc/bin/ldc schemaAdmin -create true
    5. Restart Solr through Cloudera Manager.

    6. Restore the Solr collection index files.

      The method for restoring the collection depends on whether the collection data is stored on HDFS or on the local file system.
      NoteWhen running in a Kerberized environment, make sure to run these commands as the Solr service user.
  4. Run the following commands for Solr data on local & HDFS storage:

    $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'

    In case of multiple shards, you will have to run below command for respective shard and replica. For instance, for two shard two replica collection, the commands are as follows:

    $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'
    $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica2/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'
    $ curl 'http://solrnode1:8983/solr/wdcollection_shard2_replica1/replication?command=restore&name=wdcollection_shard2_replica1_backup&location=/tmp/wd-backups'
    $ curl 'http://solrnode1:8983/solr/wdcollection_shard2_replica2/replication?command=restore&name=wdcollection_shard2_replica1_backup&location=/tmp/wd-backups'
    1. Restore the Data Catalog discovery metadata

      If the desired directory is not already created, you'll need permission to create the directory. You can verify the location of the discovery metadata in ldc/conf/configuration.json, the value of the ldc.metadata.hdfs.large_properties.path property.
      $ sudo -u hdfs hadoop fs -mkdir /user/waterlinesvc/.ldc_hdfs_metadata
      $ sudo -u hdfs hadoop fs -put /PATH/TO/BACKUP/LOCATION/backup_discovery_metadata/backup_discovery_metadata/* 
      /user/waterlinesvc/.ldc_hdfs_metadata
    2. Restart Solr through Cloudera Manager.

  5. Run the following commands for HDP and EMR (Lucidworks Solr & Apache Solr 5.5.4)

    1. Drop the old Solr collection: as a Solr user, delete the collection.

      Typically, the Solr location is /opt/lucidworks-hdpsearch/solr. Use the correct Solr port number and Data Catalog collection name.
      $ sudo su solr
      $ cd /opt/lucidworks-hdpsearch/solr
      $ bin/solr delete -c wdcollection -p 8983
      $ sudo -u hdfs hadoop dfs -rm -r /user/solr/wdcollection
      $ exit
    2. Restore the Data Catalog collection configuration in ZooKeeper

      $ /opt/lucidworks-hdpsearch/solr/server/scripts/cloud-scripts/zkcli.sh -zkhost zkhost:zkport -cmd upconfig -confname wdconfig -confdir /tmp/wd-backups/backup_zk_config
    3. Recreate the old Solr collection.

      • As the Solr user, recreate the collection using the command you used for the original install. For example:
        $ bin/solr create -c wdcollection -p 8983 -d /opt/lucidworks-hdpsearch/solr/server/solr/configsets/wdconfig  -n wdconfig -s 1 -rf 2
        $ exit
      • As the Data Catalog service user, rebuild the schema of the collection:
      $ sudo su waterlinesvc
      $ /opt/ldc/agent/bin/agent schemaAdmin -create true
    4. Restore the Data Catalog Solr collection from the backup.

      The method for restoring the collection depends on how many Solr instances were running when the collection was created. Run a command for each replica that needs to be updated.
      • Two (or more) Solr instances were running when the collection was originally created

        From the computer where each Solr instance is installed, run the following command to restore the Solr data and indexes from the backup data. Your values may be different for the Solr port number, the Data Catalog collection name, the backup file name, and the location of the backup. This command should be entered as a single line:

        $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'
        $ curl 'http://solrnode2:8984/solr/wdcollection_shard1_replica2/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'

        Repeat this command for each Solr instance. For example, in the second version, replace the Solr port with the port number for the second instance, such as 8984.

      • One Solr instance was running when the collection was originally created

        From the computer where the Solr instance is installed, run the following commands (one for each replica) to restore the Solr data and indexes from the backup data. Your values may be different for the Solr port number, the Data Catalog collection name, the backup file name, and the location of the backup. These commands should be entered as single lines:

        $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'
        $ curl 'http://solrnode2:8983/solr/wdcollection_shard1_replica2/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'
      • All replicas in all nodes need to be restored in EMR environment.
      • In case of multiple shards, you will have to run below command for respective shard and replica. For instance, for two shard two replica collection, the commands are as follows:
        $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica1/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'  
        $ curl 'http://solrnode1:8983/solr/wdcollection_shard1_replica2/replication?command=restore&name=wdcollection_shard1_replica1_backup&location=/tmp/wd-backups'  
        $ curl 'http://solrnode1:8983/solr/wdcollection_shard2_replica1/replication?command=restore&name=wdcollection_shard2_replica1_backup&location=/tmp/wd-backups'  
        $ curl 'http://solrnode1:8983/solr/wdcollection_shard2_replica2/replication?command=restore&name=wdcollection_shard2_replica1_backup&location=/tmp/wd-backups'
    5. Restore the Data Catalog discovery metadata

      You can verify the location of the discovery metadata in ldc/conf/configuration.json, the value of the ldc.metadata.hdfs.large_properties.path property.
      $ sudo -u hdfs hadoop fs -mkdir /user/waterlinesvc/.ldc_hdfs_metadata  
      $ sudo -u waterlinesvc hadoop fs -put /PATH/TO/HDFS/BACKUP/LOCATION/backup_discovery_metadata</span>/* /user/waterlinesvc/.ldc_hdfs_metadata
  6. Restore Keystore by copying the keystore to the following locations:

    <App-Server Dir>$ cp <back-up location>/keystore jetty-distribution-9.4.18.v20190429/ldc-base/etc/keystore
    <Meta_Server Dir>$ cp <back-up location>/keystore conf/keystore
    <Agent Dir>$ cp <back-up location>/keystore conf/keystore
    NoteKeystore restoration only applies to versions 2019.3 and up
  7. Restore Postgres

    1. Login to the node where Postgres is installed.
      $ ssh <ssh-user>@<postgres-host> 
    2. Navigate to the Postgres backup directory - typically /data/postgres_backups
    3. Restore using pg_restore as follows: Make sure the BAK file is duly decompressed.
      <pg-bkup>$ pgsql/bin/pg_restore -U ldc -f 20190929_postgres.bak -d waterlinedb
  8. Restart Data Catalog services

    $ /opt/ldc/app-server/bin/app-server start
    $ /opt/ldc/agent/bin/agent start
    $ /opt/ldc/metadata-server/bin/metadata-server start