Before starting the Lumada Data Catalog installation process, you must first set up all the dependent external components, such as setting up the Data Catalog service user and collecting other essential information about your cluster configuration. These prerequisites are applicable to most Hadoop distributions and include steps to help you prepare your Hadoop environment for the Data Catalog installation.
Basic pre-install preparation includes the following tasks:
Review system requirements
Review the external components and applications that Data Catalog supports. See System requirements.
Verify the proper functioning of the various Hadoop components that Data Catalog interacts with. See Component validations.
Configure the Service User
Set up a dedicated service user to own the installation directory and to run Data Catalog jobs and services. See Configure the Data Catalog service user.
Determine the correct Solr setup for your environment. Data Catalog supports various versions of Solr for different Hadoop distributions. See Apache Solr configuration.
You can find distribution specific external component configurations and instructions for these distributions in the following articles:
- Installing Lumada Data Catalog on Amazon EMR
- Installing Lumada Data Catalog on MapR
- Installing Lumada Data Catalog on Azure HDInsight
You must also make the following changes outside of the Lumada Data Catalog environment or on other cluster nodes:
- Validate component functioning
- Configure components for Data Catalog
If you do not have control over or access to these components, you may need to plan in advance to perform these changes or contact your system administrator.