Install
This article covers the installation of Lumada Data Catalog onto a Kubernetes cluster, using Helm and Hitachi Vantara owned Docker images.
Before installing, you must have:
- Reviewed the system requirements for Data Catalog.
- Knowledge of your organization’s networking environment.
- The root permissions for your designated server.
- The ability to connect to your organization’s data sources.
Based on the deployment pattern your organization will use, you can also set up standalone remote agents. A remote agent serves the same purpose as a local agent, and is set up on an edge node of a separate Hadoop/Spark cluster.
The installation process includes:
By the end of these steps, the following Data Catalog components will be set up on your designated Kubernetes cluster:
Component | Description |
LDC Application Server | Runs the Data Catalog application |
Agent (Local agent) | Runs jobs executed by the application server |
MongoDB | Database backing the application |
Keycloak | Open Source Identity Provider (IDP) |
MinIO | Object storage (used for debug purposes only) |
Spark History Server | Helps store spark job logs on an MinIO or S3 bucket (by default this will be in the MinIO component included in set up) |
REST Server | Provides several API calls that customers can use to interact with Data Catalog, and documentation covering these supported API calls |
Data Catalog upgrade paths
You can upgrade Data Catalog from previous releases as shown in the table below.
From release | To release | Procedure |
7.3 | 7.5 | Upgrade with Helm |
7.2 | 7.5 | Upgrade with Helm |
7.1.1 | 7.5 | Upgrade with Helm |
6.1 | 7.5 | Contact the Hitachi Vantara Lumada and Pentaho Support Portal |