This article covers the installation of Lumada Data Catalog onto a Kubernetes cluster, using Helm and Hitachi Vantara owned Docker images.
Before installing, you must have:
- Reviewed the system requirements for Data Catalog.
- Knowledge of your organization’s networking environment.
- The root permissions for your designated server.
- The ability to connect to your organization’s data sources.
Based on the deployment pattern your organization will use, you can also set up standalone remote agents. A remote agent serves the same purpose as a local agent, and is set up on an edge node of a separate Hadoop/Spark cluster.
The installation process includes:
By the end of these steps, the following Data Catalog components will be set up on your designated Kubernetes cluster:
|LDC Application Server||Runs the Data Catalog application|
|Agent (Local agent)||Runs jobs executed by the application server|
|MongoDB||Database backing the application|
|Keycloak||Open Source Identity Provider (IDP)|
|MinIO||Object storage (used for debug purposes only)|
|Spark History Server||Helps store spark job logs on an MinIO or S3 bucket (by default this will be in the MinIO component included in set up)|
|REST Server||Provides several API calls that customers can use to interact with Data Catalog, and documentation covering these supported API calls|
Data Catalog upgrade paths
You can upgrade Data Catalog from previous releases as shown in the table below.
|From release||To release||Procedure|
|7.2||7.3||Upgrade with Helm|
|7.1.1||7.3||Upgrade with Helm|
|6.1||7.3||Contact the Hitachi Vantara Lumada and Pentaho Support Portal|