Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at


Hitachi Vantara Lumada and Pentaho Documentation

Deployment patterns

Parent article

To install Lumada Data Catalog v7.1, select the applicable deployment pattern:

  • Kubernetes only
  • Kubernetes with Remote Agent(s)

Kubernetes only

This minimal deployment pattern includes the setup of all required Data Catalog components onto an existing Kubernetes cluster.

This setup is sufficient if:

  • All your data sources are proximate to your Kubernetes cluster, i.e. there is low latency between the networks, the data sources, and the cluster this installation is running on.
  • None of your data sources are Hadoop based (not HDFS or Hive).

Kubernetes with remote agent(s)

In addition to all necessary Data Catalog components, this deployment pattern includes the setup of a remote agent, which is set up outside the Kubernetes cluster.

An agent runs jobs against your data sources, which requires high bandwidth and low latency between the agent and the data sources. As it is not always possible to have data sources located near your Kubernetes cluster where the local agent resides, a remote agent can be set up closer to your data source(s). This way, the agent can run jobs in an optimized manner and can communicate back to the Kubernetes cluster to transfer metadata post job run.

This set up is recommended when:

  • Your organization’s data sources are located far from your Kubernetes cluster.
  • Your organization’s data sources include Hadoop based data sources such as HDFS or Hive.