Remote Agent
For remote agents, use the following requirements lists and installation instructions.
Requirements
View the requirements for the remote agent installation, distributions, and Kerberos environments.
Category | Description |
Hardware |
|
Miscellaneous |
|
Remote agent set up supports a few distributions that vary in requirements. Follow the requirements of the distribution most suitable for your Data Catalog setup.
Category | Description |
Amazon Elastic Map Reduce (EMR) |
NoteWhen prompted by the remote agent script, you must set up the remote agent using the Lumada Data Catalog service user
hadoop . |
Cloudera Data Platform (CDP) | CDP version 7.1.3+ |
Horton Data Platform (HDP) | HDP version 3.1.0+ |
Additionally, you can enable Kerberos on your remote agent’s server. As Kerberos enabled environments add extra security between your remote agent and Data Catalog cluster, some extra configuration is required.
Category | Description |
Miscellaneous |
|
Remote agent installation
The binary agent file (RUN file) is included in the Data Catalog artifacts found on the Hitachi Vantara Lumada and Pentaho Support Portal.
Place the run file into your environment and execute the file:
sudo sh ldc-agent-<version>.run
The run file will guide you through your remote agent set up. Set up for remote agent will vary if your environment is Kerberos enabled or not.
Non-Kerberos
In this example the remote agent ldc_example
is created where:
- The install location will be
/opt/ldc_example
. - The remote agent will be managed by the service user
ldcuser
. - The remote agent will connect to a Data Catalog cluster that is accessible via NodePort on http://ldc_cluster:31080, where ldc_cluster is the host name and 31080 is the HTTP port.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LUMADA DATA CATALOG AGENT INSTALLER ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Express Install (Requires superuser access) 2. Custom Install (Runs with non-sudo access) 3. Upgrade 4. Exit Enter your choice [1-4]: 1 Enter the name of the Lumada Data Catalog service user [ldcuser]: ldcuser Enter install location [/opt/ldc]: /opt/ldc_example Enter log location [/var/log/ldc]: /opt/ldc_example/logs Enter Appserver endpoint [http://localhost:3000]: http://ldc_cluster:31080 Enter the name of the agent: ldc_example Enter HIVE version [3.1.2]: 3.1.2 Is Kerberos enabled? [y/N]: N ~~~~~~~~~~~~~~~~~~~~~~~ SELECTION SUMMARY ~~~~~~~~~~~~~~~~~~~~~~~ Lumada Data Catalog service user : ldcuser Install location : /opt/ldc_example/ldc (will be created) Log location : /opt/ ldc_example /logs/ldc (will be created) Kerberos enabled : false AppServer endpoint : http://ldc_cluster:31080 Agent ID : ldc_example Proceed? [Y/n]: Y
The script will then create the install and log locations. Remote agent configuration can be found in the install location you specified, under the ldc/agent folder.
- Switch to the Data Catalog service user that was specified in the remote agent setup, and navigate to your remote agent’s configuration:
sudo su – ldcuser cd /opt/ldc_example/ldc/agent
- Once you are in the ldc/agent directory, start the agent:
bin/agent start
Kerberos
For setup on a Kerberos environment, additional variables are required by the script – a path to an existing keytab file on the server and the service user principal.
In this example the remote agent ldc_example
is created where:
- The install location will be
/opt/ldc_example
. - The remote agent will be managed by the service user ldcuser. The remote agent will connect to a Data Catalog cluster that is accessible via NodePort on https://ldc_cluster:31083, where ldc_cluster is the host name and 31083 is the HTTPS port.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LUMADA DATA CATALOG AGENT INSTALLER ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1. Express Install (Requires superuser access) 2. Custom Install (Runs with non-sudo access) 3. Upgrade 4. Exit Enter your choice [1-4]: 1 Enter the name of the Lumada Data Catalog service user [ldcuser]: ldcuser Enter install location [/opt/ldc]: /opt/ldc_example Enter log location [/var/log/ldc]: /opt/ldc_example/logs Enter Appserver endpoint [http://localhost:3000]: https://ldc_cluster_url:31083 Enter the name of the agent: ldc_example Enter HIVE version [3.1.2]: 3.1.2 Is Kerberos enabled? [y/N]: Y Full path to Lumada Data Catalog service user keytab: /home/ldcuser/ldcuser.keytab Lumada Data Catalog service user’s fully qualified principal: ldcuser@<your company>.com ~~~~~~~~~~~~~~~~~~~~~~~ SELECTION SUMMARY ~~~~~~~~~~~~~~~~~~~~~~~ Lumada Data Catalog service user : ldcuser Install location : /opt/ldc_example/ldc (will be created) Log location : /opt/ldc_example/logs/ldc (will be created) Kerberos enabled : true Kerberos keytab path : /home/ldcuser/ldcuser.keytab Kerberos principal : ldcuser@<your company>.com AppServer endpoint : https://ldc_cluster:31083 Agent ID : ldc_example Proceed? [Y/n]: Y
The script will then create the install and log locations. Remote agent configuration can be found in the install location you specified, under the ldc/agent folder.
- Switch to the Data Catalog service user that was specified in the remote agent setup, and navigate to your remote agent’s configuration:
sudo su – ldcuser cd /opt/ldc_example/ldc/agent
- During agent set up, a folder keytab has been created with the rest of your agent related files. This folder contains the keytab file that you will use to obtain a Kerberos ticket for the service user principal:
cd /opt/ldc_example/ldc/agent kinit -kt keytab/ldcuser.keytab ldcuser@<your company>.com
- Using the openssl command, fetch certificate fingerprints, passing the hostname (and if applicable, the port or path) of the Data Catalog cluster. In this example, the certificate fingerprints will be retrieved from ldc_cluster, where the application is accessible on port 31083:
openssl s_client -connect ldc_cluster:31083 < /dev/null 2>/dev/null | openssl x509 -fingerprint -sha256 -noout -in /dev/stdin | cut -d'=' -f2 | tr -d : | tr [:upper:] [:lower:]
- If executed successfully, the command will return certificate fingerprints in the form of an alpha-numeric string. Use the certificate fingerprints to register the remote agent to the Data Catalog cluster:
bin/agent register --agent-token null --endpoint wss://ldc_cluster:31083/wsagent --agent-id ldc_example --cert-fingerprint <certificate fingerprints from openssl command>
Authorize remote agents
Once the remote agent has been started/registered, check the remote agent logs and confirm that the agent is running.
cd /opt/ldc_example/ldc/agent bin/agent log -f
If set up correctly, the logs will include an agent authorization error that will look something like the following:
[WebSocketClient-SecureIO-1] INFO com.hitachivantara.datacatalog.remoteagent.socket_services.WebsocketSocketService - Disconnected: CloseReason: code [3403], reason [agent not authorized]
To authorize the agent, open your browser and log into the Data Catalog UI, then navigate to Management and click Agents.
The configured remote agent should now appear in the list of available agents.Click on the Authorize button for the remote agent.
Go back to your remote agent’s logs, where there should be logs confirming that the remote agent has been authorized. This will include a successful handshake between the agent and Data Catalog cluster, creation of configuration data and then a series of successful “pings” to the agent:
31 May 2022 14:08:43.784 [pool-10-thread-1] INFO com.hitachivantara.datacatalog.remoteagent.messagehandlers.TokenHandler - Processed handshake from server registered agent: ldc_example 31 May 2022 14:08:45.750 [pool-10-thread-2] INFO com.hitachivantara.datacatalog.remoteagent.messagehandlers.MetaHealthHandler - Agent: Hi there from mother ship, ping:1654006125726 […] 31 May 2022 14:09:42.678 [pool-10-thread-5] INFO com.hitachivantara.datacatalog.remoteagent.messagehandlers.MetaHealthHandler - Agent: ping to ldc_example:1654006182675