Manage worker processes
Pentaho Data Catalog (PDC) uses worker processes to implement virtually all the data analytics functions. Most worker processes consist of a single primary worker process that Data Catalog launches from a user action or a scheduled action. Some processes might also initiate secondary worker processes.
Worker processes
The following table lists the worker processes:
Process | Description | Actions performed |
Test Connection | Returns detailed success or failure information for each step of the test. Data Catalog starts this worker process when you configure or update a data source connection. Data Catalog marks the data source “OFFLINE” until a successful test completes. |
|
Metadata Ingest | Ingests the metadata for one or more schemas. |
|
Data Profiling | Generates a variety of statistics and intermediate data with a single pass through the source data. Typically, this is the first process you run on your data. |
|
Data Identification | Identifies and tags columns and tables using ontology information (dictionaries, aliases), along with underlying data and metadata. |
|
Key Discovery | Performs a variety of key discovery actions. Foreign key discovery requires that Data Profiling of the data sources has completed. |
|
Data Quality | Performs a full data quality (DQ) analysis on the underlying data, using regular expressions and other configurable business rules. |
|
Sensitive Data Discovery (SDD) | Performs the tasks beyond data identification for SDD. This process uses flows, lineage, Foreign Keys, and more to put together the items comprising PI and PII. |
|
Monitor worker status
From the Manage Your Environment page, you can see the number of completed worker processes and the number of worker alerts on the Workers card.
Use the following steps to monitor the status of a worker process:
Procedure
From the Manage Your Environment page, click View Workers to see the completed and in-progress worker processes.
The Status column shows the status of the worker processing.Click the up arrow at the beginning of the worker process row to expand the information.
View worker process details
Use the following steps to view details of a worker process:
Procedure
On the Workers page, locate the worker process you want more information.
If an up arrow is visible at the beginning of the row for the worker process, click the arrow to expand the information.
Click the View Details icon (>) at the end of the row.
The View Worker Details window opens. If the process failed, an Exception tab might be available, in addition to the Details tab.Click Close to close the View Worker Details window.
Cancel a worker process
Use the following steps to cancel a worker process:
Procedure
While a worker process is running, go to the Workers page and locate the worker process you want to cancel.
Click Cancel at the end of the row.
Data Catalog cancels the worker process, and displays Cancelling in the Job Status column.