Data Catalog
- Last updated
- Save as PDF
Welcome to Lumada Data Catalog 7.3. Data Catalog automates discovery, classification, and management of your enterprise data. Data Catalog services big data, data warehouses, cloud services, and databases across your enterprise. Its patented fingerprinting AI and machine learning capabilities surface data that matters, laying the foundation for efficient and successful analytics, data governance, and compliance.
- What's new in Lumada Data Catalog
- Lumada Data Catalog 7.3 provides the following feature updates:
- Product overview
- The Lumada Data Catalog software builds a metadata catalog from data assets residing in tools such as HDFS, Hive, MySQL, Oracle, Redshift, S3, and Teradata. It profiles the data assets to produce field-level data quality statistics and to identify representative data so users can efficiently analyze the content and quality of the data.
- Install
- This article covers the installation of Lumada Data Catalog onto a Kubernetes cluster, using Helm and Hitachi Vantara owned Docker images.
- Get started
- Now that Lumada Data Catalog has been installed, you are ready to start planning, building, and using your data catalog.
- Use Data Catalog
- Use these articles to understand how to perform essential tasks in Lumada Data Catalog, such as searching the catalog, viewing lineage information, and tagging resources. This section is intended for non-administrative users of the catalog, including data analysts and data stewards.
- Manage
- Administrators can use these articles to learn the tasks in managing the Lumada Data Catalog, from setting up users and roles to monitoring jobs. These articles are intended for site administrators who are involved in post-installation and maintenance tasks, such as editing configuration properties for scripts or creating virtual folders. Some of these tasks may be performed by the owner of a data node or resource who knows the data well, such as managing jobs. In general, these tasks require an administrator (or data steward in some cases) who knows where the data is stored, how to connect to it, details about the computing environment, and how to use the command line to issue commands for Linux.
- Data Catalog reserved names
- Manage agents
- Manage collections
- Manage custom properties
- Manage data sources
- Manage users
- Manage virtual folders
- Manage workflows
- Managing associations
- Managing business glossaries
- Managing configurations
- Managing job templates
- Managing roles
- Managing rules
- MongoDB onboarding and profiling example video
- Monitoring job activity
- Role-based access control (RBAC)
- Develop and deploy
- Support your system infrastructure and integrate with other systems. These sections are best used by catalog administrators, developers, and data scientists who are familiar with programming concepts and have extensive metadata experience.Advanced configurationUse Advanced configuration to customize your Lumada Data Catalog environment as necessary, such as to set up an external MongoDB or Keycloak, or configure SSL.UtilitiesUse the Utility jobs to support your system infrastructure and effectively maintain Data Catalog in your environment.IntegrationsUse Apache Atlas integration to push business terms to Apache?? Atlas???, and pull lineage information from Atlas.Backup and RestoreUse Back up and restore to back up and restore Data Catalog components.
- REST API documentation
- Lumada Data Catalog provides REST APIs to access metadata held in the catalog.