Lumada Data Catalog uses Keycloak as the Identity Management Provider (IDP) for user authentication.
When a user logs in, Keycloak passes to Data Catalog the list of roles to which the user is entitled. User authorization of Data Catalog features and resources is provided by role-based access control (RBAC) to secure data while providing workflow efficiency. With RBAC, you can manage who has access to resources, what actions they can perform, and which areas they can access, providing fine-grained control over access management. In addition to defining roles and permissions on a granular level, you can modify the default (predefined) roles, create your own custom roles, and assign permissions as needed. For instance, you can create a custom role that can search metadata and create HIVE tables from HDFS files, and then set this role as a default for all new users.
You can access the RBAC feature by clicking Management in the left navigation menu and then clicking User Roles.
User groups allow you to manage a large number of users by organizing them efficiently so you can assign them easily to features and roles. Data Catalog organizes user roles and permissions into three groups: Administrator, Steward, and Business. You can generally organize your existing users into these groups based on their job type, such as system administrator, data steward, or business analyst, and then further define their permissions on a granular level. However, most roles generally include a combination of permissions from different groups because of dependencies across the groups. Using RBAC, you can assign all the Data Catalog features to users in your organization by dividing permissions between the different roles within the system while restricting permissions to perform critical or sensitive activities to only selected users. RBAC also provides control over search dimensions and the visibility of facets in the search results. See Search dimensions and custom facets for more information.
Data Catalog offers the following user groups:
Permissions include Data Catalog settings and management functions often performed by the Data Catalog administrator.
Permissions include those tasks performed to create and build the catalog, including permissions that control updates for resource metadata properties.
Permissions include the daily activities of a data analyst. This group also includes generic guest privileges.
- Analyst: The Analyst subset includes managing glossaries and running jobs, and tasks for Data Catalog maintenance.
- Guest: The Guest subset includes basic permissions for browsing the catalog.
Data Catalog installs with one predefined role: global_administrator. After installation, you should create the additional roles that your organization needs so that all the activities in Data Catalog can be performed.
Administrator role permissions
The Administrator group includes permissions for managing Data Catalog, including roles and users as well as system settings. Additionally, when you assign a user to the Administrator group, you can also select to include the permissions from the Steward and Business groups for that user. The following table lists the permissions contained within the Administrator group.
|Manage Business Glossary
|Create, read, update, and delete glossaries.
|Create, read, update, and delete agents.
|Manage Configuration Settings
|Create, read, update, and delete system configurations.
|Manage Custom Properties
|Create, read, update, and delete custom properties.
|Requires the Manage External Sources permission to update custom properties.
|Manage Job Templates
|Create, read, update, and delete job templates.
|Requires the Manage Virtual Folders and the Run Jobs permissions to select assets.
|View Job Activity
|View system-wide job activity.
|Requires the View Job Logs permission to view and download logs.
|Manage User Roles
|Create, read, update, and delete user roles.
|Manage External Sources
|Create, read, update, and delete external sources through the command line and API.
|Provides the ability to integrate Apache Atlas functions.
|Associate Roles with VFs
|Create role associations to virtual folders.
Requires the Manage User Roles permission.Only the virtual folders visible to the current Administrator role can be assigned.
|Associate Roles with Business Glossaries
|Create role associations to glossaries.
Requires the Manage User Roles permission.Only the glossaries visible to the current Administrator role can be assigned.
|Create, read, update, and delete workflow.
Steward role permissions
The Steward group includes permissions for creating, building, and curating the catalog, including permissions that control updates for resource metadata properties. Additionally, when you assign a user to the Steward group, you can also choose to include the permissions from the Business group for that user. The following table lists the permissions contained within the Steward group.
|Run Business Rules
|Execute rules such as for labeling, data quality, and associating glossary terms,
|Curate data lineages.
|Includes suggested inferred lineages on nodes and edges and importing factual lineages.
|Manage Data Resource Fields
|Create, read, update, and delete fields on custom field comments and custom field labels.
|Create resource metadata.
|Controls the update feature on resource properties.
|Create resource content.
|Controls the update feature on data resource properties like
|Manage Business Rules
|Create, read, update, and delete rules such as for labeling, data quality, and associating glossary terms.
|Manage Virtual Folders
|Create, read, update, and delete virtual folders.
|Manage Data Sources
|Create, read, update, and delete data sources.
|If you delete a data source, the corresponding (root/child) virtual folders are also deleted.
Business role permissions
The Business role includes permissions for curating terms, running jobs, and browsing the catalog. This role is divided into two groups:
The Analyst subset of the group includes the daily tasks for a data analyst, including permissions for term curation and catalog maintenance.
|View Business Rules
|View rules such as for labeling, data quality, and associating glossary terms.
|Manage Business Terms
|Create, read, update, and delete glossary terms.
|Requires the View Business Terms permission.
|Associate Business Terms
You can only curate tags from assigned glossaries. Allows you to accept and reject business term associations.
Requires the View Business Terms permission to see the terms from assigned glossaries.
|Job execution (sequence and template) for available resources.
|View Rationalization Dashboard
|View the Rationalization dashboard.
|Run Term Discovery
|Perform term discovery.
|Review business term
|Approve business term
The Guest subset of the group includes the minimum permissions for accessing and browsing the catalog.
|View Business Terms
|Browse glossary terms.
|Requires View Business Glossaries permission to see terms in the user interface. Only terms from assigned glossaries can be viewed.
|View Business Glossaries
|Browse business glossaries.
installs with a predefined
global_administrator role. This admin role has
benefits and limitations and is included as a starting point. You are encouraged to create
the roles your organization needs following installation, including administrators, data
stewards, business analysts, and basic guests.
global_administrator role ships with preselected Administrator permissions. You can use this role to create additional roles with the required permissions for building your instance of Data Catalog. Additionally, you can use this role to perform the following post-installation tasks for setting up Data Catalog:
- Create and register data source agents that will serve the data sources in different local and remote clusters, such as onPremAgent, azureAgent, AWS-CloudAgent, and EMEA-Agent. See Manage agents.
- Create the secondary administrative roles, such as FinanceAdmin, MarketingAdmin, and SalesAdmin, with the applicable Administrator and Steward permissions. To be effective, these secondary admin roles must be given proper permissions while respecting all permission dependencies. See Managing roles.
- Add users to Data Catalog (or delegate this activity to the secondary administrators). Refer to Add a user.
- Assign roles to users (or delegate this activity to the secondary administrators). See Assign a user to a role.
- Create custom properties (or delegate this activity to the secondary administrators). See Adding custom properties.
Resource read access control
As a Data Catalog administrator, you can enable Sample Data Access for a role:
- If Yes is selected, then the user can see sample data from the Data Canvas and Search.
- If No is selected, then the user cannot see sample data from the Data Canvas or Search.
RBAC and security
- Login. User A logs into the browser, and the browser sends a request to Keycloak over HTTPS.
- Authentication. Keycloak sends a response with the username and password to the authentication server. After a unique response is retrieved, User A can log in to Data Catalog.
- Authorization. Data Catalog honors the defined access policies.
- The Data Catalog service user is used to impersonate the logged-in user while browsing HDFS resources. Also, the Data Catalog service user is used as a proxy user to browse Hive resources.
- Data Catalog roles and RBAC models apply, and users are only allowed to access virtual folders and glossaries according to the role assigned to them.
- Each user is required to enter a user name and password to access the Data Catalog portal.