Lumada Data Catalog features role-based access control (RBAC) to secure data while providing workflow efficiency. With RBAC, you can manage who has access to resources, what actions they can perform, and which areas they can access, providing a fine-grained control over access management. In addition to defining roles and permissions on a granular level, you can modify the default (predefined) roles, create your own custom roles, and assign permissions as needed.
You can access the RBAC feature by navigating to Manage then clicking Roles.
User groups allow you to manage a large number of users by organizing them efficiently so you can assign them easily to features and roles. Data Catalog organizes user roles and permissions into three groups: Administrator, Steward, and Business. You can generally organize your existing users into these groups based on their job type, such as system administrator, data steward, or business analyst, and then further define their permissions on the granular level. However, most roles generally include a combination of permissions from different groups because of dependencies across the groups. Using RBAC, you can assign all the Data Catalog features to users in your organization by dividing permissions between the different roles within the system while restricting permissions to perform critical or sensitive activities to only selected users. RBAC also provides control over search dimensions and the visibility of facets in the search results. See Search Dimensions and Custom Facets for more information.
Data Catalog offers the following user groups:
Permissions include Data Catalog settings and management functions often performed by the Data Catalog administrator.
- Settings: This subset of permissions controls Data Catalog settings.
Permissions include those tasks performed to create and build the catalog, including permissions that control updates for resource metadata properties.
- Resource: This subset of permissions controls the READ/WRITE access on specified resource properties.
Permissions include the daily activities of a data analyst. This group also includes generic guest privileges.
- Analyst: The Analyst subset includes tag curation tasks for Data Catalog maintenance.
- Guest: The Guest subset is comprised of basic permissions for browsing the catalog.
Data Catalog installs with two predefined roles: Guest and System Admin. After installation, you should create the additional roles that your organization needs so that all the activities in Data Catalog can be performed.
Administrator role permissions
The Administrator group includes permissions for managing Data Catalog, including roles and users as well as system settings. Additionally, when you assign a user to the Administrator group, you can also select to include the permissions from the Steward and Business groups for that user. The following table lists the permissions contained within the Administrator group.
|Manage Roles||Create, read, update, and delete roles.|
|Manage Users||Create, read, update, and delete users.|
|Associate Users with Roles||Assign roles to users.|
|Associate Roles with Virtual Folders||Assign virtual folders to roles.||
Requires the Manage Roles permission.
Only the virtual folders that are visible to the current Administrator role can be assigned.
|Associate Roles with Tag Domains||Assign tag domains to roles.||
Requires the Manage Roles permission.
Only the tag domains that are visible to the current Administrator role can be assigned.
|Manage Job Templates||Create, read, update, and delete job templates.||Requires Run Job and Manage Virtual Folders permissions to select assets.|
|Manage Rules||Create, read, update, and delete rules.|
|Manage Notifications||View and delete notifications of other users.||All users can View and Delete their own notifications.|
|Manage Collaboration||Delete reviews, comments, posts, and topics authored by another user.|
|Manage Data Resources||Controls the creation of data resources through an API.||Provides the ability to create Registered (re)sources.|
|Manage External Sources||Controls create, read, update, and delete of external sources through the command line and API.||Provides the ability to integrate Apache Atlas - Cloudera Navigator functions.|
|Manage Tools||Can run utility jobs.||Refer to Utilities.|
The following permissions provide finer control of Data Catalog features that only an Administrator can use.
|Manage Application Configuration||Update configurations.|
|Manage Agents||Create, read, update, and delete agents.|
|Manage Tokens||Create, read, update, and delete tokens.|
|Manage Properties||Create, read, update, and delete custom properties.|
|Job Activity||View system-wide job activity.||Requires the View Job Logs Analyst permission to view and download logs.|
|System Activity||View system activity.|
|System Logs||View and download logs for setup, user interface activity, and job execution.|
|Monitor Network Health Status||Monitor network health for Metadata server and Agents.|
Steward role permissions
The Steward group includes permissions for creating, building, and curating the catalog, including permissions that control updates for resource metadata properties. Additionally, when you assign a user to the Steward group, you can also select to include the permissions from the Business group for that user. The following table lists the permissions contained within the Steward group.
|Manage Data Sources||Create, read, update, and delete data sources.||If you delete a data source, the corresponding (Root/Child) Virtual Folders are also deleted.|
|Manage Virtual Folders||Create, read, update, and delete Virtual Folders.|
|Manage Data Object||Create, read, update, and delete Data Objects.|
|Manage Dataset||Create, read, update, and delete Datasets.|
|Manage Tag Domain||Create, read, update, and delete Tag Domains.|
|Export Tag Domain||Export Tag Domains from Glossary.||You must have Manage Tag Domain permissions.|
|Import Tag Domain||Import Tag Domains from Glossary.||You must have Manage Tag Domain permissions.|
|Lineage Curation||Curate data lineages||
Includes suggested inferred lineages on nodes and edges, and importing factual lineages.
|Export Lineage||Export lineages with tools.|
|Import Lineage||Import lineages with tools.|
|Manage DataResource Fields||Create, read, update, and delete on custom field comments and custom field labels.|
|Generate Hive Table or View||Generate Hive tables and views from the single resource view.||You must have to at least one Hive database and Native Write access to the containing HDFS folder.|
These permissions provide for finer control of Data Catalog's single resource view (SRV) features that only a Steward or Administrator can use.
Controls the update feature on resource properties like
|Content||Controls the update feature on data resource properties like
|Metadata||Controls the update feature on resource properties like
|Social||Controls the Read feature on Experts list, average rating and sensitivity.|
Business User role permissions
The Business group includes permissions for curating tags, running jobs, and browsing the catalog. This group is divided into Analyst and Guest.
The Analyst subset includes the daily tasks for a data analyst, including permissions for tag curation and catalog maintenance.
|Associate Tags||Curate tags.||
You can only curate tags from assigned tag domains. Allows you to accept and reject tag associations.
Requires the View Tag Domain permission to see the tags from assigned tag domains.
|Manage Tags||Create, read, update, and delete tags.||Requires the View Tag Domain permission.|
|Export Resource Metadata||Export metadata with Export CSV from the single resource view.||The kind of metadata exported depends on the role's metadata access mode.|
|Run Jobs||Job execution (sequence and template) for available resources.|
|View Job Logs||View and download job logs.|
|View Dashboard||View DataOps dashboard.|
The Guest subset includes the minimum permissions for accessing and browsing the catalog.
|Collaboration||Create, read, and delete for self-authored posts, topics, comments, and reviews.|
|View Tags||Makes tag APIs available for users.||Requires View Tag Domain permission to see tags in the user interface. Only tags from assigned tag domains can be viewed.|
|View Tag Domains||View the tag domains.|
Lumada Data Catalog installs with two predefined roles, Guest and System Admin. Each has its benefits and limitations and is merely included as a starting point. You are encouraged to create the roles your organization needs following installation, including administrators, data stewards, business analysts, and basic guests.
While no permissions are set, the predefined Guest role can browse all virtual folders and perform searches. Guest is the default role. Any new user automatically added to Data Catalog is assigned this default role unless set otherwise by an administrator. See Set a role as default for setting details.WARNINGThe predefined Guest role can view all virtual folders and tag domains in Data Catalog. In accordance with your data security best practices, create an alternative 'guest' role with basic permissions and limited viewing access to virtual folders and tag domains to use for your organization's guest users.
The predefined System Admin role ships with preselected Administrator and Settings permissions. You can use this role to create additional roles with the required permissions for building your instance of Data Catalog. Additionally, you can use this role to perform the following post-installation tasks for setting up Data Catalog:
- Create and register data source agents that will serve the data sources in different local and remote clusters, such as onPremAgent, azureAgent, AWS-CloudAgent, and EMEA-Agent. See Create an agent and Manage agents.
- Create the secondary administrative roles, such as FinanceAdmin, MarketingAdmin, and SalesAdmin, with the applicable Administrator and Steward permissions. To be effective, these secondary admin roles must be given proper permissions while respecting all permission dependencies. See Managing roles.
- Add users to Data Catalog (or delegate this activity to the secondary administrators). Refer to Add a user.
- Assign roles to users (or delegate this activity to the secondary administrators). See Assign a role.
- Create custom properties (or delegate this activity to the secondary administrators). See Add a new custom property group.
Resource read access control
As a Lumada Data Catalog Administrator, you can override metadata visibility that would otherwise be dictated by the underlying native system permissions.
Data Catalog provides three levels of read access control:
At this level, the data and metadata visibility are controlled by the native system permissions where the individual resource's system permissions determine their read and list access.
METADATA + NATIVE
At this level, Data Catalog can show the resource metadata to users, even if the native system resource permission settings deny data access to that user. For example, all users who have the SalesSteward role can have access to metadata of all the resources in the virtual folders assigned to this role regardless of the access permissions set in the native system.
At this level, even if the users have native access to the resource, Data Catalog skips permission checks for data access and only displays the metadata for such a role.
The following table shows the resource read access levels for the different metadata access and data access settings described above.
|Metadata access||Data access||Resource Read access||User with Native access||User without Native access|
|Native||Native||NATIVE||Data + Metadata||None|
|Yes||Native||METADATA + DATA||Data + Metadata||Metadata|
RBAC and security
- Login. User A logs into the browser, and the browser sends a request to the Jetty server over HTTPS.
- Authentication. The Jetty server sends a response with the user name and password to AD/Kerberos or the O-Auth server. After a unique response is retrieved, User A can log in to the Data Catalog.
- Authorization. Data Catalog honors the access policies defined by Ranger (for HDP) and Sentry (for
- For Metadata mode, the Data Catalog service user is used to impersonate the logged-in user while browsing HDFS resources and the Data Catalog service user is used as a proxy user to browse Hive resources. For more information about service users, see Configure the Data Catalog service user.
- For Native mode, the user logged in to the browser is used for browsing HDFS resources.
- In either mode, Data Catalog roles and RBAC models apply, and users are only allowed to access virtual folders and tag domains according to the role assigned to them.
- Each user is required to enter a username and password to access JDBC data sources after logging into the Data Catalog portal.