Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Role-based access control (RBAC)

Parent article

Lumada Data Catalog features role-based access control (RBAC) to secure data while providing workflow efficiency. With RBAC, you can manage who has access to resources, what actions they can perform, and which areas they can access, providing a fine-grained control over access management. In addition to defining roles and permissions on a granular level, you can modify the default (predefined) roles, create your own custom roles, and assign permissions as needed.

You can access the RBAC feature by navigating to Manage then clicking Roles.

User groups allow you to manage a large number of users by organizing them efficiently so you can assign them easily to features and roles. Data Catalog organizes user roles and permissions into three groups: Administrator, Steward, and Business. You can generally organize your existing users into these groups based on their job type, such as system administrator, data steward, or business analyst, and then further define their permissions on the granular level. However, most roles generally include a combination of permissions from different groups because of dependencies across the groups. Using RBAC, you can assign all the Data Catalog features to users in your organization by dividing permissions between the different roles within the system while restricting permissions to perform critical or sensitive activities to only selected users. RBAC also provides control over search dimensions and the visibility of facets in the search results. See Search dimensions and custom facets for more information.

Data Catalog offers the following user groups:

  • Administrator

    Permissions include Data Catalog settings and management functions often performed by the Data Catalog administrator.

    • Settings: This subset of permissions controls Data Catalog settings.
  • Steward

    Permissions include those tasks performed to create and build the catalog, including permissions that control updates for resource metadata properties.

    • Resource: This subset of permissions controls the READ/WRITE access on specified resource properties.
  • Business

    Permissions include the daily activities of a data analyst. This group also includes generic guest privileges.

    • Analyst: The Analyst subset includes tag curation tasks for Data Catalog maintenance.
    • Guest: The Guest subset is comprised of basic permissions for browsing the catalog.

Data Catalog installs with two predefined roles: Guest and System Admin. After installation, you should create the additional roles that your organization needs so that all the activities in Data Catalog can be performed.

CautionAll the RBAC permissions must be assigned to ensure that the Data Catalog operates properly.

Administrator role permissions

The Administrator group includes permissions for managing Data Catalog, including roles and users as well as system settings. Additionally, when you assign a user to the Administrator group, you can also select to include the permissions from the Steward and Business groups for that user. The following table lists the permissions contained within the Administrator group.

Administrator role permissions

PermissionActionsNotes
Manage RolesCreate, read, update, and delete roles.
Manage UsersCreate, read, update, and delete users.
Associate Users with Roles Assign roles to users.
Associate Roles with Virtual FoldersAssign virtual folders to roles.

Requires the Manage Roles permission.

Only the virtual folders that are visible to the current Administrator role can be assigned.

Associate Roles with Tag DomainsAssign tag domains to roles.

Requires the Manage Roles permission.

Only the tag domains that are visible to the current Administrator role can be assigned.

Manage Job TemplatesCreate, read, update, and delete job templates. Requires Run Job and Manage Virtual Folders permissions to select assets.
Manage RulesCreate, read, update, and delete rules.
Manage NotificationsView and delete notifications of other users.All users can View and Delete their own notifications.
Manage CollaborationDelete reviews, comments, posts, and topics authored by another user.
Manage DataResources Controls the creation of data resources through an API. Provides the ability to create Registered (re)sources.
Manage External Sources Controls create, read, update, and delete of external sources through the command line and API. Provides the ability to integrate Apache Atlas - Cloudera Navigator functions.
Manage ToolsCan run utility jobs. Refer to Utilities.

The following permissions provide finer control of Data Catalog features that only an Administrator can use.

SettingPermissionsNotes
Manage Application ConfigurationUpdate configurations.
Manage AgentsCreate, read, update, and delete agents.
Manage TokensCreate, read, update, and delete tokens.
Manage PropertiesCreate, read, update, and delete custom properties. Requires the Manage DataResources permission to update custom properties.
Job ActivityView system-wide job activity.Requires the View Job Logs Analyst permission to view and download logs.
System ActivityView system activity.
System LogsView and download logs for setup, user interface activity, and job execution.
Monitor Network Health StatusMonitor network health for Metadata server and Agents.

Steward role permissions

The Steward group includes permissions for creating, building, and curating the catalog, including permissions that control updates for resource metadata properties. Additionally, when you assign a user to the Steward group, you can also select to include the permissions from the Business group for that user. The following table lists the permissions contained within the Steward group.

Steward role permissions

PermissionActionsNotes
Manage Data SourcesCreate, read, update, and delete data sources. If you delete a data source, the corresponding (Root/Child) Virtual Folders are also deleted.
Manage Virtual Folders Create, read, update, and delete Virtual Folders.
Manage Data Object Create, read, update, and delete Data Objects.
Manage Dataset Create, read, update, and delete Datasets.
Manage Tag Domain Create, read, update, and delete Tag Domains.
Export Tag Domain Export Tag Domains from Glossary. You must have Manage Tag Domain permissions.
Import Tag Domain Import Tag Domains from Glossary. You must have Manage Tag Domain permissions.
Lineage CurationCurate data lineages

Includes suggested inferred lineages on nodes and edges, and importing factual lineages.

Export LineageExport lineages with tools. 
Import LineageImport lineages with tools. 
Manage DataResource FieldsCreate, read, update, and delete on custom field comments and custom field labels.
Generate Hive Table or ViewGenerate Hive tables and views from the single resource view.You must have to at least one Hive database and Native Write access to the containing HDFS folder.

These permissions provide for finer control of Data Catalog's single resource view (SRV) features that only a Steward or Administrator can use.

SRV featurePermissions
Configuration

Controls the update feature on resource properties like xmlRowTag, xmlRootTag, separator, parquetBinaryAsString.

ContentControls the update feature on data resource properties like resource_field_tags and text.
MetadataControls the update feature on resource properties like header and headerRow on the Properties tab, and Description property on the Overview tab.
SocialControls the Read feature on Experts list, average rating and sensitivity.

Business User role permissions

The Business group includes permissions for curating tags, running jobs, and browsing the catalog. This group is divided into Analyst and Guest.

Business User role permissions

The Analyst subset includes the daily tasks for a data analyst, including permissions for tag curation and catalog maintenance.

PermissionActionsNotes
Associate TagsCurate tags.

You can only curate tags from assigned tag domains. Allows you to accept and reject tag associations.

Requires the View Tag Domain permission to see the tags from assigned tag domains.

Manage TagsCreate, read, update, and delete tags.Requires the View Tag Domain permission.
Export Resource MetadataExport metadata with Export CSV from the single resource view.The kind of metadata exported depends on the role's metadata access mode.
Run JobsJob execution (sequence and template) for available resources.
View Job LogsView and download job logs.
View DashboardView DataOps dashboard.

The Guest subset includes the minimum permissions for accessing and browsing the catalog.

PermissionActionsNotes
CollaborationCreate, read, and delete for self-authored posts, topics, comments, and reviews.
View TagsMakes tag APIs available for users.Requires View Tag Domain permission to see tags in the user interface. Only tags from assigned tag domains can be viewed.
View Tag DomainsView the tag domains.

Predefined roles

Lumada Data Catalog installs with two predefined roles, Guest and System Admin. Each has its benefits and limitations and is merely included as a starting point. You are encouraged to create the roles your organization needs following installation, including administrators, data stewards, business analysts, and basic guests.

  • Guest

    While no permissions are set, the predefined Guest role can browse all virtual folders and perform searches. Guest is the default role. Any new user automatically added to Data Catalog is assigned this default role unless set otherwise by an administrator. See Set a role as default for setting details.

    WARNINGThe predefined Guest role can view all virtual folders and tag domains in Data Catalog. In accordance with your data security best practices, create an alternative 'guest' role with basic permissions and limited viewing access to virtual folders and tag domains to use for your organization's guest users.
  • System Admin

    The predefined System Admin role ships with preselected Administrator and Settings permissions. You can use this role to create additional roles with the required permissions for building your instance of Data Catalog. Additionally, you can use this role to perform the following post-installation tasks for setting up Data Catalog:

    • Create and register data source agents that will serve the data sources in different local and remote clusters, such as onPremAgent, azureAgent, AWS-CloudAgent, and EMEA-Agent. See Create an agent and Manage agents.
    • Create the secondary administrative roles, such as FinanceAdmin, MarketingAdmin, and SalesAdmin, with the applicable Administrator and Steward permissions. To be effective, these secondary admin roles must be given proper permissions while respecting all permission dependencies. See Managing roles.
    • Add users to Data Catalog (or delegate this activity to the secondary administrators). Refer to Add a user.
    • Assign roles to users (or delegate this activity to the secondary administrators). See Assign a role.
    • Create custom properties (or delegate this activity to the secondary administrators). See Add a new custom property group.

Resource read access control

As a Lumada Data Catalog Administrator, you can override metadata visibility that would otherwise be dictated by the underlying native system permissions.

Data Catalog provides three levels of read access control:

  • NATIVE

    At this level, the data and metadata visibility are controlled by the native system permissions where the individual resource's system permissions determine their read and list access.

  • METADATA + NATIVE

    At this level, Data Catalog can show the resource metadata to users, even if the native system resource permission settings deny data access to that user. For example, all users who have the SalesSteward role can have access to metadata of all the resources in the virtual folders assigned to this role regardless of the access permissions set in the native system.

  • METADATA ONLY

    At this level, even if the users have native access to the resource, Data Catalog skips permission checks for data access and only displays the metadata for such a role.

The following table shows the resource read access levels for the different metadata access and data access settings described above.

Metadata accessData accessResource Read accessUser with Native accessUser without Native access
NativeNativeNATIVEData + MetadataNone
YesNativeMETADATA + DATAData + MetadataMetadata
YesNoMETADATA ONLYMetadataMetadata

RBAC and security

Data Catalog leverages RBAC to integrate with user authentication methods like AD, Kerberos, and O-Auth. The following diagram and process illustrates the flow of Data Catalog user authentication.

Data Catalog Security Model

  1. Login. User A logs into the browser, and the browser sends a request to the Jetty server over HTTPS.
  2. Authentication. The Jetty server sends a response with the user name and password to AD/Kerberos or the O-Auth server. After a unique response is retrieved, User A can log in to the Data Catalog.
  3. Authorization. Data Catalog honors the access policies defined by Ranger (for HDP) and Sentry (for CDH).
    • For Metadata mode, the Data Catalog service user is used to impersonate the logged-in user while browsing HDFS resources and the Data Catalog service user is used as a proxy user to browse Hive resources. For more information about service users, see Configure the Data Catalog service user.
    • For Native mode, the user logged in to the browser is used for browsing HDFS resources.
    • In either mode, Data Catalog roles and RBAC models apply, and users are only allowed to access virtual folders and tag domains according to the role assigned to them.
    • Each user is required to enter a username and password to access JDBC data sources after logging into the Data Catalog portal.