Managing rules
With Lumada Data Catalog's rules framework you can define, execute, and manage tag-based rules. These rules can evaluate data and metadata properties to add tags, remove tags, modify custom properties on data assets, and generate reports. The rules can reference data identifiers, such as table.column_3.
To manage rules, click Manage on the menu bar, then click Rules. From here, you can author, run, track, and manage all rules in the catalog. All rules must be entered using the Data Catalog rules language. After creating a new rule or updating an existing rule, you must set the Current status to Enabled for the rule to take effect.
When you write rules using tags, they are translated automatically into concrete rules that are bound and executed on individual data resources. This translation occurs regardless of which format and platform the resource is in, such as JDBC tables, Hive tables, CSV, Avro, or JSON, as long as it is a format Data Catalog supports. A rule written using tags captures business logic explicitly and can express many concrete rules.
You can view the following sample applications of the rules framework:
Using a rule for sensitive resource tagging
This example identifies and tags all resources containing sensitive data or personal identifiers such as names, addresses, social security numbers, and account information. Using the Lumada Data Catalog tag discovery features, you can identify field metadata and tag data fields such as first name, last name, and address. The Data Catalog built-in tags can identify these fields. Then, you can use a rule to check for any resources that contain tagged sensitive data fields and tag the resources as "Restricted Access".
Although not shown in exact syntax, the rule illustrated below is the only rule you need to write. The rule is tag-based and does not depend on an actual field name, resource name, or resource type.
When the rule is processed, it is automatically bound to all qualifying resources and attaches the tag you specify when a resource contains the sensitive fields. If you have 100 CSV files, 200 JDBC tables, and 30 Avro files that are all sensitive, they all are labeled correctly after executing this rule.
Syntax part | Definition |
Scope | hasFieldTag(First Name) AND hasFieldTag(Last Name) AND hasFieldTag(Address) |
Rule | hasFieldTag(SSN) |
Action | tag SSN field as "Sensitive" |
Resource tagging based on data properties
This example attaches a resource tag to all resources where the data for a given field is within a certain range. With the Data Catalog rules, a simple data rule defining the condition can identify the resources to which the condition applies and take the corresponding action.
Syntax part | Definition |
Scope | hasTag (Employee) |
Rule | @Category between (100, 199) and @TAX_state = 6A |
Action | tag resource as "CA Employee" |
Rule syntax
A rule executes on all the resources in Lumada Data Catalog, applying and evaluating the rule against one resource at a time, and executing the specified rule action. All rules must be entered using the Data Catalog rules language.
The language of the Data Catalog rules provides constructs for expressing scope, conditions, and actions. A unique capability in Data Catalog is that you can express all these constructs based on tags as well as with actual resource and field names.
Given a rule with Scope S, Body B, and Action A, the semantics of the rule can be summarized as: "For any resource R that is within S, if B evaluates to true, then perform all actions listed in A on R."
The rule syntax contains three parts:
Rule scope
Sets the scope of resources on which the rule is evaluated and applied.
Rule body
Defines the condition in a SQL predicate.
Rule action
Defines the action to be taken on resources that conform to the rule evaluation, such as resource tagging, tag removal, setting custom property values, and report generation.
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enter
Built-in_Tags/Last_Name
instead of Built-in_Tags.Last_Name
. This replacement should be made regardless of the part of the rule in which it is
located.Rule scope
The rule scope defines the resources for rule execution. You can define the rule scope by specifying scope types in the Scope window.
Virtual folders
At least one virtual folder is required for the rule to compile. When a single virtual folder is listed, rule execution is run against all the resources under the listed virtual folder. You can enter additional virtual folders as a comma-separated list. If a folder no longer exists when the rule is executed, it is ignored.
Source property filters
Comma-separated list of the source property filters key-value pairs.
Field tags
List of field tag filters.
Resource tags
List of resource tag filters.
Tag association states
List of tag states that the rules evaluate.
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enter
Built-in_Tags/Last_Name
instead of Built-in_Tags.Last_Name
. This replacement should be made regardless of the part of the rule in which it is
located.For example:
"ruleScope": { "virtualFolders": [ "HDFS" ], "sourcePropertyFilters": {}, "fieldTags": [ "Built-in_Tags/Last_Name", "Built-in_Tags/US_Address" ], "resourceTags": [ "CA/Employee" ], "tagStates": [ "ACCEPTED", "SUGGESTED" ] }
You can also define or update the rule scope using the Insert button. When using Insert, the placement of your cursor in the Scope window determines the insertion point for your selection. If the cursor is placed in the code or on a code line, then the item is inserted at the point of placement. If the cursor is placed outside the code, then the item is inserted on a new line. When entering your definition details, you can select from the system suggestions to help you complete the field entries.
The following selections are available using Insert:
Custom Property
Enter a custom property name that exists in your system and click Insert Custom Property.
Datasource
Enter the data source name, then click Insert Datasource.
Field
Enter the resource name, select a field name from that resource, and then click Insert Field.
Tag
Enter the tag domain name, select a tag name from that domain, and then click Insert Tag.
Virtual Folder
Enter the virtual folder name, then click Insert Virtual Folder.
You can return the rule scope to its default settings by clicking Reset.
Rule body
The rule body defines the rule that is translated and evaluated into a query to be executed against every qualifying resource as defined in the rule scope. You can define the rule body by specifying rule types in the Body window.
For example, you can insert a clause that determines what the rule body acts on. The query clause determines if the rule acts on metadata.
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enter
Built-in_Tags/Last_Name
instead of Built-in_Tags.Last_Name
. This replacement should be made regardless of the part of the rule in which it is
located.Metadata query is compiled using resource metadata
For example, the rule body
hasFieldTag(Built-in_Tags/Social_Security_Number_Delimited) = 1
checks for the presence of the field tagBuilt-in_Tags.Social_Security_Number_Delimited
.Data query is compiled using the resource data
Data query operating on field tags
For example, the rule body
(@EMS/Category >= 100 and @EMS/Category <= 199) and @EMS/Tax_State = '6A'
inspects the data in the field tagged withEMS.Category
for values between 100 and 199, when the data in the field tagged withEMS.Tax_State
has a value of "6A".Data query operating on custom property
For example, the rule body
@@business = 'MagnUX'
operates on custom properties looking for specific values.
Note the following conventions for data queries:
- Prefixing a
FieldTag
with an "@
" indicates the rule operates on the data tagged by theFieldTag
. - The presence of "
@@
" indicates the rule operates on a custom property and its values.
Depending on the rule type, the following ruleBody
queries are possible, where FieldTag
is a full tag name including the domain it is associated with:
Metadata query | Data query |
Evaluates against metadata. Rules query for metadata discovered by Lumada Data Catalog | Inspect the data when evaluating rules. Rules query for the specific data value identified by the tag. |
Evaluating on field tag ruleBody: hasFieldTag(A/x) AND hasFieldTag(B/y) | FieldName1 IN (val1,val2) AND FieldName2 = ‘Some Value’ |
Evaluating on field tag ruleBody: hasResourceTag(M/j) OR hasFieldTag(A/x) | (@Domain1/Tag1 + @Domain2/Tag1) < @Domain3/Tag1 |
Evaluating on resource tag ruleBody: hasResourceTag(M/j) AND hasResourceTag(N/f) | @FieldTag1 = ‘someValue’ |
(@fieldTag1 >= 100 and @fieldTag1 <= 199) and @fieldTag2 = 'some_value' | |
@Built-in_Tags/US_City = 'Los Angeles' OR @Built-in_Tags/US_City IN ('Fresno', 'Los Angeles', 'San Francisco') OR @Built-in_Tags/US_City = 'Folsom' OR 'city.*' = 'Los
Angeles' OR length('city.\*') > 5 OR length(@Built-in_Tags/US_City) > 6 | |
CASE statement support @Built-in_Tags/US_Zip_Code in ('10003', '10019', '10036', '10014') and (case when hasFieldTag(@Built-in_Tags/US_City) =1 then @Built-in_Tags/US_City
is null else true end) | @FieldTag1 > 'someValue' |
Evaluating on custom property @@business = 'MagnUX' @@strike-count = '3' |
You can also define or update the rule body using the Insert button. When using Insert, the placement of your cursor in the Body window determines the insertion point for your selection. If the cursor is placed in the code or on a code line, then the item is inserted at the point of placement. If the cursor is placed outside the code, then the item is inserted on a new line. When entering your definition details, you can select from the system suggestions to help you complete the field entries.
The following selections are available using Insert:
Custom Property
Enter a custom property name that exists in your system and click Insert Custom Property.
Datasource
Enter the data source name, then click Insert Datasource.
Field
Enter the resource name, select a field name from that resource, and then click Insert Field.
Tag
Enter the tag domain name, select a tag name from that domain, and then click Insert Tag.
Virtual Folder
Enter the virtual folder name, then click Insert Virtual Folder.
You can return the rule scope to its default settings by clicking Reset.
Rule action
The rule action defines the action to be taken if the ruleBody
evaluates to true (1)
.
A rule action is an array of actions and an action can apply only one tag. To
apply multiple tags, you must submit a ruleAction
for each tag.
In the actionAttributes
body:
- The presence of the
rule_action_field
entry indicates field tagging. The field name specified is tagged with the tag provided in therule_action_tag_name
. - The absence of the
rule_action_field
entry implies resource tagging. The resource is tagged with the tag provided in therule_action_tag_name
. - The
rule_action_threshold
entry is used only with a data rule and defines the percentage of rows that should satisfy the rule before the rule action is applied.
You can define the rule action by specifying action types in the Action window:
The following are the rule action types:
Tagging
When
actionType
is set toTagging
, theruleAction
makes tag associations based on rule evaluation. A tag suggestion can be applied on a specific field or on a qualifying resource. When applying a tag suggestion of a field, the field is identified in one of the following ways:- Full field name.
- Wildcard field name with partial string match for field name. Wildcard strings should follow the JAVA regular expression pattern format.
- Referencing another field tag associated with the field.
NoteThe Lumada Data Catalog rule framework does not create new tags. Any tag suggestions to be applied as part of rule action must be for existing tags. If an associated tag does not exist, Data Catalog displays an error message.Remove Tagging
When
actionType
is set toremove_tagging
, theruleAction
removes the tag associations based on the rule evaluation.Properties
When
actionType
is set toProperties
, theruleAction
sets custom property values.Property values are strings. You specify property names with
@@
and its string is substituted.You can use property actions to set and reset property values. To reset a property value, leave the field empty.
The following code sample gives an example of each action type:
tagDomain.Tag
needs the dot
replaced with a forward slash. For example, enter Built-in_Tags/Last_Name
instead of Built-in_Tags.Last_Name
. This replacement should be made regardless of
the part of the rule in which it is
located.{ "ruleActions": [ { "actionType": "Tagging", "actionName": "TagFieldByFieldName", "actionDisplayName": "Bind FieldTag by matching fieldName", "actionAttributes": { "rule_action_field": "FieldNameToBeTagged", "rule_action_threshold": "40", "rule_action_tag_name": "Field/TagToBeApplied" } }, { "actionType": "Tagging", "actionName": "TagFieldByAnotherFieldTag", "actionDisplayName": "Bind FieldTag by matching another fieldTag", "actionAttributes": { "rule_action_field": "@SomeDomain/AnotherFieldTag", "rule_action_threshold": "40", "rule_action_tag_name": "SomeDomain/NewTagToBeApplied" } }, { "actionType": "Tagging", "actionName": "TagFieldByWildcardFieldName", "actionDisplayName": "Bind FieldTag by matching wildcard fieldName", "actionAttributes": { "rule_action_field": "WildcardFieldNam.*", "rule_action_threshold": "40", "rule_action_tag_name": "SomeDomain/FieldTagToBeApplied" } }, { "actionType": "Tagging", "actionName": "Resource Tagging", "actionDisplayName": "Tag Resources with Resource/TagToBeApplied", "actionAttributes": { "rule_action_threshold": "60", "rule_action_tag_name": "SomeDomain/ResourceTagToBeApplied" } }, { "actionType": "remove_tagging", "actionName": "removeFieldTag", "actionDisplayName": "TagName2BeRemoved", "actionAttributes": { "rule_action_threshold": "60", "rule_action_tag_name": "SomeDomain/TagToBeRemoved" } }, { "actionType": "Properties", "actionName": "UpdatePropertyValue", "actionDisplayName": "propValueUpdate", "actionAttributes": { "rule_action_property_name": "propName2BeChanged", "rule_action_property_value": "newPropValue" } }, { "actionType": "Properties", "actionName": "ResetPropertyValue", "actionDisplayName": "propValueReset", "actionAttributes": { "rule_action_property_name": "propName2BeReset", "rule_action_property_value": "" } } ] }
You can also define or update the rule action using the Add Template button. When using Add Template, the template is inserted on a new line. When entering your definition details, you can select from the system suggestions to help you complete the field entries.
The following selections are available using Add Template:
Tagging template
Adds a tagging template.
Remove Tagging template
Removes a tagging template.
Property template
Adds a property template.
Reset text
Removes any changes and returns the rule action default settings.
Sample rules
You can use these examples of metadata and data rules to help you write rules for your implementation of Lumada Data Catalog:
Metadata rule samples
Metadata rules work with the metadata associated with the resource or field.
The metadata rule examples below show the following situations:
- Field tag binding using a field name
- Field tag binding using a wildcard or partial match of a field name
The following sample rule validates the presence of Built-in_Tags.Social_Security_Number_Delimited
for all resources containing the field tags
Built-in_Tags.Last_Name
and Built-in_Tags.Address
either in ACCEPTED or SUGGESTED state within the HDFS virtual folder, then
applies the resource tag PII.Sensitive
to the field named "SSN".
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enter Built-in_Tags/Last_Name
instead of Built-in_Tags.Last_Name
. This replacement
should be made regardless of the part of the rule in which it is located.
{ "name": "Sensitive_Tag_Completeness", "description": "If field Tags Built-in_Tags/Last_Name Built-in_Tags/Address Built-in_Tags/social_security present then add PII/Sensitive tag to SSN field", "ruleBody": "hasFieldTag(Built-in_Tags/Social_Security_Number_Delimited) = 1", "metadataRule": [], "ruleScope": { "virtualFolders": [ "HDFS" ], "sourcePropertyFilters": {}, "fieldTags": [ "Built-in_Tags/Last_Name", "Built-in_Tags/US_Address" ], "resourceTags": [ "CA/Employee" ], "tagStates": [ "ACCEPTED", "SUGGESTED" ] }, "ruleActions": [ { "rule_action_field": "SSN", "actionType": "Tagging", "actionName": "PII", "actionDisplayName": "PII", "actionAttributes": { "rule_action_threshold": "40", "rule_action_tag_name": "PII/Sensitive" } } ] }
ruleBody
This field validates the presence of field tag
Built-in_Tags.Social_Security_Number_Delimited
in the resources.ruleScope
This field applies various filters or scopes the rule to specific resources:
virtualFolders
This field limits the rule evaluation to only the virtual folder named DQM.
fieldTags
This attribute further filters resources that contain the field tags
Built-in_Tags.Last_Name
andBuilt-in_Tags.Address
.resourceTags
This field further limits evaluation to resources containing the
CA.Employee
resource tag.tagStates
This field restricts the rule evaluation rule to consider tags in either the ACCEPTED or SUGGESTED states.
ruleAction
This field specifies the attributes to apply the action, in this case to attach
PII.Sensitive
to the fieldSSN
in all resources the which the rule applies.rule_action_threshold
This field specifies the minimum per cent evaluation match for the action to be taken. In the above example, all the resources where 40 percent or more of the data passes the rule evaluation are tagged with the
PII.Sensitive
field tag.
The following sample rule validates the presence of the field tag Built-in_Tags.Email
in all resources within the DQM, Holdings, and Bank_Retail virtual folders, then applies the field tag PII.contactType
to the fields beginning with email*
.
{ "name": "EmailTagger", "ruleBody": "hasFieldTag(Built-in_Tags/Email)=1", "ruleScope": { "virtualFolders": ["DQM", "Holdings", "Bank_Retail"], "sourcePropertyFilters": {}, "fieldTags": [], "resourceTags": [], "tagStates": ["ACCEPTED", "SUGGESTED"] }, "ruleActions": [ { "actionType": "Tagging", "actionName": "tag", "actionDisplayName": "tag", "actionAttributes": { "rule_action_field": "email*", "rule_action_tag_name": "PII/contactType", "rule_action_threshold": "0" } } ] }
ruleBody
This field validates the presence of field tag
Built-in_Tags.Email
in the resources.ruleScope
This field applies various filters or scopes the rule to specific resources:
virtualFolders
This field limits the rule evaluation to only the virtual folders named DQM, Holdings and Bank_Retail.
tagStates
This field restricts the rule evaluation to tags in either the ACCEPTED or SUGGESTED states.
ruleAction
This field specifies the attributes to apply the action, in this case to attach
PII.contactType
as a field tag to all fields containingemail
in all qualifying resources.rule_action_threshold
This field specifies the minimum percentage evaluation match for the action to be taken. In the above example, all the resources where 0% or more of the data passes the rule evaluation are tagged with the
PII.contactType
field tag.
Data rule samples
A data rule inspects the data for a field or field tag when evaluating a resource.
The data rule examples below show the following situations:
- Resource tag and field tag binding
- Custom property setting and field tag removal
This rule is for resource tag binding on resources that contain a certain type of data. It checks for all resources with the resource tag DQM.Employee
and examines the data in fields tagged with DQM.taxCode
and DQM.stateCode
for qualifying data, then attaches the resource tag DQM.CA_Employee
and DQM.CA_Tax
tag to the field tagged with DQM.stateCode
.
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enter Built-in_Tags/Last_Name
instead of
Built-in_Tags.Last_Name
. This replacement should be made regardless of the part of the rule in which it is located.{ "name": "CA_Tag_Rule", "ruleBody": "(@DQM.taxCode >= 100 and @DQM.taxCode <= 199) and @DQM.stateCode = '6A'", "metadataRule": [], "ruleScope": { "virtualFolders": ["DQM"], "sourcePropertyFilters": {}, "fieldTags": [], "resourceTags": [ "DQM/Employee" ], "tagStates": [ "ACCEPTED", "SUGGESTED" ] }, "ruleActions": [ { "actionType": "Tagging", "actionName": "ResourceTagging", "actionDisplayName": "CA-ResourceTag", "actionAttributes": { "rule_action_threshold": "50", "rule_action_tag_name": "DQM/CA_Employee" } }, { "actionType": "Tagging", "actionName": "FieldTagging", "actionDisplayName": "DQM-CA_Tax", "actionAttributes": { "rule_action_threshold": "10", "rule_action_tag_name": "DQM/CA_Tax", "rule_action_field": "@DQM.stateCode" } } ] }
The rule elements are used as follows:
ruleScope
The
ruleScope
element scopes the rule to the following filters:virtualFolders
This field limits the rule evaluation to only the virtual folders named DQM.
resourceTag
This field filters the resources that have the
DQM.Employee
resource tag.tagStates
This field restricts the rule to tags in either the ACCEPTED or SUGGESTED states.
ruleBody
The
ruleBody
element limits the data to resources with fields tagged withDQM.taxCode
that have values between 100 and 199, and the data in the field tagged withDQM.stateCode
contains the value "6A".ruleActions
The
ruleActions
element is an array of two actions:Resource tagging action
rule_action_tag_name
This field specifies the new tag
DQM.CA_Employee
to bind to qualifying resources.rule_action_field
This field is left out intentionally to indicate resource tagging.
rule_action_threshold
This field specifies the minimum percentage evaluation match for the action to apply. In the above example, all the resources where 50% or more of the data passes the rule evaluation are tagged with the
DQM.CA_Employee
resource tag.
Field tagging action
rule_action_tag_name
The
rule_action_tag_name
from one action specifies the new tagDQM.CA_Employee
to bind to qualifying resources, while the other action specifies theDQM.CA_Tax
tag that binds to a field specified in therule_action_field
.rule_action_field
The presence of
rule_action_field
indicates field tagging, and the reference tag@DQM.stateCode
is used for binding theDQM.CA_Tax
tag.NoteTherule_action_field
is used in two ways:- When you prefix the value in
rule_action_field
with@
, the value is used as a tag. - Without the
@
, this field is interpreted as the field name that is used for binding (full or wildcard).
- When you prefix the value in
rule_action_threshold
The
rule_action_threshold
field specifies the minimum percentage evaluation match for the action to apply. For example, all the resources where 50% or more of the data passes the rule evaluation are tagged with theDQM.CA_Employee
resource tag.
This example is for a rule updating a custom property value and removing a field tag.
This rule checks for all resources with the resource tag DQM.Employee
and field tag DQM.CA_Tax
. It checks the data in fields tagged with DQM.stateCode
for the value is "6A". It also validates the value of the custom property data_owner
, checking to see if the value is '"Lara". If both these conditions are satisfied, the rule updates the value of data_owner
to "Joe" and removes the tag DQM.CA_Tax
bound to fields tagged with DQM.stateCode
.
{ "name": "Remove CA_Tax field tag and update data_owner custom property", "ruleBody": "((@DQM.stateCode = '6A') AND (@@data_owner = 'Lara')", "metadataRule": [], "ruleScope":{ "virtualFolders":["Banking", "DQM", "LDC-Warehouse"], "sourcePropertyFilters": {}, "fieldTags": ["DQM/CA_Tax"], "resourceTags": ["DQM/Employee"], "tagStates": ["ACCEPTED", "SUGGESTED"] }, "ruleActions": [ { "actionType": "Properties", "actionName": "Update data_owner", "actionDisplayName": "data_owner", "actionAttributes": { "rule_action_property_name": "data_owner", "rule_action_property_value": "Joe" } }, { "actionType": "remove_tagging", "actionName": "Remove CA_Tax field tag", "actionDisplayName": "rem-CA_Tax", "actionAttributes": { "rule_action_threshold": "10", "rule_action_tag_name": "DQM/CA_Tax" } } ] }
ruleScope
This field scopes the rule to the following filters:
resourceTag
This field filters the resources with the
DQM.Employee
resource tag.virtualFolders
This field limits the rule evaluation to only the virtual folders named DQM.
fieldTags
This field filters for resources containing the
DQM.CA_Tax
field tag.tagStates
This field restricts the rule evaluation rule to consider tags in either the ACCEPTED or SUGGESTED states.
ruleBody
This field inspects and validates the data values for a field tagged with
DQM.stateCode
to be "6A", and that the value of custom propertydata_owner
is "Lara".ruleActions
This rule element contains an array of two actions:
Updating custom property value
rule_action_property_name
This field specifies the property name for which the value will be updated, in this case,
data_owner
.rule_action_property_value
This field specifies the new value to which the custom property will be updated, in this case, "Joe".
rule_action_threshold
This field is intentionally left out for rule actions on custom properties.
Remove field action
rule_action_tag_name
This field specifies the tag
DQM.CA_Tax
to be removed.rule_action_field
This field is intentionally left out for the
remove_tagging
action.rule_action_threshold
This field specifies the minimum percentage evaluation match for the action to apply to the resource. In this example, all the resources where 50% or more of the data passes the rule evaluation are tagged with the
DQM.CA_Employee
resource tag.
Requirements for writing rules
Avoid errors by strictly following these requirements when writing rules:
- When using tags in the
ruleBody
for a Data query, you must prefix the tags with the@
qualifier. In the absence of the@
qualifier, a tagD.t
is interpreted as a column name which may or may not exist and the corresponding results may be misreported. - When evaluating rules to set custom properties, you must prefix the custom property with the
@@
qualifier. - Lumada Data Catalog supports minimal SQL functions in the rule definition such as
AND
,OR
,<
,>
,IN
, andlength()
. - Data Catalog supports
CASE
statements in predicates with the following syntax:CASE valueExpression whenClause+ (ELSE elseExpression=expression)? END #simpleCase
or
CASE whenClause+ (ELSE elseExpression=expression)? END #searchedCase
- All tags specified in the
actionAttribute
field need to pre-exist. - In rule syntax, a
tagDomain.Tag
needs the dot replaced with a forward slash. For example, enterBuilt-in_Tags/Last_Name
instead ofBuilt-in_Tags.Last_Name
. This replacement should be made regardless of the part of the rule in which it is located.
Rule workflow
On the Rules Settings page, you can create, update, edit, and delete rules.
Create a rule
Procedure
Click Manage on the menu bar, then select Rules.
The Rules page opens.Click + Create New Rule.
Enter the following information for your rule:
Field Description Name Enter the unique name of the rule that your users will recognize. Names must start with a letter, and must contain only letters, digits, hyphens, or underscores. White spaces in names are not supported. Current status Select the status of the rule. You can select Enabled or Disabled for your rule. When a Rules Execution job is triggered, disabled rules are skipped and are not evaluated. By default, a new rule is Disabled. When you select Enabled, all referenced names (custom properties, tags, fields, resources, virtual folders, etc.) are verified for accuracy in the system.
Scope Define the filters for evaluating the rule. You can edit the rule manually or use the Insert button on the Scope window, as described in Rule scope.
Body Define the rule that will be evaluated using rule syntax. You can edit the rule manually or use the Insert button on the Body window, as described in Rule body.
Action Define the action taken once the rule evaluation is accepted as true. This action can be associating a field or resource tag, removing an associated tag, or updating a custom property value. You can edit the rule manually or use the Insert button on the Action window, as described in Rule action.
Click Create to save your rule.
The rule is created. If there is a problem while creating your rule, an error notification displays at the top of the page. Resolve the error and click Create.
Next steps
Update a rule
Perform the following steps to edit a rule:
Procedure
Click Manage on the menu bar, then select Rules.
The Rules page opens.Locate the rule you want to edit, click its More actions icon, and then select the Edit option from the drop-down menu.
Edit the fields as needed.
Click Create to save your rule.
The rule is saved with your changes. If there is a problem while creating your rule, an error notification displays at the top of the page. Resolve the error and click Create.
Next steps
Delete a rule
Procedure
Click Manage on the menu bar, then select Rules.
The Rules page opens.Select the rule you want to delete.
Click the Delete icon. Optionally, select the More actions icon, then click Delete from the drop-down menu.
Click Save.
Rule execution
Like any job, you can trigger a Lumada Data Catalog rule as an independent job sequence or as a job template.
Even when triggered as a job sequence from a resource or virtual folder, the rule execution job runs across the entire Data Catalog, not just the resource or virtual folder from which it was triggered.
The command line syntax to execute rules is as follows:
<Agent>$ bin/ldc executeRules [-virtualFolder <VF name> [-path <path to a single resource only>]] \ [--<system parameters for driver-memory/executor-memory/etc.>]
Where:
-virtualFolder
When specified with the above command, the virtual folder mentioned overwrites the scope specified in the rules.
-path
Specifies the path to a specific resource, file or table. Use with the
-virtual folder
parameter for rule execution on a specific file or table. Rule execution is not recursive, so if-path
points to a directory or database, this parameter is ignored.-<system parameters>
Specifies any optional system specific parameters such as driver-memory or executor-memory.
Rule execution report
A rule execution report is a report of all the rules that summarizes how well a rule evaluates the resources in Lumada Data Catalog.
To generate a rule execution report, submit a job template with additional parameters.
The command syntax to enter in the Command line options field is as follows:
<Agent>$ bin/ldc executeRules [-virtualFolder <VF name> [-path <path to a single resource only>]] \ [-generateReport <true> -reportName <Name of the report being generated>] \ [--<system parameters for driver-memory/executor-memory/etc.>]
Where the options are defined as follows:
-virtualFolder
When specified with the above command, the virtual folder
<VF name>
overwrites the scope specified in the rules.-path
Path to a specific resource, file, or table. Use with the
-virtual folder
parameter for rule execution on a specific file or table. Rule execution is not recursive, so if the path points to a directory or database, this parameter is ignored.-generateReport
If this parameter is passed, rule execution generates a report with the name specified by the
-reportName
parameter.-reportName
User-defined name for the report being generated. Use with the
-generateReport
parameter.-<system parameters>
Any optional system-specific parameters such as driver-memory or executor-memory.
You also can use -reportsFolder <server folder>
to specify a folder for generating reports.
All reports are generated in the /var/log/ldc/generatedReports directory. If you do not provide a report name, Data Catalog randomly generates a unique name that is shown on the command prompt.
Sample report
Based on the sample rules explained in the Metadata rule samples section, the rule execution report looks similar to the following example:
"Sensitive_Tag_Completeness
" is the sample metadata rule and "CA_Tag_Rule
" is the sample data rule explained in Data rule samples.
You can use the percentage match to
identify the data quality, since fewer matches indicate lower quality data. The percentage setting is governed by rule_action_threshold
, which also controls the amount of data
in the report.