Skip to main content

Pentaho+ documentation is moving!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Managing rules

Parent article

Organizations need to understand the quality of their data to understand its fit for use. They also need to manage their metadata at scale without requiring significant human effort, especially for repeated tasks. With Data Catalog's rules framework, you can define, execute, and manage business rules. These rules can evaluate data and metadata properties to add terms, remove terms, modify custom property values, evaluate data quality metrics, and perform several other actions.

Rule components

Data and metadata rules execute based on the defined rule scope, criteria, and action on all qualified data entities.

You must use the Data Catalog rule components, which provide constructs for expressing scope, criteria, and actions. You can define all these constructions based on actual business terms, field names, custom properties, and business term association states.

The rule components are:

  • Rule scope

    Sets the scope of resources on which the rule is evaluated and applied.

  • Rule criteria

    Defines the condition.

  • Rule action

    Defines the action to take on resources that conform to the rule's evaluation, such as add term, remove term, set custom property values, set quality dimension, set term assignment state, compute data quality score, and set sensitivity.

NoteData Catalog always refers to terms using their fully qualified name in the following form: <Glossary Name>/<Parent Term>.<Child Term>.<Grandchild Term>.The parent term should be separated by the child term using a dot (.). For example, @Insurance/HomeOwners.State_VA, here HomeOwners is a parent term, and State_VA is a child term.

Rules creation

To create rules, click Management in the left navigation menu to open the Manage Your Environment page, and then click Business Rules.

On the Business Rules page, you can create, run, track, and manage all rules in Data Catalog.

You can create business rules in two ways:

  • Create a new rule by entering the scope, action, and criteria for the rule in the Add New Rule page and save it.
  • Create a new rule by creating blocks for scope, action, and criteria. You can use these blocks any time later while creating a new rule using the Load option on the Add New Rule page. This is the recommended way that is optimized for re-usability.

When you create rules, they are translated automatically into concrete rules that are bound and executed on individual data resources. This translation occurs regardless of which format and platform the resource is in, such as JDBC tables, Hive tables, CSV, Avro, JSON, or any other file format that Data Catalog supports.

NoteData Catalog data types differ from the original data types used in databases and file systems. For example, the int data type can be represented as Integer in a Data Catalog data type. You can view data types by clicking the Details tab when you are viewing a resource in the Data Canvas.

You can view the following sample applications of the rules framework:

Rule scope

The rule scope defines the resources for rule execution. When configuring a rule, you can define the rule scope by specifying scope types in the Set Scope section.

  • Virtual folders

    You must include at least one virtual folder. When a single virtual folder is listed, the rule runs against all the resources in the listed virtual folder. You can select multiple virtual folders from the navigation tree.

  • Resource terms

    You can select multiple terms from multiple glossaries in the navigation tree. Selecting resource terms narrows the scope to include only the resources (files and tables) that have the selected resource-level terms.

  • Field terms

    You can select multiple terms from multiple glossaries in the navigation tree. Selecting field terms narrows the scope to include only the resources (files and tables) that contain columns and fields associated with the selected terms.

  • Custom properties

    You can select custom properties from custom property groups in the navigation tree. Selecting custom properties narrows the scope to include only the resources (files and tables) that have the selected resource custom properties.

  • Term states

    You can further filter the resources based on the term association state if you have selected Resource terms or Field terms while defining the rule scope. The possible values are ACCEPTED, SUGGESTED, and REJECTED. If you do not specify a term association state, then all states will be considered.

Rule criteria

The rule criteria define the rule that is translated and evaluated into a query for execution against every qualifying resource as defined in the rule scope. You can define the rule criteria by specifying the criteria block and later loading it into a rule or manually entering it while creating a new rule.

For example, you can insert a clause that determines what the rule body acts on. The query clause determines if the rule acts on metadata or actual resource data.

ImportantData Catalog uses the JDK regular expression (regex) engine to process regular expressions. When you are editing business rules, make sure your regular expressions conform to JDK regex syntax so Data Catalog can process them accurately.

Rule criteria on resource data

The rule criteria on resource data operates on field terms in combination with field values. For example, the rule criteria (@EMS/Category >= 100 and @EMS/Category <= 199) and @EMS/Tax_State = '6A' inspects the data in the field tagged with EMS/Category for values between 100 and 199, when the data in the field tagged with EMS/Tax_State has a value of "6A".

It can also operate on field values using field names. The rule on resource data can directly act on the real values of the field and column. For example, a rule criteria like Country = ‘USA’ is inspecting the data in a field named Country for a value of USA.

Rule criteria defined on resource data inspects the data of the file when evaluating the rules, and identifies the associated fields and checks the data of those fields.

Depending on the rule type, you can use the rule criteria queries shown in the table, where the field term is a fully qualified term name including the domain that it is associated with:

NotePrefixing a fully qualified field term with an "@" indicates the rule operates on the data tagged by the term.

Example of a fully qualified term name: If the user has Glossary as HomeInsurance and Term as State-NJ, then the fully qualified name of the Term is: HomeInsurance/State-NJ

Rule criteria must be written using a specific syntax to execute as desired. For syntax requirements, see Rule criteria requirements and syntax.

Note The following resource data rule criteria can only be used in Business Rules.
DescriptionSyntaxSyntax exampleNotes
Evaluate data in fields using field name.Field ('field name') AND 'Field'= 'Auto'Country IN('USA', 'Australia', 'Japan') AND 'Insurance Type'= 'Auto'

The first part specifies the names of the fields that are being evaluated, along with the values that are associated with them. In this case, the fields are 'Country' and 'Insurance Type', and the values are 'USA', 'Australia', 'Japan', and 'Auto'.

The second part specifies the logical conditions that must be satisfied in order for the data to be evaluated as true. In this case, the logical conditions are that the Country field must be one of 'USA', 'Australia', or 'Japan', and the 'Insurance Type' field must be 'Auto'.

NA
Evaluate data in fields using field terms. @ Glossary/Term1 = "someValue" @`Glossary/Term1` = 'someValue'AND @`Glossary/fieldTerm1`>= 100 AND @`Glossary/fieldTerm1`<= 199) AND @`Glossary/fieldTerm2`= 'some_value'

@ Glossary/Term1 = 'someValue' This part specifies a condition that the field named "Term1" in the "Glossary" category should have a value equal to "someValue".

@Glossary/fieldTerm1 >= 100 AND @Glossary/fieldTerm1 <= 199 This part specifies an additional condition that the field named "fieldTerm1" in the "Glossary" category should have a value greater than or equal to 100 and less than or equal to 199.

@Glossary/fieldTerm2 = 'some_value' This part specifies a third condition that the field named "fieldTerm2" in the "Glossary" category should have a value equal to "some_value".

All four conditions in this example must be met to satisfy the rule criteria.
Evaluate the length of field values by querying for both field name and field term.length(field) > 10 OR length (@`Field/Field Term`)> 10 length(Username) > 10 OR length(@Domain1/Term1)> 10The syntax length (Username) > 10 evaluates the length of the value of the field "Username". Specifically, it checks if the length of the value is greater than 10 characters. OR length(@`Customer/Email`)> 10 evaluates the length of the value of the field associated with the term "Email" within the "Customer" glossary. Specifically, it checks if the length of the value is greater than 10 characters.NA
Evaluate the uniqueness of a field using field name.fieldName isunique UserID isunique The "UserID" is the name of the field that needs to be checked for uniqueness.NA
Check for the existence of all the given values in the mentioned field namefieldname containsall Usercontainsall(‘John’, ‘Matt’, ‘Winston’)NA
Evaluate data in fields with nested terms and terms with spaces using field terms.@`Glossary/ParentTerm1.childTerm`>=100 AND @`Glossary.Term2` = ‘value’@Glossary/ParentTerm1.childTerm >=100 and @`Glossary. Term2` = “value” The syntax @Glossary/ParentTerm1.childTerm >=100 represents a query that evaluates whether data in fields associated with the term childTerm have a value greater than or equal to 100. NA
matchRegEx matches the column or field values matchRegEx (“RegEx pattern“)

If there is a column or field with the name City, which contains all the city names,

City matchRegEx (“B.*“) matches all the city names that start with the letter B.

City matchRegEx (“B.*“, “K.*“) matches all the city names that either start with the letter B or the letter K. This gets translated to City matchRegEx (“B.*“) OR City matchRegEx (“K.*“)

City matchRegEx (“^B.*“) AND City matchRegEx (“u$“) matches all the city names starting with the letter B and ending with the letter u.

To limit the search to a specific column instead of the entire table, make sure to indicate the column name before using the matchRegEx regular expression.
to_date function allows the comparison of columns that are detected as string type with date values

The to_date function can be used in the following ways:

For more information on writing the to_date function, see Requirements and guidelines for writing to_date function in rule criteria.

to_date(`column name`, `date format of the column`)to_date(`registration_date`, `yyyy-MM-dd`) The `date format of the column` must always be enclosed with backticks.
to_date(@Glossary/BusinessTerm, `date format of the associated column`)to_date(@Booking/Date, `yyyy-MM-dd`)

Where Booking is the glossary name and Date is the business term. This case is for cases when a term is associated with columns in different files that have the same date format.

The `date format of the associated column` must always be enclosed with backticks.
to_date(`date string in yyyy-MM-dd format`)to_date(`2016-08-10`)The date format should always follow the yyyy-MM-dd format.

Rule criteria on resource metadata

Rule criteria defined on resource metadata evaluates against metadata discovered by Data Catalog. For example, the rule criteria hasFieldTerm(Built-in_Terms/Social_Security_Number_Delimited) = 1 checks for the presence of the field term Built-in_Terms/Social_Security_Number_Delimited.

For a custom property, include "@@" in the rule criteria. For example, the rule criteria @@business= 'MagnUX' operates on custom properties looking for the value of 'MagnUX'. The inclusion of "@@" indicates the rule is used for custom properties and is a metadata rule.

Depending on the rule type, you can use the following rule criteria queries:

NoteYou must write rule criteria using a specific syntax for the rule to execute as desired. For syntax requirements, see Rule criteria requirements and syntax.
Function and descriptionSyntaxSyntax example ParametersCan be used in

Metadata rules

hasMetadataName match on data source name.hasMetadataName(Data Source,Name/Regex)=1hasMetadataName(DataSource,`airport_enrichment_data`)=1

Regex example: hasMetadataName(DataSource,`.*enrichmentdata.*`)=1

  • Type:

    DataSource

  • Name or Regex:

    Regular expression to match name or entire name.

Business Term discovery

(In rule execution datasource, virtualFolder is not applicable)

hasMetadataName match on virtual folder namehasMetadataName(VirtualFolder, Name/Regex)=1hasMetadataName(VirtualFolder, `loandata`)=1
  • Type:

    Virtual folder

Business Term discovery
hasMetadataName match on resource pathhasMetadataName(ResourcePath, Path_to_Resource/Regex)=1 hasMetadataName(ResourcePath, `/mssql_adworks/Person/ContactType`)=1
  • Type:

    Resource path

  • Name or Regex:

    Regular expression to match the name or entire name.

Business Rules and Business Term discovery
hasMetadataName match on resource namehasMetadataName(ResourceName, Name/Regex)=1 hasMetadataName(ResourceName, `.*_finance\.csv`)=1
  • Type:

    Resource name

  • Name or Regex:

    Regular expression to match the name or entire name.

Business Rules and Business Term discovery
hasMetadataName match on field name.hasMetadataName(FieldName, Name/Regex)=1 hasMetadataName(FieldName,`.*thlete`)=1
  • Type:

    Field name

  • Name or Regex:

    Regular expression to match the name or entire name.

Business Rules and Business Term discovery
hasMetadataName match on field path hasMetadataName(FieldPath, Path_to_Field/Regex)=1 hasMetadataName(FieldPath, `/mssql_adworks/Person/ContactType/Name`)=1
  • Type:

    Field path

  • Name or Regex:

    Regular expression to match the name or entire name.

Business Rules and Business Term discovery
hasFieldName match on field namehasFieldName(Name/Regex)=1

hasFieldName(CreditCard)=1

Regex example:hasFieldName(`CreditCa.*`)=1

  • Name or Regex:

    Regular expression to match the name or entire name.

Business Term discovery
hasFieldTypechecks for data type of the any field in the resourcehasFieldType(Name/Regex, DataType)

Example:hasFieldType (data type, CreditCardNumber)=1

Regex example: hasFieldType (long, `CreditCa.*`)=1

  • Type:

    Data type of the field.

  • Name or Regex:

    Regular expression to match the name or entire name.

NoteIf data type is not provided, it checks if any field in the column has that data type.
Business Rules and Business Term discovery
hasFieldTermProximity checks for the distance (proximity) between two field terms.hasFieldTermProximity(Field_Name/Regex, `Glossary/Term`)operator proximity hasTermProximity(Built-in_Terms/First_Name, Built-in_Terms/Last_Name) <=2

Fully qualified Term name1

Fully qualified Term name2

Business Rules
matchFieldProximity checks for the distance (proximity) between two field nameshasFieldProximity(Field1_Name/Regex, Field2_Name/Regex) operator proximity hasFieldProximi(Firstname, Lastname)<=2
  • Field1:

    Field1 Name or fieldPath

  • Field2:

    Field2 Name or fieldPath

  • Proximity:

    Distance between two fields

Business Rules and Business Term discovery
hasOrdinal checks for the position of a column (field) in an RDBMS tablehasOrdinal(ColumnName/Regex, ordinal)hasOrdinal (Firstname,2)=1

The field First name appears as the 3rd column in LDC, Data Canvas UI, then the ordinal should be given as 2 because LDC follows Java Conventions.

  • Field name:

    Field name or field Path

  • Ordinal:

    Position number

Business Rules and Business Term discovery
hasFieldTerm checks for the existence of a field term (suggested or accepted)hasFieldTerm(`Glossary/Term`)=1

hasFieldTerm (Built-in_Terms/Country)=1

  • Term name:

    Fully qualified name of the term

Business Rules
hasAcceptedFieldTerm checks for the existence of accepted field termhasAcceptedFieldTerm(`Glossary/Term`)=1 hasAcceptedFieldTerm(Built-in_Terms/Country)=1
  • Term name:

    Fully qualified name of the term

Business Rules
hasResourceTermchecks for the existence of resource term hasResourceTerm(`Glossary/Term`)=1 hasResourceTerm(`Built-in_Terms/3-Letter_Country_Code`)=1
  • Term name:

    Fully qualified name of the term

Business Rules
matchTermProximitychecks for the distance (proximity) between two resource terms hasTermProximity(`Glossary/Term`, `Glossary/Term`)=1 hasTermProximity(`Parts/Part Name`,`Parts/Part Number`)=1
  • Field name:

    Fully qualified name of the term

  • Term name:

    Fully qualified name of the term

Business Rules and Business Term discovery
hasStatistic checks for field selectivity value hasStatistic(Selectivity,Field_Name/Regex) operator selectivity_value hasStatistic(Selectivity,`.*Port`)>=3

Regex example: hasStatistic(Selelctivity,`.*pOrt.*`) >3

  • Name:

    Can be one of the values: selectivity, cardinality, stringMin, stringMax, numericMin, or numericMax

  • Field name or regex:

    Regular expression to match name or entire name

Business Rules and Business Term discovery
hasStatistic checks for field cardinality valuehasStatistic(Cardinality,Field_Name/Regex) operator cardinality_value hasStatistic(Cardinality,`.*Port`)>=3

Regex example: hasStatistic(Cardinality,`.*pOrt.*`) >= 3

  • Name:

    Can be one of the values: selectivity, cardinality, stringMin, stringMax, numericMin, and numericMax

  • Field name or regex:

    Regular expression to match the name or entire name

Business Rules and Business Term discovery
metadataStringMin Minimum length of a string field hasStatistic(name,Fieldname/regex)=1hasStatistic(stringMin,Sport)>=3

Regex example:

hasStatistic(stringMin,`.*Port`)>=3
  • Name:

    Can be one of the values: selectivity, cardinality, stringMin, stringMax, numericMin, and numericMax

  • Field name or regex:

    Regular expression to match the name or entire name

Business Term discovery
metadataStringMax Maximum length of a string field hasStatistic(name,Fieldname/ regex)=1hasStatistic(stringMax,Sport)>=3

Regex example:hasStatistic(stringMax,`.*Port`)>=3

  • Name:

    Can be one of the values: selectivity, cardinality, stringMin, stringMax, numericMin or numericMax

  • Field name or regex:

    Regular expression to match the name or entire name

Business Term discovery
metadataNumericMin Minimum value of a numeric field hasStatistic(name,Fieldname/regex)hasStatistic(numericmin,Sport)>=3

Regex example:hasStatistic(numericmin,`.*Port`)>=3

  • Name:

    Can be one of the values: selectivity, cardinality, stringMin, stringMax, numericMin, or numericMax

  • Field name or regex:

    Regular expression to match name or entire name

Business Rules and Business Term discovery
metadataNumericMax Maximum value of a numeric fieldhasStatistic(name,Fieldname/regex)hasStatistic(numericmax,Sport)>=3

Regex example: hasStatistic(numericmax,`.*Port`)>=3

  • Name:

    Can be one of the values: selectivity, cardinality,stringMin, stringMax, numericMin, or numericMax

  • Field name or regex:

    Regular expression to match the name or entire name

Business Rules and Business Term discovery
lastProfiledTimeStamp determines if the last profiled date of a resource matches a value hasLastProfiledTimestamp(``) operator "MM-DD-YYYY HH:MM:SS" hasLastProfiledTimestamp(``) > "10-01-2022 02:22:45"
  • Timestamp:

    Value in timestamp

Business Rules and Business Term discovery
matchPattern (LDC data pattern) matches the column or field values. `column/field` matchPattern (“LDC data pattern”)

Example,

  • `City Code` matchPattern (“AANN“) matches all City Code values having two alphabetic characters either in upper or lower case followed by two digits.
  • `Date Of Birth` matchPattern (“NNNN-NN-NN“) matches all dates with pattern 1998-09-08
  • `Date Of Birth` matchPattern (“NNNN-NN-NN“, “NN/NN/NNNN“) matches all dates with pattern 1998-09-08 or 08/09/1998
  • `Product Code` matchPattern (“NNNN“) AND `Product Code` matchPattern (“NNNNNN“) matches all values which have at max six consecutive digits.
The supported parameter for the matchPattern function should be valid and enclosed in double quotes.

The valid LDC data patterns are:

  • Letter N represents a digit
  • Letter A represents an alphabet either in upper or lowercase
  • It can consist of the following special characters: ` ~! @ # \$ % \^ & * _ ' = + | \" ; : / ? . > , < - ]
Business Rules and Business Term discovery
Checks if the values are from a particular column which is provided in the InColumn() as input`column_name` inColumn(`path to the column which has the valid data`)

If you have a table named Valid Zipcodes.csv that contains column values with correct zip codes, and you want to verify if all the values in the zip code column are valid, then use the following rule criteria: zipcode inColumn(`minio/1/Data1/Valid Zipcodes.csv/values`)

Provide the correct path to the column as a parameter to the inColumn.

  • The inColumn parameter can be from any source or virtual folder as long as it belongs to the same agent.
  • The inColumn parameter may not belong to the virtual folder selected in Scope.
  • The inColumn parameter column null values are not validated.
Business Rules and Business Term discovery
Unstructured
ContainsTerm checks for the existence of a resource termcontainsTerm(`Glossary/Term`)=1containsTerm(Built-in_Terms/Country)=1

If any resource within a virtual folder has the term Country associated with it, then this implies that the condition is satisfied.

  • Term name:

    Fully qualified name of the term

Business Rules
hasDocumentType (Jpeg, video, etc.) determines the file type hasDocumentType(Type)=1

Example: hasDocumentType(jpeg) = 1

If the unstructured document type is “jpeg", this condition will return true.

  • Type:

    Type of the document

Rule
Data quality
hasFieldQualitychecks for a field's overall data quality hasFieldQuality(Field_Name/Regex) operator field_quality_score hasFieldQuality(`.*country.*`) >=30

If the resource has a field “Country”, and the field has Data Quality Statistics, which is 30. In such a scenario, this condition will return true.

  • Field name or regex:

    Regular expression to match name or entire name

Business Rules
hasResourceQuality checks for a resource's overall data qualityhasResourceQuality(``) operator resource_quality_score hasResourceQuality(``) >=30

If any resources have a Data Quality Score of 30, then this method will return true.

No parameterBusiness Rules
Term metadata rules
termConfidence valueshasTermConfidence(`.*oUntry`) operator term_confidence_value hasTermConfidence(`.*oUntry`) > 21

In a resource, if any field has the name “Country”, and the field has the term -Built-in_Terms/Country associated with it, and if the Term confidence percentage is greater than 80, then this condition will return true.

  • Field name or regex:

    Regular expression to match the name or entire name

Business Rules
Truncate
roundDown(@@propertyName) roundDown(@@propertyName) = expected output of the roundDownroundDown(@@12.678)=12

If the custom property holds the value 12.678, the function will return 12 as output.

For all the truncate functions, a property name can be either a document property or a custom property. The document property must be prefixed with ##, and the custom property must be prefixed with @@. Currently, only unstructured file properties are supported.Business Rules
roundDown(##propertyName)roundDown (##propertyName)=expected output of the roundDownroundDown(##pdfVersion)=1

If the document property holds the value 1.6, the function will return 1 as output.

Business Rules
roundDown AND hasResourceTerm roundDown(##property)> DecimalNumber AND hasResourceTerm(termname) roundDown(##pdfVersion)>0.9 AND hasResourceTerm(HEALTHCARE/City)=1

When used alongside another metadata function, predicate operators for this function can be either AND or OR.

Business Rules
roundDown AND dateDiff roundDown(##property) AND dateDiff(`##property1`,`##property2`)roundDown(##pdfVersion)>0.9 AND dateDiff(`##Modified`,`##Created`)=0 The roundDown and dateDiff functions are used together along with predicate operators.Business Rules
Round
round()round(propertyName,`scale`)round(##filesize,`-3`)=29000

If the document property holds the value 28672, the function will return 29000 as output.

This function includes property name and scale value.

The scale value can take negative or positive values. If the scale value is negative, rounding occurs on the left-hand side of the decimal point. If it is positive, rounding occurs on the right-hand side. The scale value must be enclosed in the backticks.

Business Rules
roundUp() roundUp(propertyName)roundUp(@@12.123)=13

If the custom property holds the value 12.123, the function will return 13 as output.

A property can either be a document property or a custom property. To differentiate between the two, document properties should be prefixed with ## and custom properties should be prefixed with @@. Currently, only unstructured file properties are supported.Business Rules
round()AND roundDown() round(propertyName) AND roundDown(propertyName)round(##filesize,-3)=1000 AND roundDown(##pdfVersion)=1

The predicate operators for this function can be either AND or OR.

A property can either be a document property or a custom property. To differentiate between the two, document properties should be prefixed with ## and custom properties should be prefixed with @@. Currently, only unstructured file properties are supported.Business Rules
round() AND roundUp() round(propertyName)AND roundUp(propertyName) round(##filesize,-3)=1000 AND roundUp(##pdfVersion)=2 The predicate operators for this function can be either AND or OR.A property can either be a document property or a custom property. To differentiate between the two, document properties should be prefixed with ## and custom properties should be prefixed with @@. Currently, only unstructured file properties are supported.Business Rules
Trim
Removes the leading and trailing spaces in the custom property. trim(`##document property`) trim(`##File format`) = 'pdf'For all the trim functions, a property can be either a document property or a business term. To differentiate between the two, the document property must be prefixed with ##, and the business term must be prefixed with @. Currently, only unstructured file properties are supported. Business Rules
Removes the leading and trailing spaces in the business term. trim(@termname)trim(@Built-in_Terms/Email) = 'xyz@gmail.com' Business Rules
Filters based on the column, ignoring any leading or trailing spaces that may be present in the cells.trim(##Column) IN ('value1', 'value2') trim(##Author) IN ('John','Mike')

The function used in conjunction with IN operator.

Filters based on whether the column Author contains either 'John' or 'Mike', while ignoring any leading or trailing spaces that may be present in the cells.

Business Rules
Filters to include only the rows where the specified column contains the specified value (after trimming any leading or trailing spaces).trim(##Column)='value' OR trim(##column)='value'  trim(##language)='English' OR trim(##author)='Rostyslav Gordin' 

The predicate operators for this function can be either AND or OR.

Filters to include only the rows where the language column contains the value 'English' (after trimming any leading or trailing spaces), or the author column contains the value 'Rostyslav Gordin' (after trimming any leading or trailing spaces).

Business Rules
Trim time details from date and time or timestamp value
Trims or converts the document property time unittrimTime(##document property, `time unit`) trimTime(##modified, `hours`)="2023-02-15"

It trims or converts the document property time unit to zero. In this example, the second parameter is `hours`, so it converts the hours, and all the rest of time to zero; it, trims ##modified value up to hours.

This function includes two parameters. The initial parameter allows any document property with a date and time value or a timestamp. Currently, only two document properties are available for use: ##modified and ##created, which denote the modified and created timestamp of the document.

The second parameter signifies the desired time unit to trim. The valid options include hours, minutes, seconds, and milliseconds, and they must be enclosed within backticks.

Business Rules
Trims the profiled time value trimTime(`hasLastProfiledTimestamp()`, `time unit`)trimTime(`hasLastProfiledTimestamp()`, `seconds`) <= "2023-03-13 21:30"

In this example, the time unit parameter is `seconds`, so it converts the seconds and milliseconds to zero; it trims the profiled time value up to seconds, leaving the details of the hours and minutes.

This function includes two parameters. The initial parameter is metadata that contains a date and time value or a timestamp value, accessed through the function. hasLastProfiledTimestamp(), which retrieves the latest profiled time.

The second parameter specifies the time unit to be trimmed. The available options include hours, minutes, seconds, and milliseconds, and they must be enclosed within backticks.

Business Rules
Date difference
Calculates the difference in days between two datesdateDiff(`##document property`,`@@custom property`) dateDiff(`##Modified`,`@@DecimalNumber`)=8479

Timestamp is not considered while calculating the difference; the output is always in days.

For all the date difference function, a property name can be either a document property or a custom property. To differentiate between the two, the document property must be prefixed with ##, and the custom property must be prefixed with @@. Currently, only unstructured file properties are supported. Business Rules
Calculates the difference in days between two datesdateDiff(`##document property1`,`## document property2`)dateDiff(`##Modified`,`##Created`)= 50 Timestamp is not considered while calculating the difference; the output is always in days. Business Rules
Calculates the difference in days between two datesdateDiff(`##document property1`,`## document property2`) dateDiff(`##Created`,`##Modified`)= -50

Depicts a scenario where the output is negative since the end date is in the future.

Business Rules
Calculates the number of days between the parameter date and the current date dateDiffFromToday(`##document property `)dateDiffFromToday(`##created`) = 30

Calculates the number of days between the parameter date and the current date. Timestamp is not considered in the calculation.

Business Rules
Calculates the difference in days between a future date (represented by the @@futuredate parameter) and the current datedateDiffFromToday(`@@custom property `)dateDiffFromToday(`@@futuredate`) = -50

Depicts a scenario where the output is negative since the end date is in the future.

Business Rules
Current Date - Today
Returns the current date today(``)='current date' today(``)='2023-03-10'

Returns the current date in the default format ‘yyyy-MM-dd’

No parameterBusiness Rules
Returns the current date in the user provided format today(`date format`)='current date'today(`yyyy/dd/MM`)='2023/10/03'

In the example above, the format is 'yyyy/dd/MM', which specifies the year (yyyy), day of the year (dd), and month (MM) separated by forward slashes. This function returns the current date in the provided format.

The format that you want to use for the current date. Business Rules

Rule criteria requirements and syntax

You must write rule criteria using a specific syntax for the rule to execute as desired. There are general syntax requirements that you must use regardless of whether the rule is a metadata or a data rule. There are also syntax requirements specific to metadata and those specific to data rules.

Requirements for writing rule criteria for a metadata and data rule

You can avoid errors by adopting the following requirements when writing data and metadata rules:

Rule criteria elementsSyntax requirements
Business termsSurround the entire fully qualified glossary and business term with backticks. For example,

`glossary_name/business_term.child_business_term`

If your term does not have a child term, do not include it. For example,

`Marketing/Campaign_ID`

`Data Quality Flags/Data Quality Issue`

`Products/Parts.Part_Number`

Field namesSurround the entire field name with backticks. For example,

`first_name`

`First Name`

Regular expressionsSurround regular expressions with backticks. For example, hasOrdinal(`.*oUntry`,1)=1 Regular expressions are case insensitive. For example, the following rule criteria uses a regular expression to match a field name. It matches both `Country` and `cOunTry`: hasFieldName(`.*OUntry`)=1
SQL functionsData Catalog supports minimal SQL functions in rule criteria, such as AND, OR, <, >, IN, and length ().
Data Catalog's rules framework does not create new terms. It only attaches an existing term to the resource or field specified.
Requirements for writing rule criteria for a data rule
You can avoid errors by adopting the following requirements when writing data rules:
Rule criteria elementsSyntax requirements
Business terms

Prefix a fully qualified field term with an "@" to indicate the rule operates on data tagged by the glossary term, such as @ `glossary_name/parent_term_name.child_term_name`

For example, @`Clinical Code/LONIC CODE` = “5792-7”
Custom propertyPrefix a custom property name with “@@” to indicate the rule criteria is used for custom properties and is a metadata rule, such as @@property_name

For example, @@retention_date < “10-01-2022 12:00:00”

Requirements and guidelines for writing to_date function in rule criteria

You can avoid errors by adopting the following requirements when writing the to_date function:

  • Symbols and their meaning that you must use while writing the to_date function.
    SymbolsMeaning
    yyear
    Mmonth-of-year
    dday-of-month
    Hhour-of-day (0-23)
    mminute-of-hour
    ssecond-of-minute
    Sfraction-of-second
  • The to_date function alone does not perform date comparisons and requires a comparison operator with another to_date function.

    For example:

    to_date(`start date`,`dd/MM/yyyy`)>= to_date(`2021-01-01`)

    to_date(@Time/Start_time,`yyyy-MM-ddTHH:mm:ssZ`)= to_date(`2022-12-15`)

    In the example above, `start date` is a column and `Start_time` is a business term in glossary called Time.

  • The date format provided to_date(@Time/Start_time,`yyyy-MM-ddTHH:mm:ssZ`) must match the actual data in the column, otherwise it will not be able to compare the date values correctly. The date format can be constructed by checking the column data and converting the data into its appropriate symbol, and leaving the special characters as it is.

    For example.

    • If the real data contains column value is 2014-09-22 13:57:11.0 (this data can be checked in the column’s data canvas page), then its format will be yyyy-MM-dd HH:mm:ss.S
    • If the real data contains column value is 12/Jan/2014, then its format will be dd/MMM/yyyy
    • If the real data contains sample value is 1940-10-15T20:16:38Z, then its format will be yyyy-MM-ddTHH:mm:ssZ

Rule action

A rule action can perform an array of tasks. Actions can be one of the following:

ImportantRule actions execute in the following order regardless of the order they are created in the user interface.
  • Set Properties
  • Reset Properties
  • Add Business Terms
  • Remove Business Terms
  • Set Term Assignment State
  • Set Sensitivity
  • Set Quality Dimension
  • Compute Data Quality Score

You can define the rule action by specifying action types in the Rule Actions window. The supported rule action types are described below.

  • Add Business Terms

    When actionType is set to AddBusinessTerm, the ruleAction makes term associations based on rule evaluation. You can apply a term suggestion on a specific field or on a qualifying resource. When applying a term suggestion of a field, the field is identified with a full field name. You can add multiple terms in AddBusinessTerms rules.

    NoteThe Data Catalog rule framework does not create new terms. Any term suggestions to be applied as part of a rule action must be for existing terms. If an associated term does not exist, Data Catalog displays an error message.
  • Remove Business Terms

    When actionType is set to RemoveBusinessTerm, the ruleAction removes the term associations based on the rule evaluation.

  • Set Properties and Reset Properties

    When actionType is set to SetProperties or ResetProperties, the ruleAction sets or resets custom property values.

    Property values are strings. If you specify property names with @@, then the value of its string is substituted for the property name. You can specify multiple property values in the Add Property Value section. Each property value should be comma (,) separated and without any trailing space.

    You can use property actions to set and reset property values. To reset a property value, use ResetProperties.

  • Set Quality Dimension

    When actionType is set to SetQualityDimension, the ruleAction defines the quality dimension. You can specify the following quality dimensions with a threshold value range (Low Threshold and High Threshold).

    The supported quality dimensions are:

    • Accuracy
    • Completeness
    • Consistency
    • Timeliness
    • Uniqueness
    • Validity

    Specify the field name if the associated action is to be performed at the field level. If you do not specify the field name, then the action is performed at the resource level.

    NoteThis action requires non-metadata criteria to execute because it evaluates the actual data of the column.
  • Set Term Assignment State

    When actionType is set to SetTermAssignmentState, the ruleAction updates the business term assignment state. You can specify one or multiple terms whose assignment state is to be changed. If you do not select a term, then the selected assignment state is applied to all the terms associated with the resource.

    The supported business term assignment states are:

    • Accept & Use In Learning
    • Accept & Don’t Use In Learning
    • Reject & Use In Learning
    • Reject & Don’t Use In Learning

    Use In Learning is only applicable at the column level. At the resource level, it can only be Accept or Reject.

    NoteThis action requires metadata criteria in the rule to execute.
  • Compute Data Quality Score

    When actionType is set to ComputeDataQualityScore, the ruleAction computes the data quality score based on the provided data quality dimension values.

    When you select the Compute Data Quality Score as the rule action type, the template for the score formula is provided in the Score Formula field. You can customize the template formula by changing the dimension weightage values.

    Score formula template:

    GUID-731861F6-883D-4A08-B353-67682F199AE7-low.jpg

    Data quality score formula conditions:

    • The column-level entity should have at least one calculated dimension for the action to execute.
    • If the rule provides weightages to multiple dimensions, and the actual entity has only one dimension, then 0 is considered the value for the other dimensions.
    • The formula should have at least one dimension and all the weights provided should total 1.0.

      The following example shows a valid formula with weights specified for Accuracy, Completeness and Uniqueness quality dimensions:

      GUID-B4CF36D0-E1DD-4F77-8AB6-8A6B3AD7E269-low.jpg

    Data quality score calculation process:

    Data quality score is calculated based on the Quality Dimension Weightage specified in the formula and Calculated Value Set defined for the quality dimension.

    Data quality score calculation process is stated below with an example:

    Data quality score formula:

    • Accuracy*0.5 + Completeness*0.20 + Uniqueness*0.30

    Calculated value set:

    • Accuracy: 60
    • Completeness: 80
    • Uniqueness: No calculated value is set for Uniqueness. The default value for Uniqueness will be 0.

    Data quality score calculation:

    • [60*0.5 + 80*0.2 + 0*0.3] = 46
    NoteThis action requires a metadata criteria in the rule to execute.
  • SetSensitivity

    When actionType is set to SetSensitivity, the ruleAction updates the resource sensitivity level. This is applicable only at the resource level. If you do not provide the resource name, then the action changes the sensitivity level of all the resources defined in the rule scope and criteria.

    If multiple sensitivity rules are executed for a particular resource, then the sensitivity level defined in the latest rule is considered, overriding all the previous rules.

    The supported business term assignment states are:

    • Low
    • Medium
    • High
    • Non Sensitive
    • Unknown
    NoteThis action requires metadata criteria in the rule to execute.

Using a rule for sensitive resource tagging

Many users create business rules to govern sensitive data. The following example identifies and tags all resources containing sensitive data or personal identifiers such as names and addresses. You can modify it to work with other sensitive data like social security numbers and account information.

Using the Data Catalog term discovery features, you can identify field metadata and tag data fields such as first name, last name, and address. The Data Catalog built-in terms can identify these fields. Then, you can use a rule to check for any resources that contain tagged sensitive data fields and tag the resources as "PII/Restricted Access".

The following rule example is term-based and does not depend on an actual field name, resource name, or resource type.

When you run the rule, it is applied to all qualifying resources and attaches the term you specify when a resource contains the sensitive fields and these fields have terms associated with them. If you have 100 CSV files, 200 JDBC tables, and 30 Avro files that are all sensitive, they are all labeled correctly after executing this rule.

Sample metadata rule with field term for sensitive data
Rule componentDefinition
Rule ScopeYou can define the following rule scope by clicking and selecting parameters using the scope building block.
  • Virtual folders
  • Resource terms
  • Field terms
  • Custom Properties
  • Term states
Rule Criteria hasFieldTerm(Built-in_Terms/First_Name)=1 AND hasFieldTerm(Built-in_Terms/Last_Name)=1 AND hasFieldTerm(Built-in_Terms/US_Address)=1
Rule ActionYou can tag associations, remove associations, set properties, reset properties, set term assignment state, compute data quality score, and set sensitivity level by selecting the action using the Rule Action block.

For example, to tag SSN field as PII/Restricted Access:

Select Action Type as: Add Business Terms

Select Business Terms as: PII/Restricted Access

Set Action Field as: SSN (this field is case-sensitive)

Set the threshold as needed.

Resource tagging based on data properties

You can use Data Catalog rules to create a simple data rule that defines the resources to which a condition applies, the condition, and then performs a specified action. In the following example, the rule attaches a resource term to all resources where the data for a given field falls within a specified condition and threshold range.

Data rule with resource term
Rule componentDefinition
Rule Scope

You can define the following rule scope by clicking and selecting parameters using the scope building block.

  • Virtual Folders
  • Resource Terms
  • Field Terms
  • Custom Properties
  • Term States
Rule Criteria

Country='Australia'

Here Country is the column name, and 'Australia' is the value inside the column.

Use single quotation marks to specify the value.

Rule criteria for data rules always examine the actual data of the given column.

Rule Actions

You can tag or remove associations, set or reset properties, set data quality by selecting it using the action building block.

In this example, we will do the following:

Tag resource as DQM/CA_Employee:

Select Action Type as:Add Business Term

Select Business Term as: PII/Restricted Access

Keep the Action Field empty.

Set the threshold as needed.

Rule workflow

On the Business Rules page, you can create, update, edit, and delete rules.

Create a rule

Perform the following steps to create a new rule. After you create a rule, you can then run the rule using Execute rules.

Procedure

  1. Click Management in the left navigation menu.

    The Manage Your Environment page opens.
  2. Select Business Rules.

    The Business Rules page opens.
  3. Click Add Business Rule.

    The Create Business Rule page opens.
  4. Enter the rule name and description.

  5. Set the Rule Scope. The rule scope includes the following parameters:

    You can create your own scope and save it for future use by yourself or another user or select one of the existing ones using the Load Scope option.

    You can use the Rule Scope JSON box to view the rule scope as you create it or the rule scope that you have selected using the Load Scope option. This helps you visualize the defined rule scope parameters and make the necessary changes. You can also copy the rule scope in the JSON format and use it for REST API requests.

    • Virtual Folders

      Select virtual folders on which to execute the rule. You must include at least one virtual folder for the rule to compile. When a single virtual folder is listed, the rule executes against all the resources in the listed virtual folder. You can select additional virtual folders. If a folder no longer exists when the rule is executed, it is ignored.

    • Resource Terms

      Select business terms to filter the virtual folder resources.

    • Field Terms

      Select the business terms associated with the fields or columns to filter the virtual folder resources.

    • Custom Properties

      Select the custom properties to filter the virtual folder resources.

    • Term States

      You can further filter the virtual folder resources based on the term association state. The possible values are Accepted, Rejected, and Suggested. If you do not specify a term association state, all states are included.

  6. Set the Rule Criteria.

    You can create your rule criteria and save it for future use by yourself or another user or select one of the existing ones using the Load Criteria option.

    Define the rule's criteria for evaluation using rule syntax, with metadata or data rule. See the following examples:

    • Metadata rule

      For dealing with metadata stored in the database. Syntax is:

      • hasResourceTerm(TermFullyQualifiedName)=1

        Work on the resources that have the given resource term associated with them. For example, hasResourceTerm(Insurance/HomeOwners.State_VA)=1

      • hasFieldTerm(TermFullyQualifiedName)=1

        Work on the fields and columns that have the given field term associated with them. For example, hasFieldTerm(Insurance/HomeOwners.State_VA)=1

    • Data rule

      For dealing with actual data present in the resource. Syntax is to specify the column name directly or use @TermFullyQualifiedName equivalent to the field with the given term name. Use @@customPropertyName to specify the custom property with specific values.

    NoteAll Quality Dimensions are generated only with DataRules.
  7. Set the Rule Actions. The rule actions include the following action types.

    You can create rule actions and save them for future use by yourself or another user or select one of the existing ones using the Load Action option.

    You can use the Rule Actions JSON box to view the rule actions as you create them or the rule actions that you have selected using the Load Action option. This helps you visualize the defined rule actions and make the necessary changes. You can also copy the rule actions in the JSON format and use it for REST API requests.

    Action TypeDescription
    Add Business Terms
    • Business Term

      Select the term name that you want to add.

    • Action Field

      Specify the field name if the associated action should be performed on a specific field. The field name specified is tagged with the term or terms you selected.

      If the field name is not specified, then the action will be performed at the resource level.

    • Set Threshold

      Specify the threshold value at which to perform the rule action.

    Remove Business Terms
    • Business Term

      Select the term name that you want to remove.

    • Action Field

      Specify the field name if the associated action should be performed at the column level. The field name specified is tagged with the term or terms you selected.

      If the field name is not specified, then the action will be performed at the resource level.

    • Set Threshold

      Specify the threshold value at which to perform the rule action.

    Set Properties
    • Select Property

      Select the property for which you want to set the custom properties.

    • Action Field

      Specify the field name if the associated action should be performed on a field.

    • Add Property Value

      Specify the property value. You can specify multiple property values. Each property value should be comma (,) separated with no trailing space.

    • Set Threshold

      Specify the threshold value at which to perform the rule action.

    Reset Properties
    • Select Property

      Select the property for which you want to reset the custom properties.

    • Action Field

      Specify the field name if the associated action should be performed on a field.

    • Set Threshold

      Specify the threshold value at which to perform the rule action.

    Set Quality Dimension
    • Set Quality Dimension (for DataRules)

      Select the data quality dimension that defines your rule. This dimension is reflected in the data quality graph on the Data Canvas page. Options include:

      • Accuracy

        The degree to which data correctly describes the "real world" object or event being described.

      • Completeness

        The proportion of stored data against the business definition of “100% complete”.

      • Consistency

        The absence of difference when comparing two or more representations of an item against a definition. Each data item is measured against itself or its counterpart in another data set.

        Note that consistency assessment may not be applicable to all data items.

      • Timeliness

        The degree to which data represent reality from the required point in time. This is measured by the time distance between each correct and incorrect data point.

      • Uniqueness

        The inverse of an assessment of the level of duplication.

      • Validity

        Data is valid if it conforms to the syntax (format, type, range) of its business definition. Typically, this value is the overall measure of data quality.

    • Action Field

      Specify the field name if the associated action should be performed at the column level. The field name specified is tagged with the term provided in the Set Quality Dimension.

      If the field name is not specified, then the action will be performed at the resource level

    • Set Low Threshold

      Specify the threshold value at which to perform the rule action.

    • Set High Threshold

      Specify the threshold value at which to perform the rule action.

    You can view the following data quality color range in a donut chart for the column.

    • > High Threshold = green
    • > Low Threshold and < High Threshold = orange
    • < Low Threshold = red

    For example, If you want to validate the number of records with the state name California.

    To validate this, you have selected Validity as a quality dimension. Suppose there are a total of 500 records for states; out of these, you want to see records with the state name California.

    Now, consider that out of 500, we have 50 states with the name California. This is 10% as a validity percentage.

    Now in the threshold field, if you set the low threshold as 0 (zero) and the high threshold as 9. In this case, 10 is greater than the high threshold, so the data quality chart will show green.

    If you set a low threshold as 0 (zero) and a high threshold as 15. In this case, 10 falls between 0 and 15, so the data quality chart will show orange.

    If you set the low threshold as 11 and the high threshold as 15. In this case, 10 falls below the low threshold, so the data quality chart will show red.

    Similarly, you can set the threshold value for all the data quality dimensions.

    Set Term Assignment State
    • Set Term Assignment State (for MetaData Rules)

      Set the assignment state of a term.

    • Select Term

      Select one or multiple terms for which you want to change the assignment state. If the term is not specified, the selected assignment state is applied to all the terms associated with the resource.

    • Assignment State

      Select the assignment state. The supported term assignment states are:

      • Accept & Use In Learning
      • Accept & Don’t Use In Learning
      • Reject & Use In Learning
      • Reject & Don’t Use In Learning
      NoteUse In Learning is applicable at the column level only. At resource level, it is Accept and Reject only.
    Compute Data Quality Score
    • Compute Data Quality Score

      Computes the data quality score based on the provided data quality dimension values.

    • Score Formula

      When you select the Compute Data Quality Score as the rule action type, the template for the score formula is provided in the Score Formula field. You can customize the template formula by changing the dimension weightage values as per the requirement.

      Score formula template:

      Accuracy*0.17 + Completeness*0.17 + Consistency*0.16 + Timeliness*0.16 + Uniqueness*0.17 + Validity*0.17

      Data quality formula condition:

      • The column level entity should have at least one calculated dimension for this action to execute..
      • If the rule provides weightages to multiple dimensions, and actual entity has only one dimension, then 0 will be considered as the weightage value for the other dimensions.
      • While defining the formula, there should be at least one valid dimension, and the sum of all dimension weightages should equal to 1. Greater than 1 or less than 1 is not acceptable.

        Example of a valid formula:

        Accuracy*0.5 + Completeness*0.20 + Uniqueness*0.30

    • Action Field

      Specify the column name if the associated action should be performed only on a specific column, instead of applying the weights on all the columns of the applicable resource.

      Examples of a valid formula used in the Action Block:

      Accuracy*0.5 + Completeness*0.20 + Uniqueness*0.30

      Accuracy*0.2 + Completeness*0.1 + Consistency*0.1 + Timeliness*0.1 + Uniqueness*0.4 + Validity*0.1

    Set Sensitivity
    • Set Sensitivity

      Updates the resource sensitivity level. If multiple Sensitivity rules are executed for a particular resource, then the sensitivity level defined in the latest rule is considered, overriding all the previous rules.

    • Sensitivity Level

      Specify the sensitivity level. The supported sensitivity levels are:

      • Low
      • Medium
      • High
      • Non Sensitive
      • Unknown
    • Action Item

      Specify the resource name if the associated action should be performed on a resource. If the resource name is not provided, then the action will change the sensitivity level of all the resources defined in the rule scope and criteria.

  8. Click Create Rule.

Update a rule

If you have already created and configured a rule, you can edit it from the Business Rules page.

Perform the following steps to edit a rule:

Procedure

  1. Click Management in the left navigation menu.

    The Manage Your Environment page opens.
  2. Select Business Rules.

    The Business Rules page opens.
  3. Locate the business rule you want to configure in the table of rules and select the View Details button (greater-than sign) in its row.

    If you have a large number of rules, select Show Filters to help you find the rule you want to edit.The Business Rule page opens for the selected rule.
  4. Edit the fields as needed and click Save Rule.

    The rule is saved with your changes. If there is a problem while creating your rule, an error notification displays at the top of the page. Resolve the error and click Save Rule.

Update a rule created using rule blocks

If you have created and configured a rule using Scope, Action, and Criteria blocks, you can edit it from the Business Rules page

Perform the following steps to edit a rule:

Procedure

  1. Click Management in the left navigation menu.

    The Manage Your Environment page opens.
  2. Select Business Rules.

    The Business Rules page opens.
  3. Locate the business rule you want to configure in the table of rules and select the View Details button (greater-than sign) in its row.

    If you have a large number of rules, select Show Filters to help you find the rule you want to edit.

    The Business Rule page opens, highlighting the block that is used for the selected rule.

  4. If you want to update the rule block used in the rule, click on the specific block name link (which can be Rule Scope, Rule Criteria, or Rule Actions).

    You will be navigated to the specific block page.
  5. Edit the fields as needed and save the changes.

    The rule will be updated with the block changes.

  6. If you do not want to update the pre-saved rule block currently in use, but you want to change its scope, criteria, and action, then you do one of the following:

    • You can either load a different rule block by clicking the load button on the top right of the scope, criteria, or action component.
    • Update the block on the Update Business Rules page. This will create a new block instead of updating it. You can save this new block with the new name.
    If there is a problem while modifying the block, then an error notification is shown at the top right corner of the page.

Delete a rule

If a rule is no longer needed, you can delete it. Perform the following steps to delete a rule:

Procedure

  1. Click Management in the left navigation menu.

    The Manage Your Environment page opens.
  2. Select Business Rules.

    The Business Rules page opens.
  3. Click Management on the menu bar to open the Manage Your Environment page, and then select Business Rules.

    The Business Rules page opens.
  4. Use the check box to select the rule you want to delete.

  5. Click the Actions menu and then click Remove. Optionally, select the More actions icon, then click Remove from the drop-down menu.

    A message appears prompting you to enter the text Yes to delete the rule.
  6. Enter Yes in the input field and click Confirm.

    A message appears confirming that the business rule is deleted.
  7. Click Close on the message box to return to the Business Rules page.

View rules

You can view the list of all the rules in Data Catalog.

Perform the following steps to view the rules:

Procedure

  1. Click Management in the left navigation menu and click Business Rules.

    The list of all rules in Data Catalog is shown.
  2. To view the rules associated with the specific resource, click Data Canvas in the left navigation menu > Virtual folders > Specific file.

  3. Click Rules tab.

    Rules specific to that file are shown.View rules

Adding rule blocks

You can create a scope, criteria, and actions block and use it anytime later while creating a rule.

Perform the following steps to create a block:

Procedure

  1. Click Management in the left navigation menu and click Business Rules.

  2. Click the Blocks tab.

  3. Click Add New Block, and select the scope, criteria, or actions for which you want to create a block.

    NoteIf you make any changes to the block, the rules that are using these blocks are impacted.

Execute rules

Perform the following steps to execute a Data Catalog rule.

NoteIf the rule does not execute as expected, verify the rule configuration and syntax and review the job activity log.
  1. Click Management in the left navigation menu and click Business Rules.
  2. Select the rule you want to run and click the Execution tab.

    The Execution Schedule window opens. You can run the rule immediately or add a schedule.

  3. Select Run Now to run the rule immediately.
  4. Click Add Schedule to schedule the rule and select one of the following schedules:
    • On a date
    • Daily
    • Weekly
    • Monthly

    NoteSet the schedule time in UTC zone only.
  5. (Optional) If you want to enter parameters, select the Advanced Mode check box, and enter the parameters in the text box.

    For example, to generate a rule execution report, enter the following additional parameters before rule execution:

    -generateReport true -reportName <Name of the report being generated>

    For more information on additional parameters, see Rule execution report.

  6. Click Apply Changes.
Propagate bindings

Propagate bindings associates the rule with all the data entities that fall under the selected resources.

On the Execution Schedule window, click Propagate Bindings to associate the rule with the data entities.

You will receive notification about the propagate binding status. After completing it, you can go to each data entity and run the rule as required. With propagate bindings, owners of data entities have an option to run the rule or not.

Rule execution report

A rule execution report is a report of all the rules that summarizes how well a rule evaluates the resources in Data Catalog.

To generate a rule execution report, enter the following additional parameters before rule execution:

-generateReport true -reportName <Name of the report being generated>

Define the options as follows:

  • -generateReport

    If this parameter is passed, rule execution generates a report with the name specified by the -reportName parameter.

  • -reportName

    User-defined name for the report being generated. Use with the -generateReport parameter.

All reports are generated in the /var/log/ldc/generatedReports directory. If you do not provide a report name, Data Catalog randomly generates a unique name for each report.