Lumada Data Catalog's Validator utility is useful for fixing inconsistencies or invalid entries that got added to the Solr repository because of defects in the code or unanticipated field situations.
This utility can verify and fix Tag Domain, Tag, User, Roles, Virtual Folders or Data Sources, and is available as a self-contained jar located in the contrib directory of Data Catalog install.
Starting from version 4.4, validator requires one of the following solr jars also in the class path before the validator jar.
It is recommended to invoke the validator utility using the validator.sh found under <LDC App-Server Dir>/bin path.
The script will automatically include the necessary jars required for validator.
The usage information and the supported functionality can be easily found by running the help command as follows:
<LDC App-Server Dir> $ bin/validator.sh -help
The following options are displayed:
Usage: -help | -verify | -verifyAndFix [-dataSource <dataSourceName> ] [-verifyAction <verifyActionName> ] <verifyActionName> cane be one of the following: checkfor-invalid-tag-associations checkfor-orphan-tag-associations checkfor-tags-immutable checkfor-data-source-by-key checkfor-invalid-tag-names checkfor-invalid-dataset-names checkfor-invalid-user-names checkfor-invalid-tag-domains checkfor-invalid-resource-folder-maps [-dataResourceKey <resourceKey> ] checkfor-orphan-resource-folder-maps checkfor-duplicate-resource-folder-maps checkfor-duplicate-case-sensitive-tags checkfor-duplicate-resource-fields checkfor-invalid-audit-description checkfor-invalid-virtualfolder-definition checkfor-incorrectly-created-hive-views checkfor-duplicate-entities-with-same-key-attributes [-entityToFix <TagDomain/Tag/User/Role/Source/VirtualFolder> ] # Duplicate detection will be done for : TagDomains, Tags, Roles, users, DataSources and VirtualFolders checkfor-virtual-folder-run-records [-vfRunRecordsFile <vfRunRecordsFileName> ] checkfor-custom-props-in-use [-customPropertyName <customPropertyName> ] If -verify is specified and -verifyAction is not specified, actions will be picked up from verify.actions file from classpath If -verifyAndFix is specified -verifyAction is required
- verify command will look for the inconsistencies.
- verifyAFix command indicates and intent of corrective action on inconsistencies and is generally followed by the -verifyAction flag.
- verifyAction will take action as specified by the
- checkfor-orphan-tag-associations checks and lists orphan tag associations.
- checkfor-tags-immutable checks for immutable tags.
- checkfor-data-source-by-key checks for data sources by key.
- checkfor-invalid-tag-names verifies if all tag names are valid.
- checkfor-invalid-dataset-names verifies if all dataset names are valid.
- checkfor-invalid-user-names verifies if all user names are valid.
- checkfor-invalid-tag-domains verifies if all tag domain names are valid.
- checkfor-invalid-resource-folder-maps checks for folder maps that are invalid or incomplete for each resource and is followed by -dataResourceKey.
- checkfor-orphan-resource-folder-maps checks for any folder_maps that are there but no corresponding resources.
- checkfor-duplicate-resource-folder-maps checks for folder_maps that are duplicates.
- checkfor-duplicate-case-sensitive-tags checks for duplicate case-sensitive tags.
- checkfor-duplicate-resource-fields checks for duplicate resource fields.
- checkfor-invalid-virtualfolder-definition checks for invalid virtual folder definitions.
- checkfor-incorrectly-created-hive-views checks for incorrectly created Hive Views which are disguised as Hive Table.
To verify and fix a specific duplicate entity use the -EntityToFix flag.
$ validator.sh -verifyAndFix \ -verifyAction checkfor-duplicate-entities-with-same-key-attributes \ -EntityToFix <TagDomain, Tag, User, Role, Source, VirtualFolder>
- checkfor-custom-props-in-use checks for
the resources in which the said custom property is set. This option must be followed
by the -custompropertyName providing the name of
the custom property for which the usage is to be verified and/or fixed.
When used with the -verify command, the Validator will look for the resources having the said custom property set. The results can be confirmed by examining the wd-ui.log under <LDC Log Dir> (Typically /var/log/waterlinedata/).
When used with the -verifyAndFix command, the Validator will reset the value of said custom property in all the resources found using the -verify command.ImportantVerify must be run before VerifyAndFix for proper resetting of custom property values.
- entityToFix specifies the entity on which the fix action is to be performed.
If -verifyAndFix is specified -verifyAction is required.
The following are some sample commands:
- Verify duplicate entity existence:
<LDC App-Server Dir> $ bin/validator.sh -verify \ -verifyAction checkfor-duplicate-entities-with-same-key-attributes
- Verify and fix all duplicate entities:
<LDC App-Server Dir> $ bin/validator.sh -verifyAndFix \ -verifyAction checkfor-duplicate-entities-with-same-key-attributes
- Verify and fix a specific duplicate entity:
<LDC App-Server Dir> $ bin/validator.sh -verifyAndFix \ -verifyAction checkfor-duplicate-entities-with-same-key-attributes \ -EntityToFix <TagDomain, Tag, User, Role, Source, VirtualFolder>