Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Workflows

Before you begin working in Data Storage Optimizer, you should already have the data sources and storage destinations present in Data Catalog. Also, your role must have permissions for the operations you want to perform. See Data sources.

You can use the following generalized workflows to identify, classify, and tier or purge files.

NoteTo manually migrate or delete a file, see Explore your data in Data Catalog.
Workflow for rules-governed file tiering

Use the example below to create the Data Temperature domain, apply terms, identify, and then automatically tier files older than a specified date.

  1. From Data Storage Optimizer, click Check Data Temperature to open the Glossary in Data Catalog.
  2. Create a domain. For example, Data Temperature.
  3. Create terms in the domain. For example, Boiling, Hot, Warm, Cold, and Frozen.
  4. Click Management to open the Manage Your Environment page, and on the Metadata Rules card, click Add New. Then click Add Rule Definition and create a rule definition with the following attributes:
    1. Name the rule. For example, Cold Files
    2. Criteria: Object = File
    3. Create a condition: Attribute = Last Access Date
    4. Operator: Less Than Date, and then enter a time and date. For example, 1 October 2022, 00:00:00
    5. Add Action = Apply Business Terms
    6. Add Term = Cold (for example)
    7. Add Action = Remove Business Terms
    8. Add Term = Boiling, Hot, Warm, and Frozen (for example)
    9. Save the rule definition.
  5. Open the Manage Your Environment page, and on the Metadata Rules card, click Add New. Then click Add Rule and create a rule with these attributes:
    1. Name the rule. For example, Cold Files
    2. Select the Source Data Asset on which you want to run the rule.
    3. Select the rule definition created earlier. For example, Cold Files
    4. Select to run the rule now, or you can schedule it to run later, or both.
    5. Apply the rule.
  6. Click the App Switcher then select Data Storage Optimizer.
  7. In Data Storage Optimizer, click Management to open the Manage Your Environment page, and on the Rules card, click Add New. Then click Add Rule Definition and create a rule definition to tier files:
    1. Name the rule. For example, Move Cold Data
    2. Criteria: Object = File
    3. Create a condition: Attribute = Business Term
    4. Operator = Equals Ignore Case
    5. Value = Cold (for example)
    6. Click Start Process
    7. Action = Move and select the file destination.
    8. Save the rule.

You can view the history of the rule’s execution. Go to the Home page and on the Data Operations card, locate the name of the rule definition and click the GUID-0EBCD0EB-F56A-4BCC-8710-38CD52A552A3-low.png icon. On the Data Operations page, the status of the operation, file path, and destination are provided. Files marked Cold are moved to the target destination. What remains in the folder of the data source is a stub file that can be used for rehydration, if needed.

Workflow for rules-governed file purging

NoteYou cannot purge files from AWS/S3 data sources.

Use the example below to create the Data Temperature domain, apply terms, identify, and then automatically remove files older than a specified date.

  1. From Data Storage Optimizer, click Check Data Temperature to open the Glossary in Data Catalog.
  2. Create a domain. For example, Data Temperature.
  3. Create terms for the domain. For example, Boiling, Hot, Warm, Cold, and Frozen.
  4. Click Management to open the Manage Your Environment page, and on the Metadata Rules card, click Add New. Then click Add Rule Definition and create a rule definition with the following attributes:
    1. Name the rule. For example, Frozen Files
    2. Criteria: Object = File
    3. Create a condition: Attribute = Last Access Date
    4. Operator: Less Than Date, and then enter a time and date. For example, 1 October 2020, 00:00:00
    5. Add Action = Apply Business Terms
    6. Add Term = Frozen (for example)
    7. Add Action = Remove Business Terms
    8. Add Term = Boiling, Hot, Warm, and Cold (for example)
    9. Save the rule definition.
  5. Open the Manage Your Environment page, and on the Metadata Rules card, click Add New. Then click Add Rule and create a rule with these attributes:
    1. Name the rule. For example, Frozen Files
    2. Select the Source Data Asset on which you want to run the rule.
    3. Select the rule definition created earlier. For example, Frozen Files
    4. Select to run the rule now, or you can schedule it to run later, or both.
    5. Apply the rule.
  6. Click the App Switcher then select Data Storage Optimizer.
  7. In Data Storage Optimizer, click Management to open the Manage Your Environment page, and on the Rules card, click Add New. Then click Add Rule Definition and create a rule definition to delete files:
    1. Name the rule. For example, Delete Frozen Data
    2. Criteria: Object = File
    3. Create a condition: Attribute = Business Term
    4. Operator = Equals Ignore Case
    5. Value = Frozen (for example)
    6. Click Start Process
    7. Action = Delete.
    8. Save the rule.

You can view the history of the rule’s execution. Go to the Home page and on the Data Operations card, locate the name of the rule definition and click the GUID-0EBCD0EB-F56A-4BCC-8710-38CD52A552A3-low.png icon. On the Data Operations page, the status of the operation, file path, and destination are provided. Files marked Frozen are deleted.

CautionPurged files are deleted from the data source.