Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Purge transformations, jobs, and shared objects from the Pentaho Repository

Parent article
The Purge Utility allows you to permanently delete shared objects (servers, clusters, and databases) stored in the Pentaho Repository as well as content (transformations and jobs). You can also delete revision information for content and shared objects.
CautionPurging is permanent. Purged items cannot be restored.

To use the Purge Utility, complete these steps.

Procedure

  1. Make sure the Pentaho Repository is running.

  2. Open a shell tool, command prompt window, or terminal window, and navigate to the pentaho/design-tools/data-integration directory.

  3. At the prompt enter the purge utility command.

    The format for the command, a table that describes each parameter, and parameter examples follow.
    NoteThe command must contain the url, user, and password parameters, as well as one of these parameters: versionCount, purgeBeforeDate, purgeFiles, or purgeRevisions.
    • Windows

      purge-utility.bat [-url] [-user] [-password] [-purgeSharedObjects][-versionCount] [-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]

    • Linux

      purge-utility.sh [-url] [-user] [-password] [-purgeSharedObjects] [-versionCount] [-purgeBeforeDate] [-purgeFiles] [-purgeRevisions] [-logFileName] [-logLevel]

    OptionRequired?Description
    -urlYURL address for the Pentaho Repository. This is a required parameter. By default, the Pentaho Server is installed at this URL: http://localhost:8080/pentaho
    -userYUsername for an account that can access the Pentaho Server as an administrator. This is a required parameter.
    -passwordYPassword for the account used to access the Pentaho Server. This is a required parameter.
    -purgeSharedObjectsNWhen set to TRUE, the parameter purges shared objects from the repository. This parameter must be used with the purgefile parameter. If you try to purge shared objects without including the purgefile parameter in the command line, an error occurs. If you set the purgeSharedObjects parameter to FALSE, it does not purge shared objects. If you include the purgeSharedObjects parameter in the command, but you don't set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
    -versionCountYou must include only one of these: versionCount, purgeBeforeDate, purgeFiles, or purgeRevisionsDeletes entire version history except the for last versionCount versions. Set this value to an integer.
    -purgeBeforeDateDeletes all versions before purgeBeforeDate. The format for the date must be: mm/dd/yyyy
    -purgeFilesWhen set to TRUE, transformations and jobs are permanently and physically removed. Shared objects (such as database connections) are NOT removed. If you want to also remove shared objects, include the purgeSharedObject parameter as well. If you set the purgeFiles parameter to FALSE, it does not purge files. If you include the purgeFiles parameter in the command, but you don't set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
    -purgeRevisionsWhen set to TRUE, all revisions are purged, but the current file remains unchanged. If you set the purgeRevisions parameter to FALSE, it does not purge revisions. If you include the purgeRevisions parameter in the command, but you do not set it to TRUE or FALSE, the Purge Utility will assume that it is set to TRUE.
    -logFileNameNAllows you to specify the file name for the log file. If this parameter is not present, the log is written to a file that has this name format: purge-utility-log-YYYYMMdd-HHmmss.txt YYYYMMdd-HHmmss indicates the date and time that the log file was created (e.g., purge-utility-log-20140313-154741.txt).
    -logLevelNIndicates the types and levels of detail the logs should contain. Values are: ALL, DEBUG, ERROR, FATAL, TRACE, INFO, OFF, and WARN. By default the log is set to INFO. Check the Log4J documentation for more details on the logging framework definitions: https://logging.apache.org/log4j/2.x/log4j-api/apidocs/org/apache/logging/log4j/Level.html.
    • In this example, only the last five revisions of transformations and jobs are NOT deleted. All previous revisions are deleted.

      purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -versionCount=5

    • In the example that follows all revisions before 01/11/2009 are deleted. Logging is set to the WARN level.

      purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeBeforeDate=01/11/2009 -logLevel=WARN

    • In this example, all transformations, jobs, and shared objects are deleted. You do not need to set the purgeFiles and purgeSharedObjects parameters to TRUE for this command to work. Logging is turned OFF.

      purge-utility.bat -url=http://localhost:8080/pentaho -user=jdoe -password=mypassword -purgeFiles -purgeSharedObjects -logLevel=OFF

  4. When finished, examine the logs to see if there were any issues or problems with the purge.

  5. To see the results of the purge process, disconnect, then reconnect to the Pentaho Repository. In the Repository Explorer, in the Browse tab, verify that the items you specified in your purge utility command were purged.