Skip to main content

Pentaho+ documentation has moved!

The new product documentation portal is here. Check it out now at docs.hitachivantara.com

 

Hitachi Vantara Lumada and Pentaho Documentation

Searching Data Catalog

Parent article

To search your metadata, enter keywords into the search box on the LDC Home page. The Search tool is accessible in the left navigation toolbar on most pages.

You can search across the metadata in Lumada Data Catalog in one of three ways:

  • Basic Search

    Use the search box to perform a global keyword search across all the resources, fields, or business terms in the cluster.

  • Saved Search

    Reuse one of your previously saved searches. Click in the search box and then select Saved Searches to open a list of your previous search queries. Additionally, when you click in the search box, several previously searched terms appear in the drop-down list for easy selection.

  • Advanced Search

    Run an advanced search. Click in the search box and select Advanced Search to open a window for filtering your search results.

Whichever search you choose, your results appear on the main Search Result page. Search results can also be sorted by Relevance, Name, and Average Rating. Select Sort by to change your sorting method.

At any time, click Clear Search under the search box to remove the search results and any keywords in the search box.

Searching with keywords

When you enter search keywords, Lumada Data Catalog performs two separate searches and combines the results:

  • A search is performed for full or partial path names, such as /user/hudson/analysis/trend-2016.csv or analysis.
  • A search is performed for all other metadata and sample data, such as the names of files, fields, tables, business terms, and the content of business term descriptions.

Entering keywords into the search box returns matching resources or fields, which are determined by the following attributes:

  • Case sensitivity

    Searches are case-insensitive, including path names.

  • Wildcard characters

    When you type a word in the search box, Data Catalog’s search engine scans the query for the wildcard asterisk (*) character in the search for metadata containing your keyword. The search results produced depend on the wildcard character’s position around the keyword, as shown in the table below. If you enter multiple words in the search box, each word is included as an independent term, as shown in the table below:

    Search TextDescriptionFinds
    fooStrict equals search foo

    ^foo*

    foo*

    Starts with “foo” and searches include any number of succeeding characters.foo, food, foodbar

    *bar$

    *bar

    Ends with “bar” and searches include any number of preceding characters. Escobar, Zanzibar, foodbar, bar
    food barOR search on each wordfood, bar, foodbar, food_bar
  • Special characters

    If a resource or term name contains special characters, such as $, @, &, and so forth, you must use a backward slash (\). For example, if your term name is finance@USA$, then enter nce@USA\$ in the search box to find it.

    If resource name or term name contains multiple special characters in its name, place the backward slash (\) in front of any special character. For example, if your term name is Park@Avenue#, and this term is associated to any resource at resource/field levels, your search for Park\@Avenue# or Park@Avenue\# are both valid.

    However, if a resource or term name contains the hyphen (-) special character, you must use a forward slash (/) to escape the hyphen. For example, searching for Q1-2016 returns no results, but escaping the hyphen with a forward slash (Q1/-2016), returns Q1 2016 and Q1-2016.

    If you are using a field search, you do not need to use slashes to escape special characters.

    NoteWhen searching for resources, combine special characters with text.

Path name searches

Path names searches can match on any part of a path name and are case-insensitive.

Lumada Data Catalog compares the search text to its list of all the path names for all the resources in the catalog. The search is performed as a string comparison where a file, table, or folder is returned when the keyword matches any part of the resource path.

For example, part matches the file /data/transactions/part-r-00000/data. The keyword actions would match the same file.

Other metadata searches

Lumada Data Catalog compares the search text to the names of files, fields, tables, terms, and the content of term descriptions.

When building the keyword search indexes for these items, Data Catalog ensures the metadata values match one or more keywords to any complete token in the index. White space and characters such as single and double quotation marks, question marks, parentheses, carets (^), pound signs (#), colons, periods, hyphens, and commas indicate the end of a word and are otherwise ignored. Words including underscores are not broken across the underscore.

The search is case-insensitive. For example, risk matches the field name Risk Band. The keyword RISK has the same match behavior as risk, except in path names, which are case-sensitive.

However, risk would not match the field Risk_Band because the token is considered the entire phrase risk_band. To find matches that include the keyword somewhere in the tokenized name, you can use the asterisk wildcard character before, after, or both before and after the keyword. The keyword risk* would match the field Risk_Band.

Keyword searches:

  • Are case-insensitive.
  • Match complete words.
  • Accept the asterisk wildcard character to indicate any preceding or following characters.

Characters such as plus sign (+), minus sign (-), ampersand (&), vertical bar (|), exclamation mark (!), carets (^), tilde (~), colon (:), and other special characters are treated as delimiters and are ignored in the search. For example, if you enter risk-band, the search behaves the same as if you entered risk band.

Refining search results

When you perform a keyword search from the toolbar or from the Advanced Search page, it returns results from the entire cluster, including files and fields that directly match the search criteria. You can use the facets in the left pane of the search results to further refine these results.

You may notice that global search results return matched files and all the fields in those files. When you refine the results, only the fields that directly match the refinement remain in the results. Here's an example of how this process works:

You enter weather in the toolbar search. The search results show:

  • The files that have weather in their name or a file-level term or term description.
  • All the fields in the matched files.
  • The fields that match weather in their name, a field-level term, or term description.

Unlike the original global search, no fields show simply because they were associated with a matched file. For example, if the global search results on weather matched a file windspeed.csv with detailed address information, including a field with the term US State, then all the fields in the file windspeed.csv appear in the global search results. You can use facets to narrow your search to exactly what you are looking for.

Basic search

A basic search performs a global keyword search that lists the total number of results for the search term or terms entered, and groups those results in views for Resources facets and Fields facets.

  • Resources

    List of resources that match the search term, including resource name, path, fields, terms, or term associations.

  • Fields

    List of fields that match the search term, including resource path, fields, terms, or term associations.

  • Business Terms

    List of business terms that match the search term, including term hierarchy path, term type, accepted and suggested associations, and sensitivity.

Lumada Data Catalog provides built-in facets for Resources and Fields that can further filter the search results.

View search results using facets

Use the Resource, Fields, or Business Terms view in the search results to see the list of built-in facets on the search results. Facets appear in the left pane next to search results.

Procedure

  1. In the Search field, select Resources, Fields, or Business Terms to adjust the view of your search results.

  2. In the left pane, click the facets you want to use for filtering.

    By default, the built-in facets appear expanded in the left pane. Click the down arrow by each facet type to collapse the facets for easier viewing.
    NoteOnly the facets that have values display on the facets pane in the search results. Empty facets do not display.
  3. To limit the search results to a chosen set of facets, select the check boxes next to the facets.

    The search results automatically update to filter on your selected facets.

Search result information

Lumada Data Catalog search results are organized into self-contained panels. Each result panel contains key information organized for easy viewing, such as path details, description, and type.

  • Resource type and path

    List of file type and the path to the file location.

  • File metadata

    Contains the following file metadata parameters:

    • Format
    • Modified (Last modified time)
    • Size
    • Fields
    • Records
  • Sensitivity

    Icon indicates sensitivity of the data. Sensitivity levels include low, medium, high, and unknown.

  • State

    Shows resources that are available for browsing and resources that are no longer available for processing.

  • Description

    Plain text describing the resource, if available.

  • Resource Terms

    Number of overflow terms that you can view by clicking the numeric link. If no terms are associated with the resource or field, select Add term to tag the resource or field with a business term.

  • Action

    Select to open actions menu for that result. You can choose to go to the Process window or open Galaxy view.

Search result details also indicate resource popularity metrics like average overall rating with total ratings.

Saved search

You can save your searches to easily repeat search requests when needed. Saved searches store the search term and any facets that you apply.

Saving a Search

You can save a search to run in future sessions.

Procedure

  1. Perform a basic or advanced search. The results appear on the Search Result page.

  2. (Optional) Select facets in the left pane to update your search query.

  3. When satisfied with the list of results, click Save Search. The Create new search filter dialog box opens.

  4. Enter a name for your search and click Create.

Next steps

You can access saved searches by clicking in the Search Catalog field that appears throughout the product.

Accessing saved searches

When you save a search, you can view it in the Saved Searches window. From this window, you can run a saved search, rename it, or delete it.

Procedure

  1. Click the Search Catalog field and then select Saved Searches. The Saved Searches window opens.

  2. You can view the list of searches by name, type, or keyword. You can also view the facets that are part of the search.

  3. When you find the search you want, run it by selecting the run arrow in the Search Name field. Optionally, select the More actions icon and select Run.

  4. (Optional) Select the More actions icon to rename or delete a saved search.

Results

The saved search runs and the Search Result page displays the results.

Advanced search

Like basic search, you can use keywords in an advanced search of Data Catalog. However, instead of just filtering out the search results as in basic search, you can apply filters before searching to limit the search itself. Search results are bound by your user access control permissions.

To perform an advanced search, click into the search box and then click Advanced Search.

Enter a keyword or keywords, then define the filters that you want to apply for your search. For example, in the Resources view you can limit your search to the virtual folder BankRetail, or in the Fields view you can limit your search to the string data type. After selecting the desired Entity type and filters, click Apply filters and search.

Search using Advanced Search

You can apply facets when performing an Advanced Search to set the scope of your returned results.

Selecting more than one value inside the same facet includes files that match either value (OR). Selecting more than one value in multiple facets includes files that match both values (AND). If keywords are also specified, the search results match both keywords and facet choices.

NoteData Catalog builds its search results from information collected during a batch process on the cluster called profiling. If files are not already profiled, you will not see results from those files when you search.

In addition to using keywords and facets, you can also apply term-based filters to include or exclude terms and term children to perform conjunctive and disjunctive searches in an advanced search.

  • Including term(s)

    Enter term names you want to include in your search. Only selected term resources and fields are included. When Include child terms is selected, the children for the term are also included.

    NoteIf you select a business entity term, search results do not include its children.
  • Excluding term(s)

    Enter term names you want to exclude from your search. All other term resources and fields are included, except those terms marked as excluded. When Exclude child terms is selected, all the children for the term are also excluded.

For example, when you search for the keyword "Personnel Info" and include the term US_State, the search results are limited to resources matching the keyword and having the term (suggested or accepted) US_State. By including or excluding child terms, individual states tagged with US_State can also be filtered.

As with a basic search, users can select available facets on the Search Result page to further filter the advanced search results.

Perform the following steps to run an Advanced Search. At any time, you can select Reset to clear your selections or click Close to go back to the previous window.

Procedure

  1. Click in the search box, and then click Advanced Search.

    The Advanced Search page opens.
  2. Select the Entity type that you want to search:

    • Resources: to search resources.
    • Fields: to search fields.
  3. Enter your search term or terms in the Keywords field.

  4. (Optional) Enter a term or terms in the Including Term(s) field in the drop-down menu, and select the Include child terms check box if you want to include child terms in your search.

    Selected included terms appear on the Advanced Search page.
  5. (Optional) Enter a term or terms in the Excluding Term(s) field in the drop-down menu, and select the Exclude child terms check box if you want to exclude child terms fom your search.

    NoteIf the Including Term(s) and Excluding Term(s) fields contradict each other, then Excluding Term(s) takes precedence.
    Selected excluded terms appear on the Advanced Search page.
  6. (Optional) Depending on your selected Entity type, apply facets:

    ResourcesYou can search any or all of these resource facets:
    • Data source
    • Virtual folder
    • Resource Type
    • File format
    • Processing status
    FieldsYou can search any or all of these field facets:
    • Data source
    • Virtual folder
    • Data Type
    • Field term state
  7. Click Apply filters and search.

Results

A list of resources or fields matching your search criteria are returned. Depending on your permissions some data may be unavailable for viewing.