Hierarchical JSON Input
You can use the Hierarchical JSON input step to load JSON data into PDI from a file. You can specify the input file directly in this step or use a list of files.
General
- Step name: Specifies the unique name of the Hierarchical JSON input step on the canvas. You can customize the name or leave it as the default.
Options
The Hierarchical JSON input step features several tabs with fields. Each tab is described below.
Source tab
Option/Field | Description |
From file | Select to specify the file path and name of the JSON file you want to load into PDI. |
File name | File path and name of the JSON file to load. |
From field | Select to use an incoming field as the JSON file path. |
Field with file name | The incoming field containing the JSON file path. |
Output tab
Field | Description |
Output field | Specify the field name for output column. |
Split rows across path | Specify the JSON path to be parsed. This field supports regular expressions. |
Filters tab
Use the Path field (Optional) to specify the filters to apply while using the Split rows across path option to fetch the subset of a JSON file.
Example
The following data is example JSON data in a file that you can load into PDI:
{ "employees": [ { "name" : "emp_name_1" , "age" : 35, "addresses" :[ { "country":"Country_l" }, { "country":"Country_2" } ] }, { "name" : "emp_name_2", "age" : 35, "addresses" :[ { "country" :"Country_3" }, { "country" :"Country_4" { ] } ] }
The following data is extracted from the JSON file when you specify the Split rows across path option as $.employees[*] and do not specify any filters:
NoteThe Split rows across path option is especially useful when loading JSON array objects within large JSON files.