Splunk Input
The Splunk Input transformation step enables you to connect to a Splunk server, enter a Splunk query, and get results back for use within a transformation. Once you have completed those steps, you can stream data from Splunk into your transformation. To learn more about Splunk see their online documentation.
Prerequisites
Before using the Splunk Input step, you must have read access to a Splunk server. Contact your Splunk system administrator for host and port details.
General
Enter the following information in the transformation Step name field.
- Specifies the unique name of the Splunk Input step on the canvas. The Step Name is set to Splunk Input by default.
Options
The Splunk Input step features two tabs with fields and options for defining Splunk a connection and database fields. Each tab is described below.
Connection tab
In this tab, you can define the following connection properties, as described in the table below.
Option | Description |
Host name(s) or IP address(es) | Specifies the network name or address of the Splunk instance or instances. |
Port | Indicates the port number of the Splunk (splunkd) server. The default value is 8089, but your administrator may have changed the port number. |
User name | Specifies the user name required to access the Splunk server. |
Password | Indicates the password associated with the User name. |
Test connection | After you define the connection, you can test it by clicking this button. |
Preview | Provides a first look at the data. Clicking Preview causes the Enter preview size window to appear. Enter the maximum number of records that you want to preview, then click OK. The preview data appears in the Examine preview data window. |
Fields tab
In this tab, you can define the following properties and fields, as described in the table below.
Option | Description |
Splunk query expression |
This field defines the Splunk query. Note that unlike the queries defined in the Splunk user interface, you must start the query with the term: search For example: search * | head 100 One capability of Splunk search is field selection. This allows you to get access to Splunk-parsed fields within the _raw column. To select specific fields, use this syntax at the end of your defined search query:
|
Execute for each row |
If checked, a new query is issued for each row of data coming into
the step. You can reference incoming fields of data using the
|
Name | Name of the field. |
Splunk name | Indicates the Splunk name for the field. |
Type | Specifies the data type of the field. |
Length | Indicates the length of the field. |
Format | Specifies the format of the field. |
Get fields | Displays the field metadata and displays it in the Fields tab. After you have detected the field metadata using the Get Fields button on the Fields tab, you may choose to delete metadata fields that are not relevant to your specific query. Since each field must be translated to its mapped data type, removing unused fields should increase performance. |
Preview | Provides a first look at the data. Clicking Preview causes the Enter preview size window to appear. Enter the maximum number of records that you want to preview, then click OK. The preview data appears in the Examine preview data window. |
Raw field parsing
The input step automatically attempts to parse the raw field into a number of child fields denoted by:
_raw.<Field Name>
It parses the raw field assuming that the field is formatted with name value pairs separated by a new line character, like this:
<Name1>=<Value1>\n <Name2>=<Value2>\n
If raw field data is not formatted like this, you must post-process those fields with other steps in the transformation flow. Note that your secondary steps may include String variables.
Date handling
Kettle does not support the parsing of ISO-8601 date formats, which is Splunk's format for passing date objects through web services. However, you can edit the date string returned from Splunk using the Modified Java Script Value step. Use this script to parse the date:
var dateobj = str2date((substr(_time, 0, 23) + "GMT" + substr(_time, 23)).trim(), "yyyy-MM-dd'T'HH:mm:ss.SSSz");
Metadata injection support
All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.