File Adapter

The file adapter allows reading points from, or writing points to, a file on the local filesystem.

read file

Supported file formats are JSON array, JSON lines, CSV, column-aligned text, and unstructured logs (with grok parsing); see examples below.

read file -file <path> [-format <format>] [-timeField <fieldname>]
Parameter Description Required?
-file File path on the local filesystem, absolute or relative to the current working directory Yes
-format Input file format, supports: csv json jsonl columns or grokfor text parseable by grok No; defaults to json
-timeField Name of the field in the data which contains a valid timestamp No; defaults to time
-pattern When -format 'grok'you can specify the grok matching pattern here. More information on grok here No
-separator When -format 'csv'is used, you can specify the separator between columns in a CSV file. No: defaults to ,
-commentSymbol When -format 'csv'is used, you can specify the comment character that prefixes comment lines in a CSV file. No: defaults to ,
-ignoreEmptyLines When -format 'csv'is used, you can skip empty lines in a CSV file. No: defaults to false
-allowIncompleteLines When -format 'csv'is used, you can allow for parsing of incomplete lines in a CSV file. No: defaults to false

The data is assumed to contain valid timestamps in a field named time by default; a different name for the time field may be specified with -timeField option. If the data contains fields time and another named field specified with -timeField, the original contents of field time will be overwritten by the valid timestamp from timeField.

Timeless data, which contains no timestamps, is acceptable; however certain operations which expect time to be present in the points, such as reduce -every :interval:, will execute with warnings or error out. Timeless data can be joined in the Juttle flowgraph with other data which contains timestamps; a common use case for reading timeless data from a file or another backend is to join it with streaming data for enrichment.

The file adapter does not support any kind of filtering (neither filter expressions, nor full text search). In order to filter the data read from file, pipe to the filter proc.

Example: read from a JSON array data file

The source file has data in JSON array format:

[ 
{ "time": "2015-11-06T04:28:32.304Z", "hostname": "lemoncake", "state": "ok" },
{ "time": "2015-11-06T04:28:32.304Z", "hostname": "applepie", "state": "warn" },
{ "time": "2015-11-06T04:28:42.405Z", "hostname": "lemoncake", "state": "ok" },
{ "time": "2015-11-06T04:28:42.502Z", "hostname": "applepie", "state": "ok" }
]

This program reads the file in, and filters on a specific hostname:

read file -file 'docs/examples/datasets/input1.json'
| filter hostname = 'lemoncake'
| view table

Example: read from a JSON lines data file

The source file has data in JSON lines format, i.e. newline separated JSON objects:

{ "time": "2015-11-06T04:28:32.304Z", "hostname": "lemoncake", "state": "ok" }
{ "time": "2015-11-06T04:28:32.304Z", "hostname": "applepie", "state": "warn" }
{ "time": "2015-11-06T04:28:42.405Z", "hostname": "lemoncake", "state": "ok" }
{ "time": "2015-11-06T04:28:42.502Z", "hostname": "applepie", "state": "ok" }

This program reads the file in, and filters on a specific hostname:

read file -file 'docs/examples/datasets/json_lines.jsonl'
| filter hostname = 'lemoncake'
| view table

The above examples produce the same output table:

┌────────────────────────────────────┬──────────────┬──────────┐
│ time                               │ hostname     │ state    │
├────────────────────────────────────┼──────────────┼──────────┤
│ 2015-11-06T04:28:32.304Z           │ lemoncake    │ ok       │
├────────────────────────────────────┼──────────────┼──────────┤
│ 2015-11-06T04:28:42.405Z           │ lemoncake    │ ok       │
└────────────────────────────────────┴──────────────┴──────────┘

write file

Data is written out to the filename specified in the format you specify.

write file -file <path>
Parameter Description Required?
-file File path on the local filesystem, absolute or relative to the current working directory Yes
-format Input input format: csvfor CSV data, jsonfor JSON data, or jsonlfor JSON lines data No; defaults to json
-append Specifies if the data should be appended to the file or if the file should be overwritten No, defaults to false

If the file already exists it will be truncated and overwritten.

Example: writing data to a file

emit -from :2015-01-01: -limit 2 
| put name = 'write_me', value = count() 
| write file -file '/tmp/write_me.json'

The resulting file /tmp/write_me.json contains:

[
    {
        "time": "2015-01-01T00:00:00.000Z",
        "name": "write_me",
        "value": 1
    },
    {
        "time": "2015-01-01T00:00:01.000Z",
        "name": "write_me",
        "value": 2
    }
]