Filtering
In order to pare down a dataset, Juttle supports various ways of filtering the data.
How to Filter
There are two basic forms of filtering in Juttle.
At any point in the flowgraph, you can can use the filter processor to restrict the points that will flow to those that match the given predicate. For example:
emit -from :0: -limit 10
| put i=count(), even = (i % 2 == 0)
| filter even == true
| view table
Will output:
┌────────────────────────────────────┬──────────┬──────────┐
│ time │ even │ i │
├────────────────────────────────────┼──────────┼──────────┤
│ 1970-01-01T00:00:01.000Z │ true │ 2 │
├────────────────────────────────────┼──────────┼──────────┤
│ 1970-01-01T00:00:03.000Z │ true │ 4 │
├────────────────────────────────────┼──────────┼──────────┤
│ 1970-01-01T00:00:05.000Z │ true │ 6 │
├────────────────────────────────────┼──────────┼──────────┤
│ 1970-01-01T00:00:07.000Z │ true │ 8 │
├────────────────────────────────────┼──────────┼──────────┤
│ 1970-01-01T00:00:09.000Z │ true │ 10 │
└────────────────────────────────────┴──────────┴──────────┘
In addition, most adapters take a filter expression options that are given as part of the invocation of read and turn that into a corresponding query (or queries) to the backend.
For example, assuming you have configured an elasticsearch adapter, then the following juttle will translate the juttle query to search for all documents with a timestamp within the last hour and a message field containing the string "error" and an app field that contains "syslog":
read elastic -from :1 hour ago: -to :now: message~"*error*" app="syslog"
This could have instead been written as:
read elastic -from :1 hour ago: -to :now: | filter message~"*error*" app="syslog"
While these would produce the same results, in the latter case the adapter would pull all of the documents for the last hour out of elasticsearch and into the Juttle runtime where they would be filtered, unlike the former example which sends the query to elasticsearch for execution.
Field comparisons
The basic form of field comparisons is:
field operator expression
or
expression operator field
Depending on the context, you may either be able to reference fields by name or you may use the field reference operators.
The valid comparison operators include:
Operator | Description | Examples |
---|---|---|
=, == | Matches exactly | hostname = "server1" |
!= | Does not match | hostname != "server-" + server_id |
<, <=, >, >= | Is less than, is less than or equal to, is greater than, is greater than or equal to | cpu >= 1 + Math.max(4*20, 79) cpu < max_cpu - 10 |
~, =~ | Wildcard operator for matching with "glob" or regular expressions | True if the value of the "hostname" field is "server" followed by any number of characters: hostname ~ "server*" True if the value of the "hostname" field contains alphanumeric characters: hostname ~ /[A-Za-z0-9]*/ |
!~ | Wildcard negation operator | True if the value of the "hostname" field does NOT begin with "server": hostname !~ "server*" |
in | Check for inclusion in an array | True if the value of "hostname" field is one of "host1", "host2", or the value of the "server" field: hostname in ["host1", host2", server] |
See operators reference for more information.
Filter Expressions
Field comparisons can be combined using the boolean operators AND
, OR
, and NOT
, and can be nested using parentheses. Note that AND
is implicitly added between two field comparison statements.
For example the following will read all points using a hypothetical adapter called email
containing the subject "hello", where the spam rating is either 0 or 1 and the sender is not "self":
read email subject~"*hello*" (spam=0 OR spam=1) AND NOT sender="self"
Full-text search
Juttle supports backend storage systems such as elasticsearch that implement full-text search across all fields in a document through lexical analysis. Full-text searches match any point in which the string returned by expression is present in any field.
The filter processor does not support full-text search -- it can only be used as part of a read from an external backend that supports search.
The search terms for full-text search are currently expressed as standalone strings in the filter expression for read
. Search terms and other filter expressions can be combined with the AND, OR, NOT operators.
The syntax for full-text search will be changed soon to add a ?
operator.
For example the following searches all documents in the last day for the term "alarm":
read elastic -last :1 day: "alarm"
And the following searches all documents in the last day containing the term "alarm" and where the env
field is not equal to "test".
read elastic -last :1 day: 'alarm' AND NOT env = 'test'
Quoted terms match exact phrases only
For example, the following matches points in which one or more fields contain the exact phrase "alarm failed":
read elastic -last :1 day: "alarm failed"
It does not match points in which one field contains "alarm" and another field contains "failed", nor does it match points in which a field contains "alarm has failed". To match those points, use this instead:
read elastic -last :1 day: "alarm" "failed"
Terms analysis
There are many different ways in which a backend storage system may map an incoming document into terms that are available for full-text search.
From the standpoint of Juttle, the terms are passed through to the back end and the specific matching is implemented there.
Search is not available in filter
The filter processor does not implement full-text search. It is only available when interacting with a suitable backend.