read stochastic -source 'cdn'

Simulate hosts and services in a content distribution network (CDN), providing metric and event streams that vary with demand throughout the day.

read stochastic -source 'cdn' source-params read-params

Demo parameters

Parameter Description Required?
-every Historic points are downsampled at the specified interval

See Time notation in Juttle for syntax information.

No; default is the same as -max_samples
-max_samples Historic points are downsampled at an interval calculated so that there are no more than int intervals No; default is 100
-nhosts The number of hosts to simulate No; the default is 1
-pops A list of Point of Presence (PoP) names to use with -nhosts

Host names are generated by assigning numbers to PoPs in a round-robin fashion, such as pop1.1, pop2.2, pop1.3, pop1.4, and so on.

No
-host_names A list of host names to use, in which case -nhostsis ignored and the number of hosts equals the number of specified host names No; when nhosts is more than 1, and -host_names is omitted, host names are randomly generated
-service_names A list of service names to use, including one or more of the following:
  • search
  • index
  • authentication
No; the default is all three service names
-host_capacities The relative capacities of hosts. Daily CDN demand are divided among them by these proportions. No
-daily The scale factor for daily demand wave. A value of n will max out n hosts during peak hours. No
-dos The scale of DOS demand load on a host, between 0 and 1. A value of 1 will max out any host it hits. No; the default is 0
-dos_dur The average time that an attack spends on a host before moving to another host

Simulation times increase with this number; keep it to seconds or small minutes.

No; defaults to 15 seconds
-dos_id The customer ID of the DOS attacker No; default is 13
-dos_router The method for selecting the next host for the DOS demand

Roundrobin cycles through the host list, each for dos_dur seconds. markov is more interesting.

No
-errp The syslog error percentage at peak demand No; defaults to 0.02
-lpm The average number of syslog lines per minute per host No; defaults to 60
-debug "1" to emit additional points with behind-the-scenes information No

Read parameters

Parameter Description Required?
-from Stream points whose time stamps are greater than or equal to the specified moment

See Time notation in Juttle for syntax information.

Required only if -to is not present; defaults to :now:
-to Stream points whose time stamps are less than the specified moment, which is less than or equal to :now:

See Time notation in Juttle for syntax information.

:information_source: Note:To stream live data only, omit -fromand specify -to :end: To combine historical and live data, specify a -fromvalue in the past and -to :end:

Required if -from is not present; defaults to :now:
-last Given a duration, shorthand for -from (:now: - duration) -to :now: No
filter-expression A field comparison, where multiple terms are joined with AND, OR, or NOT

See Filtering for additional details.

No

Characteristics of a simulated CDN

Each host in the simulated CDN runs two services:

  • An index service. Demand for the index services is constant.

  • An auth service. Demand for the authentication service is constant.

  • A search service. Search demand varies through the day (generally highest from noon to six).

The default "network" is a single host experiencing the full daily demand for the CDN. You can increase the number of hosts with -nhosts or -host_names. When nhosts is more than 1, host names are randomly generated, though you can provide an explicit list of names with the -host_names option. The daily demand is divided among these hosts in random proportions unless you specify an explicit list of -host_capacities. The daily demand level can be increased to keep all hosts busier with the -daily scale factor.

In addition to the daily demand cycle, you can enable a Denial-of-Service (DoS) agent that hops from host to host with blasts of requests. When hosts are under high load, their services begin to degrade and the frequency of error responses increases. Its demand factor ranges from 0 to 1, configured with the -dos option . The default, 0, disables the agent. You can also control how fast this load shifts between hosts (-dos_dur) and how predictable the shifts are (-dos_router).

Example: Live and historical streams in a simulated CDN

To simulate live data, start the interval now and omit the -to parameter:

read stochastic -source 'cdn' ... -from :now:

To simulate historical data, start the interval anytime in the past and specify -to as either another past moment or :now::

read stochastic -source 'cdn' ... -from :past_moment: -to :now:

To combine historical and live data, start the interval in the past and do not include the -to argument:

read stochastic -source 'cdn' ... -from :2 days ago:

When you combine historical and live data, all data from the -from moment until the present moment is delivered immediately. Then, all subsequent data is delivered in real time.

Data points are produced each time the network is sampled:

  • For live data, a sample is emitted each second.
  • For historical data, 60 samples are emitted to cover the specified time range unless you provide a sampling interval with the -every parameter.

Fields in data points in a simulated CDN

All points include these fields:

Field Description
time :moment: The point's time stamp
host hostname The host that emitted this point
service index|search The service type to which this point belongs; see Characteristics of a simulated CDN
source_type metric|event The data type for this point
name string The name of this data point
value float This data point's value

In metrics data streams generated by cdn source, the value of the value field depends on the metric:

Metric name Value field
requests The number of HTTP requests for this service over this period
responses The number of HTTP responses for this service

This metric type also includes a codefield that is one of 200, 404, or 500. The valueis the event count for this period.

response_ms The average response time for a request to this service in this period, in milliseconds
cpu The CPU load percentage for this host, 0..1
disk The disk load percentage for this host, 0..1

Similarly, in events data streams generated by cdn source, the value of the value field depends on the event type:

Event type Value field
server_error For server response events, only HTTP error events are emitted and "value" is the HTTP error code (500)

An additional cust_idfield contains the ID of the requestor.

syslog The log message text

Example: set up a CDN source using different options

read stochastic -source 'cdn' -nhosts 3 -dos 0.5 -dos_dur :10 seconds:
read stochastic -source 'cdn' -host_names ["fenge", "geruth", "amled"] -daily 3.0  name ~ '*cpu*'

Example: historic query for a week of data on one host, with graphs

read stochastic -source 'cdn' -from :1 week ago: -to :now: -source_type 'metric'
name = 'requests' OR name = 'cpu' OR name = 'response_ms' OR name = 'disk'
| (
  filter service = 'search' AND name = 'requests'
  | reduce requests_per_sec = avg(value) by host;

  filter name = 'cpu'
  | reduce cpu = avg(value) by host;

  filter service = 'search' AND name = 'response_ms'
  | reduce response_ms = avg(value) by host;

  filter name = 'disk'
  | reduce disk = avg(value) by host;
)
| join host
| view table -title 'Host Metrics (Last Week)' -columnOrder [ 'host' ]

Example: historic and real-time query of three hosts with some chaos from a DoS attack

The simulation renders the past data, then continues ticking along each second with a new sample.

read stochastic -source 'cdn' -nhosts 3 -dos 0.5 -from :1 minute ago: -source_type 'metric'
  name = 'requests' OR name = 'response_ms' OR name = 'responses' OR name = 'disk'
| (
  filter service = 'search' AND name = 'requests'
  | reduce -every :10s: requests_per_sec = avg(value) by host;

  filter name = 'cpu'
  | reduce -every :10s: cpu = avg(value) by host;

  filter service = 'search' AND name = 'response_ms'
  | reduce -every :10s: response_ms = avg(value) by host;

  filter name = 'disk'
  | reduce -every :10s: disk = avg(value) by host;
)
| pass // required to make the following join work
| join host