read stochastic -source 'cdn'

Simulate hosts and services in a content distribution network (CDN), providing metric and event streams that vary with demand throughout the day.

read stochastic -source 'cdn' source-params read-params

Demo parameters

Parameter	Description	Required?
`-every`	Historic points are downsampled at the specified interval See Time notation in Juttle for syntax information.	No; default is the same as `-max_samples`
`-max_samples`	Historic points are downsampled at an interval calculated so that there are no more than int intervals	No; default is 100
`-nhosts`	The number of hosts to simulate	No; the default is 1
`-pops`	A list of Point of Presence (PoP) names to use with -nhosts Host names are generated by assigning numbers to PoPs in a round-robin fashion, such as pop1.1, pop2.2, pop1.3, pop1.4, and so on.	No
`-host_names`	A list of host names to use, in which case `-nhosts`is ignored and the number of hosts equals the number of specified host names	No; when nhosts is more than 1, and -host_names is omitted, host names are randomly generated
`-service_names`	A list of service names to use, including one or more of the following: search index authentication	No; the default is all three service names
`-host_capacities`	The relative capacities of hosts. Daily CDN demand are divided among them by these proportions.	No
`-daily`	The scale factor for daily demand wave. A value of n will max out n hosts during peak hours.	No
`-dos`	The scale of DOS demand load on a host, between 0 and 1. A value of 1 will max out any host it hits.	No; the default is 0
`-dos_dur`	The average time that an attack spends on a host before moving to another host Simulation times increase with this number; keep it to seconds or small minutes.	No; defaults to 15 seconds
`-dos_id`	The customer ID of the DOS attacker	No; default is 13
`-dos_router`	The method for selecting the next host for the DOS demand Roundrobin cycles through the host list, each for dos_dur seconds. markov is more interesting.	No
`-errp`	The syslog error percentage at peak demand	No; defaults to 0.02
`-lpm`	The average number of syslog lines per minute per host	No; defaults to 60
`-debug`	"1" to emit additional points with behind-the-scenes information	No

Read parameters

Parameter	Description	Required?
`-from`	Stream points whose time stamps are greater than or equal to the specified moment See Time notation in Juttle for syntax information.	Required only if -to is not present; defaults to :now:
`-to`	Stream points whose time stamps are less than the specified moment, which is less than or equal to :now: See Time notation in Juttle for syntax information. :information_source: `Note:`To stream live data only, omit `-from`and specify `-to :end:` To combine historical and live data, specify a `-from`value in the past and `-to :end:`	Required if -from is not present; defaults to :now:
`-last`	Given a duration, shorthand for `-from (:now: - duration) -to :now:`	No
`filter-expression`	A field comparison, where multiple terms are joined with AND, OR, or NOT See Filtering for additional details.	No

Characteristics of a simulated CDN

Each host in the simulated CDN runs two services:

An index service. Demand for the index services is constant.
An auth service. Demand for the authentication service is constant.
A search service. Search demand varies through the day (generally highest from noon to six).

The default "network" is a single host experiencing the full daily demand for the CDN. You can increase the number of hosts with -nhosts or -host_names. When nhosts is more than 1, host names are randomly generated, though you can provide an explicit list of names with the -host_names option. The daily demand is divided among these hosts in random proportions unless you specify an explicit list of -host_capacities. The daily demand level can be increased to keep all hosts busier with the -daily scale factor.

In addition to the daily demand cycle, you can enable a Denial-of-Service (DoS) agent that hops from host to host with blasts of requests. When hosts are under high load, their services begin to degrade and the frequency of error responses increases. Its demand factor ranges from 0 to 1, configured with the -dos option . The default, 0, disables the agent. You can also control how fast this load shifts between hosts (-dos_dur) and how predictable the shifts are (-dos_router).

Example: Live and historical streams in a simulated CDN

To simulate live data, start the interval now and omit the -to parameter:

read stochastic -source 'cdn' ... -from :now:

To simulate historical data, start the interval anytime in the past and specify -to as either another past moment or :now::

read stochastic -source 'cdn' ... -from :past_moment: -to :now:

To combine historical and live data, start the interval in the past and do not include the -to argument:

read stochastic -source 'cdn' ... -from :2 days ago:

When you combine historical and live data, all data from the -from moment until the present moment is delivered immediately. Then, all subsequent data is delivered in real time.

Data points are produced each time the network is sampled:

For live data, a sample is emitted each second.
For historical data, 60 samples are emitted to cover the specified time range unless you provide a sampling interval with the -every parameter.

Fields in data points in a simulated CDN

All points include these fields:

Field	Description
`time :moment:`	The point's time stamp
`host hostname`	The host that emitted this point
`service index\|search`	The service type to which this point belongs; see Characteristics of a simulated CDN
`source_type metric\|event`	The data type for this point
`name string`	The name of this data point
`value float`	This data point's value

In metrics data streams generated by cdn source, the value of the value field depends on the metric:

Metric name	Value field
`requests`	The number of HTTP requests for this service over this period
`responses`	The number of HTTP responses for this service This metric type also includes a `code`field that is one of 200, 404, or 500. The `value`is the event count for this period.
`response_ms`	The average response time for a request to this service in this period, in milliseconds
`cpu`	The CPU load percentage for this host, 0..1
`disk`	The disk load percentage for this host, 0..1

Similarly, in events data streams generated by cdn source, the value of the value field depends on the event type:

Event type	Value field
`server_error`	For server response events, only HTTP error events are emitted and "value" is the HTTP error code (500) An additional `cust_id`field contains the ID of the requestor.
`syslog`	The log message text

Example: set up a CDN source using different options

read stochastic -source 'cdn' -nhosts 3 -dos 0.5 -dos_dur :10 seconds:

read stochastic -source 'cdn' -host_names ["fenge", "geruth", "amled"] -daily 3.0  name ~ '*cpu*'

Example: historic query for a week of data on one host, with graphs

read stochastic -source 'cdn' -from :1 week ago: -to :now: -source_type 'metric'
name = 'requests' OR name = 'cpu' OR name = 'response_ms' OR name = 'disk'
| (
  filter service = 'search' AND name = 'requests'
  | reduce requests_per_sec = avg(value) by host;

  filter name = 'cpu'
  | reduce cpu = avg(value) by host;

  filter service = 'search' AND name = 'response_ms'
  | reduce response_ms = avg(value) by host;

  filter name = 'disk'
  | reduce disk = avg(value) by host;
)
| join host
| view table -title 'Host Metrics (Last Week)' -columnOrder [ 'host' ]

Example: historic and real-time query of three hosts with some chaos from a DoS attack

The simulation renders the past data, then continues ticking along each second with a new sample.

read stochastic -source 'cdn' -nhosts 3 -dos 0.5 -from :1 minute ago: -source_type 'metric'
  name = 'requests' OR name = 'response_ms' OR name = 'responses' OR name = 'disk'
| (
  filter service = 'search' AND name = 'requests'
  | reduce -every :10s: requests_per_sec = avg(value) by host;

  filter name = 'cpu'
  | reduce -every :10s: cpu = avg(value) by host;

  filter service = 'search' AND name = 'response_ms'
  | reduce -every :10s: response_ms = avg(value) by host;

  filter name = 'disk'
  | reduce -every :10s: disk = avg(value) by host;
)
| pass // required to make the following join work
| join host