Skip to main content
Sumo Logic

fillmissing

When you run a standard group-by query, Sumo Logic only returns non-empty groups in the results. For example, if your query is grouping by timeslice, then only the timeslices that have data are returned.

This can be a problem because:

  • The lack of data is sometimes also an interesting event, but there is no easy way to capture this information. For example, the outlier operator cannot catch anomalies arising from missing data because it can only mark an existing timeslice as anomalous.
  • Missing data can lead to misleading visualizations. For example, if you plot a line chart across timeslices with missing data, the chart will interpolate across the missing timeslices and represent them deceptively as nonempty.

The fillmissing operator addresses this shortcoming by allowing you to specify groups that should be represented in the output, even if those groups have no data.

How it works

The fillmissing operator allows you to define generators over the fields in the output. Each generator applies to one field and enumerates all the values that you would like to appear in the output for that field, even if some of those values are not present in data.

You can define multiple generators, which enumerate tuples for every combination of the values enumerated by each of the generators (i.e., the Cartesian product). For example, if you used the following query:

| fillmissing values("a1", "a2") in A,
              values("b1") in B,
              values("c1", "c2", "c3") in C

The generators for the three fields (A, B, and C) enumerate the tuples:

A B C
a1 b1 c1
a1 b1 c2
a1 b1 c3
a2 b1 c1
a2 b1 c2
a2 b1 c3

Fields with generators defined are called key fields. The remaining fields are called non-key fields.

Given the list of generators, fillmissing ensures that every tuple enumerated by the generators are present in the output. In particular, when one of the enumerated tuples is missing, fillmissing will append a record to the output with the missing values for the key fields and some constant default value for the non-key fields.

Generators

The fillmissing operator currently supports the following types of generators:

  • Timeslice. Enumerates all the timeslices with a given granularity in a query time range. For example, timeslice(15m) enumerates all the 15-minute timeslices in the query time range.
  • Values. Enumerates the fixed set of values given in arguments. For example, values("a", "b", "c") enumerates the values "a", "b", and "c". Currently, only string literals are supported for the arguments.

Default values for non-key fields

When fillmissing appends a record to the output, the key fields of the record contain the missing values, while the remaining fields contain some constant value. You can configure the constant value for those fields. If you don't, a default value is assigned that depends on the type of the field:

Field Type Default Value
integer 0
double 0.0
boolean false
String (or other) null

 

Syntax

This section describes the syntax for the fillmissing operator.

fillmissing <keyFieldGenerators> [ with <nonKeyFieldSpecs> ]

The keyFieldGenerators is comma-separated list of generators, one for each key field. Currently, two generators are supported; their syntax looks like:

<timesliceGenerator> ::= timeslice ( <granularity> ) [ in <field> ]   <valuesGenerator>    ::= values ( "value1", "value2", ... ) in <field>

The nonKeyFieldSpecs are optional. They allow you to configure the default constant values for one or more non-key fields. The syntax for each specification looks like:

<constantNonKeySpec> ::= <constantValue> for <field>
<constantValue>      ::= <double> | <int> | <stringLiteral> | null

Rules

  • In Live Dashboards, you must use the fillmissing operator after an aggregate operator. 

Examples

Timeslice generator

This example query counts the number of login events in 15-minute timeslices. Notice that in the query with fillmissing, timeslices with a count of zero are shown in the output.

Notice also that for the timeslice generator, the key field name is optional. (It defaults to _timeslice.)

Without Fillmissing With Fillmissing
login
| timeslice 15m
| count by _timeslice
| sort by _timeslice
login
| timeslice 15m
| count by _timeslice
| fillmissing timeslice(15m)
| sort by _timeslice
fillmissing-example-1-without.png fillmissing-example1-with.png

Value generator

This example query counts by a discrete field (type). Here, you can use the value() generator to enumerate the three types (web, api, and internal) that are required in the output.

Notice also how we changed the default value of _count from 0 to -1.

Without Fillmissing With Fillmissing
login
| count by type
login
| count by type
| fillmissing values("web", "api", 
    "internal") in type
  with -1 for type
fillmissing-example2-without.png fillmissing-example2-with.png

Multiple generators and transpose

This example shows how multiple generators can be used to enumerate every combination of the required values in two key fields (type and _timeslice). Notice that while transpose can show some of the missing values, it misses the rows where none of the types have any data.

Without Fillmissing With Fillmissing
login
| parse "Completed in * ms." as latency
| timeslice 15m
| pct(latency, 99) by type, _timeslice
| transpose row _timeslice column type
login
| parse "Completed in * ms." as latency
| timeslice 15m
| pct(latency, 99) by type, _timeslice
| fillmissing timeslice(15m),
  values("rework", "view", "mix") in type

| transpose row _timeslice column type
fillmissing-example3-none.png fillmissing-example3-before.png

Note that in this simple example, you can achieve a similar effect without the need to specify all the expected values for the type field, by applying the fillmissing operator after the transpose, like this:

login
| parse "Completed in * ms." as latency
| timeslice 15m
| pct(latency, 99) by type, _timeslice
| transpose row _timeslice column type
| fillmissing timeslice(15m) 

However, the filled-in fields will always be null (instead of 0 like in the previous example). Currently, there is no way to change to default null value.