Skip to main content
Sumo Logic

Outlier Detection with Monitors

An unusual change or a spike in a time series of a key indicator is the first sign of trouble in distributed systems. Outlier detection with Monitors allows you to detect and alert when these unusual changes occur and quickly investigate. Outlier detection can help you when you want to:

  • Configure outlier-based alerts for metrics like latency, traffic volume, and throughput, which don’t have a good static baseline to alert on.
  • Alert another user on anomalies with an outlier-based alert set up in the system. This type of alert must provide all the context necessary to understand and act on the issue.
  • Set the outlier alert to automatically resolve when the anomalous behavior is no longer seen. 

When you add a Monitor you will have the option to choose between a static threshold or an outlier threshold. A static threshold was the only detection method previously offered. An Outlier Detection Monitor is set to use the outlier detection method. Depending upon the type of Monitor, either logs or metrics, you'll have a few different configuration options.

For all the details on Monitors, such as Rules and Limitations, see our Monitors documentation.

Add an Outlier Detection Monitor

On the Monitors page, click on the Add button then New Monitor to add a new monitor. The monitor creation dialog box will appear.

  1. Select a Monitor Type, either Logs or Metrics.
    trigger conditions for monitor.png

  2. Choose Outlier as the Detection Method.
    outlier detection method.png

  3. Provide a Query. A Logs Monitor can have one query up to 4,000 characters long. Metrics Monitors can specify up to six queries. When providing multiple metrics queries use the letter labels to reference a query row, see joined metrics queries for details. The monitor will automatically deduce the query row to use for the trigger.

  4. Select the Direction that you want to track.
    outlier detection input on Monitor.png

    • Up. Only get alerted if there is an abnormal increase in the tracked key indicator. 
    • Down.  Only get alerted if there is an abnormal decrease in the tracked key indicator. 
    • Both. Get alerted if there is any abnormality in the data whether an increase or a decrease.
    • If you are adding a Logs Monitor, you need to select the field to trigger alerts on.

  5. Specify the Trigger Type. A monitor can have one critical, warning, and missing data trigger condition, each with one or more notification destinations.

    Triggers have different options depending on the query and alert type. Click the Expand button next to the query type you're using for configuration details.

Logs query

Logs query.png

Trigger Type: Critical and Warning

monitor outlier logs.png

Alert when result is greater than or equal to <threshold> standard deviations from baseline for <consecutive> consecutive out of <window> data points

Parameter Description
threshold The number of standard deviations for calculating violations. The default is 3.0.
consecutive The required number of consecutive indicator data points (outliers) to trigger a violation.
window The number of data points used to calculate the baseline for outlier detection.

Recover

The recovery condition will always be the opposite of the alerting condition. For example, if there is no outlier identified for the duration of the detection window from the time the alert was first fired, then the Monitor will be brought back to the normal state. You cannot customize the resolution condition for the Monitor.


        Trigger Type: Missing Data
        
       logs missing data Jan 2021.png

Alert when missing data within <time range>

Parameter Description
Time range The time period to use for calculating the baseline outlier detection. Select either 5 minutes, 10 minutes, 15 minutes, 30 minutes, 1 hour, 6 hours, 12 hours, or 24 hours.

Recover

  • Automatically: Sumo Logic automatically resolves the incident when the resolution condition is satisfied. Recover automatically when data becomes available for the affected time span.

Metrics query

metrics query.png

Trigger Type: Critical and Warning

monitor metrics outlier triggers.png

Alert when result is greater than or equal to <threshold> standard deviations from baseline for <time range>

Parameter Description
Threshold The number of standard deviations for calculating violations. The default is 3.0.
Time range The duration of time to evaluate. Select either 5 minutes, 10 minutes, 15 minutes, 30 minutes1 hour, or 24 hours.

Recover

The recovery condition will always be the opposite of the alerting condition. For example, if there is no outlier identified for the duration of the detection window from the time the alert was first fired, then the Monitor will be brought back to the normal state. You cannot customize the resolution condition for the Monitor.

Trigger Type: Missing Data

missing.png

Alert when missing data <occurrence type> for <time range>

Parameter Description
Occurrence type The time condition you want for the trigger. Choose either for all or for any.

If you choose all you will get notified when all of the metrics meeting the query condition are not sending data in the given time range.

Alternatively, you can choose any if you want to get notified when one of the metrics does not receive any data in the given time range. This option requires at least one initial data point.
Time range The duration of time to evaluate. Select either 5 minutes, 10 minutes, 15 minutes, 30 minutes, 1 hour, 6 hours, 12 hours, or 24 hours.

Recover

  • Automatically: Sumo Logic automatically resolves the incident when the resolution condition is satisfied. Recover automatically when data becomes available for the affected time span.
  1. (Optional) Set Notifications, when a trigger condition is met you can send notifications to other people and services. To add notifications click on the Add Notification button. You can add more than one notification channel for a monitor.

    notifications input pane.png

    Metrics Monitors have an option to send notifications either as a group or separately. Group Notifications define whether you want single notifications per time series that match the monitor query or you want group notifications where you receive a single notification for the entire monitor.
    1. The Connection Type specifies the notification channel where you want to get notified, such as an email or webhook. See Connections for details. 

      Monitor notifications support variables to reference monitor configuration settings or your raw data. See alert variables for a table of the available variables.

      • Email. You must provide one or more recipient email addresses. You can customize the email Subject and Body.

      • Webhook. By default, the payload defined on the Connection is used. You can customize your payload for each notification if needed.

    2. Select the Alert and Recovery checkboxes for each trigger type based on when you want to send a notification.  You can have different Trigger Conditions send a notification to different channels. For example, you can get notified on PagerDuty for Critical Incidents and get an email or slack notification for Warning incidents.

      If your connection type is Microsoft Teams, OpsGenie, PagerDuty, or Slack the Recovery checkbox enables an automatic resolution process that updates the connection when an alert has recovered within Sumo Logic. Support for other connection types is coming soon.
    3. Add Notifications to add additional notification channels as needed. You can configure different notifications for each trigger type, critical, warning, and missing data.

  2. Enter a Name for the monitor and the Location you want it saved to. A Description is optional.

    monitor details during creation.png