Skip to main content
Sumo Logic

Lab 18 - Create Alerts with Context

In this lab, rather than alerting on simple error counts with a static threshold, which can yield false positives (Fig. 1 below), learn to create an alert that will notify you when your errors increase at a higher rate than your overall traffic (Fig. 2).  For a more in depth explanation, check out this blog post on creating meaningful alerts.

Fig. 1 & Fig. 2

Alert notification graphic with context

  1. Using Labs/Apache/Access data, search only for messages with status code 200 or 404. 404 messages are your errors, and 200 messages give you a sense of the overall traffic.

  2. Count 200 messages as Successes and 404 messages as Fails.

  3. Sum Successes and Fails to get a count by timeslice to identify a trend over time.

  4. Create a ratio of fails to successes

  5. Use outlier operator to identify anomalies in the ratio

_sourceCategory=Labs/Apache/Access (status_code=200 or status_code=404)

| timeslice 1m

| if (status_code="200", 1, 0) as successes

| if (status_code="404", 1, 0) as fails

| sum(successes) as success_cnt, sum(fails) as fail_cnt by _timeslice

| fail_cnt/success_cnt as failure_rate

| sort _timeslice desc

| outlier failure_rate window=5, threshold=3, consecutive=1, direction=+

  1. Adding the following where clause allows you to filter out only outliers (where ration increase is higher than normal) . Using your email address, you can now create a Scheduled Search to Alert when this query has results.

| where failure_rate_violation > 0