Skip to main content
Sumo Logic

Lab 8 - TravelLogic Metrics

This lab shows an example of how to use business and operational from a fictitious travel application's data to populate metrics.

You can also use Sumo to collect custom business and operational data that is coded into your organization’s applications.

In this lab, we will look at metrics from our TravelLogic Demo. 

Let's begin by testing the Graphite Metrics Rule we just reviewed during Training.

  1. First, let's query a metric using its raw Graphite name. In a metrics query, enter the following:

    travel.training.counters.travel-checkout*.bookings.success.count

  2.  On a second query, run the following to query the exact same metric. However, this time, we are taking advantage of the key-value tags created by the Metrics Rule

    type=bookings metric=success.count

    step2.png

  3. You will notice you are graphing the exact same metric twice. You can check this by clicking on the Legend tab and verifying the metric's raw name, plus all other metadata fields.

Now let's learn how to use some additional operators. In the next steps we'll plot data from an online travel website to determine successful versus unsuccessful bookings.

  1. In a new Metrics tab, add a query to search for all your successful bookings for the last 60 minutes:

    type=bookings metric=success.count

  2. In a second query underneath the first one, search for all failed bookings:

    type=bookings metric=fail.count

    step6.png

  3. Click on the Settings tab to see what options are available to you. For example, you can change the chart type (line or area), the color palette used, the line width and the axes labels and scales.

  4. Explore the Legend tab, which allows you to view all-time series detail for each metric charted.

  5. Back in the query tab, toggle off the success.count by clicking on the orange icon (). This will now only chart the fail.count metric.

    Any pink dots in your chart are identifying outliers in your data. You can edit the settings for your outliers by clicking the pink dot on the top right and editing the Outliers settings: how many outliers to show (Top) and how many standard deviations to use when considering outliers (Threshold).


    step10.png


Lastly, click on the 3 grey dots in the top right corner to view the query info, refresh the query, or add this chart to a Dashboard.

Let's now compare KPIs at different time periods using the timeshift operator. The timeshift operator shifts the time series of your query. It's very useful to compare across multiple time periods.

  1. In a new Metrics tab, add a query to search for all your mean latency for the last 60 minutes.

    metric=latency.mean

Compare that with your latency from 1 day ago.

metric=latency.mean | timeshift 1d

step13.png
Similar to logs, metrics have  the usual operators (min, max, sum, count, avg). However, oftentimes, what you want to measure is change.

In this next exercise, we will identify rate of change to get early warning on impending issues.

  1. In a new Metrics tab, add a query to search for a count of packets received in the last 60 minutes.

    type=packets_received metric=count

  2. To find the difference between one data point and the next, edit your query to show the delta.

    type=packets_received metric=count | delta

However, to find the rate of change, in this case, packets received per second, edit your query to:

type=packets_received metric=count | rate

With this last query, you're able to determine if the rate at which packets are being received is increasing gradually or spiking quickly. Identifying an outlier on a rate of change is a better indicator of an impending problem.

step15.png

 

Lastly, let's learn how to correlate metrics to relevant logs to identify the root cause.

Metrics allow you to identify symptoms in your environment (WHAT is going on?). Relevant logs help you identify the cause (WHY is this happening?). Let's again look for successful and failed bookings, but this time, let's take a look at the relevant logs to identify why we have failed bookings.

  1. Identify counts of successful booking and failed bookings for your travel website.

    step6.png

  2. To overlay your metrics with the relevant logs, enter this log query as depicted below:

    _sourceCategory=training/travel/checkout error

    step18.png

  3. Notice the orange bars at the top. The darker the bar, the larger the number of logs containing the word ERROR. Click the bar to view the relevant logs in this same screen. Shift+click to view logs in a Log Search screen.