Skip to main content
Sumo Logic

Lab 14 - Using Threat Intel wisely to avoid costly data breaches

Learn how analyze AWS data to detect when there has been unauthorized root account usage, monitor security groups, and logins from two different IP addresses.

Each day there may be people attempting to access your companies data to obtain Screen Shot 2020-08-26 at 3.14.58 PM.pngpersonally identifiable information. This can be quite challenging to prevent them from getting in. You have data coming in from all sorts of applications running on in hybrid cloud systems. It's so hard to track all of this. Right? But now you can detect malicious intent across all your incoming data, which can be millions or even billions of log messages generated daily in just seconds? Yes. In just seconds!

Continue with this lab where you will check incoming public ip address for possible malicious intent using our embedded Crowdstrike lookup table.  You will also learn how to take advantage of using scheduled views for a faster response time and detect outliers.

If you would like to hear a customer attestations take a moment to look at this use case.

Lab Activity

Run without a scheduled view

  1. Click Managed Data> Logs>ip_threats. First you will run the query against the incoming data for a week. Highlight the query as shown below and copy the code. Notice that the code looks at all the incoming logs which is what our centralized log management platform can easily do.

    Screen Shot 2020-08-26 at 4.04.40 PM.png

  2. Click +New and select Log Search. Paste the code into the query builder. For the time you will use the data from the past 10 days. Type -10d for the time. While it is running let's talk about the code. The wildcard asterisk is using all the incoming cloud or on prem logs, basically all the data that is connected to Sumologic. Then we parse all incoming  ip addresses and lookup every ip address in crowdstrike to check for "high" malicious indicators. The associated count for actor, log source, and other identifiers are provided.

    Screen Shot 2020-08-26 at 4.07.48 PM.png

  3. The results take several minutes to return. In this example, with fairly fast internet it took over 15 minutes. The results may look like this. The warnings provide useful guidance on how to improve the performance of the query.

    Screen Shot 2020-08-26 at 4.53.06 PM.png

Run as a scheduled view

  1. Now that you've seen the slow performance, you can rerun this query as a Scheduled View. A schedule view has been set up for ip_threats. This means that every minute the query will run and create and store the aggregated results. Unlike the query above which ingested the data realtime, a scheduled view takes advantage of the continuous query which happens once per minute, and it will speed the search process.  Basically, you will receive all the same data as the above query with only the last minute of data not included. It will run in seconds.Click Logs tab. Click ip_threats. Hover over ip_threats and to the right, click a blue iconOpen in Log Search, which will open up the query for this scheduled view. 

  2. Screen Shot 2020-08-26 at 3.51.57 PM.png

Run the ip_threats query for the last 10 days, type -10d. Click Start. The results are returned in less than a second. Notice that _view=ip_threats points to the scheduled view rather than incoming data.

Screen Shot 2020-08-26 at 5.36.09 PM.png


Note that ip_address, or any other parsed value, is best if created as a Field Extraction Rule for all the incoming data so that you've normalize the data as a common language. Using a common metadata language is so important, as you have the advantage of running queries such as this across all your incoming data. For example, using the same ip_address metadata tag you can easily extract the ip address from all incoming logs, This makes all incoming ip addresses across all your logs easily passed to our Crowdstrike lookup table.

Pre-aggregated scheduled views don't count against your data ingest

  1. Besides the significant performance improvement, you also don't pay a penny to use a pre-aggregated scheduled view! Data doesn't count against ingestion if using aggregation. By using any aggregation operator count in this scheduled view we are not re-ingesting data, unlike the first query running against re-ingested data. Since a scheduled view runs continuously every minute in the background, it's up to the last minute up-to-date information.  If you need to pivot quickly from 10 days to 30 days you can do that quickly.

Detecting outliers

  1. Now you can further enhance the query to detect counts that are outliers. The outlier command below sets the data window to 5 for the math calculations. The standard deviation is 3, as represented by the threshold=3. Consecutive equals 1 means that every violation is also an outlier indicator and direction will show both positive and negative outliers. You may want to view this to a line chart.

* | parse regex "(?<ip_address>\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" multi
| where ip_address != "" and ip_address != ""
| lookup type, actor, raw, threatlevel as malicious_confidence from sumo://threat/cs on threat=ip_address
| where type="ip_address" and !isNull(malicious_confidence)
| if (isEmpty(actor), "Unassigned", actor) as Actor
| parse field=raw "\"ip_address_types\":[\"*\"]" as ip_address_types nodrop
| parse field=raw "\"kill_chains\":[\"*\"]" as kill_chains nodrop
| timeslice 1m
| count _timeslice, ip_address, malicious_confidence, actor, kill_chains, ip_address_types, _sourceCategory, _source
| outlier _count window=5,threshold=3,consecutive=1,direction=+-



Now if you feel that detecting incoming hackers are less important than detecting who is already hacked in and trying to communicate outbound, then you could modify this query. How would you adjust the query to look at outbound malicious activity?


Quiz (True or False?)

  1. Geo lookup receives an ip address and returns location.

  2. The haversine formula was used to convert octets to decimals.

  3. Landspeed violations allow you to detect suspect user activity.


Congratulations! You’ve completed these tasks:

  1. You learned how to format with fields.

  2. You will use the operators  GEO Lookup, ipv4ToNumber

  3. You optimized your query run time by using scheduled views.