Skip to main content
Sumo Logic

Lab 8 - Using Threat Intel wisely to avoid costly data breaches

Learn how analyze AWS data to detect when there has been unauthorized root account usage, monitor security groups, and logins from two different IP addresses.

Each day there may be people attempting to access your companies data to obtain Screen Shot 2020-08-26 at 3.14.58 PM.pngpersonally identifiable information. This can be quite challenging to prevent them from getting in. You have data coming in from all sorts of applications running on in hybrid cloud systems. It's so hard to track all of the Indicators of Compromise, IOC's threat intel. Right? But now you can detect malicious intent across all your incoming data, which can be millions or even billions of log messages generated daily in just seconds? Yes. In just seconds!

Continue with this lab where using our centralized log management capability, you will check all incoming public ip address for possible malicious intent using our embedded Crowdstrike lookup table.  For a faster response time, you will also learn how to take advantage of using scheduled views. Then using one of our advance operators, you will check to see if any outliers can be detected.

If you would like to hear a customer attestations take a moment to look at this use case.

Lab Activity

Run without a scheduled view

  1. Click Managed Data> Logs> Scheduled Views> ip_threats. The window below will open up. First you will run the query against the incoming data. Highlight the query as shown below and copy the code. Notice that the code looks at all the incoming logs which is what our centralized log management platform can easily do.

    Screen Shot 2020-08-26 at 4.04.40 PM.png

  2. Click +New and select Log Search. Paste the code into the query builder. For the time you will use the data from the past 10 days. Type -10d for the time. While it is running let's talk about the code. The wildcard asterisk is using all the incoming cloud or on prem logs, basically all the data that is connected to Sumologic. Then we parse all incoming  ip addresses and lookup every ip address in crowdstrike to check for "high" malicious indicators. The associated count for actor, log source, and other identifiers are provided.

    Screen Shot 2020-08-26 at 4.07.48 PM.png

  3. The results will take several minutes to return. In this example, with fairly fast internet it took over 15 minutes. The results may look like this. The warnings provide useful guidance on how to improve the performance of the query. 

    Screen Shot 2020-08-26 at 4.53.06 PM.png

Run as a scheduled view

  1. Now that you've seen the slow performance, you can rerun this query as a Scheduled View. A schedule view has been set up for ip_threats. Every minute the scheduled view query will run and store the aggregated results. Unlike the query above which ingested the data realtime, a scheduled view takes advantage of the continuous query which happens once per minute, and it will speed the search process as it doesn't need to re-ingest the data.  Basically, you will receive all the same data as the above query with only the last minute of data not included. It will run in seconds. Click Logs tab. Click ip_threats. Hover over ip_threats and to the right, click a blue iconOpen in Log Search, which will open up the query for this scheduled view. 

Screen Shot 2020-08-26 at 3.51.57 PM.png

  1. Run the ip_threats query for the last 10 days, type -10d. Click Start. The results are returned in less than a second. Notice that _view=ip_threats points to the scheduled view with pre-aggregated data rather than incoming data.

Screen Shot 2020-08-26 at 5.36.09 PM.png


Note that ip_address, or any other parsed value, is best if created as a Field Extraction Rule for all the incoming data so that you've normalize the data as a common language. Using a common metadata language is so important, as you have the advantage of running queries such as this across all your incoming data. For example, using the same ip_address metadata tag you can easily extract the ip address from all incoming logs, This makes all incoming ip addresses across all your logs easily passed to our Crowdstrike lookup table.

Pre-aggregated scheduled views don't count against your data ingest

  1. Besides the significant performance improvement, you also don't pay a penny to use a pre-aggregated scheduled view! Data doesn't count against ingestion if using aggregation. By using any aggregation operator count in this scheduled view we are not re-ingesting data, unlike the first query running against re-ingested data. Since a scheduled view runs continuously every minute in the background, it's up to the last minute up-to-date information.  If you need to pivot quickly from 10 days to 30 days you can do that quickly.

Detecting outliers

  1. Now you can further enhance the query to detect counts that are outliers. Add these 3 lines of code. The fields operator is formatting how the columns display in the results. For this query we are using the minus sign to remove certain columns from the aggregated results. The outlier command below sets the data window to 5 for the math calculations. The standard deviation is 3, as represented by the threshold=3. Consecutive equals 1 means that every violation is also an outlier indicator and direction will show both positive and negative outliers. You may want to view this to a line chart.

| fields - ip_address,malicious_confidence,actor,kill_chains,ip_address_types,_sourcecategory,_source
| count by _timeslice
| outlier _count window=5,threshold=3,consecutive=1,direction=+-


  1. Your final query should look like this:

* | parse regex "(?<ip_address>\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" multi
| where ip_address != "" and ip_address != ""
| lookup type, actor, raw, threatlevel as malicious_confidence from sumo://threat/cs on threat=ip_address
| where type="ip_address" and !isNull(malicious_confidence)
| if (isEmpty(actor), "Unassigned", actor) as Actor
| parse field=raw "\"ip_address_types\":[\"*\"]" as ip_address_types nodrop
| parse field=raw "\"kill_chains\":[\"*\"]" as kill_chains nodrop
| timeslice 1m
| count _timeslice, ip_address, malicious_confidence, actor, kill_chains, ip_address_types, _sourceCategory, _source
| fields - ip_address,malicious_confidence,actor,kill_chains,ip_address_types,_sourcecategory,_source
| count by _timeslice
| outlier _count window=5,threshold=3,consecutive=1,direction=+-



Now if you feel that detecting incoming hackers are less important than detecting who is already hacked in and trying to communicate outbound, then you could modify this query. How would you adjust the query to look at outbound malicious activity?


Quiz (True or False?)

  1. You check all incoming public ip address for possible malicious intent using our embedded Crowdstrike.

  2. When using the outlier command if you set the threshold = 3, that sets the standard deviation.

  3. Data doesn't count against ingestion when using pre-aggregated scheduled views.


Congratulations! You’ve completed these tasks:

  1. You learned how to format with fields.

  2. You will use the operators  GEO Lookup, ipv4ToNumber

  3. You optimized your query run time by using scheduled views.