Skip to main content
Sumo Logic

Lab 12 - Correlation using Transaction

Aggregate comparable messages from different data sources with similar key fields using the Transaction operator.

In the next 3 labs you will see various ways to correlate your data. Below is a summary to help you know which operator to use.

Transaction

Transaction allows you to correlate messages (from a single source or multiple sources) based on one or more common keys (IP Addresses, Session ID's, etc). It performs an "outer join" and produces an aggregate result. Its main use case in the security space is to check the existence of a values across several data sources. For example, when various security tools alert on the same IP address.

Transactionize

Transactionize allows you to correlate messages (from a single source or multiple sources) based on one or more common keys (IP Addresses, Session ID's, etc).  It performs an "outer join", but operates on the raw messages. Combined with merge, you can merge raw messages or different extracted fields across messages into a single row in the result set.

Subquery

Subquery lets you filter data from one result set based on the result of another query (or multiple queries), within one or across several datasets. It performs an "inner join" and returns raw data. Its main use case is to find data in one dataset that can be found in another dataset. For example, show all Windows Event Logs for hosts that have been flagged by an Endpoint protection system.

  1. The transaction operator allows you to analyze related sequences of messages based on a unique transaction identifier such as a SessionID or IP Address. Transaction uses the unique identifier you specify to group related messages together, and arrange them based on states which you define. In this lab, use the transaction operator to identify source IPs in your Labs/Apache/Access logs that correlate to IPs that appear in Labs/Snort network intrusion detection logs filtered to show a Web Application Attack. We need both parse techniques to obtain the src_ip from both log sources.

_sourceCategory=Labs/Apache/Access or (_sourceCategory=Labs/Snort and "[Classification: Web Application Attack]")

| parse "{TCP} *:* -> *:*" as src_ip, src_port, dest_ip, dest_port nodrop /*for Labs/Snort */

| parse regex "(?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" /* for Labs/Apache/Access */

| transaction on src_ip

 with states %"Labs/Snort", %"Labs/Apache/Access" in _sourceCategory

| where %"Labs/Snort">0 and %"Labs/Apache/Access">0

  1. Now that you have identified these IPs, use the Threat Intel Lookup to see if these are Indicators of Compromise (IOCs). Keep in mind that no results simply means that they are not flagged as malicious IP addresses in the CrowdStrike database.

((_sourceCategory=Labs/Snort "[Classification: Web Application Attack]") or _sourceCategory=Labs/Apache/Access)

| parse "{TCP} *:* -> *:*" as src_ip, src_port, dest_ip, dest_port nodrop

| parse regex "(?<src_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"

| transaction on src_ip

 with states %"Labs/Snort", %"Labs/Apache/Access" in _sourceCategory

| where %"Labs/Snort">0 and %"Labs/Apache/Access">0

| lookup type, actor, raw, threatlevel as malicious_confidence

 from sumo://threat/cs on threat=src_ip

| where !isEmpty(type)