Skip to main content
Sumo Logic

Lab 3 - Parsing Options

Parsing your logs allow you to provide structure to your messages, identifying the fields that are meaningful to you.

 

  1. Use the json auto option to automatically parse all fields from AWS CloudTrail messages

_sourceCategory=Labs/AWS/CloudTrail

| json auto

Image of JSON Auto field selections

  1. In the previous example, notice all the parsed fields shown in the Field Browser. You can now use the parsed awsregion field to count messages by region.

_sourceCategory=Labs/AWS/CloudTrail

| json auto

| count by awsregion

  1. The nodrop option for the parse operator allow users to include messages in your results that do not meet the pattern criteria. Run a search for Apache Error logs for the last 15 minutes and notice that not all messages have a client ip.

_sourceCategory=Labs/Apache/Error

  1. Run the same search, but this time, parse the client ip. Notice how all other messages without the [client *] pattern are dropped.

_sourceCategory=Labs/Apache/Error

| parse "[client *]" as client_ip

  1. Add the nodrop option. Notice how non-matched messages are kept, with an empty client_ip. Notice how a nodrop combined with additional parse statements can allow you to parse logs of varying patterns/formats.

_sourceCategory=Labs/Apache/Error

| parse "[client *]" as client_ip nodrop

| parse "mod_log_sql: *" as message

  1. Filter those parsed by one or the other statement by using the isEmpty, isBlank or isNull operators.

_sourceCategory=Labs/Apache/Error

| parse "[client *]" as client_ip nodrop

| parse "mod_log_sql: *" as message

| where isBlank(client_ip)

  1. The parse field option allows you to do further parsing on an already extracted field. In this example, we want to identify the top 5 committers in GitHub. Search committers in the last 30 days, and parse their email address.

_sourceCategory=Labs/Github and "committer"

| parse "\"email\":\"*\"" as email

  1. Now use the parse field option to further parse the email address into user and domain. Lastly, count by user and identify the top 5 committers.

_sourceCategory=Labs/Github and "committer"

| parse "\"email\":\"*\"" as email

| parse field=email "*@*" as users, domain

| count by users

| top 5 users by _count

  1. The parse multi option allows you to extract multiple occurrences of  the same pattern within one message. By default, parse only extracts the first occurrence. First, search the Snort data and extract the ip address.

_sourceCategory=labs/snort

| parse regex "(?<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"

  1. Now use parse multi and notice how each message is repeated for each occurrence of an ip address, allowing you to do accurate counts.

_sourceCategory=labs/snort

| parse regex "(?<ip_address>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})" multi

  1. Field Extraction Rules extract fields at the time the log messages are ingested. You can see all FERs available (and their details) under Manage Data → Settings → Field Extraction Rules. Taking advantage of the Apache Access rule, run a search to identify the count of 404s by source ip.

_sourceCategory=Labs/Apache/Access and status_code=404

| count by src_ip

Image of Field Extraction Rules settings

QUIZ: True or False

  1. csv, json, split, keyvalue are all parsing operators.

  2. Once a field has been parsed, it cannot be parsed any further.

  3. Fields parsed by the Field Extraction Rules are available in the Field Browser.