Skip to main content
Sumo Logic

Best Practices: 7 Search Rules to Live By

Use these easy to follow rules to get the most out of your Sumo Logic searches.

Rule 1 - Be Specific with Search Scope

At a minimum, all searches should use one or more metadata tags in the scope, for example: _sourceCategory, _source, _sourceName, _sourceHost, or _collector.

If possible, also use one or more keywords to limit the scope.

Rule 2 - Limit Search Time Range

Use the smallest time range required for your use case. When reviewing data over long time ranges, build and test your search against a shorter time range first, then extend the time range once the search is finalized.

Rule 3 - Use Fields extracted via Field Extraction Rules to Limit Data, Avoid Where Operator

Whenever possible, use keyword searches and fields already extracted using Field Extraction Rules to filter data instead of using the where operator. If it is not possible to only use a keyword or pre-extracted field, use both a keyword search AND the where clause.

Best Approach - Field Extraction Rule field AND keyword:

_sourceCategory=foo and fielda=valuea and valuea

Good Approach - Keyword search AND where operator:

_sourceCategory=foo and valuea
| parse "somefield *" as somefield
| where somefield="valuea"

Least Preferred Approach - No keyword search, No pre-extracted field:

| parse "somefield *" as somefield
| where somefield="valuea"

Rule 4 - Filter your Data Before Aggregation

When filtering data, make the result set you are working with as small as possible before conducting aggregate operations like sum, min, max, and average. As stated in Rule 1, keywords and metadata in your search scope are the priority. If you must use a where clause, refer to Rule 3.

Best Approach:

_sourceCategory=Prod/User/Eventlog user="john"
| count by user

Least Preferred Approach:

| count by user
| where user="john"

Rule 5 - Use Parse Anchor Instead of Parse Regex for Structured Messages

As Rule 3 states, it is best to use pre-extracted fields. If you need to parse a field that is not pre-extracted, use parse anchor. If you are dealing with unstructured messages that are more complex, leverage parse regex and place it in a Field Extraction Rule.

Rule 6 - When Using Parse Regex Avoid Expensive Tokens

If you need to use parse regex, avoid the use of expensive operations like .* Just as Rule 1 states for your search scope, be as specific as you can with your regular expressions as well.

Example log message: - - [2016-09-12 20:13:52.870 +0000] "GET /blog/index.php HTTP/1.1" 304 8932

Best Approach:

| parse regex "(?<client_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s"

Least Preferred Approach:

| parse regex "(?<client_ip>.*)\s-"

Rule 7 - Put Pipe-Delimited Operations on Separate Lines

For readability, use a soft return in the query field to put each new pipe-delimited operation on a separate line.

Best Approach:

_sourceCategory=Apache/Access and GET
| parse "\"GET * HTTP/1.1\"\" * * \"\"*\"\"" as url,status_code,size,referrer
| count by status_code,referrer
| sort _count

Least Preferred Approach:

_sourceCategory=Apache/Access and GET | parse "\"GET * HTTP/1.1\"\" * * \"\"*\"\"" as url,status_code,size,referrer | count by status_code,referrer | sort _count