Skip to main content
Sumo Logic

Best Practices: Search Rules to Live By

Use these easy to follow rules to get the most out of your Sumo Logic searches.

Rule 1 - Be specific with search scope

At a minimum, all searches should use one or more metadata tags in the scope, for example:  _sourceCategory, _source, _sourceName, _sourceHost, or _collector.

If possible, also use one or more keywords to limit the scope.

Rule 2 - Limit search time range

Use the smallest time range required for your use case. When reviewing data over long time ranges, build and test your search against a shorter time range first, then extend the time range once the search is finalized.

Rule 3 - Use fields extracted by FERs and avoid the where operator

Whenever possible, use keyword searches and fields already extracted using Field Extraction Rules (FERs) to filter data instead of using the where operator. If it is not possible to only use a keyword or pre-extracted field, use both a keyword search AND the where clause.

Best approach - Field Extraction Rule field AND keyword

_sourceCategory=foo and fielda=valuea

Good approach - Keyword search AND where operator

_sourceCategory=foo and valuea
| parse "somefield *" as somefield
| where somefield="valuea"

Least preferred approach - No keyword search, no pre-extracted field

| parse "somefield *" as somefield
| where somefield="valuea"

Rule 4 - Filter your data before aggregation

When filtering data, make the result set you are working with as small as possible before conducting aggregate operations like sum, min, max, and average. As stated in Rule 1, keywords and metadata in your search scope are the priority. If you must use a where clause, refer to Rule 3.

Best approach

_sourceCategory=Prod/User/Eventlog user="john"
| count by user

Least preferred approach

| count by user
| where user="john"

Rule 5 - Use parse anchor instead of parse regex for structured messages

As Rule 3 states, it is best to use pre-extracted fields. If you need to parse a field that is not pre-extracted, use parse anchor. If you are dealing with unstructured messages that are more complex, leverage parse regex and place it in a Field Extraction Rule.

Rule 6 - When using parse regex avoid expensive tokens

If you need to use parse regex, avoid the use of expensive operations like .* Just as Rule 1 states for your search scope, be as specific as you can with your regular expressions as well.

Example log message - - [2016-09-12 20:13:52.870 +0000] "GET /blog/index.php HTTP/1.1" 304 8932

Best approach

| parse regex "(?<client_ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s"

Least preferred approach

| parse regex "(?<client_ip>.*)\s-"

Rule 7 - Use partitions and scheduled views

Sumo provides two index-based search optimization features: partitions and scheduled views. When you run a search against an partition or scheduled view, search results are returned more quickly and efficiently because the search is run against a smaller data set. For more information, see Optimize Search Performance.

Rule 8 - Use Search Parameters

If your search contains filtering criteria that could change each time the search is executed, take advantage of Search Templates. Search templates make it easier for less expert users to obtain search results, and also reduces the risk that such users will run expensive searches.

Rule 9 - Aggregate before a lookup

Whenever possible, you should aggregate data prior to doing a lookup. In some cases, this will significantly reduce the amount of data the lookup is referencing.

Best approach

| count by client_ip
| lookup is_bad_ip from shared/bad/ips on client_ip=ip

Less preferred approach

| lookup is_bad_ip from shared/bad/ips on client_ip=ip
| count by is_bad_ip

Rule 10 - Put pipe-delimited operations on separate lines

For readability, use a soft return in the query field to put each new pipe-delimited operation on a separate line.

Best approach

_sourceCategory=Apache/Access and GET
| parse "\"GET * HTTP/1.1\"\" * * \"\"*\"\"" as url,status_code,size,referrer
| count by status_code,referrer
| sort _count

Less preferred approach

_sourceCategory=Apache/Access and GET | parse "\"GET * HTTP/1.1\"\" * * \"\"*\"\"" as url,status_code,size,referrer | count by status_code,referrer | sort _count

Rule 11 - Pin searches with long time ranges

A query with a longer time range can run past the default time window for Sumo Logic. To protect against an interruption in a query with a significant time range, pin it. A pinned search can run in the background for up to 24 hours.