Skip to main content
Sumo Logic

lookup

The lookup operator maps data in your log messages to meaningful information saved in Sumo or on an HTTPS server. For example, you could use a lookup operator to map "userID" to a real user's name. Or, you could use a lookup operator to find black-listed IP addresses. In either case, you'll point the operator to one of the following:

  • A table of saved data generated by the Save operator.
  • A CSV file hosted on an HTTPS server. Attempting to run the lookup operator against a CSV hosted on an HTTP server will not be loaded in Sumo Logic.

Syntax

  • lookup <outputColumn-1> [,<outputColumn-2>] from <filePath> on <joinColumn-1> [,<joinColumn-2>]

where:

  • outputColumn-x is a list of field names in the header of the filePath.
  • filePath is an HTTPS address of a CSV file containing the external relationship table or a table saved to the Sumo Logic file system by the Save operator.
  • joinColumn-x is a list of pairs of field names that define the relationship between values in the log data results with matching values in an external table.

Rules

  • The size limit for the CSV file is 8MB.
  • If using an HTTPS resource, the file must be downloaded within 10 seconds. If the file cannot be downloaded in 10 seconds, it is probably too large.
  • Your joinColumn-x need to be of the same data type and are case sensitive. If your search result's field consists of integer data then the field in your external lookup table must also be integer data. You can cast data to a string or numeric value, see Casting Data to a Number or String.

Structuring CSV Files

Sumo Logic supports HTTPS-hosted lookup CSV files with the following restrictions:

  • The CSV file must contain a header line.
  • The header line cannot use special characters.
  • All values in the CSV file need to be wrapped in quotes.
  • No spaces are allowed between quotes and values. For example:

"id","name","time"

"1","foo","6-15-12"

"2","zoo","6-14-12"

"3","woo","6-13-12"

Dashboard Limitation

The lookup operator behaves differently when used in live mode versus interactive mode or an interactive search. When used in live mode the lookup operation is done continually to provide real-time results. However, only the most recent data point is looked up in real time, while the previous data points keep their previously looked up result. An interactive search will conduct the lookup operation on all data points when the query is processed. Therefore, when comparing live mode results to interactive results you will likely have differences in your lookup results.

For example, say you're plotting the average price of a stock over the last 30 days.

In live mode, lookup will return the real-time price and retain the previously looked up data points during the 30 day period.

In an interactive search, lookup will only use the real-time stock price to plot over the past 30 days. In this case, you would have to provide the previous stock prices for the past 30 days.

In other words, in live mode, lookup will use and retain the lookup data at that point in time when it ran. Whereas lookup in an interactive search will only use the data that was available when it ran.

Examples

Type the lookup operator in the Search tab, just as you would any other operator.

To match the userID string with a users' ID in your CSV, your query could be:

* | parse "name=*, phone number=*," as (name, phone) 
| lookup email from https://company.com/userTable.csv on name=userName, phone=cell

where the userTable.csv file includes the following:

"id","userName","email","IP","cell"
"1","Joe","joe@example.com","192.168.1.1","650-123-4567"
"2","John","john@example.com","192.168.1.2","212-123-4567"
"3","Susan","susan@example.com","192.168.1.3","914-123-4567"
"4","John","another_john@example.com",192.168.1.4","408-123-4567"
"5","John","yet_another_john@example.com","192.169.1.5","734-123-4567"

Running this query adds three fields to the output: name, phone, and email.

Composite field lookup

In our example above we had several users named John. A lookup operator can be used on a composite set of fields, so you can identify the correct email for each person named John because each unique cell phone number has also been mapped using a query like:

* | parse "name=*, phone number=*," as (name, phone) 
| lookup email from https://company.com/userTable.csv on name=userName, phone=cell

Running this query adds an email field to the output.

Using multiple lookup operators together

Another way to use a lookup operator is to chain lookup operators together. Each operator can call separate CSV files. For example, if you wanted to find user names and the position each user has in a company, your query could be:

* | parse "userID=*," as userID 
| lookup userName from https://company.com/userTable.csv on userID=id 
| lookup position from https://company.com/userPosition.csv on userID=id

where the userPosition.csv file includes the following:

"id","position"
"1","Salesperso"
"2","Salesperson"
"3","Engineer"
"4","Manager"
"5","Senior Engineer"

In our example above, the first operator finds the name, and the second finds the position.

Handling null values

To find a mismatch from a lookup operator query, use the isNull operator.

For example, running a query like:

| parse "code=*]" as code
| lookup status_code from shared/statusupdates on status = code
| if (isNull(status_code), "unknown", status_code) as status_code

Using Lookup to Access Saved Data

Once you've saved the results of a search to the Sumo Logic file system using a Save operator, the lookup operator allows you to search that data.

For example, say we wanted to find the date when users signed up in a file named newDailyUsers (the full path is myFolder/mySubFolder/newDailyUsers). We'd use this query to find that information:

* | parse "user_name=*," as name
  | lookup date from myFolder/mySubFolder/newDailyUsers on name=name

A file generated by a save operator can be saved to an org-level shared folder. This allows for others in your organization to use your search results when running their lookup queries. See saving files to a shared location for details.

Duplicate keys error

If the key you specify in a lookup operation matches several records, you get an error message that warns you of the duplication:

Lookup table folder/myfolder/filename has duplicate keys. The last value associated with a duplicated key will be used in the lookup result.

You only get the last associated value as a result. 

For example, if you are searching your Apache Access logs from 34.87.4.6 and you are looking for an internal server errors by a specific keyid, lookup provides the last result that matches your criteria:

lookup-duplicate.png