Skip to main content
Sumo Logic

About the Search Job API

The Search Job API provides access to resources and log data from third-party scripts and applications. The API follows Representational State Transfer (REST) patterns and is optimized for ease of use and consistency. With the Search Job API, it’s not necessary to wait until a search is complete to see status or results.

  • While the search job is running in the background in the Sumo Logic service, the user can query the job status based on search job ID.
  • Users can start requesting results asynchronously while the job is running and page through partial results while the job is in progress.

Note the following:

  • The Search Job API is available only for Enterprise accounts. See Account Types for more information.
  • Collector registration and API access must use access key/access ID authentication. Username/password are not supported.
  • Cookies must be enabled for subsequent requests to the search job. A 404 status (Page Not Found) on a follow-up request indicates that a cookie did not accompany the request.

Endpoints for API access

Sumo Logic has deployments that are assigned depending on the geographic location and the date an account is created. For API access, you must manually direct your API client to the correct Sumo Logic API URL.

See Sumo Logic Endpoints for the list of the URLs.

API authentication

See API Authentication for a description of options for API authentication.

Rate limiting

A global rate limit of four API requests per second (240 requests per minute) applies across all API calls from a user. If the rate is exceeded, a rate limit exceeded (429) error is returned.

The Search Job API can return up to 10 million records per search query.

Process flow

The following figure shows the process flow for search jobs.

  1. Request. You request a search job, giving the query and time range.
  2. Response. Sumo responds with a job ID. If there’s a problem with the request, an error code is provided (see the list of error codes following the figure).
  3. Request. Use the job ID to request search status. This can be done repeatedly to obtain status updates while the search is running.
  4. Response. Sumo responds with job status. An error code (404) is returned only if the request could not be completed.
    The status includes the current state of the search job (gathering results, done executing, etc.). It also includes the message and record counts based on how many results have already been found while executing the search. For non-aggregation queries, only the number of messages is reported. For aggregation queries, the number of records produced is also reported. The search job status provides access to an implicitly generated histogram of the distribution of found messages over the time range specified for the search job. During and after execution, the API can be used to request available messages and records in a paging fashion.
  5. Request. You request results. It’s not necessary for the search to be complete for the user to request results; the process works asynchronously. You can repeat the request as often as needed to keep seeing updated results, keeping in mind the rate limits. The Search Job API can return up to 10 million records per search query.
  6. Response. Sumo delivers JSON-formated search results as requested. The API can deliver partial results that the user can start paging through, even as new results continue to come in. If there’s a problem with the results, an error code is provided (see the list of error codes following the figure).

search_job_api_process_flow.png

 

Errors

Generic errors that apply to all APIs

Code Error Description
301 moved The requested resource SHOULD be accessed through returned URI in Location Header.
401 unauthorized Credential could not be verified.
403 forbidden This operation is not allowed for your account type.
404 notfound Requested resource could not be found.
405 method.unsupported Unsupported method for URL.
415 contenttype.invalid Invalid content type.
429 rate.limit.exceeded The API request rate is higher than 4 request per second.
500 internal.error Internal server error.
503 service.unavailable Service is currently unavailable.

Errors when creating the search query (#2 in the process flow)

Code Error Description
400 generic Generic error.
400 invalid.timestamp.to The 'to' field contains an invalid time.

400 invalid.timestamp.from The 'from' field contains an invalid time.

400 to.smaller.than.from The 'from' time cannot be larger than the 'to' time.
400 unknown.timezone The 'timezone' value is not a known time zone. See this Wikipedia article for a list of time zone codes.
400 empty.timezone The 'timezone' cannot be blank.

400 no.query No 'query' parameter was provided.

400 unknown.time.type Time type is not correct
.
400 parse.error Unable to parse query.

Error when requesting status (#3 in the process flow)

Code Error Description
404 "jobid.invalid" "Job ID is invalid."

Errors when paging through the result set (#5 in the process flow)

Code Error Description
400 "jobid.invalid" "Job ID is invalid."
400 "offset.missing" "Offset is missing."

400 "offset.negative" "Offset cannot be negative."

400 "limit.missing" "Limit is missing."

400 "limit.zero" "Limit cannot be 0."

400 "limit.negative" "Limit cannot be negative."
400 "no.records.not.an.aggregation.query" "No records; query is not an aggregation"

 

API details

Creating a search job

To create a search job (step 1 in the process flow), send a JSON request to the search job endpoint.

Method: POST

Example endpoint:

https://api.sumologic.com/api/v1/search/jobs
Headers
Header Value
Content-Type application/json
Accept application/json
Query parameters
Parameter Type Required Description
query string Yes The actual search expression.
from string Yes The ISO 8601 date of the time range to start the search. Can also be milliseconds since epoch.
to string Yes The ISO 8601 date of the time range to end the search. Can also be milliseconds since epoch.
timeZone string Yes The time zone if from/to is not in milliseconds.  See this Wikipedia article for a list of time zone codes.
Status codes
Code Text Description
202 Accepted The search job has been successfully created.
400 Bad Request Generic request error by the client.
415 Unsupported Media Type Content-Type wasn't set to application/json.
Response headers
Header Value
Location
https://api.sumologic.com/api/v1/search/jobs/SEARCH_JOB_ID
Result

A JSON document containing the ID of the newly created search job. The ID is a string to use for all API interactions relating to the search job.

Example error response:

{
  "status" : 400,
  "id" : "IUUQI-DGH5I-TJ045",
  "code" : "searchjob.invalid.timestamp.from",
  "message" : "The 'from' field contains an invalid time."
}
Sample session

The following sample session uses cURL. The Search Job API requires cookies to be honored by the client. Use curl -b cookies.txt -c cookies.txtoptions to receive, store, and send back the cookies set by the API.

curl -b cookies.txt -c cookies.txt -H 'Content-type: application/json' 
-H 'Accept: application/json' -X POST -T createSearchJob.json 
--user ACCESSID:ACCESSKEY https://api.sumologic.com/api/v1/search/jobs

The createSearchJob.json file looks like this:

{
  "query": "| count _sourceCategory",
  "from": "2013-01-28T12:00:00",
  "to": "2013-01-28T13:10:00",
  "timeZone": "PST"
}

The response from Sumo Logic returns the Search Job ID as the “Location” header in the format 

https://api.sumologic.com/api/v1/search/jobs/#{SEARCH_JOB_ID}

Getting the current Search Job status

Use the search job ID to obtain the current status of a search job (step 4 in the process flow).

Method: GET

Example endpoint:

https://api.sumologic.com/api/v1/search/jobs/SEARCH_JOB_ID
Query parameters
Parameter Type Required Description
searchJobId String Yes The ID of the search job.
Result

The result is a JSON document containing the search job state, the number of messages found so far, the number of records produced so far, any pending warnings and errors, and any histogram buckets so far.

Sample session
curl -v --trace-ascii - -b cookies.txt -c cookies.txt -H 'Accept: application/json' 
--user ACCESSID:ACCESSKEY https://api.sumologic.com/api/v1/search/jobs/37589506F194FC80

This is the formatted result document:

{
   "state":"DONE GATHERING RESULTS",
   "messageCount":90,
   "histogramBuckets":[
      {
         "length":60000,
         "count":1,
         "startTimestamp":1359404820000
      },
      {
         "length":60000,
         "count":1,
         "startTimestamp":1359405480000
      },
      ...
      {
         "length":60000,
         "count":1,
         "startTimestamp":1359404340000
      }
   ],
   "pendingErrors":[

   ],
   "pendingWarnings":[

   ],
   "recordCount":1
}

Notice that the state of the sample search job is DONE GATHERING RESULTS. The following table includes possible states.

State Description

NOT STARTED

Search job has not been started yet.

GATHERING RESULTS

Search job is still gathering more results, however results might already be available.

PAUSED

Query that is explicitly paused by the user.

FORCE PAUSED

Query that is paused by the system. It is true only for non-aggregate queries that are paused at the limit of 100k. This limit is dynamic and may vary from customer to customer.

DONE GATHERING RESULTS

Search job is done gathering results; the entire specified time range has been covered.

CANCELLED

The search job has been cancelled.

More about results

The messageCount and recordCount values indicate the number of messages and records found or produced so far. Messages are raw log messages and records are aggregates.

For queries that don't contain an aggregation operator, only messages are returned. If the query contains an aggregation, for example, count by _sourceCategory, then the messages are returned along with records resulting from the aggregation (similar to what a SQL database would return).

The pendingErrors and pendingWarnings values contain any pending error or warning strings that have accumulated since the last time the status was requested.

Errors and warnings are not cumulative. If you need to retain the errors and warnings, store them locally.

The histogramBuckets value returns a list of histogram buckets. A histogram bucket is defined by its timestamp, which is the start timestamp (in milliseconds) of the bucket, and a length, also in milliseconds, that expressed the width of the bucket. The timestampplus length is the end timestamp of the bucket, so the count is the number of messages in the bucket.

The histogram buckets correspond to the histogram display in the Sumo Logic interactive analytics API. The histogram buckets are not cumulative. Because the status API will return only the new buckets discovered since the last status call, the buckets need to be remembered by the client, if they are to be used. A search job in the Sumo Logic backend will always execute a query by finding and processing matching messages starting at the end of the specified time range, and moving to the beginning. During this process, histogram buckets are discovered and returned.

Paging through the messages found by a search job

The search job status informs the user about the number of found messages. The messages can be requested using a paging API call (step 6 in the process flow). Messages are always ordered by the latest _messageTimevalue.

Method: GET

Example endpoint:

https://api.sumologic.com/api/v1/search/jobs/SEARCH_JOB_ID/messages?offset=OFFSET&limit=LIMIT
Query parameters
Parameter Type Required Description

searchJobId

String

Yes

The ID of the search job.

offset

Int

Yes

Return message starting at this offset.

limit

Int

Yes

The number of messages starting at offset to return. The maximum value for limit is 10,000 messages.

NOTE: A query might return fewer than 10,000 messages if the message sizes are large.

Sample session
curl -v --trace-ascii - -b cookies.txt -c cookies.txt -H 'Accept: application/json' 
--user ACCESSID:ACCESSKEY 'https://api.sumologic.com/api/v1/search/jobs/37589506F194FC80/messages?offset=0&limit=10

This is the formatted result document:

{
   "fields":[
      {
         "name":"_messageid",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_sourceid",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_sourcename",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_sourcehost",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_sourcecategory",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_format",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_size",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_messagetime",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_receipttime",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_messagecount",
         "fieldType":"int",
         "keyField":false
      },
      {
         "name":"_raw",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_source",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_collectorid",
         "fieldType":"long",
         "keyField":false
      },
      {
         "name":"_collector",
         "fieldType":"string",
         "keyField":false
      },
      {
         "name":"_blockid",
         "fieldType":"long",
         "keyField":false
      }
   ],
   "messages":[
      {
         "map":{
            "_receipttime":"1359407350899",
            "_source":"service",
            "_collector":"local",
            "_format":"plain:atp:o:0:l:29:p:yyyy-MM-dd HH:mm:ss,SSS ZZZZ",
            "_blockid":"-9223372036854775669",
            "_messageid":"-9223372036854773763",
            "_messagetime":"1359407350333",
            "_collectorid":"1579",
            "_sourcename":"/Users/christian/Development/sumo/ops/assemblies/latest/service-20.1-SNAPSHOT/logs/service.log",
            "_sourcehost":"Chiapet.local",
            "_raw":"2013-01-28 13:09:10,333 -0800 INFO  [module=SERVICE] [logger=util.scala.zk.discovery.AWSServiceRegistry] [thread=pool-1-thread-1] FINISHED findRunningInstances(ListBuffer((Service: name: elasticache-1, defaultProps: Map()), (Service: name: userAndOrgCache, defaultProps: Map()), (Service: name: rds_cloudcollector, defaultProps: Map()))) returning Map((Service: name: elasticache-1, defaultProps: Map()) -> [], (Service: name: userAndOrgCache, defaultProps: Map()) -> [], (Service: name: rds_cloudcollector, defaultProps: Map()) -> []) after 1515 ms",
            "_size":"549",
            "_sourcecategory":"service",
            "_sourceid":"1640",
            "_messagecount":"2044"
         }
      },
      ...
      {
         "map":{
            "_receipttime":"1359407051885",
            "_source":"service",
            "_collector":"local",
            "_format":"plain:atp:o:0:l:29:p:yyyy-MM-dd HH:mm:ss,SSS ZZZZ",
            "_blockid":"-9223372036854775674",
            "_messageid":"-9223372036854773772",
            "_messagetime":"1359407049529",
            "_collectorid":"1579",
            "_sourcename":"/Users/christian/Development/sumo/ops/assemblies/latest/service-20.1-SNAPSHOT/logs/service.log",
            "_sourcehost":"Chiapet.local",
            "_raw":"2013-01-28 13:04:09,529 -0800 INFO  [module=SERVICE] [logger=com.netflix.config.sources.DynamoDbConfigurationSource] [thread=pollingConfigurationSource] Successfully polled Dynamo for a new configuration based on table:raychaser-chiapetProperties",
            "_size":"246",
            "_sourcecategory":"service",
            "_sourceid":"1640",
            "_messagecount":"2035"
         }
      }
   ]
}
More about results

The result contains two lists, fields and messages.

  • fields contains a list of all the fields defined for each of the messages returned. For each field, the field name and field type are returned.
  • messages contains a list of maps, one map per message. Each map}} maps from the fields described in the fields list to the actual value for the message.

For example, the field _raw contains the raw collected log message.

_messagetime is the number of milliseconds since the epoch of the timestamp extracted from the message itself.

_receipttime is the number of milliseconds since the epoch of the timestamp of arrival of the message in the Sumo Logic system.

The metadata fields _sourcehost_sourcename, and _sourcecategory, which are also featured in Sumo Logic, are available here.

Paging through the records found by a Search Job

The search job status informs the user as to the number of produced records, if the query performs an aggregation. Those records can be requested using a paging API call (step 6 in the process flow), just as the message can be requested.

Method: GET

Example endpoint:

https://api.sumologic.com/api/v1/search/jobs/SEARCH_JOB_ID/records?offset=OFFSET&limit=LIMIT
Query parameters
Parameter Type Required Description
searchJobId String Yes The ID of the search job.
offset Int Yes Return records starting at this offset.
limit Int Yes The number of records starting at offset to return. The maximum value for limit is 10,000 records.
Sample session
curl -v --trace-ascii - -b cookies.txt -c cookies.txt -H 
'Accept: application/json' --user ACCESSID:ACCESSKEY
'https://api.sumologic.com/api/v1/search/jobs/37589506F194FC80/records?offset=0&limit=1'

This is the formatted result document:

{
   "fields":[
      {
         "name":"_sourcecategory",
         "fieldType":"string",
         "keyField":true
      },
      {
         "name":"_count",
         "fieldType":"int",
         "keyField":false
      }
   ],
   "records":[
      {
         "map":{
            "_count":"90",
            "_sourcecategory":"service"
         }
      }
   ]
}

The returned document is similar to the one returned for the message paging API. The schema of the records returned is described by the list of fields as part of the fields element. The records themselves are a list of maps.

Deleting a search job

Although search jobs ultimately time out in the Sumo Logic backend, it's a good practice to explicitly cancel a search job when it is not needed anymore.

Method: DELETE

Example endpoint:

https://api.sumologic.com/api/v1/search/jobs/SEARCH_JOB_ID
Query parameters
Parameter Type Required Description
searchJobId String Yes The ID of the search job.
Sample session
curl -v --trace-ascii - -b cookies.txt -c cookies.txt -X DELETE 
-H 'Accept: application/json' --user ACCESSID:ACCESSKEY 
https://api.sumologic.com/api/v1/search/jobs/37589506F194FC80

Bash this Search Job

You can use the following script to exercise the API.

#!/bin/bash

# Variables.
PROTOCOL=$1
HOST=$2
ACCESSID=$3
ACCESSKEY=$4
OPTIONS="--silent -b cookies.txt -c cookies.txt"
#OPTIONS="-v -b cookies.txt -c cookies.txt"
#OPTIONS="-v --trace-ascii -b cookies.txt -c cookies.txt"

#
# Create a search job from a JSON file.
#
RESULT=$(curl $OPTIONS                                                                \
          -H "Content-type: application/json"                                         \
          -H "Accept: application/json"                                               \
          -d @createSearchJob.json                                                    \
          --user $ACCESSID:$ACCESSKEY                                                     \
          "$PROTOCOL://$HOST/api/v1/search/jobs")
JOB_ID=$(echo $RESULT | perl -pe 's|.*"id":"(.*?)"[,}].*|\1|')
echo Search job created, id: $JOB_ID

#
# Wait until the search job is done.
#
STATE=""
until [ "$STATE" = "DONE GATHERING RESULTS" ]; do
  sleep 5
  RESULT=$(curl $OPTIONS                                                              \
            -H "Accept: application/json"                                             \
            --user $ACCESSID:$ACCESSKEY                                                   \
            "$PROTOCOL://$HOST/api/v1/search/jobs/$JOB_ID")
  STATE=$(echo $RESULT | sed 's/.*"state":"\(.*\)"[,}].*/\1/')
  MESSAGES=$(echo $RESULT | perl -pe 's|.*"messageCount":(.*?)[,}].*|\1|')
  RECORDS=$(echo $RESULT | perl -pe 's|.*"recordCount":(.*?)[,}].*|\1|')
  echo Search job state: $STATE, message count: $MESSAGES, record count: $RECORDS
done

#
# Get the first ten messages.
#
RESULT=$(curl $OPTIONS                                                                \
          -H "Accept: application/json"                                               \
          --user $ACCESSID:$ACCESSKEY                                                      \
          "$PROTOCOL://$HOST/api/v1/search/jobs/$JOB_ID/messages?offset=0&limit=10")
echo Messages:
echo $RESULT

#
# Get the first 2 records.
#
RESULT=$(curl $OPTIONS                                                                \
          -H "Accept: application/json"                                               \
          --user $ACCESSID:$ACCESSKEY                                                      \
          "$PROTOCOL://$HOST/api/v1/search/jobs/$JOB_ID/records?offset=0&limit=1")
echo Records:
echo $RESULT

#
# Delete the search job.
#
RESULT=$(curl $OPTIONS                                                                \
          -X DELETE                                                                   \
          -H "Accept: application/json"                                               \
          --user $ACCESSID:$ACCESSKEY                                                      \
          "$PROTOCOL://$HOST/api/v1/search/jobs/$JOB_ID")
JOB_ID=$(echo $RESULT | sed 's/^.*"id":"\(.*\)".*$/\1/')
echo Search job deleted, id: $JOB_ID