Skip to main content
Sumo Logic

Transaction Analytics

No matter what type of data you're analyzing, from tracking website sign ups, to e-commerce data, to watching system activity across a distributed system, the transaction operator can be used in a variety of use cases. Ultimately, data is always ordered, at least by timestamp. But during analysis, the transaction operator can process otherwise unordered data and produce results using ordered data (data that has an ordered flow).

For example, if you ran a retail website, you could use the transaction operator to track your customer's movements through the log events that determine the states of their transaction, such as login, cart, payment, and checkout. From the results of your query, you could visualize your customers as they move or "flow" through the site as a Flow Diagram, and identify any problems, such as a drop-off at the payment state, which prevents them from completing their purchase.

The transaction operator requires:

  • One or more transaction IDs to group related log messages together. You could use session IDs, IPs, username, email, or any other unique IDs that are relevant to your query. You'll define transaction IDs in a query. The transaction IDs are extracted using operators such as parseparse regex, etc.
  • Mapping from a log message to a state. Specify the mapping from a log message to a state through the syntax of the matches operator, or through fields that are already parsed.

Defining states

Think of states as a way of using log events and fields in your logs to plot the movement of data. The transaction operator needs to have these states defined to produce results. There are two ways you can define states.

Syntax Example
with "match string" [in fieldName] as stateName

with "*LinkAccountAction category=Google*" as linkGoogle,
with "*LinkAccountAction category=Facebook*" as linkFacebook,
with "*LinkAccountAction category=LinkedIn*" as linkLinkedIn,
with "*LinkAccountAction category=Other*" as linkOther

with states stateA, stateB, ..., stateN [in fieldName] with states login, cart, checkout, shipping, shipping_method, billing, review, progress

The state is the text in the field.

Once states have been defined for ordered data, you can use them to order data using the fromstate and tostate arguments, described in the next section.

Ordered vs Unordered data

When used with ordered data, you can monitor the transition between two distinct states, allowing you to build a Flow Diagram to visually represent the transitions a transaction goes through, and the number of transactions between transitions. On unordered data, you can use the transaction operator to build a table of results.

The difference between ordered and unordered data is the flow (order) that you define in a transaction query. Both types of data require you to define states.

Below you'll see two nearly identical queries. On the left, unordered data is searched, and the results are displayed in a table. On the right, by adding results by flow as well as the fromstate argument, we can build a Flow Diagram.

Unordered Ordered

_sourceCategory=oursite | parse using public/apache
| where !(user_agent matches "*Pingdom*") and status_code==200
| parse regex "(?<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
| parse regex field=url "^/(?<urlprefix>[A-Za-z0-9.-]+)"
| fields urlprefix, ip
| replace(urlprefix, "-", "") as urlprefix
| transaction on ip with states followus, signup, blog, product, api, about, resources, applications, bigdatachallenge,
sumologicfree, termsandconditions, whatsnew, search, privacy, awstrial in urlprefix

_sourceCategory=oursite | parse using public/apache
| where !(user_agent matches "*Pingdom*") and status_code==200
| parse regex "(?<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})"
| parse regex field=url "^/(?<urlprefix>[A-Za-z0-9.-]+)"
| fields urlprefix, ip
| replace(urlprefix, "-", "") as urlprefix
| transaction on ip with states followus, signup, blog, product, api, about, resources, applications, bigdatachallenge, 
sumologicfree, termsandconditions, whatsnew, search, privacy, awstrial in urlprefix
results by flow
| count, max(latency) by fromstate, tostate

Tables generated with unordered data can be added to Dashboards.

Flow Diagrams cannot be added to Dashboards.

 

Unordered

Tables generated with unordered data can be added to Dashboards.

 

Ordered

Flow Diagrams generated with ordered data cannot be added to Dashboards.

Specifying a fringe cut-off

Since transaction operator queries are constrained by a time window, some transactions may be cut off if they occur near the edges of the time window. It is possible to filter them out using the fringe argument.

If tw is the time window for a query, then transactions that satisfy the following will be filtered out:

  • ends in [tw.start, tw.start + fringe)
  • starts in (tw.end - fringe, tw.end]

For example:

... | transaction on sessionid fringe=10m 
with "Starting session *" as init, 
with "Initiating countdown *" as countdown_start, 
with "Countdown reached *" as countdown_done, 
with "Launch *" as launch 
results by transaction

Limitation

For ordered data, there is a group limit of 10,000. The transaction operator uses a least-recently used scheme to phase out transactions. So when this limitation is reached, the transactions that are included in the results are not the first 10,000 transactions, but the 10,000 most-frequently used transactions. This is due to the fact that some earlier transactions have ended prematurely, as stated in the following error.

This message is displayed if you use more than 10,000 groups with the Transaction operator: "Group or memory limit exceeded, some transactions may have ended prematurely."

For unordered data, once group limit of 10,000 is reached, new transactions are ignored.