Skip to main content
Sumo Logic

Hash Rules

With a hash rule, whatever expression you choose to hash will be replaced by a hash code generated for that value. Hashed data is completely hidden (obfuscated) before being sent to Sumo Logic. This can be very useful in situations where some type of data must not leave your premises (such as credit card numbers, social security, numbers, etc.). Each unique value will have a unique hash code.

For example, to hash member IDs, you could use the following:

Filter hash

Log line:

2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8]
[module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=dan@demo.com] [remote_ip=98.248.40.103]
[web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]

Resulting hashed log line:

2012-05-16 09:43:39,607 -0700 DEBUG [hostId=prod-cass-raw-8]
[module=RAW] [logger=scala.raw.InboundRawProtocolHandler] [memberid=3dfg3534ftgfe33ffrf3] [remote_ip=98.248.40.103]
[web_session=19zefhqy...] [session=80F1BD83AEBDF4FB] [customer=0000000000000005] [call=InboundRawProtocol.getMessages]

Please note the following:

  • Values that you want hashed must be expressed as a match group enclosed in "( )".
  • You can use an anchor to detect specific values. In addition, you can specify multiple match groups. If multiple match groups are specified, each of the values will be hashed uniquely:
    Processing rule
  • The hash algorithm is MD5.
  • If a match group isn't specified no data will be hashed.
  • Make sure you don't specify a regular expression that matches a full log line. Doing so will result in the entire log line being hashed.
  • For multiline messages, add single line modifiers (?s) to the beginning and end of the expression to simplify matching your string, regardless of where it occurs in the message. For example:

(?s).*secur.*(?s)