Skip to main content

dedup Search Operator

The dedup operator removes duplicate results. You have the option to remove consecutively and by specific fields. This allows you to filter your results to identify the most recent or last few events based on an identical combination of results.

For example, to find the most recent value of services you'd use the following operation: | dedup 1 by service.

Supported features

The dedup operator is supported for the following features:

Syntax

dedup [consecutive] [<int>] [by <field>[, <field2>, ...]]
ParameterDescriptionExample
consecutiveremoves duplicate combinations of values that are in succession.Remove only consecutive duplicate events. Keep non-consecutive duplicate events. In this example, duplicates must have the same combination of values as the source and host fields for them to be removed. Non-consecutive events with the same combination of source and host fields will be retained.
... | dedup consecutive by source, host
intspecifies the number of most recent events to return.For search results that have the same source value, keep the first three that occur and remove all subsequent search results.
... | dedup 3 by source
fieldA comma-separated list of field names to remove duplicate values from. If no fields are specified, the query is run against _raw, the full raw log message.
For example, | dedup is the same as | dedup by _raw.
Remove duplicate search results based on _sourceCategory.
... | dedup by _sourceCategory

Rules

  • Non-aggregate and aggregate queries are supported.
    • non-aggregate queries process up to 100k results.
    • aggregate queries process all results.
  • Use the sort operator before dedup to control the order of removed results.
  • Running dedup against the full raw log message is inefficient and is not recommended.
  • The histogram only shows results the dedup operator returned.

Examples

The following examples use this sample data.

TimestampCityCountryContinentPopulation (in millions)
05/09/2021 11:32:00Las VegasUSANorth America2.31
05/09/2021 11:32:00ParisFrance6.945
05/09/2021 11:30:00KarachiAsia16.1
05/09/2021 11:29:00ChennaiIndiaAsia4.7
05/09/2021 11:28:05MumbaiIndiaAsia20.7
05/09/2021 11:28:00BangaloreIndiaAsia12.7
05/09/2021 11:27:00FloridaUSANorth America2.4
05/09/2021 11:26:00WashingtonUSANorth America7.6
05/09/2021 11:25:00New YorkUSANorth America8.8
05/09/2021 11:24:00San FranciscoUSANorth America8.5
05/09/2021 11:23:00DelhiIndiaAsia11
05/09/2021 11:22:00KolkataIndiaAsia4.5

Remove duplicate search results by country

| dedup by country

Returns the most recent record for each country:

deup by country

Keep the first 3 duplicate results

For search results that have the same country value, keep the first three that occur and remove all subsequent search results.

| dedup 3 by country

Returns the following results:

deup by 3

Keep results with same combination of values in multiple fields

For search results that have the same country AND continent values, keep the first two search results that occur and remove all subsequent results.

| dedup 2 by country, continent

Returns the following results:

deup by 3

Remove only consecutive duplicate events

Remove only consecutive duplicate events. Keep non-consecutive duplicate events. In this example, duplicates must have the same combination of values as the country and continent fields for them to be removed. Non-consecutive events with the same combination of source and host fields will be retained.

| dedup consecutive by country, continent

Returns the following results:

deup by 3

Status
Legal
Privacy Statement
Terms of Use

Copyright © 2024 by Sumo Logic, Inc.