replace Search Operator
The replace
operator allows you to replace all instances of a specified string with another string. You can specify the string to replace with a matching regex or literal text. You might use it to find all instances of a name and change it to a new name or to replace punctuation in a field with different punctuation. This operator is useful anytime you need to rename something.
Syntax
replace(<sourceString>, <searchString>, <replaceString>) as <field>
replace(<sourceString>, /<regex>/, <replaceString>) as <field>
Rules
- An alias is required.
- If any of the inputs are null, the output is null.
- If the searchString is not found or the regex does not match, the sourceString is returned intact.
- Regex must be RE2 compliant.
- The string is case sensitive.
- When using multiple replace operators on the same field you must use the same alias, see an example below.
Regex usage
You can use a regex to define what you want to replace. Capture groups are optional. You can use named or numbered capture groups to then reference in the <replaceString>
.
Named capture groups: /(?<section>flight)\/(?<id>[0-9]{5,})/
Where you'd reference the named capture group section by its name in the <replaceString>
with ${section}
.
Numbered capture group: /(flight)\/([0-9]{5,})/
Where you'd reference the first capture group in the <replaceString>
with $1
, and the second capture group with $2
.
Using $0
will reference the whole matching string.
Required characters to escape
If the <replaceString>
needs to include the dollar sign ($
), it needs to be escaped as \\$
. Similarly, backslash itself needs to be escaped as \\\\
. Some other escapable characters include:
\n
: replace with a new line character\t
: a tab character
Examples
Replace unique IDs in URLs with a regex
If you have a URL and would like to see the number of times it was visited, but do not want to aggregate with unique IDs, you can replace the IDs with an empty string. Take the following URL, where in this example it belongs to a field named url
:
http://somewebsite.com/flight/12345678/certification
To remove the ID 12345678
from the field url
you can use the following query with a regex:
| replace(url, /[0-9]{5,}/, "") as url
This provides the URL like:
http://somewebsite.com/flight//certification
This would allow you to count the number of times the URL was requested without the specific IDs.