regex_extract

Searches a string and returns text that matches a regular_expression.

Regular expressions is a powerful language for matching patterns of partial words, whole words, or even multiple words. While simple regular expressions are straightforward to use, you can create complex expressions that are powerful but may be difficult to predict and debug --- and may be difficult for other people in your organization to understand.

So the best practice is: start simple, and add complexity only if you have no other choice.

Syntax

regex_extract(string, regular_expression)

Parameters

string can be any of the following:

regular_expression is a case-sensitive RE2 regular expression (RE2 is an open source engine for processing regular expressions). See examples and suggestions below. The complete list of operators and syntax is available on Github.

Surround the regular expression with quotation marks.

Regular expression syntax

Here's a list of the operators and syntax you may find useful when using regular expressions in Search Ads 360:

Wildcards

. Matches any single character (letter, number or symbol) goo.gle matches gooogle, goodgle, goo8gle
* Matches zero or more of the previous item The default previous item is the previous character. goo*gle matches gooogle, goooogle
+ Matches one or more of previous item gooo+gle matches goooogle, but not google.
? Matches zero or one of the previous item labou?r matches both labor and labour
| Inclusive "or"  a|b matches a or b, or both a and b

Anchors

^ Line starts with ^site matches site but not mysite
$ Line ends with site$ matches site but not sitescan

Grouping

() Non-capturing group Thank(s|you) matches both Thanks and Thankyou
[] Set or range of characters in any order [ogl]+ matches google,  goooogle, or logic
- Expresses a range of characters [A-Z] creates a list for the uppercase English alphabet

Other

\ Escape special characters mysite\.com keeps the dot from being a wildcard
\s Space character \s+.* matches one or more whitespace followed by zero or more characters
\d Digit \d65\d matches "265" not "256"
\w Word character (a-z, A-Z, 0-9, _) $\w matches any string starting with a word character, such as "Campaign" but not "@Campaign"
\b Word boundary \bcity\b matches " city " not "scarcity"

Example

  • regex_replace(ITEM_TITLE, "\bLabou?r\b")
    If ITEM_TITLE is "Ministry of Labour", the function returns "Labour".
    If ITEM_TITLE is "Ministry of Labor", the function returns "Labor".
    If ITEM_TITLE is "Ministry of labor", the function does not find a match. (regular_expression in Search Ads 360 is case-sensitive)

    Note that if you remove \b, the regular expression matches "Laborious" as well as "Labor" For example:
    regex_replace(ITEM_TITLE, "Labou?r")
    If ITEM_TITLE is "Laborious Hike", the function returns "Labor".

Was this helpful?
How can we improve it?