Analytics supports regular expressions so you can create more flexible definitions for things like view filters, goals, segments, audiences, content groups, and channel groupings.
In the context of Analytics, regular expressions are specific sequences of characters that broadly or narrowly match patterns in your Analytics data.
For example, if you wanted to create a view filter to exclude site data generated by your own employees, you could use a regular expression to exclude any data from the entire range of IP addresses that serve your employees. Let’s say those IP addresses range from 198.51.100.1 - 198.51.100.25. Rather than enter 25 different IP addresses, you could create a regular expression like 198\.51\.100\.\d* that matches the entire range of addresses.
Or if you wanted to create a view filter that included only campaign data from two different cities, you could create a regular expression like San Francisco|New York (San Francisco or New York).
Regex metacharacters
Wildcards
. | Matches any single character (letter, number or symbol) | 1. matches 10, 1A 1.1 matches 111, 1A1 Examples |
? | Matches the preceding character 0 or 1 times | 10? matches 1, 10 Examples |
+ | Matches the preceding character 1 or more times | 10+ matches 10, 100 Examples |
* | Matches the preceding character 0 or more times | 1* matches 1, 10 Examples |
| | Creates an OR match Do not use at the end of an expression |
1|10 matches 1, 10 Examples |
Anchors
^ | Matches the adjacent characters at the beginning of a string | ^10 matches 10, 100, 10x ^10 does not match 110, 110x Examples |
$ | Matches the adjacent characters at the end of a string | 10$ matches 110, 1010 10$ does not match 100, 10x Examples |
Groups
( ) | Matches the enclosed characters in exact order anywhere in a string Also used to group other expressions |
(10) matches 10, 101, 1011 ([0-9]|[a-z]) matches any number or lower-case letter Examples |
[ ] | Matches the enclosed characters in any order anywhere in a string | [10] matches 012, 120, 210 Examples |
- | Creates a range of characters within brackets to match anywhere in a string | [0-9] matches any number 0 through 9 Examples |
Escape
\ | Indicates that the adjacent character should be interpreted literally rather than as a regex metacharacter | \. indicates that the adjacent dot should be interpreted as a period or decimal rather than as a wildcard. 216\.239\.32\.34 matches 216.239.32.34 Examples |
Tips
Default behavior between Universal Analytics and Google Analytics 4
By default, regular expressions in Universal Analytics properties are treated as a "partial match." The expression will be true if the pattern you provide is contained anywhere in the data.
For example, if you provide the pattern "India" the regex matches "India", "Indian", "Indiana", "Indianapolis", and so on. You don't need to use metacharacters to achieve this partial match.
In a Google Analytics 4 property, the default regex is a "full match." The data must exactly match the pattern you provide. For example, the pattern "India" only matches "India." To make this regex act like a partial match, you must use metacharacters: "India.*" will return any value that begins with "India" and ends with anything (or nothing) else.
Use simple expressions
Keep your regular expressions simple. Simple regex is easier for another user to interpret and modify.
Match metacharacters
Use the backslash (\) to escape regex metacharacters when you need those characters to be interpreted literally. For example, if you use a dot as the decimal separator in an IP address, escape it with a backslash (\.) so that it isn’t interpreted as a wildcard.
Use metacharacters to limit the match
Regular expressions are greedy by nature: if you don’t tell them not to, they match what you specify plus any adjacent characters. For example, in a partial match, site matches mysite, yoursite, theirsite, parasite--any string that contains “site”. If you need to make a specific match, construct you regex accordingly. For example, if you need to match only the string “site”, then construct your regex so that “site” is the both the beginning and end of the string: ^site$.