About Regular Expressions

Updated by: LunaMetrics on 8 March 2008

Google Analytics supports regular expressions so that users can create more powerful implementations. Regular expressions are special characters that match or capture portions of a field, as well as the rules that govern all characters. Most of the filters included in Google Analytics use these expressions to match the data and perform an action when a match is achieved.

For instance, an &;exclude by IP address&; filter will exclude the hit if the regular expression that you write matches the visitor's IP address. Your regular expression in the Exclude by IP Address field might look like this:

163\.212\.171\.123

So when someone at IP address 163.212.171.123 visits your site, a match happens between 163\.212\.171\.123 and 163.212.171.123, and the hit is excluded. The backslashes in the expression above are only one example of the special characters that regular expressions use.

Regular Expression Characters

Click on each character's description to read a detailed article describing its use.

Wildcards

. Matches any single character (letter, number or symbol) goo.gle matches gooogle, goodgle, goo8gle
* Matches zero or more of the previous item The default previous item is the previous character. goo*gle matches gooogle, goooogle
+ Just like a star, except that a plus sign must match at least one previous item gooo+gle matches goooogle, but never google.
? Matches zero or one of the previous item labou?r matches both labor and labour
| Lets you do an "or" match a|b matches a or b

Anchors

^ Requires that your data be at the beginning of its field ^site matches site but not mysite
$ Requires that your data be at the end of its field site$ matches site but not sitescan
Note: to understand why anchors are necessary, please read Tips for Regular Expressions at the bottom of this page.

Grouping

() Use parenthesis to create an item, instead of accepting the default Thank(s|you) will match both Thanks and Thankyou
[] Use brackets to create a list of items to match to [abc] creates a list with a, b and c in it
- Use dashes with brackets to extend your list [A-Z] creates a list for the uppercase English alphabet

Other

\ Turns a regular expression character into an everyday character mysite\.com keeps the dot from being a wildcard

Tips for Regular Expressions

  1. Make the regular expression as simple as possible so that you and your colleagues can work with them easily in the future.
  2. Make sure you use a backslash if you have characters like "?" or "." and you wish to match those literal characters -- otherwise, they will be interpreted as special regular expression characters.
  3. Not all regular expressions include special characters. For example, you can specify that a Google Analytics goal be a regular expression, and even if you don't have any special characters, your goal will be interpreted according to the rules of regular expressions.
  4. Regular expressions are greedy. For example, site matches mysite and yoursite and sitescan. If site is your regular expression, it is the equivalent of asking to match to all strings that contain site. Therefore, you should use anchors whenever necessary, to get a more accurate match. ^site$, which uses both a beginning ^ and ending $ anchor, will ensure that the expression has to start with site and end with site and include nothing else. Notice too that there were no special characters in the regular expression site - it is interpreted as a regular expression only if it is in a regular expression-sensitive field.