To create a regular expression, you must use specific syntax—that is, special characters and construction rules. For example, the following is a simple regular expression that matches any 10-digit telephone number, in the pattern nnn-nnn-nnnn:
\d{3}-\d{3}-\d{4}
For additional instructions and guidelines, see also Guidelines for Using Regular Expressions and Examples of Regular Expressions. See also Configure Content Compliance settings
The following table describes some of the most common special characters for use in regular expressions. These characters are categorized as follows:
Characters | Description |
---|---|
Anchors | |
^ | (caret) Matches the start of the line or string of text that the regular expression is searching. For example, a content rule with a location Subject line and the following regular expression: ^abc captures any email message that has a subject line beginning with the letters abc |
$ | (dollar) Matches the end of the line or string of text that the regular expression is searching. For example, a content rule with a location Subject line and the following regular expression: xyz$ captures any email message that has a subject line ending with the letters xyz |
Metacharacters | |
. | (dot) Matches any single character, except a new line. |
| | (pipe) Indicates alternation—that is, an “or.” For example: cat|dog matches the word cat or dog |
\ | Indicates that the next character is a literal rather than a special character. For example: \. matches a literal period, rather than any character (dot character) |
Character Classes | |
[...] | Matches any character from a set of characters. Separate the first and last character in a set with a dash. For example: [123] matches the digit 1, 2, or 3 [a-f] matches any letter from a to f Note: Regular expressions in Content Compliance policies are case sensitive. |
[^...] | Matches any character not in the set of characters. For example: [^a-f]matches any character that’s not a letter from a to f Note: Regular expressions in Content Compliance policies are case sensitive. |
[:alnum:] | Matches alphanumeric characters (letters or digits): a-z, A-Z, or 0-9 Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:alnum:]]. |
[:alpha:] | Matches alphabetic characters (letters): a-z or A-Z Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:alpha:]]. |
[:digit:] | Matches digits: 0-9 Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:digit:]]. |
[:graph:] | Matches visible characters only—that is, any characters except spaces, control characters, and so on. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:graph:]]. |
[:punct:] | Matches punctuation characters and symbols: ! " # $ % & ' ( ) * + , \ -. / : ; < = > ? @ [ ] ^ _ ` { | } Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:punct:]]. |
[:print:] | Matches visible characters and spaces. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:print:]]. |
[:space:] | Matches all whitespace characters, including spaces, tabs, and line breaks. Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:space:]]. |
[:word:] | Matches any word character—that is, any letter, digit, or underscore: a-z, A-Z, 0-9, or _ Note: This character class must be surrounded with another set of square brackets when you use it in a regular expression, for example: [[:word:]]. |
Shorthand Character Classes | |
\w | Matches any word character—that is, any letter, digit, or underscore: a-z, A-Z, 0-9, or _ Equivalent to [:word:] |
\W | Matches any non-word character—that is, any character that’s not a letter, digit, or underscore. Equivalent to [^[:word:]] |
\s | Matches any whitespace character. For example, use this character to specify a space between words in a phrase: stock\stips matches the phrase stock tips Equivalent to [:space:] |
\S | Matches any character that’s not a whitespace. Equivalent to [^[:space:]] |
\d | Matches any digit from 0-9. Equivalent to [:digit:] |
\D | Matches any character that’s not a digit from 0-9. Equivalent to [^[:digit:]] |
Group | |
(...) | Groups parts of an expression. Use grouping to apply a quantifier to a group or to match a character class before or after a group. |
Quantifiers | |
{n} | Match the preceding expression exactly n times. For example: [a-c]{2} matches any letter from a to c only if two letters occur in a row. Thus, the expression would match ab and bc but not abc or aabbc. |
{n,m} | Match the preceding expression a minimum of n times and a maximum of m times. For example: [a-c]{2,4} matches any letter from a to c only if the letters occur a minimum of 2 times and a maximum of 4 times in a row. Thus, the expression would match ab and abc but not aabbc. |
? | Indicates that the preceding character or expression can match 0 or 1 times. Equivalent to the range {0,1}. For example, the following regular expression: colou?r matches either colour or color, because the ? makes the letter u optional. |