The data-redaction feature helps to prevent the inadvertent collection of PII in the form of email addresses and URL query parameters. Data redaction uses text patterns to identify likely email addresses across all event parameters and the URL query parameters that are included as part of the event parameters page_location, page_referrer, page_path, link_url, video_url, and form_destination.
language=english) that follows the question mark (?) at the end of a URL. For example, the value for the event parameter
page_locationmight be the URL and query parameters
Data redaction evaluates events before they are collected to find and remove any text it understands as an email address or query parameter key-value pair. After removal of the offending text, data collection proceeds as expected.
You configure data redaction as one of the settings for a web data stream.
When you create a new property, email data redaction is on by default. For properties that you created before the release of this feature, you need to enable data redaction using the instructions below.
It's important to remember that while data redaction provides a powerful tool against inadvertently collecting PII, the ultimate responsibility for meeting regulatory requirements still lies with the entity collecting data. To further help you meet that responsibility, this feature lets you test your configuration to understand whether the text patterns you identify are redacted as expected (learn more). You can also use Debug View to monitor in real time how Analytics collects events from your site.
Some things to keep in mind
Data redaction is currently available only for web data streams.
Data redaction evaluates event data for email addresses on a best-effort basis.
Data redaction occurs client side after Analytics modifies or creates events (which also occurs client side) and before data is sent to Analytics.
Data redaction accepts percent-encoded URL query parameters, including Unicode characters accepted by browsers. For example, if you enter 名 as a query parameter, and enter https://www.example.com?名=john as a test URL, data redaction will interpret it as follows:
|Test URL after you enter it||Redacted version|
Data redaction may incorrectly interpret text as an email address and redact the text; for example, if the text includes "@" followed by a top-level domain name (e.g., example.com) it may be incorrectly removed.
Data redaction does not evaluate HTTP-header values, (for example referer, which may contain query parameters on older browsers).
Configure data redaction
- In Admin, under Data collection and modification, click Data streams.
- Click the relevant web data stream.
- In the Events section, click Redact data.
- If you want to redact email addresses and/or URL query parameters, turn on the switch for each option.
- If you choose to redact URL query parameters, enter a list of the query parameters you want to redact (e.g., firstname, lastname, email_address). Press return/Enter after each parameter.
Use the Test data redaction section to see how Analytics removes data. Analytics will test for the options you chose in Step 5 above.
- Enter sample text containing an email address, or a URL that includes the query parameters you entered in Step 6 above along with sample values (e.g.,
- Click Preview redacted data.
Under Redacted version, you'll see an example of the data that Analytics would collect given your settings. For example, if your sample text is:
then the redacted version will be: