Companies that implement consent requirements for Analytics cookies will experience data loss from their Google Analytics reporting proportional to the amount of users who decline Analytics cookies. This results in incomplete measurement scenarios, preventing companies from getting answers to questions like:
- How many Daily Active Users do I have?
- How many new users did I acquire from my last campaign?
- What is the user journey from landing on my website to actually making a purchase?
- How many of my site visitors are based in Germany vs. the UK?
- What is the difference in user behavior between mobile vs. web visitors?
Behavioral modeling for consent mode aims at filling this data gap by modeling the behavior of users who decline analytics cookies based on the behavior of similar users who accept analytics cookies. The training data used for modeling is based on the consented user data from the property where modeling is activated.
For example, behavioral modeling estimates data based on user and session metrics, such as daily active users and conversion rate, that may be unobservable when identifiers like cookies or user IDs are not fully available. Without modeling, you have a less complete understanding of user behavior on your site based only on the observed data you have available.
Modeled data vs. observed data
When users visit your site and grant consent for Analytics cookies or when they don't opt out of personalization using advertising ID in Android Settings, Analytics associates user behavior with various identifiers to provide continuity in measurement. We refer to this kind of data as observable data because it comes from users who have given us permission to observe their behavior.
When users don't grant consent to the use of Analytics cookies or equivalent app identifiers, events are not associated with a persistent user identifier. For example, if Analytics collects 10 pageview events, it can’t observe and report whether that’s 10 users or 1 user. Instead, Analytics applies machine learning to estimate the behavior of those users based on the behavior of similar users who do accept analytics cookies or equivalent app identifiers.
Google's behavioral modeling approach
Google's behavioral modeling approach applies the following machine learning best practices.
Check for accuracy and communicate changes
Holdback validation maintains the accuracy of Google’s models. Estimated user data is compared to a portion of observed user data that was held back from model training, and the information is used to tune the models. Google will communicate changes that might have a large impact on your data.
Maintain rigorous reporting
Behavioral modeling is only included when there is high confidence of model quality. Prerequisites must be met. For example, if there isn’t enough consented traffic to inform the model, then events triggered by unconsented users aren't reported. This helps ensure the accuracy of the data.
Customize for your business
Google’s more general modeling algorithm is separately applied to reflect your unique business and customer behavior.
How behavioral modeling appears in Google Analytics
Analytics seamlessly integrates modeled data and observed data in your reports. When Analytics includes modeled data, you will probably see differences versus reports that include only observed data (e.g., higher user counts in reports that include modeled data).
Administrators can manage behavioral modeling for consent mode in Admin > Property column > Reporting Identity. Learn more
Use the data-quality icon (shown below) to see when modeled data is integrated.
The following table summarizes the messages you might see via the icon.
|Data-quality icon status||Description|
|Including estimated user data||As of [modeling effective date], Analytics is estimating data that's missing due to factors such as cookie consent.|
|Including estimated user data*||
As of [modeling effective date], Analytics is estimating all possible data that’s missing due to factors like cookie consent.
*Cards with realtime or retention data don’t include estimation, however, and only contain data from users who consented to the use of identifiers.
|Excluding estimated user data||Your property’s reporting identity setting doesn’t allow Analytics to estimate data that’s missing due to factors such as cookie consent. Unless you use the blended setting, your reports only include data available from users who consented to the use of identifiers.|
|Estimated user data unavailable||The date range selected is prior to when this property became eligible for estimated data.|
|Estimated user data unavailable||This report includes only realtime or retention data from users who consent to the use of identifiers. It doesn't include estimated data.|
Some pages in the Analytics interface will also display a banner with information about the modeling status.
The following table summarizes the messages you might see via a banner.
|Banner message||Banner location|
|Most templates include only data from users who consented to the use of identifiers, except for the free-form and segment-overlap templates, which do include data from estimated users.||Explorations home page|
|If an exploration has a segment with a sequence, it will show only data for users who consented to the use of identifiers.||Exploration detail page|
|This [report/exploration/audience] includes only data from users who consented to the use of identifiers.||Exploration detail page|
|If this segment includes a sequence, it will show only data for users who consented to the use of identifiers.||Segment builder|
To successfully train behavioral models, Analytics requires that your Google Analytics 4 property meet the following requirements:
- Consent mode is enabled across all pages of your site(s) and/or all app screens of your app(s).
- At least 1,000 daily events with
analytics_storage='denied'for at least 7 days.
- At least 1,000 daily users sending events with
analytics_storage='granted'for at least 7 of the previous 28 days. Note that it may take more than 7 days of meeting the data threshold within those 28 days to successfully train the model; however it's possible that even the additional data won't be sufficient for Analytics to train the model.
Note that behavioral modeling starts from the date a given property becomes eligible.
The following features don't support using modeled behavioral data:
- Realtime reports and cards with realtime data
- Explorations except free-form tables
- Predictive Metrics
- Data export