Reporting Information

Inferred Demographics

Like many ads on the web, Google Consumer Surveys reports on the inferred age and gender of anonymous respondents based on the websites users visit and location based on IP addresses. Income, and urban density are then approximated using census data for particular geographic regions. To see what inferences are associated with your browser, visit google.com/ads/preferences.

Please note that it’s possible that we may miscategorize people. For example, if someone visits websites that are usually frequented by younger people, they may be categorized as younger than their actual age. Similarly, if a household uses a shared computer, we may categorize that “user” based on the combined interests of the household.

All responses are anonymous and collected in aggregate. For more information about Google’s privacy policy, visit here.

 

Weighting

Weighting is designed to remove bias from a survey sample and make the results better represent the target population. For example, if there are fewer female respondents for a given survey than the target population (for example, US Internet users), we can increase the weight of the female responses by a factor that brings them inline with the target population.

Consumer Surveys weights results by inferred gender, age and geography when possible to make the sample as representative as possible of the internet population. If you prefer to see unweighted data, you can turn off weighting to see raw results.
 

Root Mean Square Error (RMSE)

When targeting an audience representing the US Internet population, Consumer Surveys attempts to find respondents that match the distribution of people in the US by age, gender and location as reported in the US Census Current Population Survey (CPS).  The sampling bias table at the bottom of the survey’s results page tells you the difference between the collected answer distribution and the desired distribution from CPS.  At the bottom of this table is a root mean square error (RMSE).  

The RMSE measures the differences between the desired distribution and actual distribution for each targeted population segment and calculates a weighted average error.  The RMSE technique weights large errors more than small errors.  Thus, if the difference in one segment is very large, it would have a greater effect on the RMSE than if there were small errors across several segments.  The lower the RMSE score, the closer we are to representing the US Internet population.

Note: The table provided on the survey results page communicates the sampling bias for three segments:  gender, age and region.  However, our targeted population segments also includes combinations of gender, age, and region.  Combinations can include any two or three of these segments.
 

Statistical Significance

Consumer Surveys automatically analyzes your results. In the upper right corner of your question-level results, you’ll see an indication of the system’s confidence in either the complete ordering of your results or in the winning answer. Your results are statistically significant when there is 95% confidence. If the system has 95% confidence in the ordering of the results, you’ll see that. If there is only 95% confidence in the winning answer and not in the complete ordering of results, that will be indicated instead. It will say “Too close to call” if neither the complete ordering nor the winner can be declared with statistical significance.
 

Insights

Insights are statistically significant differences between demographic subpopulations. For example, differences between genders or locations. Consumer Surveys only shows insights in which it has 98% confidence, a stricter confidence level to minimize false positives.
 

Error Bars

The error bars on Consumer Surveys graphs indicate the 95% confidence interval. This is calculated using the Wilson score interval. A 95% confidence interval means that a retrial of the survey should produce a result within the interval 95% of the time. Error bars decrease in size as sample size (or the number of respondents) increases.
 

Confidence Interval

In the results report, the values inside the parenthesis are the ranges to determine the confidence interval and the graphical illustration of it is called Error bar as previously discussed. If you run the same survey again, there's a 95% chance that the percentage of respondents who chose that answer will be in that confidence interval range.

E.g. Answer A 33.5% (+9.9/-8.7)

There's a 95% chance that the percentage of respondents who choose this Answer A will be between  43.4% and 24.8% if you run the survey again. Because 33.5 + 9.9 = 43.4 and 33.5 - 8.7 = 24.8.