Filtering invalid traffic to ensure quality
To ensure accurate data, Google removes invalid clicks and impressions produced both by automated services (non-human users) and human traffic (found to be suspicious or illegitimate). This activity is sometimes known as "spam filtering." The number of impressions and clicks that were removed, and considered invalid for your ads, is available in reporting in the invalid traffic category of metrics.
Other events, like bid requests and video events, may also benefit from traffic filtering, but aren't surfaced in "invalid" metrics in reporting.
How filtration is done
Google's Ad Traffic Quality team has numerous systems in place to detect suspicious activity. These systems are regularly updated, and all filtration is performed so that the user (browser, robot, and so on) isn't given any indication that their traffic has been flagged for filtration. This helps retain the effectiveness of traffic quality filtering.
Pre-bid and post-serve filtration
Depending on when the traffic is identified as invalid, Google will remove it either before inventory is bid on, or after an event occurs (like a click or impression). Traffic that is removed pre-bid is never bought (because it wasn't bid on), and traffic that is removed post-serve is not paid for (because it is credited back to your account).
Kinds of traffic that get filtered
General invalid traffic and sophisticated invalid traffic
Two broad categories of traffic that get filtered are general invalid traffic (GIVT) and sophisticated invalid traffic (SIVT). General invalid traffic is identified using lists of known spiders and robots, or other routine checks. Sophisticated invalid traffic is often more difficult to identify, and requires human intervention or more in depth analysis. For a more detailed description of GIVT and SIVT see page 6, section 1.1.2, of the Media Rating Council's Invalid Traffic Detection and Filtration Guidelines Addendum.
Categories of general invalid traffic (GIVT)
|Data Center||Ad traffic originating from servers in data centers whose IPs are linked to invalid activity (typically non-human traffic). These are usually known data center IPs that are likely included in an industry list, such as the Trustworthy Accountability Group (TAG) Data Center IP list.|
|Known Crawler||A program or automated script that requests content and declares itself as non-human through a variety of identification mechanisms. These crawlers are usually included in the IAB International Spiders and Bots List.|
|Irregular Pattern||Ad traffic that includes one or more attributes (e.g., user cookie) associated with known irregular patterns, such as auto-refresh traffic or duplicate clicks.|
Categories of sophisticated invalid traffic (SIVT)
|Automated Browsing||A program or automated script that requests web content (including digital ads) without user involvement and without declaring itself as a crawler, such as and primarily referring to botnets.|
|False Representation||An ad request for inventory that is different from the actual inventory being supplied, including ad requests where the actual ad is rendered to a different website or application, device, or other target (such as geography).|
|Misleading User Interface||A web page, application, or other visual element modified to falsely include one or more ads. This includes rendering ads that are not visible to the user, injecting ads without a publisher’s consent, or tricking users to click on an ad.|
|Manipulated Behavior||A browser, application, or other program that triggers an ad interaction without a user’s consent, such as an unintended click, an unexpected conversion, or false attribution for installation of a mobile application.|
|Incentivized Behavior||The use of an explicit incentive to drive users to interact with one or more ads for the sole purpose of receiving said incentive and without any advertiser knowledge of its presence. Often these incentives are financial in nature.|
|Undisclosed Classification||Invalid traffic that cannot be classified using any of the other categories in the taxonomy, or sensitive invalid traffic that cannot be disclosed.|