Inventory management

Ensure your ads.txt files can be crawled

Once an ads.txt file is set up on your domain, the Google crawler will:

  • Attempt to crawl the file every 24 hours.
  • Parse the contents of the file to determine seller IDs that are authorized to monetize your inventory.

To ensure your ads.txt file can be crawled, we recommend working through the following troubleshooting steps.

Confirm the file is not temporarily unavailable

If a previously seen ads.txt file is unavailable on a subsequent re-crawl, the previously seen entries will be:

  • Purged if the response is a hard 404 error (page that actually doesn’t exist; HTTP 404 status).
  • Retained for up to five days if the response is a soft 404 error (a real page returned for a URL that doesn't actually exist; HTTP 200 status) or a 500 server error.

Confirm that the file is reachable from the root domain

Redirects from domain.com/ads.txt to www.domain.com/ads.txt are fairly common. Ads.txt crawling will start at the root domain, and the root domain needs to return from, or redirect to, the ads.txt file.

An ads.txt file on www.domain.com/ads.txt will only be crawled if domain.com/ads.txt redirects to it.

Ensure crawling is not disallowed by robots.txt

The ads.txt file for a domain may be ignored by crawlers if the robots.txt file on a domain disallows one of the following:

  • The crawling of the URL path on which an ads.txt file is posted.
  • The User Agent of the crawler.
Example: Crawling disallowed on ads.txt file path

For example1.com:

  1. An ads.txt file is posted on example1.com/ads.txt.
  2. The following lines are included in example1.com/robots.txt:
    User-agent: *
    Disallow: /ads
  3. The ads.txt file will be ignored by crawlers that respect the robots.txt standard.
  4. You can modify the robots.txt file as follows to allow crawling of the file (other approaches are possible):
    • Option 1: Modify disallowed path.
      User-agent: *
      Disallow: /ads/
    • Option 2: Explicitly allow ads.txt; depends on crawler support for the Allow robots.txt directive.
      User-agent: *
      Allow: /ads.txt
      Disallow: /ads
Example: Crawling disallowed for User Agent

For example2.com:

  1. An ads.txt file is posted on example2.com/ads.txt.
  2. The following lines are included in example2.com/robots.txt:
    User-agent: Googlebot
    Disallow: /
  3. The ads.txt file will be ignored by the Google crawler.

Ensure file is returned with an HTTP 200 OK status code

While a request for an ads.txt file may return the contents of the file in the response body, if the status code in the response header indicates the file was not found (e.g., status code 404):

  • The response will be ignored.
  • The file will be considered non-existent.

Make sure the file has an HTTP 200 OK status code.

Ensure there are no formatting errors or invalid characters in the file

Formatting errors, such as invalid whitespace characters, may be difficult to detect but can make an ads.txt file difficult to parse by a crawler, and may therefore result in a file being ignored. Avoid copying and pasting ads.txt entries from a rich text editor; we recommend a plain text editor.

Make an ads.txt file reachable via both HTTP and HTTPS

The Google crawler attempts to crawl all ads.txt files on both HTTP and HTTPS. However, a 404 (or 40X) response causes previously crawled entries to be purged, even though an ads.txt file is crawled via HTTP. Therefore, if crawling via HTTPS returns a 404 (or 40X):

  • The previously crawled entry will be purged.

Please ensure the ads.txt is accessible via both HTTP and HTTPS.

Was this helpful?
How can we improve it?