It's important that the Google crawler can access your ads.txt file. After you create an ads.txt file and set it up on your root domain, the Google crawler will:
- Attempt to crawl the file.
- Parse the contents of the file to determine seller IDs that are authorized to monetize your inventory.
Troubleshoot ads.txt crawler issues
To ensure your ads.txt file can be crawled, we recommend working through the following troubleshooting steps.
(advanced) These steps require an understanding of HTTP status codes and are aimed at advanced users.
Confirm the file is not temporarily unavailable
If a previously seen ads.txt file is unavailable on a subsequent re-crawl, the previously seen entries will be:
- Purged if the response is a hard 404 error (page that actually doesn’t exist; HTTP 404 status).
- Retained for up to five days if the response is a soft 404 error (a real page returned for a URL that doesn't actually exist; HTTP 200 status) or a 500 server error.
Confirm that the file is reachable from the root domain
Redirects from domain.com/ads.txt
to www.domain.com/ads.txt
are fairly common. Ads.txt crawling will start at the root domain, and the root domain needs to return from, or redirect to, the ads.txt file.
An ads.txt file on www.domain.com/ads.txt
will only be crawled if domain.com/ads.txt
redirects to it.
Ensure crawling is not disallowed by robots.txt
The ads.txt file for a domain may be ignored by crawlers if the robots.txt file on a domain disallows one of the following:
- The crawling of the URL path on which an ads.txt file is posted.
- The User Agent of the crawler.
For example1.com:
- An ads.txt file is posted on
example1.com/ads.txt
. - The following lines are included in
example1.com/robots.txt
:User-agent: *
Disallow: /ads
- The ads.txt file will be ignored by crawlers that respect the robots.txt standard.
- You can modify the robots.txt file as follows to allow crawling of the file (other approaches are possible):
- Option 1: Modify disallowed path.
User-agent: *
Disallow: /ads/
- Option 2: Explicitly allow ads.txt; depends on crawler support for the
Allow
robots.txt directive.User-agent: *
Allow: /ads.txt
Disallow: /ads
- Option 1: Modify disallowed path.
For example2.com:
- An ads.txt file is posted on
example2.com/ads.txt
. - The following lines are included in
example2.com/robots.txt
:User-agent: Googlebot
Disallow: /
- The ads.txt file will be ignored by the Google crawler.
Ensure file is returned with an HTTP 200 OK status code
While a request for an ads.txt file may return the contents of the file in the response body, if the status code in the response header indicates the file was not found (e.g., status code 404):
- The response will be ignored.
- The file will be considered non-existent.
Make sure the file has an HTTP 200 OK status code.
Ensure there are no formatting errors or invalid characters in the file
Formatting errors, such as invalid whitespace characters, may be difficult to detect but can make an ads.txt file difficult to parse by a crawler, and may therefore result in a file being ignored. Avoid copying and pasting ads.txt entries from a rich text editor; we recommend a plain text editor. You can also check for invalid UTF-8 characters in your ads.txt file using a HEX editor.
Make an ads.txt file reachable via both HTTP and HTTPS
The Google crawler attempts to crawl all ads.txt files on both HTTP and HTTPS. However, a 404 (or 40X) response causes previously crawled entries to be purged, even though an ads.txt file is crawled via HTTP. Therefore, if crawling via HTTPS returns a 404 (or 40X):
- The previously crawled entry will be purged.
Please ensure the ads.txt is accessible via both HTTP and HTTPS.