​robots.txt report

See whether Google can process your robots.txt files

The robots.txt report shows which robots.txt files Google found for the top 20 hosts on your site, the last time they were crawled, and any warnings or errors encountered. The report also enables you to request a recrawl of a robots.txt file for emergency situations.

A robots.txt file is used to prevent search engines from crawling your site. Use noindex if you want to prevent content from appearing in search results.

This report is available only for properties at the domain level. That means either:

Open robots.txt report

 

See your robots.txt files and crawl status

In a Domain property, the report includes robots.txt files from the top 20 hosts in that property.

For each robots.txt file checked by Search Console, you can see the following information:

  • File path - The full URL where Google checked for the presence of a robots.txt file. A URL will appear in the report only if it had a Fetched or Not Fetched status any time in the last 30 days. See Location of robots.txt files.
  • Fetch status - The status of the latest fetch request for this file. The following values are possible:
    • Not Fetched - Not found (404): A 404 error (the file doesn't exist) occurred when requesting this file. If you have posted a robots.txt file at the listed URL but are seeing this error, try inspecting the URL to see if there are any availability issues. A file that has status Not found (404) for 30 days will no longer appear in the report (though Google will continue checking it in the background). Not having a robots.txt error is fine, and means that Google can crawl all URLs on your site, but read how Google behaves when there's a robots.txt error for full details.
    • Not Fetched - Any other reason: Some other issue occurred when requesting this file. See List of indexing issues.
    • Fetched: The last crawl attempt successfully returned a robots.txt file. Any issues found while parsing the file will be listed in the Issues column. Google ignores the lines with issues and uses those that it can parse.
  • Checked on - When Google last tried to crawl this URL, in local time.
  • Size - The size of the fetched file, in bytes. If the last fetch attempt failed, this will be empty.
  • Issues - The table shows a count of any parsing issues in the contents of the file when last fetched. Errors prevent a rule from being used. Warnings do not prevent a rule from being used. Read how Google behaves when there's a robots.txt error. To fix parsing issues, use a robots.txt validator.

See the last fetched version

You can see the last fetched version of a robots.txt file by clicking it in the files list in the report. If the robots.txt file has any errors or warnings, they will be highlighted in the displayed file contents. You can cycle through the errors and warnings using the arrow keys.

See previously fetched versions

To see fetch requests for a given robots.txt file in the last 30 days, click the file in the files list in the report, then click Versions. To see the file contents at that version, click the version. A request is included in the history only if the retrieved file or fetch result is different from the previous file fetch request.

If Google encountered a fetch error in the latest fetch attempt, Google will use the last successfully fetched version without errors for up to 30 days.

Request a recrawl

You can request a recrawl of a robots.txt file when you fix an error or make a critical change.

When to request a recrawl

You generally don't need to request a recrawl of a robots.txt file, because Google recrawls your robots.txt files often. However, you might want to request a recrawl of your robots.txt in the following circumstances:

  • You changed your robots.txt rules to unblock some important URLs and want to let Google know quickly (note that this doesn't guarantee an immediate recrawl of unblocked URLs).
  • You fixed a fetch error or other critical error.

How to request a recrawl

To request a recrawl, select the more settings icon next to a file in the robots file list and click Request a recrawl.

Websites on website hosting services

If your website is hosted on a website hosting service, it might not be easy to edit your robots.txt file. In that case, see your site host's documentation about how to block specific pages from being crawled or indexed by Google. (Note that most users are concerned with preventing files from appearing in Google Search, rather than crawled by Google. If this is your concern, search your hosting service for information about blocking pages from search engines.)

What happens when Google can't fetch or read your robots.txt

If a robots.txt file is not found for a domain or subdomain, Google assumes that it can crawl any URL within that host.

If Google finds a robots.txt file but can't fetch it, Google follows this behavior:

  1. For the first 12 hours, Google stops crawling the site but keeps trying to fetch the robots.txt file.
  2. If Google can't fetch a new version, for the next 30 days Google will use the last good version, while still trying to fetch a new version. You can see the last good version in the version history.
  3. If the errors are still not fixed after 30 days:
    • If the site is generally available to Google, Google will behave as if there is no robots.txt file (but still keep checking for a new version).
    • If the site has general availability problems, Google will stop crawling the site, while still periodically requesting a robots.txt file.

If Google finds and can fetch a robots.txt file: Google reads the file line by line. If a line has an error or can't be parsed to a robots.txt rule, it will be skipped. If there are no valid lines in the file, Google treats this as an empty robots.txt file, which means no rules are declared for the site.

Location of robots.txt files

Terminology:

  • A protocol, (also called a scheme) is either HTTP or HTTPS.
  • A host is everything in the URL after the protocol (http:// or https://) until the path. So the host m.de.example.com implies 3 possible hosts: m.de.example.com, de.example.com, and example.com, each of which can have its own robots.txt file.
  • An origin is the protocol + host. So: https://example.com/ or https://m.example.co.es/

Per RFC 9309, the robots.txt file must be at the root of each protocol and host combination of your site.

For a Domain property:

  1. Search Console chooses the top 20 hosts, sorted by crawl rate. For each domain, the report may show up to 2 origins, which means the table can show up to 40 rows.If you can't find the robots.txt URL for one of your hosts, create a domain property for the missing subdomain.
  2. For each host, Search Console checks two URLs:
    • http://<host>/robots.txt
    • https://<host>/robots.txt
  3. If the robots.txt file at the requested URL is reported as Not found for 30 days, Search Console does not show the URL in this report, although Google will keep checking the URL in the background. For any other result, the report shows the URL checked.

For a URL-prefix property at the host level (such as https://example.com/) Search Console checks only a single origin for that property. That is: for property https://example.com, Search Console checks only https://example.com/robots.txt, not http://example.com/robots.txt or https://m.example.com/robots.txt.

Common tasks

View a robots.txt file

To open a robots.txt file listed in this report, click the file in the list of robots.txt files. To open the file in your browser, click Open live robots.txt.

You can open any robots.txt file on the web in your browser. See below to learn which URL to visit.

Where robots.txt files can be located

A robots.txt file is located at the root of a protocol and domain. To determine the URL, cut off everything after the host (and optional port) in the URL of a file and add "/robots.txt". You can visit the robots.txt file in your browser, if one is present. Robots.txt files are not inherited by subdomains or parent domains, and a given page can be affected by only one robots.txt file. Some examples:

File URL URL of robots.txt that can affect that file
http://example.com/home http://example.com/robots.txt
https://m.de.example.com/some/page/here/mypage https://m.de.example.com/robots.txt
https://example.com?pageid=234#myanchor https://example.com/robots.txt
https://images.example.com/flowers/daffodil.png https://images.example.com/robots.txt

See which robots.txt file affects a page or image

To find the URL of the robots.txt file that affects a page or image:

  1. Find the exact URL of the page or image. For an image, in the Google Chrome browser, right-click and select Copy image URL.
  2. Remove the end of the URL after the top level domain (for example, .com, .org, .co.il) and add /robots.txt to the end. So the robots.txt file for https://images.example.com/flowers/daffodil.png is https://images.example.com/robots.txt
  3. Open the URL in your browser to confirm that it exists. If your browser can't open the file, then it doesn't exist.

Test if Google is blocked by robots.txt

  • If you want to test whether a specific URL is blocked by a robots.txt file, you can test the availability of the URL with the URL inspection tool.
  • If you want to test a specific robots.txt rule against a file that isn't on the web yet, or test a new rule, you can use a third-party robots.txt tester.

More information

Was this helpful?

How can we improve it?
Search
Clear search
Close search
Google apps
Main menu
1773652352261417