Index Coverage report

See which pages Google has found on your site, which pages have been indexed, and any indexing problems encountered.

INDEX COVERAGE REPORT

 

Index coverage status in Search Console - Google Search Console Training

Getting started

Non-experts

If you are new to indexing or SEO, please read these guidelines, otherwise you probably won't understand this report.

  1. Read how Google Search works. If you don't understand indexing, this report will confuse or frustrate you--trust us.
  2. Decide whether you need to use this report. If your site has fewer than 500 pages, you probably don't need to use this report. Instead, use one of the following Google searches to see if your site is indexed:

    • site:<<site_root_domain_or_path>> - See a subset of pages that Google knows about on your site. Examples: site:example.com or site:example.com/petstore
    • site:<<your_site>> term1 term2 - Search for indexed pages containing specific terms on your site. Example: site:example.com/petstore iguanas zebras.
    • site:<<exact-url>> - Search for the exact URL of a page on your site to see whether Google has indexed it. Example: site:http://example.com/petstore/gerbil
    If you get no search results, then look at this report to verify whether your site truly has zero indexed pages. If this report says zero valid pages (or zero pages of any status) see the troubleshooting section.
  3. Use this report is used to understand the general index status of your site. The report is not useful for investigating the index status of specific pages. To find the index status of a specific page, use the URL Inspection tool.
  4. What to look for in this report:
    • Are most of the URLs green (valid) and/or gray (excluded)? Your site should be mostly valid and excluded pages: Valid, because these pages are in the index; Excluded, because Search Console thinks those URLs are excluded from the index for a reason that you can agree with.
    • Are few (if any) URLs red (error)? Error URLs are almost always a problem. However, how much time you want to spend fixing indexing errors depends on how important the page is to your site.
    • Are the gray (excluded) URL reasons what you expect? Excluded URLs are not indexed, but we think that's probably not an error. Reasons for exclusion mean that the page is explicitly blocked from indexing (for example, a robots.txt rule on your site, or a noindex tag on the page). Duplicate pages are also excluded (Google only indexes one version of a set of duplicate pages). Make sure that the reasons your pages are excluded are acceptable. If not, fix them according to the documentation for the specific excluded status.
    • Is Google indexing the most important URLs on your site? The Index coverage report isn't used to check individual URLs, but you can filter results to show only the valid URLs, then see if your important URLs are listed. (Note that the list of example URLs in the report is limited to 1,000 items, and isn't guaranteed to show all URLs in a given status, even when less than 1,000 items.) Check the index status of your homepage and key pages using the URL Inspection tool.
    • Is Google finding most of your URLs? The report shows all the URLs that Google knows about on your site, whether or not they are indexed. If the total URL count in this report is much smaller than your site's page count, then Google isn't finding pages on your site. Some possible reasons for this:
      • The pages, or your site, is new. It can take a week or so for Google to start crawling and indexing a new page or site. If your site or page is new, wait a few days for Google to find and crawl it. In an urgent situation, or if waiting doesn't seem to be working, you can explicitly ask Google to crawl individual pages.
      • The pages aren't findable by Google. The pages should be linked from somewhere known to Google: from other known pages: from your homepage, from other known pages on your site, from other sites, or from a sitemap. For a new website, the best first step is to request indexing of your homepage, which should start Google crawling your website. For missing parts of a site, make sure they are linked properly. If you are using a site hosting service such as Wix or SquareSpace, check your site host's documentation to learn how to publish your pages and make them findable by search engines.
    • Read the documentation for your specific status type to understand the reason and any possible fix recommendations for a specific status. Skipping the documentation will cause you more more effort and time in the long run than reading the docs.
  5. What not to look for:
    • Don't expect every URL on your site to be indexed. Some URLs might be duplicates or might not contain meaningful information.
    • Excluded URLs are usually fine. Read and understand the specific reason for each excluded URL to confirm that the page is properly excluded.
    • Error URLs should probably be fixed, read the error reason to understand the issue and how to fix errors.
    • The total coverage numbers above the chart are complete and accurate from Google's perspective, but don't expect them to match exactly your estimate of the number of URLs on your site. Small discrepancies can occur for various reasons.
    • Just because a page is indexed doesn't guarantee that it will show up in your search results. Search results are customized for each user's search history, location, and many other variables, so even if a page is indexed, it won't show up in every search, or in the same ranking when it does. Therefore, if Search Console says a URL is indexed, but it doesn't turn up in your search results, you can assume that it is indexed and eligible to appear in search results.

FAQs

What does this report show?

The Index coverage report shows whether specific URLs have been crawled and indexed by Google. (If you don't know have a good knowledge of what these terms mean, please read how Google Search works). Google finds URLs in many ways, and tries to crawl most of them. If a URL is missing or unavailable, Google will probably continue to try crawling that URL for a while.

A URL in this report can have one of the following statuses:

  • Valid: Google found and indexed the page. Nothing else to do.
  • Warning: Google found and probably indexed the page, but we think there is a problem. Read the warning description below to understand your next steps.
  • Error: The URL is not indexed, and we think it's because of a mistake that you can correct. Read the error description below to understand your next steps.
  • Excluded: The URL is not indexed, but that is probably the right thing. Either you are blocking Google from crawling and indexing the page, or the page has been classified as a duplicate of another, crawled page on your site.

What is indexing?

Indexing is when Google finds (crawls) your page, then processes content of the page and puts the page into the Google index (indexes it), where the page is eligible to appear in Google Search results, as well on as other Google services, like Discover. For more about indexing, read how Google Search works.

How do I get my page or site indexed?

If you are using a site hosting service such as Wix or SquareSpace, your hosting service will probably tell Googe whenever you publish or update a page. Check your site host's documentation to learn how to publish your pages and make them findable by search engines.

If you are creating a site or page without a hosting service, you can use a sitemap or various other methods to tell Google about new sites or pages.

We strongly recommend ensuring that your homepage is indexed. Starting from your homepage, Google should be able to index all the other pages on your site, if your site has comprehensive and properly implemented site navigation for visitors.

Is it OK if a page isn't indexed?

Absolutely. Google doesn't index pages that are blocked by a robots.txt rule or noindex tag, or pages that are duplicates of other pages on your site, or pages that are inappropriate to index them (for example, variations of a page with different filters applied). Use the URL Inspection tool to see why a specific page isn't indexed. If there is an indexing error, or if a page was excluded for a reason that doesn't make sense, follow the documentation to understand and fix the issue.

SEOs, developers, and experienced website owners

If you're an experienced SEO, developer, or website owner, but haven't used the Index coverage report yet:
  1. Read how Google Search works. If you don't understand indexing, this report will just be confusing or frustrating, trust us.
  2. Follow the guidelines in Navigating the report, including What to look for and What not to look for.
  3. Read the troubleshooting section to understand and fix common problems.
  4. Remember that Excluded is not necessarily a bad status for a URL. These URLs are excluded, and we think you probably intended for them to be. In the case of a duplicate URL, understand why the URL is a duplicate, and why Google made the choice it did. If you think the wrong page was chosen as canonical, you can give signals to Google about your preferred canonical URL.
  5. Read the documentation for your specific status and reason to understand the issue, and see tips for fixing it.

Navigating the report

The Index Coverage report shows the Google indexing status of all URLs that Google knows about in your property.

  • The top-level summary page shows the results for all URLs in your property grouped by status (error, warning, or valid) and specific reason for that status (such as Submitted URL not found (404)).
  • Click a table row in the summary page to see a details page that focuses on all URLs with the same status/reason.

Summary page

The top level page in the report shows the index status of all pages that Google has attempted to crawl on your site, grouped by status and reason.

Primary crawler

The Primary crawler value on the summary page shows the default user agent type that Google uses to crawl your site. Available values are: Smartphone or Desktop; these crawlers simulate a visitor using a mobile device or a desktop computer, respectively.

Google crawls all pages on your site using this primary crawler type. Google may additionally crawl a subset of your pages using a secondary crawler (sometimes called alternate crawler), which is the other user agent type. For example, if the primary crawler for your site is Smartphone, the secondary crawler is Desktop; if the primary crawler is Desktop, your secondary crawler is Smartphone. The purpose of a secondary crawl is to try to get more information about how your site behaves when visited by users on another device type.

What to look for

Ideally you should see a gradually increasing count of valid indexed pages as your site grows. If you see drops or spikes, see the troubleshooting section. The status table in the summary page is grouped and sorted by "status + reason".

Your goal is to get the canonical version of every important page indexed. Any duplicate or alternate pages should be labeled "Excluded" in this report. Duplicate or alternate pages have substantially the same content as the canonical page. Having a page marked duplicate or alternate is usually a good thing; it means that we've found the canonical page and indexed it. You can find the canonical for any URL by running the URL Inspection tool. See more reasons why pages might be missing.

What not to look for

  • 100% coverage: You should not expect all URLs on your site to be indexed, only the canonical pages, as described above.
  • Immediate indexing: When you add new content, it can take a few days for Google to index it. You can reduce the indexing lag by requesting indexing.

Status

Each page can have one of the following status values:

  • Error: The page is not indexed. See the specific error type description to learn more about the error and how to fix it. You should concentrate on these issues first.
  • Warning: The page is indexed, but has an issue that you should be aware of.
  • Excluded: The page is not indexed, but we think that was your intention. (For example, you might have deliberately excluded it by a noindex directive, or it might be a duplicate of a canonical page that we've already indexed on your site.)
  • Valid: The page is indexed.

Reason

Each status (error, warning valid, excluded) has a specific reason for that status. See Status type descriptions below for a description of each status type, and how to handle it.

Validation

The validation status for this issue. You should prioritize fixing issues that are in validation state "failed" or "not started".

About validation

After you fix all instances of a specific issue on your site, you can ask Google to validate your changes. If all known instances are gone, the issue is marked as fixed in the status table and dropped to the bottom of the table. Search Console tracks the validation state of the issue as a whole, as well as the state of each instance of the issue. When all instances of the issue are gone, the issue is considered fixed. (For actual states recorded, see Issue validation state and Instance validation state.)

More about issue lifetime...

An issue's lifetime extends from the first time any instance of that issue was detected on your site until 90 days after the last instance was marked as gone from your site. If ninety days pass without any recurrences, the issue is removed from the report history.

The issue's first detected date is the first time the issue was detected during the issue's lifetime, and does not change. Therefore:

  • If all instances of an issue are fixed, but a new instance of the issue occurs 15 days later, the issue is marked as open, and "first detected" date remains the original date.
  • If the same issue occurs 91 days after the last instance was fixed, the previous issue was closed, and so this is recorded as a new issue, with the first detected date set to "today".

Basic validation flow

Here is an overview of the validation process after you click Validate Fix for an issue. This process can take several days, and you will receive progress notifications by email.

  1. When you click Validate Fix, Search Console immediately checks a few pages.
    • If the current instance exists in any of these pages, validation ends, and the validation state remains unchanged.
    • If the sample pages do not have the current error, validation continues with state Started. If validation finds other unrelated issues, these issues are counted against that other issue type and validation continues.
  2. Search Console works through the list of known URLs affected by this issue. Only URLs with known instances of this issue are queued for recrawling, not the whole site. Search Console keeps a record of all URLs checked in the validation history, which can be reached from the issue details page.
  3. When a URL is checked:
    1. If the issue is not found, the instance validation state changes to Passing. If this is the first instance checked after validation has started, the issue validation state changes to Looking good.
    2. If the URL is no longer reachable, the instance validation state changes to Other (which is not an error state).
    3. If the instance is still present, issue state changes to Failed and validation ends. If this is a new page discovered by normal crawling, it is considered another instance of this existing issue.
  4. When all error and warning URLs have been checked and the issue count is 0, the issue state changes to Passed. Important: Even when the number of affected pages drops to 0 and issue state changes to Passed, the original severity label will still be shown (Error or Warning).

Even if you never click "start validation" Google can detect fixed instances of an issue. If Google detects that all instances of an issue have been fixed during its regular crawl, it will change the issue state to "N/A" on the report.

When is an issue considered "fixed" for a URL or item?

An issue is marked as fixed for a URL or item when either of the following conditions are met:

  • When the URL is crawled and the issue is no longer found on the page. For an AMP tag error, this can mean that you either fixed the tag or that the tag has been removed (if the tag is not required). During a validation attempt, it will be considered as "passed."
  • If the page is not available to Google for any reason (page removed, marked noindex, requires authentication, and so on), the issue will be considered as fixed for that URL. During a validation attempt, it is counted in the "other" validation state.

Revalidation

When you click Revalidate for a failed validation, validation restarts for all failed instances, plus any new instances of this issue discovered through normal crawling.

You should wait for a validation cycle to complete before requesting another cycle, even if you have fixed some issues during the current cycle.

Instances that have passed validation (marked Passed) or are no longer reachable (marked Other) are not checked again, and are removed from the history when you click Revalidate.

Validation history

You can see the progress of a validation request by clicking the validation details link in the issue details page.

Entries in the validation history page are grouped by URL for the AMP report and Index Status report. In the Mobile Usability and Rich Result reports, items are grouped by the combination of URL + structured data item (as determined by the item's Name value). The validation state applies to the specific issue that you are examining. You can have one issue labeled "Passed" on a page, but other issues labeled "Failed", "Pending," or "Other".

Issue validation state

The following validation states apply to a given issue:

  • Not started: There are one or more pages with an instance of this issue that you have never begun a validation attempt for. Next steps:
    1. Click into the issue to learn the details of the error. Inspect the individual pages to see examples of the error on the live page using the AMP Test. (If the AMP Test does not show the error on the page, it is because you fixed the error on the live page after Google found the error and generated this issue report.)
    2. Click "Learn more" on the details page to see the details of the rule that was violated.
    3. Click an example URL row in the table to get details on that specific error.
    4. Fix your pages and then click Validate fix to have Google recrawl your pages. Google will notify you about the progress of the validation. Validation typically up to about two weeks, but in some cases can take much longer, so please be patient. 
  • Started:  You have begun a validation attempt and no remaining instances of the issue have been found yet. Next step: Google will send notifications as validation proceeds, telling you what to do, if necessary.
  • Looking good: You started a validation attempt, and all issue instances that have been checked so far have been fixed. Next step: Nothing to do, but Google will send notifications as validation proceeds, telling you what to do.
  • Passed: All known instances of the issue are gone (or the affected URL is no longer available). You must have clicked "Validate fix" to get to this state (if instances disappeared without you requesting validation, state would change to N/A). Next step: Nothing more to do.
  • N/A: Google found that the issue was fixed on all URLs, even though you never started a validation attempt. Next step: Nothing more to do.
  • Failed: A certain threshold of pages still contain this issue, after you clicked "Validate." Next steps: Fix the issue and revalidate.

Instance validation state

After validation has been requested, every instance of the issue is assigned one of the following validation states:

  • Pending validation: Queued for validation. The last time Google looked, this issue instance existed.
  • Passed: [Not available in all reports] Google checked for the issue instance and it no longer exists. Can reach this state only if you explicitly clicked Validate for this issue instance.
  • Failed: Google checked for the issue instance and it's still there. Can reach this state only if you explicitly clicked Validate for this issue instance.
  • Other: [Not available in all reports] Google couldn't reach the URL hosting the instance, or (for structured data) couldn't find the item on the page any more. Considered equivalent to Passed.

Note that the same URL can have different states for different issues; For example, if a single page has both issue X and issue Y, issue X can be in validation state Passed and issue Y on the same page can be in validation state Pending.

URL discovery dropdown filter

You can use the dropdown filter above the chart to filter index results by how Google discovered the URL. The following values are available:

  • All known pages [Default] - Show all URLs discovered by Google through any means.
  • All submitted pages - Show only pages submitted in a sitemap to this report or by sitemap ping.
  • Specific sitemap URL - Show only URLs listed in a specific sitemap that was submitted using this report. This includes any URLs in nested sitemaps.

A URL is considered to submitted by a sitemap even if it was also discovered through some other mechanism (for example, by organic crawling from another page).

Details page

Click on a row in the summary page to open a details page for that status + reason combination. You can see details about the chosen issue by clicking Learn more at the top of the page.

The graph on this page shows the count of affected pages over time.

The table shows an example list of pages affected by this status + reason. You can click the following row elements:

  • Click the row to see more details about that URL.
  • opens the URL in a new tab.
  • opens URL Inspection for that URL.
  • copies the URL

The Source value on the details page shows which user agent type (Smartphone or Desktop) was used to crawl the URLs listed.

When you've fixed all instances of an error or warning, click Validate Fix to let Google know that you've fixed the issue.

See a URL marked with an issue that you've already fixed? Perhaps you fixed the issue AFTER the last Google crawl. Therefore, if you see a URL with an issue that you have fixed, be sure to check the crawl date for that URL. Check and confirm your fix, then request re-indexing

Sharing the report

You can share issue details in the coverage or enhancement reports by clicking the Share button on the page. This link grants access only to the current issue details page, plus any validation history pages for this issue, to anyone with the link. It does not grant access to other pages for your resource, or enable the shared user to perform any actions on your property or account. You can revoke the link at any time by disabling sharing for this page.

Exporting report data

Many reports provide an export button to export the report data. Both chart and table data are exported. Values shown as either ~ or - in the report (not available/not a number) will be zeros in the downloaded data.

Troubleshooting

You can confirm the indexing status for any URL shown in this report by inspecting the URL thusly:

  1. Decide whether the index status really is a problem based on the status type, indexing goal, and specific error.
  2. Read the specific information about the issue.
  3. Inspect the URL with the URL Inspection tool:
    1. Click the inspect iconnext to the URL in the examples table to open URL Inspection for that URL.
    2. See crawl and index details for that URL in the Coverage > Crawl and Coverage > Indexing sections of the URL Inspection report.
    3. To test the live version of the page, click Test live URL.

Common issues

Here are some of the most common indexing issues that you might see in this report:

Drop in total indexed pages without corresponding errors

If you see a drop in total indexed pages without a corresponding increase in errors, you might be blocking access to your existing pages via robots.txt, 'noindex' or a required login. Look at the Excluded URLs for a spike that corresponds to your drop in Valid pages. Note that if these URLs were submitted in a sitemap, they would be marked as errors, not excluded.

More Excluded than Valid pages

If you see more Excluded than Valid pages, look at the exclusion reasons. Common exclusion reasons include:

  • You have a robots.txt rule that blocks Google from crawling large sections of your site. If you are blocking the wrong pages, unblock them.
  • Your site has a large number of duplicate pages, probably because it uses parameters to filter or sort a common collection (for example: type=dress or color=green or sort=price). These page probably should be excluded, if they are just showing the same content that is sorted, filtered, or reached in different ways. If you are an advanced user, and you think that Google is misunderstanding parameters on your site, you can use the URL Parameters tool to customize the handling of your site's parameters.
Error spikes

Error spikes might be caused by a change in your template that introduces a new error, or you might have submitted a sitemap that includes URLs that are blocked for crawling by robots.txt, noindex, or a login requirement.

If you see an error spike:

  1. See if you can find any correspondence between the total number of indexing errors or total indexed count and the sparkline next to a specific error row on the summary page as a clue to which issue might be affecting your total error or total indexed page count.
  2. Click into the details pages for any errors that seem to be contributing to your error spike. Read the description about the specific error type to learn how to handle it best.
  3. Click into an issue, and inspect an example page to see what the error is, if necessary.
  4. Fix all instances for the error, and request validation by clicking Validate Fix in the details page for that reason. Read more about validation.
  5. You'll get notifications as your validation proceeds, but you can check back after a few days to see whether your error count has gone down.
Server errors
A server error means that Googlebot couldn't access your URL, the request timed out, or your site was busy. As a result, Googlebot was forced to abandon the request.
Check the host status verdict for your site in the Crawl Stats report to see if Google is reporting site availability issues that you can confirm and fix.

Testing server connectivity

You can use the the URL Inspection tool to see if you can reproduce a server error reported by the Index Coverage report.

Fixing server connectivity errors

  • Reduce excessive page loading for dynamic page requests.
    A site that delivers the same content for multiple URLs is considered to deliver content dynamically (for example, www.example.com/shoes.php?color=red&size=7 serves the same content as www.example.com/shoes.php?size=7&color=red). Dynamic pages can take too long to respond, resulting in timeout issues. Or the server might return an overloaded status to ask Googlebot to crawl the site more slowly. In general, we recommend keeping parameter lists short and using them sparingly. If you're confident about how parameters work for your site, you can tell Google how we should handle these parameters.
  • Make sure your site's hosting server is not down, overloaded, or misconfigured.
    If connection, timeout or response problems persists, check with your web hoster and consider increasing your site's ability to handle traffic.
  • Check that you are not inadvertently blocking Google.
    You might be blocking Google due to a system level issue, such as a DNS configuration issue, a misconfigured firewall or DoS protection system, or a content management system configuration. Protection systems are an important part of good hosting and are often configured to automatically block unusually high levels of server requests. However, because Googlebot often makes more requests than a human user, it can trigger these protection systems, causing them to block Googlebot and prevent it from crawling your website. To fix such issues, identify which part of your website's infrastructure is blocking Googlebot and remove the block. The firewall may not be under your control, so you may need to discuss this with your hosting provider.
  • Control search engine site crawling and indexing wisely.
    Some webmasters intentionally prevent Googlebot from reaching their websites, perhaps using a firewall as described above. In these cases, usually the intent is not to entirely block Googlebot, but to control how the site is crawled and indexed. If this applies to you, check the following:
404 errors

In general, we recommend fixing only 404 error pages, not 404 excluded pages. 404 error pages are pages that you explicitly asked Google to index, but were not found, which is obviously a bug. 404 excluded pages are pages that Google discovered through some other mechanism, such as a link from another page. If the page has been moved, you should return a 3XX redirect to the new page. Learn more about evaluating and fixing 404 errors.

Missing pages or sites

If your page is not in the report at all, one of the following is probably true:

  • Google doesn't know about the page. Some notes about page discoverability:
    • If this is a new site or page, remember that it can take some time for Google to find and crawl new sites or pages.
    • In order for Google to learn about a page, you must either submit a sitemap or page crawl request, or else Google must find a link to your page somewhere.
    • After a page URL is known, it can take some time (up to a few weeks) before Google crawls some or all of your site.
    • Indexing is never instant, even when you submit a crawl request directly.
    • Google doesn't guarantee that all pages everywhere will make it into the Google index.
  • Google can't reach your page (it requires a login, or is otherwise not available to all users on the internet)
  • The page has a noindex tag, which prevents Google from indexing it
  • The page was dropped from the index for some reason.

To fix:

Use the URL Inspection tool to test the problem on your page. If the page is not in the Index Coverage report but it is listed as indexed in the URL Inspection report, it was probably indexed recently, and will appear in the Index Coverage report soon. If the page is listed as not indexed in the URL Inspection tool (which is what you'd expect), test the live page. The live page test results should indicate what the issue is: use the information from the test and the test documentation to learn how to fix the issue.

"Submitted" errors and exclusions
Any indexing reason that uses the word "Submitted" in the title (for example, "Submitted URL returned 403") means that the URL is listed in a sitemap that is either referenced by your robots.txt file or submitted using the Sitemaps report.
To fix a "Submitted" issue:
  • Fix the issue that prevents the page from being crawled
    or
  • Remove the URL from your sitemap and resubmit the sitemap in the Sitemaps report (for fastest service)
    or
  • Using the Sitemaps report, delete any sitemaps that contain the URL (and ensure that no sitemaps listed in your robots.txt file include this URL).

FAQs

Why is my page in the index? I don't want it indexed.

Google can index any URL that it finds unless you include a noindex directive on the page (or it has been temporarily blocked), and Google can find a page in many different ways, including someone linking to your page from another site.

  • If you want your page to be blocked from Google Search results, you can either require some kind of login for the page, or you can use a noindex directive on the page.
  • If you want your page to be removed from Google Search results after it has already been found, you'll need to follow these steps.

Why hasn't my site been reindexed lately?

Google reindexes pages based on a number of criteria, including how often it thinks the page changes. If your site doesn't change often, it might be on a slower refresh rate, which is fine, if your pages haven't changed. If you think your site is in need of a refresh, ask Google to recrawl it.

Can you please recrawl my page/site?

Ask Google to recrawl it.

Why are so many of my pages excluded?

Look at the exclusion reasons detailed by the Index Coverage report. Most exclusions are due to one of the following reasons:

  • You have a robots.txt rule that is blocking us from crawling large sections of your site. Use the URL Inspection tool to confirm the problem.
  • Your site has a large number of duplicate pages, typically because it uses parameters to filter or sort a common collection (for example: type=dress or color=green or sort=price). These pages will be labeled as "duplicate" or "alternate" in the Index Coverage report.
  • The URL redirects to another URL. Redirect URLs are not indexed; the redirect target is.

Google can't access my sitemap

Be sure that your sitemap is not blocked by robots.txt, is valid, and that you're using the proper URL in your robots.txt entry or Sitemaps report submission. Test your sitemap URL using a publicly available sitemap testing tool.

Why does Google keep crawling a page that was removed?

Google continues to crawl all known URLs even after they return 4XX errors for a while, in case it's a temporary error. The only case when a URL won't be crawled is when it returns a noindex directive.

To avoid showing you an eternally growing list of 404 errors, the Index Coverage report shows only URLs that have shown 404 errors in the past month.

I can see my page, why can't Google?

Use the URL Inspection tool to see whether Google can see the live page. If it can't, it should explain why. If it can, the problem is likely that the access error has been fixed since the last crawl. Run a live crawl using the URL Inspection tool and request indexing.

The URL Inspection tool shows no problems, but the Index Coverage report shows an error; why?

You might have fixed the error after the URL was last crawled by Google. Look at the crawl date for your URL (which should be visible in either the URL details page in the Index Coverage report or in the indexed version view in the URL Inspection tool). Determine if you made any fixes since the page was crawled.

How do I find the index state of a specific URL?

To learn the index status of a specific URL, use the URL Inspection tool. You can't search or filter by URL in the Index Coverage report.

Status reasons

The following status types are exposed by the Index Coverage report:

Error

Pages with errors have not been indexed

Server error (5xx): Your server returned a 500-level error when the page was requested. See Fixing server errors.

Redirect error: Google experienced a redirect error of one of the following types: A redirect chain that was too long; a redirect loop; a redirect URL that eventually exceeded the max URL length; there was a bad or empty URL in the redirect chain. Use a web debugging tool, such as Lighthouse, to get more details about the redirect.

Submitted URL blocked by robots.txt: You submitted this page for indexing, but the page is blocked by your site's robots.txt file.

  1. Click the page in the Examples table to expand the tools side panel.
  2. Click Test robots.txt blocking to run the robots.txt tester for that URL. The tool should highlight the rule that is blocking that URL.
  3. Update your robots.txt file to remove or alter the rule, as appropriate. You can find the location of this file by clicking See live robots.txt on the robots.txt test tool. If you are using a web hosting service and do not have permission to modify this file, search your service's documentation or contact their help center to tell them about the problem.

Submitted URL marked ‘noindex’: You submitted this page for indexing, but the page has a 'noindex' directive either in a meta tag or HTTP header. If you want this page to be indexed, you must remove the tag or HTTP header. Use the URL Inspection tool to confirm the error:

  1. Click the inspection icon next to the URL in the table.
  2. Under Coverage > Indexing > Indexing allowed? the report should show that noindex is preventing indexing.
  3. Confirm that the noindex tag still exists in the live version:
    1. Clicking Test live URL
    2. Under Availability > Indexing > Indexing allowed? see if the noindex directive is still detected. If noindex is no longer present, you can click Request Indexing to ask Google to try again to index the page. If noindex is still present, you must remove it in order for the page to be indexed.

Submitted URL seems to be a Soft 404: You submitted this page for indexing, but the server returned what seems to be a soft 404. Learn how to fix this.

Submitted URL returns unauthorized request (401): You submitted this page for indexing, but Google got a 401 (not authorized) response. Either remove authorization requirements for this page, or else allow Googlebot to access your pages by verifying its identity. You can verify this error by visiting the page in incognito mode.

Submitted URL not found (404): You submitted a non-existent URL for indexing. See Fixing 404 errors.

Submitted URL returned 403: The server recognized Googlebot as logged in, but denied Google access to the content. If this page should be indexed, please grant access to anonymous visitors; otherwise, do not submit this page for indexing.

Submitted URL blocked due to other 4xx issue: The server returned a 4xx response code not covered by any other issue type described here for the submitted URL. You should either fix this error, or not submit this URL for indexing. Try debugging your page using the URL Inspection tool.

Warning

Pages with a warning status might require your attention, and may or may not have been indexed, according to the specific result.

Indexed, though blocked by robots.txt: The page was indexed, despite being blocked by your website's robots.txt file. Google always respects robots.txt, but this doesn't necessarily prevent indexing if someone else links to your page. Google won't request and crawl the page, but we can still index it, using the information from the page that links to your blocked page. Because of the robots.txt rule, any snippet shown in Google Search results for the page will probably be very limited.

Next steps:

Page indexed without content: This page appears in the Google index, but for some reason Google could not read the content. Possible reasons are that the page might be cloaked to Google or the page might be in a format that Google can't index. This is not a case of robots.txt blocking. Inspect the page, and look at the Coverage section for details.

Valid

Pages with a valid status have been indexed.

Submitted and indexed: You submitted the URL for indexing, and it was indexed.

Indexed, not submitted in sitemap: The URL was discovered by Google and indexed. We recommend submitting all important URLs using a sitemap.

Excluded

These pages are typically not indexed, and we think that is appropriate. These pages are either duplicates of indexed pages, or blocked from indexing by some mechanism on your site, or otherwise not indexed for a reason that we think is not an error.

Excluded by ‘noindex’ tag: When Google tried to index the page it encountered a 'noindex' directive and therefore did not index it. If you do not want this page indexed, congratulations! If you do want this page to be indexed, you should remove that 'noindex' directive. To confirm the presence of this tag or directive, request the page in a browser and search the response body and response headers for "noindex". 

Blocked by page removal tool: The page is currently blocked by a URL removal request. If you are a verified site owner, you can use the URL removals tool to see who submitted a URL removal request. Removal requests are only good for about 90 days after the removal date. After that period, Googlebot may go back and index the page even if you do not submit another index request. If you don't want the page indexed, use 'noindex', require authorization for the page, or remove the page.

Blocked by robots.txt: This page was blocked to Googlebot with a robots.txt file. You can verify this using the robots.txt tester. Note that this does not mean that the page won't be indexed through some other means. If Google can find other information about this page without loading it, the page could still be indexed (though this is less common). To ensure that a page is not indexed by Google, remove the robots.txt block and use a 'noindex' directive.

Blocked due to unauthorized request (401): The page was blocked to Googlebot by a request for authorization (401 response). If you do want Googlebot to be able to crawl this page, either remove authorization requirements, or allow Googlebot to access your page.

Crawled - currently not indexed: The page was crawled by Google, but not indexed. It may or may not be indexed in the future; no need to resubmit this URL for crawling.

Discovered - currently not indexed: The page was found by Google, but not crawled yet. Typically, Google wanted to crawl the URL but this was expected to overload the site; therefore Google rescheduled the crawl. This is why the last crawl date is empty on the report.

Alternate page with proper canonical tag: This page is a duplicate of a page that Google recognizes as canonical. This page correctly points to the canonical page, so there is nothing for you to do.

Duplicate without user-selected canonical: This page has duplicates, none of which is marked canonical. We think this page is not the canonical one. You should explicitly mark the canonical for this page. Inspecting this URL should show the Google-selected canonical URL.

Duplicate, Google chose different canonical than user: This page is marked as canonical for a set of pages, but Google thinks another URL makes a better canonical. Google has indexed the page that we consider canonical rather than this one. We recommend that you explicitly mark this page as a duplicate of the canonical URL. This page was discovered without an explicit crawl request. Inspecting this URL should show the Google-selected canonical URL.

Not found (404): This page returned a 404 error when requested. Google discovered this URL without any explicit request or sitemap. Google might have discovered the URL as a link from another site, or possibly the page existed before and was deleted. Googlebot will probably continue to try this URL for some period of time; there is no way to tell Googlebot to permanently forget a URL, although it will crawl it less and less often. 404 responses are not a problem, if intentional. If your page has moved, use a 301 redirect to the new location. Read Fixing 404 errors

Page with redirect: The URL is a redirect, and therefore was not added to the index.

Soft 404: The page request returns what we think is a soft 404 response. This means that it returns a user-friendly "not found" message without a corresponding 404 response code. We recommend returning a 404 response code for truly "not found" pages, or adding more information to the page to let us know that it is not a soft 404. Learn more

Duplicate, submitted URL not selected as canonical: The URL is one of a set of duplicate URLs without an explicitly marked canonical page. You explicitly asked this URL to be indexed, but because it is a duplicate, and Google thinks that another URL is a better candidate for canonical, Google did not index this URL. Instead, we indexed the canonical that we selected. (Google only indexes the canonical in a set of duplicates.) The difference between this status and "Google chose different canonical than user" is that here you have explicitly requested indexing. Inspecting this URL should show the Google-selected canonical URL.

Blocked due to access forbidden (403): The user agent provided credentials, but was not granted access. However, Googlebot never provides credentials, so your server is returning this error incorrectly. This error should either be fixed, or the page should be blocked by robots.txt or noindex.

Blocked due to other 4xx issue: The server encountered a 4xx error not covered by any other issue type described here.

Was this helpful?
How can we improve it?
Search
Clear search
Close search
Google apps
Main menu
Search Help Center
true
83844
false