Crawl errors

Not found errors (404)

What is a Not Found error?

Google discovers content by following links from one page to another. Generally, a Not Found status error (usually a 404 HTTP status code) is returned when Googlebot attempts to visit a page that doesn’t exist—either because you deleted or renamed it without redirecting the old URL to a new page, or because of a typo in a link.

Dealing with Not Found errors

Generally, 404 errors don’t impact your site’s ranking in Google, and you can safely ignore them. Typically, they are caused by typos, misconfigurations (for example, for links that are automatically generated by a content management system) or by Google’s increased efforts to recognize and crawl links in embedded content such as JavaScript.  Here are some pointers to help you investigate:

  • See where invalid links are coming from by viewing the Linked from these pages section, which you reach by clicking the URL.
  • Fix or delete links that from your own site.
  • Capture intended traffic from misspelled links on other sites with 301 redirects.
    For example, a misspelling of a legitimate URL (www.example.com/redshuz instead of www.example.com/redshoes) probably happens when someone intended to link to your site and simply made a typo. In this case, you can capture that misspelled URL in your server configuration and create a 301 redirect to the correct URL. You can also contact the webmaster of a site with an incorrect link, and ask for the link to be updated or removed.

404s are a perfectly normal (and in many ways desirable) part of the web. You will likely never be able to control every link to your site, or resolve every 404 error listed in Search Console. Instead, check the top-ranking issues, fix those if possible, and then move on.

When to return a 404 status code

When you remove a page from your site, think about whether that content is moving somewhere else, or whether you no longer plan to have that type of content on your site. 

  • When moving content to a new URL, redirect the old URL to the new URL—that way when users come to the old URL looking for that content, they’ll be automatically redirected to something relevant to what they were looking for.
  • When you permanently remove content without intending to replace it with newer, related content, let the old URL return a 404 or 410. Currently Google treats 410s (Gone) the same as 404s (Not found). 

Returning a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404) can be problematic. Such pages are called soft 404s, and can be confusing to both users and search engines.

Unexpected 404 errors

In Crawl Errors, you might occasionally see 404 errors for URLs you don't believe exist on your own site or on the web. These unexpected URLs might be generated by Googlebot trying to follow links found in JavaScript, Flash files, or other embedded content.

For example, your site may use the following code to track file downloads in Google Analytics:

<a href="helloworld.pdf" onClick="_gaq.push(['_trackPageview','/download-helloworld']);">Hello World PDF</a>

When it sees this, as an example, Googlebot might try to crawl the URL http://www.example.com/download-helloworld, even though it’s not a real page. In this case, the link may appear as a 404 (Not Found) error in the Crawl Errors feature in Search Console.

Google strives to detect these types of issues and resolve them so that they will disappear from Crawl Errors. 

Was this article helpful?