Siden du har forespurt, er for tiden ikke tilgjengelig på språket ditt. Du kan velge et annet språk nederst på siden. Eventuelt kan du bruke den innebygde oversettelsesfunksjonen i Google Chrome til å oversette en hvilken som helst nettside til det språket du foretrekker.

Crawl Errors report for websites

Website crawl errors can prevent your page from appearing in search results

The Crawl Errors report for websites provides details about the site URLs that Google could not successfully crawl or that returned an HTTP error code.

Open the crawl errors report

 

Looking for the Crawl Status report for apps?

 

The report has two main sections:

  • Site errors: This section of the report shows the main issues for the past 90 days that prevented Googlebot from accessing your entire site (click any box to display its chart).
     
  • URL errors: This section lists specific errors Google encountered when trying to crawl specific desktop, phone, or Android app pages. Each main section in the URL Errors reports corresponds to the different crawling mechanisms Google uses to access your pages, and the errors listed are specific to those kinds of pages.

Site errors overview

In a well-operating site, the Site errors section of the Crawl Errors report should show no errors (this is true for the large majority of the sites we crawl). If Google detects any appreciable number of site errors, we'll try to notify you in the form of a message, regardless of the size of your site.

When you first view the Crawl Errors page, the Site errors section shows a quick status code next to the each of the three error types: DNS, Server connectivity, and robots.txt fetch. If the codes are anything other than a green check mark, you can click the box to see a graph of crawling details for the last 90 days.

High error rates

If your site shows a 100% error rate any of the three categories, it likely means that your site is either down or misconfigured in some way. This could be due to a number of possibilities that you can investigate:

  • Check that a site reorganization hasn't changed permissions for a section of your site.
  • If your site has been reorganized, check that external links still work.
  • Review any new scripts to ensure they are not malfunctioning repeatedly.
  • Make sure all directories are present and haven't been accidentally moved or deleted.
If none of these situations apply to your site, the error rate might just be a transient spike, or due to external causes (someone has linked to non-existent pages), so there might not even be a problem. In any case, when we see an unusually large number of errors for your site, we'll let you know so you can investigate.

Low error rates

If your site has an error rate less than 100% in any of the categories, it could just indicate a transient condition, but it could also mean that your site is overloaded or improperly configured. You might want to investigate these issues further, or ask about them on our forum. We might alert you even if the overall error rate is very low — in our experience, a well configured site shouldn't have any errors in these categories.

Site error types

The following errors are exposed in the Site section of the report:

DNS Errors

What are DNS errors?

A DNS error means that Googlebot can't communicate with the DNS server either because the server is down, or because there's an issue with the DNS routing to your domain. While most DNS warnings or errors don't affect Googlebot's ability to access your site, they may be a symptom of high latency, which can negatively impact your users.

Fixing DNS errors

  • Make sure Google can crawl your site.
    Use Fetch as Google on a key page, such as your home page. If it returns the content of your homepage without problems, you can assume that Google is able to access your site properly.
  • For persistent or re-occuring DNS errors, check with your DNS provider.
    Often your DNS provider and your web hosting service are the same. 
  • Configure your server to respond to non-existent hostnames with an HTTP error code such as 404 or 500.
    A website such as example.com can be configured with a wildcard DNS setup to respond to requests for foo.example.com, made-up-name.example.com and any other subdomain. This makes sense in the case where a site with user-generated content gives each user account its own domain (http://username.example.com). However, in some cases, this kind of configuration can cause content to be unnecessarily duplicated across different hostnames, and it can also affect Googlebot's crawling.

DNS error list

Error Type Description
DNS Timeout

Google couldn't access your site because your DNS server did not recognize your hostname (such as www.example.com).

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Check with your registrar to make sure your site is correctly set up and that your server is connected to the Internet.

DNS Lookup

Google couldn't access your site because your DNS server did not recognize your hostname (such as www.example.com).

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Check with your registrar to make sure your site is correctly set up and that your server is connected to the Internet.

Server errors

What is a server error?

When you see this kind of error for your URLs, it means that Googlebot couldn't access your URL, the request timed out, or your site was busy. As a result, Googlebot was forced to abandon the request. Google can't access your site either because the server is too slow to respond, or because your site is blocking Google. As a result, Google is forced to abandon the request.

Fixing server connectivity errors

  • Reduce excessive page loading for dynamic page requests.
    A site that delivers the same content for multiple URLs is considered to deliver content dynamically (e.g. www.example.com/shoes.php?color=red&size=7 serves the same content as www.example.com/shoes.php?size=7&color=red).  Dynamic pages can take too long to respond, resulting in timeout issues. Or, the server might return an overloaded status to ask Googlebot to crawl the site more slowly. In general, we recommend keeping parameters short and using them sparingly. If you're confident about how parameters work for your site, you can tell Google how we should handle these parameters.
  • Make sure your site's hosting server is not down, overloaded, or misconfigured.
    If connection, timeout or response problems persists, check with your web hoster and consider increasing your site's ability to handle traffic.
  • Check that you are not inadvertently blocking Google.
    You might be blocking Google due to a system level issue, such as a DNS configuration issue, a misconfigured firewall or DoS protection system, or a content management system configuration. Protection systems are an important part of good hosting and are often configured to automatically block unusually high levels of server requests. However, because Googlebot often makes more requests than a human user, it can trigger these protection systems, causing them to block Googlebot and prevent it from crawling your website. To fix such issues, identify which part of your website's infrastructure is blocking Googlebot and remove the block. The firewall may not be under your control, so you may need to discuss this with your hosting provider.
  • Control search engine site crawling and indexing wisely.
    Some webmasters intentionally prevent Googlebot from reaching their websites, perhaps using a firewall as described above. In these cases, usually the intent is not to entirely block Googlebot, but to control how the site is crawled and indexed. If this applies to you, check the following: If you would like to change how frequently Googlebot crawls your site, you can request a change in Googlebot's crawl rate. Hosting providers can verify ownership of their IP addresses too.

Server connectivity errors

Error Type Description
Timeout

The server timed out waiting for the request.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Truncated headers

Google was able to connect to your server, but it closed the connection before full headers were sent. Please check back later.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connection reset

Your server successfully processed Google's request, but isn't returning any content because the connection with the server was reset. Please check back later.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Truncated response

Your server closed the connection before we could receive a full response, and the body of the response appears to be truncated.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connection refused

Google couldn't access your site because your server refused the connection. Your hosting provider may be blocking Googlebot, or there may be a problem with the configuration of their firewall.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Connect failed

Google wasn't able to connect to your server because the network is unreachable or down.

It's possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Google returns the content of your homepage without problems, you can assume that Google is generally able to access your site properly.

Connect timeout

Google was unable to connect to your server.

Use Fetch as Google to check if Googlebot can currently crawl your site. If Fetch as Googlebot returns the content of your homepage without problems, you can assume that Googlebot is generally able to access your site properly.

Check that your server is connected to the Internet. It's also possible that your server is overloaded or misconfigured. If the problem persists, check with your hosting provider.

Robots failure

What is a robots failure?

This is an error to retrieve your site's robots.txt file. Before Googlebot crawls your site, and roughly once a day after that, Googlebot retrieves your robots.txt file to see which pages it should not be crawling. If your robots.txt file exists but is unreachable (in other words, if it doesn't return a 200 or 404 HTTP status code), we'll postpone our crawl rather than risk crawling URLs that you do not want crawled. When this happens, Googlebot will return to your site and crawl it as soon as we can successfully access your robots.txt file. More information about the robots exclusion protocol.

Fixing robots.txt file errors

  • You don't always need a robots.txt file.
    You need a robots.txt file only if your site includes content that you don't want search engines to index. If you want search engines to index everything in your site, you don't need a robots.txt file—not even an empty one. If you don't have a robots.txt file, your server will return a 404 when Googlebot requests it, and we will continue to crawl your site. No problem.
  • Make sure your robots.txt file can be accessed by Google.
    It's possible that your server returned a 5xx (unreachable) error when we tried to retrieve your robots.txt file. Check that your hosting provider is not blocking Googlebot.  If you have a firewall, make sure that its configuration is not blocking Google.

URL errors overview

The URL errors section of the report is divided into categories that show the top 1,000 URL errors specific to that category. Not every error that you see in this section requires attention on your part, but it's important that you monitor this section for errors that can have a negative impact on your users and on Google crawlers. We've made this easier for you by ranking the most important issues at the top, based on factors such as the number of errors and pages that reference the URL. Specifically, you'll want to consider the following:

  • Fix Not Found errors for important URLs with 301 redirects. While it's normal to have Not Found (404) errors, you'll want to address errors for important pages linked to by other sites, older URLs you had in your sitemap and have since deleted, misspelled URLs for important pages, or URLs of popular pages that no longer exist on your site. This way, the information that you care about can be easily accessed by Google and your visitors.
  • Update your sitemaps.  Prune old URLs from your sitemaps, and if you add newer sitemaps that you intend to replace older ones, be sure to delete the old site map (not redirect it to the newer one).  
  • Keep redirects clean and short.  If you have a number of URLs that redirect in a sequence (e.g. pageA > pageB > pageC > pageD), it can be challenging for Googlebot to follow and interpret the sequence.  Try to keep the "hops" to a low number.  Read more about Not followed
  • Make sure deep links to your Android Apps are properly configured.  You can read more about this on the App Indexing for Google Search site.

Viewing URL error details

You can view URL errors in a variety of ways:

  • Click Download to retrieve a list of the top 1,000 errors for that crawler type (e.g. desktop, smartphone).
  • Use the filter above the table to locate specific URLs.
  • See error details by following the link from individual URLs or Application URIs.
Desktop or phone URLs error details show status info on the error, a list of pages that reference the URL, and a link to Fetch as Googlebot so you can troubleshoot problems that URL.

Mark URL errors as fixed

Once you've addressed the issue causing an error for a specific item, you can hide it from the list. You can do this singly or in bulk. Select the checkbox next to the URL, and click Mark as fixed. The URL will be removed from the list.

If the issue remains unresolved, the URL will reappear in the list the next time Google crawls your site, even if you have marked it as fixed.

URL error types

Common URL errors
Error Type Description
Server error

When you see this kind of error for your URLs, it means that Googlebot couldn't access your URL, the request timed out, or your site was busy. As a result, Googlebot was forced to abandon the request.

Read more about server connectivity errors.

Soft 404

Usually, when a visitor requests a page on your site that doesn't exist, a web server returns a 404 (not found) error. This HTTP response code clearly tells both browsers and search engines that the page doesn't exist. As a result, the content of the page (if any) won't be crawled or indexed by search engines.

A soft 404 occurs when your server returns a real page for a URL that doesn't actually exist on your site. This usually happens when your server handles faulty or non-existent URLs as "OK," and redirects the user to a valid page like the home page or a "custom" 404 page.  

This is a problem because search engines might spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage because your real, unique URLs might not be discovered as quickly or visited as frequently due to the time Googlebot spends on non-existent pages.

We recommend that you configure your server to always return either a 404 (Not found) or a 410 (Gone) response code in response to a request for a non-existing page. You can improve your visitors' experience by setting up a custom 404 page when returning a 404 response code. For example, you could create a page containing a list of your most popular pages, or a link to your home page, or a feedback link. But it's important to remember that it's not enough to just create a page that displays a 404 message. You also need to return the correct 404 or 410 HTTP response code.

404

Google discovers content by following links from one page to another. Generally, a Not Found status error (usually a 404 HTTP status code) is returned when Googlebot attempts to visit a page that doesn't exist—either because you deleted or renamed it without redirecting the old URL to a new page, or because of a typo in a link.

Dealing with Not Found errors

Generally, 404 errors don't impact your site's ranking in Google, and you can safely ignore them. Typically, they are caused by typos, misconfigurations (for example, for links that are automatically generated by a content management system) or by Google's increased efforts to recognize and crawl links in embedded content such as JavaScript.  Here are some pointers to help you investigate:

  • See where invalid links are coming from by viewing the Linked from these pages section, which you reach by clicking the URL.
  • Fix or delete links that from your own site.
  • Capture intended traffic from misspelled links on other sites with 301 redirects.
    For example, a misspelling of a legitimate URL (www.example.com/redshuz instead of www.example.com/redshoes) probably happens when someone intended to link to your site and simply made a typo. In this case, you can capture that misspelled URL in your server configuration and create a 301 redirect to the correct URL. You can also contact the webmaster of a site with an incorrect link, and ask for the link to be updated or removed.

404s are a perfectly normal (and in many ways desirable) part of the web. You will likely never be able to control every link to your site, or resolve every 404 error listed in Search Console. Instead, check the top-ranking issues, fix those if possible, and then move on.

When to return a 404 status code

When you remove a page from your site, think about whether that content is moving somewhere else, or whether you no longer plan to have that type of content on your site. 

  • When moving content to a new URL, redirect the old URL to the new URL—that way when users come to the old URL looking for that content, they'll be automatically redirected to something relevant to what they were looking for.
  • When you permanently remove content without intending to replace it with newer, related content, let the old URL return a 404 or 410. Currently Google treats 410s (Gone) the same as 404s (Not found). 

Returning a code other than 404 or 410 for a non-existent page (or redirecting users to another page, such as the homepage, instead of returning a 404) can be problematic. Such pages are called soft 404s, and can be confusing to both users and search engines.

Unexpected 404 errors

In Crawl Errors, you might occasionally see 404 errors for URLs you don't believe exist on your own site or on the web. These unexpected URLs might be generated by Googlebot trying to follow links found in JavaScript, Flash files, or other embedded content.

For example, your site may use the following code to track file downloads in Google Analytics:


<a href="helloworld.pdf"
  onClick="_gaq.push(['_trackPageview','/download-helloworld']);">
  Hello World PDF</a>

When it sees this, as an example, Googlebot might try to crawl the URL http://www.example.com/download-helloworld, even though it's not a real page. In this case, the link may appear as a 404 (Not Found) error in the Crawl Errors feature in Search Console.

Google strives to detect these types of issues and resolve them so that they will disappear from Crawl Errors. 

Access denied

In general, Google discovers content by following links from one page to another. To crawl a page, Googlebot must be able to access it. If you're seeing unexpected Access Denied errors, it may be for the following reasons:

  • Googlebot couldn't access a URL on your site because your site requires users to log in to view all or some of your content.
  • Your robots.txt file is blocking Google from accessing your whole site or individual URLs or directories.
  • Your server requires users to authenticate using a proxy, or your hosting provider may be blocking Google from accessing your site.

To fix:

  • Test that your robots.txt is working as expected and does not block Google. The Test robots.txt tool lets you see exactly how Googlebot will interpret the contents of your robots.txt file. The Google user-agent is Googlebot. 
  • Use Fetch as Google to understand exactly how your site appears to Googlebot. This can be very useful when troubleshooting problems with your site's content or discoverability in search results.
Not followed

Not followed errors lists URLs that Google could not completely follow, along with some information as to why. Here are some reasons why Googlebot may not have been able to follow URLs on your site:

Flash, JavaScript, active content

Some features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash can make it difficult for search engines to crawl your site. Check the following:

  • Use a text browser such as Lynx to examine your site, since many search engines see your site much as Lynx would. If features such as JavaScript, cookies, session IDs, frames, DHTML, or Flash keep you from seeing all of your site in a text browser, then search engine spiders may have trouble crawling your site.
  • Use Fetch as Google to see exactly how your site appears to Google.
  • If you use dynamic pages (for instance, if your URL contains a ? character), be aware that not all search engine spiders crawl dynamic and static pages. In general, we recommend keeping parameters short and using them sparingly. If you're confident about how parameters work for your site, you can tell Google how we should handle them.

Redirects

  • If you are permanently redirecting from one page to another, make sure you're returning the right HTTP status code (301 Moved Permanently).
  • Where possible, use absolute rather than relative links. (For instance, when linking to another page in your site, link to www.example.com/mypage.html rather than simply mypage.html).
  • Try to make every page on your site reachable from at least one static text link. In general, minimize the number of redirects needed to follow a link from one page to another.
  • Check your redirects point to the right pages! Sometimes we discover redirects that point to themselves (resulting in a loop error) or to invalid URLs.
  • Don't include redirected URLs in your Sitemaps.
  • Keep your URLs as short as possible. Make sure you aren't automatically appending information (such as session IDs) to your redirect URLs.
  • Make sure your site allows search bots to crawl your site without session IDs or arguments that track their path through the site.
DNS error

When you see this error for URLs, it means that Googlebot could either not communicate with the DNS server, or your server had no entry for your site.

Read more about DNS errors.

Mobile-only URL errors (Smartphone)
Error Description
Faulty redirects

The Faulty redirect error appears in the URL Errors section of the Crawl > Crawl Errors page under the Smartphones tab.

Some websites use separate URLs to serve desktop and smartphone users and configure desktop pages to direct smartphone users to the mobile site (e.g. m.example.com). A faulty redirect occurs when a desktop page incorrectly redirects smartphone users to a smartphone page not relevant to their query. A typical example of this occurs when all desktop pages redirect smartphone users to the homepage of the smartphone-optimized site. In the figure below, the redirects shown with red arrows indicate faulty redirects:


This kind of redirect disrupts users' workflow and can cause them to stop using the site and look elsewhere, so when our systems detect that smartphone results redirect to a homepage instead of a relevant URL, the Search results provide a note to the user:

May open the site's home page.

A user can still access the link by clicking Try Anyway. Even if a user perseveres and finds the correct page on the smartphone-optimized site, an irrelevant redirect makes them work harder to find your page on a slower mobile network. In addition to frustrating users, faulty redirects can cause problems with our crawling, indexing, and ranking algorithms.

Following are some tips to help you create a mobile-friendly search experience and avoid faulty redirects:

  • Do a few searches on your own phone (or set your browser to act like a smartphone) to see how your site behaves.
  • Use the example URLs provided in the report as a starting point to debug exactly where the problem is with your server configuration.
  • Set up your server so that it redirects smartphone users to the equivalent URL on your smartphone site.
  • If a page on your site doesn't have a smartphone equivalent, keep users on the desktop page, rather than redirecting them to the smartphone site's homepage. Doing nothing is better than doing something wrong in this case.
  • Consider using responsive web design, which serves the same content for desktop and smartphone users.
  • Finally, read our recommendations for having separate URLs for desktop and smartphone users.
URLS blocked for smartphones

The "Blocked" error appears on the Smartphone tab of the URL Errors section of the Crawl > Crawl Errors page. If you get the "Blocked" error for a URL on your site, that means that the URL is blocked for Google's smartphone Googlebot in your site's robots.txt file.

This may not necessarily be a smartphone-specific error (for example, the equivalent desktop pages may also be blocked). However, it often indicates that the robots.txt file needs to be modified to allow crawling of smartphone-enabled URLs. When the smartphone-enabled URLs are blocked, the mobile pages can't be crawled and because of this, they may not appear in search results.

If you get the "Blocked" smartphone crawl error for URLs on your site, examine your site's robots.txt file and make sure that you are not inadvertently blocking parts of your site from being crawled by Googlebot for smartphones.

For more information, see our recommendations.

Flash content

The Flash content error appears in the URL Errors section of the Crawl > Crawl Errors page under the Smartphones tab.

Our algorithms list URLs in this section as having content rendered mostly in Flash. Many devices cannot render these pages because Flash is not supported by iOS or Android versions 4.1 and higher. In addition, for these URLs, users of these operating systems see the following notice in Google Search results:

Uses Flash. May not work for your device.

We recommend that you improve the mobile experience for your website by using responsive web design for your site, a practice recommended by Google for building search-friendly sites for all devices.  You can learn more about this in Web Fundamentals, a comprehensive resource for multi-device web development.

Whichever approach you take to address this issue, be sure to allow Googlebot access to all assets of your site (CSS, JavaScript, and images) and do not block them with robots.txt or by other means. Our algorithms need these external files to detect your site's design configuration and treat it appropriately. You can make sure our indexing algorithms have access to your site by using the Fetch as Google feature in Search Console.

News-only errors

 

In order to view error reports specific to Google News, news publishers need to include their site in Google News, have created a Search Console account and added their site to it. Once these steps are done, follow the steps below in Search Console:

  • On the Home page, click the site's URL.
  • On the Dashboard, click Crawl > Crawl Errors.
  • Click on the News tab to see crawl errors for your news content.
  • Crawl errors are organized into categories, such as "Article extraction " or "Title error." Clicking on one of these categories will display a list of affected URLs and the crawl errors they're generating.
Note: Please keep in mind that our news index is compiled by computer algorithms. While we strive to include as much of your content as possible, we can't guarantee the inclusion of every single article. We appreciate your understanding.
Error Description
Article disproportionately short

The article body that we extracted from the HTML page is too small when compared to other clusters of text without links on the page. This applies to most pages that contain news briefs or multimedia content, rather than full news articles. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

This problem is often caused by:

  • Too many snippets for related articles - to help our extractor please consider making these snippets clickable.
  • Features such as 'Send this article to friends' with long descriptions - consider setting a "display:none" or "visibility:hidden" style to make the text invisible or writing the pieces of HTML code by JavasScript dynamically.
  • User comments - consider enclosing the comments in an iframe, dynamically fetching them with AJAX or moving them to an adjacent page.
Article fragmented

The article body that we extracted from the HTML page appears to consist of isolated sentences not grouped together into paragraphs. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

  • Check that your paragraphs are formatted such that each is more than one sentence in length.
  • Make sure your sentences are well punctuated.
  • Make sure you don't use frequent <br> and <p> tags within your paragraphs, and try to avoid breaking up the article body in general.
  • Consider removing some of the non-article text from the article page.
Article too long

The article body that we extracted from the HTML page appears to be too long to be a news article. We generated this error to avoid including what might be an incorrect piece of text. Common causes include news articles that contain user-contributed comments below the article, or HTML layouts that contain other material besides the news article itself.

Recommendations

Consider removing some of the non-article text from the article page. If the article page contains user comments, consider one of the following options:

  • enclosing them in an iframe.
  • dynamically fetching them with AJAX.
  • moving part of the comments to an adjacent page.
Article too short

The article body that we extracted from the HTML page appears to contain too few words to be a news article. This applies to most pages that contain news briefs or multimedia content, rather than full news articles. We generated this error to avoid including what might be an incorrect piece of text.

Recommendations

  • Try formatting your articles into text paragraphs of a few sentences each. If the article content appears to contain too few words to be a news article, we won't be able to include it.
  • Make sure your articles have more than 80 words.
Date not found

We were unable to determine the publication date of the article.

Recommendations

Follow the date formatting recommendations below:

  • Place a clear date and time for each of your articles in between the article's title and the article's text in a separate line of HTML. The date should specify when the article was first published.
  • Remove any other dates from the HTML of the article page so that the crawler doesn't mistake them for the correct publication time.
  • If you'd like to use a date metatag, please contact us first. Date meta tags should be of the form: <meta name="DC.date.issued" content="YYYY-MM-DD">, where the date is in W3C format, using either the "complete date" (YYYY-MM-DD) format, or the "complete date plus hours, minutes and seconds" (YYYY-MM-DDThh:mm:ssTZD) format with a time zone suffix.
  • Create a News Sitemap. The <publication_date> tag will ensure we're able to pick the correct date for your articles.
Date too old

The date that we determined for this article, either from a <publication_date> tag in the Sitemap, or from a date in the page HTML itself, is too old.

Recommendations

  • Make sure your article is less than 2 days old. Currently we are only collecting articles that are 2 days old or less.
  • Follow the date formatting recommendations above.
Empty article

The article body that we extracted from the HTML page appears to be empty.

Recommendations

  • Make sure that the full text of each of your articles is available in the source code of your article pages (and not embedded in a JavaScript file or iframe, for example).
  • Make sure that you're not using a style in the source code of your articles such as "display:none" or "visibility:hidden".
  • Make sure the links to your articles lead directly to your articles pages rather than to an intermediate page using a Javascript redirect.
Extraction failed

We were unable to extract the article from the page. Extractions fail when we are unable to identify a valid title, body, and timestamp for the article. We list URLs with this error to provide you with information regarding why some articles may not appear in Google News.

Recommendations

  • Make sure that your title, body, and timestamp are easily crawlable (are available as text and not as images, for instance), but at this time, this error is primarily for informational purposes. We are actively working to improve our extraction methods so that you'll see this error less often.
  • Submit a News Sitemap.
No sentences found

The article body that we extracted from the HTML page appears not to contain punctuated sequences of contiguous words. We generated this error to avoid including what might be an incorrect section of text.

Recommendations

  • If the article content doesn't have punctuated sequences of contiguous words, we won't be able to include it in Google News. Make sure that the text of your articles is made up of sentences, and that you don't use frequent <br> or <p> tags within your paragraphs.
  • Make sure that the full text of each of your articles is available in the source code of your article pages (and not embedded in a JavaScript file, for example).
  • Make sure the links to your articles lead directly to your articles pages rather than to an intermediate page using a JavaScript redirect.
Off-site redirect

The section or article page redirects to a URL on a different domain.

Recommendations

  • All section pages and articles must be located within the domain of the site included in Google News.
  • If you are not using off-site redirects, please make sure your site has not been modified by a third party. Read moreabout hacked sites.
Page too large

The section or article page length exceeds the maximum allowed.

Recommendation

  • The HTML source page can be up to 256KB in size.
Title not allowed

The title that we extracted from the HTML page suggests that it is not a news article.

Recommendation

  • Often this problem can be fixed by setting the <title> tag on the HTML page to the title of the article, and repeating the title in a prominent place on the HTML page, such as in an <h1> tag. Read more about titles.
Title not found

We were unable to extract a title for the article from the HTML page.

Recommendations

  • Follow our title formatting recommendations.
  • To make sure your articles display properly on mobile devices, don't include a leading number (which sometimes corresponds to an access key) in the anchor text of the title.
Uncompression failed

Googlebot-News detected that the page was compressed, but was unable to uncompress it. This can be caused by bad network condition or bad web server programming or configuration.

Recommendation

  • Please check your network/webserver.
Unsupported content type

The page had an HTTP content-type that is not supported by Google News.

Recommendation

  • Articles must have a content-type of text/html, text/plain or application/xhtml+xml.
Was this article helpful?