Common Product Crawl Issues

We routinely crawl your product page and images in order to check for quality issues. If we cannot do this, we will be unable to show your items on Google Shopping.

The most common reasons for Product Crawl Issues are:

  • Page not found (404) error: You gave us a wrong URL (e.g there was a mistake in the URL) and so the page returned a ‘Page not found (404)’ error. Please check that the URL is correct and your website is live.
  • Server's robots.txt disallows access: You have robotted your page by adding a ‘robots.txt’ file to your server and prohibited crawl access. We do not crawl robotted pages. Please resolve this by configuring the ‘robots.txt’ file to allow our crawl.
  • Invalid URL: Your URL contains invalid characters or does not have the format of a valid link.

Note: Once the issue you are experiencing has been resolved your product may take up to 48 hours to be reinserted into Google Shopping.

There are a number of other issues that may also prevent us from crawling your page.

Common Issues
  • Page requires authentication: The URL provided is protected by some sort of authentication protocol that prevents Google from accessing the content.
  • HTTP 4xx response, HTTP 5xx response: The server hosting your website returned an HTTP error that prevented us from accessing the content.
  • Hostname not resolveable: We were unable to resolve the hostname of your server to an IP address and so could not access the page.
  • Malformed HTTP response: The response from your server was garbled.
  • Private IP: Your website is hosted behind a firewall or router and we were unable to access it.
  • Network error: There was some sort of error in the network.
  • Timeout reading page: The server took too long returning the page and we abandoned the crawl of that product.
  • Server redirects too often: Your server redirected the crawl multiple times and it had to be abandoned.
  • Redirect URL too long, Empty redirect URL, Bad redirect URL: The redirect URL your server returned was not valid and we could not follow it.
  • Server's robots.txt unreachable, Timeouts reading robots.txt: We were unable to read your robots.txt file so could not crawl your page. Learn more about the Robot Exclusion Protocol here.