/webmasters/community?hl=en
This content is likely not relevant anymore. Try searching or browse recent questions.
Googlebot ignores robots.txt 2 Recommended Answers 16 Replies 0 Upvotes
1 Recommended Answer
$0 Recommended Answers
1 Relevant Answer
$0 Relevant Answers
Two months ago i disallowed some directories from crawling with robots.txt. Since then i monitor logfiles everyday and realize, that Googlebot absolutely ignores robots.txt.

Really, it crawls every single url, it was crawling before directories were disallowed for crawling.

Search console test displays, as all urls from disallowed directories would be allowed! 

Only live tests of search console display urls as disallowed - it says, disallow rules are correct and working.

According to disallowed crawling, urls from disallowed directories should appear in index without snippets. But, urls from disallowed directories appear with snippets, despite of their caching date is from this week.

All main rules i know about Google and websites aren't working there.

Any ideas, what could be happen here? Example url.
Relevant Answer Relevant Answers (0)
All Replies (16)
Relevant Answer
Hi E. O.,

As I can see, This rule is blocked to Googlebot and it cant reach to this URL. 
Your page isn't crawled from 25th Jan.. and as you are saying GSC Live Test tool says its blocked by robots.txt … I would suggest you to wait for some time … 

In case you don't want this URL to appear in SERP, using Noindex tag or blocking it behind some captcha would be the best approach …
marked this as an answer
Relevant Answer
Correctly, this blocking rule was set end of november 2019. Since then Google crawls the site weekly - lastly at 25.01.2020, as the caching date says.

The question is: why damn Google ignores blocking rule and maintaines snippets in SERP, while it should display no snippet because of blocked crawling?

Two rules, which are the way to go for any SEO since many years, are ignored.

No, thank you for recommendation, but i want exactly, what i was done - not to noindex (this is something slightly other, then blocking against crawling) and not to bring behind captcha, login etc. I want simply to block this url against crawling.
marked this as an answer
Relevant Answer
E.O. 
 
Do you mind sharing any log file details? I'd like to escalate the issue  but would need details to do so. 
  • I have seen robots.txt files cached so Google didn't see the new version for an extended period
  • I have see useragents maliciously using Googlebot's name which ignore robot.txt 
  • Some user agents don't honor robots.txt: https://support.google.com/webmasters/answer/1061943?hl=en
  • I've never seen a case of Googlebot ignoring robot.txt - if it were actually happening Google would really want to know
 
marked this as an answer
Relevant Answer
@OptimistPrime For sure, but as recently as at monday, when i'm in office.

I verify all logfiles with reverse DNS lookup to get only "real" Googlebots (as Google recommends) - so i'm pretty sure, they aren't fakes.
marked this as an answer
This question is locked and replying has been disabled.
Discard post? You will lose what you have written so far.
Write a reply
10 characters required
Failed to attach file, click here to try again.
Discard post?
You will lose what you have written so far.
Personal information found

We found the following personal information in your message:

This information will be visible to anyone who visits or subscribes to notifications for this post. Are you sure you want to continue?

A problem occurred. Please try again.
Create Reply
Edit Reply
Delete post?
This will remove the reply from the Answers section.
Notifications are off
Your notifications are currently off and you won't receive subscription updates. To turn them on, go to Notifications preferences on your Profile page.
Report abuse
Google takes abuse of its services very seriously. We're committed to dealing with such abuse according to the laws in your country of residence. When you submit a report, we'll investigate it and take the appropriate action. We'll get back to you only if we require additional details or have more information to share.

Go to the Legal Help page to request content changes for legal reasons.

Reported post for abuse
Unable to send report.
Report post
What type of post are you reporting?
Google takes abuse of its services very seriously. We're committed to dealing with such abuse according to the laws in your country of residence. When you submit a report, we'll investigate it and take the appropriate action. We'll get back to you only if we require additional details or have more information to share.

Go to the Legal Help page to request content changes for legal reasons.

Reported post for abuse
Unable to send report.
This reply is no longer available.
/webmasters/threads
//accounts.google.com/ServiceLogin
You'll receive email notifications for new posts at
Unable to delete question.
Unable to update vote.
Unable to update subscription.
You have been unsubscribed
Deleted
Unable to delete reply.
Removed from Answers
Marked as Recommended Answer
Removed recommendation
Undo
Unable to update reply.
Unable to update vote.
Thank you. Your response was recorded.
Unable to undo vote.
Thank you. This reply will now display in the answers section.
Link copied
Locked
Unlocked
Unable to lock
Unable to unlock
Pinned
Unpinned
Unable to pin
Unable to unpin
Marked
Unmarked
Unable to mark
Reported as off topic
/webmasters/profile/0