Comprehensive documentation of the robots.txt & robots meta tags
In the Robots.txt Specifications, you give an example of a valid robots.txt URL of http://www.müller.eu/robots.txt .. which redirects to http://www.xn--mller-kva.eu/robots.txt.
You then say that is is valid for both www.müller.eu and www.xn--mller-kva.eu.
Does this mean that a domain which is redirected to another inherits the robots.txt of the target domain?
You say "The path value must start with "/" to designate the root. If a path without a beginning slash is found, it may be assumed to be there". You then give an example below and say that fish/ is equivalent to /fish/. However the robots.txt testing tool in Webmaster tools doesn't assume a / at the start of a directive. Is this an issue with the testing tool? Does the testing tool use identical code to the real Googlebot or is it a simulator?
Finally, you give an example of the URL http://example.com/page.htm and say that Allow: /page and Disallow: /*.htm would result in an undefined outcome. The testing tool says that this example would be disallowed which follows the rule about "the most specific rule based on the length of the [path] entry will trump the less specific (shorter) rule". Does this mean that the testing tool is not consistent with the real Googlebot?
Thanks. Overall the documentation is very thorough and I have learned a lot.
In the robots.txt specification: http://code.google.com/web/controlcrawlindex/docs/robots_txt.html you mentioned that the following URL:
is *undefined* if robots.txt contains the following rules:
When you test this example in Webmastertools “Test robots.txt” tool though the results are slightly different and are clearly giving the upper hand to Allow in this case. See results below:
* Allowed by line 4: Allow: /*.htm
Can you confirm which one is correct: the specification or the Webmastertools “Test robots.txt” tool?
Would it be possible to add a starring option ? i.e. like for APIs, ex. on http://code.google.com/apis/accounts/ you've got the option to click the star left of the header ("Authentication and Authorization for Google APIs"), and it will shown up in the "My favorites"-dropdown (located right of your email/account-login displayed, in top right part of the page).
1. include it in the Google Code Site Directory of Resources
2. support the starring option offered by Google Code.
I don't need this documentation very often, so I haven't bookmarked it - but when I do need it, ... I haven't figured out a natural way of locating it; Each time I've had to do a WebSearch to find this thread. (maybe I just need to either bookmark it, - or bump this thread at regular interval, so to attempt making sure that it'll never get dropped from Google's index).
Actually, I can't even find a reference to it from within any WebMaster Help Center article - all those articles, only ever links to pages on www.robotstxt.org.
Hmmm, I guess that http://code.google.com/web/controlcrawlindex/ might just be a dumping ground, ... to be ignored for future use.
Some community members might have badges that indicate their identity or level of participation in a community.
Member levels indicate a user's level of participation in a forum. The greater the participation, the higher the level. Everyone starts at level 1 and can rise to level 10. These activities can increase your level in a forum:
- Post an answer.
- Having your answer selected as the best answer.
- Having your post rated as helpful.
- Vote up a post.
- Correctly mark a topic or post as abuse.
Having a post marked and removed as abuse will slow a user's advance in levels.
View profile in forum?
To view this member's profile, you need to leave the current Help page.
Report abuse in forum?
This comment originated in the Google Product Forum. To report abuse, you need to leave the current Help page.
Reply in forum?
This comment originated in the Google Product Forum. To reply, you need to leave the current Help page.