/webmasters/community?hl=en
/webmasters/community?hl=en
3/8/12
Original Poster
Aeth

Googlebot trying to FTP into server?

The secure log on a VPS I manage is showing 2 sets of three login anonymous FTP attempts yesterday from what appears to be googlebot, as follows (server details anonymised for security)

Line 164878: Mar 7 10:05:42 server proftpd[3795]: server.example.com (66.249.71.116[66.249.71.116]) - USER anonymous: no such user found from 66.249.71.116 [66.249.71.116] to (IP address removed):21

Line 164879: Mar 7 10:05:42 server proftpd[3795]: server.example.com (66.249.71.116[66.249.71.116]) - mod_delay/0.5: delaying for 121 usecs

Line 164880: Mar 7 10:05:42 server proftpd[3795]: server.example.com (66.249.71.116[66.249.71.116]) - FTP session closed.

Line 164882: Mar 7 10:05:42 server proftpd[3796]: server.example.com (66.249.71.116[66.249.71.116]) - USER anonymous: no such user found from 66.249.71.116 [66.249.71.116] to (IP address removed):21

Line 164883: Mar 7 10:05:42 server proftpd[3796]: server.example.com (66.249.71.116[66.249.71.116]) - mod_delay/0.5: delaying for 74 usecs

Line 164884: Mar 7 10:05:42 server proftpd[3796]: server.example.com (66.249.71.116[66.249.71.116]) - FTP session closed.

66.249.71.116 appears to be crawl-66-249-71-116.googlebot.com but I'm puzzled why a search bot should be attempting an FTP login on a VPS. Is googlebot's IP address being spoofed, otherwise, why would googlebot try and FTP into a VPS via FTP like this? (The VPS is set not to allow anon FTP.)
Community content may not be verified or up-to-date. Learn more.
Recommended Answer
Was this answer helpful?
How can we improve it?
All Replies (4)
JohnMu
3/8/12
JohnMu
Hi Aeth

When we find links to FTP content, we'll generally attempt to crawl those URLs. If they're publicly accessible and return normal content, we may choose to index them as well. While it's not that common, there are occasionally queries where a file on an FTP server is a good result. For example, for the query https://www.google.com/search?q=ISO%2FIEC+8859-11 I currently see a link to ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-11.TXT as one of the top results.

If you wish to block crawling of a public FTP server, you can use the robots.txt file just as you would on a normal website. If your FTP server isn't publicly accessible, then you wouldn't need to do anything specific to prevent that content from being indexed (as it can't be accessed). 

Hope it helps!
John
3/10/12
Original Poster
Aeth
Hi John, we don't have any public FTP links at all on the server, we're just a message board and html files with images, which is why I'm extremely puzzled by this. We are by no means a public FTP server and have nothing available for anyone to download anywhere, so why would Googlebot think we have? We know there's no FTP files to be indexed, which is why I don't understand why Googlebot is attempting to login via anonymous FTP.
JohnMu
3/10/12
JohnMu
Hi Aeth
I'm happy to double-check where we found those links if you can post the URL/hostname. Feel free to use a URL shortener if you prefer. Otherwise, especially if you're sure that there's no FTP content that is meant to be indexed, it's fine to just leave it like that -- our algorithms will generally recognize that there's nothing crawlable there and slow their fetch-attempts.

Cheers
John
--
11/30/12
--
How does one stop googlebot from attempting FTP access at all?  It still consumes bandwidth attempting to log in.  Also, "robots.txt" is an HTTP thing - there's no such equivalent for FTP.  John:  Are you saying that if googlebot encounters a "robots.txt" file in an FTP root directory, it will honor the file?
 
Regardless, I still receive HTTP requests for "http://ftp....." and HTTP requests for a "/robots.txt" under that.  Does googlebot want an FTP server's robots.txt file under the HTTP protocol?
 
This question is locked and replying has been disabled. Still have questions? Ask the Help Community.

Badges

Some community members might have badges that indicate their identity or level of participation in a community.

 
Expert - Google Employee — Googler guides and community managers
 
Expert - Community Specialist — Google partners who share their expertise
 
Expert - Gold — Trusted members who are knowledgeable and active contributors
 
Expert - Platinum — Seasoned members who contribute beyond providing help through mentoring, creating content, and more
 
Expert - Alumni — Past members who are no longer active, but were previously recognized for their helpfulness
 
Expert - Silver — New members who are developing their product knowledge
Community content may not be verified or up-to-date. Learn more.

Levels

Member levels indicate a user's level of participation in a forum. The greater the participation, the higher the level. Everyone starts at level 1 and can rise to level 10. These activities can increase your level in a forum:

  • Post an answer.
  • Having your answer selected as the best answer.
  • Having your post rated as helpful.
  • Vote up a post.
  • Correctly mark a topic or post as abuse.

Having a post marked and removed as abuse will slow a user's advance in levels.

View profile in forum?

To view this member's profile, you need to leave the current Help page.

Report abuse in forum?

This comment originated in the Google Product Forum. To report abuse, you need to leave the current Help page.

Reply in forum?

This comment originated in the Google Product Forum. To reply, you need to leave the current Help page.