Jan 8, 2023

Googlebot tries to load images from invalid URLs

Googlebot keeps trying to load images by appending the title of the image to the URL of the page it is currently on. The images are lazy loaded where a placeholder image is in the src with the real url in data-src, and through IntersectionObserver the real url is put in the src.
Also the image has schema.org properties among which contentUrl which has the image url.
Locked
Informational notification.
This question is locked and replying has been disabled.
Community content may not be verified or up-to-date. Learn more.
Last edited Jan 8, 2023
Recommended Answer
Jan 9, 2023
Thank's. I don't see anything obvious that would cause an issue.

What do you mean by "Googlebot keeps trping to load". How are you seeing that?

The title, alt and metadata for the images are the images filename including the extension .jpg. I wonder if Google is picking up it looks like an image URL and testing it out. Anyhow, for usability you should drop the .jpg from those text fields.

Do you have an image sitemap?

An example of one of these strange image URLs would help. 

Original Poster Alex 1187 marked this as an answer
Helpful?
All Replies
Jan 9, 2023
It would help to have an example page to look at. You can share via a url sortener like https://bitly.com if you want to keep it hidden.
Jan 10, 2023
Thanks very much Tony, I will drop the .jpg, I think it might help to not see those as something resembling relative paths.
I do not have an images sitemap. I can see the requests Googlebot Image does in my webserver logs, they look like:

"GET /Hamrun--Il-Hamrun--Malta---Imsida--L-Imsida--Malta---Malta---Paola--Paola--Malta---Xaghra--Ix-Xaghra--Malta/Mdina,%20Malta%20-%20panoramio%20(9).jpg HTTP/1.1" 200 47626 "-" "Googlebot-Image/1.0" 443

"GET /Gourbeyre--Guadeloupe--Guadeloupe---Guadeloupe---Le-Moule--Guadeloupe--Guadeloupe---Petit-Canal--Guadeloupe--Guadeloupe/Pointe-%C3%A0-Pitre%20P%C3%A9licans%20(3).JPG HTTP/1.1" 200 38541 "-" "Googlebot-Image/1.0" 443

"GET /archaeology---beach/Amman--Amman--Jordan---Asia---Beirut--Beyrouth--Lebanon---Eilat--Southern-District--Israel---Manama--Manama--Bahrain/Dar-Kulayb--Southern-Governorate--Bahrain---archeological-sites---places/Holon--Tel-Aviv--Israel---gardens---places/Holon--Tel-Aviv--Israel---theatres---places/Holon--Tel-Aviv--Israel---forests---places/DSC%200049-2.jpg HTTP/1.1" 200 51669 "-" "Googlebot-Image/1.0" 443
Jan 10, 2023
That's Google's image bot.

I tested those URLs and it seems your website returns success (200 OK) for URLs like that, including ones ending in .jpg. However the response is HTML.

I suspect these are old image URLs that the image bot is testing and getting confused as they return OK and HTML.

You should have your website return a 404 missing status for invalid URLs.
false
14404990471930709822
true
Search Help Center
true
true
true
true
true
83844
Search
Clear search
Close search
Main menu
false
false