May 31, 2023

To many POST requests from Googlebot

Hi there!
Today I found many of POST requests from google (and bingbot too) to my wordpress website.

EXAMPLE LOGS
POST /?wc-ajax=get_refreshed_fragments HTTP/1.1" 200 281 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/113.0.5672.126 Safari/537.36" "66.249.64.104"
POST /?wc-ajax=get_refreshed_fragments HTTP/1.1" 200 281 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/112.0.0.0 Safari/537.36" "40.77.202.17"

Too many requests overloaded my server. I have blocked POST request from Googlebot to something be smoothly.
But I don't understand what is this mean, Google Crawler must use GET request not POST request like this.
Is this an update from google? Block Post request Does affect SEO?
Hope to get an answer <3
Locked
Informational notification.
This question is locked and replying has been disabled.
Community content may not be verified or up-to-date. Learn more.
Recommended Answer
May 31, 2023
I imagine it depends on how well the 'client' pages handle that request being blocked. 

If they 'gracefully' handle the 403, and the page renders ok despite the error, then it proably fine. 


But if the pages 'freak out' over not getting a proper reply to the POST request, then it could affect indexability of the page. (eg if the page ends up triggering a javascript exception, it might halt further JS processing, such that the page doesn't render 'well'. 



As the requests seem to be something to do with woocommerse, perhaps disabling the requests from being made in the first place would be even better. So prevent the HTTP request even being made. 
Original Poster Myfrogtee marked this as an answer
Helpful?
Recommended Answer
May 31, 2023
Well both Googlebot and Bingbot 'render' pages, load pages up in a headless browser. 

These are probably  XMLHttpRequest requests being made by the page during rendering. You can't usually seperate  XMLHttpRequest  requests in the access log,  The 'ajax' in the URL kinda reinforce that. 




Original Poster Myfrogtee marked this as an answer
Helpful?
Recommended Answer
May 31, 2023
Hi,

This will be from the rendering of your pages, i.e. when google executes the JavaScript, like a user would.

And yes, it can be a problem if what that API call returns is used to make an important part of your content.

Use the URL inspection tool and do a live test, then check the rendered html, is all your content in there? If so, you're fine to leave this blocked, if not, then unblock.

This isn't new, google has done POST requests for resources if JavaScript requests them for a long time now. Perhaps something's changed on your site that now does this, if you had not noticed it before?
Original Poster Myfrogtee marked this as an answer
Helpful?
All Replies (2)
Recommended Answer
May 31, 2023
Hi,

This will be from the rendering of your pages, i.e. when google executes the JavaScript, like a user would.

And yes, it can be a problem if what that API call returns is used to make an important part of your content.

Use the URL inspection tool and do a live test, then check the rendered html, is all your content in there? If so, you're fine to leave this blocked, if not, then unblock.

This isn't new, google has done POST requests for resources if JavaScript requests them for a long time now. Perhaps something's changed on your site that now does this, if you had not noticed it before?
Original Poster Myfrogtee marked this as an answer
May 31, 2023
HI dwsmart!
Thank you for your help.
I have tested and It work fine. Recently I discovered a lot of requests from fake Googlebot. These requests pretend the user agent is Googlebot but come from non-google IPs. Maybe it's not google bot slowing down my server but a large amount of Fake Googlebot causing it.


May 31, 2023
Blocking a 'fake' Googlebot, is very different thing to blocking requests from real clients. 

Suggest looking at the two different issues separately, dont compound them. 
Recommended Answer
May 31, 2023
Well both Googlebot and Bingbot 'render' pages, load pages up in a headless browser. 

These are probably  XMLHttpRequest requests being made by the page during rendering. You can't usually seperate  XMLHttpRequest  requests in the access log,  The 'ajax' in the URL kinda reinforce that. 




Original Poster Myfrogtee marked this as an answer
May 31, 2023
Hmm, a clear 403, sounds better than 'denying' the coonection. That could clog things with terminated connections. And might not even work with h2 anyway. 

A redirect is a Terrible idea. Redirects for when content has been moved. You havent 'moved' the ajax responce to the homepage. Clients will probably blindly follow the redirect causing even more 'wasteful' http requests. 


Stick with 403, or even better stop the requests even being made in the first place. 

May 31, 2023
Agree, redirects are a bad idea for sure.

Also agree you're better off blocking the fake traffic vs. trying to conditionally manage this one call.
false
3497627171241325294
true
Search Help Center
true
true
true
true
true
83844
Search
Clear search
Close search
Main menu
false
false