Mar 17, 2021
Limit pagination crawling via Robots.txt on Forums
www.site.com
/forum/thread-name?page=1
The no. of pages for the thread is unlimited, so we can hypotetically hade an unlimited no. of pages:
www.site.com
/forum/thread-name?page=1500
We;d like to limit/prevent Google from crawling all this pagination as the content tends to digress anyway. I've been toying around with RegEx to undertand if this can be done via Robots.txt and tried the below to limit Google's crawl to pages from 1 to 9, however the rule doesn't work on the robots.txt tester:
Disallow: /forum/*?page=[1-9]
Would love to hear your thoughts on this/potential implementations and/or a better solution - thanks in advance.
Community content may not be verified or up-to-date. Learn more.
All Replies (6)