/webmasters/community?hl=en
This content is likely not relevant anymore. Try searching or browse recent questions.
-
2/10/20
how to set no-index for paginated pages? 0 Recommended Answers 26 Replies 1 Upvote
1 Recommended Answer
$0 Recommended Answers
Suggestions and tips on how to set no-index for paginated pages in wordpress?

Not a plugin solution.
All Replies (26)
2/10/20
You would probably need to set that in your themes functions.php file, if you don't want to use a plugin. This might point you in the right direction: https://wordpress.org/support/topic/noindex-for-archive-sub-pages/
marked this as an answer
-
2/10/20
Thank you
do you know if there is a way to verify this code works?

And do you know of a solution that does not involve plugins?

add_filter("wpseo_robots", function($robots) {
if (is_paged()) {
        return 'noindex,follow';
    } else {
        return $robots;
    }
});

He mentions this, but this includes "wpseo_robots" so I guess it works only with the yoast plugin, what about a more general solution?

If for example I was not using wordpress how would I do that?

I want to learn the principles of how this works and how to make it work. I am tired of asking around, so I would like to understand how this is done.
marked this as an answer
-
2/10/20
This?

User-agent: *
Disallow: /page/
marked this as an answer
2/10/20
Well, the fundamentals of it are pages you don't want indexed would have:
  • A robots meta tag with noindex in it i.e. <meta name="robots" content="noindex">
    and  / or
  • A X-Robots-Tag HTTP header with noindex, i,e. X-Robots-Tag: noindex
 
The how you get those on the page are intrinsic to how the site is built, be that a CMS like wordpress, a custom coded page, a JavaScript powered SPA framework like react or even static html pages you build yourself.
 
There's nothing special in it being page 2, 3 etc, it's a url and the end methods are the same.
 
The 'magic' is in how you get them to appear where you want them too, and that's specific to the site / CMS. Just like if you wanted to add 'Deon Rules!' in an H1 in every paginated page, you'd need to work out how to manage that in your CMS.
 
There's perhaps a wider question as to why and if you should noindex these pages, but I assume you have good reason, and you know what you want to achieve by doing so, and the possible downsides.
marked this as an answer
2/10/20
This?
 
User-agent: *
Disallow: /page/
 
That's a robots.txt rule, robots.txt would block crawling but not necessarily indexing. A URL can still be indexed partially if it's blocked, it just won't be crawled. So although google knows it's there, it will not know the content.
 
It can lead to dead ends, Say you had a product that appeared on page 1, then moved to page 2 as you added new products, effectively it would no longer have internal links if it wasn't linked in from other pages in the site. Same theory for articles in a category.
 
Orphaned pages can often struggle in search longer term, sitting outside the internal link graph that flows through sites by google crawling the pages and seeing the links. You might find they struggle to maintain rankings and even indexing.
 
That being said the same can be said of urls linked to only from noindex pages.
 
If it's a page you care about ranking, make sure it's linked to from an indexable, canonical page somewhere.
 
Careful thought needs to be put into if you want to noindex or block with robots.txt swathes of your site.
marked this as an answer
-
2/10/20
Hi,
thanks for this. I am gonna read this link you sent me and do some experiments.

Could you tell me how I can verify I set things properly?

I will try and do some tests, but then how can I check I did it right? (before waiting for googlebot to come and maybe find some mistakes)
marked this as an answer
2/10/20
Hi Deon,
 
Use the URL inspection tool on a page you'd expect to be noindexed, and do a live test. That should tell you if a noindex has been found.
marked this as an answer
-
2/10/20
Hi,
you wrote that if I use the robots.txt disallow page function, googlebot will not crawl the URLs, but if it doesn't crawl them, how can it index them?

Second, I am fascinated by your knowledge of this topic, is this something you learned all from here: https://developers.google.com/search/docs/guides/get-started

Is that the best resource on the topic or is there any more?
Maybe I should find something that involves the docs from google + WordPress as CMS. Because I started reading it and I find it very generic and not specific to WordPress.

On a final note, I was wondering if in any way I can find you on google and if this is your profession, if I can hire you for a consultation on a specific issue I am having (of which the /page/ is just a minor part).
marked this as an answer
2/10/20
you wrote that if I use the robots.txt disallow page function, googlebot will not crawl the URLs, but if it doesn't crawl them, how can it index them?
 
Usually from links, either internally in your site (there's normally a link to get to at least the next page), or external, i.e. people linking directly to one of the other pages.
 
If the URLs are live already, they may have already crawled it and know about it from there.
 
Second, I am fascinated by your knowledge of this topic, is this something you learned all from here: https://developers.google.com/search/docs/guides/get-started
 
Practical experience plays a part here I guess, but the developer guides are a good point of reference for the pure 'fact' of it, if maybe not the wider implications always.
 
Maybe I should find something that involves the docs from google + WordPress as CMS. Because I started reading it and I find it very generic and not specific to WordPress.
 
The specific how's are probably always better asked in something like wordpress forums. The google docs tend to be more concerned with 'This is what you need to output' and basically google doesn't really care at the behind the scenes stuff that make that happen. They care about what's output.
 
On a final note, I was wondering if in any way I can find you on google and if this is your profession, if I can hire you for a consultation on a specific issue I am having (of which the /page/ is just a minor part).
 
I do do this for a living, and whilst I'm certainly not trying to hide and be anonymous, generally I like to keep forum stuff on the forum, but kind of you to ask :)
 
marked this as an answer
3/17/20
my site mis webmaikl it showing index when site:mysite.com in firefox but when i do google chrome site:mysite.com it's shows nothing :( what to do
marked this as an answer
This question is locked and replying has been disabled.
Discard post? You will lose what you have written so far.
Write a reply
10 characters required
Failed to attach file, click here to try again.
Discard post?
You will lose what you have written so far.
Personal information found

We found the following personal information in your message:

This information will be visible to anyone who visits or subscribes to notifications for this post. Are you sure you want to continue?

A problem occurred. Please try again.
Create Reply
Edit Reply
Delete post?
This will remove the reply from the Answers section.
Notifications are off
Your notifications are currently off and you won't receive subscription updates. To turn them on, go to Notifications preferences on your Profile page.
Report abuse
Google takes abuse of its services very seriously. We're committed to dealing with such abuse according to the laws in your country of residence. When you submit a report, we'll investigate it and take the appropriate action. We'll get back to you only if we require additional details or have more information to share.

Go to the Legal Help page to request content changes for legal reasons.

Reported post for abuse
Unable to send report.
Report post
What type of post are you reporting?
Google takes abuse of its services very seriously. We're committed to dealing with such abuse according to the laws in your country of residence. When you submit a report, we'll investigate it and take the appropriate action. We'll get back to you only if we require additional details or have more information to share.

Go to the Legal Help page to request content changes for legal reasons.

Reported post for abuse
Unable to send report.
This reply is no longer available.
/webmasters/threads
//accounts.google.com/ServiceLogin
You'll receive email notifications for new posts at
Unable to delete question.
Unable to update vote.
Unable to update subscription.
You have been unsubscribed
Deleted
Unable to delete reply.
Removed from Answers
Marked as Recommended Answer
Removed recommendation
Undo
Unable to update reply.
Unable to update vote.
Thank you. Your response was recorded.
Unable to undo vote.
Thank you. This reply will now display in the answers section.
Link copied
Locked
Unlocked
Unable to lock
Unable to unlock
Pinned
Unpinned
Unable to pin
Unable to unpin
Marked
Unmarked
Unable to mark
Reported as off topic
/webmasters/profile/0?hl=en