Jan 2, 2022

Many pages discovered not indexed for months

I've been trying to optimise our website for the last month and am unable to get the list of pages "Discovered not indexed" down, it just seems to keep growing and some pages have been in there for over three months. We are up to 30 pages now, and many are our bookings pages which will hurt the business.

I have read the general guidance and know there may be issues with server performance, content, backlinks or structure but I don't know which to focus on. I can't see any pattern to which pages get indexed and which don't.

- crawl stats show drop in daily crawls since mid Nov, not sure why but seems likely related
- crawl stats show average response time ~1.6 seconds. Occasional blips with server connectivity (~5% once a month). No idea if this is average or bad?
- Site has >150 backlinks, only to a few pages though
- I am moving pages around with the correct 301 redirects to a better URL structure but it has been weeks since I did the biggest change
- I have been adding structured markup for Event pages (were marked as Product), these  all check ok, with just two optional fields not present (we don't have performers or offers), again weeks since main change
- I have added a few new robots.txt disallow rules as the crawler was finding some non-content URLs (API calls), hoping this lets it use its crawl budget on content pages.
- I know page performance timings are not great, but not sure if that may be it?
- sitemaps all read ok and pages listed.
- I checked internal page links and a lot of the pages are shown in the live test as having no detected internal links, but that's true for many of the pages indexed. Not sure if I should really spend time in figuring out why google doesn't find the internal links.

Any advice appreciated! Happy to dig out other information needed to help debug.

Example URLs:

Thanks,
Jonathan

Locked
Informational notification.
This question is locked and replying has been disabled.
Community content may not be verified or up-to-date. Learn more.
Recommended Answer
Jan 2, 2022
Trying to find a 'pattern' to them is pointless, because there isnt one. 
 
The URL hasnt been crawled yet. So its nothing about the specific page, because Google doesn't even know what the page contains yet. 
 
It was just unlucky that it reached the head of the processing queue during a time of low quota availability. So google aborted teh crawl to avoid making too many requests to your server. 
 
ie entered that state for no other reason than it happened to come up for crawling at a bad time. 
 
 
The fact that they havent recrawled very quickly (as Googlw has already re added them to the processing queue) speeks to low crawl demand. Googel just sint 'excited' about expending resources crawling, it doesnt think it needs more pages. 
 
 
So quit thinking about specific pages, and think about improving crawl budget for the site as a whole. 
 
 
 
Last edited Jan 3, 2022
Original Poster Jonathan (CM) marked this as an answer
Helpful?
All Replies (2)
Recommended Answer
Jan 2, 2022
Trying to find a 'pattern' to them is pointless, because there isnt one. 
 
The URL hasnt been crawled yet. So its nothing about the specific page, because Google doesn't even know what the page contains yet. 
 
It was just unlucky that it reached the head of the processing queue during a time of low quota availability. So google aborted teh crawl to avoid making too many requests to your server. 
 
ie entered that state for no other reason than it happened to come up for crawling at a bad time. 
 
 
The fact that they havent recrawled very quickly (as Googlw has already re added them to the processing queue) speeks to low crawl demand. Googel just sint 'excited' about expending resources crawling, it doesnt think it needs more pages. 
 
 
So quit thinking about specific pages, and think about improving crawl budget for the site as a whole. 
 
 
 
Last edited Jan 3, 2022
Original Poster Jonathan (CM) marked this as an answer
Jan 2, 2022
Thank you, that helps a lot. I now know to concentrate on the site improvement actions.
false
2597569098679802682
true
Search Help Center
true
true
true
true
true
83844
false
false
Search
Clear search
Close search
Main menu