/webmasters/community?hl=en
/webmasters/community?hl=en
9/12/15
Original Poster
Eran Rosenthal

Indexing problem for articles in Wikipedia containing ' char

Articles in Hebrew Wikipedia that contain ' character in their name aren't indexed by Google.
For example: https://he.wikipedia.org/wiki/%D7%90%D7%A0%D7%92'%D7%9C%D7%94_%D7%A4%D7%9C%D7%96%D7%A0%D7%A1
What could be the problem?
Community content may not be verified or up-to-date. Learn more.
All Replies (9)
Gaieus
9/13/15
Gaieus
Hi Eran, 

That's the "yod" character, isn't it? I wonder if it is (somehow) replaced by a "simple" apostrophe character (as it seems from the transcript of the URL).
9/13/15
Original Poster
Eran Rosenthal
It seems to be a issue with apostrophe character (') - new pages that contain this character in their title aren't indexed other. Other examples are:
* https://he.wikipedia.org/wiki/%D7%9E%D7%99%D7%A7%D7%94_%D7%91%D7%96'%D7%96'%D7%99%D7%A0%D7%A1%D7%A7%D7%99 (created at 30 August 2015)
* https://he.wikipedia.org/wiki/%D7%92'%D7%A8%D7%9E%D7%99_%D7%A7%D7%95%D7%A8%D7%91%D7%99%D7%9F (create as 14 August 2015)
This is wide issue and there are many more such examples. Taking a sample sentence from the above links doesn't yield any result.



9/13/15
Original Poster
Eran Rosenthal
See also cross post in Wikimedia bug tracker: https://phabricator.wikimedia.org/T112425
* It seems to be issue specific for Google and Baidu (which may copy Google results?) but Bing sometimes do yield results
* It isn't specific to Hebrew, examples from English Wikipedia are:
** https://en.wikipedia.org/w/index.php?title=Conference_USA_Men%27s_Soccer_Freshman_of_the_Year
** https://en.wikipedia.org/w/index.php?title=Holland%27s_Next_Top_Model_%28cycle_8%29
Gaieus
9/13/15
Gaieus
It1s all right that a URL can hardly be properly indexed with special characters in it (therefore one should never use them in URL's; directory paths/names or file names) but regarding the Hebrew text, there should not be an apostrophe there so I guess it's a mismatch of characters. Or this particular character was replaced by an apostrophe. 

This character: י should not be replaced by an apostrophe: '.
9/13/15
Original Poster
Eran Rosenthal

* This is not issue in Hebrew, for example in English "Conference USA Men's Soccer Freshman of the Year" in English Wikipedia

* Search indexes should handle and do handle apostrophe properly, this is a new regression from the last month or so. (older pages containing apostrophe are indexed properly)

* (Though this isn't relevant to this issue) Yud and apostrophe are different characters. In Hebrew apostrophe is widely used for transcribing English names containing J (e.g Jon is "ג'ון"). Since readers here aren't familiar with Hebrew which is totally OK let's talk use the English examples above.

Gaieus
9/13/15
Gaieus
Fine, my Hebrew is also rusty nowadays. Only first I thought it was particular to that specific URL you mentioned first. 

I have been looking high and low now but could not figure out any definitive solution to the issue. 
This page is indexed for instance: 
But the other one (with just some addedd parts in the RL in brackets - again, some bad practice IMO) is not (as you mentioned of course):
https://en.wikipedia.org/wiki/Holland%27s_Next_Top_Model_(cycle_8) 

So it's not particularly the apostrophe itself I believe which makes these (new?) URL's hard to index.
9/14/15
Original Poster
Eran Rosenthal
I still don't understand why new URLs with apostrophe aren't indexed while older are included in the index. This seems to be a new bug introduced in the last month or so.
10/5/15
Original Poster
Eran Rosenthal
where can I post a bug to Google search engine? pages with apostrophe character in their title are consistently not indexe.
John  Mueller
10/5/15
John Mueller
Thanks for posting, Eran -- I'll double-check with the team here.

Cheers
John
Were these replies helpful?
How can we improve them?
 
This question is locked and replying has been disabled. Still have questions? Ask the Help Community.

Badges

Some community members might have badges that indicate their identity or level of participation in a community.

 
Google Employee — Google product team members and community managers
 
Community Specialist — Google partners who help ensure the quality of community content
 
Platinum Product Expert — Community members with advanced product knowledge who help other Google users and Product Experts
 
Gold Product Expert — Community members with in-depth product knowledge who help other Google users by answering questions
 
Silver Product Expert — Community members with intermediate product knowledge who help other Google users by answering questions
 
Product Expert Alumni — Former Product Experts who are no longer members of the program
Community content may not be verified or up-to-date. Learn more.

Levels

Member levels indicate a user's level of participation in a forum. The greater the participation, the higher the level. Everyone starts at level 1 and can rise to level 10. These activities can increase your level in a forum:

  • Post an answer.
  • Having your answer selected as the best answer.
  • Having your post rated as helpful.
  • Vote up a post.
  • Correctly mark a topic or post as abuse.

Having a post marked and removed as abuse will slow a user's advance in levels.

View profile in forum?

To view this member's profile, you need to leave the current Help page.

Report abuse in forum?

This comment originated in the Google Product Forum. To report abuse, you need to leave the current Help page.

Reply in forum?

This comment originated in the Google Product Forum. To reply, you need to leave the current Help page.

false
Search
Clear search
Close search
Google apps
Main menu
Search Help Center
true
true
true
true
83844
false