tag:blogger.com,1999:blog-12046714.post6590054614007317412..comments2024-03-29T10:36:37.546+01:00Comments on Words and what not: #SEO and #WikipediaGerardMhttp://www.blogger.com/profile/14287269079265427282noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-12046714.post-22082720046608271142011-05-02T15:29:05.687+02:002011-05-02T15:29:05.687+02:00This is not that much simple. Search engines can ...This is not that much simple. Search engines can not (and should not) do that normalization unless there is a canonical equivalence defined between two code points. Remember, search is not done by search engines alone. Everywhere there is string comparison. So the ideal solution is getting a canonical equivalence definition for these two code points, optionally deprecate one. Once we implement that equivalence in glibc, cldr, icu etc, our collation, searching, string comparisons becomes error free.<br /><br />For the കൗതുകം/കൌതുകം patterns, webfonts does not require since all fonts shows it in proper way. As per language both words are same with same meaning, but technically they are completely different words. That is a problem.Santhosh Thottingal സന്തോഷ് തോട്ടിങ്ങല്https://www.blogger.com/profile/03721292417914894934noreply@blogger.comtag:blogger.com,1999:blog-12046714.post-43331091808810576892011-05-02T10:28:06.467+02:002011-05-02T10:28:06.467+02:00You can only normalise when people have the right ...You can only normalise when people have the right fonts to see a text. As we use the latest version of Unicode AND provide webfonts, anyone can read our text. <br /><br />Knowing that we provide webfonts is why Google et al can normalise for OUR website.<br />Thanks,<br /> GerardMGerardMhttps://www.blogger.com/profile/14287269079265427282noreply@blogger.comtag:blogger.com,1999:blog-12046714.post-49852270553601166452011-05-02T09:27:19.293+02:002011-05-02T09:27:19.293+02:00In reply to:
>When the search engines know that...In reply to:<br />>When the search engines know that we<br />>provide web-fonts, they can make<br />>searches like കൗതുകം and കൌതുകം<br />>equivalent; for a native Malayalam these<br />>are the same and as we present a text in the latest Unicode encoding a search<br />>engine could present us as a result for<br />>both searches.<br /><br />Wait, so the search engine are going to normalize in one direction, and only one direction? That doesn't really make sense. If the search engines are going to normalize the different Malayalam sequences to the same thing, they're going to make both equivalent and searches for either be exactly the same... Also, how does web fonts enter into this at all? The font is only going to change how its displayed. Google et al are computers, they only look at the code point of each character. They don't need any fonts.Bawolffhttps://www.blogger.com/profile/02917810358934543942noreply@blogger.com