Friday, February 25, 2011

Supporting #Unicode in #MediaWiki

MediaWiki does store its texts in Unicode. Every #Wikipedia has to have Unicode support in its fonts on a device to be properly displayed on a screen or to be printed on paper. Never mind what input methods are used, they have to insert the right Unicode characters in the database.

There are languages where many input methods are used to generate text. Some of them are associated with a particular incompatible font or writing systems like Zawgyi. When its input method is used, the characters will have to be converted to proper Unicode. As a consequence the characters will not be properly visible for a Zawgyi user. This can be remedied by displaying text using a web-font.

For many input methods, it is not that problematic. Here people are accustomed to the characters being in "their" place on the keyboard and they see the resulting text properly. These are where the early gains are for the Narayam extension. This is a complete solution when the appropriate fonts are available on the system.


Having proper fonts on a system is an assumption and for all scripts, including the Latin script, it cannot be relied on. It would be good when a user can indicate if he can read a Wikipedia properly. When there is an issue, we should provide web fonts.
Thanks,
       GerardM

No comments: