Thursday, May 03, 2012

#Font subsets

A font with all the characters for the Latin or Cyrillic script is big. Over a mega byte big. This is considered too big for use in a web font particularly when mobile devices are targeted as well. For this reason, moves are under way to split mega fonts in subsets.

At SIL they are working on font subsets. Their criteria is to include all the characters used in a given "region".  In this way they explicitly target a range of languages. It does reduce the size and one font can be used as a web font for all these languages. When they are to be used on Wikipedia, it will still be necessary to identify the specific language and have the language associated with a particular region.

Typically a font does not include all characters anyway. It is created with a language in mind and when another language needs extra characters, it is tough. Many languages do use the same subset of characters and when a font is identified as complete for one language, it follows that it is complete for all other languages as well.

SIL needs a way to subset its existing fonts. Google in contrast provides many fonts as web fonts that provide subsets of the Latin script. As it is not made obvious if these fonts support languages like German, French or Dutch, it is not really attractive when English is not your language.

Both Google and SIL provide solutions. The key question they do not explicitly answer is: does it support my language.
