Thursday, March 04, 2010

UNESCO document on measuring linguistic diversity

The UNESCO publication "Twelve years of measuring linguistic diversity in the Internet: balance and perspectives" is an update to a previous UNESCO study on this subject that was issued for the World Summit on the Information Society in 2005. It is based on research done from 1996 to 2008.

I browsed this document and I do not like it. It does include some parts that I could use but it seems to me to be a regurgitation of things that others have done. The percentage of languages used on the Internet is based on Google statistics. The problem is that Google only recognises a limited number of languages.

I would expect this report to include information on the things that prevent languages from getting a presence on the Internet; words like Unicode, CLDR even locale are not found. Information on a percentage of documents that accurately flag its language, another relevant statistic are missing.

I do applaud UNESCO for having an interest in linguistic diversity, I am not convinced that the methodology used for this document helps languages find their way to the Internet.
