Thursday, January 12, 2006

Dialects within a language

There are Wikipedias in many languages. So far there are some 212. Most of these languages have an ISO-639 code. There are two versions of this code that are "official", there is one version that is workable; the ISO/DIS-639-3 is currently maintained by SIL international. However, workable does not mean that it is perfect. Today there were to moments where the current practices relating to ISO-639 were the issue.

JAVA uses ISO-639 for its language codes. The codes used is the ISO-639-1. Consequently the Neopolitan language is not known. OmegaT is an open source CAT tool, it uses the languages known to JAVA as the languages that it can translate.. So in order to translate to Neapolitan you have to pretend that it is a different language.. Not nice.. So the nice people of SUN were asked this and we have great expectations.

Today there was a request on Meta, the website about the Wikimedia Foundation's project, for a new Wikipedia. The request is for tarantino, it is considered a dialect, a dialect of Neapolitan. This request is problematic because there is not even an ISO-639 code. Consequently there is little chance of there being a wikipedia for created. Now, with the new namespace manager, it is possible to create a seperate namespace within the for the tarantino dialect. This is also a solution for the problematic request for a Lower Saxon wikipedia that will be in an orthography that is not German..

It is sobering to see that standards can enable and prevent things to happen. Good standards are vital and ISO/DIS 639-3 is a big move forward.

Post a Comment