Wednesday, October 12, 2011

#Google, #Acehnese is not #Indonesian

The Acehnese #Wikipedia is recognised by the Google Chrome browser as Indonesian. It even offers to translate it into English.

However, in the html, there is a tag identifying the language as "ace" or Acehnese.
<html lang="ace" dir="ltr" class="client-nojs" xmlns="">
When you check the bar, there are options, one of them is to report an incorrect language detection. There is a drop down list of languages to select from with an option at the bottom to specify a language that is not in the list.

For Google to provide a translation service is wonderful. There is however a reason why we specify a language. It is because we want everyone to know what language a text is in.

Adding this is best practice for a website but sadly many if not most websites get it wrong. It is intended to allow services like the Google translation service to know  what language a text is in. The Localisation team will go a step further for MediaWiki; it will provide services based on the language indicated by this tag. Services like WebFonts, keyboard mappings and appropriate text direction settings.

Identifying a language is relevant, many services rely on it and it is what gives visibility to a language on the Internet.
