Monday, December 31, 2007

Happy New Year

Some of my friends are already in the New Year, some like me are celebrating the old and awaiting the New Year. The SignWriting you see is the Czech sign language.

One of my hopes is for SignWriting to do well in the new year. I hope that many people will learn to read and to write their sign language. I wish us, that the best is in front of us... Happy New Year :)

Linguistic tolerance

The Wikimedia Foundation has a policy about what linguistic entities it accepts and what linguistic entities it does not accept. When a linguistic entity is recognised as such and has an ISO-639-3 code, it is considered a language. As a rule, the language committee gives a conditional approval to requests for languages that have a code.

There are problems with this policy. As I wrote elsewhere, a language may be dead. A dead language is a language that is no longer actively used; there has been no new terminology, nobody is using it actively, good examples are Hittite (hit) or Akkadian (akk). In my opinion they can have a Wikisource but a Wikipedia is problematic because you cannot write in these languages for a modern public without changing the language into something completely different. It does not even make sense to have a MediaWiki localisation for such languages.

There are more problems, what to do with languages where from within the culture it is prohibited to write the language down. What to do with languages where there are few people speaking a language. What to do with languages where few people are truly literate for their language. What to do with constructed languages?

The biggest issue with all these issues is one of competency. Who is competent to judge that a language is truly dead. At what level are there sufficient people in a community to support a language for a WMF project. How do we judge the quality of our projects and as importantly who is to judge? Also does the WMF have a responsibility for less resourced languages.

Brianna blogged about the Volapük wikipedia. For her and for many others, Volapük became an issue because they had the audacity to create enough articles to be noticed. People like Brianna feel offended because it upsets the notion of what Wikipedia is. Brianna introduces the notion of a "language ego" but I am sure she will agree that every non dead language deserves its place under the sun and only the people that communicate in a language are the ones that can realise a WMF project. The good news is there is plenty of sun and it is not expensive to have another language.

When people equate artificial languages with languages without merit, they have a problem. Many languages have started out in this way. One of the more interesting examples is Italian, it was standardised by Dante, used as a lingua franca until the unification of Italy when it became an official language. Another example is Sardinian where a constructed merged linguistic entity that has not been recognised by the ISO-639-3 registrar, is recognised in Italian law.

When you compare Volapük with Klingon, the biggest difference is that Volapük allows you to express all modern subjects.

For me the issue of the Volapük Wikipedia is a non-issue. I know three people that speak Volapük, not all of them are involved in this project. Given the competency of the people that are, there are no issues in getting the information in Wikipedia right. Wikipedia has always allowed for a project to evolve and insisted on the independence of communities. I am sure that in the end both Wikipedia and the Volapük Wikipedia will emerge stronger from all this.


Sranang Tongo

Sranan Tongo is a creole language spoken in Suriname. A request was made for a Wikipedia in this language and as it does have an ISO-639-3 code (srn), granting a conditional approval was a formality.

What is becoming a comfortable routine is requesting the great people at BetaWiki to support another language. People that know the language can now start with the most used messages. I wish the proposers for this new project well, and I hope to hear from them when they consider to be ready for the big time :)


Saturday, December 29, 2007

Farmer; tactical technology

Farmer is an extension for MediaWiki. Farmer helps with the maintenance of Wiki farms and enables changes to the configuration from a Web interface. The reason why I blog about it is because it is one of those little gems that may make a difference in making MediaWiki more popular.

MediaWiki can be found in an NGO in a box. This is an initiative of Tactical Tech, an organisation that aims to "demystify technology for non profits". The big selling points for MediaWiki are the many people that know MediaWiki through Wikipedia and the many languages that are supported by it.

Recently farmer was welcomed as an extension in BetaWiki, and the developers active at BetaWiki have been working hard at improving the messaging of Farmer. Farmer will make the maintenance of MediaWiki less challenging. With its messages translated in more and more languages, MediaWiki becomes more and more tactical technology.


Friday, December 28, 2007

Firefox 3 beta

I have been bold, I have installed Firefox 3 beta. I cannot say it is all good but it is for many things much better. I would not have installed it without Firefox supporting Chatzilla.. Chatzilla is mandatory for me.

What I like:
  • When I click on an Arab text it will cleanly select the whole word.
  • URL's are shown much more cleanly
  • Firefox is still a great program, it seems more stable and responsive
What I do not like
  • I do not like the presentation of the browsing history
  • There is a bug that has a tab go to the beginning of the page

Monday, December 24, 2007

Localisation fast and furious

When things reach a certain maturity, visible things can happen really quickly. Another Christmas present is this presentation of the localization of MediaWiki. It shows the quality of the localisation of the MediaWiki software.

You will see a lot of red, not good. This is a typical situation of the "cup being half full" as the list is includes more languages. With a new visualisation you can not compare. So you do not notice the many recently added extensions. You do not notice the many recently added languages. You do not notice that there would have been more red for many languages. :)

If you want to help MediaWiki, help us improve the MediaWiki localisation for your language at BetaWiki..


Friday, December 21, 2007

BetaWiki exports .po files

BetaWiki has given us a splendid Christmas gift. Nikerabbit, Hashar and Siebrand have developed an export tool for the MediaWiki system messages into the .po format. This format is the standard used for many open source applications.

The most important thing about this format is that there are many tools that support it for off line translation. Many translators will not work on-line. For many languages we do not need to accommodate off line translation when there are sufficient people willing to maintain the localisation. However, when you look at the statistics, you will find that there are many languages not supported or poorly supported in MediaWiki. Hindi is a good example. The Wikipedia is well localised but for Hindi only 3.99% of the system messages is translated. Hindi is spoken by 180.000.000 people...

For Hindi .po files will not be the solution. Collaboration between Indian Wikimedians and the BetaWiki administrators will be a better solution. There are also languages like Neapolitan where it helps to localise and then have the localisation proof read. When the number of collaborators for a language is small, it is typically easy and safe to work off line. You do not have to wait for loading and saving, you can combine it with a translation memory and make it efficient.

Nikerabbit has now created a .po importer. What he is looking for are translations to test this new functionality...


Thursday, December 20, 2007

A great Christmas card

The question: what language and, what does it say..

Happy holidays,

Wednesday, December 19, 2007

WOSI, a really cool Open Source/Standards project

WOSI is a project of a Dutch school, the HVA, working on a software environment for "woningcorporaties". A woningcorporatie is an organisation that is involved in public housing. As an organisation they are genuinely capital intensive, their IT requirements are complex and evolving.

The objective of the school is to provide an environment for their students that will give them a real feel for what it is like to work in the ICT business and teach them Open Source and Open Standards. Students that are part of this project will experience that a project does not start from scratch, there is always something to build upon, there are always conflicting requirements, there is always the need to ensure interoperability because the use of Open Standards is a precondition.

The WOSI project is in its second year and, it is growing in size. Students from other disciplines are getting involved as there is overlap with other specialties like communications and marketing. More woningbouwcorporaties are interested in the project as well as other schools. The great thing is that as Open Source and Open Standards are key to this curriculum, professionals will be released to the job market that know how to apply these notions in real world scenarios. This is likely to prove the biggest boon to this really cool project.


Tuesday, December 18, 2007

Inter operability is important

Wikipedia and particularly the English language Wikipedia is a rich resource of information. The amount of information in it is staggering. Much of the information is duplicated in other Wikipedias and other websites. This is great. Because with more applications for the same data, more eye balls will find what is in error.

I am subscribed to the DBpedia mailing list and today I read about errors in Wikipedia that had to do with Wigan and Manchester City. Errors were found and the gentleman wrote that he can and will make the necessary updates. His question is when will the DBpedia reflect the changes.

When the data of Wikipedia is analysed with tools, and when the results are found to be of value, it adds relevance to what enables this collaboration. It typically relies on the availability of dumps. When the data is analysed, a new work emerges. When it has a completely different format, it is possible to mesh it with other data sources. This in turn will help establish the validity of the Wikipedia data and will allow for the extension of the data.

When multiple data sources are meshed, the issue of copyright and license raise their ugly head. You can create static and dynamic meshes. In a dynamic mesh you can build the mesh depending on what the person has access to. In a static mesh you can only include the data that is still available to the least privileged person who will get access to the data.

The consequence is that many people, organisations will mesh sources, manipulate data, publish and not indicate what all the sources are. They will not do that because they do not want to be bound by all kinds of licenses and because they do not want to be hassled.

This DBpedia example shows that the presentation of facts is important. It demonstrates that interoperability will result in a better Wikipedia. It is important for Wikipedia to be as open and engaging as it can be. Frankly, when people analyse our data in a similar way to DBpedia, it is a new work it should not be considered derivative. Best practice is to publish sources and this, more then the viral nature of a license like the GFDL or CC-by-sa, will drive collaboration and give Wikipedia more relevance.


Sunday, December 16, 2007

Localisation of MediaWiki

When you wonder in what languages MediaWiki has been localised, and to what extend the localisation is usable, BetaWiki has some great statistics.

The Localisation statics show the languages that have a central localisation and the percentage of the messages that have been done. It clearly shows that the MediaWiki localisation leaves a lot to be desired; for some 144 languages less then half of the messages have been localised. At this moment there are 235 languages known to MediaWiki. When you compare this to the 253 language that have a Wikipedia and add the languages that are starting in the Incubator, you get a clear picture of how much effort is needed to better support the readers of MediaWiki projects.

When you look at the statistics, the glass is half full, and it is filling. On average five languages are introduced every month and more then 500 messages are translated every day. The languages that are in the Incubator are doing well Seeltersk for instance has done an astonishing 99,3%.

One of the latest innovations in the BetaWiki are the core top 500 messages, they contain the most important messages and with these messages translated, MediaWiki is usable for a language. BetaWiki has a dedicated team of people that make MediaWiki and as a consequence MediaWiki projects usable for many people. With your help, we can improve the localisation even further. One message at a time will slowly but surely provide proper support for all the languages MediaWiki supports.


Friday, December 14, 2007

It is perfect after all

In my latest post I wrote about the Oostvaardersplassen, today I received a mail telling me that fish will in future be able to swim into and out of the Oostvaardersplassen.

This makes me perfectly happy. Now I know how the water flows from the Oostvaardersplassen into the "Wilgenbos" and I know that there will be a lot of work that needs doing. But when Staatsbosbeheer, as it does, states that fish will be able to swim in and out .. really great news.


Wednesday, December 12, 2007

Vindication of a kind

One of the Wikipedia articles I am proud of is the Dutch article about the Oostvaardersplassen. The Oostvaardersplassen are close to where I live and I think it is one of the best examples that nature is something that not only evolves, but also can be engineered. I have followed its development with considerable interest and my favourite point of view has been that the water management has been detrimental to the natural diversity.

I visited the Oostvaardersplassen this weekend, and I learned that a small dyke will be removed leading to a more natural distribution of water and a more dynamic water level. This will have a huge impact on the fish stock; the current population of mainly mature carps will make room for many more smaller fish. This will allow many small herons and other fish eaters finding their niche.

The one remaining question for me is if fish will be able to freely migrate in and out of the nature reserve. It would be grand if this is the case.. As only one dyke is mentioned, I do expect it to be great but not "perfect".

Monday, December 10, 2007


The word of the day for OmegaWiki should be burglary. It is not as another word had already been entered. I was sleeping and woke up because I heard the breaking of glass. I looked out of my window and saw someone breaking and entering. I called the police, they arrived quickly..

After all the excitement, I find it hard to go back to sleep.. Anyway, this is real life drama. Not dramatic, but it keeps me from sleeping.

NB the word of the day is íshokkí.


Wednesday, December 05, 2007

Shameless plug ...

A friend of mine send me what she called a "shameless plug". I agree with her, there is no shame in announcing that wikiHow is supporting the Dutch language.

As I absolutely approve of great projects doing great things, I am happy to shamelessly plug wikiHow and I wish it and all its language versions great editors and a great audience.