Tuesday, October 30, 2007

Kotava - another constructed language

Kotava is a constructed language. There is yet not article in the English Wikipedia, there is an article in eight other language.. (myth busting; if it is not in the English Wikipedia, it is not in Wikipedia).

At this moment Kotava is not eligible for a Wikipedia, it will not be enabled for editing in OmegaWiki. It does not have an ISO-639-3 code yet. What is special is that there are clear indications that the process for a code is under way. The code is likely to be "avk".

At OmegaWiki there is a Kotava enthusiast who has started a lot of the preparations for another language. It will be given once the code is official. For a Wikipedia, they may ask for a Kotava Wikipedia. With the ISO process under way, the language committee does not have to do anything until the code is granted.

It is great that the language committee has reserved the right to do nothing..


WCN, the network - an unsung hero

When you are at a conference and the networking just works, you will not hear anyone about it. At the Wikimedia Conferentie Nederland NOBODY mentioned the network; it was just there and it just did what it was supposed to do.



Sunday, October 28, 2007

Google docs - published presentation

When I am not signed on in Google docs, blogger, any Google application and I look at the URL that I used in my blog I get this screen. At the very bottom it indicates that I can have a look at the presentation. It then works for me..


Saturday, October 27, 2007


Today the Wikimedia Conferentie NL was held in Amsterdam. It was a great occasion. It was impossible to do justice to the program; they had three tracks and to chose one presentation over the other was an injustice to what was missed.

I had the privilege to give a presentation, and as I had to do some serious travelling to be there, I considered to what extend I could reduce what I took with me. I decided that with the latest Google application I did not need to bring anything. I could rely on there being a network, Kim brought his Merakis who are still on the old functional software and assuming one functional lap top should not be a problem either.

To prove that it was indeed the Google presentation tool. I selected one of the available backgrounds. It is different. What I need in a presentation is basic. I need to show some texts and some screen dumps. I think there is some need to polish the handling of the screen dumps.

One of the nice things is that the presentation can be made available. So have a look and let me know what you think.


Monday, October 22, 2007


Breton is is a Celtic language spoken by some of the inhabitants of Brittany (Breizh) in France. According to Ethnologue over half a million people speak the language.

There is an organisation that is actively promoting the Breton language. I am really happy to inform you that they have taken the trouble to localise the system messages of OmegaWiki and have started to add translations in Breton. I have send a file with many of the languages that are in the ISO 639-1. This combination will localise most of OmegaWiki in Breton.

The argument that proved convincing is that you get all the information we have. By steadily increasing the translations available in Breton, the experience will improve.


Saturday, October 20, 2007

Import, export

In OmegaWiki we provide some statistics. One of them is a breakdown of the Expressions per language. It is interesting because it shows what people are working on. It is a good indicator because after the initial import from GEMET, all the new work was done by hand. We are now experimenting with the import and export of data that are in collections and this will change things quite a bit.

The export creates a txt file with the columns separated by tabs. The columns are the number identifying the DefinedMeaning, and combinations of the Expressions and Definitions. The reason why we start with this export is because it is still much quicker to translate in a spreadsheet then it is to translate on the web. These experiments are done in a test environment and, we hope to bring it life soon. I can already send you a file when you are interested :)

When we start to import, it is likely that languages can make quite a jump in the statistics. I hope it will encourage people to help us by providing translations particularly for the less resourced languages.


Friday, October 19, 2007


I chatted with Duesentrieb the other day. He mentioned dbpedia. I had another look and it is really a great resource. For those that do not know, dbpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web.

It does a great job and it is different from that other project that deals with structured information, Semantic MediaWiki, in that it does operate by data mining information from Wikipedia while Semantic MediaWiki is an integral part of a MediaWiki project.

The great thing of dbpedia is that it explicitly encourages interlinking. With interlinking data, data that can be found in another resource, becomes available limited by the quality of the interface.

You might ask how OmegaWiki fits into all this. Dbpedia's information is in English while OmegaWiki allows for the representation of information in any language . Both OmegaWiki and dbpedia link to Wikipedia articles and consequently where the two share a link, the information can be mashed together.

In Wikiprotein there is a large amount of medical information available. This information does link to external databases. Given that the medical articles relate to external databases, there is an opportunity to link the data to Wikipedia articles. With Wikipedia articles linked in this way, more specialists will find their way to the Wikipedia medical articles and this in turn will make more enriched information available.

The thing to consider now is how OmegaWiki can benefit from dbpedia.. one of the issues is the difference in license. Then again, dbpedia provides its algorithms and consequently the result is not necessarily the license that dbpedia posts.


Friday, October 12, 2007

Domain names in other scripts

The Washington Post is reporting that ICANN is experimenting with URLs in other scripts. This is good news. For people that do not read or write in a language that uses the Latin script, it is a big handicap to have to type for instance http://ar.wikipedia.org also http://ar.ويكيبيديا.org is problematic. In order to enable people, you have to have the whole string in the appropriate script.

From a localisation point of view, this is one of the ultimate challenges. Consider; the .org has two components, the dot (.) and the org. The org needs an equivalent in all scripts. This top level domain or TLD is just one of many. The ISO 15924 defines many scripts and, a particular combination that is auspicious in one language can be the equivalent of wtf in another. Choosing these codes is not trivial. There are not only TLDs but also ccTLDs or country code top level domains.

We have agreed that ويكيبيدي is Arabic for Wikipedia .. Wikipedia is very much an international movement. We have agreed that Wikipedia is to be used for the Latin script in our domain name. The question is, if ويكيبيدي will be accepted to represent Wikipedia in the Arab script and, when there are multiple ways of writing Wikipedia, do we need to register for all these domains?

To make it even more confusing, what will the rules be when it comes to domain squatting. I can imagine that a brand is only registered for one script and not necessarily for another. I wonder what the position of the WMF will be; I am sure that it has not been considered yet.

ICANN is courageous, they are now experimenting with the technical issues and this will show that it can be done. I expect that the next part will be a proposal on how all the top level domain names are to be "transscripted". Then, it will become interesting because from that moment onwards the Internet will be truly global in its reach and no longer centered on one language or script.


Saturday, October 06, 2007

A picture paints a thousand words

I mentioned that in OmegaWiki we are really moving on the localisation. I am thrilled to announce that a lot of work has been done for Spanish, French, Portuguese, German and Dutch. But the icing on the cake is that for two other scripts a start has been made; for Serbian and Georgian the localisation has started.

I think the Georgian script is pretty :)


Now with grammatical gender

In Omegawiki we now have support for grammatical gender. When we now how to say it in "your" language, we can show it. :)


OmegaWiki vs Semantic MediaWiki

Both OmegaWiki and Semantic MediaWiki are providing semantic support. There are people that have expressed that OmegaWiki should not include particular types of data because Semantic MediaWiki does a better job.

Both OW and SMW are extensions to MediaWiki, so at first face it seems like a reasonable suggestion. The two extensions however do completely different things.

Semantic MediaWiki will shine when it becomes part of a project like the English Wikipedia; when key data that is in the article is marked, it will provide a great improvement in making these facts available. SMW even provides a really rich environment to query the information. It is absolutely great and it is absolutely mono-lingual.

OmegaWiki is at this moment very much a stand alone application. It does not derive data from anything, it is great at presenting the same data in many languages. This means that when we know that the information exists, we will show it in "your" language.

SMW can export and when OW can import, we have the best of both worlds; that is to say we have the best of both worlds when we can link the resources to each other. As OmegaWiki is not encyclopaedic and does not want to be, it is our stated intention to link to Wikipedia articles. As the SMW is tightly linked to the Wikipedia articles, this may be just the trick.

The suggestion that OmegaWiki would only have the semantic information that is in Wikipedia is incorrect. In Wikiprotein we already have a real rich set of annotations of proteins. This is the kind of information that is not encyclopaedic. It enables scientists to maintain information on "their" proteins. The language of the science of proteins is English, however many are known in other languages. It is rich that all this can integrate as it brings diverse information together.

Both OmegaWiki and Semantic MediaWiki have their own, and different strengths. Within Open Progress we use Semantic MediaWiki for our internal wiki. It works absolutely fabulous. I love both MediaWiki extensions!


Thursday, October 04, 2007

Changes in the user interface

A DefinedMeaning in OmegaWiki can have a lot of data associated with it. China borders on so many countries and seas that it is just a bit much. So it makes sense to bundle certain types of information together and keep them separately. This has a profound impact on what the data looks like. So far I was happy when we had the data on the screen but now it becomes possible to think where does it makes sense to have the data. I asked Erik if the incoming messages could be at the bottom of the page..

Well some things seem like miracles and we can have them in five minutes.. The problem with the collations is that at this moment it has to be by hand. This makes that it does not scale. With the "borders on" example however, we have a great showcase WHY we need to be able to sort these texts and also why they should be Expressions in OmegaWiki like all the other information that we hold.

Really, I could not be more happy with the progress that started to happen.. :)


Wednesday, October 03, 2007

Is it a Wiki ?

OmegaWiki is a wiki. Some people however disagree; they consider the fixed format conclusive evidence why it is not. As the software is maturing, this argument loses a lot of its lustre. Obviously we have been saying that these people are wrong all along :)

The best argument why OmegaWiki is a wiki is because people can add/ change little items one at a time, they are not compelled to do everything at one go. As this is acceptable, our data has to be correct but does not necessarily need to be complete.

With the new terminological support; we can now indicate that a language is a language, all kinds of additional information can be added once you have stated that it is a language. Have a look at French for instance, the "incoming relations" and the annotations provide a lot of information. As the moment of writing it is not yet clear that French is an official language of France for instance.

With the expansion of existing classes, with the expansion of the class attributes information will become more available and integrated. The best bit is that as the OmegaWiki specific user interface can be translated in many language people will be challenged to ensure that the right terminology is used in translation.

With the new functionality OmegaWiki became much more wiki. It is there for every one to see, and everyone is cordially invited to have a look, create a user, add some Babel templates and have a go at it.