Friday, April 28, 2006

and now for some other news ..

I read on the BBC news website that the UN is cutting in half the daily rations of the fugitives in the Darfur region due to severe funding shortfall .....

From May the ration will be half of what the minimum amount required for each day.

They are starving and they are the lucky ones ......

Thursday, April 27, 2006


The classic business model for lexicology, terminology and thesauri is .... create a dictionary and keep it to yourself. Protecting this investment is difficult; they are facts so they are protected as a collection of data. The classical way of "proving" that the collection is "stolen", is by adding some nonsense words or have some other bogus data as part of your collection.

The open / free business model for lexicology, terminology and thesauri is .... work together on a stellar resource and make the data available to everybody. With the data available to everybody, the big question is how to achieve the best result. For an open / free resource, the best way is by providing the data in a standard way. The first standard I want WiktionaryZ to use is the "TermBase eXchange" or TBX standard. Given what we do in WiktionaryZ, LMF and SKOZ are two other great standards.

Why use standards? Simple, our definition of success is: "when people find a use for our data we did not think of". By providing the data in a standard way, it will be available in a stable way and as a result it will be more easy for people to make use of our data. It will be easier for WiktionaryZ to become a success.

Yes, I love our "competition", but I will love them to bits when they want to be as relevant as we want to be.


Monday, April 17, 2006

What to do with stuff that is good but not standard

What to do if you can get a lot of content that is good in some respects and lousy in others. German uses different characters then the English language and there are ways to indicate an Ä/Ä, Ö–/ö, Üœ/ü or ߟ. When you are using German these characters should be used and not an ue for instance. So when should we accept content with German that is non standard ?

I have been thinking about this for a few days. The answer for me became obvious; it is in the database. When we have the MisSpelling table, we can have the community identify the words that should have an Umlaut. With proper logic the representation for our public will be the proper German with the umlauts.. But the first thing is to have the MisSpellings..


Friday, April 14, 2006

Diana posted an answer ..

Diana posted an answer to an entry of this blog. I know she did because I was send a message to inform me of the fact. I was quite happy with her message, she wants to get into contact with me but I do not know how to get into contact with her.

I am GerardM .. :) You can find me on the WiktionaryZ site. I am very happy to talk about collaboration. I am happy to remind everyone how we define success for the project: Success is when people find an application of our data that we did not consider in the first place..


Tuesday, April 04, 2006

Some of the best things in life are free

Last week I went to the Berlin 4 Open Access - From Promise to Practice conference in Golm Germany. For me it was an education. The really big thing that I now appreciate even more than before is the extend science is prevented from being science because of restrictive practices.

Typically something can be called scientific when the conclusions are arrived at in a methodical way and, the method is repeatable. This is exactly what the Open Access movement wants to bring back. In order to do this they have to wrest away the restrictions that copyright has put on scientific data far too long. A lot of bad science is the result of these restrictions a lot of wastage is the result of these restrictions.

What I learned is that many superb resources are becoming available to the world as a consequence of this movement. Open Access is a rich tapestry with many threats in many fabrics of many colours.

If there was one thing disappointing it was the lack of awareness of licenses. It is a CC license.. was the answer and people applauded. Well, Creative Commons has great licenses but it is a bit like Animal Farm; all CC licenses are Free but some are more Free than others ...