Wednesday, September 21, 2005

Exciting times

On Wikimania Jimbo Wales said that lexicological content ought to be free. When this happens all kind of things start to become possible. Logos is about to make a lot of their content available under the GFDL. Their content has long been freely available, now they are going to license it.

The great thing is that is allows for among other things, the use of this content for teaching languages. Consider, when you create a language excercise you need words to select. When these words are available in an electronic resource, you can make selections including particular vocabulary related to specific subject matter.

For the provider and the user of the Free content / Free education, it is a win-win situation. Both need ample data and with more eyes looking at content and structure, the data can only improve in quantity and quality.

The good news is all of this is happening.


Wednesday, September 07, 2005


On the English Wiktionary they have this wonderfull resource called "Entry layout explained". It explains what is needed for the different content that is available on there. I just discovered it because it was mentioned on IRC.

For me it is a treasure trove. Because all the content described needs its place in the Ultimate Wiktionary. I have to think through how to add homophones. I have to think about how to have them as content. At this stage I do not need to concern myself with where it will end up in the actual screens. I have to have it in the database.

Homophones led to the most drastic change in a long time. I divorced Relations from Meanings. Now RelationType is connected to the Table table. This in effect allows me to use the Relation table in combination with Words as well. This gives me right and rite as homophones

Tuesday, September 06, 2005


The wikimedia foundation had a wonderfull offer of getting a database with eponyms. This is great and obviously once we have them, we want to include them in Ultimate Wiktionary as well.

Eponyms are however a funny thing; they are definetly related to words and not to their meaning. Actually this should be quite obvious because in German you have "Röntgenstrahlen" while in English it is called "x-ray". One is an eponym, while the other is not and they do share the same meaning.

Thinking about eponyms and how to include them in the database design let me see the light that I was really wrong about how I had etymologies in the data design. Like eponymys etymologies are word related and not meaning related; they too are language specific.

The funny bit is that many people have looked at it and nobody noticed. I think it is like with so many things, the best designs do not survive reality unscathed. :)


Sunday, September 04, 2005

Open Office is LGPL

Open Office is nowadays LGPL. The importance is that this is a major shift for Sun Microsystems because Open Office is seen by many as the major free Office Suite. Making OO LGPL makes it much easier to share code.

Sun has its own Computer Aided Translation (CAT) tool. These open language tools are written in Java. The open language tools are licensed under the CDDL license. It would be great if they could share their code with the OmegaT CAT tool. OmegaT is also written in Java.

My point is that there is this big concentration of effort and power in the commercial CAT tool business to the extend that there is a genuine monopoly. It does not make sense to have all the Open/Free CAT tools work seperately. To stimulte cooperation We hope to make a success out of the reference tool for a translation glossary. The best thing that could happen if some serious attention is given to more cooperation.