Monday, January 23, 2006

Standing on the shoulders of giants

At a conference in December I went to, I received promotion CD for the "Referentiebestand Nederlands" of the TST-Centrale. Yesterday I spend a considerable amount of time digesting the information on the CD. I learned a lot, particularly the power of having content in a database that can be used in many different ways. The "referentiebestand" is a corpus based dataset with 45.000 Expressions with morfological, syntactic collocational, semantic and pragmatic descriptions. (much of the definitions are considered stubs in the English Wikipedia, please help in describing these subjects better).

I have learned a few things. I am right when I understand that the location of words in a sentence is relevant. The TST-Centrale uses a fixed notation for it; I wonder how universal it is.. On the CD it is explained that people who use this content, can select the information they want to use and that this is part of the secret of its success. We hope that having the WiktionaryZ available as a database will serve in a similar way.

Reading back the first paragraph, I feel like a Tom Thumb. Every second noun is something to look up and it was hard work writing it. The great thing is, that when you have it in front of you, when you see it demonstrated it does make sense. When we collaborate on content and stucture, when we make WiktionaryZ something that is usefull because it has an application, we will have giants that help us little people make progress when we are allowed to stand on their shoulders and move with us forwards.

Post a Comment