Wednesday, July 20, 2005

Sound and sign

I am progressing with the datadesign for Ultimate Wiktionary. The current challenge I am facing is to deal with both oral languages and sign languages.

The easiest for sign languages was the realisation that a movie is the "Pronunciation" of a signed word. This made me change the fieldname from "Soundfile" to "Mediafile". More complicated is the fact that there are some four written signlanguages. These I would really want in the Ultimate Wiktionary. The question is, do they have like Chinese does their own UTF-8 characters. When they do, I do not have to do anything. It would just work as designed.

I have realised that languages like Arabic and Chinese are formal written languages. There are many people who have a spoken language that is grammatically and syntactically (does this word exist?) different from the formal words. So when I record pronunciations, how do I deal with those. How do I register those lanuages? How do I indicate that these languages use Chinese / Arabic for their written language..

My working theory for the moment is that there may be transcriptions for those languages. Certainly when they have been noted by someone who has some authority, these can be used to link the essentially oral words with something that has characters. These characters are needed at this moment to make it possible to enter them in the database. Now the question is, how to relate them to the written language ... At this time it is just a matter of having the written word as a translation.. in effect this is correct.


No comments: