Thursday, May 31, 2007

Uploading texts to Wikipedia

A friend of mine had worked on a text for Wikipedia. It was a great text and, I urged him to upload it. This did not happen. I tried to be polite and did not prod him aggressively. At some stage, it was almost a month later, I asked him about it.

His complaint was that it did not work. He had tried it many times only to find that the file was not one of those that could be uploaded. It finally dawned on me that my terminology was the problem; had I said that he had to copy and paste the article everything would have been fine.


Sunday, May 27, 2007

A license for SignWriting

The SignWriting script is the product of a process that is already under way for over 30 years. It is a process that has been getting more and more momentum.

The script has been available on line for everyone to use from the start. When people have questions, need support, it has always been provided. People are learning to write their sign language all over the world. The last time I spoke to Valerie Sutton, the creator of the SignWriting script, it had to be short because people from Switzerland wanted to talk to her :)

As the SignWriting movement is ancient in Internet time, it started before licenses and copyright were considered. People needed to be able to use it, so it was made available. In a reply to a previous post on the subject of SignWriting, David Gerard asked about the license and the copyright and asked if there is an organisation behind SignWriting.

Yes, there is an organisation, the "Center for Sutton Movement Writing Inc." which is a USA nonprofit, tax-exempt educational membership organization. The copyright I understand is with Valerie Sutton and, she is considering what license to use. I have been discussing this with Valerie and the license that we are considering is the SIL Open Font License.

The big question she asked me to ask David and my readers is: "If all SignWriting symbols were under the SIL Open Font License (OFL) would you feel free to use the symbols?"


Sunday, May 20, 2007

More on scripts, fonts and SignWriting

There were two bits of interesting news this week on scripts and fonts;
  • The Chinese script is much older, it is some 8000 years old. This makes the Chinese script some 3500 years older that previously thought.
  • Redhat made available the "Liberarion fonts", they are fonts that allow for the replacement of the proprietary Microsoft fonts. This is one of the impediments of adopting Linux. It is also one of the more visual aspects where Microsoft shows not to care about interoperability.
The Chinese script is old, it has had a long evolution and it is a living script. It is very much being used and one of the important aspects is that it brings together who speak different languages. The script is very much what unites the Chinese. It is not realistic to change Chinese for the Latin script because it, being based on sounds, will not serve in a same way.

The SignWriting script is young, it represents a revolution in the signing world and it has to be adopted by many signing communities. It is different from other written representations of signing languages because it is actually used for day to day use. As a script, it is different from Chinese because it represent movements like the Latin script represents sounds while a Chinese character represents in essence a concept. For this reason the SignWriting characters will mean different things in different sign languages.

The Liberation fonts make it possible to replace the Microsoft fonts without a need for reformatting the text. When you analyse this, it means that Unicode characters are now available in two interchangeable sets of fonts.

SignWriting does not have Unicode characters and it does not have fonts. The symbols that it uses are complete for most sign languages and some missing characters for the Ethiopian Sign Language are being added at the moment.

One really powerful argument why SignWriting is so important is because it helps deaf people to learn a written language that is foreign to them. English is in essence foreign to a person who grew up with American Sign Language; the written English language is not connected to the every day language of the student. When you learn a second language, you learn the shared concepts quickly. Now in order to know what these shared concepts are, you use a dictionary. Without a dictionary, without a written representation of the primary language, it is extremely hard to learn. It requires a really well trained memory.

It is exactly because of deaf people having to live in a world based on sounds that the bridge that the written word is so important. There is anecdotal evidence that kids who learned to write their sign language are better able to learn the written language of the spoken language that surrounds them.

It is for all these reasons that SignWriting emancipate the deaf. It will emancipate them because as a group they will become better able to communicate in their own world. A world that is both signing and speaking.


Thursday, May 17, 2007


Over the last month I have become interested in SignWriting. It is a fascinating subject for someone who is passionate about dictionaries. Given that OmegaWiki aims to include all words of all languages, it had bugged me for a long time how to include sign languages as well as written languages.

SignWriting is a recognised script, ISO-15924 Sgnw. It is only recently that it became possible to write a sign language grammatically correct using computer programs. It is written from top to bottom and it does not have its characters included in Unicode. There are some 30.000 characters at the moment and they are being converted from a bitmap into SVG.

I have been watching an instruction video on SignWriting, I have watched a video of some kids singing in sign language. When you see these people sign, I cannot even distinguish the individual signs, it goes way to quick for me. It is quite something, it makes me realise that I am dumb when it comes to signing and illiterate when it comes to writing sign languages. My redeeming quality would be that I am willing to be informed about it.

As the aim of OmegaWiki is to include all languages and as sign languages are as relevant as any other, I hope that the signing communities and particularly the SignWriting community will work with me to achieve this goal. Along this road there is the Unicode challenge and the challenge to get MediaWiki to support SignWriting.

This is likely to happen as SignWriting fulfils its promise and becomes the universally accepted script for sign languages.


Tuesday, May 15, 2007

Wiktionary quality issues II

In a previous blog I wrote about the Russian Wiktionary being ostracised by the Polish Wiktionary. There was not enough content and consequently they did not want to have links on the Polish Wiktionary.

Yesterday, I blocked a bot run by an Arab Wiktionarian. I blocked it because it did not comply with the way interwiki links are created, it also did not have a bot flag. A link is only created when the words are exact matches. The problem at the Arab Wiktionary is that they have imported with a bot many words, English words, and they are all upper case.

I have blocked the bot because it is technically wrong. I do run my bot to correct the "damage". The biggest damage however is in the lack of communication between the Wiktionary projects. It is for this reason that I am of the opinion that though courageous efforts, many of them are failures.


Saturday, May 12, 2007

Looking at Encyclopedia of Life differently

There has been a lot of comments on the Encyclopedia of Life in the Wiki community lately. All of them miss the most important point. The point is made by the Encyclopedia of Life itself; it will provide "aggregation and will use mash-up technology and Wiki style editing and accumulation of content".

Key in all this is that the Encyclopedia of life is not only Wiki style editing, it also provides for aggregation and mash up technology. It is the one thing that Wikipedia and Citizendium alike are incapable of. With the functionality MediaWiki provides, you are restricted to copying data into the article and having done that, it loses the connection with its origin in a practical way. This reduces these projects to sources of information because of the functionality.

When you look at the examples on the Encyclopedia of Life website, you find that they have information from many sources and present them together. In this way they provide a composite view on the subject that is considered. The data can remain fresh because once data is changed in the sources it consists of, this can be reflected because of the methodology.

One of the exiting things they provide in their mash up is the connection to old literature. This literature is extremely relevant to taxonomy as the oldest name that described a taxon is the one to be used to describe it. The consequence is that many names that were valid once are not valid anymore. It is interesting to learn how they will manage the linking of old valid names to the newer names. This has other applications as well, when an old paper mentions aspects of a species, it may mean that there is no mapping. It is then of relevance to consider the mapping to a subspecies or a different species that is referred to in the old literature.

Mapping the historical names of taxons in time is a hell of a job but an important job because it opens up the understanding of the old literature. For plants, the IPNI resource is of immense value. If anything, I hope that IPNI will be or become part of the mash-up.


Tuesday, May 01, 2007


adarente is an Italian word. This word was defined in OmegaWiki. It has multiple meanings. One of these meanings, not the most common one, became a DefinedMeaning.

When you want to understand all this from what it says, it is extremely relevant to know that indeed "adarente" is the expression that is relevant to all this. When you do not, you will not appreciate that "conforme" is a synonym.

The whole notion of the DefinedMeaning is extremely important to OmegaWiki, it is fundamental. It is sad that, while we do record the word the Expression that goes with the DefinedMeaning, we still not make this visible.