Wednesday, July 25, 2007


WiktionaryDev was pointed out to me and there is much to like. I really like the fact that they brought some tables to the party. The system now knows when there is content in a language. I experimented a bit and added both the word Cherokee and a word in that language.

Superb is the possibility to indicate that two labels indicate content in the same language, this brings the content of Greek and Greek (modern) for instance under the same heading. Having indexes for each language is really powerful; it allows people that are interested to work on one specific language.

The WiktionaryDev functionality builds upon the standards that were adopted in the en.Wiktionary. It is absolutely fabulous that the hard work of standardising Wiktionary results in all this new functionality.. :)


Tuesday, July 24, 2007

Licenses are often not even a nice pain

Many people produce content and many people collaborate on such content. They have a reason to do so, they want to make their work available to the work so that select a license. Often, this content is produces as an extra to something else.

You would create a spell checker under the GPL, but it is incompatible with the GFDL, the CC-by-sa ... You would create a machine translation engine under the GPL ...

The best reason for selecting a Free or Open license for software is because you want to ensure that the freedoms will remain available to the people who receive the programs downstream. In essence it is a defensive measure using copyright law.

Facts are different from programs. You cannot copyright facts. You can copyright collections of facts. Large collections of facts are available under a Free / Open license and these are incompatible with other Free / Open licenses. This means that you either take these collections and just use them, or you get into long discussions about licenses. Either option has its issues.

When you get into discussions about licenses, you have to indicate that the license does not liberate the facts for use in another setting. People get really grumpy even upset when they are told that their favourite Open or Free license is the issue.

The worst thing happens when you are asked to cooperate on a project. A project that has obvious merits. You are asked to help out because you know the subject matter. You are asked to help and comply with their license. You are asked to collaborate for free but you are not permitted to use your own work. When you then tell these people that you do not want to collaborate because their Free / Open license is an issue, you first get stunned silence and disbelief and then you get the same old religious arguments why their license is best. To me licenses are only great if you belief in the copyright system. I believe the copyright system is evil.

In OmegaWiki, we make our community data available under a combined license, CC-by and GFDL. In this way we reach out to both the Free and the Open communities. The people that use our data downstream can pick either license or when their license is more restrictive, they can even re licence our data. Our data will remain Free / Open. People can come back to us and improve and append our data and from OmegaWiki it will be available to the people who use the data downstream.

We are happy to cooperate with anyone. We are happy to collaborate on any database of facts but we have to insist that we work in our community environment. Facts need to be liberated and be available to all. This notion I learned from the people I met in the Open Access world.


Monday, July 23, 2007

Happy news

This article on the BBC-news website had me smile. It is excellent news and it indicates a big milestone towards a Free Culture.

Congratulation to all the people who are involved and make it happen :)


Sunday, July 22, 2007

Harry Potter and the Deathly Hallows

The amount of e-mails on the Wikimedia mailing lists has been somewhat less the last days. I expect that many, like myself, have enjoyed or are enjoying the latest tome by JK Rowlings. It is a great time sink. If I would have been smart I would have saved it for the plane to Taipei... Resistance was futile.


Friday, July 20, 2007

Ishi could have spoken Coos

According to a study from 1962 there were two people in Oregon who spoke Coos. It is not an unreasonable assumption that in 2007 nobody speaks Coos any more. Suppose that Ishi had some profound things to say; his message was recorded and it was recorded on a wax cylinder in 1911.

So let us consider this situation. Ishi spoke Coos, his language is now extinct and he used the technology of the day and left a recording of his message. We do not speak the language any more and we have problems with technology less than hundred years later. This wax cylinder is owned by the Phoebe Hearst Museum of Anthropology and we might be lucky; there could be a complete set of annotations including translation of Ishi's words. When we are lucky, we can listen to the the Ishi recording 96 years later and have some understanding through the possible annotation; a window is opened in our past.

When Ishi had spoken his message in English, we would consider it to be easier for us to understand it. The message would still be from a different culture, it would still require the same annotations for us to understand it properly. Now Bill who was also living in Oregon, had an as profound message. Suppose Ishi and Bill knew each other, Ishi's background would be Coos, and Bill's background would be Welsh. Our ability to understand their true message would depend on our understanding of that time. Without sufficient understanding of the culture, the profound message of either Bill or Ishi will not reach us. It would be an artefact of a museum, an artefact to be studied.

For Bill and Ishi it might have been of great significance that their profound message was recorded. For Ishi it would be natural to communicate in Coos, it was his language and it is not unlikely that is was only recorded and annotated because it was Coos. Bill's message was not recorded, his English and his message was not considered of similar significance.

Much of what is said and done, is done only for the present moment. My message is written in English because it is the best way for me to convey my message. Many of the messages of Sabine are in Neapolitan when she reaches out to that particular audience. My message is written on Blogger, I do not spend much thought considering its format. If at all, it may be saved for posterity thanks to the effort of the Internet Archive. When people cannot read it, understand it in 100 years time, I do not really mind as they are not my intended audience.

Much of what we do on our Wikis is for our current audience. Our content is transient, its shelf life is limited. We aim to bring information to our public and we to do this now. We provide Free content and when a wealth of content is available in a format like Flash, we should imho provide it because we aim to provide the best possible service now. With the continued development of Gnash, I feel reasonably safe that a future generation will still be able to experience some of what our day and age is about. The stuff that I really enjoyed.. well that is another story.. some people try to preserve it..


Thursday, July 19, 2007

Ishi meets IRENE

Ishi was a native American who was recorded in 1911 on a wax cylinder. It is unique authentic material and reproducing the sound can deteriorate the wax cylinder. There are many such recordings and for those who have an interest in such things the invention of IRENE is absolutely fabulous.

IRENE is a method that in stead of a needle uses a camera to find all the little grooves in the track. With this information it is able to emulate what a needle would find and, play the music. There are many priceless recordings and it will be great when they are digitised. This will ensure that they are less likely to get lost.

Obviously, given the age of this material, is it all public domain.


Wednesday, July 18, 2007

Today's word of the day: hypoxia

An article on the BBC news website informs you about the dead zones of the Gulf of Mexico. There are several reasons why I like the word hypoxia. The first is that this word is not part of the GEMET thesaurus while it is a genuine term that deals with the environment. OmegaWiki, does know the term and as such it shows that an interested community can add value to a resource that is considered to be authoritative.

The sad thing about hypoxia is that it is preventable. Hypoxia was a phenomena that happened in the Wadden Sea as a result of the pollution that came in from the Rhine. As a result of cleaning up this river, the hypoxic areas or dead zones have diminished, the areas where seagrass is growing are on the rise again. The same could happen with the Mississippi and the Gulf of Mexico. The main things required is the prevention of nitrate and phosphate getting into the waterways. It is well known how this can be done, it just takes the political will to make this happen.

My interest in this subject can be understood from the fact that I wrote most of the articles about the fresh water fish of the Benelux on the Dutch Wikipedia.


Sunday, July 15, 2007


There is an interesting read about copyfraud on slashdot, it refers to a paper published by the Social Science Research Network. I have been reading this paper now for some time and it does a good job at explaining the problems with the copyright claims of works that are in the public domain. The issue is very much that as the paper explains nobody cares about what is technically an offence.

Organisations that deal in copyfraud are legally fraudsters. While reading this paper one question that comes up to me is, how can industries that do not implement the law themselves on such a massive scale expect their customers to respect the law ?

The paper informs of the many ways it prevents people to use material that is public domain. There have been many threads on the WMF mailinglists about this subject and it is quite clear that our projects would benefit enormously from a strengthened public domain.

This paper does address the issue of how the public domain can be strengthened, it mentions among other things that courts ruled that those with dirty hands because of the assertion of copyright on public domain material were denied copyright enforcement.

Industry protects its copyright through organisations that represent them. With an industry massively breaking the law by claiming copyright where it is not theirs to claim, the moral footing of their representatives is undermined. There is legislation where the court denied copyright enforcement to copyright owners with unclean hands. Many industries have engaged in the implementation of digital rights management. These implementation do not take into consideration the fact that copyrights expire. Consequently I would argue that these implementations are broken by design and consequently they are not a legal correct implementation of copyright restrictions. I think it could also be argued that combined with the massive copyfraud perpetrated by the industry copyright enforcement should not be allowed because of the fraudulent behaviour of these industries by organisations representing the whole of an industry.


Saturday, July 14, 2007

Deletions of pictures

At one time I was a prolific writer on Wikipedias. This was when there was no Commons yet. At the time all pictures had to be present at the individual wikis in order to be seen. In 2004, we won the Prix Ars Electronica and I received an e-mail off to the organisation behind this price; we got permission to use their logo with our articles. There was a restriction that it was to be used with an article about the price, price winners. A reasonable restriction because we are talking about their logo, their trade mark.

I have had two instances now of people insisting on deleting this logo because it is not Free. It will be probably be deleted from the Dutch Wikipedia where the information about the permission is documented and it will after this be deleted from the English Wikipedia because the reference for the permission will no longer be readable.

There are several issues that I want to raise. Commons is as far as I am concerned not as good as it could be because it does not have a way to deal with a restrictive use of logos of organisations that allow for the use of their organisation within the limitations that they have to insist on because of it being part of a trademark.

Permissions given on one project, are often referred to on another. This is not considered when the licenses of pictures are evaluated. This is understandable because there is so much. Much of the older material does not have all the templates and doodahs because these did not exist at the time. The consequence is that much is lost because of this insistence on the compliance with later policies.

For the photos I made, I do not care too much if they are kept on projects or not. Everybody can make a photo of a building, an animal ... The problem is with material that is of benefit to our projects that we do not consider because it requires licenses that are by necessity restrictive. Restrictive in a way that even the organisation that puts the restriction on cannot help.

One additional benefit for accepting a license for logos and stuff would be that as a result our own logos would no longer have a "status aparte" on Commons.


Friday, July 13, 2007

Proposal: localisation sure but enable languages first !!

The Pan African L10N had a conference in Morocco in February. I learned about it through a blog I read today. I am really happy with the progress that is reported on of more languages being supported. I am however coming more and more to the conclusion that there is a need for a stage before actual localisation that will provide a service to the bilingual people of a language.

At this stage, the support of a language is very much an all or nothing affair. There is a localisation or there is nothing. This is not how it needs to be. When a language is known to exist, the lowest level of support for that language is the acknowledgement that this language exists. This is currently not done, and I think it is a missed opportunity.

The first thing to consider is, what languages and linguistic entities exist and, how do you support this. This is a surprisingly complex question. Languages are recognised in the ISO 639 standard. There are several versions of the standard and not all languages have a script that is supported in Unicode. Even when a script is supported in Unicode, it does not mean the an associated font is available for a language. The consequence of these two points is that a subset is needed on computer. On the other hand the currently recognised versions of the ISO 639 do not recognise orthographies or dialects or other entities that make a difference to how documents are to be supported.

This is not an issue the organisations that develop and localise software want to tackle. For them this a distraction. Deciding what linguistic entities can be supported is something that is best addressed by one organisation that exists to deal with issues like these. The World Language Documentation Centre (WLDC) is that organisation. Through its association with Geolang and because its board of experts in many of the relevant fields, it is already in a prime position to the research that goes into the development of the ISO 639-6.

With the WLDC and Geolang able to provide researched and verified information about linguistic entities that can be safely supported, it is then up to the applications to at least acknowledge the existence and allow a user to create content in that language. As more information becomes available, spell checkers can be added specific to that linguistic entity. In this way slowly but surely the functionality grows without the need to first localise the application.

In a way this is a solution for a "chicken and egg" problem. This problem is solved when you think of it in an evolutionary way. First there was the egg, the support of the language, and then the chicken evolved, the localisation of the application.


Thursday, July 12, 2007


I have postponed it long enough, I need to be able to pay abroad and, I have to be able to pay to and receive money from outside of Europe as well. To me this is not a straight forward affair. In Europe it is not customary to pay by credit card and, transferring money within the EU does cost nothing or, it costs as much as a national transfer.

Credit card payments cost money, paypal payments cost money. I would expect that credit card payments are more expensive and it seems customary that you pay your paypal bill with a credit card as well. I have the option to pay a paypal bill directly with my checking account. Intuitively this seems to be the option that is less expensive. So I have opted for this.

The question I am left with is how paypal compares with transferring money using traditional banking methods. As paypal exists for some time now, I am sure it will have had an effect on the banks. Paypal benefits from ubiquitous Internet and highly automated procedures, these benefits are available to banks as well...

One of these days I will know the answer to these questions.


PS Yes, I am going to Taiwan :)

Wednesday, July 11, 2007

Finaly, Ariana may stay

I know Ariana for many years, I see her every so often. She is a nice bright woman, a registered nurse and a university student. She is also from Kosovo and escaped the atrocities of war and fled to the Netherlands.

When Ariana got her nursing diploma she was not allowed to work, she was a refugee. After many years she finally got her permit to stay in the Netherlands. Many of the people that know Ariana have been appalled by the, in our eyes un-just, policies of the Dutch government. Not only my mother was willing to provide her with a sanctuary and sabotage these loathed policies that deal with refugees.

I am happy that Ariana can stay and now has more of a future. It is a shame that she has had to suffer all these years of uncertainty and doubt that have added to an already troubled past.


Sunday, July 08, 2007

RTFM, read the fine manual

SignWriting is the script that gives sign languages the potential of writing. It is way beyond an experiment, it is used. It is used in daily life, one of the telling things is that they have message pads with layouts that indicate who called, who left a message.

There is a mailing list and the last week there were two things that really got my attention. The first one said that not all the symbols that make up the total character set are to be used for one language. There are for instance characters specific to the Italian and the Ethiopian sign language. This lead to the observation that there is a need to identify the characters that are used for a particular language. This is similar to for instance the Latin script where only so many character are used for a language.

In the same way I appreciate this latest amazing story unfolds; there is a lot of documentation on how to use the SignWriting characters... People do not really read it. This is of course to be expected but it brings these wonderful moments where people find that there is more to it. That like for any other language, the basic tenets have to be really understood. That you can go find a character, a movement and it will be readable but it might prove not be the best one. What makes it so nice to read is the wonder and delight these people express that such great documentation exists.

I am impressed with SignWriting and I believe that it would be good when serious funding would find its way towards the further development of SignWriting. To sum up some points why; kids who learn to write their first language first have an easier time to learn the dominant written language that surrounds them, it gives the deaf people of the whole world the opportunity to express themselves in their own language. It allows for a better preservation of so many cultures that do not have anything but video.


Saturday, July 07, 2007

Another BBC journalist comes home

Another BBC journalist came home. I was so happy to write about Alan Johnston coming home. In the same week Frances Harrison is coming home from Iran. Frances was the bureau chief of the BBC in Tehran, in her last presentation she writes how people are afraid to talk to the BBC because they fear for their liberty, their freedom. Frances writes how she struggles with her conscience because she has to justify her need as a journalist with the danger that goes with it.

I am moved, I know several Iranians; they are wonderful people. I have been to concerts of Persian music, it is enchanting. I am saddened because with this continued breakdown of the free exchange of ideas with the deterioration of the freedom of the press, this wonderful country, these wonderful people may become painted even more as an enemy of our culture.

Frances writes: "
The Islamic system of government has deliberately erased much of what was Persian culture and it is only by looking hard that you can catch glimpses of the past." It is self righteous politicians that try to make the world in their own image. It is self righteous politicians that do not allow their own and their own actions to be judged in the same way as they judge others.


Thursday, July 05, 2007

Who cares about the process ...

A lot of "drama" can be observed in how the elections for the WMF board are run. There are many ways in which the process can be seen. The thing that bothers me most is that even though the process was defined prior to the actual running, the way it is run has changed a lot during the process.

The best example is the effort of Gregory Maxwell to "get the votes out". The way he has done it, is really American. Bringing people to the registration office and then actually getting them to the voting booth is done by all US-American parties. This is accepted in that system while it is not done in the systems I am familiar with. Greg invited everyone from the English language Wikipedia who had not voted yet to vote.

As it is so clear that the process is allowed to change while under way, I invite you to Danny Wool's question list where he is asked why he has not bothered the call for him not to stand. The reason why he was not prevented from running is that the process of the vote was considered to be defined. Danny proved unwilling to answer the question.

The process of the vote is that once the results are known, the board is to pass a resolution announcing the result. As a result of the previous vote, the board nominated three people in stead of the one the vote was called for. So I agree with our "drama queen"; the board can repudiate results it does not like. I am known to be of the opinion that Danny should not be a candidate in the first place as his behaviour makes it quite clear to me that he his hostility towards other board members. Also given the statutes of the WMF when Danny were to be elected, he can be removed when my misgivings about him prove to be correct.

What we have observed in these elections is that many aspects of the process have been changed while it was under way. We have observed that our Foundation has become more politicised, that this time around projects and languages are being pitted against each other.

In our Wiki world the notion of "so fix it" is accepted. In the mail of Jan-Bart it was clear that the board did not change the process re Danny's candidature because the process was considered to be under way. Given the many changes to the process that have happened during the process, I think it is fair to ask Danny again, WHY DO YOU STAND !!!

I am sure the answer will remain the same; because he does. It is the best answer he can give as there is no good answer for him giving the compelling arguments why he should not stand.


Wednesday, July 04, 2007

Alan Johnston is free

The best news of today is the release of Alan Johnston. Alan is a reporter for the BBC who was kidnapped by some criminals and held hostage for 114 days. It is wonderful news because at least one reporter is alive to tell the tale.

When you read what UNESCO has to say about Freedom of the Press, it is sickening to see how many journalists have been killed for keeping our society Free and informed. When you are part of the Wiki world like I am, many of the notions what makes good journalism, what makes for providing people with good information are the same.

Today we celebrate the release of Alan Johnston, it is wonderful that he is Free. The price of the kidnap may be that there will be no journalist of Alan's stature left to report from places like Gaza. The price may be that we will not be informed of what is happening in many parts of the world. The price may be that we will only know our point of view without the possibility of getting something like a neutral point of view.

Tuesday, July 03, 2007

An organisation gets the governance it deserves

The Wikimedia Foundation is selecting three members for their board. You may also have noticed that there is a lot of anxiety for what the result may be. There is a distinct feeling that the result may be biased. Of particular interest will be the perception of the English language Wikipedia. This perception of some seems to be that the WMF is US-America centred, the feeling that I get is that some Americans feel that they are losing their influence of "their" project, their Foundation.

So let us consider all this for a moment, the English Wikipedia is with its 1,8 million articles only 0.2 million articles bigger than Commons and is likely to be eclipsed by it in the near future. It used to be that it was growing faster than every other project or language version, this is no longer true. Over time the relevance of the English Wikipedia grew, its quality is great, its coverage is broad, it is what people turn to when they need a first impression on a subject. This has not changed, the English Wikipedia is splendid.

So why this anxiety.. it is not as if the English language Wikipedia is likely to become less dominant in the perception of what it is we do. Many issues of other projects however have not really been considered. There are nasty situations that do not get attention, issues about POV, issues of good faith, issues with living people.

The real issues have little to do with the English language Wikipedia, they have everything to do with the example it has given. It is now for our Foundation to make this promise for the other projects and language versions come true. There are basically two scenario's; our Foundation is allowed and enabled to deal with these issues or the Foundation is to primarily deal with the big projects that insist that they should be the beneficiary of all this attention.

To me, the Wikimedia Foundation is an organisation with many projects, many languages it provides support for. In reality it is the projects that mostly organise themselves. It is for this reason that I am really bewildered why there is so much fear of our organisation not being American. This bewilderment also comes from my appreciation that the WMF never was American but global.


Monday, July 02, 2007


Yesterday there was an interesting meeting in the Netherlands billed as a "moderator workshop". There were many nl.wikipedians and some of them, myself included, were no nl.wikipedia moderator at all. To my delight this was deemed to be good; it was even considered that the name for this recurring event is wrong; in order to get more interest also outside the fairly limiting group of moderators, it is considered to have these meetings under a different name.

The most interesting presentation was the one where Siebrand informed about the state of play at Commons. I knew that Siebrand was one of those working hard to keep Commons clean and sane. It was another thing to hear him explain how these things are done... seeing the amount of pictures moved, the amount of duplicate pictures deleted. It is really impressive.

At the same meeting there was a guy who was articulating his opposition to Commons and its policies. He gave examples why he was disgusted with Commons. The issue as I understand it is in two things; the balance between the need for getting things done and the need for discussing individual issues and the understanding of the procedures used at Commons.

The need to get things done is easily explained. With 1.6 million images, and a growth that exceeds the English language Wikipedia, it is a big project. With all the pictures of all the different projects that should have their home at Commons, the amount of work that needs doing is mind boggling. On the other hand there are people who have a few pictures, who have strong feelings about their work and or who do not know and understand about the procedures at Commons. There is a balance between these two needs.

Over time, many changes have happened to the procedures at Commons, many things have been automated and have been improved to better reflect the needs that people have. If anything this is what I got from Siebrand's presentation. Siebrand indicated that there is still room for improvement and I got the distinct impression that a lot of work is done to do achieve even more.

AGF, Assume Good Faith, I do expect that the better nl.wikipedia moderator sign up to this notion. I was disappointed when it was not even acknowledged the improvements in Commons that were made. Well, that is probably the difference between a long serving moderator and a moderator that will help the community move forward in this brave new world where we cooperate and seek consensus on a continuously bigger scale.