Saturday, January 31, 2009

Damaged goods

This is a stereoscopic picture of a building being built in Tel Aviv. This is the White City, in these days a world heritage site, being constructed.

When you look at this picture it is obviously damaged. For normal Wikipedia usage, steroscopic pictures are not needed. So only one frame needs to be restored. The second frame however is still valuable as it shows the parts missing in the first frame.. With some clever copy pasting, a reasonable result will be achieved.

Stereoscopic pictures are not needed for Wikipedia, but that does not mean that someone else is not interested in restoring the second frame as well. Saving the restored file in a format that is not compressed ensures that only the second frame needs to be restored. So having the intermediate files is in the interest of Open Content and Open Collaboration.

Friday, January 30, 2009

Shōki is shockingly big

Shōki  is a featured picture candidate on the English language Wikipedia. It is also a whopping big file, it is 53.5 Mb. When I tried to look at it, it crashed my system. This is a JPG compressed file and this file is not usable for people truly interested in assessing the quality of restored pictures. The original file is much bigger.

Of interest is that in order to show the picture, a new template had to be created because of the dimensions of this Japanese woodcut.

When you load the full sized picture from Commons, you have to wait a long time before you get an impression of the picture. At the library of Alexandria, Open Source software was demonstrated that used an algorithm that builds up the image by first giving an outline and then adds details.

PS this is a much reduced version of the unrestored image.

Thursday, January 29, 2009

What is on the menu

Today I read an article about the "mouse man", a scientist who had a big influence on the research of cancer and genetics. His Wikipedia article does not have a picture.. It looks boring and this is probably because there are no appropriately licensed pictures around. There are not even pictures of laboratory mice...

Pictures illustrate, and having great illustrations make great articles a lot more accessible. When we are lucky, we can find material that is in the public domain. Museums and archives are the custodians of such material and when we are really lucky, they make quality scans available.

Many of these images are damaged and need restoration. Many of these images are restored by one of a growing group of Wikipedians. Durova is probably the most prolific restorer. It is hard to keep track of all the images she has done.

In order to reach out to other people interested in restoration work, Durova has started to upload her pictures to Flickr as well. The question is if the upload limit will allow her to upload all her work. Flickr has a slideshow so have a look..

To answer pfctdayelise, yes it would be great to have a slide show in MediaWiki as well..

Gaza and the BBC and journalism

For me the BBC is first and foremost the BBC worldservice and the BBC news website. I understand that the BBC is under fire because it will not broadcast a fund-raiser in the aftermath of the latest fighting in the Gaza strip.

It is this same BBC that continues its reporting about the hardships in the Gaza strip. This is what they wrote today. It is great reporting and it demonstrates how the BBC has continued its quality reporting after the abduction of Alan Johnston in 2004 to 2007.

The problem that I find in the reporting about Gaza is, that there is so much less attention about the conflicts elsewhere and the devastation caused by it often on a much bigger scale. The people leaving the Swat province of Pakistan because the Taleban prevents their children from being educated. The genocide that is the cholera epidemic of Zimbabwe. The civil war in Sri Lanka and the trouble with reporting this news.

These conflicts, and sadly many more, do not get the attention Gaza gets. Key in this is journalism. Journalists are needed on the scene before we can get the news in our newspaper, on our website or television. Journalists die in droves. It is the most dangerous occupation. In 2009 journalists died in Russia, the Phillipines, Sri Lanka, Somalia and Gaza. In 2008 63 journalists died.

There are no tweets coming from Zimbabwe, Pakistan, Sri Lanka, Sudan or Somalia to tell us about what happens. There are no people piling up the pressure to give more attention to these causes like it happens for Gaza. What we need are organisations that bring us the best journalistic intelligence. This is what the BBC does for me.

Wednesday, January 28, 2009

Sadly I will now have to approve your comments :(

One of the gadflies of Wikipedia found it necessary to frequent my blog. He spouted the same kind of "meaningful" prose he got banned for from the en.wikipedia. As my blog is not a platform for his propaganda, he has his own blog, I have found it necessary to change one of the settings on my blog. I now have to publish your comments.

I will leave this setting for now and I may just remove it at a later date.

CLDR or Common Locale Data Repository needs you

Unicode is best known for its standards about characters. The CLDR is another project of the Unicode consortium, it provides key building blocks for software to support the world's languages. It is an extremely helpful resource and at Betawiki we use it more and more in our operations. One of the aspects of the CLDR is the translation of languages in other languages.

For many languages you will not find information in the CLDR. I blogged about a project for the inclusion of African languages before. There are also languages supported in the WMF that we do not have this information for. This makes it more difficult to support these languages in Betawiki and by inference it makes it more difficult to support these languages in MediaWiki.

I wrote to the nice people of the CLDR and expressed our need to include the names of our languages missing in the CLDR. They are willing to consider our special request. They pointed to their policy on what languages are currently supported.

The good news is that we can and will create a list for their approval. The better news is that their policy explicitly states that when someone wants to provide the minimal (level 30) information for their language, it will be accepted. For us as a community, it should be possible to do better then that. At OmegaWiki we have a list of languages with translations that should cover a lot of ground and we should be able to improve on this when we are permitted to include both the translations for other languages and the translations in other languages.

Please check out if your language is properly supported :)


To my dismay I read this blog by Michael Arrington of TechCrunch. TechCrunch provide interesting information about the movers and shakers in the start up technology world. It is important that we have this. It is important that we recognise that Michael is a journalist and as such deserves all the respect and protection that so many journalists fail to get and die for.

In another blog, Adam Frost talks about a misunderstanding about they way Barack Obama's name is to be signed and written in American Sign Language. From my perspective the coolest thing about this controversy is that it is only possible thanks to SignWriting.

Michael has had some really nasty experiences. There is no excuse for how he was treated. His tolerance is now at the breaking point. Adam recognised a situation where an injustice was done to a student. He apologised publicly and by doing so he demonstrated understanding and tolerance.

Michael wrote that some things need to change. I can only agree and wish him well.

Tuesday, January 27, 2009

Approved bug 46247

Brion has approved bug 46247. This bug is about gender. You will be asked to indicate your gender on the preferences. You may prefer to leave it as "not indicated".

There are people who object to collecting irrelevant information and to them I want to say that it is very relevant for many of our Wikipedias. In many languages, particularly the Slavic languages, a male is addressed differently from a female. It is considered impolite to address either in the wrong way. It is for this reason that a request was made to support gender where this matters in the messaging of our MediaWiki software.

The software written by Nikerabbit creates the mechanism whereby a message can be different based on the knowledge of the sex of our public. This does not mean at all that our messages have been changed. We do not even know what languages will make use of our new gender support.

What we have in front of us is that people will proof read the MediaWiki messages and decide that a specific message needs to be made gender aware. They will change the message. For the other languages we will need a bot to FUZZY that message so that people can make the changes for their language as well.

Betawiki now has more then 1000 volunteers working on localisation. Particularly for the languages that want gender support it will be hard work; the first messages will come fast. The ratings for these languages will suffer and slowly but surely the gender support will improve.

The Betawiki developers put a lot of effort in assessing new and changed messages. They now have the added burden of indicating what messages need gender support. What we hope for is that one of our Slavic volunteers steps up to the plate and slowly but surely becomes a Betawiki developer.

Jimmy Wales; character approved

The USA Network hands out the "Character approved" awards to the people who change the character of American Culture. They get a sculpture by Bruce Gray and a $10.000 honorarium that will be given to the charity of his choice.

This year our own Jimmy Wales has been considered for this award. Jimmy is now officially an approved character. There is a really nice video made where the approved characters tell their story.

I am happy to approve Jimmy's choice. I know Jimmy well enough that he is the first to acknowledge that Wikipedia is us, the community, so I feel that in Jimmy all our characters are approved. Most of us anyway :)

Wikipedia, the missing manual

Today O'Reilly Media, announced the migration of its book about Wikipedia to the English language Wikipedia. As of today, the entire contents of Wikipedia: The Missing Manual (O'Reilly, $29.99) by John Broughton is available for free online for editing and updating just like any Wikipedia entry.

I think this is a great news. I have read the book and it is excellent. This way it becomes even better and more useful. The Missing Manual can be found on en.wp.Help:Wikipedia: The Missing Manual.

The Pontic Wikipedia

The community for the to be created Pontic Wikipedia, are eager for their project to start. Zaharias has been in contact with me regularly to finalise the issues around the approval process. I came to appreciate him as a hard working Wikimedian. Today he popped up again and mentioned that they have started the process of choosing their admins and bureaucrat. They want to be ready for the moment when they can finally start for real.

Having learned about their election, I am happy to endorse Zaharias for the role of bureaucrat. I hope that they can soon start in earnest.

NB the Bugzilla nbr is 16955.

Monday, January 26, 2009

Looking at the larger context of restored images

These two girls are part of a recently restored picture. The detail I am interested in is the hair of these girls. This restoration was done from material available at the Library of Congress. They are part of a collection of images that were used to interest people about foreign places and foreign people. In those days people did not travel. It is pictures like this one that had the first tourists go to Marken. They are still coming.

When original material like this is restored, we share the result at Commons in a compressed format. This is great for the use on Wikipedia and for the re-use on the projects of students. For the restoration process itself it is a disaster. Restoration is a repeatable process and a restoration happens in many phases. By saving the half way products there is room for other restorers to improve on the end result.

The problem is we cannot. The files are too large. The .tiff format is not supported. Quite a lot of development effort is needed to make this happen and this is not likely to happen. So there is a need for plan B.

At Meta we make the case for a This project proposal will help grow the community of people who restore images. It will allow them to save the intermediate stages of their work. It will establish the Wikimedia restorations as relevant in the world of archives and musea and this will hopefully open up those organisations to collaborate with us.

Sunday, January 25, 2009

Comparing former Yugoslav localisations

I was asked if I could write with some regularity something about localisation for a Serbian public. So I am looking more with more interest at the Serbian effort in Betawiki.  I find that there is enough to write about..

When you compare the localisation of these closely related languages, I find that the Serbian language is not keeping up with the others. In a way it is not that easy because there is both the Latin and the Cyrillic script that is to be maintained. The number of articles would have it that the amount of attention the Serbian Wikipedia is getting is much bigger.

bs Bosanski 100.00% 99.91% 100.00% 39.12% 12.07% 26,057
hr Hrvatski 100.00% 99.45% 94.58% 35.74% 41.72% 54,003
mk Македонски 100.00% 98.67% 25.72% 18.78% 74.83% 23,626
sr-ec ћирилица 100.00% 89.44% 65.61% 23.88% 63.79% 70,702
sr-el latinica 79.34% 57.72% 7.85% 3.78% 0.34% 70,702

Localisation is one of the most important usability factors. It is one of the few things that our communities can do to improve the user experience; localisation and writing the best articles. The rest is in the hands of the developers.

Friday, January 23, 2009


All the languages that have a Wikipedia, exist on the Internet. You would expect that all the components that are needed to support such a language properly on the Internet are in place. Proper support means that the characters are defined in Unicode and exists in fonts, proper support means that we know how to sort the words in the right way for that language and proper support begins with the knowledge that the language exists. Sadly proper support is not available for all the Wikipedia languages.

I will do (almost) anything to improve this. To support a language properly many separate issues need to be addressed and many of these issues are dealt with separate organisations. The Wikimedia Foundation needs a holistic solution for its languages; the recognition, the characters in a font, the support by a browser and finally MediaWiki to deal with it are all needed to make this work right.

NLnet is a Dutch foundation that financially supports organizations and people that contribute to an open information society. I had approach them to learn if they were interested in supporting a project to help the Wikipedia language to establish themselves fully on the Internet. As they are more inclined to the technical aspects of the Internet, they are sympathetic to the idea but this is not really their thing.

I was really surprised to receive a phone call from Michiel Leenaars today. He told me that the very subject of supporting languages on the Internet was discussed at a conference he attended in Brussels. He spoke about my proposal with a big foundation and they may be interested in considering helping us establish our languages properly..

I am thrilled.

Thursday, January 22, 2009

The failure that is copyright

Copyright as it is currently practiced is a failure. It is not even handed and it prevents our use of much historic material. A good example is this picture of Aletta Jacobs. It can be found in the archive of the Library of Congress, they make it available in a high quality tiff file. It mentions a photographer, a F. Julius Oppenheim, and nobody knows anything about the man.

Aletta Jacobs died in 1929. As it is possible that this image is still under copyright, this picture will not be accepted in Commons. The biography of Dr Jacobs was published in 1924, clearly no longer under copyright, and it is published on the Internet under copyright by the "Digitale bibliotheek voor de Nederlandse Letteren" (dbnl). The problem is that they have the audacity to claim copyright. They claim copyright on everything they publish including work from the Middle Ages.

Copyright is a failure because important material like this picture of Aletta Jacobs are lost to us while people effectively steal the rights to her works by being bold. The dbnl does great work by making literature available on the Internet, they lose their goodwill by taking what is not theirs to take.

I found this image in the archive of the library of Congress, Durova restored it for me as a favour. I am disappointed that such unclear copyright prevents her work from making a difference.

PS this picture is a PNG.

Tuesday, January 20, 2009

The clock has it wrong

Kamusi, is the Swahili word for dictionary. Kamusi is also the name of a great project for Swahili dictionaries. The Kamusi project has been around for quite some time, they had a great time at Yale university, and last year became independent at because both the and the project were taken.

As of today Kamusi can be found at, the story behind it is a great read and a happy story. As a service to its readers, Kamusi would be happy to redirect from Sadly it is extremely likely that this domain has been cyber squatted by GoDaddy.

Anyway, the Kamusi project will have to change their merchandising, they are a sign of their time.

Securing a place for a language in cyberspace

For a language to survive in our modern world it needs to establishing on the Internet. Having a Wikipedia in a language is considered to be the gold standard of Internet presence. In order to be ready for a well functioning language on the Internet, many things have to be considered and taken care of.

Unesco has published a book called "Securing a place for a language in Cyberspace". If you are interested in this subject, it is a must read book. The book is available in English, French, Spanish, Russian and Portuguese.

You would expect that all the issues discussed in this book have been covered for the languages supported by a Wikipedia. This is however not the case. Several of our languages have problems that are hard to resolve unless resources are dedicated to go for a complete solution.

The problem of supporting a language has many facets. Most organisations take care of only one aspect and leave the rest to others. The Wikimedia Foundation in contrast has an interest in a holistic solution for its languages. In order to get involved, the WMF must make such an holistic approach a priority.

It is often wondered why certain languages do not function. It is obvious to me that when the chain of MediaWiki, browser, Unicode, font is broken, the effect will be that people will not recognise the opportunity that Wikipedia provides.

Monday, January 19, 2009


Today Michael Snow gave prominence to an idea of the WMF Swedish chapter to make 2009 the "year of the picture". We should have more and better illustrations. I love the idea and I responded on the Foundation-l.

I have been writing and phoning all day, and for relaxation I went to the English language Wikipedia. Today's featured article is about Edgar Allen Poe. I read it and I looked at the illustrations..

The size of the picture is great, but it is damaged. It surprised me that such a picture is part of a featured article. Hanging out with people who are into restorations certainly has raised my expectations of what gets featured.

Sunday, January 18, 2009

The Linguists

If a program is going to be aired on PBS on Thursday, February 26, 2009 at 10PM. When the message is deemed to be so important that they ask me to a join a Facebook group, the message must actually be quite desperate. In many ways it is, in another way it is a celebration of what is still there.

All you want to know about the Linguists can be found here. Background information is also available... Now how do I and all the other people not living in the US get to see it ?

Closed curtains

Yesterday, as part of the News Year event of three Dutch Open Content organisations, we had a guided tour of de Wallen, the famous red light district of Amsterdam. Even though the area is world famous, there is much more then meets the eye. Our guide was able to provide us with an astonishing wealth of details and, us being mostly Wikipedians we noticed the friezes in the classicist style, we wondered if next to the head of Zeus it was Hera where others had only eyes for the scantily clad ladies in the many windows. I must say they reminded me of a Patti Page song.

On a more serious note the relative merits of a red-light district were mentioned. With a decreased official tolerance for prostitution, more and more illegal brothels pop up all over the country. The girls are typically illegal and often forced in to prostitution.

At the end of the tour we made a group photo in front of a house with the curtains drawn. It is certain that Paul has an alibi as well, he took the picture ... :)

Saturday, January 17, 2009


Today the Wikimedia Vereniging Nederland and Creative Commons Nederland have their new year's do. The first part of the meeting will be about photos and making them available under a CC license. There will be a tour of Amsterdam, it will be fun.

In order to make the photos even more interesting, we have proposed to revisit the places where these historical pictures were taken. In the nineteenth century people did not travel in the same way as we do today. Publishers had photographers travel the world to bring the world closer to people by selling cards. Many of such cards ended up in archives like the Library of Congress.

Durova has promissed to restore at least two of these pictures when they get their modern equivalent. These old pictures are gorgeous, have a look at this picture of the Dam with Naatje still on it.


Friday, January 16, 2009

Accountant wanted

Yesterday at a Holland Open meeting, the current status of Open Source and Open Standards was discussed. One of the big issues for for the adoption of freely licensed accountancy software is that there are no accountants willing to do their work when a company is using it.

The message was we need to find an accountant who will work with us and this will drive the adoption of more open practices in Dutch accountancy.

Commonist revisited

Earlier today I blogged about Commonist. Seven hours ago Commonist was enabled on Betawiki. There are 50 messages in Commonist, over 444 messages have been translated. When the use of Commonist is promoted to the communities for those languages, we may get many, many more files from other parts of the world.


Thursday, January 15, 2009

The Stanton project has a face

On the Wikimedia Foundation website I stumbled upon a photo of Naoko Komura who is the project lead of the Stanton project. It was a surprise to me to learn that she is a woman ... I am happy with that because most of the women that I know in the field of computing outperform the men.

It will be cool when the press release is replaced by some actual information about the project.. It is off course early days but I will not deny that I am eager to see a MediaWiki with an improving usability.


Mardetanha, the fa.wp contributor with 5Gb of pictures waiting to be uploaded to Commons, did not know about Commonist. Commonist is a tool that allows you to prepare the uploading of files to a MediaWiki installation off line and have the uploading done in a batch mode.

I informed Mardetanha about the existence of Commonist, and he informed people at the Persian Wikipedia. The news about Commonist was received well. While we were chatting, he told me that one of his goals is to get many more pictures about his beautiful country on Commons.

I know Mardetanha as a Betawiki translator and for Commonist to be adopted in Iran, it is best when the software is localised. The article about Commonist does not mention this possibility at all, it is only when you get to the website of the developer that you find that this is possible.

When Commons is to be a multi-lingual project, it has to think in a multi-lingual way. It means that it has to recognise that the localisation of its tools is important. There are many people who can help with localisation at Betawiki. To allow these people to make a difference, the Commoners appreciate that most of the pictures that Commons need will be made by people who live in countries where English is not the first language. Once this appreciation sinks in, Commons will be at the tipping point of being a true multi-lingual project.

Wednesday, January 14, 2009

A tiff and an opportunity

Tiff is a lossless format for images. It is often used when images are digitised and they are the basis of everything that follows. When a scan is insanely great, an image may be compressed into a JPEG or another format. When an image was damaged or dirty it is necessary to restore the scan. This is best done from the .tiff file, because compressing changes the picture and it makes it extra difficult to do the restoration.

This is an example of a before and an after image of an illustration of a camera obscura from a seventeenth century manuscript that was restored by Durova. The original is from the Library of Congress, and they are a truly magnificent resource because they provide best quality tiff scans and this allows for quality restorations.

Best practice has it, that when you restore a picture you provide both the original and the restored version. This allows other people to have their go at a restoration if they think they can do better then a Wikipedia featured picture. The problem is that they cannot.

They cannot because as a collaborative platform for restoring images MediaWiki sucks. MediaWiki does not allow people to save tiff files and this is really sad because there is a fledgling community of people restoring all kinds of files. As best practice has it, the original file and the restoration are saved but in a compressed format and consequently, the files do not qualify as the basis to get an even better result.

With the great cooperation with organisations like the Bundesarchiv, we are in a great position to entice archives to provide us on request with full size original scans that need to be restored. We would restore these images to their former glory and everybody is a winner.

It is beyond a doubt that the restoration work done by our volunteers looks smashing. But to impress the people from archives, the technical quality has to be as good. Restorations that are compressed just do not cut it.

Providing tiff support in MediaWiki improves the appreciation of the restoration work and, it makes MediaWiki a platform that can be used by the people who collaborate on these restorations.

Usability - Semantic Forms at Chickipedia

A friend of mine had me look at Chickipedia. I must say that I was impressed with the way MediaWiki was made to look good.. In this website they use Semantic MediaWiki and Semantic forms. I am impressed because it hides a lot of hard stuff from the users.

It looks as good as the Wikia info boxes. What I would welcome are demonstration wikis where all the competing approaches to usability can be shown. When we test the extensions, when we ensure great documentation, we can package all these approaches so that all organisations can make up their mind of the state of the MediaWiki opportunities.

Tuesday, January 13, 2009

Continued Betawiki activity and info about the EOY prize

The one language that really stood out in the last month was Tagalog. In a really short period of time AnakngAraw did most of what we have to do. As a consequence of his efforts, the MediaWiki 1.14 release will have a full Tagalog localisation.

It is now some time after we ran our Betawiki end of year prize. This effort has been a huge success. Our initial hope was to have some 20 people participate and in the end there were 35. There was some anxiety about continued activity after the event, but as you can see in the diagram we are doing really well.

With so many participants, even 1000 Euro amounts to only 28,57 Euro. For people in Europe, the costs of the money transfer are neglicible but when transferring outside of the EC, mean costs that start with 10 or 18 Euro and, then you may still have to pay premiums. It is for this reason that we are looking for alternative ways of paying money.. Paypal is interesting but it has its restrictions and, it is not available in all countries..

Usability - Infoboxes at Wikia

A friend of mine had me look again at how MediaWiki editing is done at Wikia.

While editing, it shows a little grey text "Infobox Film". When you press it you get a pop up screen relevant to info boxes. Info boxes are typically at the very start of an article. They are intimidating and they hide what people want to do.

The Wikia approach hides the clutter and when you select the infobox, the data and the code are nicely separated. The relevant content is nicely separated from the code.

If you do not like this more WYSWYG way of editing, you can even select to use the old "Code" screen. This can be selected from the taskbar.

There is still enough work left for the Stanton project...The article includes several references; I think you will agree that they need a similar treatment.


Monday, January 12, 2009

Chicken or egg

At Betawiki we use CLDR data for the language names in all all our languages. Many languages are supported in the CLDR but some are not. Picard is one language that is not supported in the CLDR yet. Not only are the locales for the language missing, the language is currently not known and consequently the translation for Picard in many languages are not available to us. In OmegaWiki we have a small list of names for Picard.

I have requested edit rights for the CLDR so that I can add values like pcd, Picard and the Dutch "Picardisch". I am grateful that I am given the opportunity to contribute.

I have now been asked if I can add values for the locales for Picard. Locales because Picard is spoken in Belgium and France. The person who asked localisation permission at Betawiki is obviously a much better person to ask to do this. It would be great when we stimulate the people from our community to contribute to the CLDR as it is one of the corner stones for the presence of a language on the Internet.

Thursday, January 08, 2009

Template:Restored-original version

The great thing about digitally restoring files, is that it is a non-destructive and repeatable process. You start with a digital file, you save it, you work your magic and you end up with the best that you can do. So best practices for restoration have it that you safe the original file. This file is unlikely to be used anywhere and a new template has been created to signal the relevance of an original file.

On the English language Wikipedia, images that are not used are deleted or moved to Commons. This is not always possible for images that are restored. Some images are public domain in the USA but not elsewhere and consequently cannot be uploaded to Commons.

A great example of an original file is this poster for Rome. It is great because it is quite clear that this is how this poster was made available by the Library of Congress. And it allows anyone who thinks he can do better to improve on the existing restoration, a featured picture..

Another reason for keeping the original file is that it prevents a lot of grief. Many people assume that because a file is available under a free license, that they can change the file because it needs an "improvement". Such an "improvement" can be result in a cropped version or a change in the colour balance. While it is technically correct that you can change a file in this way, it is often very controversial. It is very much in the Wiki spirit that you allow for people to make up there own mind what is best.

Wednesday, January 07, 2009

A tale of two cities

There are two photographers that I know. One let us call him David and the other let us call him Mardetanha. Both have a camera, both use Internet and both make important contributions to the Wikimedia Foundation. David lives in the USA and Mardetanha in Iran. There are many differences, broad band vs dial up, great pictures of all USA landmarks versus not that much. An income that allows for some travelling versus not very much at all.

Mardetanha has some 5 Gb of digital images that he wants to upload. This would take forever on his connection Given the cost, he prefers to wait until a CD is filled with some 10 Gb and send it to someone in the USA to upload it for him

David has a fine reputation as a photographer, Mardetanha has to work much harder for other people to be even aware that his work exists. Wikipedia needs his work; more illustrations will provide better information about Iran. Not only for people in Iran but for people in the whole world.

Brion speaks at Fosdem

FOSDEM, the Free and OpenSource Software Developers' European Meeting

Brion will be speaking at FOSDEM in the "Collaboration" track. This is a VERY good reason to come to Brussels as well and enjoy great beer, great company and great conversations about everything and MediaWiki. I know that Siebrand intends to be there. Several other have expressed their intention to be come. It is likely that we will demonstrate the ExtensionTesting environment...

So check your agenda, and see you in Brussels :)

FOSDEM, the Free and OpenSource Software Developers' European Meeting


Waldir, one of the Betawiki volunteers has Cape Verdian Creole, or Kabuverdianu as his mother tongue. Kabuverdianu is a language, its ISO-639-3 code is kea. Waldir is one of the active localisers for Portuguese. His Babel information did not specify a mother tongue, his Portuguese profile provided this information so his information was completed.

I asked Waldir to add the Babel information on Betawiki and he had an initial problem with this. The Cape Verdian creole is not a homogeneous language; Cape Verde are 10 islands, and the language differs from island to island. The modern times brings radio, television and internet and as a consequence the differences are disappearing. There is also a movement to standardise the language. As Waldir is from one of the smaller islands, he is not comfortable writing the standard orthography.

We have agreed that it is best when the Babel information is in Kabuverdianu, so Waldir did localise the Babel extension. Betawiki now supports kea, it falls back to Portuguese and it will wait patiently for the moment when people want to start localising in earnest.

One thing I am really grateful for is that Waldir will provide information needed to support Kabuverdianu with locale data. This information is essential for a language to function on the Internet.

Tuesday, January 06, 2009

A nice quote

Awadewit is a fine Wikipedian and lady. Her specialty is literature, she has been active on articles about writers like Jane Austen and Mary Shelley, among others. She told some professors that much of her writing was on Wikipedia. They were astounded because Wikipedia does not have that good a reputation. The reason why she writes for Wikipedia dawned on them. when they learned how many people read these articles.. Austen, Shelley...

This same argument is valid for restored images, they typically become featured pictures. Look for instance at the traffic number for this Vietnamtunnel.jpg. It is obvious when it featured :)


In a way, it is a running gag. Wikipedia is doomed. There are always some ten reasons why Wikipedia will crash, burn, disappear. There is the latest scandal, there are the article numbers that no longer grow exponentially, there is the fundraiser with its banner, Jimmy's plea, the number of new editors...

Many gags have a kernel of truth, it is what makes them funny. They are funny because the doom sayers apparently do not consider all the positive things as the counterweight that they are..
  • The Stanton project will have a major effect on the usability
    • This will design ways for more people to contribute to Wikipedia
  • There have been issues with the arbitration committee but if democracy works, new people are aware of the pitfalls of the past and may make a difference
  • There is the WikiCup 2009, a challenge played and won by skill of editing
    • competitors have offered to teach the fine points of their skills
  • The expectation of exponential growth is not realistic. Continued exponential growth has all the sand kernels on the beach editing Wikipedia ...
  • The fundraiser was a great success, our aim was achieved
    • the banners looked much better then last year
    • the fundraiser was much better organised this year
    • the public showed its appreciation for the work done in the past and gave us the room to do more in the future
    • Jimmy's personal plea made the point why Wikipedia deserves support splendidly
  • The Wikimedia Foundation is much better organised and as a consequence much better able to ensure the sustainability of Wikipedia
There are other things that I feel make a difference
  • MediaWiki is adopted by more and more organisations
  • The quality of the localisation of a small group of languages is such that MediaWiki is used as the default language by external projects
  • Thanks to the community at Betawiki, localisation is happening for many languages
    • Without localisation we are not able to share the sum of all knowledge to every single human.
  • A small group of people restore images digitally and are willing and able to teach their skills
    • A wide variety of images are restored, photos, prints, drawings
    • Material from many cultures, epochs and languages are chosen to prevent bias
All these positive things make their own contribution to why Wikipedia is not doomed. It is not even Wikipedia what it is all about, Wikipedia is about sharing and collaboration. In 2009 the United States of America will welcome a new president. He made the material on his website available under the CC-by license. Even when Wikipedia is finally doomed, I would already consider changes like this as part of its legacy.

Sunday, January 04, 2009

Do deaf schizophrenic people hear voices ??

So I asked the people on the SignWriting mailing list. I was told by someone who has interpreted for schizophrenic deaf people, that some do hear voices and others SEE voices..

Writing in an extinct language

When new texts are written in an extinct language, it is highly academic if what is written is correct, useful and understandable. Often there are several ways of writing a language because languages evolve over time and the orthography change with it. It may even be that the script changed as well. Given that there are no living people who use the language for their daily communications, there is no final word on what is right or wrong.

There are several Wikipedias in truly extinct languages and I am not impressed with what I have seen so far. I recently learned about a dispute in the "Anglo Saxon" Wikipedia where people are fighting about the use of specific characters. When you read about this, you learn that the language used a different script anyway. it is also about the validity of a book on the subject from the 1970's.

With a transliteration to another script, with the use of extended Latin script, I wonder if people who read the texts of the Anglo Saxon Wikipedia actually learn something that helps them understand old original texts.

Supporting languages in Betawiki itself

The people who localise for their language at Betawiki, all have an ability to understand English. In many ways, the first person to start the process of localisation is a pioneer. The objective is often simply to meet the requirement for a new project. For new languages, when 50% of the "most used messages" are done, the user interface becomes available in the Incubator and all other WMF projects.

For some languages the localisation has become a continuous process; the usability for these languages becomes as good as it gets. When more and more work is done, a need arises for other activities. Proof reading, the use of consistent terminology require other skills, different skills. English is for these people not as much a requirement as it is for the people who do the primary localisation work and Betawiki needs to provide great support for the other languages.

A good example of a well supported language is Persian. The language has had consistent support for quite some time. It is of a quality where MediaWiki is usable for use in education or where native hosting in Persian can be offered. Betawiki allows its users to change the interface to right to left support, a gadget still missing for Commons.

To support the people who are not comfortable with English, the Betawiki user interface itself has to be localised. When the most basic localisation is completed, proof reading and improved terminology are activities that become increasingly important. The Persian effort demonstrates how good it can get..

Friday, January 02, 2009


The BBC informed me that the lusophone countries have started a process to adopt new spelling rules. These spelling rules are especially welcomed by countries like Angola who struggle to raise literacy in their country; one orthography would make things easier. At the same time, some Portuguese politicians see it as a capitulation to Brazilian interests...

What I am interested in is what effet it will have on projects like Wiktionary and Wikipedia. The orthography has been adopted by some and not by others. It is intended that the whole of the Lusophony will adopt this orthography.. So what will it be for the Wiki projects ??

Thursday, January 01, 2009

On the yearly and monthly Betawiki information

Betawiki has had a great year. To learn how great a year, it is best to read what Siebrand had to say in his update. It gives you all the hard information that you need. Recommended reading! There is however always more to say..

A lot of extra work has gone in 30 languages because of the Betawiki end of year prize. The prize is going to be divided by 35 people who made a claim. For Czech, Spanish, Finnish, and Chinese multiple people claimed their cut, five people have asked us to donate on their behalf to the Wikimedia Foundation.

I am especially happy that Amharic, Khmer, Tagalog, Telugu and Punjabi benefited from our end of year push. Many of the languages that benefited most were already doing really well, Arabic, Chinese, German and Russian for instance. For other languages like Czech, Estonian, Finnish and Sicilian it helps to complete their localisation.

Some people indicated that it was next to impossible for them to contribute because of the severed Internet cables in the Mediterranean. This may mean that we might have done better..

The project did what we hoped it would; we received a lot of attention, a lot of work was done in a small period of time. The one thing we do not know yet is if more people will find their way to Betawiki in 2009.

Prosit Neu Jahr

Wikipedia Affiliate Button

The great news about fund-raisers is that they come to an end. For me the goal of six million was reached in the first morning of the year.. I woke up and, it was done. At the same time the WMF office was still in 2008. So arguably the result is very much a 2008 event..

It is quite clear that Jimmy's appeal made all the difference. Well actually, given the numbers, it is impossible to ignore the importance of his personal appeal. Jimmy is important to our visibility and it has again been proven what vital role he plays in getting our message out.

I am sure we will continue to support Wikipedia. Wikipedia is a rich resource and it deserves our continued support. A more beautiful gift of reaching our fund appeal at the start of the new year we could not have.. I wish us all the best 2009 we can have...

Wikipedia Affiliate Button