Wednesday, April 30, 2008

Fighting the fight

I care deeply about languages. I want all our projects to do well. Occasionaly I do things to help these projects on their way.

Today I was spreading the word about localisation on the Kikongo Wikipedia. Their main page said BALLZ! and I changed it to its previous and proper state. When you check the history, you find more vandalism. Small projects without much of a community suffer a lot of this kind of nonsense.

I find myself a lot on many projects and I realised that if I had the admin flag I would be able to do more. I could apply for the Steward setting, but I am not eager or willing to do all the things that many stewards do; I want to do the odd job that come along on my way. The only reason why I want it is because it makes it easy when I find something that needs doing.

I do not think that I want to stand in a steward election, for me it is too much hassle, too bureacratic. I am afraid that that I do not have the stomach for all the politicking. Asking for admin for all the small projects is a non-starter. With some exceptions, there are no viable communities in any of the 100 some Wikipedias that have less then 1000 articles.. There is nobody there. I still try to get support / interest in helping out in Betawiki. I am an optimist :)

Extension of the week: Global blocking

Global blocking is the extension of the week at Betawiki. According to its technical documentation it has a beta status and according to the info on Meta it is life on all the WMF projects.

According to the Meta documentation, I should be one of the lucky few that can use this tool. However it is not clear to me at all how to use it; I have not yet found any documentation. What I understand is, that it only works for IP-numbers and that is really useful to prevent a lot of vandalisation. I think Werdna has done a fine job by creating "Global blocking".

What I find interesting is that the discussion about who will be able to use this tool and how it affects the autonomy of projects is still ongoing. There have been postings on the Foundation mailing list and there is a lot that can be found on Meta.

Sadly we do not have a project or volunteer council, IMHO it would be the obvious place where these things should be discussed / determined.

Some nice factoids:
  • In a conference in Beijing Denny reported that Semantic MediaWiki has been successfully localised in Betawiki.
  • We are deleting the Babel templates and use the Babel extension in stead

How to get the message out?

When I talk about MediaWiki, I typically think of my audience as people who are editing in a wiki. They need to know that localising is good, that you do it at Betawiki, they are the ones that may help push the WMF into using Semantic MediaWiki in its projects or share my interest in things to do with languages.

When you consider who you are doing it for, this localising, I find there are two groups. The people who localise help themselves because by collaborating at Betawiki, they get better tooling and their work will have an effect in all the WMF projects. These are the easy ones to reach. The second group, the people the organisations that have their own MediaWiki Wikis are much harder to reach.

Today I informed Tactical Tech about our "language pack" and how we provide support for the 1.12 stable release. To me it is important that the MediaWiki that is released in the NGO in a box provides the best that we can do. When NGOs use MediaWiki, more people will know our tool and this may stimulate them to write in the Wikipedia in their language too.

By reaching out to Tactical Tech, I hope to reach many more organisations but I do not reach the NGOs that got MediaWiki from the box already. Messaging in this way does not scale either.

What is the best way to inform the MediaWiki users, the people organisations that have their own wiki?

Saturday, April 26, 2008

Babel templates

I love Babel templates. They are great to express what languages you are familiar with and what level of confidence you have when using a language.

Babel templates are used on many projects. On OmegaWiki, my home wiki, it is required of the people who want to edit. There is one nuisance with Babel templates, there are a lot of them and, it is hard to get all the texts that make up the templates.

MinuteElectron is changing all that. He made it an extension. Given that it is being developed first at Betawiki, adding the messages is easy and obvious. The messages are stable, so while the software is being finetuned, the localisation has already started.

One of the clever things in the code is that data of different standards is used to make up the total number of languages. Both ISO-639-1 and ISO-639-3 codes are used, information from the CLDR is also considered. It is now a matter of finetuning the code and selecting the languages it supports.

All in all, a great new extension is being developed.

Thursday, April 24, 2008

WALS, The World Atlas of Language Structures

In 2005, Oxford University Press published "The World Atlas of Language Structures" it is a 712 page hard-cover book and is available for £425,00. It comes with a CD that allows you to get a visual grasp where languages are spoken. It is filled to the brim with all kinds of information and it is considered an important work for people studying linguistics.

In a joint project several of the Max Planck Gesellschaft Institutes, published WALS online. It provides a rich resource and uses Google maps to indicate where what languages are spoken. The data is on the Internet and in order to be of value in a world where Open Access is increasingly considered to be essential, the data is available under a Creative Commons license.

For Wikipedians, the license will be disappointing; the CC-by-sa-nc does prevent the use of this data. WALS is considered a standard work and there are many references to the ISO-639-3 standard; check out Finnish for an example. However, many of the languages recognised in this Standard are missing, Stellingwerfs stl, Abau aau to name just two. I hope that this will be resolved as it will increase the value of WALS a lot.

Even when the use of this data is limited, there is a lot to learn from WALS. It integrates many of the facts known about many languages, it presents them in maps, and makes the best use of what Google maps has to offer. With this much information available it is fun to see what languages are spoken where. Wals on line is truly of interest and I heartily recommend it to people who share my interest in languages.

Youtube ... but which one?

YouTube is a video sharing website. It is probably the most widely used, it has a wide variety of stuff including pop songs, comedy or TV shows from my youth. There is also a large amount of useful stuff, like a howto do the Rubik cube ... The real cool thing of the YouTube videos is that you can embed them in a website. It is for this reason that many people want to include YouTube into MediaWiki and, it shows. There are currently seven extensions known at MediaWiki and I have been reliably informed that there are more.

So what extension to choose? What makes a YouTube extension the best? It is a bewildering choice; do you want YouTube only, what other video sharing sites are supported, is it localised, is it supported and most importantly will it work for me ???

The answer to these questions take time to answer, and then the apparently best choice may still leave things to be desired. A friend of mine has to answer these questions. How to YouTube is not the only functionality that he has to decide on. He will like many others, have the extensions tested, maybe fixed for documentation and function.

As we are certain that this is an often repeated excercise, we want to do it in a way that provides more benefits; a way that helps people to understand what extension, what version for what MediaWiki release and how to install and use it.

Our current thinking is to test it on a virtual server and publish the results. We have not decided on the precise details. Suggestions are welcome.

Wednesday, April 23, 2008


Word2MediaWikiPlus is an extension for Microsoft Word. It's function is to convert a Microsoft Word document to MediaWiki. There are many people that are really excited with this new functionality.

What makes this software different from all the other MediaWiki extensions, is that it is not written in PHP, it is written to work with Word. This is why it is rather special that it is now also supported in Betawiki. It required a new approach to exporting the messages.

MediaWiki is not only used within the Wikimedia Foundation but also outside its projects. What interests me is if new people will come to Betawiki to help support this extension. I hope it will help us in getting more people cooperating on functionality that serves our shared needs.

Tuesday, April 22, 2008

Works for me ... the BBC

I am a big fan of the BBC news. I used to listen to it in my car 648 Mhz, the BBC world service, news, no adverts, no nonsense. At the moment it is my primary source of news.

The BBC has a new skin. As always some like it, some hate it. It is functional. It is functional except for the embedded video; when I choose to see one, I get a black screen with a few dots moving in a circle.

As I want to have it work for me, I contacted the BBC, I informed them about my browser and the problem I had. I got as a reply that they had tested Firefox in their Quality Assurance process. I was even informed that "In fact I use it on Firefox every day."

I would love to repeat after this fine person; it works for me...

Monday, April 21, 2008

From Little Things Big Things Grow

I talked with Brianna, and as always it was fun. One of the things we discussed was "From Little Things Big Things Grow". It is not only a really nice song, it is about Australia busy turning over one of its darkest pages.

I was wondering what Australians could do in Wikipedia to create more awareness of the Aboriginal heritage. I would like to see an article for all the Australian languages. When some of these articles would even be of "featured article" quality, I am sure this would have meaning.

Interim MediaWiki releases or MediaWiki language packs

When the latest stable release of MediaWiki was prepared by Brion, Betawiki was going to continue to support MediaWiki by keeping track of the messages that are valid for the new stable release. 13947 new localisations have been added to the 202201 messages, this is an increase of 6,90%.

As the many localisers at Betawiki continue to do their great work, we can provide better support for the new stable release. The first thing that needed doing was identifying the messages involved and then it was necessary to export the messages. This last part was buggy until now.

The improved localisation is currently available here. When Brion finds a need to issue an interim release for 1.12, we expect that the localisation will become part of the package

What we have to figure out is how to inform all the people and organisations that will benefit from this new feature.

Sunday, April 20, 2008

Semantic MediaWiki update

Yesterday I posted that Semantic MediaWiki is now supported by Betawiki. Today already five languages have been completely localised. They are:
  • French
  • Dutch
  • Norwegian (bokmål)‬
  • Occitan
  • Slovak
Six other language now have a better then 90% localisation and twelve languages have some localisation. It is funny that German and Croat do not have a complete localisation :)

I will agree that I would LOVE to see an experiment with SMW on a Wikipedia...

Car glass

I have a car. It is parked outside of my apartment. Next to the parking place is a building that has been squatted. For whatever reason, people find a need to kick in the windows of this building. This week I rang the police because another window was kicked in. The police came, the boys, age 15 something were gone. Yesterday I came to my car to find that the front window of my car was kicked in, I reported it to the police. Today, another window of the building was kicked in. Again I called the police.

As the building is squatted, nobody seems to care if the windows are whole. Some people want the squatters out and this may be a reason why they throw in the windows.. With a vandalised building, it is easy for people to think that nobody cares. The problem is, that when people think like that, some think nothing of damaging other things as well, like my car.

I know the police had a talk today with the boys. I am grateful that they did. I do not know what has been said, but it may help get the message out that you should not damage other peoples property.

Saturday, April 19, 2008

Semantic MediaWiki meets Betawiki

The picture says it all; Semantic MediaWiki is now supported by Betawiki. For you who do not know Semantic MediaWiki, it is one of the more brilliant pieces of software to extend MediaWiki. With Semantic MediaWiki enabled, you get one new instruction :: and it is provides really powerful possibilities to your wiki. What Semantic MediaWiki offers is the ability to provide structure to the information in a page. When you write on the Queen Beatrix article somewhere in the text [[mother::Queen Juliana]], it will allow you to query the information on the Queen Juliana page. This is the basis for the many cool things that Semantic MediaWiki can bring to Wikipedia.

The one thing that makes me really happy how quickly all this was done; Markus Krötzsch replied to a message on the Wikitech-l and stated that localisation was indeed something that has a high priority for him. Siebrand provided him with some pointers about Betawiki, and the same evening these two fine gentlemen had made the necessary changes to the SMW software and its messages and had it imported in Betawiki.

With the localisation issue dealt with, I think that Semantic MediaWiki has done most if not all the major work that was needed when it was last presented at Wikimania. Its performance has been dramatically improved, the number of instructions has been reduced to one, it provides loads of extensions of its own and last but not least many of the functionalities of SMW can be turned on or off.

It does not make sense to write new software when software exists that has been developed over a long time that WORKS.. I hope the Wikimedia Foundation will enable Semantic MediaWiki soon in one project to learn that it can be trusted in all its projects.

Friday, April 18, 2008

Paypal and safe browsers

According to the BBC, Paypal will restrict the use to its website to safe browsers. When you browse the Paypal website, it is clear, IE7 is the preferred and only choice. It has all these must have safety features blah, blah, blah.

I must say that I am surprised that the "other" browsers are not good enough. Not only would Paypal lose a big share of its customers, I just don't believe it. Their "safer browser faq" gives some information that is missing .. it allows for the fact that Firefox 2 or later and Opera 9.1 or later have anti phishing features.

What Paypal is talking about it a technique called "extended validation secure socket layer certificates". A company called Comodo holds the evsslcertificate website. On its mainpage they provide information about three browsers.. Mozilla, Opera and .... Conqueror. No info about Internet Explorer. When I read what they have to say, I get the distinct impression that these "other" browsers have been part of this project from the start.

I really have the impression that I am being manipulated / misinformed. This makes me feel cross. When Paypal's customers find that 25+% of their customers will not be able to use Paypal, or more correctly are manipulated in this way they will rightly be upset as well.

Really from a public relations point of view it sucks for Paypal.

Thursday, April 17, 2008


The Finnish Wikipedia is the 14th Wikipedia in size. Its localisation is not as good as we would expect of a Wikipedia with more then 150.000 articles. Nikerabbit, the main developer at Betawiki is Finnish and he was challenged on IRC to eat his own dog food.

Nike has now started to work on the localisation and we hope/expect that in the true spirit of Open Source, he will start to scratch his own itch. When you are translating string after string, a lot of time is wasted clicking on buttons. Ajax is a technology that improves the responsiveness of programs. It would be cool if we would find that fewer clicks are needed to do the work...

Wednesday, April 16, 2008

Now with daily quality checks ..

The word advertise means "call attention to" or "make publicity for". Announce has very much a similar meaning but with an advertisement you want people to DO something.

This is an advertisement...

Betawiki now reports on its daily quality checks. Every day a script checks the messages that are problematic and updates a page with the current issues. This helps our localisers to find what messages are problematic. There are several categories like "variables" - the number of variables used is not the same as in the original English message, "plural" - for these messages the plural is not implemented in the message for the language etcetera.

With this new function we hope that we help our localisers to be more efficient. As their time spend is the most valuable resource that we have, we are mindfull about how we can make this time as fun and well spend as possible. Every day the localised messages are submitted to SVN and once the codebase of the WMF servers is updated, the updates will be life.
Nice factoids
  • The localisation for Manx got started recently and is doing nicely (55% of the most used messages)
  • Egyptian Arab completed the localisation for the most used messages
  • The bounty program is still open for more languages ... see Betawiki for details....
  • Twenty nine new core messages were added in the last week

Tuesday, April 15, 2008

Beeld en Geluid

Beeld en Geluid is a Dutch organisation tasked with the maintenance, the safe guarding and the exploitation and exploration of the audiovisual heritage of Dutch radio and television. In Hilversum the centre of Dutch radio and television, they have a magnificent building that allows the public to experience this history, a big archive with room for people who want to research and an office with people tasked with opening up their data as much as possible.

Preservation of so much material is a big task, much effort is put into the digitisation of material. Not only the really old material but also much of the digital age needs care in order to preserve it for the future. As Beeld en Geluid is paid by the Dutch taxpayers, one of its tasks is to give this material a wider availability. They provide specialised tools for the broadcasting organisations, the researchers. Much of the material cannot be made available to the public because of copyright issues. It is reassuring to know that this material will still be there when it becomes public domain..

Beeld en Geluid is exploring MediaWiki functionality. Their need is quite different; much of the material that they want to include exists in all kinds of databases and it is a lot of hard work to make it into a Wiki. We discussed MediaWiki and its uses, we talked about technology, extensions like Semantic MediaWiki, the OmegaWiki mark II project..

As happens so often, we found that there are many organisations that use MediaWiki and with more collaboration MediaWiki would gain more utility. We discussed how such a thing could be done. One thing was clear before we can cooperate, we need to know our shared needs.

Monday, April 14, 2008

When is a Wikipedia not a Wikipedia ?

When you go from one Wikipedia to the next, you always expect an encyclopaedia. You expect things to be largely the same but in a different language. According to Bugzilla bug 13578, this is something that will change. The Alemannic community have found that having a Wikipedia, a Wikibooks, a Wiktionary and a Wikiquote is more then they can chew off.

What the Alemannic community has decided is to fold all the projects into their Wikipedia and have separate name spaces for the content of their old projects. Their communities seem to have decided on this, they want their name spaces now or they will move all their content into the Wikipedia name space.

What I wonder is if this is something that the wider community is aware off and, if this is considered to be acceptable. When such major changes are going to happen, it makes sense to include the change to the more appropriate gsw language tag.

Saturday, April 12, 2008

Statistics that get a life of their own ...

One of the metrics I use to explain why there is still so much localisation to do is quote the percentage of languages that have less then 50% of the most used messages localised. For me it demonstrates really well how much work still needs to be done. Currently 42% of the linguistic entities do not even have these messages localised. This is already a big improvement :)

Today I noticed Siebrand inform people working on a new language that only when 50% is done of the most used messages, he will commit a new language to MediaWiki. I love the idea that things can only improve.

Some great factoids
  • Kaustubh has been made an honourable member of Betawiki for his work on Marathi and Hindi.
  • One of the participants of the bounty program is giving his bounty to the Wikimedia Foundation :)
  • The Asian languages are doing REALLY well in Betawiki

Friday, April 11, 2008

Getting the message out

The success of Betawiki is in its community; a group of people who maintain the localisation of MediaWiki for their language. There is always work to do at Betawiki because the MediaWiki developers constantly modify the MediaWiki functionality and the functionality of the extensions.

When a localisation is largely done, it is best to regularly return to Betawiki and do some maintenance on the localisation. In this way the amount of work needed is not so bad. Then again, it is easy to forget and time flies...

It is with pleasure that I can announce that Siebrand has written his first extension. An extension that allows us to send an e-mail to the users that have a confirmed e-mail address. When people do not want to receive an e-mail, they can opt out and read the information on the wiki.

We intent to send a message once a month informing about the latest developments at Betawiki. By keeping our community informed about the latest developments, we hope to maintain the interest in Betawiki and the quality of the MediaWiki localisations.

Thursday, April 10, 2008

Criteria for the closure of projects

For quite some time, we have had people arguing for the closure of projects. I have seen many arguments pro and against closures. What has been missing in all these projects are objective criteria why it makes sense to find fault with a project.

I have come up with three objective arguments.
  • A project is not what it is advertised to be. For instance when a language is always written in a particular script, a project in any other script is problematic.
  • A project does not have at least 90% of the most relevant messages localised. For your information there are only 498 messages in this category at the moment.
  • A project should have at least 1000 articles. When there is nothing to see what is the point ?
The first argument is an absolute, never mind the size.

For the second and third I would argue for closure when both conditions are not met. When there is activity in either it may be reason for giving an ultimatum. The ultimatum would be that both conditions need to be met within three months.

The most important reason why we need viable projects is because it is sad to see so much time wasted by good people on projects that have little or no objective value. No value because nobody actively cares. Yes, people may come along and get an interest and eventually they will, but time of valuable people is wasted now and that is in my opinion a really strong argument.

Tuesday, April 08, 2008

Draft blogger

It is nice when software gets new functionality. Blogger, the software environment that I use to blog has a "Draft" version. This is where new features are exposed to the people who dare to live dangerously.

On this blog you will now find a "blog list" it shows a link to the latest entry of a specific blog. I have them sorted by blog with the most recent blog entry first ...

This blog entry I will post tonight at 11:00 PM. It is really cool that it is now possible to post at a specified date and time. This is particularly good when you have news that is under embargo until a specified moment...

There are more features pending a general release. These are the ones I like best.

Monday, April 07, 2008

I want a PC not a VC

Effeietsanders has proposed the setup of a council that is to complement the staff, the advisory board and the board of trustees. The name proposed is the "Volunteer Council" or VC. To make this all happen a PVC or a provisional VC is to be created that is to get a mandate to specify what the VC is going to be. A group of good people are selected for this, people that I trust can do a good job.

I find myself opposing something that I think is sorely needed. The reason for this is very much one of focus. This council will particularly deal with the projects, the issues of the projects and thereby be complementary. The staff and both boards are about the organisation that enables the projects to function. This council will be about the projects and will deal what makes the projects function better.

The council will deal mostly with issues of the projects, not about issues of the volunteers. The fact that this council will consist of volunteers is incidental what is relevant is that this council will be there for the projects and consequently it will represent the projects and not so much the volunteers. When you consider the English Wikipedia for instance, it is well able to do all the things needed; it has the attention of the organisation, the world. There are many capable people beavering away and does within reason its own thing. It does not really need a council.

For most of the other projects things are starkly different. A recurring theme from Wikinews is that it does not get the necessary attention. You can however exchange Wikinews for most other projects including most Wikipedias. With the ever increasing number of wikis, it becomes ever more impossible for projects to be heard.

It is for all these reasons that I propose to name the council the "Project council" PC and for it to represent the interest and needs of the projects. The first thing it should do is ensure that the policies that exist are actually implemented and functioning. Then it should look into what make the projects be better represented within the limits of what is possible. Limits because the WMF functions very much as an ISP and this distinction prevents liability. In this way the projects and by inference the volunteers get a better representation and help the WMF fulfill as an enabler for the projects.

Practically, I would have the same group of people prepare for a Project Council. I would not give them a blank check, I would have them prepare a draft proposal. While working on this paper, I would have them consult with members of the staff particualrly Sue and Mike and consult with members of the board of trustees as well. I am sure that it will be good because there IS a need for a council in any way you slice it.

hundred languages have basic support

The Venetian language has the honour of being the 100th language that has better then 98% of the most often used messages localised. There were two languages that reached this state today, the other was Furlan.

At this moment we support in Betawiki 313 distinct linguistic entities. This makes it 31.95% that have basic support. Since December 2007 52 languages have reached this status.

Congratulations to all the people that make Betawiki a do-ocracy. :)

Sunday, April 06, 2008

Ndonga and Dzongkha

Ndonga and Dzongkha are two languages. One is from Africa the other from Asia. Both still have a Wikipedia, one is officially closed the other is in the process of being closed. The dz.wikipedia is officially closed but has not been closed. The localisation effort is well under way and currently 78.12% of the most used messages have been localised. The ng.wikipedia has eight articles has no localisation and the usual people object to the closure of this project. On the talk page of Drini I found an anonymous coward indicating that something may be done soon ....

When projects are voted to be closed, they are not closed. This means that people like Drini have to continue their fight against spammers. There are two parts to the problem as I see it.
  • When the time of stewards like Drini is free and infinite and when they do not object to doing this work there seems to be no problem. We should however value the time of people like him and make sure that it is used sparingly and well
  • When policies lead to resolutions and these resolutions have no effect, what is the point of having these policies, these resolutions and all the time spend?
When policies have no effect, when nobody cares either the policies are wrong or we have to do a cost benefit analysis of the policies. When we find that the cost of having dead projects is less then the cost of blocking these projects we should abandon the policy for closing projects.

PS I do not think it wise to close the Dzongkha wikipedia at this time.

Friday, April 04, 2008

Egyptian Arab

Egyptian Arab is a language recognised with the arz code in the ISO-639-3. For this language a Wikipedia was requested. A lot of work needs to be done before it is ready to have its own Wikipedia. There is the Incubator project, finding a group of people interested in becoming a community and the localisation of the most used MediaWiki messages.

It would be cool if we could have this project go live in Alexandria.. maybe have a talk or a lightning talk about the experience ?

Tuesday, April 01, 2008


Nikerabbit has a blog. As his pearls of wisdom are not heard on aggregators, I find this entry interesting enough to point it out to a wider public.

On the monthly Betawiki info

In March we had contributions in 186 languages in Betawiki. This is an increase of 30 languages over last month! The statistics for this month show a continued increase in all the relevant metrics; more languages support the most often used MediaWiki messages, more languages support the extensions.

The number of languages has increased by only one. This is also because for the first time two languages are no longer supported in BetaWiki. Middle Dutch and Old Norse are no longer supported in Betawiki. The decision to withdraw support was done by the Betawiki management :)

When you follow the project on a daily basis, the one thing that stands out is the support for the languages from India. I am impressed with the speed in which these languages have made their mark. With Hindi finally starting to move, we are slowly getting to the stage where we support most of the official languages of India.

The one thing that I will be watching closely is a request from the Greek localisers to have an approval process for new and changed localisations. The argument is that in this way it is possible to ensure consistent terminology and, to prevent dubious localisations to go life. The initial reply has been that Betawiki is a wiki too and that the process is an iterative process. Everyone can do the inital localisation and everyone can proof read and improve. "we are basically a do-ocracy. You do the work, you decide how it works." I read on IRC ...

What was agreed on is that features will be created to allow for an improved proof reading functionality, Nikerabbit has a Summer of Code project (not Google) to improve on the Betawiki functionality so I am sure that we will see loads of interesting changes in the future.