Thursday, March 28, 2013

I want #Wikimedia #statistics I can rely on

Erik Moeller asked the question; "what statistics do you like best, do you think most relevant". My answer is: statistics I can rely on.

There are several developers working on statistics. Many more than several years ago. I am sure that there is a need for data driven development and I am sure that the underlying numbers for many projects require time and effort. It does not mean that the statistics that are updated on a daily basis can be neglected.

Today I learned that two new Wikivoyage projects were created; one in Hebrew and one in Ukrainian. And after more than a month I checked the statistics for pageviews for Wikivoyage. They are still very much broken. There are other bugs that I reported as well. I cannot be bothered to check their continued existence.

Erik really, having statistics is wonderful. Having many statistics is evn better. It does not matter at all if you cannot rely on them. That way they become the infamous mantra of "Lies, damned lies and statistics".

Hmmm, I am complaining here. Check out my blog; I use statistics frequently to report on things Wiki.

#ASL; the most challenging language at #Translatewiki II

An effort is underway to localise American Sign Language at It is a challenge and at this stage, it works by copy and pasting translations from Signpuddle. The result at twn looks really weird; it is a string of numbers.

To learn if these numbers actually work, I copied the localisation for Wednesday to my userpage on the ase.testproject. It did not work. The copied content did not create something in SignWriting. What I got were some numbers and to make it work I needed something else..
  • M512x531S2e508489x504S18600492x470
  • <signtext clear=0>M512x531S2e508489x504S18600492x470</singtext>
Although this label does the job in the Signpuddle software, a different attribute could be more strategic. How about..
  • <lang=ase-Sgnw>M512x531S2e508489x504S18600492x470</lang>
Software will pick this up as easily. The benefit is that it actually indicates what language it is. This will drive search engines to recognise that content exists in the given language. Software that supports SignWriting will pick up the Sgnw label and it can be any of the signed languages.

Not only knowing that something is in SignWriting but also what sign language it is will be really powerful.

Saturday, March 23, 2013

#ASL; the most challenging language at #Translatewiki

A #Wikipedia in "our" language is what people are working towards in the incubator. There are many challenges that need to be taken before this dream is fulfilled. Messages need to be localised, articles need to be written and finally the text has to be proven to be in "my" language. Several challenges that take time and effort, all for the big moment when one more Wikipedia is created.

They are challenges but they are dwarfed by the challenges facing the people working towards a Wikipedia in American Sign Language. First of all, their effort cannot be part of the incubator. Their incubator project is in a Wikimedia Labs environment. An environment with experimental software that allows them to write their own language.

The SignWriting script is written from top to bottom as you can see in the screenshot. This is not supported by MediaWiki. The SignWriting script is not yet part of Unicode. MediaWiki only supports scripts that are encoded in Unicode.

Consider what it does to localising the software at Below you find what Wednesday looks like for Adam, who is working on the localisation of the most used messages. It must be copying data from one environment to the next, not even knowing what the effect will be when it finally shows in the labs environment.

If anything I know how much hard work goes into new projects. The effort for the Wikipedia in American Sign Language is probably the most complicated of them all.

Friday, March 22, 2013

#FUEL makes for standards in localisations

#RedHat and #Wikimedia Foundation engineers have been collaborating on providing support for the languages of India for some time now. However, both Red Hat and the Wikimedia Foundation support people from all over the world. Both organisations know the perils of internationalisation intimately well.

As this cooperation is progressing nicely, one of the early results can be found at This is where MediaWiki is localised and this is where software is localised in more languages than anywhere else. What you will find is that the Fuelproject can now be localised in other languages than the languages from India.

This project "aims at solving the Problem of Inconsistency and Lack of standardization in Software Translation across the platform". For this to work optimally, such an approach should not be restricted to India but it should be aimed to any and all languages.

How the terminology defined in FUEL will be used in is not yet clear. What is obvious is the intention to standardise terminology as much as possible. This will make using software more predictable and that is very much to the benefit of everyone.

#MediaWiki powers #jQuery for language support

#Wikipedia is powered by MediaWiki. Wikipedia has a version in over 280 languages and, more languages are waiting in the wings. All these languages need support. This support is needed in all the computer languages used for Wikipedia; in PHP, Javascript even Lua.

With so many languages to support, you really need a standardised way to provide the support. Support for localisation, for input methods, for fonts.

One of the really brilliant software projects provides language support in jQuery and consequently in Javascript. It is under active development and consequently more and more languages are supported in this way. Currently there are over 155 input methods for over 75 languages - it is already the largest repository of input methods on the Web and, it is open source ...

This development has not gone unnoticed. People who support languages and scripts not supported by the Wikimedia Foundation have a hard time finding support for their languages, for their fonts and input methods. I have seen an implementation for the Thuɔŋjäŋ languages developed by Andrew Cunningham.

Even better, Andrew was happy enough to talk to Martin and this may result in even more collaboration on languages that deserve the same support just like any other language.

Do you want to experience the full power of jQuery.ime like me? We just have to wait for the Wikimedia Foundation to roll it out to its own websites. It is great dogfood, they just have to eat it.


Monday, March 18, 2013

Endowment and #Wikimedia

Every so often, there is talk about an endowment fund for the Wikimedia Foundation. This means that a huge pile of money is kept is reserve for a rainy day.

There are a few problems as far as I am concerned. It makes sense when there is a reasonable fear that we will not be able to fund the projects of the WMF in the future. It assumes all the things the WMF could fund are being funded and are competently managed. The return on investment of putting money in an investment fund .. eh .. endowment fund are better than the return on investment of putting this money to work. People who administer investment funds are bankers.

Given the success of the fundraisers of the WMF, the question if the WMF cannot raise enough funds for its current activities is a joke.

What the WMF currently does and what it could do are two different things. Answering the second can only be answered by looking at what the WMF is there for. I refer to the mission statement on Meta.

The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally.
In collaboration with a network of chapters, the Foundation provides the essential infrastructure and an organizational framework for the support and development of multilingual wiki projects and other endeavors which serve this mission. The Foundation will make and keep useful information from its projects available on the Internet free of charge, in perpetuity.

When I look at this text, the last two words can be seen a reason for an endowment.

However, I am certain that we can do more to "empower and engage people around the world". Just consider how much we do in the United States and compare it to Africa, Asia and South America combined. The WMF works in collaboration with a network of chapters.. Most countries do not have a chapter or another organisation .. yet.

Given that realistically the WMF is only investing in Wikipedia, it can not be said that we are doing everything possible to realise the mission statement. It can be argued that we do everything we can manage at this time.

There is enough room for growth and much of it does not need to cost us anything but time and effort. What is needed is the coordination, the planning and the will to go where Wikimedians have not gone before.

PS for an endowment to work well you have to trust and pay a banker. Personally I have more trust in us spending money wisely.

Help #Wikipedia Zero and localise the #mobile software

Learning about the current state of Wikipedia Zero can be done through this presentation. It is well worth it. The things that I found most interesting are:

  • The use of mobile phones is different from what was expected
  • The Wikipedia Zero project has a measurable and positive impact
  • There are things that can be improved upon, things like localisation
What many people do not realise is that the localisation of the mobile software is not part of the MediaWiki software. Traffic from mobile devices is what grows traffic for Wikipedia a lot. The experience is that when the software has been properly localised, it has a big impact. 

In India we have had multiple localisation drives. Every time we ask the same things; 
  • please do the most used messages first, they are what people see the most
  • then do the rest of the MediaWiki messages
  • The extensions used by Wikipedia are the next lot that need doing
  • Yes, there are all these other MediaWiki messages as well
Localisation has a clear impact on the number of readers and that is what ultimately grows the community.

So what we really, really need are for the mobile messages to be localised as well because this is where the new readers for our projects can be found. 

Friday, March 15, 2013

#Pope Francis on #Wikidata

The good news of Wikidata is that it may become the repository of data on many subjects. The bad news; it can get it wrong.

Pope Francis is the new pope of the Roman Catholic church. He is not the pope of Catholicism. This is rather basic.

There are all kinds of identifiers associated. Ever heard of a "Viaf identifier", a "GND identifier", "SUDOC" or a "GND identity type" and really should they be so prominently displayed?

The picture is not of the pope, it is of the former cardinal. The coat of arms is the coat of arms of the former cardinal..

The picture in my blog is of pope Francis; he is smiling.. And yes, I can edit Wikidata.

Saturday, March 02, 2013

Using #Wikidata in #OmegaWiki

Wikidata knows about the existence of #Wikipedia articles. One really nice side effect is that OmegaWiki will point you to the article in your language, Arabic in the example, even when OmegaWiki does not have a translation in your language.