Thursday, March 31, 2011

On #Wikipedia we have no friends

#Facebook is wildly successful because all your "friends" are there. Wikipedia by contrast has a problem growing its community, it is not where you have friends.

Most of my Facebook friends I know because of a mutual Wikimedia interest. Many of these friends I have met and I would consider helping them when asked. Leigh Thelmadatter came to me for help because I advertised myself as a GLAM ambassador.

She is preparing a GLAM partnership with a Mexican museum and, I have enjoyed many an interesting conversation with her and was happy to help her with several Tropenmuseum pictures of Mexican objects.

Conversations among friends are not limited to the business at hand. We talked about education, Mexico, photography, pain in my back ... The photography part provides an other reason how Leigh adds to her Wikimedia relevance; she has uploaded over 5100 pictures taken by her and her husband of the nooks and crannies of Mexico and this makes it an awesome collection.

As she likes the picture of the San Miguel Concá mission very much and as it clearly demonstrates a different style of church, I nominated it for featured picture.

I wish that we could all be friends, Wikipedia friends. I wish I could help my friends who come new to Wikipedia. It will help when you can find the friends that are already there. When you make a booboo as a newbie, it is friends who can help you best settling in, who can make you feel welcome.

Wikipedia is not Facebook, but we can learn from Facebook how to be nice and have friends.

Best blogging practices II

My purpose as a blogger is to inform widely about significant developments for a range of topics. Because of my publishing frequency, the stories together read like a narrative; they became more about the journey then about an arrival.

Every time a subject is raised, there has to be a story with a fresh angle with fitting illustrations and appropriate hyper links. Doing this well, grows the audience for a blog and it also raises the ranking of my articles for the search results. This mechanism is the only reason I can think of why this is my best read article with currently 5,692 page views.

It is important to understand that a writing style defines a public. For me no long academic style articles, I think these are overrated because many readers do not finish them. They are not for me because an academic article aims to cover the whole ground while I prefer to celebrate developments and call attention for opportunities. By more then doubling my readers within a year, the numbers prove the viability of my formula.

Traditionally the English Wikipedia and its public dominate wiki publications. The distribution on the map shows that my public is largely in Europe and Asia. This coincides with where the most interesting Wikimedia community developments occur. It is where I expect the biggest growth in activity and traffic for the Wikimedia projects.

The Wikimedia Foundation aims to grow in the "global south"; there is a public, there are able communities held back by a lack of base technology. I raise these issues and I look for partners in crime. It should not be a crime, it should be our way forward.
      GerardM is ahead of the curve

Running on #MediaWiki "head" all the latest messages are available for localisation at twn. As a consequence there is not much disruption for many languages when new functionality goes into production. Typically there is ample time to get the localisation done right.

When new functionality is developed for translatewiki itself, testing is as essential as it is for any production platform. This testing is done on the "sandwiki" and it is very much to prevent avoidable downtime.

There are essentially two categories of functionality we test; development on the Translate extension itself and other extensions that will impact the effectiveness of our community of localisers. Because of this we already run Narayam for all languages in production and we are now testing WebFonts in "sandwiki".

Now that Santhosh has commit rights, further development by him and Niklas can take place in the place where many eyeballs can make any bug shallow. Once it is in SVN, we can also start localising the WebFonts software.

"sandwiki" with the Rufscript font

Wednesday, March 30, 2011

What comes first, the study or the paradigm shift

Citation needed
#Wikipedia on a #mobile sucks big time for many languages. All it takes to realise this is to try it. Even if you don't, our statistics on mobile traffic for Wikipedia prove this conclusively.

When we are to generate more traffic, it is relevant to realise that a mobile is essentially a computer with a different form factor. As mobiles gain their smarts, it is not so much that they become better looking but more that they gain more attributes from a computer.

The big difference between computers and mobile phones is that computers have an open architecture while mobiles do not. Many of the mobile architectures are based on computer architectures and fortunately displaying text is one of these. As a result, the smarter the phone, the easier it is to send web fonts to a device and get a readable result.

There is proof that we can show Malayalam text on an iPhone. When we implement web fonts technology, not only mobiles will benefit. It will bring a paradigm shift to reading Wikipedia; as long as a device can show Unicode properly, any Wikipedia will be readable.

Does it take a study to understand such benefits? No. Does implementation invalidate the study? Possibly. Is there a reason to wait on implementation of web fonts? Yes, it takes more development. Should this development get the highest priority? Depends on how highly we value bringing knowledge to more people.

The Congress Party of #India does not understand #Wikipedia

The Minister of Law and Justice in the Indian government, Mr Veerappa Moily and with him the Congress Party do not know how to deal with  Wikipedia. Take Mr Moily's Wikipedia article for instance, it does not even have a picture of him.

For politicians it is important to be seen as the friendly, smiling person, the person who will deliver what they promise in the elections. Wikipedia is where people find information about Mr Moily because Wikipedia gets a high ranking in the search engines. No picture though.

In an article in the Financial Express, he hits out at WikiLeaks and Wikipedia. The good news is that he knows them apart. The bad news is that he does not understand that an encyclopaedia with a neutral point of view is not the enemy.

The LDF in contrast does understand the opportunity Wikipedia offers; they made many pictures available under a free license and consequently articles about their politicians have a high quality picture. We may not have an article for all their politicians, but the pictures are available to be used as illustrations.

Mr Moily I make you an offer; make a high quality picture of you available to us under a free license. I will blog about it and, I will propose it as a featured picture candidate. When it becomes a featured picture, millions of people in the whole world will see it.

Using the Ş or Ș In #Romanian

The definition of the subset of #Unicode characters used for the Romanian language is quite clear; only the Ș and the ș is correct for the șe. This does not mean however that everybody uses the comma under the S and not a cedilla.

Before Unicode became common place typing the comma under a character was really hard. As a consequence many, many expressions of Romanian are erroneous. People got used to writing with a cedilla.

With the later versions of MS Windows, the keyboard mapping for Romanian made it easy to write correct Romanian. For the people stuck with a wrong keyboard we can easily have Narayam provide a modern input method for Romanian.

This is currently a big deal for the Romanian and English language Wiktionaries. They are in the process of correcting every occurrence of a wrongly written șe and țe. This is quite an undertaking because it affects interwiki links to other Wiktionaries as well.

It also means that they want to ensure that only a correct șe or țe is written in Romanian. This is complicated by the fact that a Ş is correct in for instance Turkish. Being able to identify a text for its language is therefore quite important.

The solution currently implemented on the Romanian Wikipedia is that any t or s with a cedilla is converted to a proper șe or țe. As a consequence Turkish names of people and places are likely to be spelled incorrectly.

PS Please note that the font used for the title does not cope with these characters.

Tuesday, March 29, 2011

Alpha support for #Malalyalam support on the #iPhone

#MediaWiki in one of the languages of #India on an iPhone. Given that an iPhone or an Android are quite powerful tools, it is therefore not a real surprise that they can show Unicode properly. It is however something that people in India have not seen yet.

The extension that brings this functionality to you is alpha software. It does not support Internet Explorer yet. This can be fixed, this will be fixed. Help is welcome.

This proof of concept demonstrates that when you add Narayam for its input methods you can edit Wikipedia in the languages of India. Is more study needed?

Monday, March 28, 2011

The answer to our "most significant challenge": #kittens

Lack of new editors with staying power is #Wikipedia's biggest challenge. What amazes me most in all this is the lack of credible suggestions to alleviate this. The only one I find is to give kittens to people. Not a good idea as kittens become quite rapidly sexually mature and have this nasty habit of becoming octamoms.

It is hard to enter Wikipedia because of the increasing levels of "required" sophistication. This makes Wikipedia more high brow, reliable, wordy and both hard to read and edit. By making quality a priority, Wikipedia is no longer the inviting community it once was. With this change the number and size of illustrations vis a vis the number of words has gone down, the number of foot notes has gone up and developing new templates is a too much for most of our best.

With the number of participants and the quality level a balancing act, there are things we can do to motivate people to stay with the program. The most obvious thing to do is motivating people. As obvious it is to make participating less complicated. There are technical solutions like the Sentence-editing tool by Jan Paul Posma that make editing a lot easier...

An other big group of potential editors is left disenfranchised because they do not have the support for their language on their device. Many of their issues are solvable, it is solvable without a need for more studies or reflection teams. It is solvable when usability and accessibility is truly a primary goal for this year.

Sunday, March 27, 2011

Hopes are up for #Toolserver II

No tool is more appropriate then "#SVG translate" to be the first of the Toolserver tools to be internationalised. This tool that has been around for quite some time and was recently adopted by Jarry1250. He came to #mediawiki-i18n to learn how to get localisations for this tool.

One of the things that is quite nifty is that it does not only enable translations for the SVG files at Commons, you can also translate files from other websites.

With Jarry1250 and Krinkle setting up the internationalisation for Toolserver tools, the functionality of SVG Translate got attention as well. The image that has its labels translated is now shown together with the labels that need translation.

There is some Dutch in this screen shot :)

With the usability improvements and the accessibility improvements SVG translate has the potential to become a tool that will seriously support the use of SVG in all the Wikimedia projects. A tool like this unlocks an potential that has been inherently present all the time.

#Google suggests

Google suggest is one of my favourite helpers navigating the Internet. It will catch my typos and provides me with an improvement to what I typed. This functionality is available in many languages and not in others.

Santhosh found out that the ":" is used by many people writing Malayalam. This is the kind of mistake that Google suggest can pick up. It is also the kind of mistake that a clever spell checker can automatically fix.

When people use the wrong character in preference to the correct one, it is because it is easier to type the wrong character. To what extend is this because of using a deficient input method ?

Supporting multiple languages in one article

Many #Wikipedia articles include text in other languages. There are many reasons why this is the case but technically these foreign bodies are not marked as such. For people there is no problem, it is easy to spot what is in a different language. For computers rendering a text is not much of a problem either; if a character is not available in one font it will likely be in another.

When it is reasonably certain that the majority of our audience do not have a font, current practice is to include a screen shot. While functional, it is not the best solution because search engines will not be able to read these. For Wikimedia projects like Wiktionary, including foreign text is the rule and not the exception and many of its readers see the rectangles of the Unicode fonts missing on their system.

With multiple languages present in a text, a spell checker will mark much of a text as in error. It does this because it is not aware of changes in language.

With the languages in our articles properly marked, a spell checker can prevent obvious errors, a search engine can do its job and we can provide additional targeted support by providing input methods and web fonts.

#wikimediaZA is the latest #Wikimedia chapter

Chapters are enablers and, the news of a South African chapter is really welcome. South Africa is a large country with many native cultures. Realising the Wikimedia aim is going to be an epic journey.

South Africa is a multi cultured country that is developing its infrastructure quickly, it is a magnet for people from other African countries and as a consequence it is rich in opportunities. There are many Wikipedias for languages native to South Africa. Zulu, Xhosa, Afrikaans even English qualifies.

As a large part of the population is young, there are plenty opportunities to reach out and build both an audience and an editor community.

This will work once people understand that it is their Wikipedia to write and it will get its public when these Wikipedias are relevant in the content they provide.

Different characters for #MediaWiki

These screen shots say it all; different characters. Different characters because a different font can be selected.

This functionality is of use for all scripts, even for the Latin script because the fonts people have on their PC does not necessarily show all the characters used in a specific language.  By providing a web font, we can make sure that all the characters are available for people to see.

The great thing of this extension created by Santhosh is that it will be of universal use. It will enable people to read our content as long as there is a free font that we can use. For many of our Wikipedias this will dramatically extend the reach of our projects. It is one of the key improvements when we are to increase our reach in the "global south".

PS the code for the extension is not not in SVN yet because Santhosh does not have commit rights.

Saturday, March 26, 2011

Hopes are up for #Toolserver

With Toolserver effectively an English language only project, its use is limited. This is sad because many really useful tools are based at Toolserver.

Krinkle has written to the Toolserver mailing list that he is working on I18N for Toolserver tools. This raises expectations. In this mail he is asking people to help him come up with a name for the tool, for the user that will be used to run the program.

I am intrigued by this system because it is to be super easy for developers. There are however two groups of developers involved; they are the developers of the individual tools and they are the developers at

As internationalisation is an architecture, it will be interesting to learn how its logic and structures will be fitted in, if there will be support for constructs like plural and gender. Even when initially such constructs are absent, it will be a huge improvement for the accessibility of the tools that will be taking their game to the next level.

About 2000+ maps and a treasure

#Wikimedia OTRS received an offer from Geographicus Fine Antique Maps for us to receive over 2000 historic maps with annotations in March, it created the enthusiasm needed to have it on-line within a month.

As we gain experience, the process of uploading becomes more sophisticated and now that it is all there, it is time for our community to work its magic. While rich in annotations, these maps needs to be categorised and find its place in articles. It will be sweet when the cartographers are identified and get their own articles.

In his announcement of the completion of the upload, Multichil mentioned the things we could do with such exquisite maps; one of them is overlay old maps on top of modern maps. A tool from the New York Public Library does exactly that. Combine it with the OpenSteetMap collaboration that is still waiting in the wings and we bring to two communities the shared benefit of open content and collaboration.

Among all these maps is one pointing the way to El Dorado. There is one thing lacking and, it is the X that makes it a genuine treasure map. Maybe you have to go there and look at the end of the rainbow.

Friday, March 25, 2011

Doing a good job, and then the reward ...

Having been involved on the #Wikimedia side of GLAM, it was my privilege to learn people in many parts of the world interested in different aspects of both Wiki and GLAM. With the whole program becoming more official, it was time to make myself an "ambassador"  now that I still can or forget about it.

I did and clearly there is a clear need for people who are available for questions, who are willing to give people their bearings. The lessons learned from the Tropenmuseum and from my language involvement make for a happy fusion.

I have had wonderful conversations with two ladies, one from Mexico and one from South Africa. Both were a mix of GLAM and language and exhilaratingly different.

Thelmadatter was getting into contact with me independently about Wikipedia for native Mexican languages and the different ways you can approach a museum. The Tropenmuseum has many Mexican objects in its collection and this made for an interesting exchange of views. Mexican arts and crafts have a rich tradition and they can do with increased visibility.

An organisation in South Africa wants to improve information about Africa in Wikipedia for Africans. This conversation went very much the other way; we started with GLAM and ended up what practical activities will infuse activity in Wikipedias like the one in Xhosa or Zulu. I gave them a challenge and, when they accept I will blog about their organisations and what they aim to do.

#Statistics do motivate at

AnakngAraw provided us with proof that statistics motivate. She is one of our hardest workers and localises for a Tagalog public. Her complaint was that the daily updates of statistics did not happen any more.

Keeping our community happy is really important for us and even though Siebrand is really busy making a difference at the Wikimedia Chapter conference in Berlin, he found the time to look into it.

One of the affected statistics is the "group statistics", it show what percentage of defined groups are localised in a language for MediaWiki. Statistics like this are generated by a pywikipedia bot and the software was broken. With a quick hack it is fixed locally and our numbers are up to date again.

Money makes the world go around

The #Wikimedia meta discussion on movement roles has a personal point of view published by Sue. One of the issues she addresses is funding.

Funding produces the go-juice for our many activities but this go-juice is not distributed equitably and its mileage can be improved. Funding has a life-cycle; first you acquire the spending power, then you spend it and finally you enjoy the benefits.

Money is best held tightly and in order not to pry it from tight fists, it would work miracles when it can only be held conditional on it going around our world. Benefits going around or not is observable Take WikiPortret, Toolserver or PDF export they all have limited use while having a global potential. Contrast that with Wiki Loves Monuments Europe, GLAM outreach or MediaWiki development where the result goes beyond a language, a country, a culture.

So lets make use of the inclination to be tight fisted and let our organisations retain control over money when its projects have an observable global reach. When "others" rate the reach of projects, it will promote cooperation. As each project is rated, it will be known and, that is in itself observably good.

Thursday, March 24, 2011

Logging into

Your #Wikipedia data foot print is hidden in the public data that is analysed and researched in many ways. Many a tool has been created producing insights in what we have done but also what you have done.

As a result of privacy concerns many of these tools went up the scrap heap. The data however is still there and the genie is out of the bottle. When one smart cookie can find the data and produce a really nice presentation others can as well.

Privacy is not a crime and it is not guaranteed by destroying such insights. Privacy is better served by presenting private data specific to a member of our community to him or her in the privacy of the browser. Such privacy can be provided by authenticating first with a SUL account and using HTTPS.
  • Many people find motivation in well presented statistics
  • the continued use of research tools enables more sophisticated research in the future
  • becomes a collaborative place for production grade statistics

#Publishing is what you do to get a public

When you google for a definition of "publishing" the first two results are fundamentally different for a writer seeking an audience.
When publishing is seen as a business, as a writer you hand over your text and wait for it to be published in order to get to an audience. When publishing is a process, printing is one option among others to get information to a public. For the viability of a publishing industry it is important to separate the business from the printing.

When something is to be published, many skills are needed to prepare for publication. The material may need editing, proof reading, peer review, presentation, marketing before it is readied for consumption. The skills involved maximise impact and distribution. Choosing a medium for a publication is one of them.

When publishing is not the printing business, maximising a paying public is what an author looks for. Each medium has its own cost structure and each medium has its own public. Getting this mix right and optimising for a return on investment makes the publishing business a business with a future.

Wednesday, March 23, 2011

The 10 suggestions

Triggered to name the top #MediaWiki challenges, the rule of not more then 8 paragraphs in a blog post is going to be broken in order to bring to you the top 10 from my point of view.

1 - All development is done to support MediaWiki projects 
MediaWiki projects are collaborative projects aimed at sharing knowledge with a global public. This dictates MediaWiki's functionality and the need for usability for a world audience.

2 - Thou shalt conform to Standards
As Wikimedians can be quite argumentative, one way to prevent fruitless natter is by insisting on standards and best practices. This has the added benefit that we build on the shoulders of giants and make it easier for others to understand what we are on about.

3  - All scripts and languages are equal in our eyes
When the first chapter of the MediaWiki book was written, Unicode and not ASCII was chosen to represent text. This resulted in over 270 languages with active Wikimedia projects. All scripts can do with TLC but improved infrastructure support for scripts results in benefits for many if not most languages.

4 - Internationalisation and localisation are key to fulfil our prime objective
As our projects are to be edited and read by a global audience the totality of the functionality offered needs to be, at a minimum, be internationalised. The linguistic requirements can be quite daunting and we have in a MediaWiki project that provides a service that is only limited by its developer resources. Localisation itself is left to the language communities.

5 - Thou shalt address men and women with equal respect
The user interface of MediaWiki is for people, men and women, it is them we address. The language should be personal, proper and not stilted. Great texts and great localisations appropriate for a language are a gift to our public and our communities.

6 - Thou shalt use the best technology wherever it comes from
Around MediaWiki a whole ecosystem of extensions and websites evolved. The least they do is function as a proof of concept. Often conceptualising in directions we have feared to tread. Much of this software has awesome potential at it can enable a Wikimedia movement, eg the Wikia social software, the OpenID functionality and last but not least Translate.

7 - Thou shalt know developments by its numbers
Numbers indicate developments of the MediaWiki projects. While MediaWiki is open source, the numbers game is not as open. With a more open structure numbers relevant to our partners can be included and a localised interface will broaden its appeal.

8 - Thou shalt know our craftsmen by their tools
While MediaWiki provides a specialised editing environment, for some of us even more tools help do their job. Some of these tools are highly personal, but many tools have the potential to raise the skill level in our communities. To achieve this, these tools need to gain the attributes of Open Source and be optimised for more general usage.

9 - What you do for the least of us, you do to further the good cause
Never mind how skilled you are, never mind how sublime your tools, there is a limit to what you can do. Only by sharing your experience, your skills and tricks will we realise our goal. Supporting this with newbie incubators and outreach tools we can do more.

10 - Our journey is leading us towards universal information
To make it universal, we have to grow both our traffic and our reach. MediaWiki is our bus, it enables us to move together. It moves easily through the pipes of the Internet and is preparing to make its way with dead wood technology and mobile technology; sneaker net and phones. Our tools open up the new frontiers where our information may help knowledge grow.

A #PDF library with potential

#Wikipedia content can be combined and exported in the PDF or Portable Document Format. This ability is extremely powerful as it allows people to produce speciality content as readers straight from a Wikipedia and share it either printed, by mail or by sneaker net.

While awesome, the current software is deeply flawed because it does not support all the scripts and languages of our projects. The excuse has been that the library that "everybody" is using is external to MediaWiki and the problem is in this library.

Has been, but not for much longer. Development is progressing nicely on another library. A library that will support other scripts. Vasudev Kamath's first package entered debian and it is the python-pypdflib. It is experimental and it may not yet support right to left scripts properly but with its support for the scripts from India and probably Cyrillic it forecasts a brighter day.

The project page can be found here and, here is your opportunity to test the existing functionality. 

Tuesday, March 22, 2011

Dear #Apple why do I need to jail break to support my language

When I want support for the #Ge'ez script on my #iPhone, the only option open for me is to jail break my phone and lose the warranty. This is very unsettling because I love my language and I love my iPhone.

Could you please make an app available that allows me to read content in my language and, while you are at it, please add a Ge'ez keyboard as well.

Monday, March 21, 2011

#Google search and #Wikipedia

#Quality or #quantity is one of the classic arguments and with the recent "Panda" update to the Google search algorithm many websites with high quantity and low quality content suffered; they lost most of their traffic.

Contrary to what some people think, it will not have a negative impact on any of our Wikipedias. The reason for this is astonishingly simple. A Wikipedia with not that great a quality will typically be in a language that does not have a big footprint on the Internet. Because of a lack of competition our articles will do well anyway.

This does not mean that quality is irrelevant, it means that quality is not the only criteria.

When the English Wikipedia was populated with stubs on all the places in the United States, it resulted in a community who took care of these stubs and produced something awesome. The info boxes of many of the towns and villages of Europe now have their arms and flags in SVG and an article in English as well. A project with well prepared stubs will do the same thing for the cities, towns and villages of India first on the English Wikipedia and then maybe on all the other Wikipedias relevant for an Indian public.

We know what articles are the most sought after and, we can know what articles people were looking for but could not find. When these sought after articles are improved or created first, they are most likely the same articles people are searching for using any of the search engines. This leads to optimisation of our search results; this will make a Wikipedia article relevant in the eyes of Google.

When the quality of the highly sought after articles is optimised first, the effect will be disproportionate to the time invested in any other category of articles. More people will read high quality articles and, are more likely to select Wikipedia in their language as a quality resource.

Happy Nowruz

In the Netherlands it is the first day of spring. In Iran and other countries people celebrate Nowruz, the first day of the new year. As part of the celebrations seven traditional items are put on a table in a pleasing manner.

The drama that is #Unicode support for the #Myanmar script

Unicode is a standard that defines the characters of a script and their behaviour. As it is a standard, implementing a font for a script and calling it Unicode means that the specifications have to be followed exactly.

I have written in the past about Zawgyi; it abuses Unicode by not having the characters in the designated places. This time it is the Ayar Unicode Group that is the focus of attention. The Ayar system gets the characters in the right place but there are questions about the placement of characters that  form the syllables. With 12% of them in the wrong order, this affects around 70% of the words in Myanmar.

With Myanmar people fighting over the "correct" implementation of Unicode it makes sense when the claims and counter claims are assessed by someone who can be accepted as an expert. It makes sense to do so and do it publicly because this ongoing madness has to stop.

Tooting the horn with #statistics

Statistics serve many purposes. One of them is the analysis of what goes on and with the other we can show the world what goes right.

The motto of the Wikimedia Foundation is "Imagine a world in which every single human being can freely share in the sum of all knowledge. That's our commitment." When the statistics show that people are increasingly turning to Wikipedia, it is good news. When more localisations at make it the 48th biggest wiki, it is good news.

It is good news when statistics can be influenced over time by our actions.

Good statistics show that our work is relevant and that motivates. When we get more traffic, it motivates. When we watch the localisation complete, it motivates. The most important task for statistics is to provide credible information and use it to motivate.

Motivating statistics are repetitive; you set them up and let them run. The key thing for them to work is presentation, automation and timeliness. When statistical analysis shows that we have not as many newbies and regulars, it does not motivate, statistics with the number of people in the Russian incubator do on many levels.

The challenge is not so much in doing an initial analysis; many people are poring over the raw numbers, it is in packaging them, automating them and finally getting them onto

When the statistics show flukes, it is when our developers and our community have to be agile. Being agile and keep the MediaWiki development show on the road.

Sunday, March 20, 2011 has more edits than many a #Wikipedia

#Statistics give bragging rights and the statistics at do the trick for me. Every now and then I get pinged that has gained a few places in the ranking of the largest MediaWiki wikis.

Currently at number 48, there are only 4 Wikia wikis and Wikihow in front of translatewiki, the rest are Wikimedia Foundation wikis.

When you follow the activity, translatewiki will continue to rise in these rankings. With no bots editing, it is a triumph for the localisation of the supported projects.

Dear #Google, how about supporting #India on the #Android?

#Reading #Wikipedia in one of the languages of India is one of the things we really want you to support. As you can see from our statistics, there is ample room for improvement. As the force behind the Android phones, it would be cool when the rendering of Indian languages gets your attention.

Recently the Android 2.2 client for looking up meaning of words using the Silpa dictionary service was launched. For now it supports only the English-Malayalam and English-Hindi dictionaries. You can read the meanings in Malayalam or Hindi even though Android 2.2 doesn’t support the rendering of its characters.

Given that Android does use Unicode fonts as its underlying technology, it should not be difficult to add fonts for the languages of India. It is not as if the Android phones are not capable enough.

Google, my promise to you: when Android supports the languages of India, I will make a lot of noise to get people to adopt this. I will also post a blog post three months after this functionality becomes available. I expect that the numbers will be quite different from the ones you can see above.

What to do when the #CLDR does not have it II

In #MediaWiki messages we support the plural of items. Its implementation differs per language and at we use the CLDR as the source for these rules.

This presents its own problem as the CLDR does not do fractions. Saper has just added Bug 28128 - "Consider whether {{PLURAL:}} should handle fractional numbers".

Having proper support for fractions is relevant when you want to inform for instance how many millions the Fundraiser has made for us. The current messages are mostly based on lists so that keeps it nicely integer.

Given the complexity of the support for integers in all our languages, it is a nice can of worms we are opening up. It helps that the Fundraiser only starts in November.

Saturday, March 19, 2011

What to do when the #CLDR does not have it

At we use standards. It makes our life a lot easier because it prevents us from arguing about things we do not really know about. When there is a standard, we have always referred to the standard and asked people to update the data in the standard.

The CLDR is a standard where you can enter the name of languages and currencies and time formats... A long list of items is supported. One problem is that the CLDR supports 217 languages. At we support 344 languages. When our people experience problems with getting data into the CLDR, at some stage we want to reconsider.

As we may be getting into the collection of translations for things like the names of languages, we will do this reluctantly and we will continue to give precedence to the information provided by the standards. Another thing is that there are many lists with the translations of language names and it does not makes sense to do the same thing yet again. One such can be found at OmegaWiki...

Another reason to be reluctant is that working on such functionality keeps our developers from more hard core work needed at We are very much like any open source project; there are more then enough ambitions and we welcome all the help we can get.

Nostalgia and #Indonesia ...

The #Tropenmuseum had a symposium with the title Colonial Nostalgia. The first part of the evening were two introductions of books and the second part started with four highly esoteric introductions to the subject.

As my family had no connections to "de oost", practically Indonesia was for me the Indonesian rice table, the remembrance plaques in the chapel near my school and the statue of Jan Pieterszoon Coen in front of the Westfries Museum.

In the introductions a structure was given to what is it that gets people's interested for times gone by. I did not fit in. The question if there had been signs of "whites only" at Indonesian swimming pools was new to me. It did evoke passions in the public and, one of the speakers was willing to give 100 EURO for proof that such signs actually existed.

My involvement in Indonesia originated in the wish by the Tropenmuseum to share its Indonesian collection with Wikipedia. The realisation of this resulted in the use of its great collection in many languages as illustrations, particularly in Bahasa Indonesia. The people that use the Tropenmuseum collection may write because of a sense of nostalgia, it may also be because they write an encyclopaedia and historic articles ask for historically relevant illustrations.

Our #bugmeister makes me happy

Bug 6100 - "Allow different directionality (rtl/ltr) for user interface and wiki content" is an old one. It goes back all the way to 2006. Many people commented over time and several issues mentioned have already been overcome.

By raising the priority of this bug to "highest" and "major", a clear signal is given by the Wikimedia Foundation that support for languages like Arabic, Farsi, Hebrew is important. Actions like this and the recent work on the implementation of Narayam demonstrate this support visibly.

Support was announced in words but these actions speak for themselves and empower the words with reality.

#Fundraising in #India II

Every year #Wikipedia hosts banners urging people to #donate to the #Wikimedia Foundation. Every year it is wonderful to see the response of so many people, a response that allows us to run one of the biggest and one of the most efficient websites in the world.

The focus of the fundraiser is very much on the collaboration with the chapters. This has proven itself as very effective. Fund raising in countries that do not have a chapter is therefore sadly something of a side show.

The Hindu mentions that India raised $193,000 for the WMF. This is serious money and, it is serious money coming from the "global south". It shows us what Hans Rosling has been making clear to us for so long: there is no such thing as one homogeneous global south. There are prosperous people everywhere and they are not that dissimilar to people elsewhere.

For a global organisation like the WMF, the Hindu shows that doing well in the fundraiser is something people take notice of and take pride in. When the cost of the traffic to the countries of the global south is paid by their own people, it shows ownership and involvement in their projects, their Wikipedias, their Wiktionaries, Wikibooks and Wikisources.

The localisation of the fundraiser software at is a process that is largely independent. It has been recently completed for languages like Swahili as well. For me this is a sign that they want to own their projects as much as we do.