Tuesday, July 31, 2012

Learning a new #script .. #Arabic

As I am learning a new language with a script that is new to me, I find the Internet not yet the resource it is in languages for the Latin script.

There are several obstacles; I have to configure my computing devices for the Arabic script and keyboard. I have to find the characters on my keyboard and only when I do will I get the facility that I need.

The one thing that would make my day is to have a typing tutor that is ready for an international public. This means that the instructions are available in multiple languages for the same input method for the same language. When these parts are separated from the code, there are three specific parts.
  • the user interface
  • the exercises
  • the input method or keyboard layout itself
This allows the same software to be used for multiple input methods and scripts / languages. The software can be localised at translatewiki.net and the typing tutor can be distributed with the input methods themselves. 

Wednesday, July 25, 2012

Multilingual cluefull bots

The #BBC writes that #Wikipedia without its bots is doomed. It is a good story, a happy story and well worth a read. Wikipedia as you know is not only the English language Wikipedia, there are over 280 Wikipedias in different languages. They are not all created equal; most of the smaller Wikipedias have less than 100.000 articles, a small community of editors and their own issues with people who think it funny to write penis wherever they can.

The BBC article writes about "Cluebot NG" and is said to reside on a computer somewhere. It would be a great project to move Cluebot NG to a Wikimedia server and make instances for all the other languages. First for the 40 something languages with more than 100.000 articles and when its lessons are learned there may be scope to expand.

There is a committee supervising the bot activities, it will be wonderful when it has the potential to expand its expertise as widely as possible.

Tuesday, July 24, 2012

Levels of competence for #Arabic in #Wikipedia III

Tools like #Google #Chrome use an engine to do the rendering for them. For Chrome it is Webkit. It is responsible to connect two items that are in separate HTML elements and make sure that the rules for the Arabic script are addressed.

What started as an observation in MediaWiki became a Chrome bug and is now a Webkit bug.

Webkit is used in multiple browsers.. This makes a fix relevant for Apple's Safari browser as well. The one question left is what will it take to bring a fix in production within those browsers in the near future.

Sunday, July 22, 2012

Levels of competence for #Arabic in #Wikipedia II

It does matter what browser you use. Google's Chrome browser is the reason why the Arabic Babel templates.

If you use Chrome to browse Arabic websites, you want to add your vote to fix this bug. You vote by marking the star that is at the bottom of the page.

I am really pleased how quickly this issue was analysed. It is now for Google to support its users who use languages in the Arabic script.

Levels of competence for #Arabic in #Wikipedia

As my competence in the Arabic language is growing, I added Arabic in my Babel information. To my amazement what I read is wrong. The "ba" is written on its own while it should be connected with the following "aliph". It is one of those peculiarities of the Arabic script that a character like the "ba" is expressed depending on its position in a word.

I had a look at translatewiki.net to understand why it is wrong; as you can see the "ba" or пе is 
connected to a wiki link and the word is not formatted after what is in the variable.

It is fun to learn other languages. What is really amazing is that I can find glaring issues like this one.

Thursday, July 19, 2012

European cranes want to be free


This wonderful video of European crane courtship is currently copyrighted and not freely licensed. The video comes with a button that allows you to embed it in a website and its article has a button where you can order the video.

The video is on the website of Stichting Natuurbeelden and they made a nice offer; 50 of their videos will become available under the CC-by-sa license. There are many great videos to choose from and for all kinds of reasons, the initial offering is to choose at most 50 of them. 

Organising this selection will not be easy and the relevance of some videos is very much in the context of nature in the Netherlands. European cranes are breeding again in the last few years. When the fifty videos are selected and when they are used in our projects, these videos will be seen quite often. At the moment when I write this, the number of views for this video is only 136.

Saturday, July 14, 2012

Can everybody read #Wikipedia?

A lot of effort goes into making "Wikipedia the encyclopaedia everybody can edit". The result is wonderful; there are Wikipedias in over 280 languages, a big effort is under way to make editing even easier and as so many people do edit, it became a rich almost authoritative resource. When Wikipedia goes off-line, students despair.

People do read Wikipedia, it is very popular but the notion that Wikipedia is hard to read is not really considered. Take for instance today's featured article.There are too many words on a line. For many people this makes it hard to read, some give up on an article or on Wikipedia.

Yes, you can change the way content is displayed on a computer screen. The problem is that people who have problems read websites in the default format.

The Wikimedia Foundation does have the expertise to consider these issues. The people who do have already too much on their plate to support the current software development. However, the proof of the pudding that is Wikipedia is in people READING its content and that makes this a key concern.

Tuesday, July 10, 2012

Something positive to say about #Apple

Apple is innovative, it routinely adds new functionality and quality to its products. Many people love Apple products and pay a healthy premium for the latest and greatest.

One of the recent innovations goes by the name of "retina display". It is a high resolution screen of such quality that the human eye does no longer see individual pixels. Innovations like the retina display add demand for high quality images and, it does stimulate the use of high quality images and the use of SVG or scalable vector graphics in WMF projects like Commons and Wikipedia.

The improved technology used in Apple hardware stimulates quality improvements at content providers. In this way Apple stimulates a healthy and innovating content ecosystem. As other hardware suppliers are continuously catching up with the leader of the pack, there is real value in buying Apple.

I did  it, nothing negative to read in this post about Apple.

#GLAM - About recognition

 Left Hand Bear, Oglala chief
This years #Commons picture of the year contest was different from last years. The many old images that were so lovingly restored and featured the Commons main page were not there any more.

A thread on the mailing list reminded me about all the hard work that gives images of the past a new lease of life. The image of Left Hand Bear, the Oglala chief is used a lot. As you can see below it is even used to make ties, mugs and buttons.

The image of Left Hand Bear has been lovingly restored by Adam Cuerden. The original of this image is at the Library of Congress and I owe a debt of gratitude to both Adam and the LoC.

Adam restored an image preserved at the Library of Congress. Knowing this, I am sure that this is indeed an image of Left Hand Bear. The image is obviously in the public domain and as such I am not required to acknowledge either the LoC or Adam. I may put the image on mugs, ties and buttons and sell them.

For both Commons and Wikipedia, acknowledging the LoC and Adam bring important benefits. Acknowledging the LoC provides provenance of the image, this is the equivalent of providing a source to a fact. Acknowledging Adam links the much improved image to the original. It recognises Adam for his work.

Acknowledging the LoC and Adam IS a best practice. It is a best practice promoted by organisations like Europeana. It is a best practice that is not a requirement, it is however something that we should aspire to.

Monday, July 02, 2012

#Kiwix - the interview

#Kiwix is the tool that allows you to read the content of a Wiki offline. It has been developed with Wikipedia in mind but is equally usable for Wikisource or Wikibooks. I am really happy to have interviewed Emmanuel who knows all the ins and outs of this wonderful piece of software.

What is Kiwix and what is it used for
Kiwix is a software that wants to enable people to read Web contents without internet connection. It's a reader which works with ZIM files containing all the content. It's used to access Wikipedia offline, by reading pre-packaged Wikipedia ZIM files. It's mainly used by people who want to have an encyclopedia, but are too poor to have access to the internet. It's also used,for example, by travelling people (plane, ship, train), prisoners and students at school. 

Can you tell us something about its popularity
We have users all over the world and the audience is increasing quickly: we have had 25.000 downloads of Kiwix in May

In how many languages is Kiwix supported 
Thank to the Translatewiki Web site andits community, Kiwix is localised in more than 80 languages. We also provide content (ZIM files), mainly Wikipedia, in around 25 languages. But we want to do more: thanks to a grant of WMCH soon we will offer ZIM files of Wikipedia in all languages

How do you support languages written in scripts like Malayalam, or Tibetan
Contents are Web contents and Kiwix itself is a sort of browser getting the Web pages from the ZIM file instead of the Web. So, we do not have special handling in Kiwix itself to render the contents. Everything should be well organised in the ZIM file, for example by using Web fonts. But, from the Kiwix fulltext search engine point of view, this is challenging. Natural languages have a lot of particularities. Kiwix uses the Xapian search engine and tries to integrate CLucene. We do our best with them to offer the smoothest user experience possible. 

Do you provide fonts with Kiwix for the languages that use these scripts
The Wikipedia ZIM files we are preparing still do not provide the Web fonts. Already for a few months, the integration of Web fonts has been a part of the Wikimedia projects, so we have to fix that ASAP, this is not a big challenge.

For some languages like Chinese and Serbian, we show the content in two scripts ... Can Kiwix do this as well ?
Kiwix does not provide any transliteration tool for now, but all the technology is already there in the soft. We use a powerful unicode library called ICU (http://site.icu-project.org/) which can do that. We
want to use it to allow users to do custom transliterationsC++ developer wanted there!

Kiwix uses the OpenZIM format ... can you tell us more about this format
The format is called ZIM. There is a volunteer driven project called openZIM  created a few years ago to specify the format and develop a standard library. The ZIM format allows to put millions of contents together, to compress a part of them, and add Metadata. In the end, you get only one file, which is, at the same time, extremely compressed and allows a constant and quick random access.

Nowadays, many publications are in the EPUB standard ... can Kiwix handle this as well
Kiwix is not able to deal with EPUB, but in the future it will. We think EPUB & ZIM format are complementary and we want Kiwix being able to perfectly deal with EPUB. Our plan is to integrate "Monocle" to do that. Also there, developers are wanted.

How do people find content available for Kiwix
Kiwix has its own content managerso you can download content from Kiwix itself. But you may also download the ZIM file from the Kiwix Web site (http://www.kiwix.org) or using the Mediawiki Collection extension.

In the future, we want to have a platform (something like Itunes) to offer  really easy to find and download contents (both ZIM and EPUB files). We have started a project in that perspective. We need your support! 

What is your biggest challenge at this moment in time with Kiwix
Building Kiwix-mobile for Android. We will have a first release in autumn. But we have many other projects running at the same time and others for which we still need volunteers.

Thank you