Tuesday, September 08, 2009

Reality check; looking at the numbers

The 10 biggest Wikipedias generate 92.65% of the Wikipedia traffic. These 10 languages are written in three scripts: Latin, Cyrillic and the Japanse writing system. When you look at the next 10 languages, there are three additional scripts; Chinese, Thai and Arabic.

When you look at the top 20 languages on view, you find that some are growing really fast; Russian at 117%, Arabic at 44% and Indonesian at 32% this can be compared to the growth for English of 13%. Some are not doing well; German and Dutch at 4%, Finnish at 2%.

The English Wikipedia is the 155th in growth of traffic with 13% but as it gets most of the traffic anyway (53,54%). All the other projects together only add 2% more traffic to a total of 15% growth in traffic.

When I look at what I remember of the group statistics at translatewiki.net, I get the impression that the languages that improved their localisation have increased their traffic as well. I am also quite pleased with the growth of the newly created Wikipedias. Pontic is the sad exception.

So what does this all mean.. The top 30 languages, with seven scripts represent 98,32% of our traffic. When you look at the rest, you find that many languages are growing quite nicely and are known to have really active communities.  These beautifully compiled numbers by Erik Zachte are nice because they do prevent the easy comparisons based on article counts.

For me it does not change much, because I do not know yet how to interpret these numbers. There is still so much I do not know like where does the traffic come from and what is the percentage of people reading a Wikipedia as their second language. I am still looking for a better understanding of the relation between traffic numbers and our localisation numbers and finally I am looking what makes a project take off, what are the inflection points and what factors affect it.
