Thursday, October 09, 2014

#Wikidata - #Statistics are a #data game


The Wikidata statistics are a marvel. They exist in their own little corner of the Wikiverse and rely on the dumps that are regularly produced. When everything is fine, a refresh is generated automatically. Some crazy people find them of interest and go over the numbers trying to understand what is happening. Every now and again, they are amazed or appalled.

Recently the dumps who are available in JSON changed its format in the midst of a dump. The resulting hodge podge of data made the statistics unrealistic. Magnus was on a holiday. Yes, he has a real life, so it took a bit of time before he reasoned his way out of the mess.

It is wonderful that our community has people like Erik Zachte and Magnus Manske. They spend so much time and effort in providing us with meaningful statistics. It is important to remember that they rely on underlying data and it is their skills that ensures that the data remains comparable over time.

NB Currently 56,83% of the Wikidata items have 0, 1 or 2 statements.. :)
Thanks,
       GerardM

No comments: