Thursday, August 22, 2013

#Wikidata needs more data to be useful

In case you did not know it, Mr José Mujica is the current president of Uruguay. When you read his Wikipedia articles you will know it. Only recently this "statement" was added to the Wikidata item about Mr Mujica.

When you check out the infobox on the English Wikipedia, there are many more statements to be made about Mr Murjica. Statements that any competent bot operator can either get from Wikipedia or from DBpedia. Mr Mujica is not the first Uruguayan president, many presidents preceded him. And for all of them there is a date that they started as president, there are predecessors, successors and end dates. Most of them are not in there.

For Wikipedia to be useful, it needs data. A lot more data IS available and is waiting to be added. Magnus Manske has information on 300.000 people waiting to be added. It is not included because some people want the information added by bots to be sourced.  As you know, Wikipedia is not a source...

When you consider how many people have an entry in Wikidata, you will realise that 300.000 people is a nice effort but also a drop in the ocean. When you consider that many of these people have links to external sources like VIAF en GND, you will realise that much of this "unsourced" information can be compared with information elsewhere.

Wikidata is not yet at the tipping point where it is actually useful. Until it is, artificial restrains to adding credible statements will only move this point even further in the future.
