Sunday, July 17, 2016

#Reasonator - the perspecive on #Wikidata people do no get

#Wikidata is where Wikimedia data lives. It started with a big service to Wikipedia; It centralised its interwiki data and this was a huge step forward in its quality. There is still a lot of work done on improving it even further because many of the problems left need a different perspective.

The next official challenge is to provide data to infoboxes. This problem is utterly different from the challenge replacing interwikilinks. It is impossible to import all the data from infoboxes all at once and start improving. The quality of the data in infoboxes is worse but that is not the problem.

So people have imported oodles of data and the quality is as expected; poor but improving. One problem is that all the work is happening at Wikidata and it does not transfer to Wikipedia. There is not even an official way to have a good look at the data available at Wikidata. The unofficial tool is Reasonator, it is currently broken and it is why I am reflecting.

Reasonator provides an intelligible perspective on the data of an item. It makes many problems "obvious". It shows imported statements and it shows all the references to the item that is shown. It allows you to see all (with a maximum of 500) statements that share common properties.

With a functional Reasonator, many people work on data from Wikipedia with a Wikidata perspective. When Wikidata is to fulfil its promise of improving the quality of data of Wikipedia considerably, the first thing to do is change objectives and perspective. The perspective could be Wikipedia based and the objective is not replacing data in infoboxes but quality. The good thing is that it is actually possible to achieve this.

A few observations; all wikilinks are in effect links between Wikidata items. Many of the links indicate that an article "needs" to be in a category and consequently this can be automated.

Why do this? When people look at all the wikilinks with a Wikidata perspective, it will make a lot of faulty links obvious. A painter of the 16th century did not receive a 20th century award for instance. Quality will improve.  As more statements and possibly items are created, it will affect every article about the same and related topics.

It needs only one thing, a Reasonator like view of the data from a Wikipedia point of view.
