Sunday, August 09, 2015

#Quality for #Wikidata and for external #sources

There are always arguments to find why not to accept Wikidata as a quality resource. Many Wikipedians ignore Wikidata because they do not trust the quality of its data. They require sources because that is why they trust a fact in Wikipedia to be good.

The practical problem is that Wikidata has some 15 million items and most have one or multiple statements. Each statement should be sourced given the notion of sources as a requirement. Given the speed of new information in Wikidata, sourcing for all statements is not going to happen anytime soon and consequently an alternative that demonstrates quality is needed.

One best practice of Wikidata is publishing external sources for our items. It already adds a feeling of quality because it allows a person to see what those external sources have to say. It takes some software and a workflow to leverage this sense of quality and solidify it as a measurable quality improvement.

Obviously both Wikidata and external sources have their issues. Where they all agree, there is the least need to work on improving quality. Where Wikidata has no data, it is obvious to add data and use the external source as a reference. It becomes interesting when there is a difference.

The first thing to do is flag a differing statement as suspicious. It signals to software and people that there is a need for attention. People can research the issue and come to the conclusion that
  • Wikidata is correct
  • the external source is correct
  • both are incorrect
In all these circumstances, the flag for the statement will be changed, the statement may be changed and in every case a source is to be provided. This is when true sources make the biggest difference because the flag does not go away and with quality sources where there is this obvious need, the quality of Wikidata is easier to appreciate.
