Monday, August 10, 2015

#Wikidata - #corroboration and #sourcing

The problem with sources available on statements in Wikidata is that even when they are by definition the source of a statement, it is not what we understand a source to be. When I use tools to add statements to Wikidata based on lists and categories from a Wikipedia, that Wikipedia is my source. My tools do not help me add this fact so I do not add Wikipedia as a source. Other tools do and consequently there are some 20 million statements sourced in this way.

When no source is available, a statement can be corroborated by finding identical information in an external source. The difference is important. The external source is no source proving the veracity or the origin of the fact, it merely indicates that it does not differ. Corroboration is important, it does improve the likelihood that a statement is correct. It adds a notion of quality.

Wikidata items often refer to many external sources. Only when a fact new to Wikidata is added as a statement from one of these sources, the external source IS the source.

Some external sources provide information with the authority of a respected organisation. When the RKD Netherlands Institute for Arts History indicates that Nora van de Vlier received the Willink van Collen Prize in 1954 I would consider it a source and happily accept it as a source for a new statements in Wikidata. When such information is from DBpedia or Freebase, I would appreciate more references at a later date.

When it is not the original source the only thing I care to know is that there is no discrepancy between the data provided and the data available at the external sources. When external data is pushed into Wikidata as a reference, it could easily be considered a fraud. It is certainly clutter.
