Thursday, September 26, 2019

The lowest hanging fruit in #DBpedia

What I hate with a vengeange is make work. DBpedia as a project retrieves information from all the Wikipedias, wrangles it into shape and publishes it. In one scenario they have unanimous support from one or more Wikipedias agreeing on the same fact and, they all may have their own references.

We should import such agreeable data without further ado. An additional manual step to import to Wikidata is not smart because manual operations introduce new errors. Arguably when there is no unanimous support manual intervention may improve the quality but given the quantity of the data involved, it means that a lot of data will not become available. THAT in and of itself has a negative impact on the quality of available data as well.

So what to do.. Harvest all the data that is of an acceptable quality, that is the data DBpedia accepts for its own purposes. Enable an interface where people verify the data where their project is challenged.

When we truly aim to engage people, we enable them to target the data they want to work on. I will happily work on scientists but do not expect me to work on "sucker stars". More than likely there will be people who care about soccer stars but not about "crazy professors".
Thanks,
      GerardM

No comments: