Sunday, February 10, 2019

#Wikidata - A quick and dirty "HowTo" to improve exposure of a subject in Wikidata

When you want to expose a particular subject, any subject, in Wikidata. This is the quick and dirty way to expose much of what there is to know. There are a few caveats. The first is that the aim is not to be complete, the second that it is biased towards scientists who are open about their work at ORCiD.

You start with a paper, a scientist. They have an DOI / ORCiD identifier and, they may already be in Wikidata. First there is the discovery process of the available literature and the authors involved. The SourceMD tool is key; with a SPARQL query or with a QID per line, you run a process that will update publications by adding missing authors or it will add missing publications and missing authors to known publications.

When you treat this as an iterative process, more authors and publications become known. When you run the same process for (new) co-authors, more publications and authors become known that are relevant to your subject.

To review your progress, you use Scholia. it has multiple modes that help you gain an understanding of authors, papers, subjects, publications, institutions.. You will see the details evolve. NB mind the lag Wikidata takes to update its database. It is not instant gratification.

A few observations, your aim may be to be "complete" but publications are added all the time and the same is true for scientists. People increasingly turn to ORCiD for a persistent identifier for their work. The real science is in designating a subject to a paper. Arguably the subject may be in the name of the article but as an approach it is a bit coarse. I leave that to you as your involvement makes you a subject "specialist".

