Saturday, February 13, 2016

#Wikidata - all notable #Psychiatrists - a query for big data

When you look at all the psychiatrists known to Wikidata, there are currently some 2992 psychiatrists known. When you think about psychiatry, the relevance is in the number of people who have to deal with it.

The tool used is one that does not get that much attention. It takes its time to complete but it gets whatever it is the query says. The tool does its job admirably for several years now. The one redeeming advantage is that it does what 'official query' does not offer. It can be used never mind the size of the results.

Some say that it is unrealistic to ask for this quality of service from 'official query' because "it has not been designed with this in mind". The friendliest thing to say is that this is a mistake. Official Query was supposed to replace WDQ and when it cannot by design, the design if wrong. A better argument would be that Wikidata is one of the biggest public facing resources on the Internet and people are actually using it; at this time it cannot cope. It takes money and lots of money to serve the whole world. Possibly. This approach however is an acceptable argument. It allows for seeking one or more solutions.

One existing solution is the "Toolkit", when you can have your own datastore, you can throw as much hardware at it to get results. You can give the WMF targeted money to have more hardware for you and I or implement existing software that may do the trick. We could explore if federated technology as it exists for Wikipedia could make a difference. What cannot be done is hiding behind an arbitrary choice that was insufficient from the start because official query is to replace what we already have and not take away from it.

No comments: