Friday, November 08, 2019

Bias in @Wikidata and a SMART approach

When at the WikidataCon quality was presented, it was rated from 1 to 5. This approach has its own bias because it does not consider what may not be there. What is not there can be made visible using assumptions like: "a university has more than one employee" (employee includes professors) and, every country has at least one university..

The bias in Wikidata starts with the way it is mostly used and consequently how it is taught. People are shown what Wikidata looks like, immediately followed up with training in the use of query and the use of tools. At every level it takes considerable skills to make a use of Wikidata. The first hurdle to overcome is to understand the data in a single item. When your language is not English you are toast. This is Cape Town in Newari and this is a useful presentation using Reasonator. With Reasonator the information is easy to digest and adding missing labels is just one click away.

The second hurdle is knowing what bias it is you want to remedy. For a known bias like the gender gap, the Women in Red have lists of missing Wikipedia articles. A Wikidata gap is expressed by the absense of data. Listeria lists are great at that.. These are all the universities of Africa.. If you do not get the extend of what we miss, you have some thinking to do. When you apply this principle to the science of Africa, you find a lot of lists and the biggest issue remains; missing lists.

When you tackle a missing subject like I did for the "Affiliates of the African Academy of Sciences", you will find a source as a reference for the group and a reference on every affiliate. To ensure that the data is relevant and actionable, I added all of them, linked them to ORCiD and/or Google Scholar enabling SourceMD to link them to their papers. I added nationality because this may trigger inclusion on the Women in Red lists and when it was obvious, I added employers so that they may be included as a scholar on African University lists..

When we as a movement want to fight bias, we have to consider the use of lists and particularly Listeria list to show the developments of a subject. With lists available on many Wikipedias, it becomes possible to gain traction on what we miss. This approach is distinctly different as it acknowledges the need for more support for item based editing and it makes the point that missing data is a quality issue that needs to be addressed as a fundamental issue.
Thanks,
      GerardM

No comments: