Tuesday, November 12, 2019

Instant gratification at @Wikidata

As I write this, it is 11:46am at 09:26am I added papers to prof Hafida Merzouk. The edits are picked up by Reasonator but not by Scholia. In a similar way, edits done are not picked up by Listeria.

Instant gratification is now a thing of the past, the work done at Wikidata may eventually be picked up in a Scholia or Listeria but it is not funny. Can I tweet about the things I find or have done when Wikidata no longer reflects the relevant changes?

This may sound like trivial but it does mean that when I look back at my work that  there is no longer a timely way to do so.

Instant gratification motivates and it is a factor in maintaining quality. We are losing it.
Thanks,
      GerardM

Saturday, November 09, 2019

Put (modern) #science of #Africa on the map

A young African scholar commented that the info on websites of African scholarly organisations was all about its past. There is a point to recognizing those who did good and consequently making obvious that the science of today is rooted in the past.

African scientists as well as any other scientist have a place in Wikidata with their affiliations, papers, co-authors and also with their scholarly advisors. My proposal is for all scholars to check if they are on Wikidata, check if their doctoral thesis is on Wikidata. Then add their doctoral advisor to their item and reciprocate themselves as a doctoral student.

Do not forget to include where you studied and for what university you work(ed). Check if your ORCiD profile includes trusted organisations like CrossRef that will update your profile when appropriate. When many of you do this at Wikidata you will be surprised what the impact will be.
Thanks,
      GerardM

Friday, November 08, 2019

Bias in @Wikidata and a SMART approach

When at the WikidataCon quality was presented, it was rated from 1 to 5. This approach has its own bias because it does not consider what may not be there. What is not there can be made visible using assumptions like: "a university has more than one employee" (employee includes professors) and, every country has at least one university..

The bias in Wikidata starts with the way it is mostly used and consequently how it is taught. People are shown what Wikidata looks like, immediately followed up with training in the use of query and the use of tools. At every level it takes considerable skills to make a use of Wikidata. The first hurdle to overcome is to understand the data in a single item. When your language is not English you are toast. This is Cape Town in Newari and this is a useful presentation using Reasonator. With Reasonator the information is easy to digest and adding missing labels is just one click away.

The second hurdle is knowing what bias it is you want to remedy. For a known bias like the gender gap, the Women in Red have lists of missing Wikipedia articles. A Wikidata gap is expressed by the absense of data. Listeria lists are great at that.. These are all the universities of Africa.. If you do not get the extend of what we miss, you have some thinking to do. When you apply this principle to the science of Africa, you find a lot of lists and the biggest issue remains; missing lists.

When you tackle a missing subject like I did for the "Affiliates of the African Academy of Sciences", you will find a source as a reference for the group and a reference on every affiliate. To ensure that the data is relevant and actionable, I added all of them, linked them to ORCiD and/or Google Scholar enabling SourceMD to link them to their papers. I added nationality because this may trigger inclusion on the Women in Red lists and when it was obvious, I added employers so that they may be included as a scholar on African University lists..

When we as a movement want to fight bias, we have to consider the use of lists and particularly Listeria list to show the developments of a subject. With lists available on many Wikipedias, it becomes possible to gain traction on what we miss. This approach is distinctly different as it acknowledges the need for more support for item based editing and it makes the point that missing data is a quality issue that needs to be addressed as a fundamental issue.
Thanks,
      GerardM

Thursday, November 07, 2019

@Wikipedia talks about @Wikidata

"WD is unreliable. WP:V and WP:RS are completely ignored (from any editors). International NPOV is a problem too." It is so SMART, that the best I can do is ignore it. Then again it is an open invitation to talk about Wikipedia..  There is no Wikipedia there are over 300 Wikipedia language editions.. so even the acronyms are lost on me as there is no one Wikipedia to rule them all.. 

So forget about acronyms and lets talk Wikidata and by inference raise issues particularly for the English Wikipedia where appropriate. First, Wikidata includes more items than there are subjects raised in any and all Wikipedias. Its quality can be considered in many ways and verifiability is largely ensured because of the association with other "authorities" about a subject. Thanks to the increased use of open data, it is possible to verify that specific statements are shared, increasing the likelihood that they are correct. For some information like for scientists who are a member of the AAS Affiliates Programme, we have/may have references to the authoritative source. Such references may be on a project or on an item level, it makes verifiability easy and obvious. 

Wikidata has an issue with all kinds of gaps in its coverage. For many African countries no universities are known, there are hardly any scholars associated with them. Thanks to Listeria functionality we can monitor if and when data is added. Many a Wikipedia do not have such tools because of the aversion of Wikidata by some. At the same time projects like Women in Red rely on Listeria lists and by inference Wikidata to know what to work on.

In tools like Reasonator and Listeria lists are generated and, when you compare them with Wikipedia lists, the quality is measurably better. I published frequently in the past about the Polk award.. In its lists Wikipedia has a likely error rate of six percent. When they fudge the record by not linking at all, the quality of a Wikidata lists is even better because it is much better at linking items than Wikipedia is at linking red links.  There is a solution, it just requires a willingness by Wikipedians to cooperate. 

I understand what is meant by "international NPOV" and it is where Wikidata is by definition better than an individual Wikipedia. By definition because Wikidata represents data from ALL Wikipedias. Thanks to the people of DBpedia, there is a potential to highlight where Wikipedias differ and it is more likely that the fruit of their labour will enrich Wikidata than Wikipedias.

So a Wikidatan walks into a bar..
Thanks,
       GerardM

Monday, October 21, 2019

Adegoke O. S. - Fellow of the African Academy of Sciences

It is easy enough to add "O.S. Adegoke" to Wikidata and mark him as a fellow of the AAS.  With only initials there is no way to know the gender and to me that is quite unsatisfactory. This is when Google becomes your friend when you find Mr Adegoke is addressed as "Silvester".

There are some 384 fellows and slowly but surely they find their way into Wikidata. If there is a point to it, it is the same point why there are fellows of the African Academy of Sciences; "they provide Advisory and Think Tank functions and help to develop strategies that promote science in Africa and that are relevant to the continent".

The objective of Wikipedia and, by inference Wikidata, is to share in the sum of all knowledge. As we do not really consider what is relevant for our public in Africa and for those interested in Africa the AAS in its choices of its fellows at least points in the right direction. Adding the AAS fellows to Wikidata is a puzzle because the format of names differ. Some 240+ fellows are known at Wikidata as data but for it to become informative there is a need for suplemental data and even better Wikipedia articles.
Thanks,
     GerardM

Saturday, October 05, 2019

Rebecca R. Richards-Kortum

A text on the Internet read: "She’s Rice’s first-ever MacArthur grant winner. But her real claim to fame? Her clever medical inventions might just save your life." It is not as if I know her even though I added to her Wikidata item in the past .

I looked her up because she approves of the NEST360° organisation on Twitter. It is an organisation committed to reducing neonatal mortality in sub-Saharan hospitals by 50 percent.

Such organisations deserve a place in Wikidata, it has members I am adding. I consider it part of my "Africa project" even though it does not have a place there yet.

Yesterday I added an item for "neonatal care" and all the papers that are already included in Wikidata  about neonatal care need to be associated with the subject. Scientists like Prof Joy Lawn are to be marked for their specialty.

How is it possible that it takes a 60 year old white male from the Netherlands to add something this basic to Wikidata. We are talking about more yearly deaths than Ebola..
Thanks,
       GerardM

Tuesday, October 01, 2019

What data is wrangled is obvious when its presentation is considered

When you watch a game, you want to know the score. When you have a favourite author, you want to know all his/her publications and when you hear about a place you want to know where it is. Easy.

Such data may be included in a repository like Wikidata and, in essence the data is still simple. You still want to know the score, the publications or the location, the question is how do you get the data in a format that makes sense.

People are really good at understanding data when it is in an agreeable format.. These are three format for the same data; a scientist in Wikidata. This is how Wikidata presents its data and imho the data is really hard to understand. This is the same data in Reasonator, it is a general purpose tool that shows data and its relations. It can be used for all kinds of data, it is my goto tool to get to grips with data related to one item. Finally Scholia presents data formatted in a way that makes sense for this scientist.

Given how awful the default presentation of Wikidata is, it is obvious why everyone teaching the use of Wikidata focuses on querying the data and therefore people seek/work on the results provided in what is their default tool. I typically focus on particular subjects, today it was Dr Shima Taheri, I added a reference, some publications and genders for her co-authors. To do this I am triggered by the presentation of the data in the tools I use.

The holy grail for Wikidata is the use of its data in Wikipedia info boxes. However, people are taught to query data and that approach does not align well with the data items you find in info boxes. So when the purpose of Wikidata is in Wikipedia info boxes, presentation needs to become a priority.
Thanks,
      GerardM