Sunday, January 12, 2020

Science and Africa - what colloboration exists and how do we know?

As I am adding large amount of African scientists to Wikidata, I find that I have moved into a green field. A green field as far as Wikipedia and Wikidata are concerned.

To learn about how the information about African science evolves in Wikidata, I created Listeria lists that inform about universities by country, fellows/member of academies of science and members of African young science organisations.

What I produce is a scaffolding; basic information that enables. The information that I use from the Royal Society of South Africa for its fellows includes dates, other awards, employers and even dates of death. Slowly but surely more information is being added for these people and consequently you will also find for, for instance Rhodes University, more employees and additional papers (currently only 1385 papers for its 84 scholars are known).

A scholar like Tebello Nyokong, a Rhodes scholar, has 637 papers to her name. She is a world class scientist and has four Wikipedia articles to her name. All kinds of questions may be queried for her co-authors; the gender distribution, the organisations they represent, the nationality of the co-authors.

Obviously, African science is not well represented at this time. This is a reflection of how people perceive and value African science... In essence it reflects a bias of regular Wikimedia editors. The regular Wikimedia editors are in the west, they have no reason to consider African science but this is a bias. It is highly likely that it will be hard to get Wikipedia articles accepted for African scientists because of a lack of sources and probably a lack of this perceived Western relevance.

Adding one scientist at a time does not make much of a difference. When scientists are added as part of a SourceMD process, any and all scientists who have a public ORCiD profile are likely to get included in Wikidata. This is why so many African scientist are already known. When a notable scientist is then recognised as a recipient of an award, we may already know about the papers they authored.

The SourceMD process is no longer available. It coincides with a lack of resources at Wikidata so any and all resources used for science papers are now available to something else. Understandable, but the result is that I am no longer motivated to seek ORCiD identifiers and consequently, the process is increasingly broken.

Thursday, January 02, 2020

Scaffolding in Wikidata - the Christiaan Hendrik Persoon Medal

Professor Brenda D. Wingfield is another scientist who won an award Wikidata knew nothing about. This time the Christiaan Hendrik Persoon Medal. The Wikidata basics for an award are its name, the conferring organisation and a website with details. A bonus is when there is a link to a person, an organisation the award is named after..

The Persoon Medal is a South African award, its conferring org is new to Wikidata as well. In an article about Prof B.D. Wingfield, all the other recipients are named as well; there are only six in a time span of 53 years so it was not hard to add them all.

The objective is to make connections. For both the award and Prof Wingfield, connections shows best in a Scholia. One of the more frequent co-authors is a Michael J. Wingfield.. He is a co-recipient of the Persoon Medal, a co-member of the Member of the Academy of Science of South Africa and, a duplicate of Mike Wingfield and yes, he is the spouse of Prof B.D. Wingfield as well. He is at this time Q73879566 in the Scholia waiting for papers to be attributed to the earlier item.

Another frequent co-author, Bernard Slippers, is known from a different context. He is both a member of the South African Young Academy of Science and the Global Young Academy. Given some personal connections it was easy to ask if by chance Prof Wingfield is his doctoral advisor (he is the primary, they both are).

The point of scaffolding is that it provides the structures that enable finding the data, preferably in a context. Given that most data is static, the static representation that is Listeria is really powerful. When you group them like I did for African science or Young Academies, you get the satisfaction of understanding what work is done/has been done on a subject. The icing on the cake is when you enable collaboration. I am grateful for Robert Lepenies to pick up the lead and inform other young academies for what Wikidata may mean for them. I am grateful for Daniel Mietchen for improving on the queries I use; they now show the number of publications known for each scientists and a link to the tool that enables attribution to that scientists.

My role is a simple one. I add data. Data that connects, gives relevance but most importantly data that may be picked up in queries, lists by others. The scaffolds are made by others, relevance only happens when others pick it up. My point: there has to be something to pick up.
Thanks and happy new year,

Friday, December 27, 2019

The value of incomplete data - Fellows of the Ecological Society of America

This is about understanding data in Wikidata. The article is about understanding what you can and cannot do with incomplete data, it is not so much about the Ecological Society of America.

The most recent work started with the news of a new Wikipedia article. Prof Cottingham is a 2015 fellow of the esa, there is a category for fellows, adding her and other missing fellows to Wikidata showed that for one fellow there was no Wikipedia article. At the time there were 90 known fellows and for only two it was known when they became a member.

I expected that new fellows would be known to Wikidata not just as an "author string" but that they would be an "item". So I added 14 of the 2019 cohort and found this not to be the case. I then looked up the known fellows from the esa webpage, added their date to Wikidata because I wondered if it were particularly the older fellows that are represented in Wikipedia.

While adding the dates, I added many alternate names to aid disambiguation, I removed one item and found two false friends; fathers mistaken for their son. When I was done, I had a good impression of the data on the website and even though I do not have the full numbers, I feel to be correct in my belief that it is the old ecology/ecologists that are represented in Wikipedia.

When you scrutinize the list of fellows, you will find included "Early Career Fellows", they are "elected for advancing the science of ecology and showing promise for continuing contributions" and they take part for a limited amount of time. Programs like these are known from all over the world and from many science orgs. This time I did not spend time on them but from previous experience I can safely say that promising is putting it mildly.

Wikidata is a wiki and as such, the work that I did is of value even though it is incomplete. I did not add all the missing fellows for instance. The esa is very much an organisation for America (check the employment of its fellows) and it takes pride in global attention and solicits membership fees from all over the world. It takes a lot of additional data when you want to compare if its subject matter is biased towards America and in what way.

For many of the fellows I added, there are papers with "author strings" waiting to be linked to an author. The same can be said for the fellows that are still missing. It can be compared to other ecological organisations but how to deal with the differences takes a completely different understanding. It takes more data to make this possible but the data does not need to be complete, that is the beauty of averages.

Thursday, December 26, 2019

Why didn’t @Wikidata have an item on Margaret Nakakeeto, a champion for living babies?

Ed Erhard wrote famously in 2018 "Why didn’t Wikipedia have an article on Donna Strickland, winner of a Nobel Prize?" A year later we can say that it is extremely likely that a Donna Strickland, a Margaret Nakakeeto are known in Wikidata if only because they are a co-author of a paper (technically: an "author string").

When Ed wrote his article, it was to highlight the gender gap we have in Wikipedia. Arguably relevant and important and it needs the attention it gets. However, it does not follow that it is the only "gap" that needs addressing, it even does not follow that the gender gap is the gap with the biggest impact.

When you consider Africa and particularly science in Africa, the subjects that matter in Africa most are reflected in for instance the Scholia for the members of the South African Academy of Science. As far as I now know, its gender ratio is 27% and this is a list with a mix of Wikipedia articles and Wikidata items. It shows the attention African science gets in Wikipedia nicely.

In Africa there is a huge amount of attention for maternal and neonatal care (eg Uganda) and as programs impact the health and survival of women, it follows that more women will become notable, notable for Wikipedia.

By giving attention to female African scientists, the subjects they are known for gain relevance. Their Scholias are developed, including links to co-authors and papers. It will improve the likelihood that when African science awards are announced, we will at least know the recipients in Wikidata.

Saturday, December 21, 2019

#Science and America first

Several US American science organisations are quite adamant that for them, it is America first. Stupidity has its place and these days the United States has a lot of it particularly as those same science organisations expect people from the rest of the world to accept "pre-eminence" of the USA.

There may be good reasons to be a member of these organisations but from my perspective, it is one thing to be with stupid, it is another to have these organisations argue their case on "your" behalf. So when you are a scientist, chances are that we already know you at Wikidata. We may even know about your science, your co-authors, your memberships.

Take for instance Prof Lise Korsten, she is probably South African, this is her Scholia. She has many co-authors and for some we do not know their gender and for most we do not know their nationality. We do not know if she is a member of any science organisation and we do not know that for her co-authors either. So you may add your professional memberships at Wikidata, your nationality and when you do know the nationality of your co-authors, you may add that as well.

In this way we make obvious to US American stupid that science is global.

Thursday, December 12, 2019

Disseminate science says @EstherNgumbi, @Wikimedia projects have the power to do just that

In this day and age science is of the utmost importance. When I am pointed to a conference where an African scientist gives the plenary lecture; the message is on display in the picture. I take an interest.

When you want to disseminate research, when you want the science to be known by society, you have to pick your platform. You can do worse than choosing for the Wikimedia projects.

Professor Esther Ngumbi is employed at the University of Illinois at Urbana–Champaign. Her ORCiD profile has only one paper but at Wikidata we knew of others. As she is now known at Wikidata with her papers, she has a Scholia. At first there was only one co-author, a bit sparse, so others were added. They were linked to the papers they have on Wikidata. The same was done for some authors who cited professor Ngumbi..

When you, your science is known in Wikidata, you are more likely to get a Wikipedia article and yes, working for an American university helps. An ORCiD profile that is open will be even more potent when you trust organisations like your university, CrossRef to update your ORCiD when it knows about your papers, your new papers.

In this day and age where our ecology is no longer stable, it is vital to know and respect the science. While we aim for the best we have to be prepared for the worst; we have to see it coming. It is why our Wikimedia projects should inform about all the science and not just what a Wikipedia article has as a reference.

Wednesday, December 11, 2019

Jack needs help, so do we and, so do our audiences

Jack penciled his aspirations for Twitter in a tweet. In it he states: "... Second, the value of social media is shifting away from content hosting and removal, and towards recommendation algorithms directing one’s attention. Unfortunately, these algorithms are typically proprietary, and one can’t choose or build alternatives. Yet."

It is good news that Jack seeks a way out, he intends to hire a "small independent team of up to five open source architects, engineers, and designers" and "Twitter is to become a client of this standard"..

In the Wikimedia projects we have similar challenges and opportunities. We cannot expect for all kinds of reasons that scientists who are very much in the news (aka relevant) there to be a Wikipedia article Dr Tewoldeberhan is a recent example but there is no reason why we cannot have her, her work and the work of any other scientist in Wikidata. With tools like Scholia we already have a significant impact by making more known that just what may be found in a Wikipedia. Jack, we do know many scientists by their Twitter handle, they already make the case for their science on Twitter. This makes it easy for you to link to and expand on Scholia. What we give our readers is more to read so that they can find conformation for what they read.

Jack, Wikidata is not proprietary, Scholia is not proprietary and the Wikimedia motto is "to share in the sum of all knowledge". Together we can shift focus from what we have read before in the Wikipedias to what there is to read on the Internet. Put stuff in context and bring the scientists who care to inform about their science in the limelight.

What we do not have is the pretense that we cover everything well. we do aim to cover everything notable well. What we provide is static, Twitter is much more dynamic and together we will change the landscape. Great technology combined with both the Twitter and Wikimedia communities has the potential of being awesome.