Tuesday, October 29, 2024

the virality of co-authors in urology

Happy birthday Wikidata and, many happy returns.

When you start enriching the data for a Dutch urologist, an academic who published quite a number of scientific papers, obviously there must be many co-authors. Many of them are yet to be identified, at this moment for Jakko A. Nieuwenhuijzen there are some 339 still to be added.

The main consideration is what has the biggest impact. As a colleague of Mr Nieuwenhuijzen is known at Google scholar, adding papers for him brought new publications to Mr Nieuwenhuijzen and many of his co-authors. Enriching data for these co-authors makes the graph more complex.

At some point more precision in the data for a single author is no longer worth the effort. When you then find an other urologist with many papers not yet attributed and many co-authors where Wikidata does not know the gender yet, focus shifts and many more edits make their way into Wikidata.

Many of these co-authors are of the same institute but people from elsewhere find their place in these graphs as well. Many are Dutch but as urology knows many international collaborations this is reflected in the expanding number of co-authors. 

As a topic is developed in this way, it easily results in thousands of edits. As many subject are  researched in this way, the enriched data is there for the world to use. This data is only of value when there is a public. Sharing in the sum of all knowledge has always been what we stand for. Sharing freely and widely generates us a a both public and a future.

Thanks,

      GerardM

Sunday, October 27, 2024

The fallibility of notability

When Wikidata will be split up in a "science" part and "all the rest", scientists who have a Wikipedia article will need to be part of the "rest" as well. This is necessary as all Wikipedia articles have a link to Wikidata because of the "interwiki" mechanism.

It follows that there will be an over abundance of USA scientists and there will  hardly be any scientists of Africa or South America. 

Some data about scientists is likely to be considered to be part of "all the rest" awards for instance. Are these scientists who received an award to be known in two data sets? Some scientists had a career as an athlete.. an other reason for duplication. It is hard enough to maintain the interwiki links and existing duplication within Wikidata, it will become exponentially more difficult when another data set is added.

When the creation of Wikimedia Commons was considered, similar good reasons led to hesitation and prevented us to bite the bullet for quite some time. Commons started with the creation of a Wiki, a MediaWiki patch that showed a picture in a Wikipedia and it then took a long time for most of the duplicate pictures to be only in Commons. It was not technically perfect but it was done perfect in the wiki way.

I hope that we will bite the bullet this time as well. With a new unrestricted wikibase, the old batch jobs can be dusted off and make good for the years of academic data we missed. I pray that Scholia will become functional soon after. 

I will still be able to do my Wikidata thing.. projects like African politicians, Muslim countries and their rulers (past and present).. Awards that can do with an update obviously including science awards.. I will not be bored but maybe I will be working .. maybe not.

Thanks,

       GerardM

Saturday, October 26, 2024

Old soldiers never die, they march in the remembrance parades

As our movement matures, people who were there at the beginning, age. They get other priorities, they get sick, operated upon and as a consequence have a windfall of time to do more work at Wikidata. 

I did a similar job for a dear fellow Wikimedian.. It is now my turn, my chirurg is in this picture and as I add missing co-authors this picture becomes more complex. It will also become more complex when existing co-authors are enriched with new and linked papers.

With Wikipedia there is the promise that even though the information will evolve, all the work people have put in will be there in future and enable people to read/study the subjects each editor cared for.

The data of Wikidata as it is will be split in parts. For the best of reasons but once its structure is broken, the tools that bring structure to the data will be broken as well. The same tools that enable the enrichment of the data will be broken. Much of my Wikimedia legacy will be lost because there will no longer be a public enabled to learn about scholarly works in a Wiki way.

For a few years now this sword of Damocles has hung over Wikidata. As a consequence the potential of Wikidata is not being realised. The data could be so much richer when automated processes bring free knowledge together. References in Wikipedia  indicating later papers and improve its quality. 

As long as I can I will do my Wikidata thing; hope is eternal.

Thanks,

       GerardM