Wednesday, December 26, 2018

Professor @steve_hanke and reading what is #FAIR

Professor Hanke is on Twitter. He has his five Wikipedia articles and his info on Wikidata is well developed. With a scholar of his eminence, you would expect a lot of known publications as well. However, never mind the 153 English Wikipedia references, never mind the links to 13 external authorities, finding his work is not easy nor obvious.

The problem with Wikipedia references, it is a hodgepodge of links about him and links to his works. His VIAF registration may bring you some of his works but it will not tell you where his books are cited. Mr Hanke does not have an ORCID identifier and consequently it is not easy to include his data on Wikidata.

This is not about Mr Hanke; in certain fields of science people do not have an ORCID identifier or are not open about their publications. When you are interested in a specific subject or a specific scientist, it helps when the information is FAIR.

So what is missing; there is this database with all Wikipedia references, it needs to be included in Wikidata as soon as possible. It may require a fair deal of social manoeuvring to include all Wikipedia references to Wikidata. But the benefits; the benefits will be huge. Given that Wikipedia references are backed up by the Internet Archive, this will extend for these links in Wikidata as well. It makes them FINDABLE and ACCESSIBLE. At Wikidata, this data becomes INTEROPERABLE and REUSABLE (FAIR).

So my 2019 wish for the Wikimedia Foundation is to become FAIR in what it says and what it does.

Tuesday, December 25, 2018

Dear Katherine: Socialization Tactics in Wikipedia and Their Effects

In the contract of Wikimedia employees it says that they are not allowed to blow their own horn in any of the Wikimedia projects. It is according to a very senior Wikimedia official why they cannot add/contribute to information to scientific papers like Socialization Tactics in Wikipedia and Their Effects in Wikidata.

Dear Katherine, you will agree with me that this is a perverse effect of a well intentioned item in personnel contracts. So let me tell you more about the effects and how we can overcome this issue.

As you know, there is a thriving research community and its recorded presentations showcase the  research on Wikimedia projects. These presentations are recorded and may be found on YouTube. Typically these showcases are based on scientific papers. They should be recorded in Wikidata with all the details like it is done for any and all subjects. When a paper is properly covered, we know all its authors, the papers it cites and in time the papers who in turn cite the paper. When an author is well covered, we know every paper published, co-authors, subjects, subjects, citing authors. We know this because of Scholia. Scholia is what prevents Wikidata from being a stamp collection, Scholia is what makes a subject come alive, it is what brings data together, makes it digestible and gives it relevance.

Not so for subjects relating to Wikimedia apparently for contractual reasons. There are several strategies to overcome this. But first let us decide what we are, what we do and why this matters.

Wikimedia is a publisher of scientific papers; currently there are three and in order to raise the impact of the papers it publishes, they have to gain visibility. To do this we can associate with ORCID, and publish and certify all the details of papers to its authors. One of the things we do on a big scale, is re-publish data from ORCID. They have a program whereby they can sync their information with ours.. They collaborate with Crossref and so could we. When we do, we make Open Science much more visible.

Dear Katherine, what we have shown is that we can and do care about publications, about citations. We care about science. The least we want is our own research to be presented the best we can. In order to achieve this we have to consider the unintended impact of a provision in a labour contract and overcome this self inflicted barricade.

Tuesday, December 18, 2018

#Wikidata and the papers of Professor Wiesje van der Flier

Professor van der Flier has an ORCID identifier. She works at the Neurology/ Alzheimer center of the VU University Medical Center.

Mrs van der Flier has in Iris E. Sommer a co-author. We know that they have at least one co-autor in Edwin van Dellen. There may be more and we will certainly know for those co-authors that are as open about their work. Professor Sommer was the initial interest because she is a member of "de Jonge Akademie".

We will know because they have an ORCID identifier. At Wikidata it serves two vital functions; it helps with disambiguation, a job was ran for all people with the surname "Li"... Given that ORCID allows people to share their information publicly, it allows us to import the publications of authors and identify their equally open co-authors.

The Scholia page for Professor van der Flier knew 31 people who were certainly knew to me. They are being processed and chances are that at the end of it Mrs van der Flier will know more co-authors, more papers and her representation in Wikidata will be more complete.

Yes, it will only know the co-authors that are open about their work but, that is only FAIR.

Sunday, December 09, 2018

#Science; I can read

The basis for what Wikipedia articles offers are its sources. Those sources can be anything and when we want to know the veracity of what we read, the sources have to be available. Not only that, we rely on those sources to be consistent and we rely on those sources to be readable.

When sources are on the web, the Internet Archive will have iterations of a source available in its Wayback machine. It ensures that sources remain available and thereby much of the integrity of Wikipedia is maintained.

For scientific sources we are unlucky. Reading a scientific paper can set you back $45,- and it only allows you to read that paper for a day.. In effect all such papers cannot be read; we "have to" trust them and there are plenty of papers that are extremely problematic and also expensive to read.

Many papers are increasingly FAIR. They are Findable, Accessible, Interoperable and Reusable. The best first line partners we have are again the Internet Archive and ORCiD. Organisations like the Biodiversity Heritage Library store scientific papers at the IA thereby making them available for as long as the IA exists. ORCiD is where living scientists identify themselves and if they so choose, the publications they (co-)authored. It makes them and/or their papers findable. The papers typically include a DOI making them accessible. After that it is anyone's guess if you can actually read them.

Scientists that are open about their work may find that they and their work found its way into Wikidata. For Karsten Suhre this was done; his scientific work is represented in his Scholia and many of his co-authors have been automatically added from ORCiD and have been processed as well. His co-authors that are not as open are largely missing but that is only Fair; I do not volunteer to promote them.

What Wikidata has is not representative of all of science but it increasingly represents the science that is open access, the science that I can read, that you can read that is for all of us there to read. The science that deserves to be used as sources in Wikipedia. We can read.