Tuesday, January 01, 2019

The #decline of #Wikipedia (as we know it)

Regularly, we are told about misgivings about Wikipedia. It can not stay as it is, it is in decline; it is all doom and gloom.  NB the use of the phrase "doom and gloom" increased in the 1950s.

So Wikipedia will not remain as we know it? GOOD, it forces us to think how we can improve what we have. When things are to change, what will have a healthy impact? How will we get something that serves us better in "sharing the sum of all knowledge". How will we get more people use what we have to offer and how will we entice more people to contribute to the data collection that is included in all the Wikimedia Foundation projects.

First thing; our projects need to be less US-American. For me, a POV situation I was in, was "obviously"decided in favour of only considering the USA point of view; I let it slide but went to pastures green. The money we raise is for: "keeping the servers going". An objective a bit too limited to my taste but it raises the cash. Money is mainly raised in the USA but in order to be truly global, it is better to raise more equally in every country at least for the amount it cost to serve it. Gapminder is where you may be reminded that money is everywhere. As to the servers, why have all crucial eggs in one USA basket? Given its current politics, there is indeed a potential doom and gloom scenario possible. Having them more dispersed will bring our data closer to our audience, our editors as well. Benefiting them with better performance; that is the easy win. A more complicated solution is in the implementation of the Vrije Universiteit research of a peer to peer MediaWiki.

When our projects are to be less US-American, it is important for spending to be more global too.

When today's Wikipedia practices are no longer considered to be set in stone, we can finally implement features that enable, ensure and enhance its future. First, we should be less self centric; after all there is only one sum of all knowledge and we define only a part of it. Magnus showed how to maintain lists in an efficient way and Amir added recently a "task" to Phabricator to implement proper disambiguation of "red links". We are increasingly aware, not only of the references of all Wikipedias but also of publications by scientists that enable their work to be found. Complement this with the scientific papers we publish and we improve the public relevance of scientists by making them findable, by pointing to their science.

With a changed approach at Wikipedia, we may be bold and change the outlook on what Wikipedia is there for as well. Why not make Wikipedia the gateway to information held elsewhere? Why not show a Scholia page for every scientist we know, why not offer the books at OpenLibrary or inform on the availability of books at the local library?  Why not partner with other organisation we have a shared objective in. But most importantly let us be aware that an African professor teaches in Africa and that we allow for and enable the context of our partners and volunteers.

For me there is no reason for doom and gloom as there are so many opportunities to become even more effective. With a whole new year in front of us; let us do well.

Wednesday, December 26, 2018

Professor @steve_hanke and reading what is #FAIR

Professor Hanke is on Twitter. He has his five Wikipedia articles and his info on Wikidata is well developed. With a scholar of his eminence, you would expect a lot of known publications as well. However, never mind the 153 English Wikipedia references, never mind the links to 13 external authorities, finding his work is not easy nor obvious.

The problem with Wikipedia references, it is a hodgepodge of links about him and links to his works. His VIAF registration may bring you some of his works but it will not tell you where his books are cited. Mr Hanke does not have an ORCID identifier and consequently it is not easy to include his data on Wikidata.

This is not about Mr Hanke; in certain fields of science people do not have an ORCID identifier or are not open about their publications. When you are interested in a specific subject or a specific scientist, it helps when the information is FAIR.

So what is missing; there is this database with all Wikipedia references, it needs to be included in Wikidata as soon as possible. It may require a fair deal of social manoeuvring to include all Wikipedia references to Wikidata. But the benefits; the benefits will be huge. Given that Wikipedia references are backed up by the Internet Archive, this will extend for these links in Wikidata as well. It makes them FINDABLE and ACCESSIBLE. At Wikidata, this data becomes INTEROPERABLE and REUSABLE (FAIR).

So my 2019 wish for the Wikimedia Foundation is to become FAIR in what it says and what it does.

Tuesday, December 25, 2018

Dear Katherine: Socialization Tactics in Wikipedia and Their Effects

In the contract of Wikimedia employees it says that they are not allowed to blow their own horn in any of the Wikimedia projects. It is according to a very senior Wikimedia official why they cannot add/contribute to information to scientific papers like Socialization Tactics in Wikipedia and Their Effects in Wikidata.

Dear Katherine, you will agree with me that this is a perverse effect of a well intentioned item in personnel contracts. So let me tell you more about the effects and how we can overcome this issue.

As you know, there is a thriving research community and its recorded presentations showcase the  research on Wikimedia projects. These presentations are recorded and may be found on YouTube. Typically these showcases are based on scientific papers. They should be recorded in Wikidata with all the details like it is done for any and all subjects. When a paper is properly covered, we know all its authors, the papers it cites and in time the papers who in turn cite the paper. When an author is well covered, we know every paper published, co-authors, subjects, subjects, citing authors. We know this because of Scholia. Scholia is what prevents Wikidata from being a stamp collection, Scholia is what makes a subject come alive, it is what brings data together, makes it digestible and gives it relevance.

Not so for subjects relating to Wikimedia apparently for contractual reasons. There are several strategies to overcome this. But first let us decide what we are, what we do and why this matters.

Wikimedia is a publisher of scientific papers; currently there are three and in order to raise the impact of the papers it publishes, they have to gain visibility. To do this we can associate with ORCID, and publish and certify all the details of papers to its authors. One of the things we do on a big scale, is re-publish data from ORCID. They have a program whereby they can sync their information with ours.. They collaborate with Crossref and so could we. When we do, we make Open Science much more visible.

Dear Katherine, what we have shown is that we can and do care about publications, about citations. We care about science. The least we want is our own research to be presented the best we can. In order to achieve this we have to consider the unintended impact of a provision in a labour contract and overcome this self inflicted barricade.

Tuesday, December 18, 2018

#Wikidata and the papers of Professor Wiesje van der Flier

Professor van der Flier has an ORCID identifier. She works at the Neurology/ Alzheimer center of the VU University Medical Center.

Mrs van der Flier has in Iris E. Sommer a co-author. We know that they have at least one co-autor in Edwin van Dellen. There may be more and we will certainly know for those co-authors that are as open about their work. Professor Sommer was the initial interest because she is a member of "de Jonge Akademie".

We will know because they have an ORCID identifier. At Wikidata it serves two vital functions; it helps with disambiguation, a job was ran for all people with the surname "Li"... Given that ORCID allows people to share their information publicly, it allows us to import the publications of authors and identify their equally open co-authors.

The Scholia page for Professor van der Flier knew 31 people who were certainly knew to me. They are being processed and chances are that at the end of it Mrs van der Flier will know more co-authors, more papers and her representation in Wikidata will be more complete.

Yes, it will only know the co-authors that are open about their work but, that is only FAIR.

Sunday, December 09, 2018

#Science; I can read

The basis for what Wikipedia articles offers are its sources. Those sources can be anything and when we want to know the veracity of what we read, the sources have to be available. Not only that, we rely on those sources to be consistent and we rely on those sources to be readable.

When sources are on the web, the Internet Archive will have iterations of a source available in its Wayback machine. It ensures that sources remain available and thereby much of the integrity of Wikipedia is maintained.

For scientific sources we are unlucky. Reading a scientific paper can set you back $45,- and it only allows you to read that paper for a day.. In effect all such papers cannot be read; we "have to" trust them and there are plenty of papers that are extremely problematic and also expensive to read.

Many papers are increasingly FAIR. They are Findable, Accessible, Interoperable and Reusable. The best first line partners we have are again the Internet Archive and ORCiD. Organisations like the Biodiversity Heritage Library store scientific papers at the IA thereby making them available for as long as the IA exists. ORCiD is where living scientists identify themselves and if they so choose, the publications they (co-)authored. It makes them and/or their papers findable. The papers typically include a DOI making them accessible. After that it is anyone's guess if you can actually read them.

Scientists that are open about their work may find that they and their work found its way into Wikidata. For Karsten Suhre this was done; his scientific work is represented in his Scholia and many of his co-authors have been automatically added from ORCiD and have been processed as well. His co-authors that are not as open are largely missing but that is only Fair; I do not volunteer to promote them.

What Wikidata has is not representative of all of science but it increasingly represents the science that is open access, the science that I can read, that you can read that is for all of us there to read. The science that deserves to be used as sources in Wikipedia. We can read.

Thursday, November 15, 2018

Bringing more #science to @Wikidata

Slowly but surely more scientific papers and their authors find their way into Wikidata. Particularly when scientists have staked their claims in ORCiD, adding is easy and obvious.

It is easy because in ORCiD every author, paper, organisation et al have their own unique identifiers. So when you add a paper, all authors who claimed to be author are already linked.

Earlier today, I added papers and co-authors for Jaume Piera. As a consequence Laura Recasens was added today and as you can see in the illustration of her co-authors, several new authors popped up as a consequence.

To do this I use a combination of tools. Reasonator is my preferred tool to display data; for scientists it tells me if he or she is known to be an author. When there are, Scholia presents the scholarly author information. Of particular relevance to me is the co-author presentation. For co-authors shown in white, no gender is given in Wikidata and when the name is an initial and a surname, I will look up the ORCiD information to find a full name. Typically that is how people are known in ORCiD.

I use the SourceMD tool for two purposes; "creating and amending papers for authors" and to "add metadata from ORCiD authors to Wikidata". It is processed in a batch job, I run one job for up to 15 authors at a time and it takes forever to run.

Other people run other jobs, a particular hat tip to Daniel Mietchen who makes sure that recent publications find their way into Wikidata and finds many other reasons to improve on what we have. All this would not be possible without the many tools by Magnus and for Scholia I do thank Finn Årup Nielsen thanks to this evolving presentation, science as a process comes alive.

There is more to do; the Wikipedia citation are in a separate database and much of its data may be found in Wikidata.. Who will merge them. Publications do cite other publications, it is a field I am not really interested in.. They are added so there must be a tool.

When you are interested in a particular scientist, a particular paper.. Just use the tools and slowly but surely we all make Wikidata a great tool to represent science fact.

Saturday, November 10, 2018

More #impact for your #science is in being a #source at @wikipedia

In a study about how students research a new subject it was found that they read the Wikipedia article first. Then they move to its sources and from there it takes off.

In order to have an impact you, as a scientist, wants to be their first getting the attention of your work. There are a few tips.
  • Make sure that you and your work are known. First make your work known at ORCiD. From there it gets into Wikidata
  • PS check out the Scholia presentation of you and your scientific work.. (example)
  • Make sure that your work can be read. Wikipedia actively seeks free reads using the OAbot.
  • Do not think that current practices of your field will benefit new scientists in the future. Many fields are not well represented at ORCiD
For your information. There is a database with the sources used in Wikipedia. The only thing lacking is that this database still needs to be integrated in Wikidata for it to gain a real impact.