Saturday, July 17, 2021

Making Wikicite a success by putting the Wikipedia editor first

Wikicite as a project has come to an end after a five year run. Much has been achieved (read the article). One great follow up idea is to harvest all the references of all the other Wikipedias and have all this data together so that we can analyse our quality even better.

All the data, all the Wikicite activities have been valued; people have been thanked, accomplishments announced and it is now left open how to move on. The best way to move forward is to bring a public to the data and add value. 

Every Wikipedia article has its references and we put Wikipedia editors and readers first when we show all the known references and its relations with older and newer publications. One button that provides the latest information will create buy in, makes it more interesting. It links to papers, to authors and requested updates are not part of a serial but of a targeted process.

A similar service we can provide for the authors of scholarly works. They will find in Wikidata what we know about them, identifiers of VIAF, ORCiD, Google Scholar, ResearchGate, Twitter even Scopus. They can improve the information on their publications they can add identifiers, references to other publications, co-authors, subjects and see it improve the associated Scholia representations. What we can do for them is regularly harvest information and update Wikidata with the new or altered information.

When we truly build relationships, it is no longer essential that everything should be on Wikimedia servers. Why have a same project at the Internet Archive and at the Wikimedia Foundation? Why not share the work load. When we put our Wikipedia editors first, we provide them with the best information on literature, publications we have on offer. Most references are in the WaybackMachine anyway, we already rely on this so why not collaborate and share both the effort and the cost?

When we truly care about sharing the sum of all knowledge, it is in the eating that we find the proof, not in the dogma.

Thanks, GerardM

Monday, July 05, 2021

The pain that is in maintaining lists of African governmental politicians

At the French Wikipedia, there is a category with 16 Finance ministers of Morocco. A category is an unsorted list and the French category contains a few more entries than the English category. Both Wikipedias do not have a (sorted) list.

I maintain list of African heads of state and African governmental ministers, so this is just a next list to prepare. The workflow is as follows: I create an item for a position, on the talk page I add a template that contains the data and pointers to what is missing. Typically such information can be found on a Wikipedia list, this time I have to rely on templates on Wikipedia articles.. All my lists are "works in progress" and are on my watch list.

I have added copies of the Listeria lists to many Wikipedias. The texts are in English, the data is shown in the local language. It being in English is a reason to refuse these lists in several communities.

The Listeria functionality is replaced by Wikimedia Foundation maintained software. All the list definitions and expressions exist in one place. The data is shown in a language that can be selected. When a list changes, it is reflected on watch lists. You can watch for all data and possibly for data and labels in "your" language. A project can opt in to enable the use of these lists when they don't, functionality is available to compare a list with the centrally maintained list.

Thanks, GerardM 

Sunday, July 04, 2021

The pain that is in maintaining the same list 300 times

As you may know, the board of the Wikimedia Foundation recently changed its composition. The President of the board resigned. This was easily reflected in English. Tonight there is a meeting of candidate board members for the Middle East and North Africa.

Obviously the information in Arabic needs to be correct and at this time it is not. There are many more languages supported by Wikimedia in the Middle East and North Africa and all of them could have information about the board and what it stands for. This inspired me to come up with the following user story..
An administrative person of the Wikimedia Foundation is tasked with maintaining  specific lists relevant to the movement. The data is maintained at Wikidata and lists that exist potentially in all Wikimedia languages, some 300, are updated with software maintained by the Wikimedia Foundation. Each WMF list is on a watch list; when changes occur in the list, quality is centrally maintained.

Thanks, GerardM 

Saturday, July 03, 2021

Two "user stories" for Nigeria

I spoke with my Nigerian friend Olatunde Olalekan Isaac on Facebook about growing more interest for Wikimedia content in Nigeria. I will not bore you with a verbatim discussion we had. Bringing you two user stories is much more satisfying.

Nigerian kids looking for pictures of a fireman goto their Wikipedia and search for an "onye oku oku". The search engine knows from where the request comes and shows the pictures of Nigerian firemen first.

The pictures are from Commons, in the meta data of the picture it says where the photo was taken. Happy kids, happy teachers and happy Nigerian Wikimedians because this brings more attention for the projects they care for.

At first it is an experiment that brings more traffic in their language. They then launch a photo contest in Nigeria.

People find pictures in Nigerian languages and to increase the choice of pictures to choose from, a photo contest is launched of everyday Nigerian objects, traffic signs, shops of all kinds, professionals, "be bold and show us your Nigeria" is the mantra. People find more pictures about Nigeria and even after the contest people continue to increase the selection people use for an illustration.

The Wikimedia Foundation has another challenge, there are copycats all over the world and the public use of Commons increases by 200% in a few months time.

Thanks,  GerardM 

The promise of things to come that is in @Wikicite

Wikicite is for me the biggest disappointment of all the Wikimedia projects. A disappointment not because it is not worthwhile, not because it did not bring us insights about what happens in English Wikipedia but because all the papers, conferences and data did not result in a transformation into active user stories. Closest comes Scholia where it identifies Wikipedia articles where a scholarly paper is used as a reference. 

Wikicite conferences have come and gone. They brought the best and brightest minds together discussing all kinds of ideas, all kinds of research. Once the papers were presented, the assertions discussed and analysed, when the conference was done they went home. From what I can observe in Wikimedia land, not much has changed.

Wikimedia Foundation has a real good friend in the Internet Archive. It already does much of what Wikicite could do. It archives everything that is used as a reference. When a reference goes dark, it links the reference to the backup. In FatCat it has information on scholarly papers and many of the papers, in Open Library it has information on books and many of the books. Many bots originating from the IA run on many Wikimedia projects. 

When it is up to me, I would have Wikicite as a joint project of the Internet Archive and the Wikimedia Foundation. A joint project will be based on the existing reality that is in the Internet Archive. Wikicite and Wikimedia projects bring it additional data, a public and additional user stories. Funding from the Wikimedia Foundation enables the development that such a synergy brings.

Data for all the citation in English Wikipedia linked to scholarly papers was available. We know in Wikidata, in FatCat, in Orcid, in Scopus how to disambiguate authors. When all this data gets integrated, a user story mentioned in a previous blog post is not fancy and easily becomes best practice. Thanks to collaboration with the Internet Archive there is less duplication of effort and the sum of the shared knowledge we hold is so much bigger; we can provide an even better service to our public. 

Friday, July 02, 2021

Calling a spade a spade and the "friendly space policy"

In my previous blog post I established that facts supported by science trump personal opinions, even community consensus. It follows that pointing out a biased opinion can be hugely unpopular and considered offensive. The "friendly space policy" is applied on the spot and at best can be later appealed at the "trust and safety". This is after the fact, it takes a huge time effort and it is why I did not bother. One other reason is that I told the respected Wikipedian that what he said was biased and I did not make excuses for saying so after prodding by a trusted officer of the Wikimedia Foundation. I am sanctioned and can no longer use certain functions.

I run for a seat on the board of the Wikimedia Foundation and even though it is a popularity contest, I am not in it to be Mr Wonderful; I will not kiss babies. My platform is to improve our service particularly to the projects other than English Wikipedia. To be successful, I have established that this is doable and I have to undermine accepted opinions that prevent us from improving on our service.

I truly respect the Wikipedian who made the biased remarks but I will not apologise for pointing his bias out. 

When I am to establish my aims at board level, there must be acceptance that we do not serve everyone with the sum of all the knowledge we have when there is this persistent bias towards our English speaking public. The recent developments at the Croatian Wikipedia show that "community consensus" can be overridden when this is necessary to establish our accepted global Wikimedia policies. It follows that the board should be mindful of the evolving science about the "gender gap" and insist English Wikipedia to consider its recommendations (aka clean up its act). It follows that even the most respected English Wikipedian and in the light of the "friendly space policy" can be called out for such bias.

As to the organisation of the WMF; its director reports to the board and is to inform about the development for other languages and the effects on the traffic in other languages. 

Thanks, GerardM

Thursday, July 01, 2021

What science has to say about the English Wikipedia gender gap

Why Men Don’t Believe the Data on Gender Bias in Science
 A respected Wikipedian expressed the opinion that people have it wrong when they say that English Wikipedia is biased against women. In the same week a professor stated on Twitter that her students no longer edit Wikipedia because of the toxic reception they get. As an example she mentioned a quote from an award winning scientist that was removed because "that scientist lacks relevance".

In this same week the  American sociologist Francesca Tripodi published the paper: "Ms. Categorized: Gender, notability, and inequality on Wikipedia". The paper is a scholarly read with 55 references. Most of these references are previous scholarly works, some are references to Wikipedia resources like the notability page. The references have been included in Wikidata and this is visualised in the Scholia for the paper. Please read at least the Discussion and conclusions of the paper. 

This and previous research leaves no room for evasion: English Wikipedia is biased. A personal opinion of the respected Wikipedian may differ, the consensus of the community may differ but both are biased.
Thanks,
      GerardM

Sunday, June 27, 2021

Science Is Shaped by Wikipedia: Evidence From a Randomized Control Trial

A scholarly paper has it that a paper cited in Wikipedia gets a bigger exposure in other scholarly papers. That is in itself quite huge, it speaks of the impact that Wikipedia has in scholarly circles.

When the cited papers trigger citations, it follows that the new papers represent much of the scientific development of the subject of the Wikipedia article. That makes it obvious that in order to keep Wikipedia articles up to date, it is relevant to know the papers that cite these scholarly articles.

To help editors, Scholia already provides assistance. There is a Scholia for each paper; this one is for the paper I referred to earlier and, there is a random Scholia for a subject identified in Wikidata. Just think what a Scholia specific for a Wikipedia article could be and do.

Maintaining information is a drag and there are many bots that are used to append and amend the information that we have based on what we have in a Wikipedia article. First, there is the Internet Archive bot that save guards information on websites by making a copy for the Wayback Machine. For many of the references we know a DOI and it is easy enough to ensure that we know the associated paper in Wikidata. It is easy enough to make a process out of it for a single paper.

What it takes is a different take on processes; not the traditional serial process with serial results but a  focused process that supports a user story. 

In the initial phase of a review of a Wikipedia article, the Wikipedian starts a process that updates all the citations to the existing Wikipedia references. Once the process is done, the citations are known, these papers have been fleshed out with open information and the Wikipedia will know the latest science for the subject reviewed.

Much of the functionality exists, when the user story is supported, it is not only Wikipedians and scholars but also our public that is invited to share in the sum of what we know.

Thanks, GerardM

Sunday, June 20, 2021

@Wikimedia and its defined #predisposition towards its editors

The predisposition of the Wikimedia Foundation can be found in its mission statement: "The mission of the Wikimedia Foundation is to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally". Contrast this with a previous version where: "every single human being can freely share in the sum of all knowledge". The Wikimedia Foundation defines the movement with "Wikimedia is a global movement whose mission is to bring free educational content to the world". 

It is easy to understand how the Wikimedia Foundation priorities on its communities when developing its programs. The assumption has always been that once the data is available, a public will come. In the second part of its mission statement enabling this dissemination it is defined that the Wikimedia has a defined role for it to play in the global and effective dissemination of the educational content it holds.

When you monitor the effectiveness like I have, then you do not care for promises for the future, you want the data to show how a difference is made. An easy one is to compare content delivery in English against all other languages; English content is slightly less than 50% of traffic, this has been the same for many years. When a Commons project that worked for all languages no longer worked because of a failed text integration, it was seen as "we did not test for that". More importantly, the text integration was not removed. This shows that the second part of the WMF mission statement is not seen as critical.

I am a candidate for the board.  In many ways I am the wrong guy for the job. The wrong guy because I have a platform and I do question the lack of growth in the delivery of the content that we have. For me this is a classic situation where lack of data equals no problem. It is not a bias measured in percentage of Wikipedia articles for one group versus another group, the bias I demand attention for is the percentage of data delivered in a languages relative to the size of the population who relies on this language.

The wrong guy because this is not a popular issue. People are offended when you point out that their point of view implies a bias. People fear they lose out because their hobby horse is a niche project comparatively and not part of the mission statement. The right guy because you can check out my past in my blog that I kept for over 15 years, you will find that I have been practical and on mission for dissemination of our content. 

For me it does not matter if I win or lose; the mission itself is critical and I want the Wikimedia board and the Wikimedia director realise that the second part of the Foundation's mission statement is where it could have done so much better.
Thanks,
      GerardM

Tuesday, June 15, 2021

What to do on the Wikimedia board; imho first things first

It is a stated objective of the Wikimedia Foundation to have "knowledge equity". When you care to know the practicalities, just search Wikimedia in one of the 200 languages with the least Wikipedia content and compare it with a search in English. English provides you with vastly more of everything.

Nowadays, "knowledge equity" is accepted and strategic at the Wikimedia Foundation. This acceptance is a recent development; gone are the days when a most senior WMF executive stated that one Wikipedia, at most five, should be enough. 

Given that we now support some 300 languages, it is the translatewiki.net community has always been the cornerstone to the knowledge equity we provide in our projects. Without its high quality internationalisation and localisation efforts the knowledge equity we provide would not have been possible. It follows that any and all knowledge equity for a culture, a language has as its precondition that the tooling, the environment has been properly localised.

It becomes really strategic when the tooling is provided by outside parties. A key role in ensuring that our references remain available is performed by the Internet Archive. For me it has been on my wish list for a long time that the software of its WayBack Machine, Open Library and FatCat are localised at translatewiki.net. Any and all of our language communities that find the resources will benefit our projects and strengthen us in the shared aim of "sharing in the sum of all knowledge". 

When we are to achieve "knowledge equity", the first thing we should provide is a level playing field. With all tools critical for the maintenance of our projects properly internationalised and available for localisation as a movement we achieve the most basic objective; we enable all our communities to be their best.
Thanks,
       GerardM

Sunday, June 13, 2021

Board member of the @Wikimedia Foundation .. for me it is about customer value

Recently many changes happened at the Wikimedia Foundation; its director and its chair of the board ended an illustrious term of office. New elections have been called for board members of the Wikimedia Foundation. This makes it the best of times to ask attention for the customer value our projects provide. 

We want more people to make use of our services and we want to provide more and improved services to them. 

The Wikimedia services are optimised for the English language. More than 50% of our traffic is for the other languages and this makes it easier to improve our service for the 299+ other languages. This has never been a priority there and this makes for many easy pickings.

Commons for instance provides 100% false positives when you look for "bever" with the environment set to Dutch. Old functionality worked really well and it should be easy enough to revert to the full functional version. Research will show if as a result people change their language more pictures are linked to Wikidata and Wikidata gains more labels as a result. When people do search with a string that provides no result, we can search Wiktionary and ask show suggestions for what we find. Oh and when people are from Singapore, why not be biased and show pictures from Singapore first?

One of the challenges for any Wikipedia is the maintenance of lists. Particularly for the smaller projects it is too much of a drain. I have dabbled with for instance lists of African politicians. Arguably information on the current national functionaries for any country should be available and up to date as much as American or European politicians. When a Wikipedia wants to have an up to date list, it should be possible to subscribe to a list. The functionality is to be supported by the Wikimedia Foundation. When for whatever a community does not want such functionality, they can still be pinged when there is some work for them to do.

Wikisourcerers transcribed original works in many languages. As a project Wikisource is great for the sourcerers but for readers ... For some languages websites were created where the finished works have a friendly interface for READERS.. Bringing all the work to a public, is what brings value to all the work the sourcerers put in..

We may not be a commercial organisation but the service we provide is relevant and valuable to our "customers". When you read my blog that has been running for sixteen years, you find that I have been consistent in bringing attention to what we could do to improve our service. Given that we could do so much more, I put myself forward as a candidate for board member of the Wikimedia Foundation. The least I hope to achieve is attention for the results of what we do.

Thanks, GerardM

Monday, May 24, 2021

@Wikimedia needs your support because what it does, what we do is not enough

 An article in the "Daily Dot" insists that Wikimedia has plenty of money.  This is based on the growth of Wikimedia budgets and yes, it has grown substantially over time. Particularly the English Wikipedia provides a lot of content and serves some 50% of the Wikimedia traffic. 

When people analyse its content, it becomes problematic. Even though its content is referenced, many of the references are old and could do with new insights that science brings on a regular basis. The content is male oriented and thanks to projects like "Women in Red" it has improved substantially but not enough. 

We know all mayors of Denver and we do not know National government ministers of African countries. Lists are to be maintained on EVERY Wikipedia, English consensus insists, and they are not properly maintained as a result. Not even on the English Wikipedia.

Money buys you things. When you donate to the WMF, you gain a sense of ownership. That is important; we may not need more money but we do need a sense of ownership in India, Columbia, Nigeria and Guinea. When the other 50% of Wikimedia traffic takes ownership away from those who had enough, we find topics with more real world relevance. Commons becomes usable in the other 299 languages and we seek out these 299 communities to make it work for them.

Given that we don't do enough for 300 languages, given that we can do much better, I will argue that Wikimedia needs more support, even money.

Thanks, GerardM

Saturday, May 22, 2021

Listeria, motivation and the Sharif of Mecca

Awn ar-Rafiq was the Sharif of Mecca from 1882 to 1905. Doesn't he look splendid in his regalia?

The Sharif of Mecca was a high placed functionary of the Ottoman Empire for many years and as such there is a Listeria list for the Sharif of Mecca. The point of these lists is that the quality of the information we maintain about the Ottoman Empire is not of a high quality. The Ottoman Empire existed longer than the Roman Empire, covered a larger expanse  and it is not yet one hundred years ago when it came to an end.

I looked into lists of functionaries on several Wikipedias and they are not consistent. So I created Listeria lists and as people add things to Wikidata, these lists are updated.. that is the theory and I am happy that this functionality has been restored.

The best bit is when other people take an interest as well. A new functionary is added; ولي الدين رفعت باشا and he will get his place in the list .. Google translates it into "Wali al-Din Rifat Pasha" and hey we can bring things to the next level.

Lists like these should be stable and they do entice cooperation. It is why I am so grateful that Magnus spend some time reviving the functionality.
Thanks,
       GerardM


Thursday, May 13, 2021

Ponderings on a book: Flammable Australia The Fire Regimes and Biodiversity of a Continent

Books are often used as a reference in a scientific publication. As I often add "citations" to individual papers, I find that books are a headache. Wikidata only "knows" about this book through two book reviews. When you google the book there are two versions of the book. Open Library "knows" about both versions but has no readable version. What I need is a reference to only one chapter of the book.

The first thing I did was changed the reviews into reviews, add the book and linked the reviews to the book. Open Library knows about two versions of the book. I linked both versions to the same author and linked the 2001 version to Wikidata. Finally I added the chapter in a haphazard way to the paper I am working on.

It is unlikely that I will ever read the book and it is very likely that others will frown on the way I added the book to Wikidata. However, for me "level 0 of data quality" applies; with data available particularly linked data, it is much easier to find fault and improve on what is there. I know that a bot will format the ISBN-10 entry I included.

The one reason why I add books is because books linked to OpenLibrary may be read by a wider public. Obviously not all books are available at this time, but the cumulative effect of adding all the books, all these links enrich the ability to read, to share in the sum of available knowledge.

Thanks, GerardM

Sunday, May 09, 2021

The @Wikimedia's #endowment aim is to reduce risk.

When you assess risk, when the purpose of the Wikimedia's endowment is to ensure the future operation of Wikipedia and probably other projects like Commons as well, money is not the only risk to consider.

The current government of the USA operates in a way that poses no or little thread to the continued operation of the WMF. However, the USA is a two party system and the same can not be said when the Republican Party is to return to power. 

All the servers that run the Wikimedia projects are currently in the United States. With servers able to run the full stack of Wikimedia projects elsewhere, two objectives are served. 

  • up to date data becomes closer to the readers close to the new servers
  • the risk of an USA that turns on the Wikimedia Foundation is mitigated
The WMF does not need to use the endowment to make this happen. Given its current finances, it should be an operational decision.
Thanks,
        GerardM

Sunday, May 02, 2021

Wikipedia is a "Work in Progress"

In many ways Wikipedia reflects the world as both are ever evolving. Once an article is finished for the moment, it in effect becomes a time capsule comprised of text, images and references. All these sleeping beauties wait for an update that comes once the realisation sets in that the article is behind the times. With 6,290,486 articles, it is obvious that there is a Bell curve of articles ranging from "up to date" to "out of date". 

So far the "Watch list" has been reactive. There are a few areas that can indicate that an article needs attention because it may be out of date. Wikipedia articles have references, when a reference is a "scholarly article", it is normal when an article will be cited by future articles. These publication can strengthen or weaken the assertions made in a Wikipedia article.

In Wikidata there is a bot that continuously updates "scholarly articles" with its citations. When one of these articles is used as a reference in Wikipedia, it merits attention. This can be reflected both on a Watch list and on the article itself.

Having the latest literature available will help settling disputes among editors as well.
Thanks,
      GerardM

Tuesday, April 27, 2021

How to find pictures of a თახვი (it means beaver) or a Wisent? Please help!!

Wikimedia Commons is the biggest repository of freely licensed media files. It serves a global multi lingual community editing Wikipedia and it has the ambition to serve students as well. When you live in Georgia and you don't speak English, you search for a "თახვი" and this picture is all that you find.

There is a tool for that. Hay's sdsearch does a good job when its interface is localised. Sadly, no such luck for Georgian. Wikimedia produced a tool as well; Special:MediaSearch does a good job. You have to know that the tool exists and when you do, you find that for Dutch it does not work at all

It is suggested that you can change the preferences for best result. I fail at getting the results that Special:MediaSearch used to provide. There is no documentation.. Please Help!!

Thanks, GerardM

Sunday, April 25, 2021

Scholarly articles in @Wikidata and its link with #Fatcat

 "Dam-site selection by beavers in an eastern Oregon basin" is a scholarly paper. It has a DOI that is not functional, it is known to Researchgate and consequently we can find a PDF of the paper. 

This paper is cited by a paper I have an interest in. I am adding its references into Wikidata (this is its Scholia) and I added the "Dam-site selection" paper to make a complete reference.  The result is something like this.

The PDF of the paper is available for download and as is to be expected, the Way Back machine of the Internet Archive has a copy as well for the URL of the download page. And then there is Internet Archive's Fatcat

"Fatcat is a versioned, publicly-editable catalog of research publications: journal articles, conference proceedings, pre-prints, blog posts, and so forth. The goal is to improve the state of preservation and access to these works by providing a manifest of full-text content versions and locations."

Wikidata is used as a source of Fatcat and as I include an additional paper, it will at some stage be picked up by Fatcat. If there is one thing to wish for, it is a function where entering a Wikidata Qid will trigger Fatcat to update its data based on the Wikidata info. If I can have two more wishes, one would be an icon for Fatcat. The second would be a Fatcat identifier making it easy to link from Wikidata to Fatcat.
Thanks,
        GerardM