Sunday, August 12, 2018

#Knowledge - three types of knowledge and why "academic" is only one and overrated

There are three types of knowledge; they are academic, professional and knowledge from experience. The scheme to the right was published by Jaap van der Stel. He works in the field of psychiatry and is known for his work on addiction in combination with the use of peers in the recovery from addiction.

In the Wikimedia world, we insist on the primacy of academic knowledge and up to a point it serves us well. Operationally it means that much of the studies are done outside of the WMF, they may point out whatever but they hardly ever make an operational difference. When the internal WMF researchers study a subject, they are typically directed to study particular phenomena and it may point to operational issues. Issues that are either addressed by the WMF itself or adopted by the community.

When scientists make a compilation of all the sources in all the Wikipedias, it is academic work when the result is static. It may indicate what sources are used multiple times but it does not help any editor weed out sources that are biased or false. Magnus started work on a tool that knows about all the sources in two Wikipedia and Wikispecies.  It is updated in real time and that  gives it valid operational credentials.

I know from experience that there are issues with source information as we have it in Wikidata. We cannot invalidate sources by reference. We are only strong in the biomedical field and adding new information is not at all user friendly.

Now this user experience does not get much of a priority for valid operational reasons but the effect is that Wikidata is only useful for the geeks. Its lack of usability prevents its data to be used on Wikipedias in the "other" languages. It is where there is little or no academic nor operational interest.

Saturday, August 11, 2018

#GenderGap - The Gineta Sagan Award (and others)

The Ginetta Sagan award is conferred by Amnesty International USA. It is an annual award, the last recipient according to Wikidata when I looked at it received it in 2014, English Wikipedia has the award as part of the article on Ginetta Sagan and has information including 2017 (when you read the texts, you will find how notable these people are and, by inference the people without an article).

Arguably, there is a lack of balance between the number of men and the number of women having an article in any Wikipedia. This is known as the "gender gap" and the "women in red" project works to great effect to improve that balance. There is no lack of fine notable ladies who have no article.

I am really happy to present two queries. The first query shows women who won an award with no article at all (2502 results). The second shows women who won an award with no article in the English language (29083 results).

Let these women be an inspiration to you.

Monday, August 06, 2018

#MADinAmerica - cause and effect

MAD in America is an organisation about mental health, particularly in America. Their take is that there is a lot that can be improved. The part that I am mostly interested in is that they highlight the science that tells you how the science behind many mental health practices fails scrutiny.

One publication they recently highlighted is about brain abnormalities by people with schizophrenia. Current wisdom has it that "cortical thickness and surface area abnormalities in schizophrenia" is indicative of schizophrenia. This paper compares people with schizophrenia who were medicated and people who were not medicated. The research shows that these differences are due to the medication.

Adding a paper like this in Wikidata is easy. Making it stand out for its results is not. The paper probably indicates previous research that it debunks but how do you model that. When papers like this are to be used as sources, how do you ensure that it is even considered?

NB the first author is employed by the University of California, Irvine

Sunday, August 05, 2018

#Citations - "Verlorene Siegen" and #Wikipedia

The publication The Battle for Wikipedia: The New Age of ‘Lost Victories’? writes about debunked knowledge but used as sources in Wikipedia. Lost Victories is a book by Erich von Manstein, a German military officer convicted at the Nuremberg trials. He served as a witness and there is strong evidence that he perjured himself. He was sentenced to eighteen years in prison

This publication is not only of academic interest. In this day and age where fake facts and science are pervasive, it is a reminder that Wikipedia is a battle ground where debunked sources are used to prove a non-neutral point of view.

One of the objectives of the Wikimedia Foundation is to combat fake facts and use make citations operational as a tool. The main trust will be by adding sources to Wikidata. "Verlorene Siegen" obviously was present but even though there is a large body of work debunking this book, there was nothing to refer either to Mr von Manstein or his book in a critical way.

It was easy enough to add a few individual sources but it takes time. For analysis of sources used in Wikipedia there are dumps containing all the citations of all Wikipedias and now Magnus has started on a tool that initially includes real time sources for the German, English Wikipedia and Wikispecies. Of these publications 36% are linked to Wikidata and this provides a great start but it will take more. We need to know what papers debunked what knowledge. We need to know what papers a Retraction Watch is critical of, or what the relevance of a paper is according to the Cochrane Database of Systematic Reviews because that is how their facts are operationalised. We need to know because that is one way to debunk fake facts.

Saturday, August 04, 2018

#Wikidata - User versus bot updates and #Scholia

These are the aggregated subjects that are associated with all the papers for the winners of the Fields Medal. Given that there are some 60 award winners for the most prestigious award in the field of mathematics, this is not a representative reflection. That is not a problem, that is an opportunity.

I added one paper, "Singularities of linear systems and boundedness of Fano varieties". Given the title, I added "Fano variety" and "Linear system" as subjects. This made no difference in the Scholia tool and after some five minutes I asked what was happening. I was told that it takes a large interval before the data in the Toolserver get updated.

Typically, information about papers are added by bot. Not so much for mathematics but still. Mr Birkar for instance has only two papers in Wikidata at this time and for the other paper no subjects are given. When you add data by hand, instant gratification or instant visibility is important as it is a potent motivator.

The best reflection of work done in Wikidata is not given by Wikidata itself. It is either by tools like Scholia or Reasonator or it is by query. When query does give instant gratification, it has much of its potency because of the instant gratification.

Tools have one important benefit over query; it provides a standard layout for the information. Queries are potent and many people contributing content to Wikidata use it in tools like Petscan. But in reality, the typical difference between one query and the next are only in the qualifiers.

At this time the best user experience is given by tools. It often suffers from a time lag and this is of little relevance to bots. For humans though it is different.