Monday, July 30, 2018

#Wikidata - #Skills and tools needed to add #awards

Adding awards to Wikidata is one way to signal notable people who could do with some tender loving care, maybe even an article. Typically it starts with someone who was awarded. This time, three fine ladies: Kay Davies, Alice Rogers and Sarah Cleaveland.

Mrs Davies received the "Croonian Medal and Lecture", an award conferred by the Royal Society. The award was already known as the "Croonian Lecture" and the Wikipedia article contains a long list of recipients. The website included a few recipients and consequently the award was positively identified.

Thanks to Reasonator, I easily navigated from Mrs Davies to the award, I noticed how few recipients were known and using text from the article in combination with the Awarder tool I started adding 250+ recipients within ten minutes. Other awards like the Harveian Oration are still missing in Wikidata.

Mrs Rogers received the "Kavli Education Medal", only four recipients so far. The name Kavli in combination with awards proved a bit ambigue; finding the correct medal was the one challenge. One recipient was missing a Mrs Margaret Brown, easy enough to add her as well. The mother of Mrs Rogers was said to be a very accomplished mathematician of Bletchley Park fame but details are lacking.

Sarah Cleaveland received many awards; of interest are two because they are not linked. She was the first woman to win the Trevor Blackburn Award in 2008. The other, the Leeuwenhoek Medal and Lecture in 2018, got my attention. It did not have many recipients and again, the Awarder tool made it easy to add many missing recipients. There are several red links, I did not add them this time.

When I am to add the Trevor Blackburn Award, I first have to find it. It is mentioned on several Wikipedia articles but the award and Mr Blackburn are missing. Google helps me find the website of the award. With Mrs Cleaveland there are 13 recipients. The first thing to do is add the award, link the organisation that conferred it and the web address for the award.

When you then start looking for recipients, Reasonator immediately provides an updated view. No need to query for its recipients they are obvious. Just to show that you can, I added another fine lady; Mrs Karen Reed.

Sunday, July 29, 2018

#AfricaGap - #Nigeria, politics as a family business

A Wikipedia friend asked on Facebook for people to neutralise an all too advertisery article on a Nigerian Senator Ademola Adeleke. I have updated his Wikidata profile based on the information in the article.

The father of Mr Adeleke, Raji Ayoola Adeleke and his brother Isiaka both preceded him as a senator.

When you google Mr Adeleke, the first thing you find is his Wikipedia article, then you cannot miss that the University of Jacksonville denied that Mr Adeleke finished his education. Consequently it is a stretch to call him Dr Adeleke.

When Wikipedia and by inference Wikidata register the education of humans, it follows that such easy scams will be more prominently displayed and known. When an article makes plain that Mr Adeleke was born with a silver spoon, it will make it easy to question his ability to truly represent adequately.

Saturday, July 28, 2018

#Wikipedia, where is all that research?

Pine, a well known Wikipedian, asked attention for the registration of the 2018 "State of Wikimedia Research". Benjamin Mako Hill mentioned that a humongous amount of publications were published on Wikipedia in the last year alone.

That is great.

I checked the numbers using the Scholia tool and was a bit disappointed. The total numbers was "only" 337 for every year. Benjamin uses different tools; he mentioned his use of Google Scholar and indeed it shows so much more.

I was really pleased with Daniel Mietchen helping out on the subject of "probiotics" and I asked him if he could run his bot for the subject of "Wikipedia" and "Wikidata". But nevermind what he decides to do, running a bot adding key words to research is not scalable when you consider the overwhelming amount of research known to Wikidata. It is not only running it one time, it is also adding key words for any and all new research entered to Wikidata.

Given that we work in a Wiki way, this is totally acceptable. We do what we can, what takes our fancy and slowly but surely new approaches, tools improve on the quality and quantity of the data that we have. When Scholia was a commercial enterprise it would be different; the exposure and use of data would be a primary concern.

Friday, July 27, 2018

#Wikidata - I do not use query and here is why

When I edit Wikidata, I never use queries and here is why. I do not need them. For instance, I added an award to a person because it was obvious it was missing. I had no need for a query because everything that I wanted to know about the award was visible.

When you use query, you have to use a tool, define a query, run it, maybe tune it and then analyse the results. Using my beloved Reasonator, all the queries that I need are included. This is the same award and the same person but in the standard user interface of Wikidata. It is not informative, I only use it to edit.

A person wanting to teach Wikidata asked how do I structure a program? The first thing proposed was teach them to query. I agree that query is important, it has its use cases but it should not be the first introduction to Wikidata because it makes it too complicated at the start and even worse it is not necessary.

Thursday, July 26, 2018

#Wikidata - Pushback on probiotics with citations

Recently I added a paper to Wikidata. The paper indicated its subjects and mania, the immune system and probiotics are among its subjects. I dutifully added some of these subjects to the article and was surprised that a topic as controversial as probiotics did not relate to many many papers.When you check out #probiotics on Twitter, you will realise that a healthy mix of fact is much needed to counter the inundation of commercial offerings you will find.

I mentioned on Twitter my surprise that there was so little to find on Wikidata about this subject and Daniel Mietchen picked this up, had a bot run adding probiotic as a topic on Wikidata. The result is wonderful.

It is almost too good. We now run the risk not to see the forest for the trees.When you are looking for sources to cite, you want to narrow down on sources that were checked by Cochrane, you may want to find/dismiss the papers mentioned by Retraction Watch.

The best part; this is an embarrassment of riches. With bots running and updating topics mentioned on papers, we gain relevance to our collection of papers, authors are linked giving a clue who might be notable enough to get a Wikipedia article. As we gain more and more data with better links to indicators to the quality of papers, we gain terrain in the battle on false facts.

Sunday, July 22, 2018

#AfricaGap - Sean Jacobs at #Wikimania

To be blunt, what Mr Jacobs is talking about is one or more step removed of the Wikimedia reality. His story is important and indicates that a specific type of source exists and is available for study. Mr Jacobs informs on the importance of Twitter for the Zulu language.

Mr Jacobs is an academic and the reality of Zulu Wikipedia is that only a few days ago we celebrated article number 1000 for the Zulu Wikipedia. What the Zulu Wikipedia needs is high school students writing in Zulu. Writing about what is important to them, what is important to their curriculum and to their world.

Just consider what one high school could do. Now consider what ten high schools could do. Compare that with one academic or what all the Zulu students currently in university could do.

Yes, history has been written so far and it does report in a biased way. When the Zulu language is to gain a foothold in the Wikimedia world, we need many people being involved in writing first the most basic information. Once there is a basis, the sources Mr Jacobs mentions become relevant in a Zulu Wikipedia.

Saturday, July 21, 2018

#AfricaGap - A #Wikidata based watch list about a Africa reality II

When there are many Listeria lists that you follow, when you care about the development about the subject, it is wonderful to see so much activity related to Africa.  As more people care to work on African politicians or "administrative territorial entities", the Listeria lists that also exist on Wikipedias in African languages will be updated as well.

When the Listeria lists become part of the main body of a Wikipedia, the politicians and entities will be found. When the info boxes as presented at the Celtic knot conference follow, slowly but surely quality content in quantity about Africa will no longer be a mirage.

Wednesday, July 18, 2018

#AfricaGap - Guinea; standing on the shoulders of giants

This map comes courtesy of the UN to Commons. It was downloaded in 2007 by Jeroen, the language on the map is French and Wikidata has much of its data in English. The names in French are mostly the same but that is for someone else to consider.

Many of the articles on "administrative territorial entities" are written by a small group of people. I want to single out Shevon Silva, the user page expresses the amount of work that went into adding stubs for so many African territories. The important thing about data is; once it is there you can change it in any way necessary.

When data gets entered into Wikidata, certain Wikipedia things are not possible; a "human settlement" is not a "administrative territorial entity". Such conflations need to be undone in Wikidata. Obviously the human settlement is located only in that administrative territorial entity and others only by inference. Attributes like "inception date" and links to other human settlements that are part of a sub-perfecture are for someone else to add/get right. Another consideration are historic administrative territorial entities particularly those of historic countries.

At this time it is important to celebrate what we have, morph it into a format that can be used on any and all of our projects. Once it is available in all the Wikipedias, it will generate more and more links and this will put Africa on the map.

Saturday, July 14, 2018

#AfricaGap - Where Wikipedias collide

The German and the English Wikipedia collide on the "administrative territorial entities" of the Gambia. I was told to remove entries that I made to Wikidata because they were "Falschinformationen". The German article is much better written but the English article indicates that the German information is likely to be outdated.

A discrepancy like this is obviously best solved to insist on "your" solution. The point that I have been making quite often is that such differences are commonplace and require proper sourcing. The obvious source will not be found on a university website, it will be found in governmental information of the Gambia.

Making information about Africa available in Wikidata makes the errors, the inconsistencies and the lack of data in the Wikipedias more visible. This is not solved by considering your "own" data to be best, it is by proving that information is up to date. According to the English Wikipedia, the Upper River Division is no longer; it is largely replaced by the Basse Local Government Area.

My question: what does it take for the Wikipedias to take their inconsistencies serious?

#AfricaGap - Support for "minority" languages

Support for "minority" languages was the subject of the Celtic knot conference. I have watched some of the presentations and find that there is a lot more to supporting minority languages from a Wikidata point of view than just adding missing labels. A vital strength of any Wikipedia is found in its relations between articles and that subjects of interest may be found.

Minority languages are a misnomer, what we mean is that the Wikipedias are small. They have a lack of articles, stucture is missing and subjects of interest are not found. Subjects have the same relations in any language and consequently lists expressing these relations can be shared using Wikidata in any language including "minority" languages. Missing labels need not be an issue; this is expressed nicely in this list of subdivisions of Egypt; the labels for most of them are only available in the Arabic script. A nice invite for people to add labels in the Latin and any other script.

The Welsh Wikipedia makes use of "Listeria" list in its main space and as a consequence, all items in these list can be found. They are available in a context, associated information may be available and they link to articles in other languages. The Welsh Wikipedia did implement the "Article Placeholder" and in this way they provide even more information for the ffspecific subjects.

When you consider Africa and information about Africa, there is no Wikipedia that provides adequate information. The data is incomplete, unstructured and often out of date. It is easy enough to improve on the quality of the data in Wikidata and when the information is updated in many Listeria lists on many Wikipedias, the impact is great.

The lack of coverage of subjects about Africa is huge. Less than 1% of humans is from Africa, we do not have up to date information about "administrative territorial entities" like provinces and districts. In my AfricaGap project only a limited range of subjects get some attention at this time. Obviously there is more that could be done. African cinema is one subject that is of interest to a group of Wikimedians. When they write their articles it will eventually translate to Wikidata and information about movies, actors and directors may be shown in Listeria lists in all the African language Wikipedias. This may generate interest from an African public for our projects.

There is only one purpose for Wikidata, Wikipedia and it is to find a public, a use case for the data, the articles, the information we provide. The one challenge we face is in both the quantity and quality of our articles and data.

Thursday, July 12, 2018

#AfricaGap - Considerations on the "Article Placeholder"

Having listened to a Youtube presentation on Article Placeholder, I am seriously disappointed. There are a few statements in there that show a lack of understanding on the functionality of the Reasonator. It is dismissed for all the wrong reasons and as a result there are a lot of missed opportunities.

What is missed is that Reasonator, as it is, provides superior representation in any language. It is a tool that helps with missing labels from within the tool. Missing descriptions in Reasonator do not need to be a problem; there are automated functionality that has shown its merits in many languages. Do compare the representation of Wikidata data and the structured representation will be seen to be more rich with the inclusion of maps, images and data linked to the subject in question.

What is particularly galling is that Reasonator is dismissed because "it is an external tool". Before work on the Article Placeholder started, it would have been easy enough to adopt functionality as provided by this external tool and it would not have been an external tool, an obvious argument AFTER the fact.

Where Reasonator provides texts, it is done based on little scripts. This is seen as problematic as is seen as a drain on the community. Templates on the other hand may be a part of the Article Placeholder and they have the same problem.

For me the bottom line is not so much about the Article Placeholder but the lack of usability of Wikidata. It is only because of Reasonator that it is easy and obvious to work on the subjects I work on. I have not spend hours learning how to query, Reasonator provides me instantly with the results in any context like the missing "Districts of Djibouti".

Monday, July 09, 2018

#AfricaGap - the Subprefectures of the Central African Republic

Even the best query is impotent when the data is not there. There were no known subprefectures of the Central African Republic when I started looking for them.

Best practice has it that any "human settlement" is located in the lowest administrative territorial entity available. It follows that the city of Baoro  is in the Baoro subprefecture and, it in turn is in the Nana-Mambéré Prefecture. This is nominally a Wikipedia best practice and a Wikidata best practice.

When a Wikipedia article indicates a "human settlement" category for an subprefecture, we get it wrong in Wikidata. When we change this in Wikidata, it is still problematic when many articles consider the town and the administrative entity to be the same thing.. Then again, this is Africa and who notices?

When there are multiple items by the same name and one is about the city and the other is not, it is just a matter of making one a subprefecture. For the Central African Republic, this is rather straightforward and it just takes a lot of work to get some structure in the data. At the same time there are many articles in the wrong basket. That problem is for another day.

Fixing the data for the CAR is doable. It takes someone with infinite time on his hands to fix the administrative entities for Angola. Most of the data is wrong and entities by the same name and type often exist multiple times. The queries will show anyone brave enough to work on it.

Sunday, July 08, 2018

#AfricaGap - A #Wikidata based watch list about a Africa reality

Wikimania 2018 will be in Cape Town and a lot of words will be used to express the importance of adequate coverage of everything Africa. Words do not express the extend Africa is lacking in coverage. My estimate is that less than 1% of all humans known to Wikidata (ie all humans in all Wikipedias) is African. We cannot properly say where someone was born or died because we do not know all the places of Africa, we do not know its administrative divisions and we do not know its politicians. We have not properly structured the former countries and colonies of Africa.

We do not know really about Africa.

When one guy from the Netherlands can make a noticeable difference, it is obvious what two, three or one hundred people can do who care about Africa. In the Listeria list on several of my user pages, you find what are in effect watch lists about Africa. Every day I notice what changed about several aspects of Africa and regularly I add lists to it and become more aware how limited our coverage about Africa is.

It is 13 days to Wikimania and, you can make a difference by making a difference on the subjects I follow. You can add information, you can add even more Listeria lists. The biggest difference will be by relating all the loose ends and curating and refactoring what is wrong.

What is the point of a Wikimania in Africa when our coverage is at this level? Obviously, a call to make up for what we have not done so far.