Friday, April 28, 2017

#Wikidata user story - The Golden Brain #Award

I have added people who won the Grawemeyer Award. This award has many categories and I concentrate on the category "psychology". The people in this category get extra attention; all the information from categories and awards are included as well.

Often people turn out to be connected through multiple awards like Mrs Anne Treisman and Mrs Leslie Ungerleider. They both received the Golden Brain Award as well.

When you read the article on the Golden Brain Award, the winners are all in a nice table. For each of them there is either a blue or a red link and for some there is only a string of text.
2015Okihide HikosakaNational Eye InstituteUS
Adding the missing people in Wikidata is not hard, just some additional work. For Mr Okihide Hikosaka there is enough information to add an item to Wikidata. When this is done for all the award winners, it is possible to create a list with the same information in any Wikipedia. 

By adding all this information, people who are into what I concentrate on are better connected. They have more near links to data that links to other people who are relevant in the field of psychology and my hope is that this will trigger people to give attention to missing articles and information.


Thursday, April 27, 2017

#OCLC, #VIAF and #WorldCat - I love them and, they could be even better for me

Jimmy Wales is doing his thing for proper news and it is welcome news. When it pans out it will work but people have to read what WikiTribune will bring. As always it takes education and access to information. Libraries have always been the bedrock of available information to people and the OCLC is what connect the worlds libraries. So it is important for it to be as good as it can be when it is to bring more people to libraries and read.

The OCLC brings two programs that are important to me as a person and as a Wikimedian. They are VIAF and WorldCat. VIAF is the "Virtual International Authority File"; it is a system that brings together the information the world's libraries have in their system and aims to connect them. VIAF is largely maintained by software but there are processes to fix issues that do occur. Wikidata is connected to VIAF because it is the link to information about authors that exists in Wikipedia and Wikisource in many languages. There are bots that do find VIAF identifiers thanks to identifiers known at Wikidata and once a month Wikidata identifiers are updated in the VIAF registry. Using VIAF on its own, you will find for instance a Uilyam ┼×ekspir.

WorldCat is where it becomes interesting to readers. For me its information is available in Dutch. Echoing a blogpost on the OCLC blog we can bring more joy to the library's website and for people who come to the library world from a Wikipedia there are opportunities. I have a profile at WorldCat; it knows my library because I entered it as one of my favourites. WorldCat assumes that it knows my location and suggest a library that is not near to me and that is not useful. So picking up on the cookie information it does not need to know my location and allow for an easy link to my library. This will help me. What WorldCat could do is ask people if the suggested library is indeed their library.

The blogpost mentioned earlier talks about web analytics. I would absolutely love to know how many Wikipedia readers get to VIAF or WorldCat. It would be wonderful to know if we get readers connected to their libraries. When we do, the effect of improvements will show and that will motivate Wikimedians even more to get people their facts, and have us share in the sum of all knowledge.

Monday, April 24, 2017

#OSM - Districts of #Kerala

A Wikipedian asked me to blog about this map. The map is shown from within the English Wikipedia. It works really well on my mobile (an i-phone). The next step, integrating multi layered maps in articles.. and on a mobile?

For some documentation..

Thursday, April 20, 2017

#Wikidata user stories - Suggesting Henry Putnam, a great #Librarian

As software suggest what articles to write, it is relevant to understand what logic it is based on. Phenomena like the "six degrees of separation" made popular around Kevin Bacon has its scientific approach in graph theory "betweenness centrality". This is used as a basis in the research that what articles are important and what automated suggestions to make.

Mr Putnam is one of the more relevant librarians. He developed an eponymous classification system, continued its development as the Librarian of Congress (it is still in use), was twice president of the American Library Association and was a knight of the order of the Polar Star. When weight is applied to references to a person, all this is of relevance in the right setting.

When an article is to be written or improved, it helps when it can be suggested what it is that can be improved. By including statements in Wikidata suggestions can be made based in the local language. Facts like date of birth and death are also easy and obvious.

So when people consider a particular subject to be of universal relevance, it helps when associated subjects are well developed in Wikidata. When for all the presidents of the American Library Association many facts like where they studied, where they worked and what awards they received are included. When this is done for all the people who share categories, the betweenness of many influential librarians increases. This will have its influence on what is suggested for people to do.

Wednesday, April 19, 2017

#Wikidata user stories - the sum of all #knowledge

Map showing all places English Wikipedia covers

Map showing all places GeoNames covers

They say "a picture paints a thousand words". There is no argument; English Wikipedia covers only so much. With such a lack of coverage it is impossible to understand what is missing and its relevance particularly to people who do not read English.

LSJbot has created lots of articles for the places GeoNames knows about in several Wikipedias. As a consequence through the backdoor much of the missing information enters Wikidata. There have been some rumblings among Wikidatans that the GeoNames data is not perfect.. But hey, let's make "Be bold", a Wikipedia quality a Wikidata quality as well.

For many Wikipedians, the notion of bot generated articles is an anathema. For others the fact that there is so much that we do not cover is as problematic. The good news is that more information in Wikidata will enable us to predict what is lacking in content. We only need to acknowledge that Wikipedia is not the sum of all knowledge.. yet.

#Wikidata user story - Suggestions to #Wikipedia editors

Exciting is the #research done on "suggestions to Wikipedia editors". There is a paper and a great presentation. The bottom line is that when you know what to suggest to people; when you make it personal, the result is what you would hope. Consider, 3.2 times the number of articles created and two times more articles created than without personalised recommendations.

There is math involved, obviously, but the gist is that when suggestions are in line with previous activities, people will be triggered to do more. When you listen to the presentation, this first experiment asks people to translate from English. The assumption is that English covers more than most.

The slides of the presentation include visualisations showing the coverage of several Wikipedias. When you consider them, it becomes clear where the Wikimedia projects are challenged.

Leila Zia, the presenter makes it clear; all this would not be possible without Wikidata. One thing where Wikidata is different from the assumptions of the research is that there is an increasing number of subjects that have no links to Wiki(m/p)edia articles at all. Many of these are connected to existing content as they share common statements, statements like "profession: soccer player" of "award received: whatever award".

When totally new subjects are to be considered, there is already plenty that might be suggested in Wikidata itself.

Monday, April 17, 2017

#Wikidata user story - #DBpedia, #death and #Federation

Federation between DBpedia and Wikidata became possible. As a consequence, the results of a query that runs on DBpedia can be linked to Wikidata.

Some time ago people at DBpedia created a wonderful query that shows differences between DBpedia and the Dutch and Greek Wikipedia. It received approval from the Dutch Wikipedia community.

With federation something much more interesting became possible; a federated query comparing Wikidata with one DBpedia at a time. When the query runs, current data from Wikidata and DBpedia is presented.  When a Wikipedia associated with  DBpedia changes, DBpedia may import the differences from a RSS-feed and consequently running the query again will show the latest differences.

Updating information about one particular type of statement like date of death, place of death or whatever, will always be based on the current differences.. Experiencing the results in this way is truly motivating. Federation is an instrument that can helps us improve the quality of either federated system.

#Wikidata user story - #Wikipedia #diversity and diversity #research

Diversity, especially the "gender gap" is one of the best researched subjects of Wikipedia. There are many projects that have it as their goal to diminish the gap they object to.

Wikidata has the best and most up to date information about any Wikipedia. People are updating Wikidata all the time, typically its information is based on a Wikipedia.

Take gender; many a Wikipedia has a category for this so it is easy to update Wikidata based on what is in such categories. When a researcher is interested in the articles where Wikidata does not have such information, articles will be found and it is appreciated when Wikidata is updated by them as part of these activities. As a rule, the percentage of "humans" with no known gender is dropping anyway.

When a Wikipedia editor has an interest in female scientists that do not have an article in English, it is easy enough to have a query for that. Not all female scientists with or without a Wikipedia article can be found this way but it is just a matter of adding them in Wikidata. When another editor is interested in female scientists with no article in German of Kannada, it is just one change in the same query.

#Wikidata user story - the #library

The OCLC is an organisation combining most of the libraries in the world. It used to connect to the English Wikipedia but as Wikidata connects all Wikipedias, the OCLC does a better job linking to Wikidata. Through Wikidata it can link to articles about authors in any language.

For many authors the connection between VIAF, the system used by the OCLC and Wikidata is still missing. Many people are adding VIAF identifiers and once a month the data is imported and all the new data pops up.

Best practice at English Wikipedia has it that an {{authority control}} template is added in the reference section of people. When a VIAF identifier is added in Wikidata not only a VIAF identifier but also Worldcat information is shown (the example is for William Keepers Maxwell Jr.). Doing this is possible for any Wikipedia.

Now to expand on this; when a reader opts in, we could show if a book of an author is available in the local library.. What do you think?

Why #Wikidata? Because it is useful!

Wikidata was useful from the start. It provides a service to all Wikipedias and after the startup, it now provides the same service to Commons and Wikisource. It connects information about the same subject, they are the interwiki links.

The next phase was to connect these subjects. This is an internal Wikidata project and it not really used. This data could be useful but it is not always up to date and the requirements for the primary use cases are not realistic and almost impossible to fulfil. The challenge is to provide sourced information for every statement.

The challenge is: how do we provide a use for the Wikidata data. How do we get people to actually use Wikidata, have an interest in the data and maintain what is in their interest.

Software developers create "user stories" to explain what their software is to achieve. Why not write user stories that show how Wikidata can already be used and expand the stories on how to be even more useful and usable?

Sunday, April 16, 2017

#Wikipedia - The death of Lanier Meaders

Mr Meaders was a notable potter who died in February 1998 according to The English Wikipedia article however is in two minds about his death. Yes he is dead but when did he die?

According to the category he was one of the living death for 10 years. In the text the year of his demise is correctly stated as 1998. By googling for a source another date was found.

As I am not an English Wikipedian, I do not know how to indicate sources in English Wikipedia. The date of death in Wikidata does have a reference. The question is how differences like the dates of death of Mr Meaders are found and improve the consistency in the information that we provide in all of our projects.

NB the information in Wikidata on Mr Meaders is not complete.

Thursday, April 13, 2017

#Wikidata - People die; implications for another #policy approach

People die, notable people die. It is natural and it happens all the time. Many a #Wikipedia has a category for the people who died in a specific year. Such categories are what makes a wonderful tool by Pasleim tick. It shows those Wikidata items that have no date of death while a Wikipedia knows about the demise of the person involved.

This is a wonderful tool; it allows Wikidata to take care of those who died and update its data. It leaves us with another option and add one more tool. A tool that checks if the date of death exists in the Wikipedias that do not have such a category.

Consider this; a date of death is relevant when you consider the "Biographies of Living People". Having complete information for people is important. So why not flip our approach to the BLP and provide tools to improve the existing information in all of our projects?

First things first; the objective is to signal the death of a person. As is the current policy, it is up to every project to do with it as it likes. What should follow is looking for sources when one is available and preferably add at least one to Wikidata for re-use.

What are the benefits; a positive approach to maintenance and invite people to do something that actually matters now. It is an invitation to read the article and see what more can be done to get in into shape.

When the date for a death exists in an article, the article will be removed from the articles that need attention. There are plenty of valid approaches to this.

Improving user engagement is one of the objectives of the Wikimedia Foundation itself. I really want the WMF to include active engagement where it makes a difference and be as pro active as it can in this field. This is a positive approach and that is what we badly need.

Saturday, April 08, 2017

#WhiteHouse Fellows - Mrs Margarita Colmenares

Mrs Margarita Colmenares is a White House Fellow. A message was posted on Twitter that her article had been created and to support the message, it was easy enough to add her on Wikidata as well. The article mentioned that she was a White House Fellow and adding one layer of additional information is one way of making a person more relevant.

Adding this fellowship and adding other people who were a fellow was easy enough. The Wikipedia article referred to the website of the White House for information and when you visit its website you will be thanked for having an interest in this subject.

At a time like this it is good to consider  Its crawler worked well at some dates for other dates the message you will see is: "Got an HTTP 301 response at crawl time".

Anyway.. Together, the information at and at provide enough of a reference.

Friday, April 07, 2017

#Wikidata - #Perfection or #progress

When you consider the intention of the "BLP" or the "Biographies of Living People", you will find that it is defensive. It is the result of court cases brought against the Wikimedia Foundation or Wikipedians by living people. The result was a restrictive policy that intents to enforce the use of "sources" for all statements on living people.

The upside was fewer court cases and the downside; administrators who blindly applied this policy particularly in the big Wikipedias. Many people left, they no longer edit Wikipedia.

At Wikidata there are proponents of enforcing a BLP explicitly so that they have the "mandate" to block people when they consider them too often in violation of such a policy.

For a reality check; there are many known BLT issues in Wikidata that are not taken care of. There are tools like the one by Pasleim who make it easy to do so. There have been no external complaints about Wikidata so far but internal complaints, complaints about the quality of descriptions for instance, are easily waved away.

The implementation of a "DLP" or "Data of Living People" where "sources" are mandatory would kill much of the work done at Wikidata and will not have an effect on the existing backlog. Killing the backlog removes much of the usability of Wikidata and will prove to be even worse.

In order to responsibly consider new policies, first reflect on the current state of a project. What issues need to be addressed, what can be done to focus attention on the areas where it is most needed. How can we leverage what we know in other projects and in external sources. When it is really urgent make a cost analysis and improve the usability of our software to support the needed progress. And yes, stop insisting on perfection; it is what you aim for, No one of us is in a position to throw the first stone.

Wednesday, April 05, 2017

#Wikimedia and our #quality

In Berlin, the Wikimedia Foundation deliberated about the future. A lot of noble intentions were expressed. People went home glowing in the anticipation of all the good things they want. It is good to talk the talk and follow up and walk the walk.

A top priority for Wikidata is that it is used and useful. As it becomes more useful, quality becomes more of a priority for the people who use it. They will actively curate the data and remedy issues because they have a stake in the outcome.

So far Wikidata is largely filled with information from all the Wikipedias and this process can be improved substantially. For this to happen there is a need for more complete and up to date data. So what use can we give this data so that it gains use, and thereby gains value?

What if .. What if Wikidata could be used as an instrument to find the 4% of wiki links in Wikipedia that point to the wrong articles? With some minor changes to the MediaWiki software this can be done. This approach is described here for instance.. The beauty of this proposal is that not all the Wikipedians have to get involved, it is for those who care, for the rest it is mostly business as usual.

There are other benefits well. When it is "required" to add a source to a statement like "spouse of", it should be or is a requirement on the Wikipedia as well. When the source is associated with the Wiki link or red link for that matter, it should be possible for Wikidata to pick it up manually or with software.

When content of Wikidata more closely mirrors information of a Wikipedia in this way, it becomes easy and obvious to compare this information with other Wikipedias. Overall quality improves, but as relevant, the assurance we can give about our quality improves.

When we consider Wikimedia for the next 15 years, I expect that we will focus on quality and prevent bias not only by combining all our resources but also by reaching out to other trusted sources. By working together we will expose a lot more fake facts.

Sunday, April 02, 2017

#Wikidata - #Quality is a #perspective.

Forget absolutes. As an absolute quality does not exist for Wikidata. At best quality has attributes, attributes that can be manipulated, that interact. With 25,430,779 items any approach to quality will have a potentially negative quality effect when quality is approached from a different perspective.

Yet, we seek quality for our data and aim for quality to measurably improve. There are many perspectives possible and they have value, a value that is strengthened when it is combined with other perspectives.

At the Wikimedia Foundation, the "Biographies of Living Persons" or BLP has a huge impact. When you consider this policy, it is about biographies, a Wikipedia thing and this is not what Wikidata does. It is important to appreciate this as it is a key argument when a DLP "Data of Living Persons" is considered. Important is that the BLP focuses on articles for living people and its aim is to prevent law suits from articles that have a negative impact on living people.

Data is different, it is used differently and it has an impact in different ways.  Take for instance notability; a person may be notable and relevant because of having held an office or receiving an award. In order to complete information on the succession of an office or an award, it is therefore essential to include all persons involved in Wikidata. At the same time, when information is incomplete it can have an impact on a person as well. "you did not get that award because Wikidata does not say so".

Wikidata is incomplete and immature. Given the different perspectives on a DLP, most of them are not achievable in short order. The people who insist on a "source" for any statement will wipe most of the Wikidata statements and force it to a stand still. The people who insist on completeness have an impossible full time job for many years to come.

So what to do? Nothing is not an option but seeking ways to improve both quality and quantity is. A key value of Wikidata is its utility. The "Black Lunch Table" is one example of giving utility to Wikidata. They use Wikidata to manage the Wikipedia articles they want to write and expand on the notability of artists by including information on Wikidata. All the information helps people to write Wikipedia articles. Quality is important. Being included on the Black Lunch Table means something; artists are considered to be notable and worthy of a Wikipedia article.

Another example is using the links to authors so that people can read a book.

Given the size of Wikidata, it is impossible to get everything right in short order. When we can get people to adopt subsets of our data, these will grow. Our data will be linked. When we get to the stage where people actually object to data in Wikidata, we have improved both our quantity and quality substantially. As it is, looking at all the data, typically there is little to object to and that is in itself objectionable.

#Wikimedia - First a #strategy, then #Action

The people at Open Library have books they love to share. They are in the process of opening what they have even more.

In a previous post it was mentioned that there is a JSON document to getting information on authors like Cicero. There are many works by Cicero and today they have a JSON document in production for the books as well.

So what possible scenario is there for the readers of any Wikipedia; they check in Open Library what books there are for Cicero (or any other authors). They download a book and read it.

Where we are:
  • there is an API informing about authors and their books at Open Library based on the Open Library identifier.
  • an app can now be build that shows this information
    • this app could use identifiers of other "Sources" like Wikidata, VIAF or whatever on the assumption that Wikidata links these "Sources".
    • this app could show information based on Wikidata statements in any language using Wikidata labels.
    • this app may download the book (maybe not yet but certainly in the future)

What next:
  • investigate the JSON and see what we already can do with it
    • publish the results and iterate
  • Add more identifiers of authors known to Open Library to Wikidata
    • there are many OL identifiers in the Freebase information; they need to be extracted and a combined list of Wikidata identifiers and OL identifiers allows OL to curate it for redirects and we can then publish.
  • Raffaele Messuti pointed to existing functionality that retrieves an author ID for Wikidata and VIAF using an ISBN number.
    • Open Library knows about ISBN numbers for its books. When it runs the functionality for all the authors where it does not have a VIAF identifier it can enrich its database and share the information with Wikidata.
    • Alternatively someone does this based on exposed information at Open Library.. :)
  • We add a link to Open Library in the {{authority control}} in Wikipedia
  • We could add information for nearby libraries like they do in Worldcat [1].
  • We can measure how popular it is; how many people we refer to Open Library or to their library.
At the Wikimedia Foundation we aim to share in the sum of all knowledge. We aim to enable people acquire information. Making this happen for people at Wikipedia, Open Library and their library is part of this mission we just have to be bold and make it so.

Saturday, April 01, 2017

#Wikimedia - Sharing all #knowledge

It is strategy time at the Wikimedia Foundation. For me the overarching theme is: "Share in the sum of all knowledge". Ensuring that knowledge, information is available is not only an objective for us, it is an objective we share with organisations like the Internet Archive and the OCLC.

One of the activities of Open Archive is the "Open Library". It provides over the Internet access to books that are free to read. At Wikidata we include links for authors that are known to the Open Library so all it takes is for a Wikipedia to have a {{authority control}} on its authors and a link to Open Library has been provided.

When you work together, a lot can be achieved. A file with identifiers for authors has been sent to the OCLC en Open Library. The reaction is that in the JSON for these authors Open Library includes a link to both VIAF (a system by the OCLC) and Wikidata. This is the JSON for Mr Richard W. Townshend.

The next step is to optimise the process of including identifiers for both VIAF and Open Library. What we bring in is our community. We have done a lot of work using Mix'n Match. We do add identifiers when it seems opportune and we already function as a stepping stone between Open Library and the OCLC. So when we can target attention in Mix'n Match per language, it already is a lot easier to make a match. It may be possible for the OCLC and Open Library to match authors through publications and in that way technology is a deciding factor.

In the end there is only one point to all this: share in the sum of all knowledge. We all have a part to play.

Friday, March 31, 2017

#Wikidata - concentrating on #Fulbright ?

A friend told me to concentrate on substantial awards;  the Fulbright scholarship for instance. To me concentrating on 325,000+ alumni is crazy. There are too many and obviously, some of them will have turned out not to be so notable after all. I do not think Wikidata is a stamp or pokemon collection either

When you search for Fulbright in Reasonator. There is still plenty to do. There is a "Fulbright scholarship" and a "Fulbright Program" they are about the same thing so their content should be merged.. And then there is this "Fulbright Prize"; it seems to have an article only on the Hebrew Wikipedia. There are also several items with no statements.

There is no reason for me to concentrating on all the Fulbright scholars. Given that it applies to so many people, slowly but surely more people will be tagged as such. Not only the people who can be found in categories or lists but also where it is only mentioned in an article.

A scholarship implies studying at a university. When you add a scholarship and there is no information about education. This is another aspect that needs taking care of. At some point it should become obvious, it is better to concentrate on something else.

Thursday, March 30, 2017

#Wikidata - Librarians and Mrs Carla Hayden

As a group librarians are not very visible. At the same time librarians are the people that have provided people with information before there was an Internet. In this day and age, they are still taking care that much of the published information is there for us and the generations to come.

Mrs Carla Hayden will address a Wikipedia edit-a-thon in Washington, DC hosted by the Library of Congress and the US National Archives. Mrs Hayden is the Librarian of Congress. It is always fun to update Wikidata information that is in the news.

It is amazing how little information there is for librarians. Of the "Librarians of the year", there seem to be only two with a Wikipedia article. Anyway, adding information for Mrs Hayden is a privilege and adding information on many librarians is easy to do.

What I am not sure about is if giving a lecture like the Jean E. Coleman Library Outreach Lecture may be seen as an award. Mrs Hayden gave this lecture twice.

#Quality - #DBpedia and Kappa Alpha Psi

Kappa Alpha Psi is a fraternity of students and alumni. There is a Wikipedia article in English, a Commons category and a Wikidata item.

The information about Kappa Alpha Psi at Wikidata is based on the Wikipedia article. Information was added to the items for the members. This was done because in a related item it was found that the influence of fraternities and sororities is considerable. Concentrating for a moment on Kappa Alpha Psi has a secondary quality impact on what is of primary concern but when this is done for three such organisations, it quickly affects thousands of notable people.

When people find it of interest to add information about a membership to a Wikipedia article it has some impact. Having a category helps more to make the relevance of a Kappa Alphi Psi more visible. Adding this information to Wikidata is easy and it may show up in any language when membership information is part of a template.

DBpedia is a project similar to Wikidata. It harvests data from Wikipedias more consistently than Wikidata. Wikidata items are mapped to its internal items making it is possible to compare Wikidata with DBpedia.

When quality is an objective, when quality is to be improved effectively, the differences between DBpedia and Wikidata are an easy and one of the more obvious starting points. For some Wikipedias DBpedia updates are based on the RSS feed of the changes. So once a difference has been curated and changed in either Wikipedia or Wikidata, it results in an improved DBpedia entry and the desired improvement in quality.  It does not need any math to understand this.

What we needed is a tool that uses these differences as input for a subset that is of interest to a Wikidata volunteer. That might be the Kappa Alpha Psi, The Black Lunch Table or whatever. Whatever can be defined with a query.

Sunday, March 26, 2017

#Wikidata - Gladys and Reginald Laubin

According to the documentation of the Capezio award, both Gladys and Reginald Laubin are awardees. The Capezio award is a dance award and it got some attention because a person of interest received the award in 2007. Wikipedia information was available until 2006.

Adding information for Mrs Laubin makes sense; she is as notable as her husband. She has her own VIAF registration and it completes the Capezio award information.

When you add an award and its awardees, some quality is expected. Adding what Wikipedia knows borrows from the sources at Wikipedia but new information is authoritative when it is from the associated website. When you then seek later information, it becomes more fuzzy; it becomes less obvious. It may not even be correct,

That is however how the cookie crumbles; like Wikipedia also relies on the interpretation of sources.

Friday, March 24, 2017

#Wikipedia - Professor Joseph Torgesen

The article on Professor Joseph Torgesen is a stub. The cool thing is that the information on a minimal article allows for improvements in the data at Wikidata. The author of the article included information on education and employment. This was done through categories.

Petscan was used and as a result 244 staff members of the university of Florida State University and 107 alumni of the University of Michican were added including Mr Torgesen.

As Mr Torgesen is a professor and "must" publish, finding a VIAF registration was possible. Adding the {{authority control}} to the article enriched the article. One fact not in the article; Mr Torgesen was awarded the Samuel Torrey Orton award in 2006. This is why there was already an item in Wikidata for Mr Torgesen.

Thursday, March 23, 2017

#Wikipedia vs #Wikidata - Quality and low hanging fruit

When Wikipedia is to be the best, it has to understand and preserve its quality. When Wikidata is to be the best, it has to understand and preserve its quality. Both Wikipedia and Wikidata are wikis but their quality and how it manifests itself are utterly different. At the same time they intersect and this is where we find low hanging fruit.

In Wikidata we have "Author"s and subclasses of author. Many of them have a VIAF identifier and this means that libraries know about them. Information like VIAF is shown in the English Wikipedia when there is an {{authority control}} template. It shows nothing when there is nothing to show but it will update Wikipedia when the information is added to Wikidata.

The low hanging fruit:
  • English Wikipedia - All articles about someone who is known as an author of any kind gets the template.
  • Wikidata - For all the items for someone who is known as an author of any kind we seek the VIAF identifier.
  • OCLC - All the libraries in the world will be updated with a link to Wikidata within a month. This will make it easy for a librarian to find Wikipedia articles in any language.
  • Open Archive - It has a project called "Open Library" and it has freely licensed e-books. Wikidata includes Open Library identifiers. OCLC and OL have links combined with Wikidata identifiers. As these numbers include, people in libraries or from Wikipedia could find authors with free books.
  • other Wikipedias - they could include VIAF and OL identifiers as well. Open Library has books in languages other than English..
We live in an interconnected world. Wikimedia quality is in not being on an island but increasing the reach and enabling our readers.

Tuesday, March 21, 2017

#Wikidata and #activism

When you care about something, you want to make sure that when you do something, it has an impact. There are many ways a difference can be made, you can protest, you can write in a blog, you can write Wikipedia articles and you can try to connect things in Wikidata.

For Wikimedians like me, sharing the sum of all knowledge, is why we are involved. As knowledge is key, it is important to make sure that facts are registered and access to knowledge becomes enabled.

The problem is that it is not obvious how and where a difference can be made. When the BBC gives diversity a prominent place because of its 100 women program, it seems obvious that we will write articles about these women. It is however not the first time that the BBC runs this program. We have written articles for women celebrated in 2013, 2014, 2015 and 2016. But in what language are these articles written? How much are they read? How well connected are these women to universities, to political parties to organisations and what countries are they from?

For a Wikimedian these are interesting questions. For an organiser of editathons they are what measures success. Is this activism? Sure. How does it affect the legitimate concern of impartiality? Not really as Wikimedia has always been about what people fancy to work on.

Saturday, March 18, 2017

#Wikidata - the #Rome Prize

The Rome Prize is given to a high number of Americans artists. It is awarded every year to 15 artists and 15 scholars, they stay for an extended period in Rome. The first awards were given in 1905.

The award winners are mentioned in many articles, when there is no article yet, there is a red link. New articles are written all the time so problems can be anticipated.

The problem is in names; different people bearing the same name. When new articles are written, there is no consideration for these red links. Articles are written. When an article is written for a Rome Prize winner, he or she may be included on the category for Rome Prize winners and that works well.

Some will say that Red Links are bad. They have a point. However it is all in the delivery. When there is no article, it does not follow that there is no information. The information could already be in Wikidata and I added a few statements for 2016 winners..

Authors, the #OpenLibrary, #Wikidata and libraries

The Open Library is part of the Internet Archive. It makes books available for you to read. That is awesome and that is why Open Library is a natural ally of the Wikimedia community.

At our end we can do more of the things that we do anyway and share what we do. The good news is that Wikidata has a CC-0 license. The people at Open Library can use everything that we do and they do not even have to bother to say thanks.

When we add more Open Library identifiers and VIAF identifier to Wikidata we connect them, us and all the libraries in the world. Yes, individual libraries may have different ways of spelling an author's name but using these connections disambiguation slowly but surely becomes a thing of the past for Open Librarians.

What will we have in return? All the books at Open Library of these authors become available to our readers and editors. We are already in the process of adding identifiers to Wikidata for Open Library. For all the authors that have been connected, we can provide our identifiers to Open Library. This helps them with their outreach and disambiguation.

Through Wikidata more and more authors become connected to VIAF. This allows the librarians of the world to share these freely licensed books with their readers. A clear win-win situation don't you think?

Friday, March 17, 2017

#Wikimedia - Professor Chuck Stone, Tuskegee airman and member of Alpha Phi Alpha

Professor Stone is the founding NABJ President, he was included in the National Association of Black Journalists Hall of Fame in 2004 and he received the Congressional Gold Medal from President Bush.

The description for the Wikidata item for Mr Stone is "American air force officer". This will not change; it is based on a bot that at one time decided that this would do. The automated description is: "US-American journalist (1924–2014); National Association of Black Journalists Hall of Fame and Congressional Gold Medal; member of Tuskegee AirmenAlpha Phi Alpha, and World Policy Council ♂" and the beauty is that this is updated as more information becomes available.

When you consider the quality of the information for Mr Stone in Wikidata, today 10 statements were added to the item. He has been added to the hall of fame with many others including some people Wikipedia does not know about. The World Policy Council is connected to Alpha Phi Alpha. The data is not complete; there is more to add.

When we consider quality, most of the data was added thanks to information available in the English article of Wikipedia. Yet there is information available that could find its way from Wikidata; how do we inform Wikipedia about the people who became part of the hall of fame for instance. Quality for Wikidata is not in single items, it is in how it connects and how it is used. With this realisation we learn from where some say Wikidata and Wikipedia fails and achieve the success that our combined data offers.

Thursday, March 16, 2017

#Wikidata - Black Art

Charles Alston is one of the artists who are of interest to the Black Lunch Table. Mr Alston died in 1977. One of his struggles was to have his art appreciated in the same way as any other art. It is why he refused to be exhibited in William E. Harmon Foundation shows, which featured all-black artists in their travelling exhibits. Alston and his friends thought the exhibits were curated for a white audience, a form of segregation which they protested. They did not want to be set aside but exhibited on the same level as art peers of every skin color.

Today is 2017 and the BLT addresses this black experience and gains the same attention for black artists by writing in Wikipedia about them. It is why many artists with a black experience gain more information in Wikidata, artists like Mr Alston. The one thing where Wikidata differs from Wikipedia is that it is all about connections. The more a person is connected, the more relevant in different settings. Mr Alston had a notable spouse, he was a founder and member of an art group, he studied and worked. All these things are easy and obvious in Wikidata.

From an artists point of view, other things are of relevance too; what awards did he gain, what museums have work in their collection and where did he exhibit. There is yet no obvious way how to make such a claim. Like so many young men of his time, he was in the army in the 372nd Infantry Regiment but that is not quite what Mr Alston is about. This could be relevant for people who care about the military and also, the 372nd was a black experience as well.

Most articles on the English Wikipedia for a person have categories about education, work at a faculty. Adding the implied information for everyone is almost as easy as adding it for one person. It makes adding statements something of a black art, an art that looks complicated an art that connects everything.

Tuesday, March 14, 2017

#Wikidata - Who is Eric D. Wolff?

Eric D. Wolff is one of three authors of a paper called "Original Issue High Yield Bonds: Aging Analyses of Defaults, Exchanges, and Calls". They won the 1989 Smith Breeden Prize and the Wikipedia article has a red link for Mr Wolff, no link for Mr Paul Asquith and a blue link for  David W. Mullins, Jr.

The simplest thing to do is add an item for all the missing authors, connect them to the awards and be done. As they wrote a paper, it is reasonable to expect a VIAF registration and it was possible to find Mr Asquith.

The question is not if Mr Wolff is notable; he is as he won a prize. The question is how to reliably connect him and others to external sources. Making this effort improves quality for Wikidata; it is quality in action.

#Wikidata - actionable quality; Debora L. Silverman

Mrs Silverman is the 2001 winner of the Ralph Waldo Emerson Award. As Wikidata had only two statements for her, it was appropriate to add more information. The Wikipedia article is a stub but it had two categories for a university where she studied and one where she worked. Adding this fact to all the people in a category is relatively easy.

The Ralph Waldo Emerson award was given for "Van Gogh and Gauguin: The Search for Sacred Art". It makes Mrs Silverman an author and consequently there is a VIAF registration for her. Adding this has an effect when the {{Authority control}} template is available in the article.. I added it to the Wikipedia stub and was pleasantly surprised with the WorldCat information from the OCLC.

It is wonderful to find such quality information provided as a consequence from having VIAF information in Wikidata. That is actionable quality!

Monday, March 13, 2017

#Wikidata #quality - is it actionable?

T. Geronimo Johnson
The Ernest J. Gaines Award for Literary Excellence is a great example to explain about Wikidata quality. The item is linked to a Wikipedia article and it has several red links. For all the red links a Wikidata item has been created and, the winner for 2015 and 2016 are only known to Wikidata.

The Wikipedia article for the 2016 winner knows about the award. The article mentions the Sallie Bingham Award, an award that Wikidata does not (yet) know about. Wikidata knows about the VIAF registration for the winner; this is relevant because it means that the international libraries know about this author. The Wikipedia article mentions several universities that were attended; including them in Wikidata is easy and obvious. Doing so improves quality for both the author and for the universities involved. The quality of Wikidata is equal or better than Wikipedia when it knows about the same or more articles than a Wikipedia category does.

Several of the winners including T. Geronimo Johnson, the 2015 winner, are "red links". The minimum needed for Wikidata is to know that he is male and, the winner of the award. With a little bit of effort his VIAF identifier can be found. Consequently we know that the T. stands for Tyrone. Adding the VIAF identifier will show the Wikidata identifier in a months time on the VIAF website and, it allows for quality checks in Wikidata.

Quality for Wikidata is different from quality for Wikipedia. It is less in traditional sources and it is more in connecting to sources like VIAF. When a Wikipedia, a Wikidata and sources like VIAF are in agreement a fact is verifiable and becomes more immune to "alternative facts".

When editing Wikidata quality is in completeness, in combining information from multiple sources, in making Mr Johnson the 2015 winner by adding a qualifier. It starts however with making an effort.

Sunday, March 12, 2017

#Wikidata - Maren Hassinger is on the "Black Lunch Table"

Maren Hassinger is a sculptor born in Los Angeles. She was awarded both the Anonymous Was A Woman Award and the Women's Caucus for Art Lifetime Achievement Award. In addition to this there is a Wikipedia article.

When you read the article, all kinds of statements are made that could reflect in the article having Wikipedia categories. In this case statements in Wikidata were made.

The outward appearance is that Wikipedia and Wikidata are two distinct projects. Wikidata however has always included data from Wikipedia and there has always been a realisation that Wikipedia in its turn could benefit from Wikidata; generating category entries should be possible for instance.

When you consider the immediate future of the Wikimedia projects, Commons will be wikidatified. One part of the information that is directly related to GLAM activities is registering the museums that include an artist in their collection. This is applicable for many artists that are part of the Black Lunch Table including Mrs Hassinger. So the question is; should we include such information in Wikidata and how should we do this?

Saturday, March 11, 2017

#Wikidata - Historical amnesia

A discussion about contemporary politics is not based on facts, nor on the interpretation of facts it is much more based on identity and what group you belong to. It is important to politicians to frame their message and much of this framing is done through a selective use of facts and the presentation of opinion as facts.

The Wikimedia community is not about politics except where facts are concerned. Facts matter; for instance Mrs Clarissa Sligh cares about "historical amnesia", read her website and see what is meant. Mrs Sligh qualifies as far as I am concerned as a "Black Lunch Table" candidate. They are artists from the African diaspora and giving attention to them is a project that aims to lessen the diversity gap that exists.

In contemporary anti politics it is relevant that facts are available. All the Wikimedia projects are political in that they deny any singular political message their limited view on facts. It is important to overcome the bias of the demagogues and pundits and bring together information that paints a difference.

Friday, March 10, 2017

#Wikidata - dating awards

Many awards are dated using "point in time". With a query you can count them, certainly when you are WikidataFacts. Looking at a graph like this, you will see that many awards for 2016 and 2015 are still missing. Many of these awards were imported in the past and have not been updated yet.

It would be cool if we knew what Wikipedias have an article for the awards and would be "pinged" when new values are known.

Thursday, March 02, 2017

#Politics and 33% fewer #HIV infections in the #UK

Professor Sheena McCormack studied the efficacy of PrEP in the United Kingdom. She headed a major NHS study to ascertain how effective the drug was, and who should be given it. The study was called PROUD. Greg Owen was to late to enrol, buying the drug privately would have set him back £500 per month, money he could not afford.

He could score some of these pills but before he started, he found he was already HIV-positive.. He posted his story on Facebook and was inundated with questions; what is PrEP, where can I get it. At some stage he remembered that medicines can be had from the Internet from countries where medicines are better affordable. Unbranded PrEP is available for £50 per month.

Greg informs people on his blog. Professor McCormack was instrumental in helping set up clinics that monitored the use of these unbranded medicines. It was based on the assertion that doctors are responsible for the care that they provide. Helpful friends monitor the supply and indicate what websites provide the correct substance

The National Health System meanwhile did not want to fund the use of PrEP in 2016. As a result more and more people became aware of PrEP and learned about the alternative. In August of 2016 the NHS lost its case in the High Court. As a result the NHS is doing a "test" for three years starting this summer for 10.000 people ignoring the 33% fewer HIV infections because of PrEP.

When Wikimedians talk about politics, having no article on Professor McCormack and on Greg Owen is relevant. With all the publicity on this case, where is the neutral point of view in this?  It is important because it highlights the cost of medicines as a determining factor on who lives and who does not. In Europe many people can afford £50 a month but in many other countries it is out of reach to make the difference it makes in Europe. According to the United Nations, we can end the HIV/Aids epidemic by 2030 and then Mr Trump happened.

It is political because it provides clarity in a time where companies like Milan make medical care too expensive. It provides clarity when the US government insists on taking away medical insurance from people.

It is political and all too often Wikipedia does not inform. We know that 12% of the 2500 most sold prescription drugs are not effective (source British Medical Journal) and we do not even register this on those drugs. Wikipedia is the prime source of information on medical matters and in my opinion we are negligent.

Sunday, February 26, 2017

#Wikidata - John Dalrymple Governor of British Mauritius

There is a list that has a John Dalrymple as Governor of British Mauritius. There are enough John Dalrymples to allow for a disambiguation page.

When there are multiple people with the same name, it is important to exclude the ones that do not fit. A date of birth and death is relevant. The time of having been governor is from 10 December 1818 to 6 February 1819 and that excludes most of them.

There is only one likely candidate; the eight earl of Stair. There is no source but hey, it is at least likely; one source describes the governor as a general and, this one certainly was one. I am however not sure.

There is a Listeria list of British Governors. When someone knows better, he can adapt Wikidata and everywhere where the list is used things will improve.

#Wikidata - Cornelius Alfred Moloney

Mr Moloney was one of the Governors-in-Chief of the Windward Islands.  As a colonial administrator for the British his tours of duty included work in Africa and in South America.

As part of his endeavours, he had an interest in forestry and even wrote a book about it. This resulted in him being known in the IPNI database of authors. All of these authors had been added so it was just a matter of merging the two items for Mr Moloney.

When you read the article on Mr Moloney, it does not mention his publication but it does mention that during his tenure at the Windward Islands there were riots because of the cost of water.

Every historic event does not happen in a vacuum. Pertinent information becomes more and more available and this can be provided either in Wikipedia or in Wikidata.  For the Windward Islands it would be that it is a "British Colony" with as its Governor in Chief Mr Molloney and possibly a map to make it clear what is included.

One problem with understanding many facts is that they are better understood when a context is known. As we gain more data it becomes feasible to provide it.

Saturday, February 25, 2017

#Wikidata - Jamaica and William O'Brien, 2nd Earl of Inchiquin

William O'Brien, the second Earl of Inchiquin is an interesting person. When you consider what Wikidata knew about him, much of it is missing. No members of the Irish Privy council were known, and he was not noted as being a governor of Jamaica. It is not known that a ransom was paid by the English parliament to the Turks.

When you read the Wikipedia article, It is mentioned in one sentence that he was the first governor of Jamaica but that is all there is. There is not much more except that he died in Jamaica in 1692 and that he was Vice-Admiral of the Caribbean Seas.

When you consider the bias in all this, it is found in the lack of attention for countries like Jamaica. Their history is in the people who were responsible for what is their country. When you read about the family of the earl, you may note that he married women of substance and obviously they are notable enough for inclusion in Wikidata. What is not known is if his wife accompanied him to Jamaica..

Much of the source material on the history of countries like Jamaica can be found in British archives and museums. Improving on the articles on people like Mr O'Brien and sharing this information widely will make Wikipedia more useful outside of its immediate and current interest.

Sunday, February 19, 2017

#Wikidata - The "first" president of Haiti

When people express a strong interest for a subject; when there is a chance that this subject is finally getting the attention it deserves, it is a good moment to assist, particularly when it is just a matter of concentrating on what you do anyway.

So Haiti has presently my attention. I have added the known members of the Chamber of Deputies all six of them, I have added the succession on most of the Presidents of Haiti. The problem here is that I do not know enough to make sense out of the early rulers and I will the known members of the Senate.

When I am done with this, I hope to get a list of the present members of the Chamber of Deputies. It is easy enough to include them in Wikidata and this may be followed up by generating lists for use in any Wikipedia that will take it.

Lists like this are wonderful because they provide early structure. When someone adds an article, it is already linked in many places in that Wikipedia and this will make for meaningful early linking in a project. Lists of award winners, lists of politicians for a party or an office. It is all possible when you think in potential particularly when the objective is to share in the sum of all knowledge.

Saturday, February 18, 2017

#USA - The Eleanor Roosevelt Award for Human Rights

Just to confuse you; there are two awards by that name. This is about the award that was established in 1998 by the President of the United States Bill Clinton, honouring outstanding American promoters of rights in the United States. In 2010, Secretary of State Hillary Rodham Clinton revived the Eleanor Roosevelt Award for Human Rights and presented the award on behalf of President Obama.

For whatever reason, this award was not very much on the radar of Wikipedians because like with so many awards, it was not well maintained. There are only people for 2010 and there is one person, and he should be on the list for the "other" award. That award is conferred by "Jobs with Justice".

What happened after 2010? Several more years of a United States with President Obama and now, four weeks in the reign of the present incumbent, the sources for the award are gone. They were at a US Government website. Luckily the disappearance of Internet Sources is a well known phenomena, Wikipedians know how to deal with them. The question is if this award is deemed notable enough and there is the rub. It is not obvious what went missing and when the Wikipedia article is not complete, the removal of the data on the web serves its purpose.

Friday, February 17, 2017

#Sources: the Charles S. Johnson Award

Awards honour both the recipients, the organisation that confers it and often the person the award is named for. The Charles S. Johnson Award is named after Charles S. Johnson. He is notable; has his own Wikipedia articles. The organisation that confers it is notable; the Southern Sociological Society has its own Wikipedia article.

The awardees, well that is a problem because there is only a partial list of people who received the award. There are many gaps in the list and sources are available to indicate why they think someone received the award.

The Wikimedia blog has a post that mentions Lillian Smith. It mentions that she gave an acceptance speech at Fisk University in 1966. But there is no source for it. This does not mean that she did not receive the award, it just means that there is no source in the article.

When sources are provided to the Sociological Society, Mrs Smith will be connected to a list of other remarkable people. All notable in their own way. It is just a matter of connecting the dots.

Thursday, February 16, 2017

#Wikidata - Who is Ann Dale

Ann Dale won the Molson Prize in 2013. In a rare twist, the English Wikipedia does not list the winners of this award but other Wikipedias do. It was therefore easy to import the list from the German Wikipedia into Wikidata.

As there is a link to the website of the award, it was easy to include the more recent winners of the award. Most of the recent winners already had a Wikipedia article so it was easy to add them.

When you disambiguate for Ann Dale using Reasonator, there was a Ann Marie Dale. There was not much known for her except for her publications. Given that it was possible to find out that she worked for the Washington University in St. Louis, it was a miss. The information on the Molson Prize website provided the answer; it was a different Mrs Ann Dale.

The research by Mrs Dale is on governance, innovation and community vitality and is designed to provide useful knowledge to Canadian decision-makers. There might be something in her work that is of interest to the Wikimedia Foundation as well.

NB Mrs Ann Dale is now registered in Wikidata. More information is left to other interested souls.