Friday, March 31, 2017

#Wikidata - concentrating on #Fulbright ?

A friend told me to concentrate on substantial awards;  the Fulbright scholarship for instance. To me concentrating on 325,000+ alumni is crazy. There are too many and obviously, some of them will have turned out not to be so notable after all. I do not think Wikidata is a stamp or pokemon collection either

When you search for Fulbright in Reasonator. There is still plenty to do. There is a "Fulbright scholarship" and a "Fulbright Program" they are about the same thing so their content should be merged.. And then there is this "Fulbright Prize"; it seems to have an article only on the Hebrew Wikipedia. There are also several items with no statements.

There is no reason for me to concentrating on all the Fulbright scholars. Given that it applies to so many people, slowly but surely more people will be tagged as such. Not only the people who can be found in categories or lists but also where it is only mentioned in an article.

A scholarship implies studying at a university. When you add a scholarship and there is no information about education. This is another aspect that needs taking care of. At some point it should become obvious, it is better to concentrate on something else.

Thursday, March 30, 2017

#Wikidata - Librarians and Mrs Carla Hayden

As a group librarians are not very visible. At the same time librarians are the people that have provided people with information before there was an Internet. In this day and age, they are still taking care that much of the published information is there for us and the generations to come.

Mrs Carla Hayden will address a Wikipedia edit-a-thon in Washington, DC hosted by the Library of Congress and the US National Archives. Mrs Hayden is the Librarian of Congress. It is always fun to update Wikidata information that is in the news.

It is amazing how little information there is for librarians. Of the "Librarians of the year", there seem to be only two with a Wikipedia article. Anyway, adding information for Mrs Hayden is a privilege and adding information on many librarians is easy to do.

What I am not sure about is if giving a lecture like the Jean E. Coleman Library Outreach Lecture may be seen as an award. Mrs Hayden gave this lecture twice.

#Quality - #DBpedia and Kappa Alpha Psi

Kappa Alpha Psi is a fraternity of students and alumni. There is a Wikipedia article in English, a Commons category and a Wikidata item.

The information about Kappa Alpha Psi at Wikidata is based on the Wikipedia article. Information was added to the items for the members. This was done because in a related item it was found that the influence of fraternities and sororities is considerable. Concentrating for a moment on Kappa Alpha Psi has a secondary quality impact on what is of primary concern but when this is done for three such organisations, it quickly affects thousands of notable people.

When people find it of interest to add information about a membership to a Wikipedia article it has some impact. Having a category helps more to make the relevance of a Kappa Alphi Psi more visible. Adding this information to Wikidata is easy and it may show up in any language when membership information is part of a template.

DBpedia is a project similar to Wikidata. It harvests data from Wikipedias more consistently than Wikidata. Wikidata items are mapped to its internal items making it is possible to compare Wikidata with DBpedia.

When quality is an objective, when quality is to be improved effectively, the differences between DBpedia and Wikidata are an easy and one of the more obvious starting points. For some Wikipedias DBpedia updates are based on the RSS feed of the changes. So once a difference has been curated and changed in either Wikipedia or Wikidata, it results in an improved DBpedia entry and the desired improvement in quality.  It does not need any math to understand this.

What we needed is a tool that uses these differences as input for a subset that is of interest to a Wikidata volunteer. That might be the Kappa Alpha Psi, The Black Lunch Table or whatever. Whatever can be defined with a query.

Sunday, March 26, 2017

#Wikidata - Gladys and Reginald Laubin

According to the documentation of the Capezio award, both Gladys and Reginald Laubin are awardees. The Capezio award is a dance award and it got some attention because a person of interest received the award in 2007. Wikipedia information was available until 2006.

Adding information for Mrs Laubin makes sense; she is as notable as her husband. She has her own VIAF registration and it completes the Capezio award information.

When you add an award and its awardees, some quality is expected. Adding what Wikipedia knows borrows from the sources at Wikipedia but new information is authoritative when it is from the associated website. When you then seek later information, it becomes more fuzzy; it becomes less obvious. It may not even be correct,

That is however how the cookie crumbles; like Wikipedia also relies on the interpretation of sources.

Friday, March 24, 2017

#Wikipedia - Professor Joseph Torgesen

The article on Professor Joseph Torgesen is a stub. The cool thing is that the information on a minimal article allows for improvements in the data at Wikidata. The author of the article included information on education and employment. This was done through categories.

Petscan was used and as a result 244 staff members of the university of Florida State University and 107 alumni of the University of Michican were added including Mr Torgesen.

As Mr Torgesen is a professor and "must" publish, finding a VIAF registration was possible. Adding the {{authority control}} to the article enriched the article. One fact not in the article; Mr Torgesen was awarded the Samuel Torrey Orton award in 2006. This is why there was already an item in Wikidata for Mr Torgesen.

Thursday, March 23, 2017

#Wikipedia vs #Wikidata - Quality and low hanging fruit

When Wikipedia is to be the best, it has to understand and preserve its quality. When Wikidata is to be the best, it has to understand and preserve its quality. Both Wikipedia and Wikidata are wikis but their quality and how it manifests itself are utterly different. At the same time they intersect and this is where we find low hanging fruit.

In Wikidata we have "Author"s and subclasses of author. Many of them have a VIAF identifier and this means that libraries know about them. Information like VIAF is shown in the English Wikipedia when there is an {{authority control}} template. It shows nothing when there is nothing to show but it will update Wikipedia when the information is added to Wikidata.

The low hanging fruit:
  • English Wikipedia - All articles about someone who is known as an author of any kind gets the template.
  • Wikidata - For all the items for someone who is known as an author of any kind we seek the VIAF identifier.
  • OCLC - All the libraries in the world will be updated with a link to Wikidata within a month. This will make it easy for a librarian to find Wikipedia articles in any language.
  • Open Archive - It has a project called "Open Library" and it has freely licensed e-books. Wikidata includes Open Library identifiers. OCLC and OL have links combined with Wikidata identifiers. As these numbers include, people in libraries or from Wikipedia could find authors with free books.
  • other Wikipedias - they could include VIAF and OL identifiers as well. Open Library has books in languages other than English..
We live in an interconnected world. Wikimedia quality is in not being on an island but increasing the reach and enabling our readers.

Tuesday, March 21, 2017

#Wikidata and #activism

When you care about something, you want to make sure that when you do something, it has an impact. There are many ways a difference can be made, you can protest, you can write in a blog, you can write Wikipedia articles and you can try to connect things in Wikidata.

For Wikimedians like me, sharing the sum of all knowledge, is why we are involved. As knowledge is key, it is important to make sure that facts are registered and access to knowledge becomes enabled.

The problem is that it is not obvious how and where a difference can be made. When the BBC gives diversity a prominent place because of its 100 women program, it seems obvious that we will write articles about these women. It is however not the first time that the BBC runs this program. We have written articles for women celebrated in 2013, 2014, 2015 and 2016. But in what language are these articles written? How much are they read? How well connected are these women to universities, to political parties to organisations and what countries are they from?

For a Wikimedian these are interesting questions. For an organiser of editathons they are what measures success. Is this activism? Sure. How does it affect the legitimate concern of impartiality? Not really as Wikimedia has always been about what people fancy to work on.

Saturday, March 18, 2017

#Wikidata - the #Rome Prize

The Rome Prize is given to a high number of Americans artists. It is awarded every year to 15 artists and 15 scholars, they stay for an extended period in Rome. The first awards were given in 1905.

The award winners are mentioned in many articles, when there is no article yet, there is a red link. New articles are written all the time so problems can be anticipated.

The problem is in names; different people bearing the same name. When new articles are written, there is no consideration for these red links. Articles are written. When an article is written for a Rome Prize winner, he or she may be included on the category for Rome Prize winners and that works well.

Some will say that Red Links are bad. They have a point. However it is all in the delivery. When there is no article, it does not follow that there is no information. The information could already be in Wikidata and I added a few statements for 2016 winners..

Authors, the #OpenLibrary, #Wikidata and libraries

The Open Library is part of the Internet Archive. It makes books available for you to read. That is awesome and that is why Open Library is a natural ally of the Wikimedia community.

At our end we can do more of the things that we do anyway and share what we do. The good news is that Wikidata has a CC-0 license. The people at Open Library can use everything that we do and they do not even have to bother to say thanks.

When we add more Open Library identifiers and VIAF identifier to Wikidata we connect them, us and all the libraries in the world. Yes, individual libraries may have different ways of spelling an author's name but using these connections disambiguation slowly but surely becomes a thing of the past for Open Librarians.

What will we have in return? All the books at Open Library of these authors become available to our readers and editors. We are already in the process of adding identifiers to Wikidata for Open Library. For all the authors that have been connected, we can provide our identifiers to Open Library. This helps them with their outreach and disambiguation.

Through Wikidata more and more authors become connected to VIAF. This allows the librarians of the world to share these freely licensed books with their readers. A clear win-win situation don't you think?

Friday, March 17, 2017

#Wikimedia - Professor Chuck Stone, Tuskegee airman and member of Alpha Phi Alpha

Professor Stone is the founding NABJ President, he was included in the National Association of Black Journalists Hall of Fame in 2004 and he received the Congressional Gold Medal from President Bush.

The description for the Wikidata item for Mr Stone is "American air force officer". This will not change; it is based on a bot that at one time decided that this would do. The automated description is: "US-American journalist (1924–2014); National Association of Black Journalists Hall of Fame and Congressional Gold Medal; member of Tuskegee AirmenAlpha Phi Alpha, and World Policy Council ♂" and the beauty is that this is updated as more information becomes available.

When you consider the quality of the information for Mr Stone in Wikidata, today 10 statements were added to the item. He has been added to the hall of fame with many others including some people Wikipedia does not know about. The World Policy Council is connected to Alpha Phi Alpha. The data is not complete; there is more to add.

When we consider quality, most of the data was added thanks to information available in the English article of Wikipedia. Yet there is information available that could find its way from Wikidata; how do we inform Wikipedia about the people who became part of the hall of fame for instance. Quality for Wikidata is not in single items, it is in how it connects and how it is used. With this realisation we learn from where some say Wikidata and Wikipedia fails and achieve the success that our combined data offers.

Thursday, March 16, 2017

#Wikidata - Black Art

Charles Alston is one of the artists who are of interest to the Black Lunch Table. Mr Alston died in 1977. One of his struggles was to have his art appreciated in the same way as any other art. It is why he refused to be exhibited in William E. Harmon Foundation shows, which featured all-black artists in their travelling exhibits. Alston and his friends thought the exhibits were curated for a white audience, a form of segregation which they protested. They did not want to be set aside but exhibited on the same level as art peers of every skin color.

Today is 2017 and the BLT addresses this black experience and gains the same attention for black artists by writing in Wikipedia about them. It is why many artists with a black experience gain more information in Wikidata, artists like Mr Alston. The one thing where Wikidata differs from Wikipedia is that it is all about connections. The more a person is connected, the more relevant in different settings. Mr Alston had a notable spouse, he was a founder and member of an art group, he studied and worked. All these things are easy and obvious in Wikidata.

From an artists point of view, other things are of relevance too; what awards did he gain, what museums have work in their collection and where did he exhibit. There is yet no obvious way how to make such a claim. Like so many young men of his time, he was in the army in the 372nd Infantry Regiment but that is not quite what Mr Alston is about. This could be relevant for people who care about the military and also, the 372nd was a black experience as well.

Most articles on the English Wikipedia for a person have categories about education, work at a faculty. Adding the implied information for everyone is almost as easy as adding it for one person. It makes adding statements something of a black art, an art that looks complicated an art that connects everything.

Tuesday, March 14, 2017

#Wikidata - Who is Eric D. Wolff?

Eric D. Wolff is one of three authors of a paper called "Original Issue High Yield Bonds: Aging Analyses of Defaults, Exchanges, and Calls". They won the 1989 Smith Breeden Prize and the Wikipedia article has a red link for Mr Wolff, no link for Mr Paul Asquith and a blue link for  David W. Mullins, Jr.

The simplest thing to do is add an item for all the missing authors, connect them to the awards and be done. As they wrote a paper, it is reasonable to expect a VIAF registration and it was possible to find Mr Asquith.

The question is not if Mr Wolff is notable; he is as he won a prize. The question is how to reliably connect him and others to external sources. Making this effort improves quality for Wikidata; it is quality in action.

#Wikidata - actionable quality; Debora L. Silverman

Mrs Silverman is the 2001 winner of the Ralph Waldo Emerson Award. As Wikidata had only two statements for her, it was appropriate to add more information. The Wikipedia article is a stub but it had two categories for a university where she studied and one where she worked. Adding this fact to all the people in a category is relatively easy.

The Ralph Waldo Emerson award was given for "Van Gogh and Gauguin: The Search for Sacred Art". It makes Mrs Silverman an author and consequently there is a VIAF registration for her. Adding this has an effect when the {{Authority control}} template is available in the article.. I added it to the Wikipedia stub and was pleasantly surprised with the WorldCat information from the OCLC.

It is wonderful to find such quality information provided as a consequence from having VIAF information in Wikidata. That is actionable quality!

Monday, March 13, 2017

#Wikidata #quality - is it actionable?

T. Geronimo Johnson
The Ernest J. Gaines Award for Literary Excellence is a great example to explain about Wikidata quality. The item is linked to a Wikipedia article and it has several red links. For all the red links a Wikidata item has been created and, the winner for 2015 and 2016 are only known to Wikidata.

The Wikipedia article for the 2016 winner knows about the award. The article mentions the Sallie Bingham Award, an award that Wikidata does not (yet) know about. Wikidata knows about the VIAF registration for the winner; this is relevant because it means that the international libraries know about this author. The Wikipedia article mentions several universities that were attended; including them in Wikidata is easy and obvious. Doing so improves quality for both the author and for the universities involved. The quality of Wikidata is equal or better than Wikipedia when it knows about the same or more articles than a Wikipedia category does.

Several of the winners including T. Geronimo Johnson, the 2015 winner, are "red links". The minimum needed for Wikidata is to know that he is male and, the winner of the award. With a little bit of effort his VIAF identifier can be found. Consequently we know that the T. stands for Tyrone. Adding the VIAF identifier will show the Wikidata identifier in a months time on the VIAF website and, it allows for quality checks in Wikidata.

Quality for Wikidata is different from quality for Wikipedia. It is less in traditional sources and it is more in connecting to sources like VIAF. When a Wikipedia, a Wikidata and sources like VIAF are in agreement a fact is verifiable and becomes more immune to "alternative facts".

When editing Wikidata quality is in completeness, in combining information from multiple sources, in making Mr Johnson the 2015 winner by adding a qualifier. It starts however with making an effort.

Sunday, March 12, 2017

#Wikidata - Maren Hassinger is on the "Black Lunch Table"

Maren Hassinger is a sculptor born in Los Angeles. She was awarded both the Anonymous Was A Woman Award and the Women's Caucus for Art Lifetime Achievement Award. In addition to this there is a Wikipedia article.

When you read the article, all kinds of statements are made that could reflect in the article having Wikipedia categories. In this case statements in Wikidata were made.

The outward appearance is that Wikipedia and Wikidata are two distinct projects. Wikidata however has always included data from Wikipedia and there has always been a realisation that Wikipedia in its turn could benefit from Wikidata; generating category entries should be possible for instance.

When you consider the immediate future of the Wikimedia projects, Commons will be wikidatified. One part of the information that is directly related to GLAM activities is registering the museums that include an artist in their collection. This is applicable for many artists that are part of the Black Lunch Table including Mrs Hassinger. So the question is; should we include such information in Wikidata and how should we do this?

Saturday, March 11, 2017

#Wikidata - Historical amnesia

A discussion about contemporary politics is not based on facts, nor on the interpretation of facts it is much more based on identity and what group you belong to. It is important to politicians to frame their message and much of this framing is done through a selective use of facts and the presentation of opinion as facts.

The Wikimedia community is not about politics except where facts are concerned. Facts matter; for instance Mrs Clarissa Sligh cares about "historical amnesia", read her website and see what is meant. Mrs Sligh qualifies as far as I am concerned as a "Black Lunch Table" candidate. They are artists from the African diaspora and giving attention to them is a project that aims to lessen the diversity gap that exists.

In contemporary anti politics it is relevant that facts are available. All the Wikimedia projects are political in that they deny any singular political message their limited view on facts. It is important to overcome the bias of the demagogues and pundits and bring together information that paints a difference.

Friday, March 10, 2017

#Wikidata - dating awards

Many awards are dated using "point in time". With a query you can count them, certainly when you are WikidataFacts. Looking at a graph like this, you will see that many awards for 2016 and 2015 are still missing. Many of these awards were imported in the past and have not been updated yet.

It would be cool if we knew what Wikipedias have an article for the awards and would be "pinged" when new values are known.

Thursday, March 02, 2017

#Politics and 33% fewer #HIV infections in the #UK

Professor Sheena McCormack studied the efficacy of PrEP in the United Kingdom. She headed a major NHS study to ascertain how effective the drug was, and who should be given it. The study was called PROUD. Greg Owen was to late to enrol, buying the drug privately would have set him back £500 per month, money he could not afford.

He could score some of these pills but before he started, he found he was already HIV-positive.. He posted his story on Facebook and was inundated with questions; what is PrEP, where can I get it. At some stage he remembered that medicines can be had from the Internet from countries where medicines are better affordable. Unbranded PrEP is available for £50 per month.

Greg informs people on his blog. Professor McCormack was instrumental in helping set up clinics that monitored the use of these unbranded medicines. It was based on the assertion that doctors are responsible for the care that they provide. Helpful friends monitor the supply and indicate what websites provide the correct substance

The National Health System meanwhile did not want to fund the use of PrEP in 2016. As a result more and more people became aware of PrEP and learned about the alternative. In August of 2016 the NHS lost its case in the High Court. As a result the NHS is doing a "test" for three years starting this summer for 10.000 people ignoring the 33% fewer HIV infections because of PrEP.

When Wikimedians talk about politics, having no article on Professor McCormack and on Greg Owen is relevant. With all the publicity on this case, where is the neutral point of view in this?  It is important because it highlights the cost of medicines as a determining factor on who lives and who does not. In Europe many people can afford £50 a month but in many other countries it is out of reach to make the difference it makes in Europe. According to the United Nations, we can end the HIV/Aids epidemic by 2030 and then Mr Trump happened.

It is political because it provides clarity in a time where companies like Milan make medical care too expensive. It provides clarity when the US government insists on taking away medical insurance from people.

It is political and all too often Wikipedia does not inform. We know that 12% of the 2500 most sold prescription drugs are not effective (source British Medical Journal) and we do not even register this on those drugs. Wikipedia is the prime source of information on medical matters and in my opinion we are negligent.