Wednesday, April 30, 2014

#Wikidata - #Marwari, a language from #India

Thanks to new #MediaWiki functionality, Wikidata can support languages that are not available elsewhere. It is a consequence of a change in the policy of Wikimedia's language committee. For languages that are "eligible", that is in essence that they are recognised in the ISO 639-3, people can ask for recognition of their eligibility.

The way it works is that you can add the language to your Babel template. This will make sure that labels will show in Wikidata. You can also change your user preference to rwr or Marwari and, it will be the default that shows the labels in all statements.

Reasonator is however the easiest way to work on Marwari. It will show labels in Marwari but it shows labels in the specified fall-back languages. This makes it convenient to add missing labels. One prerequisite however is that Widar has been set up properly.

#Wikipedia - the case for and against #categories

Wikipedia categories are lists maintained by people by adding a line at the end of an article one at a time. Categories can be part of categories and therefore you can have one for Paleontologists and one for Women paleontologists. In order to be a female paleontologist, you have to be in both categories..

Wikidata knows about categories, females and paleontologists. With Reasonator you will find information about all three including what Wikidata expects to find in the category of Paleontologists and Women paleontologists. The number of entries are not the same; a human with an occupation of paleontologist who is female will make it into the list presented by the Reasonator.

The case for categories is a simple one. It is the place where we harvest data from for Wikidata. Until we have something better, it is the place where we will have to come back to.

Tuesday, April 29, 2014

#Wikidata - #Brown University faculty

Gerald Guralnik died recently. He is credited of one of the discoverers of the Higgs boson and the Higgs mechanism. With someone that special, it makes sense to do something different, something extra.

At Wikidata a lot of time goes into adding the alma mater of subjects. For Mr Guralnik his time at Brown University was added. The time people work for a university or for a research institution is at least as relevant as the time when they studied.

For Brown university there is a category with many of its faculty members. Now that one university has been so well served, there is plenty of room for other universities to be served in a similar way. In Wikidata you find how to set up the query that show the data in the Reasonator.

#Wikidata - #elections for the Lok Sabha in #India; candidates

Many people have already voted and many more will do so soon. Everybody who is a candidate in the election is still a candidate. At Wikidata we know about 581 candidates and given that there are 543 seats in the Lok Sabha, many candidates are missing.

It could be because the  articles were not included in one of the categories of candidates. It could be because they do not have an article. All these candidates are notable in a very real way; elections are expensive and only candidates that have an appeal are fielded by the political parties.

To make this picture more complete several things can be done. For each candidate without a Wikidata item, an item can be created. The candidate is human, a citizen of India, he is a candidate. Candidates compete in an electoral district typically for a political party.

Yes, we need labels for all of them in all the languages of India as well for best effect.

Monday, April 28, 2014

#Wikidata - #diversity

The Premio Nacional Eugenio Espejo is the national prize of the nation of Ecuador. One of the recipients, Mr Theo Constanté died yesterday and consequently this fact was registered at Wikidata.

It is the aim at Wikidata to provide enough diversity in the coverage of its subjects and the quality of its information. The Premio Nacional Eugenio Espejo was not really known; Mr Constanté was the first known recipient so a bit more information was added to this award.

The winners of 2013 were added as well. Only one of them was known as a "human". The award has been established in 1975 and it seems obvious that when Ecuador's finest get less than sterling attention most notable people from Ecuador do not get proper attention.

When Wikipedia and Wikidata for that matter are inclusive enough, when enough information is available about Ecuador, they will get more attention. More readers and as important, more editors.

Sunday, April 27, 2014

#Wikidata - Mr J. Djanogly, MP for Huntingdon

The electoral system of the UK, the USA and India have in common that they work with "constituencies" or "electoral districts". At election time one constituency chooses one representative and Mr Djanogly is the representative for Huntingdon.since 2001 succeeding Mr John Major.

As you can expect, Wikidata knows about the districts and the persons involved. That does not mean that it brings things together. Neither for Mr Major nor Mr Djanogly it was known that they held the office of an MP nor was it known that they represented Huntingdon.

This type of election is personal; at election time it matters who the candidate is. A candidate may run repeatedly. This is quite different in a representative system where people vote for parties and where nobody in particular represents a specific area.

Mr Major and Mr Djanogly are now registered as MP in Wikidata. The challenge is how to include the elections and representatives in constituencies like Huntingdon. Obviously a lot of work needs to be done to mark every MP for the office that he or she held.

Friday, April 25, 2014

#ToolScript - Where is that script

The best tools are the tools that are used. As they are used, they get maintained. More functionality gets added and, at some stage it breaks the ability to add more. This is why ToolScript brings tools together it has the output of one tool as the input for the next. It brings us all the dead of 2014 that are NOT known as dead in Wikidata.

One problem is that you have to copy a script into the ToolScript every time. This proved to be a major nuisance and as Magnus eats his own dog food, a solution was found by importing scripts from "Pastebin".

The new script in Pastebin is a big improvement over the last one. It indicates the major variables on top. The category and the language that is considered and, the missing property in Wikidata. Even I can use it and change it for my purposes.

Other people like Amir, could take this even further when the result of ToolScript is in the format of an API; this will enable him to develop all kinds of wonderful stuff on top of the results.

Thursday, April 24, 2014

Adrianne.. "we can edit"

Adrianne is one of the few Wikipedians who has her own articles and, deservedly so. It is sad that she had to die to get them and it is sad to see that some do not recognise her notability and want her article deleted. They cite policy. It used to be policy to "ignore all rules"..

Never mind, but I do. Given that someone who played just once a professional game of soccer is "notable" enough for an article is proof enough for me that Wikipedia has some dumb notions of notability.

To honour Adrianne editathons are organised and everybody is invited to join. To quote the convocation:
Her work is recognized internationally as helping to encourage more women to contribute to Wikipedia to tackle the gender gap and systemic bias in its content. Wadewitz was one of the first academics to bring Wikipedia into the classroom as part of the Wikipedia Education Program, working with her students to improve Wikipedia instead of writing traditional term papers. 
As anyone can join, we have done some work adding more humans to Wikidata and indicating their sexes. As Wikidata gains more data, its data will reflect more accurately what the sex ratio is for any given Wikipedia. There are 1.45% more humans, 1.13% more men and 1.15% more women since last 
Sunday. It was suggested to harvest information from the Frau and Mann category from the German Wikipedia and, they prove to be a rich resource of data.

Obviously only knowing the gender of a human is not really interesting. What is interesting is to know the number of **Whatever** award for its sex ratio or the number of professionals in a field.  Or the sex ratio of professors from 1960 to 1970.. At this time Wikidata does not have enough data to have a clue. As it gains more data, the results slowly but surely become statistically significant. I am sure some statisticians are able to say when Wikidata has enough data.

I am sure Adrianne would applaud this development.

Wednesday, April 23, 2014

#Wikidata - More 2014 deaths

When you are interested what notable people died in 2014, you will find that Wikidata is more complete than any Wikipedia. The ToolScript tool provides a wonderful opportunity to add even more people to this list.

Sadly for many people recognition at being notable is left for the moment when they die. Then again, it is a perfect moment to write a Wikipedia article as an obituary. Magnus wrote yet another script that creates an item for the articles that did not have one yet.

This is the script that finds all the items that are in need of some attention..
all_items = ts.getNewList('','wikidata');
cat = ts.getNewList('it','wikipedia').addPage('Category:Morti nel 2014').getWikidataItems().loadWikidataInfo();
$.each ( cat.pages[0].wd.sitelinks , function ( site , sitelink ) {
  var s = ts.getNewList(site).addPage(sitelink.title); // Page list for that site, with category
  if ( s.pages[0].page_namespace != 14 ) return ; // Not a category
  var items = ts.categorytree({language:s.language,project:s.project,root:s.pages[0].page_title,redirects:'none'}).getWikidataItems().hasProperty("P570",false);
  all_items = all_items.join(items);
} );
Yes, ToolScript comes with documentation :)

#Wikidata - Mr Ramashankar Rajbhar, member of the Lok Sabha

The elections in India are underway. They will select representatives for the constituencies for the Lok Sabha. Mr Rajbhar is an incumbent and, he represents Salempur in Uttar Pradesh.

When people are to vote, it makes sense to know who they vote for. Once the results of the elections are declared, it helps when information is available about the people who represent them.

As you may know, it takes considerable effort to write articles and for many if not all representatives there is no article in all the languages of India. For Mr Rajbhar there is only an article in English, there are no labels in any of the languages of India.

When labels are added, these politicians can be found in their own language in Wikipedia. It is a first step. To know who represents a district, someone has to add this information. Most districts are known at Wikidata, add labels and they can be found as well. At statements about who represent the district and, people will know who to write to.

Tuesday, April 22, 2014

#Wikipedia - the search for ఈలా గాంధీ in Telugu

Mrs Ela Gandhi or ఈలా గాంధీ in Telugu does not have an article yet in the Telugu Wikipedia. This does not mean that she is not of interest to the people who seek information about her.

Now that the Telugu Wikipedia has added this one line in its common.js, it is at the bottom, people will be able to find her when they search for her. At the bottom you find the search results from Wikidata.

I think Mrs Gandhi already looks smashing in the Reasonator in Telugu :) What is missing is a label for one of political parties of South Africa.

#Export of a #Wikidata #query

AutoList is the tool where you can operate on the results of a query based on Wikidata data.

It is quite magical that you can now actually download a file with Wikidata data. In this example, you find 22 of the more than 3000 people who Wikidata knows to have died in 2014.

Downloading the data is a new feature that was created so that people do not have to copy and paste values to a spreadsheet any more. No more typo's and no more copy errors but best of all it is so convenient.

#Wikidata - Rulan Chao Pian and the Otto Kinkeldey Award

Mrs Rulan Chao Pian was a distinguished musicologist. She was one of the first female professors at Harvard University and, she received the Otto Kinkeldey award in 1968.

Currently there is only one recipient of the Otto Kinkeldey award. There is no Wikipedia article for it and finding more recipients is not easy. Only 20 Harvard University employees are known; obviously calculating a sex ratio based on these numbers gets you an irrelevant number.

What Wikidata needs are people who are eager to provide us with the data they care for. When Sarah Stierch asks "are there lists of things I could work on", my answer is: "work on the things you know, the things you love and even, the things you get paid for".

The beauty of Wikidata is that as long as statements are verifiable, it does not matter much who adds them. I have no problem with a Harvard intern adding its professors as "employees" and adding pertinent information like when they joined Harvard or became tenured professors. I welcome people from the American Musicological Society when they add all the winners of its awards. Obviously,the Russian, the Chinese and the Indonesian counterparts of the AMS are equally welcome.

When you are involved in a certain field, please make sure that your field is well presented. With tools like Reasonator, WDQ and AutoList you can find how well it is represented at any time. When you find that additional properties are needed in Wikidata, we can discuss this. But do get involved.

Monday, April 21, 2014

#Reasonator - the premier of New South Wales

When you consider #quality, there are many approaches to it. When Mr Neville Wran died, it was the right moment to mark everyone who held the office of premier of New South Wales as such. That is the quantitative approach to quality.

Making sure that there are pictures with places he is associated with or indicating who preceded and succeeded him provide more in depth information and that is a quality as well.

It is good to pay some attention to the incumbent of a function. Mr Baird assumed the office on April 17th.. The information about him got some tender love and care.

Funny is the text associated with his predecessor; "43rd and current Premier of New South Wales".. It demonstrates the problem with fixed text nicely. With automated descriptions that is not much of a problem; they do the trick in any language given enough labels.

Sunday, April 20, 2014

#Wikidata - its sex ratio

In a perfect world, Wikidata knows the sex for each person where Wikipedia has pertinent information; every Wikipedia. In a perfect world you query Wikidata for the sex ratio of each Wikipedia.

As we know, the world is not perfect; Wikidata currently knows about 1,332,383 "humans"  760,616 are male and 154,455 are female. This makes for 57% males,  12% females and 31% unknowns. Many items still need to be identified as human as well.

With a selection like the 12,800 known Harvard alumni, we find that there are 5,359 males and 840 females. This is 42% male, 7% female and 51% unknown. Before we compiled these numbers missing items were created for each known alumni and all of them were made human and a Harvard alumni as well.

The problem Wikidata faces is not only with the under representation of women, the problem is with the lack of data about the gender of known humans. The nice thing about statistics is that now that we have some numbers, we can track how Wikidata evolves in its information about the sexes.

Saturday, April 19, 2014

#Sources for causes

What to do when #Wikidata tells you there is a problem? Keep calm and cite your sources.

For a year now, people have been pouring data in to Wikidata and, there is a lot of it. This data is coming from many places; among them all the Wikipedias. They do not necessarily agree on everything.

One area where it particularly makes sense to cooperate are the recently departed. Many of the people who were notable enough for an article are old and, they die. They die in droves.

As some people are described in several languages, you may find that those other Wikipedians knew about it first. So you may learn about even more deaths in the ranks. Another thing that happens is that people enter a different data... OOPS...

This is where you keep calm and cite your sources. People only die once, so this is the time to be assertive about your sources.

Wikidata is at this time happy when you sort it out, get it right and update its data accordingly. Adding sources is really appreciated but at this time we are mostly happy when you concur that we have the same data.

#Wikidata - Heroes of the Soviet Union

On the Russian Wikipedia, there are 10898 entries in the category for heroes of the Soviet Union. Only 9740 of them have a Wikidata item. With the Creator tool it is easy to add the missing 1158 items. It gently adds them one at a time.

Adding statements for over 10.000 heroes is a bit too much for the AutoList tool. There are several edits to make. First, all the people are a human and then they have to receive the recognition in Wikidata for the hero they are.

It is much better to use a bot for this. What clinges it is that many people on the Russian Wikipedia have a template with much more information than just this one award. Things like dates of birth and death, places of birth and death. Other awards they have received..

The Russian Wikipedia is a really rich resource and it will be wonderful when more of its information is reflected in Wikidata.

#Wikidata - Eli Saslow, Pullitzer prize and George Polk award winner

No #Wikipedia article for Mr Saslow yet though. Some work is done on the George Polk award and, it was found that among many others, Mr Saslow was missing.

To demonstrate the potential for quality of Wikidata, missing winners were added. Some of the issues that were found were:

  • people do not have an article
  • people are not part of the George Polk awards recipients category
  • people do not have a Wikidata item
  • some of the recipients are not people
Mr Saslow is a great example of a person who you would expect to have a Wikipedia article. But given the way the community works, he will get one once someone feels the need to write it.

When journalism and sharing information is important for you, consider this: the Pulitzer Prize for Explanatory Reporting currently has only one recipient.. Mr Saslow. His alma mater has three alumni, two more were added for him not to be alone. His employer, a major quality newspaper, has one employee .. 

But still, the fact that Wikidata does know these things demonstrates that its quality is improving.

Friday, April 18, 2014

#Wikidata - awards and politics

Edward Snowden received the Ridenhour Truth-Telling Prize. For some Mr Snowden and the Ridenhour prize may be controversial. However, it does not mean that they are irrelevant or not notable. The award has been added to Wikidata together with many of its recipients.

Several of the recipients do not have a Wikipedia article and that is fine.. This may change. The Ridenhour prize was named after a Mr Ridenhour. He was a journalist and he received the George Polk Prize. This was not obvious because on the article there was no reference to the category. This has been remedied.

The world is an imperfect place and we can improve it by cherishing the people who matter. By stating the obvious, by sharing in the sum of all knowledge.

PS This is the George Polk Award Recipients category and, this is its Reasonator entry.

#Reasonator - #Taxonomy, picture this

When you are are on a train, a bit bored, it helps when your railway company provides you with free Wifi, mine does. It is the perfect setting to add pictures of species to Wikidata. To do this the latest tool by Magnus is really good.

The "Wikidata species images on Commons" is the perfect companion for those idle moments. All you do is look at pictures and decide what pictures shows off a species best. When you do not like any of them, that is fine too. You also have the option to add range maps for a species.

The result of all this can be experienced in the Reasonator.

Thursday, April 17, 2014

Baglama - #statistics that make a difference, that demonstrate impact

Arguably the #GLAM partners of the #Wikimedia foundation are its most relevant partners. They share through us a wealth of imagery and data. Among other things they help us illustrate our articles.

What they provide us with is high in numbers and high in quality. It is dropped in Commons and, it is then for the Wikimedia communities to make use of this wealth.

Baglama, one of Magnus's finest tools, has had a refresh and it shows again what impact any given collection has. It shows the page views they generate. This information is vital; how else can you prove that providing us with freely licensed media files makes a difference, gains a public? What better way to convince that we make a difference?

The Tropenmuseum for instance has much of its impact in Indonesia. Its collection is used more on the Indonesian than on the Dutch Wikipedia. 

What the Tropenmuseum collection proves is that Western museums can have a big international relevance. When they collaborate with us, they have an impact in the countries where their collection originated. This message is as relevant to the Wikimedia communities as it is to the GLAM institutions. We could and should care more about those collections that are not "local". Collections that are typically underfunded.

#Wikimedia #Commons - Mr Tadeusz Łobos

The picture that illustrates the article about Mr  Łobos is licensed under the CC-by-sa license. Consequently it is a picture that should be available for use in Wikidata. It is not because it was not uploaded to Commons.

When a picture like this one is to be made generally available, it has to be uploaded again to Commons. It is a hassle and another round of bureaucracy will decide if this picture may remain at Commons.

There are many pictures that should be generally available and are not. In a similar way there are many pictures that are considered to be not generally available and as a consequence are no longer available at all.

All media files are equal in that they are in need of the same quality of meta-data and they are all equal in that their copyright and license situation is the same never mind where they have been uploaded.

The consequence is that the scope of a Wikidata approach to media files should not restrict itself to Commons.

Wednesday, April 16, 2014

#Reasonator - Greene County, New York

Greene County is just another county in just another US state. In it you find towns, villages and "census designated places". The first line of contact for all kinds of administrative issues is the county.

For a long time, it was the state who was considered the right level of "is in the administrative-territorial entity" in this case it is New York. This has been remedied and at the same time many other villages, towns and whatnot have been associated with the lowest level of an administrative-territorial entity they have to deal with.

When you reason like the Reasonator, you can go up the food chain and find the state, the country a place like Climax has to deal with. It is just one way the Reasonator makes information for you out of the Wikidata data.

#Wikidata - an old soldier from Russia

My interest in Mr Styrov is that he died in 2014, on January 26 to be precise. In the info-box on the article that is dedicated to him, it is just one of several facts. Among them there is the rank of Rear Admiral he held in the Soviet Navy and the many awards he received. Among them is the order of Lenin.

With Mr Styrov added as someone who died in 2014, we are closer to the point where Wikidata knows all the people. When we are, we can make a report that will signal the latest people who are known to have died in the last year.

This can be of interest to people who want to know things like:
  • Do we know that this person who has an article on our Wikipedia has died
  • Fellow country men who died and do not have an article on our Wikipedia
  • What person are notable enough to have an article on our Wikipedia
When such reports become available, the data in Wikidata gains a purpose. Some may see it as getting Wikipedia to use the data by the back door, but isn't it there to be used?

#WMhack - last years demo

At last years hackathon, one highlight was the map showing the history of Islamic states. It makes clever use of Wikipedia and it makes information available in Arabic and English.

This year the hackathon will be in Zurich and one of the main subjects are maps and how it relates to Wikidata. The history of Islamic states is relevant in all languages and it would be cool when we can update this application to make use of Wikidata.

In this map,  all the different states have an overlay. There must be a map for each state at each interval. This app shows the different rulers at a time.

When we have such overlays available to us, we can do more than show the rulers, the centres of power. We could show the battles, the wars, the conquests. They happen in between the changes of the maps.

Maps that show areas grow and wane in time are another area where Wikidata can be a real help if only because it links to information in so many languages.

#Wikidata, #maps and #sprites

Maps are static but many maps define something that is very much alive. Take the "raid on the Medway". It shows two flotillas, it indicates how they move. It does not show anything happening of the other side of this battle. It does not show positions.

A sprite is a two-dimensional image or animation that is integrated into a larger scene. It can be put on a map. Picture this, the same map of the raid on the Medway and two little sprites moving in time along the indicated lines.

Technically it is not much of a challenge. It becomes interesting when it starts moving on a real map, an OpenStreetMap for instance. Add the moments when we have images that show scenes of a battle, an occurrence and we get something that becomes relevant.

Technically it seems doable. It seems like a challenge that brings together the parts that already exist. It will be really exciting when it brings Wikidata, Commons, OpenStreetMap together. It will help us explain about events. It is part of sharing the sum of all knowledge.

Tuesday, April 15, 2014

#Wikidata - The dead at the #Arabic #Wikipedia

As the #quality of Wikidata can be measured, it is important to be inclusive when what is measured are the people who died in 2014. People die in every country like this gentleman from Saudi Arabia.

According to the article, he died Thursday, 12 May 1435 AH, that is the same as 13 March 2014. Google translate transliterates the title as "Zaid bin Mohammed portal". That is enough reason not to use the transliteration as the title for the item.

When you read the article it is about Mr bin Mohammed and does not give the impression of a portal page.. That is for someone who understands the Arabic Wikipedia to fix.

Sunday, April 13, 2014

#Wikidata - ประนอม รัชตพันธุ

When #Quality is the objective and when quality is to be measured, it helps when there is something that demonstrates how Wikidata provides quality. The 2014 deaths provides a great opportunity; currently we know about 2108 people who died.

Making this list complete is not always that easy, the lady whose portrait you see, has an article on the Thai Wikipedia it is indicated that she died and, with Google translate I find the following:
Born October 1, 2457
Province, Thailand
Died 17 January 2557 (99 years).
Thailand Nationality
The Thai dates have to be converted to Gregorian dates, it is great that there is functionality around that helps with the necessary conversion. The question that is if our Thai users can enter their dates in the Thai format.

Saturday, April 12, 2014

#Wikidata - George Bookasta

Mr Bookasta has an article on the English Wikipedia. Until recently there was no item for him on Wikidata. With a new tool by Magnus, the Creator, items were added for all articles that are in the categories of people who were born or who died after 1850.

Since then two inter language links were added for Mr Bookstra. Thanks to a query we can track who is added to the category of 2014 deaths and are not known to be dead in Wikidata.

As all the people who died in 2014 have their deaths noted, it is only a matter of keeping up. The article for Mr Bookstra is small but includes many bits of information that can be added to Wikidata. One thing that I did not add is that he was also a bigband leader.

Friday, April 11, 2014

#Wikidata and naval history

The answer of what a GLAM hopes to find in Wikidata can be surprising. "We have several paintings of the Raid on the Medway and, we would like to find data about that battle and be able to place our paintings as illustrations in the sequence of events".

One Dutch admiral, Michiel de Ruyter, has a painting hanging near paintings of the battle and, it would be great when he can be associated to these events as well.

When you have a look at the info-boxes of the battles that were fought as part of the second Anglo-Dutch war, you will find the commanders of the opposing sides. You will find that the treaty of Breda ended this war.

The question is very much how to include all these facts in Wikidata. When we do and when we include information on the many historic events that have an item, we become extremely valuable to our partners.

Wednesday, April 09, 2014

#Quality - the dead as we know them

Both the German and the English Wikipedia have a category that includes everyone that died in a particular year. Currently they include 724 and 1530 people. Obviously people died that do not have an article in either Wikipedia.

All the dead registered in Wikidata for 2014 account for 1724 people. It demonstrates some quality because the death of more notable people is known than in either or both Wikipedias.

There are some Chinese and some Russian people among them. Some people from Spain, Chile and Argentina. From India, the Cook Islands and Sweden.  Many notable people from these and other countries are missing.

When we do, they will show up on the date of their death. When you look at today, April 9 2014, there are currently two people known to have died. One is from Serbia, the other is from Trinidad and Tobago. This number will grow. Showing it in the Reasonator is already a good thing. A next step would be to create a tool that will inform all projects about the people that have died in the last year.

Quality is measured by comparison, it becomes valuable when the information can be acted upon.

Tuesday, April 08, 2014

#Wikidata - Nordic Children's Book Prize

The Nordic Children's Book Prize has been awarded to people from Iceland, the Faroe Islands, Denmark, Sweden and Norway.

The article for the prize exists in many languages including English. They all list the winners for the prize and the list is incomplete for all of them but not necessarily in the same way.

Wikidata knows about 23 winners including the 2013 winner. She was known in Wikidata, there is no Wikipedia who identified her as the winner of this prize.

It would be nice when all the winners were known to Wikidata and, when there is an image for all of them.

#Quality - #Wikipedia vs #Wikidata

Probably the most divisive issue in both Wikipedia and Wikidata is quality. It is because of expectations and insistence on what "everybody" should do.

Both projects are Wikis and as far as I am concerned, the argument was decided when Nupedia got its early grave. The clincher was when research proved that Wikipedia is as good as its competition.

Quality in a Wiki world is comparative and not an absolute. You can compare Wikidata to each Wikipedia and, you can compare Wikidata to all Wikipedias. Wikidata knows about more "items of knowledge" than any Wikipedia. Every Wikipedia includes articles that are not yet represented in Wikidata and when they are, many statements are waiting to be made in Wikidata.

To create an environment where a Wikipedia can use Wikidata for its information, there are a few prerequisites, considerations:
  • at least every article needs to be connected to a Wikidata item
  • all the data needs to be available, preferably in Wikidata only
  • the information should be presentable in the language of the Wikipedia
This all will work when there is one shared understanding, one shared ambition: to share in the sum of all knowledge. It restricts what a community can decide, it directs what best practices are and it defines where tools are needed to support best practices.

Wikidata is a game changer and, we as a community are slow getting to understand its implications. From a development point of view there is an obvious geographical divide. MediaWiki development for the application level does not consider Wikidata. Its architecture ignores it. Wikidata development is first and foremost development for its infrastructure, there is no option but as a consequence the tooling is mostly fragmented. Many in our communities consider Wikidata a service project and, while it is, it is becoming so much more. Catering to ill formed arguments and sentiments of the diverse communities will not serve Wikidata at all, it will not serve those communities either. What works is for everybody to work on the Wikidata data that is important to him or her and appreciate the considerations that make Wikidata the information platform for us all.

Saturday, April 05, 2014

#Wikipedia - Esmonde and Larbey

Esmonde and Larbey are two British authors. They were were a British television comedy script writing duo from the 1960s to the 1990s, creating popular situation comedies such as Please Sir! and The Good Life.

Mr Esmonde died in 2008 and Mr Larbey died recently on March 31. Such a combination of characters may make sense in an article, for Wikidata they have to be two distinct items. How else can you identify Mr Larbey as one of the recently dearly departed ?

Given that Wikipedia has them as one single article, Wikidata identifies Mr Esmonde and Mr Larbey as part of this duo.

Friday, April 04, 2014

#Wikidata - Bartolomeo Pepe

Mr Pepe is the first Italian who has the OpenPolis property. #Reasonator shows a link to an external website where information about Italian politicians can be found. Mr Pepe is a member of the senate of the Republic.

The obvious thing to do is to link all existing articles of Italian politicians to OpenPolis. After that there are many "opportunities":

  • add an item in Wikidata for missing Italian politicians
  • add statements that complete the information about them
  • seek images to illustrate their articles and items
It will be fun to see what they will come up with when they have added all the missing information in Wikidata. Adding the OpenPolis property acknowledges that an item, an article is about an Italian politician, it is not providing information in a way that can be understood in languages other than Italian.

#Reasonator - Tancred, Prince of Galilee

This fine man from Normandy, has an extended family. When you look at his "family tree", the software stops when it says: "11464 people loaded, 9 queries to go". 

What amazes me is the sheer amount of effort that has gone in adding all these relations. As far as I am aware, this is all done by hand. Tancred is not the only example with too many relatives to show.

When you look at the inline information in the Reasonator, you use the same functionality that can also be shown externally. When Magnus is to look at this bug, there are a few considerations:
  • he has to know about the bug
  • he has to have the time to fix the bug
  • it has to fit in his priorities
Magnus has a huge capacity to do all kinds of everything, he has a day job. Given all that, he does not scale :)

Thursday, April 03, 2014

#Wikidata - what #sources ?

Some people say that Wikidata needs sources: "without sources we are just not credible".

A fine argument, it rings familiar. However take this item as an example, it is bulked up with sources. It has five sources and the second statement that is informative, the "date of death" is added to the fact that this is actually a "human".

The two statements that are informative do not have a "source", they are however useful. This human is from Sierra Leone, he used to be called "prof" and, he is a writer. Possibly you find it in the five sources. When you don't, there is always the Wikipedia article to read.

#Wikidata - killing off zombies

I was brave, I killed off many of the zombies of the class of 2014. It felt good but then, new zombies showed up.

I killed off hundreds of them and, more are coming. There are the ones of last year and, the year before last. There are the many, many more from before 1900.

It is overwhelming.. How do I get away from it all? Do I need a bicycle because it runs quiet, a motorcycle because it is fast, a zeppelin...

Actually, I may need an army of drones that mindlessly looks up the data of the dead and expires them all.. Tirelessly, one at a time..

When such drones have worked their way through them all, we have a fighting chance. A chance to know the dead stay properly dead. It will give us the quality time to remember them for who they once were.

When we expire at Wikidata the ones who are dead, we do inform the projects who care about these people. If they are on a watch-list, they will show up as dead. That is one way of keeping on top of the zombie menace and, it is the best way of improving quality all around.

Wednesday, April 02, 2014

#Wikidata - #Zombie alert

"Humans" have a tendency to die. A zombie is a human who is not convinced that he died. In Wikidata, many human who are dead are not known to be dead. These are the zombies class of 2013 and this is the class of 2014.

It is an act of kindness to lay them to rest. To do this, you add their "date of death". When you have the quality and the patience to diminish the ranks of the zombies, I am sure that we will all sleep better knowing that they can rest in peace.

Tuesday, April 01, 2014

#Wikipedia - searching for ദേവദാസ് ഗാന്ധി

This screen shot needs some explaining. Netha added the label in Malayalam for Devdas Gandhi. As a consequence everybody can find Mr Gandhi written in Malayalam on all the wikis that participate in the "extended Wikidata search". The screen shot demonstrates this for the Tamil Wikipedia.

Using #Reasonator to label #Wikidata - Devdas Gandhi or ദേവദാസ് ഗാന്ധി

When you add labels to Wikidata, the items become available in *YOUR* language. Given that a label-a-thon will take place to support Malayalam, we are preparing for the event. A label-a-thon has three main messages:
  1. once you are set-up, it is quick and easy
  2. you can do a lot really quickly
  3. you make a difference
To set things up, you 
  • create your user page on Wikidata. 
  • on it you add your #Babel information
  • in Reasonator you set your "personal settings"
  • finally you authorise Widar to edit on your behalf in Wikidata
You can search in Reasonator in any language; in the screenshot we search for "Gandhi". The results show in Malayalam. When you hover over something that is red underlined, you have the option to add the missing label. Netha just did Devdas Gandhi and she is happy to report that all the labels now show in the search results.

Arguments about #Wikidata quality

A lot of jaw movements have been involved about the Wikidata quality. However, the jaw movements that are really relevant involve pudding.

You can create a fine looking pudding, sprinkles on top, but it is all about how it tastes. It is all about how well it is actually made. The nicest thing that can be said about Wikidata quality is that it is improving. Everybody involved knows how much room in any direction there is for improvement.

When actual use is the proof of the pudding, we should concentrate on actual use cases. The "Authority control" template for instance gets additional information from Wikidata. For this to function correctly, a Wikidata item has to exist. It has to have one or more links to external sources like VIAF or the GND.

Recently attention was given on Facebook to a tool that shows coats of arms on a map. For this to function, a Wikidata has to exist that has statements for the geocoordinate and for the image of the coat of arms. It looks good but it would be cool when new items pop up when they become available.

When Autolist is able to show the people who died in 2013, it is wonderful that it can also show for whom we do not know a date of death in Wikidata. This allows anyone to improve on the available data and add dates of death.

When people add to the network of data that is Wikidata, they are cooking the pudding. They are improving the quality. Everything else is academic, a distraction ... jaw movements.