Thursday, October 23, 2014

#Wikipedia - One size does not fit all


In Wikipedia we are used to see our readers as one big group. They all read the same article, they all get the same info-boxes and they all get the same categories. It is a reasonable approach when Wikipedia is only a pile of text without data to separate out potential differences in interest.

One obvious consequence is that reasonable expectations decide what is shown and what it looks like. When there are too many categories, they no longer get attention. So what categories should be shown? The problem is that this "one size fits all" approach shows too much for some and too little for others.

Thanks to Wikidata it is possible to allow for preferences. For many categories Wikidata knows what they are about; they show for instance humans and their alma mater, their sports club, their gender... When our public has the option to choose what category of category they are interested in, there is no longer a "need" to choose what categories to keep. It is just a matter of making the choice what categories to show by default.

Any and all other category of categories are then selectable by the reader.
Thanks,
      GerardM

Tuesday, October 21, 2014

#Wikidata - Thank you Magnus


Mr A. H. Halsey is the first person who can be put to rest now that the ToolScript works again. Mr Halsey was a sociologist, he died 14 October 2014.

Thank you Magnus, you are wonderful.
Thanks,
     GerardM

Monday, October 20, 2014

#Charkop - a Vidhan Sabha constituency

Data about politics, politicians regularly finds its ways to Wikidata. When an item gets my attention, I often add all associated items to Wikidata as well. Charkop is a consistency in Maharashtra according to an associated category there are many more.

Given that the software I use is broken at this time, I can blog about one dilemma.

Charkop is a Vidhan Sabha constituency it is part of the Mumbai North Lok Sabha constituency. The question is if Charkop "is in the administrative territorial entity" of Mumbar North or Maharashtra.
Thanks,
      GerardM

#Google - Let us #share in the sum of all #knowledge

Dear Google, in our own ways, we share the aspiration to share in the sum of all knowledge. We are really happy to share everything we have with you. Our licenses are designed to share widely.

Dear Google, could you please help us make sure that our Labs webservices survive your bots? What we do not want is for your bots not to run. What we want is for our webservers to serve our own needs first and use all the spare capacity for you. As it is our software dies.

We really want you to have our data and, there are several other ways whereby you can get all out data any way. For this reason please help us with our software so that we can continue to share the sum of all our available knowledge with you.
Thanks,
     GerardM

Sunday, October 19, 2014

#Wikidata - P1472, the #Commons #Creator #Template

The work of many artists is represented in Commons. Having great information available for all of them is a Herculean job. Having all that information and more available in all the languages that are supported by the Wikimedia Foundation is very much an aspiration.. Once Commons is wikidatified, all information needs to be understood in all our languages..

France Prešeren is one of 13,481 people who currently have a Creator template and are known as such in Wikidata. All the data in those templates can be harvested and included in an Wikidata item. For all the templates NOT known in Wikidata, an item can be found or created to make them known in Wikidata as well.
A lot is already known about Mr Prešeren in Wikidata and much of that data can be expressed in multiple languages. The same can be said for the Creator template itself; as you can see, the template already shows its labels in multiple languages. With Wikidata we can show the information in all our languages as well.

Realising this will introduce the Commons community in a positive way and reduce one obstacle that needs to be overcome during the wikidatification of Commons.
Thanks,
      GerardM

Saturday, October 18, 2014

Bringing #Wikidata to #Commons, one step at a time

There is this big project that is to bring structured data to the 23,422,581 media files that make up one of the biggest resources of freely usable media files.

It is to bring many different benefits to the users of Commons. To accomplish this many steps have to be taken. Many of these steps can already be taken and will indicate why this project is done and, what its benefits are.

Take for instance Mr Daniel Havell. He is an English engraver born in  Reading. There is no Wikipedia article about him but there is information about him in Wikidata. It includes all the information that is in his "Creator" template and the category about him on Commons.

Having such information for all the "Creators" on Wikidata is easy and obvious. Having all those templates refer to Wikidata builds an anticipation of things to come. Next steps are making sure that the information looks good on Wikidata and is informative. Currently the best we can offer is by showing the information in Reasonator.

Using tools like Reasonator for now establishes that the WMF and the Wikidata team appreciates all the efforts that promote the use of Wikidata and accepts it as indicative of the type of information it will have to bring.

This can all be done today. No waiting is necessary and it makes data from Commons available in multiple languages. This is Mr Havell in Russian. Bringing the benefits of Wikidata to Commons today helps. It brings awareness to our public of the inherent benefits. It allows them to comment and get involved slowly but surely. It will prevent a "big bang" announcement of this is "it",take it or leave it. It will even bring more information in more languages to Commons sooner rather than later.
Thanks,
      GerardM

Sunday, October 12, 2014

#MediaWiki is about sharing the sum of all #knowledge

The organisational structure of the Wikimedia Foundation has been completed with the hiring of Mr Damon Sicore. In his first IRC #Wikimedia-Office chat the ugly head of Wikipedia centrism was found to be alive and well.

Mr Sicore made some important statements: "The most urgent issue seems to be software quality and shipping what we say we are going to ship, on time." and also "this urgency is compounded by the fact that we must be able to compete in mobile".

Wikidata is firmly part of us sharing in the sum of all knowledge and it is increasingly important at that. So far Wikidata was mostly about linking Wikipedia articles about the same subject. Increasingly available data is used in info-boxes. Once the wikidatification of multimedia files happens Wikidata needs to become editable from mobile phones and it needs to be easy and obvious in any and all languages..

Currently it is not easy nor obvious in any language.

This is not to say that it is not possible to make it increasingly easy and obvious in all languages. It is important because it is a requirement when the wikidatification of multi media files is to succeed. This is however only one use case where improved usability of Wikidata is essential for us to continue to share the sum of all the data we have available to us.

Only one challenge for Mr Sicore is the extend Wikidata will make a difference. There are many more he faces. I wish him well because his success is our success.
Thanks,
      GerardM

#Wikidata - the maintenance of #awards

Mrs Kizer died. She won several awards. One of the awards she won was the Robert Frost medal, another award was the Theodore Roethke Memorial Poetry Prize. Two other awards, the John Masefield Memorial Award and the Borestone Award are not linked in the article yet.

The funny thing with awards is that they have a habit of being awarded regularly. This has several consequences;
  • you can predict how many winners there may have been
  • you can predict when the next winner is likely to be known
Given that many awards are not maintained as well as for instance the Nobel Prize or the Pulitzer Prize for Poetry, it should not be that hard to produce something that lists all the awards that have no winner yet for a given year. Wikidata already provides most of the main elements; these are all the awards for instance and it shows how many Wikipedias have an article for them.

By adding a statement about the frequency of the award it becomes [possible to find the awards that were not awarded in a given year. It will stimulate adding awards, it can be the basis for a tool that shows lists of winners on Wikipedias and it would stimulate me to indicate that Mrs Kizer won the Pulitzer Prize for Poetry in 1985.
Thanks,
     GerardM

Thursday, October 09, 2014

#Wikidata - #Statistics are a #data game


The Wikidata statistics are a marvel. They exist in their own little corner of the Wikiverse and rely on the dumps that are regularly produced. When everything is fine, a refresh is generated automatically. Some crazy people find them of interest and go over the numbers trying to understand what is happening. Every now and again, they are amazed or appalled.

Recently the dumps who are available in JSON changed its format in the midst of a dump. The resulting hodge podge of data made the statistics unrealistic. Magnus was on a holiday. Yes, he has a real life, so it took a bit of time before he reasoned his way out of the mess.

It is wonderful that our community has people like Erik Zachte and Magnus Manske. They spend so much time and effort in providing us with meaningful statistics. It is important to remember that they rely on underlying data and it is their skills that ensures that the data remains comparable over time.

NB Currently 56,83% of the Wikidata items have 0, 1 or 2 statements.. :)
Thanks,
       GerardM