Tuesday, September 30, 2014

#Wikipedia - #Category: #Cholmondeley Awards

A poet is proverbially poor. If there are any, it is not really an occupation but more of an aspiration to live off the high art of poetry. To celebrate great authors, great poets, organisations like the Society of Authors, recognise them with awards. Money may be involved, but the prestige, the awareness of the public is what makes the real difference for a poet.

The Cholondeley Award is to "honour distinguished poets" and, yes there is some money to be had; £8000 to be exact. There is still a category for the distinguished ladies and gentlemen who were awarded in the past. It is up for deletion because £8000 is considered chicken feed and also because awards are supposed to be in a list, not a category per WP:OC#AWARD.

It was a revelation for me that there is something like "over categorisation" in the first place. It means that categories are deleted. Wikipedia is a law unto itself and given "consensus", categories will be deleted. It is sad that all the work that went into the categorisation is deleted with those categories. It is even worse when the implied information is not first saved to Wikidata.

Categories distinguish themselves from lists because they can be found on every article that is categorised. Changes to article names are automatically reflected while Wikipedia lists are ... static. In Wikidata, lists can be defined up to a point and funnily enough this is not done for lists but it is done for categories.
The information from the category has been saved with AutoList2. Given that Wikidata often knows about more recipients of awards than any and all Wikipedias individually, the current emphasis on lists is silly. Wikidata will do a better job.

Friday, September 26, 2014

#Wikidata - Paul Konerko; #Chicago White Sox

Mr Konerko is one of 1,669 humans Wikidata knows about as being or having been a player of the Chicago White Sox. According to the White Sox website, Mr Konerko is very much celebrated; his gear can be bought on a charity auction, the fans are asked to vote for him so that he may win the Roberto Clemente Award. Mr Konerko is about to retire from the game.

With some regularity "members of a sports team" information is added to players past and present. They can be of any team or any sport. When this is done in an automated way, this is done based on information in an "info-box" or in a "category". At this time we do not have the technology to get the information from a "list".

Importing all this information into Wikidata is laborious. That is no problem because it presents all kinds of new opportunities. For instance when a "human" is added to such a category, we can imply that he or she IS a member of that sports team. Another opportunity occurs when a person gains an article in another language; in that case we already know to what category he or she may be added.

As this process of incrementally adding more people from categories continues, Mr Konerko will be recognised not only as a White Sox player but also as a Los Angeles Dodgers and a Cincinnati Reds player among others.

Tuesday, September 23, 2014

#Wikimedia #Leadership and #Community

Sue wrote a wonderful blogpost. The subject is "leadership development", it is a great read; it reads true and at the end, the summation lists very much the kind of issues the Wikimedia Foundation has to content with,

The two things that stand our for me are:
  • How to create a strong, shared work culture without accidentally turning into a monoculture that doesn’t tolerate people who don’t fit
  • How to create an environment that enables the effectiveness of creative, talented people who have depression, ADD/ADHD and/or Asperger’s.
The notion of a "monoculture" is really pervasive; WMF IS Wikipedia it seems and, it takes another organisation to bring Wikidata into the fold. Projects like Wiktionary, Wikisource and even Commons are like Wikidata primarily seen as additional and there to provide the service that does not fit Wikipedia.

It is wonderful that Sue acknowledges how important the people who suffer from one or another mental health issue are. It is a great step forward as it acknowledges the relevance of this group. A next step would be some research into this phenomena. I am convinced that the number of people involved will prove to be surprisingly high for those who are not aware or prefer to ignore this,

I am confident that Lila is as aware about the issues Sue blogged about. My hope is that our community will become more aware about the leadership issues the Wikimedia Foundation and its communities bring.

#Wikimedia #Labs - when is enough enough

Wikimedia Labs is a success. It is used a lot. It is increasingly stable. It grows quite quickly and, it has its own staff who are knowledgeable and happy to work with anyone and help where needed.

It is a great project that increasingly provides important and useful services to Wikimedia projects. Some services stand on their own to the benefit of the wider open source and content world.

It is a happy story where growth is limited by the availability of hardware and support. With more hardware, with support available around the clock, the sky is the limit.

Unbounded success is great but it will on occasion overtake the projects it supports. As the demands on the projects grow, the infrastructure changes. Great best effort prevents most mishaps from happening but occasionally services do become unavailable. That is not a drama, it is the way of things.

Lately there was another snafu; it partially affected the work that I do. I noticed that the issue was actively taken care off. I send a mail to Magnus about it. In the end the issue was resolved and the services I use are back in full swing. It took a few days, there was no drama. That is success.

Saturday, September 20, 2014

#Wikidata - Hal Newton; #baseball player

Mr Newton died in 2014 and, as his demise was only recently picked up it came up to my attention only now. I added his date of birth and death. I skimmed his article and added two clubs he played for by hand; the Calgary Stampeders and the Toronto Argonauts. For a third club there was a category so I added all of the players of the Hamilton Tiger-Cats using AutoList2.

Then it hit me. Adding all this data to Wikidata is very similar to collecting baseball cards. Sometimes you get one card at a time, sometimes you get a whole collection. All of them have to be fitted in so that they can be displayed most advantageously.

At that Wikidata is very much the box where all the cards live. Reasonator is like a display book; it shows off what there is and it is where you get the best impression of what is still lacking.

Thursday, September 18, 2014

#Wikipedia - To bot or not to bot II

In a mail it was announced that the Lsjbot finished adding plants to the Swedish Wikipedia. Several classes of subjects have so far been added and it had quite a surprising effect on the vitality of the Swedish Wikipedia:
"We are also gladdened by the hard numbers. Reader accesses show a healthy increase even from our already high number. And a trend of a slight decrease of editors has now turned into an increase. We can not say for certain why and it could be temporary but we believe the botgenerated articles has a part of this positive development".
These hard numbers referred to fly in the face of all the pundits who claim the opposite. Evidently, Wikipedia works best when it does what it is meant to do; share in the sum of all knowledge. There is no sharing when no information at all is provided for "esthetical" or whatever reasons.

We can argue about the best way of providing additional information and, it is good that a door is kept open for Wikidata to play a role. In the end both for all the Wikipedias and for Wikidata it is about priorities and I agree with Anders that the quality of the data is key in this. In addition the priority of Wikidata should be much more centred on what we do it all for; making information available to people not so much machines.

Wednesday, September 17, 2014

#Wikidata - Ted Belytschko, engineer

Mr Belytschko was an #engineer. He did a really good job; he was awarded with medals and other awards in his lifetime. Several of these awards have been added in Wikidata;
When you check out these awards, you will find that Mr Belytschko who is known to be awarded these medals is the only known recipient. It is quite obvious that in reality this is not the case. As more people are known to be recognised as engineer, more appreciation will exist for this really important occupation.

Thursday, September 11, 2014

#Mortality - You make your choice and, then you die

#Vaccination is a serious subject. It allows you to prevent yourself and your loved ones from deadly deceases like the whooping cough. A report by the Hollywood reporter has it that an outbreak of whooping cough can be expected soon with deadly consequences in the Hollywood area.

When people make their own choices, they have to live with the consequences. I really wonder if the people who leave themselves vulnerable to such deadly deceases understand that they are also responsible for those they infect in turn, When they fall ill, they expect to receive treatment at no extra cost. They expect that their health plan pays for it but, should it?

With whooping cough there is sufficient research about the efficacy of the vaccines and its risks. For ebola experimental vaccines are used. Like whooping cough it is a deadly decease. I wonder if these people will refuse an experimental ebola vaccine when the decease arrives in their backyard.

Wednesday, September 10, 2014

#Wikidata - great progress on the number of statements

It is obvious; with 15,703,625 items, it takes a lot of effort to improve the quality of Wikidata. Statements are added all the time to specific items and that reflects well on the quality of those items but the underlying health of Wikidata is probably best expressed in its statistics.

The latest statistics show that we can finally say that "50% of our items have none, one or only two statements". Arguably not much can be said about these items and consequently these items are not informative. The trend however is wonderful; it shows in the graph; slowly but surely Wikidata is gaining data and by inference becomes more informative and useful.

The next challenge for us will be to be able to say "50% of our items have none, one, two or only three statements".. That will be a happy day.

#Wikidata - Eberhard Schlotter

Mr Schlotter died on september 8th. During his life many of his achievements were celebrated with awards. According to Wikidata he received among others the Q2571516 this award was named after Mr Q2573984 a noted German sculptor.

When you look at the information for the award, Reasonator will provide you with sufficient information as can be seen in the screenshot. Mr Loth is named by Reasonator as well.

My hope is that when Wikidata gets its UI make over, a good look will be given at the functionality of Reasonator and its killer features like, showing labels where they are available, will be adopted.  A machine may make sense out of Q2571516 but for me a text like "Wilhelm-Loth-Preis" is much more informative.

#Wikipedia & #Wikidata - #Authority control

One objective for Wikidata is to include information to be used in articles of Wikipedia. One template in particular provides a great example of this ability; it is the "Authority control" template. The rather technical information that it provides points to external sources where you will find information about the same item.

All these external sources have their own purpose and in essence, you may find additional information and hooks to functionality about all kinds of everything. It is for instance well possible to inform you about the availability of a book in *YOUR* library in several countries like the Netherlands.

Authority control is not restricted to the English Wikipedia, as you can imagine, it is called different in many other languages; Normdaten is for instance the name in German. All these templates face the challenge how they keep up with all the new external sources that are added all the time at Wikidata. The English version only supports some 12 sources while Reasonator knows about more than 200 external sources.

For all these sources statements are added all the time and new sources are added on an almost weekly basis. This makes it obvious why it is best to choose for Wikidata when external sources are to be added in an article. The template makes it easy; it shows everything from Wikidata what it currently supports.

Monday, September 08, 2014

#Wikidata - The #Ukrainian - #Russian #War 2014

Soldiers die in the Ukraine. With depressing regularity they appear in the ToolScript I use to find people who died in 2014. I think many of them can be found in this category.. Currently there are 294 entries.

The Ukrainian Russian war is just one of the wars that are happening today. Sadly most other conflicts are not documented with the same quality of detail. Consider the cultures of Iraq and Syria that are being wiped away including their people. All the "little" wars in Africa. These conflicts are huge in their impact on people and cultures.

We hardly know them.

Saturday, September 06, 2014

#Wikidata introduces some media files from #Commons

At Wikidata we use images. They are included as statements and they show up for instance in Reasonator where they prove once again that a picture paints a thousand words. In a round about way this shows one aspect of what "Wikidatification of media files" can do for us.

Magnus did it again.

His new tool, "Wikidata Commons Search" shows what we can do already based on the limited amount of data that is available, Thanks to the labels in Wikidata we CAN search and find "things" in any language. These things can be any kind of mediafile as the example shows so well.

Thursday, September 04, 2014

#Wikidata & #Commons - It is all about the #presentation

Commons is to be Wikidatified. There was an hour long chat about it. It is quite unstructured and it is all about details. At the start a reference is made to the initial document. The motivation for the whole exercise as stated is utterly disappointing:
"The Structured Data project is a proposal to store and retrieve information for media files in machine-readable format on Wikimedia Commons and other sites, so they are easier to view, search, edit, curate and use."
I could not care less that machines can read the format.. Important is that people, not machines are able to find media files in their own language, When people upload a media file, that media file should be available to the whole of our community so that the file will be actually used.

When the Wikidatification of media files is done to help PEOPLE, not machines, objectives can be formulated that are "must have" to all of us. We can have objectives where 50% achievement is a big thing. Consider an eight year old in Whatever country. They speak Whatever. This eight year old enters "phifflesticks"; this is Whatever for "horse". He or she can choose from 50% of all the media files we have about horses.. now THAT is a big achievement. It is 100% better than we are able to do today.

Presented like this, when it takes Wikidatification to make this happen, there is a clear objective, It is obvious why we should do this.

When Wikidata and the Wikidatification of Commons is to be well presented. It needs appealing objectives. Objectives that provide sufficient reasons to allow for the stress and upheaval that is needed to make this happen. Obvious objectives enable clear goals for instance
  •  all the subjects that children learn about in the first two years of primary school have labels in the language of the child
  • the number of downloaded images is measured and the downloads are to increase by 100% in a year
  • new files are given tags that enable finding and using these files
  • tags are added both manually and automatically
  • tags are gaining more labels in the languages we support
What we do is for humans. This needs to be obvious. When it is, a chat is not about retaining categories and templates. In the end they are not relevant at all. Finding pictures of a kitty, a doggie or a horsie is.