Sunday, October 15, 2017

#Wikidata - motivation; thank you #Magnus

I added a Baratunde A. Cola to Wikidata because he won the Alan T. Waterman Award. This month a Wikipedia article was written and I wanted to add some data to the item.

I did not because functionality that is key to me was broken. A new property was added and all the work that I had done on categories no longer showed in Reasonator. There was no willingness to consider the consequential loss of functionality and the result was a dip in my motivation.

Wikidata is important to me and I asked Magnus if he would help out and change Reasonator. He did.

Now I have added information to Mr Cola based on his categories. It matters that a category like this one reflects all the people known to have played in the Vanderbilt Commodores football team.

The issue is that at Wikidata, we have lost sight of these collaborative aspects. Everybody does his own thing and we hardly consider why. It is why user stories are so important; they tell you why something is done and what the benefit is.  In the end without a benefit there is no reason to do it.

Thursday, October 12, 2017

#Wikisource - the proof of the pudding

A user story for Wikisource could be: As Wikisourcerers we transcribe and format books so that our public may read these books electronically.

The proof of the pudding is therefore in the people who actually read the finished books.  To future proof the effort of the Wikisourcerers, it is vital to know all the books that are ready for reading. It is vital to know this for books in any and all languages supported.

There are two issues:
  • The status of the books is not sufficiently maintained in all the Wikisources
  • There is no tool that advertises finished books
To come to a solution, existing information could be maintained in Wikidata for all Wikisources in a similar way as done for badges. With the information in Wikidata a queries can be formulated that shows the books in whatever language, by whatever author.

Currently there are Wikisources that do not register this information at all. This does not prevent us from making the necessary steps towards a queriable solution. After all adding missing badges at a later date only adds to the size of the pudding, not to the proof of the pudding.

Tuesday, October 10, 2017

#Wikipedia discovers #OpenLibrary

On Facebook, Dumisani Ndubane posted his discovery of Open Library:
I just discovered that The Internet Archive has a book loan system, which gives me access up to 5 books for 14 days. So I have a library on my laptop!!! This is awesomest!!!
And it is. Anybody can borrow books from the Open Library (is is part of the Internet Archive). What Dumisani did not know at the time is that there are books in other languages to be found as well.

Dumisani found out by accident; he googled for an ebook called "Heart of darkness" by Joseph Conrad. What Dumisani did not know at the time is that the Open Library includes books in many languages. His next challenge: find the books in Xitsonga, and tell his fellow Wikipedians about it.

Wednesday, October 04, 2017

#Wikimedia - A user story for libraries

The primary user story for libraries is something like: As a library we maintain a collection of publications so that the public may read them in the library or at home .

Whatever else is done, it is to serve this primary purpose. In the English Wikipedia you will find at the bottom for many authors a reference to WorldCat. WorldCat is to entice people to come to their library.

It does not work for me.

My library is in Almere and, I have stated in my profile in WorldCat that I live in Almere, I have indicated that my local library is my favourite. WorldCat indicates that the Peace Palace Library is nearby.. It isn't.

When it does not work for me, it does not work for other people reading Wikipedia articles and consequently it needs to be fixed. So what does it take to fix WorldCat for the Netherlands; for me. WorldCat is used for a wordwide public and all the libraries of the world may benefit when WorldCat gets some TLC.

Monday, October 02, 2017

#Wikipedia - A user story for WikipediaXL: an end to the Cebuano issue

The user story for #Wikimedia is something like: As a Wikimedia community we share the sum of all knowledge so that all people have this available to them. 

As an achievable objective it sucks. The sum of all knowledge is not available to us either. To reflect this, the following is more realistic: As a Wikimedia community we share the sum of all knowledge available to us so that all people have this available to them.

When all people are to be served with the sum of all knowledge that is available to us, it is obvious that what we do serve depends very much on the language people are seeking knowledge in. What we offer is whatever a Wikipedia holds and this is often not nearly enough.

To counter the lack of information, bots add articles on subjects like "all the lakes in Finland". This information is not really helpful for people living in the Philipines but it does add to the sum of available information in Cebuano.

The process is as follows: an external database is selected. A script is created to build text and an infobox for each item in the database. This text is saved as an article in the Wikipedia. From the article information is harvested and it is included in Wikidata. One issue is that when the data is not "good enough", subsequent changes in Wikidata are not reflected in the Wikipedia article.

Turning the process around makes a key difference. An external database is selected. Selected data is merged into Wikidata. This data is used to generate only new article texts that are cached in all languages that have an applicable script. As the quality of the data in Wikidata improves, the cached articles improve.

With Wikipedia extended in this way, WikipediaXL, we become more adept at sharing the sum of our available knowledge. With caching enabled in this way, any language may benefit from all the data in Wikidata. It is considered important to consider the quality of new data. Data may come from a reputable source or from a source we collaborate with on the maintenance of the data. What is to be preferred is for another blogpost.