Thursday, May 30, 2013

#WMhack - produce a list that shares a #Wikidata attribute

At the Amsterdam hackathon it became clear that Wikidata can be used as a powerful tool to improve Wikipedia. The idea of a hackathon is that you go home with new hacks and, I want to share a really nice one. It is a list. A list that indicates every known link within Wikidata from a given topic.

The cool thing is that it is the standard MediaWiki "what links here" functionality.

When you copy the "following pages" for this item, you can use the "text to columns" functionality to get something that you can use. With search and replace you can build a string that looks like this..
Many of such strings together produce a list when the Lua module Available is present on your Wiki.

When you see the Q numbers, there is no article about the subject. When you see a name in red, the label exists in Wikidata but there is no article registered for this subject. When you see a link in blue, there is a label and, there is an article.

This is a genuine hack that is really useful to complete articles on the same set of Wikidata items in any and all Wikipedias.

#WMhack - #SignWriting in #Wikimedia #Incubator

One visitor at the Amsterdam hackathon was Stephen E Slevinski Jr. He was a man with a mission. His mission was to get American Sign Language on the Incubator. As you can see in the picture below, he succeeded in his mission.

Several members of the language committee knew that it was possible for Stephen to succeed. We also knew that when this hurdle was taken, any and all sign languages can ask for their own Wikipedia. There have been requests for languages like Danish sign language in the past as well.

Now that we have the technical ability to show a sign language, the old reason not to express "eligibility" for these languages has gone. Any and all sign languages that have an ISO-639-3 code will find that there is nothing that will prevent them to work towards a Wikipedia or any other project in their language.

I want to thank a few people. First all the people at the Amsterdam Hackathon who helped Stephen make this dream come true. Secondly, I want to thank Stephen, Valerie and all the other people at the SignWriting Foundation for their continued belief that their languages and their culture deserve their own  Wikipedias.

Sunday, May 26, 2013

#WMhack - Speedy deletion

Ah well, #Wikipedia and its speedy deletion process... according to its rules " This criterion does not apply to pages in the user namespace, nor does it apply to valid but unused or duplicate templates". As you can see, this page is obviously part of my user page.

Wikidata has information on a specific group of articles that may or may not exist on this Wikipedia.

PS The issue has been resolved; the page is no longer in line of immediate deletion .... pffff

#WMhack - Dates for #Wikidata

At a hackathon you may preview what is about to happen. Dates for Wikidata is something that has been eagerly awaited. Given that Wikipedia has articles on many events, it is obvious that Wikidata is one place where assertions on these events should end up.

The most obvious dates are the date of birth or death for a person or the date a ruler started his rule end, the date it ended.

There are many issues that I can think off about how dates are to be used. But the fact that dates are finally being tested and are likely to arrive in a weeks time is a reason to be happy.

#WMhack - About #Gerrit, #GIT and #SVN

When there is still so much noise about GIT and GERRIT more than a year after the transition from SVN, you can wonder if the move from SVN was the right one to make.

At the Amsterdam hackathon there was expert help available for everyone who wanted/needed to learn more about using GIT and Gerrit and as I was curious to the answer, I asked the question.

The most important reason to move was because of Git. Using Git many new functionalities became available that made the MediaWiki developers more productive. Gerrit is something of a pain but it is usable and it is improving. So much so that the conversation of dropping Gerrit has subsided.

When you wonder about the question "was it worth it", you also have to consider the TCO or, the total cost of ownership. I came up with the question if the increased productivity had to be offset with the cost of one person supporting Gerrit. I learned that SVN also required a lot of tweaking. The CodeReview extension was written by the Wikimedia  staff.

A hackathon isn't a hackathon if it is not for showing off things that are in the works.. GitBlit is as I understand it one tool that is intended to make things easy.. More about GitBlit here.

Saturday, May 25, 2013

#WMhack - #Wikidata and #Lua to the rescue

When you want to know if all the articles on a specific subject have been written, you need something that actually helps. Wikidata helps because it knows about subjects and knows if there is an article in a language.

The Lua code shown below is "trivially easy" to create ... when you know Wikidata and Lua. For me it is a life saver; it allows me to build a list and use it on any Wikipedia.

It shows the Wikidata label used in the local language and, it shows if there is an article in the local Wikipedia. Really powerful stuff if you need this functionality.

The next challenge is to create the Lua module on every Wikipedia..

#WMhack - Historic maps .. more empires

Conferences are also places where you make friends. Fareh showed me an application where you can see how Islamic states evolved over time. It is really powerful as it visualises history I did not learn about in school, in my country.

For much of the background information it relies on the Arabic language Wikipedia. We discussed how Wikidata could provide information in other languages as well.. I for one do not understand enough Arabic. The user interface of the application might be translated at as well..

#WMhack - Building empires

One demonstration that impressed me a lot is this animated map with border fortifications along the border of the Roman empire.

Thursday, May 23, 2013

#Wikipedia articles do not always fit #Wikidata needs

At #Wikidata you can add all kinds of nifty attributes to all kinds of subjects. But it can be VERY confusing.

When you read for instance the Wikipedia article about the Umayyad Caliphate, you will find illustrations for two distinct understanding of the word.

There is a map that shows the area that was the Umayyad Caliphate and, you find a map showing the genealogy or the Umayyads. These concepts are related but they are not the same thing. I could argue that all the soldiers, civil servants, the artisans and scientists defined the Umayyad Caliphate as much as the Caliphs themselves did.

The Umayyads are also a family that includes Caliphs, Emirs and many others including spouses, princes and princesses. Some of them lived during the Umayyad Caliphate, others during the Emirate of Cordoba.

In Wikidata the term "Umayyad Caliphate" is now used to indicate that some people are part of a "Noble family". There should be something like "Umayyad family". It is easy to create such an entry.. Should terms like these be the "red links" for new Wikipedia articles ?

Tuesday, May 21, 2013

Most of the data in #Wikidata is curated

I got into an argument. Wikidata it was said should not exist because its data is overwhelmingly in need of curation. It is an argument I have seen before and I positively hate it. First of all, it is not true and second of all many people just do not get what Wikidata is about.

The bulk of the information of Wikidata is replacing the old interwiki data. Good riddance to old unwieldy and hard to maintain data. Everybody who used to be involved in the interwiki connections is now involved in Wikidata. This means that there is an existing community doing the same old thing but in a more efficient way.

Much of the information that is accumulating in Wikidata is data imported from other sources. When the German Wikipedia has an article on a particular person, it is linked to an external source identifying this person with a number. This external source has a lot of data that may find its way into Wikidata. Data that has been researched in the past for its validity. When data is imported from such trusted sources, it saves our community from adding other data.

The data in Wikidata is licensed with a CC0 license. Anybody may use it. As data not present in the other sources finds its way into Wikidata, there will be people who feel strongly that this data has to be present and has to be correct. When you feel strongly about a specific category of knowledge, you can organise a workshop to add data and find sources to back up what you claim to be true.

This is what I do.

I strongly believe that information boxes that use data that is not from Wikidata is foolish. The only valid argument to have them is because Wikidata does not include all the attributes that are necessary. This argument is becoming more and more irrelevant as time goes on.

#Wikidata has a #homonym problem

You do not solve disambiguation in Wikidata with a "disambiguation page". It cannot be done because Wikidata supports all languages and each language has different homonyms.

The obvious solution; add a description.

When description are added, you can select the right homonym among others. When descriptions are added, you know what or who the subject is. 

When descriptions are added, you are halfway towards implementing a dictionary like OmegaWiki. What OmegaWiki has in addition to Wikidata is information about the word itself; if it is a noun and when it is, what gender it has, what its plural, diminutive etc is. Another thing OmegaWiki has are verbs and other words that typically do not get a Wikipedia article.

Guess what, Wikidata has already implemented half the features of OmegaWiki. It is not as tough to add the other half.

Sunday, May 19, 2013

Sources from #Islam will benefit #Wikidata

With currently over 12,000,000 "items" registered in Wikidata, most of them registered with bots there is a monumental task waiting to be undertaken. It is adding sources to all this information stated as facts.

In Wikipedia it is policy that facts stated in an article need to be supported by sources. It is also a matter of principle that Wikipedia itself can not be considered a source itself. Stating that something is true because Wikipedia says so is good for more than a smile. Oh and, there is not one Wikipedia, there are over 280 Wikipedias; enough reasons to snicker.

One of the things people are taught in Islamic schools is that they should rely on the original sources. When something of a religious nature is stated, it should be supported by what can be read in the original sources of the Islamic faith. In Wikidata many people and subjects that have to do with Islam have found their way as a fact that is not sourced. They include genealogical information like the one shown above or similar information about Q9458.

Having sourced information is important because some information present in Wikipedia is certainly wrong. Having incorrect information in Wikidata is even worse because it may present information used in 280 Wikipedias.

Bringing together people who know the relevant sources, who are willing to learn about Wikidata and edit its information is something you can do in a workshop. Organising a Wikidata workshop together with a mosque are two novelties; as far as I am aware there have been no workshops organised around Wikidata and, organising a Wiki workshop with a mosque is something I have not heard about either.

Tuesday, May 07, 2013

#Lua template wanted for use of #Wikidata in #Wikipedia

Denny said some magical words to me;

  • you can convert a list of Wikipedia wiki links to links to Wikidata using Lua
  • you can check if there is an article on THIS Wikipedia
  • you will show the Wikidata data when there is no article
He even said that it is not hard to do and provided this as a pointer.

I am preparing a Wikidata workshop and I would be REALLY pleased when this list was available doing all the song and dance mentioned above. There are so many other lists that could benefit from this as well. 

I am convinced that such functionality will motivate people to write stubs and articles on subjects that are important to them. I am also convinced that it is a powerful incentive to create data that accomplishes things like this.

Thursday, May 02, 2013

#Wikipedia lists could fall back to #Wikidata

I have been playing with Wikidata and it is really good fun. I find many uses for it and some of them have to be tweaked a bit to be even better. Take for instance a list of popes. There is a list with articles on the English Wikipedia for each of them. There are so many popes, that it is obvious that many Wikipedias do not have the list and certainly not articles to all of these popes.

Wikidata could come to the rescue. When a list is made up of values available in Wikidata and when the links to articles fall back to Wikidata, we are able to provide relevant information and, we have the perfect opportunity to suggest to our readers to write a stub or an article.

In effect such a list allows us to provide improved information by using the strength of Wikidata in any of the languages we support. When you think about such a list, it is not much different from an info-box.

Wednesday, May 01, 2013

#Wikipedia red links should link to #Wikidata

When a subject does not "merit" an article in Wikipedia, it does not follow that the same subject is not of value in Wikidata. Consider for instance the son of a famous person who died at the age of three. He completes the list of all the children of that person something that is definitely "a good thing" in Wikidata.

It is certainly true that many subjects that "merit" an article do not have an article. There may be an article in one language and not in another. Wikidata has the option to add a label for such an article as a place holder and while it has not been written it can show nicely red on a disambiguation page. The point here is that it is legitimate for lists and disambiguation pages to have red links. When such lists are completed in Wikidata they can easily be translated to other languages and provide basic information that may be of interest.

The red links in the list of Muhammeds are both kings of the Sayfawa dynasty. Sadly they are not even all the kings called Muhammed who are part of the Sayfawa dynasty with a red link. Just consider what would happen when disambiguation lists are presented from Wikidata; it makes it easier to start articles because relevant information may be available thanks to work done in another language.

When you consider the options, all "red links" could be known to Wikidata. As a result you can complete all lists without having to write articles and, you will deal with disambiguation issues sooner rather than later and, why have wikilinks when integrity is better maintained in Wikidata anyway?