Saturday, October 05, 2013

10 #Wikidata questions for Lydia

Lydia is the new project leader for Wikidata. She has already done a great job communicating for the Wikidata project and, with much of Wikidata well established, communication will increasingly be the critical success factor. The Wikidata team works together well so I think she will do really well. Enough reasons to ask her some questions. I hope you will enjoy them as much as her answers.

Wikidata exists, it is being used. Do we know how much it is used?
Unfortunately not. Of course we get to see some great uses but not the whole picture. I love seeing all the different ways the Wikipedias are already now using Wikidata that I know about. For example the Occitan Wikipedia building whole infoboxes based on data from Wikidata. Or the English Wikipedia that compares its local IMDB identifiers against those in Wikidata and puts pages into a maintenance category if they do not match so a human can look into it and fix it. But this is only inside the big Wikimedia projects. At the same time people are building 3rd-party tools that use data from Wikidata. The most recent and very impressive one is the Wikidata tempo-spatial display. We are seeing more and more of these pop up and I am looking forward to what other useful, cool and even crazy things people will come up with.
You want people to trust Wikidata data. How do you envision this to work?
Wikidata has just started and we're seeing people add a lot of data to it for all the world to use. This is fantastic. At the same time we need to make sure that the data in Wikidata is reasonably trustworthy. I say reasonably trustworthy because in the end this is just like Wikipedia. It's not 100% perfect but we're all making a huge effort to keep it as accurate and correct as possible and people have come to rely on this. Wikidata is in a bit of a better situation here though than Wikipedia for a few reasons. First of all since the data is structured and machine-readable it is much easier for a computer to find inconsistencies and alert an editor about them. One example would be that a political office is said to be held by X but X is said to be an animal. Now there were probably a few cases where a political office was held by an animal but in general things like this should be flagged for an editor to reexamine it. And there are many such cases that could be checked. Here you find a few such checks that are already in place. The other advantage Wikidata has is that it will be much easier to verify a given data point against an external source once that is given in Wikidata. And the third advantage Wikidata has is that it will be watched by potentially a lot more people. Changes in Wikidata show up on the watchlists of all editors who are watching the corresponding page on their Wikipedia as well as the recent changes of that Wikipedia. So all in all we are in a pretty good position. However we are not there yet. More tools will need to be developed, existing ones improved and most importantly people will need to spend time adding sources to statements in Wikidata.
When you talk about the user experience of Wikidata, what are the limits of this user experience as far as you are involved?
I want Wikidata to be a joy to use and I want it to be easy to use. At the same time experienced editors need to be able to navigate the site and complete their tasks quickly and efficiently. We will have to find the right balance there over and over again. The same thing goes for all the missing features we still have to develop or roll out - queries and the numbers datatype for example. I will put more emphasis on user experience but moving the project forward on a feature level is also very important.
One obvious target for Wikidata is to include all the information contained in info boxes. How far off are we before Wikidata can service most subjects that have info boxes.
I think the big missing piece of the puzzle is the numbers datatype. We're not too far from rolling out a first version of it. Once we are able to also deal with units we are well on our way to that target.
You want people to better understand Wikidata. Is that not even more complicated than understanding templates and info-boxes?
They should absolutely not need to understand everything about Wikidata. This would not work. But for those who interact with Wikidata regularly it should be easy to understand what is going on - not in detail but the bigger picture. To get there we need to improve a few things in the user interface but we will also need to adapt our help pages to be less technical.
What can people do to make Wikidata be useful in their language
Wikidata is inherently a multilingual project. It allows you to use the site in your language and will show you the data it has in your language. Have a look at for example item Q2 in German, English and Spanish. To be able to do this Wikidata needs to know the names of all these things in the different languages. That's what we call "labels". These labels are really important in all kinds of places in Wikidata as they make it possible to refer to things by their name instead of the identifier we gave it - Q2 in the example above. So the best way to make Wikidata more useful in a language is to enter a lot of labels and descriptions in that language. The Special Pages Entities without Label  and Entities without Description are there to help with this as well as the Terminator tool that sorts them by how often they are used to increase effectiveness. This is especially important for the smaller languages. If you speak several languages you should also add a babel box to your user page on Wikidata like I have done on mine. Once Wikidata knows which languages you are speaking this way it will show you the labels and descriptions for a given thing in these languages as well and will let you complete them in case they are still missing. Over time Wikidata will become more and more useful in all languages we support. One of Wikidata's major goal is supporting smaller Wikipedias. This is one of the most important steps on the way to get us there.
People like Magnus and Denny visualised data that exists in Wikidata, how important do you think visualisation is?

Visualisations are crucial for Wikidata. They are a way for us humans to make sense of the vast amount of data. They allow us to see patterns. They allow us to see where we are missing information. They allow us to see where we have outliers in the data that need closer examination. (Belgium has the information that it shares a border with Australia? Probably not when you look at it on a map...) These things are a lot easier to spot and make sense of when you have a nice visual representation of them. But they also show us how far we have already come and give us a sense of achievement. Look at this gif for example. It shows the progress of adding geocoordinates to Wikidata over the first days this was possible. And last but not less important visualisations are of course beautiful and fun.
What can be done to help people be effective in Wikidata
Make it easy for them to get started and understand the basic concepts quickly. Make it easy for them to find like-minded people for example in the task forces. Keep the number of rules low. Create more tools like Terminator and more Database reports to make it easy to find the areas that need more work.
Do you consider that Wikidata is a project in its own right or is it beholden to the Wikipedias, particularly the biggies?
Wikipedia is definitely the most important use-case for a long time to come. However as we're already seeing now Wikidata's data is of use for many many parties outside Wikimedia as well. This will only increase. It is definitely a project in its own right.
What is your dearest memory of Denny as your predecessor?
I have many dear memories. When we met for the first time a few years ago it was in a small room at our university in Karlsruhe to discuss Semantic MediaWiki and its community and development. It struck me that both Denny and Markus (the two founders of Semantic MediaWiki) just got it. They understood what it means to build a community and develop a project in the open with all its benefits and drawbacks. A very rare trait I can tell you. Since then we've built this amazing project and an incredible community gathered around it. Along the way we've been to Wikimania in Washington, D.C. and Hong Kong and many other events. At each of those events we've met incredible people who are passionate about what we're doing and willing to help. I'll never forget that and I'm looking for more of it to come.

No comments: