Saturday, March 16, 2019
When we tell the world about the most popular articles in Wikipedia, it is important to realise that we do not inform what the most popular subjects are. We could, but so far we don't. The most popular subjects is the sum of all traffic of all Wikipedia articles on the same subject. Providing this data is feasible; it is a "big data" question.
We do have accumulated data for the traffic of articles on all Wikipedias, we can link the articles to the Wikidata items. What follows is simple arithmetic. Powerful because it will show that English Wikipedia is less than fifty percent of all traffic. That will help make the existing bias for English Wikipedia and its subjects visible particularly because it will be possible to answer a question like: "What are the most popular subjects that do not have an article in English?" and compare those to popular diversity articles.
search functionality that helps finding information. It uses Wikidata and it supports automated descriptions.
Now consider that this tool is available on every Wikipedia. We would share more information.With some tinkering, we would know what is missing where. There are other opportunities; we could ask logged in users to help by adding labels for their language to improve Wikidata. When Wikidata does not include the missing information, we could ask them to add a Wikidata item and additional statements, a description to improve our search results.
This data approach is based on the result of a process; the negative results of our own Search and it is based on active cooperation of our users. At the same time, we accumulate negative results of search where there has been no interaction, link it to Wikidata labels and gain an understanding of the relevance of these missing articles. This fits in nicely with the marketing approach to "what it is that people want to read in a Wikipedia".
Saturday, March 09, 2019
This approach does not consider cultural differences, it does not consider what is topical in a given "market". To find an answer to the question: what do people want to read, there are several strategies. One is what researchers do: they ask panels, write papers and once it is done there is a position to act upon. There are drawbacks;
- you can only research so many Wikipedias
- for all the other Wikipedias there is no attention
- the composition of the panels is problematic particularly when they are self selecting
- there are no results while the research is being done
The objective of a marketing approach is centered around two questions:
- what is it that people are looking for now (and cannot find)
- what can be done to fulfill that demand now
The data needed for this approach; negative search results. People search for subjects all the time and there are all kinds of reasons why they do not find what they are looking for.. Spelling, disambiguation and nothing to find are all perfectly fine reasons for a no show.
The "nothing to find" scenario is obvious; when it is sought often, we want an article. Exposing a list of missing articles is one motivator for people to write. Once they have written, we do have the data of how often an article was read. When the most popular new articles of the last month are shown, it is vindication for authors to have written popular articles. It is easy, obvious and it should be part of the data Wikimedia Foundation already collects.. In this way the data is put to use. It is also quite FAIR to make this data available.
For the "disambiguation" issue, Wikidata may come to the rescue. It knows what is there and, it is easy enough to add items with the same name for disambiguation purposes. Combine this with automated descriptions and all that is requires is a user interface to guide people to what they are looking for. When there is "only" a Wikidata item, it follows that its results feature in the "no article" category.
The "spelling" issue is just a variation on a theme. Wikidata does allow for multiple labels. The search results may use of them as well. Common spelling errors are also a big part of the problem. With a bit of ingenuity it is not much of a problem either.
Marketing this marketing approach should not be hard. It just requires people to accept what is staring them in the face. It is easy to implement, it works for all the 280+ language and it is likely to give a boost to all the other Wikipedias but also to Wikidata.