Saturday, January 12, 2008

Providing information when there is little or none

The aim of Wikipedia is to provide encyclopedic information. The aim of the Wikimedia Foundation is to provide information. Consequently the goal of the WMF is broader then the goal of Wikipedia. The implications are not often considered. An other issue is that information provided is provided in a Wiki. This means that information is not necessarily complete and correct and also that information provided in one language can be and typically is substantially different in another.

When the aim is to provide information to the people of this world, it stands to reason that we want to provide the best information available. Sadly we are not able to provide the same quality information to all people in all languages because the quality will not be the same in all Wikipedias ever. When this is a given, the first line of business should be how can we provide the best available information to people. When people are looking for information in a Wikipedia, they are looking for information in their language. When there is no article, they draw a blank. This is the space where a lot of improvements are possible. This is an issue that is particularly relevant to the less resourced languages.

The first thing to appreciate is that a person often knows more then one language. This combination of languages can be anything. The challenge is to have a graceful fall back to information that is either less informative or qualitative until the point where we provide pointers to information in another source of information. The way you can move from one Wikipedia to one in another Wikipedia is by way of the "interwiki links". The information can be seen as lexical in nature with a twist. Disambiguation of homonyms is a requirement. The least information in an article is a stub. A stub can contain an "info box" and some lines of text. A stub can be written by hand in the wiki and it can be put on the Wiki by a bot. A text can be translated by hand and by machine. All these things bring there own issues. Key in the understanding is that it takes an article in order to have an interwiki link.

To reduce things to the least information we want to provide, you are left with the concept, a definition and links to information in other languages. These translations can be linked to Wikipedia articles in the languages of the translation. At this level we can provide information that is not language dependent; for instance a photo of a horse is a picture of a horse in any language. When a concept is related to other concepts, we can show these relations with the concepts preferably in the language of the reader.

Encyclopedic articles come in different states of development. A well written article on a subject in one language may be not much more then a stub in another or not exist at all. Articles can be translated by hand or by machine and in this way information can be provided. A better start for reading and editing is provided is available in this way then with a stub.

Stubs, bot created articles and translations are seen by some as problematic while others see their value. They gives rise to a constant amount of sniping with new arguments or old arguments presented as new. They distract from what we are about; we are about providing the best information we can. There is no such thing as "the" Wikipedia as there are many. Consequently the quality standards that apply to one should not be applied to another. On an intellectual level this is understood by most but regularly people find new "problems".

One recurring theme is the number of articles; people feel offended when a projects has too many bot-created articles or machine translated articles. It is felt that it is unfair; it denies the value of all the human effort that went into their projects. In many ways the arguments are similar to the ones about "Final version" and consequently the solution can be similar. When bot created articles and MT articles go into a separate namespace, they are not counted as an article. Basic information is provided and, these articles can be improved with "interwiki links" and provide a route to information in other languages. When these articles are expanded or proofread by a person, they can be moved into the main name space.

The benefit of this proposal is that we will provide more information in more languages. Most of the arguments of the exclusionists have a reasonable reply and the work people put into the creation of more information in their language has found a place.

Thanks,
GerardM

No comments: