When you look at Wikidata, it Is very much like the
proverbial glass. It is not even half full but It is filling rapidly. Unlike a
liquid, every new bit of data becomes part of a web of data. At that it is more
like a telephone network where every new endpoint makes the network more
valuable to its users.
Like Wikipedia, Wikidata aspires to help people gain access
to the sum of all knowledge. It already knows about more subjects than any
Wikipedia but what it knows is incomplete, sometimes even wrong and often not
accessible to everyone. Many of the Chinese villages and towns only have
Chinese labels. Most of the locations in the USA do not refer to the lowest
level of administrative unit they are in. For the towns and cities of so many
other countries we do not have any data at all. Sometimes we may know they
exist.
Like Wikipedia, Wikidata is a work in progress. Many
articles are stubs and for many items there are no statements. Bots operate
using list to generate more Wikipedia stubs.
Some clever programming uses all kinds of substitution to create a
readable text and often the data in the list is the basis for an info-box as
well. That is fine, when you can do that, more power to you.
Creating such texts is something that can be done once or
multiple times. Magnus wants a better narrative the Reasonator produces. The
beauty of this approach is that more text even better text will be generated when
more information becomes available. Better texts will also become available
when the routines that generate the texts become more clever.
For now the improved narrative is in English only. Improving
the software is an iterative process. The first iteration is just to have
something that works, the second iteration is where language specific code is
separated so that code for a different language can be used when it becomes
available.
It will be interesting to see how popular this functionality
will be. For language technology students or professionals it will be a fun
project to have a stab at this kind of stub language. What will be interesting
is what resources they will ask for in order to make their text really
expressive.
Thanks,
GerardM
No comments:
Post a Comment