Tuesday, June 17, 2014

#Wikidata - about splitters and lumpers

It would be really good when people who propose to split a property or propose to merge properties to read the Wikipedia article about splitters and lumpers.

When you consider Wikidata, it has its splitters and lumpers; a recent splitter drama wants to do away with "is part of". The idea is that because so many different things can be "part of" something else it is relevant to distinguish in the property used as well.

There are 299+1 arguments why this is a not the best of ideas.

Wikidata is very much intended as a multilingual project, all these nuanced versions of "is part of" assume that all other 299 languages are able to express these same "finer points". It should be obvious that this is not the case and having the same labels for properties that are meant to be different is an extraordinary bad idea. To give you a clue, in some languages a verb does not have a present or past tense..

The other argument negates the need for all this precision. Wikidata is proving to be really good at connecting to external sources. These sources typically have the same or similar properties. The wish to map these properties between sources has been expressed often before. When this has been done, it follows that those who are so eager to have these finer points can replace the Wikidata properties with equivalent properties as maintained elsewhere.

No comments: