Friday, December 27, 2019

The value of incomplete data - Fellows of the Ecological Society of America

This is about understanding data in Wikidata. The article is about understanding what you can and cannot do with incomplete data, it is not so much about the Ecological Society of America.

The most recent work started with the news of a new Wikipedia article. Prof Cottingham is a 2015 fellow of the esa, there is a category for fellows, adding her and other missing fellows to Wikidata showed that for one fellow there was no Wikipedia article. At the time there were 90 known fellows and for only two it was known when they became a member.

I expected that new fellows would be known to Wikidata not just as an "author string" but that they would be an "item". So I added 14 of the 2019 cohort and found this not to be the case. I then looked up the known fellows from the esa webpage, added their date to Wikidata because I wondered if it were particularly the older fellows that are represented in Wikipedia.

While adding the dates, I added many alternate names to aid disambiguation, I removed one item and found two false friends; fathers mistaken for their son. When I was done, I had a good impression of the data on the website and even though I do not have the full numbers, I feel to be correct in my belief that it is the old ecology/ecologists that are represented in Wikipedia.

When you scrutinize the list of fellows, you will find included "Early Career Fellows", they are "elected for advancing the science of ecology and showing promise for continuing contributions" and they take part for a limited amount of time. Programs like these are known from all over the world and from many science orgs. This time I did not spend time on them but from previous experience I can safely say that promising is putting it mildly.

Wikidata is a wiki and as such, the work that I did is of value even though it is incomplete. I did not add all the missing fellows for instance. The esa is very much an organisation for America (check the employment of its fellows) and it takes pride in global attention and solicits membership fees from all over the world. It takes a lot of additional data when you want to compare if its subject matter is biased towards America and in what way.

For many of the fellows I added, there are papers with "author strings" waiting to be linked to an author. The same can be said for the fellows that are still missing. It can be compared to other ecological organisations but how to deal with the differences takes a completely different understanding. It takes more data to make this possible but the data does not need to be complete, that is the beauty of averages.
Thanks,
       GerardM

No comments: