Friday, November 29, 2019

It is not a list when it is the result of a query

A list is a presentation of data. When a list is maintained manually, the list IS the data, when the data is the result of a query, it REPRESENTS the data.

The difference is quite important. Changing the information in a query is in the definition of the query, changing the data is a matter of re-running the query. Changing the information in a list is a lot of work and therefore there is no integrity in the data itself, it is always potluck what quality the data is.

In the Wikipedia world, Listeria is king of the queried lists. For some its use is controversial but things are changing for the better. Projects like Women in Red use Listeria a lot, their work is possible because people add notable women in Wikidata. The queries work on the basis of awards, professions, nationality enabling volunteers to write the articles they care to write. This works because once an article is written they are automagically removed from the lists.

On the English Wikipedia consensus has it that manual lists are to be preferred. However, emperically the quality of automated lists perform better {{REF}} and as data in Wikidata does not suffer from "false friends" even the support for "red links" is vastly superior.

There is no point in anecdotal evidence who is best. When the English Wikipedia has a black link for Stephen Fleming on its page for the Spearman medal first, it is an obvious start for a new item on Wikidata that is more than just a person who won the Spearman medal. It then becomes a target for lists of the special interest groups who aim to cover "their" subject matter well.

The next stage of the acceptance of lists relies on the realisation that "consensus" does not serve us well particularly when it trumps established facts. It will serve us well in politics and, in what Wikimedia projects could be.
Thanks,
      GerardM

Wednesday, November 27, 2019

Please let us support #Science at @Wikidata

When the BBC informs us about reforestation in Ethiopia.. It is Dr Tewolde-Berahan who informs BBC's Justin Rowlatt about the work that is done in preparation of planting trees.

It is a humorous piece of information that gets the message across; you can plant where trees were absent for generations and make the (local) climate change.

Consider; you now want to seriously know more about reforestation in Ethiopia. Where do you go to? Wikipedia, in all its magnificence, is rooted in its articles and thereby dated. Through its references however, there are links to its authors, to many more authors and their publications. Every article has in this way its concept cloud and it could be translated in a Scholia for an article.

The current Scholias are itself already a rabbit hole that leads in many directions and a Scholia for an article would be something different again. The article links to subjects, has its papers and by inference authors, they may link to newer papers, more papers, contradicting papers. They may lead to scientists who research similar notions for another locality.. Why not reforest Spain in France? When reforestation is possible in Ethiopia, what would be different to make this unfeasible in Europe?

And all this becomes possible when you consider Wikipedia as the jumping off point in any and all directions, not just within Wikipedia..
Thanks,
     GerardM

NB I know there are two fellows of the Ethiopian Academy of Science related to this subject. Who are they and how are they connected to Dr Tewolde-Berahan?

Thursday, November 14, 2019

@wikidata - I don't scale, help me scale

At Wikidata there is always more to do and as a volunteer you make the biggest impact when you concentrate on specific subjects. I do not scale enough to do everything I would like to do.

There are a few area's where I aim to make a difference; of particular concern is where we do not represent a body of knowledge/information in Wikidata. At this time the favour scientists particularly women, young scientists and scientists from Africa.

To make my work scale, I twitter and blog. I latch on to the great work done by Dr Jess Wade. She writes articles on well deserving scientists and I aim to add value for those scientists on Wikidata. Typically I add professions, alma maters, employers and awards. In addition I add "authorities" like ORCiD, Google Scholar and VIAF. This is important because it enables the linking of scholarly papers already in Wikidata or known at ORCiD. I can more or less keep up with Jess and, I happily add information for any and all scientists I come across on Twitter.

While doing this I learned of the Global Young Academy and as a side project started adding scientists who are member of the GYA or one of affiliated organisations to Wikidata. I am so pleased  I got into contact with Robert Lepenies. Robert is happy with the opportunity that a Scholia provides for an organisation like the GYA, for him and for all the young scientists involved. We collaborated on completing the lists on many wikipedias, Robert added many scientists to Wikidata and is now battling to keep the pictures of these young scientists on Commons...

What is crucially important for me is that Robert advocates an open ORCiD profile to scientists worldwide so that they may have their Scholia. Both Robert and I do not scale and what would help us most is an easy and obvious way that enables any scientists to start a process that will include all his papers from ORCiD, will update the known co-authors and instruct in what they can do to enrich their Wikidata / ORCiD / Scholia profile even more.

I am now working on African scientists and yes, I would appreciate some help.
Thanks,
     GerardM

PS my wife would like this scale to be enough for me

Tuesday, November 12, 2019

Instant gratification at @Wikidata

As I write this, it is 11:46am at 09:26am I added papers to prof Hafida Merzouk. The edits are picked up by Reasonator but not by Scholia. In a similar way, edits done are not picked up by Listeria.

Instant gratification is now a thing of the past, the work done at Wikidata may eventually be picked up in a Scholia or Listeria but it is not funny. Can I tweet about the things I find or have done when Wikidata no longer reflects the relevant changes?

This may sound like trivial but it does mean that when I look back at my work that  there is no longer a timely way to do so.

Instant gratification motivates and it is a factor in maintaining quality. We are losing it.
Thanks,
      GerardM

Saturday, November 09, 2019

Put (modern) #science of #Africa on the map

A young African scholar commented that the info on websites of African scholarly organisations was all about its past. There is a point to recognizing those who did good and consequently making obvious that the science of today is rooted in the past.

African scientists as well as any other scientist have a place in Wikidata with their affiliations, papers, co-authors and also with their scholarly advisors. My proposal is for all scholars to check if they are on Wikidata, check if their doctoral thesis is on Wikidata. Then add their doctoral advisor to their item and reciprocate themselves as a doctoral student.

Do not forget to include where you studied and for what university you work(ed). Check if your ORCiD profile includes trusted organisations like CrossRef that will update your profile when appropriate. When many of you do this at Wikidata you will be surprised what the impact will be.
Thanks,
      GerardM

Friday, November 08, 2019

Bias in @Wikidata and a SMART approach

When at the WikidataCon quality was presented, it was rated from 1 to 5. This approach has its own bias because it does not consider what may not be there. What is not there can be made visible using assumptions like: "a university has more than one employee" (employee includes professors) and, every country has at least one university..

The bias in Wikidata starts with the way it is mostly used and consequently how it is taught. People are shown what Wikidata looks like, immediately followed up with training in the use of query and the use of tools. At every level it takes considerable skills to make a use of Wikidata. The first hurdle to overcome is to understand the data in a single item. When your language is not English you are toast. This is Cape Town in Newari and this is a useful presentation using Reasonator. With Reasonator the information is easy to digest and adding missing labels is just one click away.

The second hurdle is knowing what bias it is you want to remedy. For a known bias like the gender gap, the Women in Red have lists of missing Wikipedia articles. A Wikidata gap is expressed by the absense of data. Listeria lists are great at that.. These are all the universities of Africa.. If you do not get the extend of what we miss, you have some thinking to do. When you apply this principle to the science of Africa, you find a lot of lists and the biggest issue remains; missing lists.

When you tackle a missing subject like I did for the "Affiliates of the African Academy of Sciences", you will find a source as a reference for the group and a reference on every affiliate. To ensure that the data is relevant and actionable, I added all of them, linked them to ORCiD and/or Google Scholar enabling SourceMD to link them to their papers. I added nationality because this may trigger inclusion on the Women in Red lists and when it was obvious, I added employers so that they may be included as a scholar on African University lists..

When we as a movement want to fight bias, we have to consider the use of lists and particularly Listeria list to show the developments of a subject. With lists available on many Wikipedias, it becomes possible to gain traction on what we miss. This approach is distinctly different as it acknowledges the need for more support for item based editing and it makes the point that missing data is a quality issue that needs to be addressed as a fundamental issue.
Thanks,
      GerardM

Thursday, November 07, 2019

@Wikipedia talks about @Wikidata

"WD is unreliable. WP:V and WP:RS are completely ignored (from any editors). International NPOV is a problem too." It is so SMART, that the best I can do is ignore it. Then again it is an open invitation to talk about Wikipedia..  There is no Wikipedia there are over 300 Wikipedia language editions.. so even the acronyms are lost on me as there is no one Wikipedia to rule them all.. 

So forget about acronyms and lets talk Wikidata and by inference raise issues particularly for the English Wikipedia where appropriate. First, Wikidata includes more items than there are subjects raised in any and all Wikipedias. Its quality can be considered in many ways and verifiability is largely ensured because of the association with other "authorities" about a subject. Thanks to the increased use of open data, it is possible to verify that specific statements are shared, increasing the likelihood that they are correct. For some information like for scientists who are a member of the AAS Affiliates Programme, we have/may have references to the authoritative source. Such references may be on a project or on an item level, it makes verifiability easy and obvious. 

Wikidata has an issue with all kinds of gaps in its coverage. For many African countries no universities are known, there are hardly any scholars associated with them. Thanks to Listeria functionality we can monitor if and when data is added. Many a Wikipedia do not have such tools because of the aversion of Wikidata by some. At the same time projects like Women in Red rely on Listeria lists and by inference Wikidata to know what to work on.

In tools like Reasonator and Listeria lists are generated and, when you compare them with Wikipedia lists, the quality is measurably better. I published frequently in the past about the Polk award.. In its lists Wikipedia has a likely error rate of six percent. When they fudge the record by not linking at all, the quality of a Wikidata lists is even better because it is much better at linking items than Wikipedia is at linking red links.  There is a solution, it just requires a willingness by Wikipedians to cooperate. 

I understand what is meant by "international NPOV" and it is where Wikidata is by definition better than an individual Wikipedia. By definition because Wikidata represents data from ALL Wikipedias. Thanks to the people of DBpedia, there is a potential to highlight where Wikipedias differ and it is more likely that the fruit of their labour will enrich Wikidata than Wikipedias.

So a Wikidatan walks into a bar..
Thanks,
       GerardM