This was done in the past by a different tool. It was a drama because Wikidata is NOT a relational database. The problem is that an item cannot be created with the certainty that it will be unique. To ensure that new items will be unique there are plenty of available tricks.
The easiest trick is to have an option in the tool to create all the missing papers known for a given author. One author at a time and, from Scholia. It makes use of results from a batch process that runs once a week. Cheap, cheerful highly effective.
Then there is a need for another batch process. For all the "author string"s that include an ORCiD identifier, existing authors are sought and these author strings are changed into "author"s removing the link to the ORCiD identifier as it is implicitly part of the author. This process can run once a week.
A second batch process, also running once a week, looks for "author string"s with ORCiD identifiers without corresponding authors. It generates a list of ORCiD identifiers with associated "author string"s and creates one new item uniquely identified by that ORCiD identifier.
Obviously new authors make it useful to run the first batch process again.
These batches could run exclusively for an author processed by Orcid-scraper making this tool and Scholia more powerful and up to date.
Thanks,
GerardM

No comments:
Post a Comment