Together with CSUC (Consorci de Serveis Universitaris de Catalunya), OCLC organized a symposium for library professionals in Spain engaged in metadata practices around authorities and identifiers. The goal was to encourage thinking strategically about the transition towards an interlinked ecosystem of bibliographic entities. Thirty-five staff from CSUC member institutions, the national library of Spain and the university of the Basque Country participated in this event in Barcelona on 3 December 2019.
I introduced the theme of the day, inviting participants to discuss the role of identifiers in interconnecting the data silos within their institutions and with the outside world.
Bridging the silos
My colleague Karen Smith-Yoshimura gave a lively and entertaining keynote. She explained how library practices are shifting from authority control to the use of identifiers and how this is solving the challenges to 1) disambiguate and control names more expeditiously and 2) make library data more web friendly. As became clear from her presentation, the need for disambiguation and for control is growing with creators and researchers increasingly adopting multiple identities and getting multiple identifiers assigned to them. We are seeing identity hubs emerge to address identity management across domains – like ISNI – aggregating names from many different types of resources, not just libraries. Wikidata is another major source for cross-linking identifiers, which some libraries have started using. Karen ended her talk with wise advice: “We will continue to work in an environment with lots of different identifiers. But the more you use identifiers that link to one or more ‘identifier hubs’ the more your data can be exposed and re-used.”
Examples from the field in Spain
In a series of four “lightning talks”, guest speakers from Spain presented examples of identifier uses across silos to address name disambiguation and identity management needs in their institution.
Anna Rovira and Rosa Fabeiro (Universitat de Barcelona) explained how they are using the library’s local authority file as the place where the complete records of the university’s researchers are kept. For this subset of the file, all necessary information is collected from external sources such as: ORCID, Dialnet, VIAF, Scopus, etc. The work goes into completing the authority records. This helps the library to validate the records that researchers submit to the Research Information System (RIS) and the institutional repository (IR). As a result, the authority file now contains records of individuals who are represented with their publications in the IR but not necessarily also in the library’s catalogue. The authority records have a system- generated identifier which will need to become a URI if the authority file is to be used outside the silo of the cataloging system.
Ricardo Santos (Biblioteca Nacional de España) shared some lessons learned in moving from authorities to identifiers based on the national library’s experience with datos.bne.es and their efforts to make their data more discoverable on the Web. He highlighted some of the major differences, such as: “all identifiers are equal, while with authorities, there is a “preferred string” and the importance to link up to identifier hubs – like VIAF – for better data exchange. Ricardo ended his talk by raising awareness about the need to identify everything – not just persons and organizations.
Andoni Calderón (Universidad Complutense) gave an overview of the university’s researchers portal and its data sources. He compared the coverage of Scopus, ORCID, ResearchID, GoogleScholar and Dialnet. As it turns out, Dialnet is the largest source for data about the university’s researchers. Dialnet is the Research Portal for Spanish and Latin American universities and contains metadata, full-texts and metrics of journals, theses and conference proceedings, with a good coverage of Social Sciences and Humanities. Andoni concluded that the Dialnet researcher-identifier is therefore useful for the university’s CRIS and researcher profiles.
Mari Fe Rivas (Universidad del País Vasco/Euskal Herriko Unibertsitatea) explained how her library is using ORCID as the place to manage the identities of the university’s researchers. They feed all information found in the internal sources (CRIS and IR) and the external ones (Scopus, CrossRef, Dialnet, etc.) into a researcher’s ORCID record and that is where they carry out data cleaning, deduplication, disambiguation and data control.
The different group discussions amplified the message from the lightening talks: in Spain the demand for making researchers’ data more visible is growing and it is driven by policy mandates (e.g. Catalan Government) and researchers themselves. Much effort is going into enriching authority data, using other data sources, and linking with identifier hubs. But which hub to choose from the Babel of identifiers? And while all this work is happening behind the scenes, it needs to be more aligned and made visible in order to get more commitment from the university leadership to invest in the necessary resources. Collaboration in the context of REBIUN, the association of Spanish universities, could help address some of the common issues and lead to guidelines, best practices, and more efficient workflows.
Recommended reading: Authorities and Identifiers Annotated Bibliography
Titia van der Werf is a Senior Program Officer in OCLC Research based in OCLC’s Leiden office. Titia coordinates and extends OCLC Research work throughout Europe and has special responsibilities for interactions with OCLC Research Library Partners in Europe. She represents OCLC in European and international library and cultural heritage venues.