Third English round table on next generation metadata: investing in the utility of authorities and identifiers

Thanks to George Bingham, UK Account Manager at OCLC, for contributing this post as part of the Metadata Series blog posts.

As part of the OCLC Research Discussion Series on Next Generation Metadata, this blog post reports back from the third English language round table discussion held on March 23, 2021.  The session was scheduled to facilitate a UK-centric discussion with a panel of library representatives from the UK with backgrounds in bibliographic control, special collections, collections management, metadata standards and computer science – a diverse and engaged discussion group.

Mapping exercise

Map of next-gen metadata projects (third English session)

As with other round table sessions, the group started with mapping next generation metadata projects that participants were aware of, on a 2×2 matrix characterizing the application area: bibliographic data, cultural heritage data, research information management (RIM) data, and for anything else, the category, “Other”. The resulting map gave a nice overview of some of the building blocks of the emerging next generation metadata infrastructure, focussing in this session on the various national and international identifier initiatives – ISNI, VIAF, FAST, LC/NACO authority file and LC/SACO subject lists, and ORCID – and metadata and linked data infrastructure projects such as Plan-M (an initiative, facilitated by Jisc, to rethink the way that metadata for academic and specialist libraries is created, sold, licensed, shared, and re-used in the UK), BIBFrame and OCLC’s Shared Entity Management Infrastructure.

The map also raises interesting questions about some of the potential or actual obstacles to the spread of next generation metadata:

What to do about missing identifiers? How to incorporate extant regional databases and union catalogs into the national and international landscape? How “open” are institutions’ local archive management systems? Who is willing to pay for linked data?

Contributing to Library of Congress authorities

The discussion panel agreed that there is a pressing need for metadata to be less hierarchical, which linked data delivers, and that a collaborative approach is the best way forward. One example is the development of the UK funnel for NACO and SACO, which has reinforced the need for a more national approach in the UK. The funnel allows the UK Higher Education institutions to contribute to the LC name and subject authorities using a single channel – rather than each library setting up its own channel. Because they work together as a group to make their contributions to the authority files, the quality and the “authority” of their contributions is significantly increased.

Registering and seeding ISNIs

One panelist reported on a one-year trial with ISNI for the institution’s legal deposit library, as a first step into working with linked data. It is hoped that it will prove to be a sustainable way forward. There is considerable enthusiasm and interest for this project amongst the institution’s practitioners, a vital ingredient for a successful next generation metadata initiative.

Another panelist expanded on several ongoing projects with the aim of embedding ISNI identifiers within the value chain and getting them out to where cataloguers can pick them up. For example, publishers are starting to use them in their ONIX feeds to enable them to create clusters of records. Also, cataloging agencies in the UK are being supplied with ISNI identifiers so that they can embed them in the metadata at source, in the cataloging-in-publication (CIP) metadata, that they supply to libraries in the UK.

Efforts are also under way to systematically match ISNI entries against VIAF entries, and to provide a reconciliation file to enable OCLC to update the VIAF with the most recent ISNI. These could then be fed through to the Library of Congress, who can then use these to update NACO files.

With 6 million files to update, this is a perfect example of a leading edge dynamic next generation metadata initiative that will have to overcome the considerable challenge of scalability for it to succeed at a global level.

Challenges faced by identifiers

The discussion moved on to the other challenges faced by identifier schemes. It was noted that encouraging a more widespread collaborative approach would rely on honesty amongst the contributors. There would need to be built in assurances that the tags/data come from a trusted source. Would the more collaborative approach introduce too much scope for duplicate identifiers being created, and too many variations on preferred names? Cultural expectations would have to be clearly defined and adhered to. And last but by no means least is the challenge of providing the resources needed to upscale to a national and international scope.

Obstacles in moving towards next generation metadata

Participants raised concerns that library management systems are not keeping pace with current discussions on next generation metadata or with real world implementations, to the extent that they may be the biggest obstacle in the move towards next generation metadata. It was recognized that moving to linked data involves a big conceptual and technical leap from the current string-based metadata creation, sharing and management practices, tools and methodologies.

Progress can only be made in small steps, and there is still much work to be done to demonstrate the benefits of next generation metadata, a prerequisite if we are to complete the essential step of gaining the support of senior management and buy-in from system suppliers.

If we don’t lead, will someone else take over?

Towards the end of the session, a brief discussion arose around the possibility (and danger) of organizations outside the library sector “taking over” if we can’t manage the transition ourselves. Amazon was cited as already becoming regarded as a good model to follow for metadata standards, despite what we know to be its shortcomings: it does not promote high quality data, and there are numerous problems concealed within the data, that are not evident to non-professionals. These quality issues would become very problematic if they are allowed to become pervasive in the global metadata landscape.

“Our insistence on ‘perfect data’ is a good thing, but are people just giving up on it because it’s too difficult to attain?”

About the OCLC Research Discussion Series on Next Generation Metadata

In March 2021, OCLC Research conducted a discussion series focused on two reports:

The round table discussions were held in different European languages and participants were able share their own experiences, get a better understanding of the topic area, and gain confidence in planning ahead.

The Opening Plenary Session opened the forum for discussion and exploration and introduced the theme and its topics. Summaries of all eight round table discussions are published on the OCLC Research blog, Hanging Together. This post is preceded by the posts reporting on the first English session, the Italian session, the second English session, the French session, the German session, and the Spanish session.

The Closing Plenary Session on April 13 will synthesize the different round table discussions. Registration is still open for this webinar: please join us!

Titia van der Werf

Titia van der Werf is a Senior Program Officer in OCLC Research based in OCLC’s Leiden office. Titia coordinates and extends OCLC Research work throughout Europe and has special responsibilities for interactions with OCLC Research Library Partners in Europe. She represents OCLC in European and international library and cultural heritage venues.