The OCLC Research Library Partnership Metadata Managers Focus Group met in January 2025 to explore community cases for the use and re-use of linked data URIs found in MARC records. Members were invited to bring examples of work they are doing or have seen that activates linked data augmentations to MARC records in their discovery systems and integrations.
The round-robin of three separate discussions included 38 participants from 25 RLP institutions in 5 countries:
British Library | Rutgers University | University of Illinois at Urbana-Champaign |
Cleveland Museum of Art | Smithsonian Institution | University of Kansas |
Cornell University | Tufts University | University of Leeds |
National Gallery of Art | University of Arizona | University of Pittsburgh |
National Library of Australia | University of Calgary | University of Southern California |
National Library of New Zealand | University of California, Los Angeles | University of Sydney |
New York University | University of California, Riverside | University of Tennessee, Knoxville |
Princeton University | University of Chicago | Virginia Tech |
Radboud University | Yale University |
These discussions explored the technical and social challenges of transitioning to next-generation metadata formats. This blog post synthesizes the key topics that came up in the discussions, concerning URIs in MARC, the use of AI, and more general aspects of getting started with linked data in library metadata.
Background
Like many complex systems, our transition to the next generation of metadata has both technical and social challenges. Over the past decade, the metadata community has been on a journey to transform the way we work from “strings” of description in MARC records toward knowledge graph structures that identify different entities and their relationships. Because building a graph structure is a bit like building an arch—you need all the pieces in place for it to be self-supporting—we’ve also had to deploy scaffolding to temporarily support our transition. One piece of this scaffolding is “linky MARC,” that extends the MARC format to include linked data URIs. Emerging from the PCC Task Group on URIs in MARC, the inclusion of these URIs facilitates the move toward next-generation metadata formats such as BIBFRAME.

As part of OCLC’s linked data strategy, we have added more than 500 million WorldCat Entities URIs to WorldCat records for five entity types: Persons, Organizations, Places, Events, and Works. Enriching WorldCat bibliographic records with WorldCat Entities URIs establishes a bridge between MARC data and linked data, providing a starting point for connecting data across local systems and workflows and for using linked data functionality, such as in local discovery systems. These URIs and associated WorldCat Entities data are free for anyone to retrieve and use, through our website or via API, under a Creative Commons Attribution-NonCommercial 4.0 (CC BY-NC4.0) license. (Meridian subscribers also have creating/editing privileges).
While generative AI may be pulling at our attention, it’s important to remember that linked data also emerged from an earlier era of symbolic artificial intelligence systems—developers of the semantic web needed to present factual metadata statements and a mechanism for machines to interpret them in reasoning activities. In current conversations about AI, this has largely been replaced by the kinds of stochastic prediction methods offered by large-language models. However, the work the community has done on linked data over the last decade remains significant. A recent paper highlighted the strengths of knowledge graphs (aka linked data), search engines, and large-language models in addressing specific user needs. The authors suggest that integrating these technologies could lead to more effective solutions and emphasize the importance of considering the diverse information needs of users. They believe “that research on their combination and integration—with users in mind—will be particularly fruitful.”
The January 2025 Metadata Managers Focus Group round-robin session focused on the following discussion prompts:
- What do you want to do with linked data URIs being added to MARC records?
- What are you doing that activates linked data augmentations to MARC records in your discovery systems and integrations?
The discussions revealed that members are navigating many issues:
- While linked data is of interest, regular work and Library Service Platform (LSP) migrations leave little room for experimentation or even adoption. Linked data is neither a driver for moving systems not supporting the move in any way.
- Metadata managers need tools that support seamless workflows, with a focus on data quality and trust in features and sources.
- While few attendees are yet using OCLC URIs, the need for linked data entities in repositories and cultural heritage collections is clear.
- Generative AI and machine learning present opportunities for creating and improving linked data. But metadata managers are also concerned about maintaining awareness of when AI has been used in metadata workflows.
- Participants wish to better understand what users want and need from linked data, especially as generative AI shakes up how researchers interact with our data.
Little time to think about linked data
While all participants were eager to learn about the benefits of URIs in MARC, most shared that there was little time to think about linked data. The immediate focus for many of them was the implementation of a new library service platform or the preparation for one. The latter would require prepping existing metadata for migration to the new platform in addition to keeping up with regular workflows, leaving little room for anything else.
This raises an important question about how the presence of linked data URIs in the metadata could make platform migrations smoother in the future.
Participants also shared that, at this point in time, linked data is not yet a driver for switching LSPs.
URIs: Little experience, much curiosity
Few attendees had a clear vision for how they would use OCLC URIs in their systems and workflows, and how linked data features in their chosen LSP would take advantage of them. Also, there was some confusion about why OCLC is adding URIs, and under which license terms these could be used.
However, participants noted the need to provide linked data entities for institutional repositories, data repositories, and cultural collections, and felt that WorldCat Entities could be useful in this area. Person and organization entities were of particular interest, less so subject or classification URIs.
Current linked data tools not yet convincing
A more modern LSP could be an opportunity to use MARC encoded URIs in new cataloging workflows or user-facing discovery features. However, several participants noted challenges of working with currently available linked data features. They are seeking seamless tools and finding disjointed environments that aren’t ready for production use.
For example, URIs could be used to expand information on a particular entity and expose that information to users in a knowledge card. This can be an easy configuration change in systems where this is a feature. According to our attendees who are exploring this, metadata quality is still a significant part of making such a feature accurate and useful. Anecdotally, they shared examples where Person cards pointed to the wrong entity or contained incorrect information. Since these may rely on external sources, such as Wikidata, it requires skill and knowledge to update the sources. Even then, it may take time for those changes to propagate to local environments and become visible to end users.
Organizations who have adopted open-source discovery layers may need to dedicate local resources to develop workflows and interfaces that take advantage of linked data URIs.
Even where linked data features are available, if the underlying metadata is not aligned, users may still need to navigate between silos to find what they are looking for. A common example is that it is not always possible to look across both monographs and article-level publications for the same person. Even when ORCIDs and ISNI persistent identifiers are available, they may not be integrated into a single entity. When expanding beyond just bibliographic materials to include art and special collections, similar challenges remain when entity information hasn’t been unified.
This lack of linked data integration with current systems, tools, and workflows is a barrier to adoption, as is the lack of trust in existing linked data features or sources.
Participants also desire to better understand user needs to inform their work. None of our participants actively pursued systematic studies of how their users take advantage of entity information or browse, and there was a consensus that user studies are needed in this area. In addition, much of what we’ve previously learned about users that inform current discovery experiences is being shaken up by access to interactive AI chatbots. We flagged this topic as something to explore in a future session.
Opportunities and concerns about AI-generated metadata
An ongoing concern among metadata managers is the role that AI will play in future cataloging and metadata workflows, especially as we move toward linked data approaches. During our discussion, metadata managers wanted to better understand how we can leverage AI’s advantages responsibly, in part by considering how best to inform others (catalogers, users, etc.) about the provenance of metadata records.
Balancing AI opportunities with quality
Participants with access to emerging AI cataloging assistants shared how these tools can help bootstrap brief records that are augmented in later human-centered workflows. Especially given that both general and special collections face significant backlogs (see our previous discussion about this in Keeping up with next-generation metadata for special collections and archives), having an AI assistant do the initial work of transcribing information from print would save time. Whether we can trust that these agents produce records of sufficient quality remains an area of study and interest.
Needs for consistent AI data provenance
A significant question that emerged in our conversation about the use of AI is how we are providing data provenance statements and at what level of granularity they are needed. For the agents mentioned above and in other early AI-driven MARC workflows, a consensus has developed around using a 588 Source of Description note to indicate that AI was used to create the description.
Because linky-MARC enables us to record a URI alongside textual labels, these URIs can also serve as a guide to data provenance. Participants noted that an LC/NACO Authority File record contains a great deal of information about how the authority was established and provides human-readable information that catalogers can use to make judgments about whether to trust it. This kind of trust can easily be extended to the linked data entities and URIs at id.loc.gov. How, then, can we express our confidence in different linked data sources in a way that replicates catalogers’ professional judgments, especially if those judgements were made by an AI agent? When presented with repeated entities that use different URIs, how do we choose among them? As one participant suggested, there may not be one single answer to this question. Rather, we may need to develop application profiles that express different levels of trust for different environments and use cases.
Final thoughts
This blog post started by talking about how we’re building a linked data ecosystem that requires some scaffolding to support emergent data structures and workflows. In addition to shepherding their organizations onto new platforms, LLM-powered AIs are disrupting how we think about discovery and meeting user needs. Just as Google search upturned our concept of the OPAC and ushered us into an era of modern discovery layers, AI is challenging us to reimagine what we are doing with linked data. An example of this is the Library of Congress’ “Modern MARC” approach, which will further extend “linky” MARC by including additional identifiers and embracing linked data modeling choices that are better suited for translation into knowledge graphs. Beyond displaying knowledge cards in user interfaces, knowledge graph structures can fulfill their original promise by minimizing LLM hallucinations through metadata trusted by the humans in the loop.
Writing this blog post was a collaborative effort. Many thanks to all those who contributed, in particular my colleagues Rebecca Bryant, David Heimann, Erica Melko, Mercy Procaccini, Merrilee Proffitt and Chela Weber.

Dr. Annette Dortmund leads OCLC’s European product management and research concerned with next-generation metadata solutions for libraries and other cultural heritage institutions, with a particular focus on persistent identifiers in scholarly communication and library linked data. She also coordinates and supports European research and engagement programs for the OCLC Research Library Partnership.
By submitting this comment, you confirm that you have read, understand, and agree to the Code of Conduct and Terms of Use. All personal data you transfer will be handled by OCLC in accordance with its Privacy Statement.