This is the fourth post in a short blog series on what we learned from the OCLC RLP Managing AI in Metadata Workflows Working Group. This post was co-authored by Rebecca Bryant and Annette Dortmund.
Artificial intelligence (AI) holds significant potential for improving metadata workflows, offering tools to enhance efficiency, improve discovery, and address long-standing challenges in libraries. Yet, as with any transformative technology, AI adoption requires thoughtful consideration of its limitations, ethical implications, and impact on professional practice. The key is finding the right balance—one that leverages AI’s capabilities while maintaining the quality and professional standards that libraries depend on.
From April to June 2024, the OCLC Research Library Partnership (RLP) convened the Managing AI in Metadata Workflows Working Group. This working group brought together metadata managers to explore how AI could be integrated into cataloging, special collections, and institutional repository workflows. Across these discussions, librarians and archivists expressed both enthusiasm and caution about AI adoption, and a set of cross-cutting themes emerged—insights that extend beyond specific workflows and highlight the opportunities and challenges of responsible AI adoption in libraries.
This blog post—the final of a four-part series—synthesizes key themes, including the critical importance of metadata quality, the need for ethical standards and transparency, the evolving roles of metadata professionals, and the responsibility to adopt sustainable AI practices. These insights, combined with emerging best practices from organizations like OCLC, point toward a future where AI enhances rather than replaces human expertise in metadata work.
Quality and reliability of metadata is essential
A fundamental theme across all discussions was the critical importance of metadata quality. Working group participants consistently stated that creating records using AI is counterproductive if resources are not accurately described or if users are misdirected. This emphasis on quality isn’t a barrier to AI adoption—it’s a framework for responsible implementation.
Several quality considerations emerged repeatedly:
- Hallucinations that introduce false information into catalog records
- Inconsistent outputs from identical inputs, undermining reliability
- Unreliable confidence scores that don’t always accurately reflect the quality of AI-generated content
- Entity recognition failures where AI-generated results might look syntactically correct but fail to identify the right person, place, or organization
However, these challenges are driving productive innovations rather than insurmountable barriers. OCLC’s approach to AI-powered de-duplication in WorldCat demonstrates how quality concerns can be addressed through hybrid approaches that combine AI efficiency with human expertise. OCLC has worked closely with the cataloging community to help validate its machine learning model’s understanding of duplicate records in WorldCat. To date, OCLC has removed more than 9 million duplicate records from WorldCat as a result of this AI model, which we continue to test and refine. The process includes conservative decision-making protocols and human oversight for complex cases, showing how AI can scale quality work rather than compromise it.
These developments are driving productive conversations about human oversight processes, quality control checkpoints, and training approaches that help staff effectively evaluate AI outputs—and that are already yielding practical solutions.
Contextual and cultural knowledge gaps exist
One of the most significant limitations identified by the working group involves AI’s current struggle with contextual and cultural knowledge. Participants noted practical challenges, such as AI transcription systems converting “MARC” to “Mark” or “nomen” to “Newman” in recordings with technical terminology. More broadly, AI systems often lack the deep contextual understanding needed for community-specific terminology or cultural nuances that don’t appear in general training databases.
Rather than viewing these as permanent limitations, the library community is actively addressing them. These challenges highlight an important opportunity: the need for more specialized, task-specific AI tools rather than general-purpose models. OCLC’s experiments with subject analysis and classification prediction demonstrate this approach in action. By grounding AI models in high-quality library metadata—specifically WorldCat data—OCLC is developing tools that understand library contexts better than general-purpose models.
This specialized approach also reinforces the continuing value of librarians’ and archivists’ deep collections knowledge and cultural expertise, positioning AI as a tool that extends rather than replaces professional judgment.
Evolving professional roles and skills: Enhancement, not replacement
Participants expressed genuine interest in AI as a tool for increasing efficiency and freeing metadata specialists from repetitive work to focus on more complex and specialized tasks. At the same time, thoughtful questions emerged about professional development and skill maintenance in an AI-enhanced environment.
Key considerations include how to ensure that new professionals develop foundational skills traditionally gained through tasks like brief record creation—skills that become essential for effectively evaluating AI outputs later in their careers. Experienced catalogers wondered whether spending more time reviewing than creating might impact their ability to identify subtle errors or handle complex materials that require human insight.
These discussions highlight the importance of designing AI implementations as enhancements to human expertise rather than replacements, ensuring that professional development pathways remain robust while leveraging AI’s potential to handle volume and routine tasks. OCLC’s approach exemplifies this philosophy. OCLC’s AI de-duplication project doesn’t, for instance, doesn’t eliminate human oversight but refocuses it where expertise matters most. As noted by Bemal Rajapatirana, “This approach to de-duplication is not about reducing the role of people—it’s about refocusing their expertise where it matters most. Catalogers can focus on high-value work that connects them to their communities instead of spending hours resolving duplicate records.”
Real world library examples already demonstrate this potential. The University of Calgary Library successfully redirected 1.5 FTE of staff time to more strategic, higher-level tasks following the implementation of its AI chatbot, showing how AI can create space for the uniquely human aspects of library work rather than diminishing professional roles.
Ethical considerations and standards: Building transparency into practice
Working group members identified several important ethical considerations, with data provenance and transparency emerging as particularly crucial. Participants emphasized the need to track when and how AI contributes to metadata, both for quality control purposes and transparency.
For example, in a case study where AI was given a finding aid and asked to provide headings for personal names that were verified against the LC Name Authority File, the tool provided headings that looked correctly formulated (e.g., “Bukowski, Charles, 1920-1994” with dates added ), and AI even claimed that they were verified, but actually they were not the correct authorized headings (“Bukowski, Charles”). In this type of case, the provision of provenance information indicating that the heading was AI contributed could trigger human review for quality control.
OCLC has responded to community questions about data provenance for AI-generated metadata by updating WorldCat documentation and providing guidance through programs like AskQC Office Hours. OCLC’s Bibliographic Formats and Standards (BFAS) now includes instructions for recording AI-generated metadata in bibliographic records in section 3.5. Readers may also find it useful to consult the August 2025 AskQC Office Hours session.
Questions also arose about the lifecycle of AI-generated metadata: When does AI-generated content become simply “cataloger-reviewed content,” similar to copy cataloging workflows? How do we balance transparency with practical workflow considerations? These discussions reflect the library community’s commitment to responsibly working through the practical implications of new technologies.
Environmental awareness and responsibility
Participants expressed concerns about AI’s environmental impacts, indicating a preference for less energy-intensive solutions when they prove similarly effective. Rather than viewing this as a barrier, metadata managers identified a need for accessible information about the environmental impact of different AI applications, enabling informed decision-making and meaningful conversations with their teams about responsible implementation choices.
OCLC’s approach to AI development reflects this environmental consciousness. The WorldCat de-duplication model is designed to be computationally efficient, reducing unnecessary resource use while maintaining high-quality results. As Rajapatirana explains, “by optimizing AI’s footprint, we ensure that de-duplication remains cost-effective and scalable for the long term.” This environmental consciousness reflects the library community’s broader commitment to sustainability and responsible technology adoption, suggesting opportunities for training and information sharing about library AI energy impacts.
Conclusion
The concerns and opportunities described in this blog post reflect a community that is actively thinking through the implications of an emerging technology, rather than simply adopting it. The clearly articulated need for specialized AI tools, quality frameworks, and ethical guidelines is driving innovations that address current limitations.
Working group participants’ emphasis on maintaining professional expertise while leveraging AI’s capabilities suggests a thoughtful approach to technology integration that preserves what makes library work valuable while enhancing its impact.
The RLP Managing AI in Metadata Workflows working group provided the opportunity for metadata managers to identify important implications for AI usage in metadata workflows. This blog series distills those insights, and we hope that these observations will offer useful guidance to the library community as it collectively navigates technological change.
NB: As you might expect, AI technologies were used extensively throughout this project. We used a variety of tools—including Copilot, ChatGPT, and Claude—to summarize notes, recordings, and transcripts. These were useful for synthesizing insights for each of the three subgroups and for quickly identifying the types of overarching themes described in this blog post.
Rebecca Bryant, PhD, previously worked as a university administrator and as community director at ORCID. Today she applies that experience in her role as Senior Program Officer with the OCLC Research Library Partnership, conducting research and developing programming to support 21st century libraries and their parent institutions.
By submitting this comment, you confirm that you have read, understand, and agree to the Code of Conduct and Terms of Use. All personal data you transfer will be handled by OCLC in accordance with its Privacy Statement.