Skip to content

Hanging Together

the OCLC Research blog

  • Home
  • About
  • Subscribe to Hanging Together
Main Menu
Infrastructure and Standards Support / Linked Data / Metadata / Renovating Descriptive Practice

German round table on next-generation metadata: Formats, contexts and deficits

March 26, 2021March 25, 2021 - by Annette Dortmund
OCLC metadata discussion series

As part of the discussion series on Next Generation Metadata, this blog post reports back from the German language round table discussion held in the morning of March 10, 2021. (A German translation of this post is available here.)

Participants from Germany, Switzerland, and Hungary represented national libraries, state libraries, university libraries, and special libraries; combined, they had backgrounds in metadata and collection development, open access and automated subject indexing, metadata concepts and entity management – all the ingredients for a lively and varied discussion.  

Mapping exercise

Map of next-gen metadata projects (German session)

As in all other round table discussions, taking stock of projects in the regions was a first step and resulted in a map mural of projects, which indicated strong activity in the quadrant of bibliographic data and some additional activity in the other sections, research information management (RIM), scholarly communications and cultural heritage. 

Formats and contexts

The “MARC21 –> BIBFRAME” note on the map immediately sparked a general discussion about the suitability of data “formats” in different contexts. While there was agreement that BIBFRAME was more suitable and flexible than MARC, it, too, has its limitations. To actually exchange data, agreements need to be in place (and adhered to!) on how the standard is used. And bridging different types of data is not one of BIBFRAME’s strengths.  

As one participant noted: 

The separation of title and authority data is no longer valid, as in the future all types will be part of one big graph. 

Participants noted a necessity to create gangways between different data sources. New platforms need to be modular and scalable enough in order to accommodate the subtleties of various participating institutions. Moving away from authority files to identity management allows libraries to link e.g., research data with classic library data. Other libraries create cross-references to other systems, like the coli-conc project, or enrich their catalog with links to additional information from external sources. Customers want to find information, regardless of where it is and where it comes from. Meaningful links can be created without introducing new rules and building a complex new infrastructure.  

The nascent Hungarian National Library Platform (also shown on the map) focuses on a graph model that stores triples and not MARC data; that way, the data then is not tied to a specific format and the platform can serve multiple sectors; at the same time, exchange formats can be created as needed to accommodate specific needs.  

Another relevant project in this area listed in the research information management quadrant of the map is Metagrid – a project that links data from the digital humanities with other data, including authority files such as the GND. However, authority files never have enough historical details and the fine-granular information that historians would need. Which again emphasizes the need for creating gangways between data sources to benefit from one another’s work. We cannot all do everything, a participant warned. 

Library specific formats still have their role in specific contexts. National libraries publishing national bibliographies need to do so following a reliable set of rules, even though these very rules might become obsolete in other contexts. 

At the same time, library data finds itself next to data of a very different kind. One example is the tax-funded Swiss E-government data initiative, the E-government Schweiz portal: All data that is not confidential has to be made available for all citizens. Library data is published next to weather data etc., it is published as RDA data and these triples can be used for any application. There is no way to foresee what users might one day do with this data, including in combination, perhaps. Which is also very exciting!  

How can we integrate the libraries’ unique assets and strengths into the linked data world?  

Auto-indexing needs language tagging 

Another theme that emerged quite strongly was that of automated subject indexing and resulting data requirements.  

Current metadata has strong deficits in its quality in terms of machine-readability. For example, author keywords, abstracts etc. are needed in the metadata records to enable auto-indexing. This calls for a shift in the way in which data is handled, which type of data is needed, how it is stored, and how it is typified. 

Multi-lingualism is another big challenge in this context. Current authority data is modelled to have one preferred language. Future authority data needs to be modelled more flexibly, like in Wikidata, where a term has labels in more than one language (as in the example of FIFA mentioned during the session).  

For auto-indexing, all metadata elements need to be language-coded so that it is obvious to machines, not just users, which language is used for a given element or string. Librarians sometimes think that indicating the language of the document should be sufficient but that is not the case. This is both a coordination and a staffing problem.  

Automatic language-detection scripts are part of the solution but that has a certain fuzziness, participants noted. Maybe, one participant suggested:  

If we can get automatic subject tagging to work well, librarian staff could be freed–up for language tagging.  

Scaling the effort could also be beneficial. Currently, auto-indexing initiatives are often just local, and cooperation with library networks can be slow and tedious, participants observed. Cooperating internationally has its benefits, especially when cooperating with those much further ahead. The Finnish National Library, for example, develops solutions in this area and provides them for local deployment.  

Linked data efforts, too, should not be limited to local or regional scales but if possible, take place at the national level with strong links to an international infrastructure. The fact that, at least in Germany, many initiatives are traditionally linked to library networks and thus regional in scale, which can sometimes be a barrier to scaling up, one participant observed.  

Librarians need to revisit their understanding of their role.  

Often, when discussion next generation metadata topics it comes down to priorities. Can we re-use more of the data generated upstream, by publishers, producers, universities, without spending much time on creating it again in our libraries, to free up staff for other work? A difficult topic to raise with cataloguers, though, participants felt.  

And it is not just cataloguers … As a profession, we need to challenge and question positions of libraries which often do not have a broad perspective, one participant suggested. Administration is often slow and sluggish. The library world has not changed that much in the past ten years, unlike other sectors.  

Finally, participants agreed, let us get rid of the “project” concept, but rather acknowledge that this is an ongoing effort which needs appropriate staffing, unlimited job positions, and sufficient financial resources! In this realm at least of next generation metadata, we should no longer be working on a “project” basis. 

About the OCLC Research Discussion Series on Next Generation Metadata   

In March 2021, OCLC Research conducted a discussion series focused on two reports:  

  1. “Transitioning to the Next Generation of Metadata”  
  1. “Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project”.  

The round table discussions were held in different European languages and participants were able share their own experiences, get a better understanding of the topic area, and gain confidence in planning ahead.  

The Opening Plenary Session opened the forum for discussion and exploration and introduced the theme and its topics. Summaries of all eight round table discussions are published on the OCLC Research blog, Hanging Together. This post is preceded by the posts reporting out on  the first English session, the Italian session, the second English session and the French session.

The Closing Plenary Session on April 13 will synthesize the different round table discussions. Registration is still open for this webinar: please join us!  

Annette Dortmund

Dr. Annette Dortmund is a Senior Product Manager and Research Consultant at OCLC. Her work focuses on library roles and needs in the realm of non-traditional metadata, as related to research support, scholarly communications or knowledge work. She is also interested in system and social interoperability. Based in Germany, her interest is predominantly in European developments and trends.

Share on Facebook
Facebook
Tweet about this on Twitter
Twitter
Email this to someone
email

OCLC Research

Hanging Together is the blog of OCLC Research. Learn more about OCLC Research on our website.

Stay Connected

Sign up to have Hanging Together updates sent directly to your inbox and to keep up with the latest news about OCLC Research.

Links

  • Next – OCLC Blog
  • OCLC Research
  • OCLC Research Library Partnership
  • WebJunction

Categories

  • Archives and Special Collections (209)
  • Born-Digital Special Collections (14)
  • Collaboration (10)
  • Collections (1)
  • Collective Collections (120)
  • Data Science (11)
  • Digital Preservation (69)
  • Digitization (24)
  • Equity, Diversity, Inclusion (EDI) (21)
  • Evolving Scholarly Record (11)
  • Higher Education Future (8)
  • Identifiers (42)
  • Infrastructure and Standards Support (108)
  • Libraries (98)
  • Libraries Archives and Museums (134)
  • Libraries in the Enterprise (3)
  • Library Futures (1)
  • Library Management (9)
  • Linked Data (56)
  • Measurement and Behaviors (44)
  • Metadata (105)
  • Miscellaneous (178)
  • Modeling new services (113)
  • MOOCs (7)
  • Museums (57)
  • Open Access (14)
  • Renovating Descriptive Practice (129)
  • Research Data Management (23)
  • Research Information Management (46)
  • Research Library Partnership (178)
  • Research support (40)
  • Resource Sharing (9)
  • Searching (38)
  • SHARES (9)
  • Social Interoperability (20)
  • Supporting Scholarship (65)
  • Systemwide Organization (42)
  • User Behavior Studies and Synthesis (9)
  • Visual Resources (17)
  • Web Archiving (14)
  • WebJunction (8)
  • Wikimedia (43)

Share Buttons

Share on Facebook
Facebook
Tweet about this on Twitter
Twitter
Email this to someone
email

Recent Comments

  • King on Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 2022 May 17
  • Merrilee Proffitt on The Social “Stuff”
  • Margaret Ellingson on SHARE-ing is caring, feline edition
  • Rebecca Bryant on Working from home with humans during COVID, part 2
  • Andrea Kappler on Working from home with humans during COVID, part 2

Recent Comments

  • King on Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 2022 May 17
  • Merrilee Proffitt on The Social “Stuff”
  • Margaret Ellingson on SHARE-ing is caring, feline edition

Categories

Archives

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

More about OCLC Research

Visit our web site.

Recent Posts

  • Author identity management in the book chain
  • Gestión de la identidad del autor en la cadena del libro
  • Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 2022 June 28
  • Developing research analytics support services in research libraries
  • Advancing IDEAs: Inclusion, Diversity, Equity, Accessibility, 2022 June 14

Admin.

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
© 2020 OCLC || ISSN 2771-4802