In trying to make our metadata work harder, as we have with WorldCat Identities, WorldCat Genres, and other such projects, OCLC Research does a lot of looking at what our collective metadata holds. And frankly, in some ways I think it is less than you might think.
For example, my colleague Karen Smith-Yoshimura produced a while back, as part of the work she was leading to “gather evidence to inform changes in MARC metadata practices”, a scatterplot of the number of times various MARC elements appear in WorldCat records. The vast majority of record elements fell to the bottom of the chart at a very low occurrence rate. But as is the case with any scatterplot, the point is to identify the outliers. The outliers in this case are those elements that appear in a large number of records — that is, what might be considered “core” elements that are used to describe the vast majority of library owned material.
Those “outliers” can be categorized according to three general purposes:
- Provenance and Identity: identifiers (e.g. ISBN, OCLC, etc.) and cataloging source (040)
- Elements useful for discovery: title statement (245), personal names (100, 700) and subject (650)
- Elements useful for understanding and evaluation: publication statement (260), physical description (300), and notes (500)
That’s it. In a nutshell you have the very core of bibliographic description as defined by librarians over the last century or so. Are all other MARC elements useless? That’s not necessarily what I’m suggesting, although I do believe it calls into question the utility of a number of MARC elements. What I’m really trying to say is that if you want to know what librarians feel is useful or important in bibliographic description for the vast majority of library owned content, you only have to look at the evidence. It’s no more and no less than what is described above.
Image courtesy of Jeff Kubina, Creative Commons Attribution-ShareAlike 2.0 Generic License