What do MARC descriptions of archival materials really look like?

Taking Our Pulse showed that 44% of those in the surveyed institutions had no online record. Colleagues worldwide are working hard to improve this sorry situation. In the mean time, I hope you agree with me that doing the work as effectively and efficiently as possible is essential.

At the same time, our professional community is developing new data structures for storing and communicating library and archival descriptions to meet 21^st-century needs. This includes enabling improved discovery through the promise of linked open data. As we move forward in that work, what can we learn from past practice? As my OCLC Research colleague Karen Smith-Yoshimura showed in her 2010 report on MARC field usage, fewer than thirty of the more than two hundred fields in MARC21 have been used in 10% of more of all WorldCat records. It makes one wonder: should we be carrying all those data elements forward? Do we need such granular data structures if so many bits are little-used? Would simpler approaches improve workflow efficiencies without sacrificing effectiveness for identification and discovery of unique materials?

These are among the questions lurking behind my current project to look at data element usage in the four million WorldCat that describe archival materials. Some others:

How do we define “archival materials” for purposes of such a study?
Is archival use of MARC accurate and fulfilling its potential?
How does archival description differ across types of material?
Are archival materials usually described as collections?
Does the archival control byte (Leader 08) capture all archival descriptions?
How often is DACS specified as the content standard?
To what extent have DACS minimum requirements been met?

And, referring back to my comments above, the bonus question: What implications for “next-gen” cataloging do the data suggest?

Last Thursday I presented an OCLC Research Library Partnership work-in-progress webinar to offer a first look at the data, solicit feedback on how I’ve approached the analysis, and lob a few tentative recommendations. Take a look at the slides and the recording.

First, the overall profile of the dataset by broad types of material: Visual materials are represented by the largest number of records (1.4 million), followed by mixed (1.3 million) and textual materials (800,000). More than 300,000 recordings are included, over 120,000 music scores, and much smaller quantities of several other types of material.

Here are a few of the data points I find interesting:

The record type byte (Leader 06) is used incorrectly in some significant ways.
Archival descriptive standards are specified in only 20% of records.
Twenty-five percent of mixed materials are described as items, as are up to 95% of materials describing other formats.
Some format-specific note fields are greatly underutilized.
Archival control is specified in only 28% of records.
Cataloging practices reveal format-specific silos.
One-third of records link (856) to digital content.

What do these data suggest for the likely success of archival descriptions to connect users with materials? How should practice change going forward?

I’d love to hear your feedback after you take a look at the webinar outputs. A report will be published early in 2016, so please get in touch soon.

And, in the mean time, I wish you and yours a joyful holiday season!

Jackie Dooley

Jackie Dooley retired in from OCLC in 2018. She led OCLC Research projects to inform and improve archives and special collections practice.

2 Comments on “What do MARC descriptions of archival materials really look like?”

Stephanie Bennett says:

December 9, 2015 at 12:01 pm

I’m surprised that so few records describe whole collections and that only a third of records link to digital content, although perhaps I shouldn’t be. And it was fun (and a little nerve-wracking) to go back and look at collections’ catalog records. I haven’t listened to the recording yet, but I’d love to hear more about the silos that affect cataloging.

As far as what else I’d want to know: what limits what folks put into records? Is it possible that it’s a software problem or are there other limitations? For example, why do less than half of DACS-described mixed materials collections include dates in the 245? That seems like an easy win. You probably have that question too, though, and it would be hard to get at.
Lise Summers says:

December 8, 2015 at 1:45 pm

One suggestion would be to look at the way in which MARC records conform to the ICA standards, ISAD(G) in particular, rather than a national std like DACS.

Comments are closed.