In July and August RLG Programs conducted a survey among 18 RLG partners we had selected because they had “multiple metadata creation centers” on campus that included libraries, archives, and museums and had some interaction among them. Our objective was to gain a baseline understanding of current descriptive metadata practices and dependencies, the first project in our program to change metadata creation processes.
We received 88 responses in all. We expect to issue a report analyzing the responses next month, but the preliminary look is intriguing indeed.
First, we wanted to have a variety of perspectives represented, even within one institution. The responses are dispersed among those who characterized their immediate work environments as digital library production (38%), archival collections processing and library technical services (33% each) followed by museum collection descriptions (19%) and institutional repositories (14%). Three-quarters of respondents deal with both published and unpublished materials. The respondents describe information resources that are reformatted into digital form, born digital, and analog, in that order. The types of materials described include still images, text, moving images, audio, cultural objects, computer files, Web sites, maps, and natural history objects.
With that kind of diverse representation, it is no surprise that the number of different systems used was also diverse. 76 listed the tools they used to create metadata. Guess how many tools were named? Over 270 in total, 88 different ones. And the most common? A custom system. Besides an integrated library system, the tool most frequently cited was MS Access. In several cases, a single institution used more than a dozen different tools.
That’s fine as long as systems output standard formats. But as the old saw says, “the great thing about standards is that there are so many to choose from”, and more than a dozen are used, with MARC, Encoded Archival Description, and Dublin Core Qualified the leaders. For the data content standards, 80% use the Anglo-American Cataloging Rules with DACS (Describing Archives: A Content Standard) second with 39%. Among those that characterized themselves as processing archival collections, DACS is used by almost 60%. Respondents reported using more than a dozen different controlled vocabularies – and almost half build and maintain one or more local thesauri. There was strong support for user-supplied tagging in addition to controlled vocabulary; a small minority (less than 10%) thought user-supplied tags obviated the need for controlled vocabularies.
Just over a third of the respondents do not create any MARC metadata; for those that did, the most common way to expose their MARC metadata to others outside the institution was through a Z39.50 server, with OAI-PMH (Open Archives Initiative – Protocol for Metadata Harvesting) a distant second. About 40% make at least some of their non-MARC metadata available to OAI harvesters. A little more than that expose their metadata to search engines.
Just under half of the respondents are able to keep up with additions to the information resources/collections they describe, but almost 90% reported backlogs. Almost half estimated that the percentage of their collections not adequately described – and unlikely to be described without additional resources, funding, or both – was over 30%. 35% reported ways that they generate some metadata automatically.
We were also interested in seeing the degree to which the different metadata creation centers within a campus worked together. Large majorities reported that there were other units within their institutions describing the same or similar types of materials and that staff worked with these other units. But there is less sharing when it comes to technical infrastructures, discovery environments, descriptive strategies, and metadata creation guidelines.
Is your interest piqued? Stay tuned – we’ll tell you here when the report is available!