Skip to content

Hanging Together

the OCLC Research blog

  • Home
  • About
  • Subscribe to Hanging Together
Main Menu
Archives and Special Collections / Digital Preservation / Metadata / Web Archiving

Descriptive Metadata for Web Archiving: Read the reports!

March 27, 2018September 4, 2020 - by Jackie Dooley

What’s the most widely-shared, top- priority web archiving issue across the OCLC Research Library Partnership? We conducted a survey two years ago to explore web archiving needs across the Partnership, and the lack of appropriate descriptive metadata guidelines for archived websites came out on top.

In response, we established the OCLC Research Library Partnership Web Archiving Working Group and recently published three reports as the outcomes of the group’s work.

Our preliminary research confirmed that guidelines would indeed be helpful to encourage consistency of practice. Although a variety of library and archival standards were in use, none addressed the array of conundrums presented by this type of resource, such as these:

  • Should a website owner be identified as the author (or creator), publisher, and/or subject? What about the institution that archives a site or collection of sites?
  • Which types of date are the most meaningful to users? Those designating the start and/or end dates of a site’s existence? The dates that sites in a collection were captured? Dates reflected in the content? The date shown on a page? And how can the meaning of any particular date be made clear?
  • How best should the extent of the resource be expressed to be both meaningful to users and efficient for busy metadata creators? Is the RDA default value “1 online resource” meaningful, or would a statement that includes “website” be an improvement?
  • Would it be useful to blend characteristics of archival and bibliographic description in descriptions of archived sites or collections? Many institutions already do so.
  • Which URL(s) should be included in a descriptive record?
  • Do existing approaches take into account the needs of users?

These and many other questions arose as we studied the relevant standards, compiled institutional guidelines, and examined numerous extant bibliographic and archival records in multiple discovery environments, including ArchiveGrid, WorldCat, and Archive-It. In every context, we found wide variation in both the data elements chosen and the nature of their content. We concluded that web-specific recommendations for descriptive metadata would be helpful, and active outreach to various specialist communities confirmed this.

Our recommended data elements and content guidelines are described in this report, together with introductory text in which we describe the characteristics of bibliographic and archival description, address issues particular to live and archived websites, and discuss aspects of collection-level and  item-level approaches.

We also realized that it would be key to keep the needs and perspectives of users at top of mind as we did our research—but what are their needs? We therefore compiled and abstracted more than sixty readings from the web archiving literature that address descriptive metadata issues (at least in part). These are summarized in a  second report, which includes an introductory narrative summarizing our analysis and an abstract of each reading.

We also studied other types of metadata that pertain for digital resources, which include technical and administrative in addition to descriptive–and found that significant gray area exists among them . We heard from colleagues that they needed a basic understanding of existing web harvesting tools to understand their metadata-related functionality. To address this need, we analyzed eleven tools in a  third report, with particular attention to the extent to which descriptive metadata can be extracted.

Two more posts this week will go into more depth about each of the reports. As always, we would be delighted to have your feedback on the work.

Jackie Dooley

Jackie Dooley retired in from OCLC in 2018. She led OCLC Research projects to inform and improve archives and special collections practice.

OCLC Research

Hanging Together is the blog of OCLC Research. Learn more about OCLC Research on our website.

Stay Connected

Sign up to have Hanging Together updates sent directly to your inbox and to keep up with the latest news about OCLC Research.

Links

  • Next – OCLC Blog
  • OCLC Research
  • OCLC Research Library Partnership
  • WebJunction

Categories

  • Archives and Special Collections (230)
  • Artificial Intelligence (AI) (24)
  • Born-Digital Special Collections (15)
  • Collaboration (30)
  • Collections (3)
  • Collective Collections (124)
  • Data Science (16)
  • Digital Preservation (70)
  • Digitization (25)
  • Equity, Diversity, Inclusion (EDI) (99)
  • Evolving Scholarly Record (12)
  • Higher Education Future (9)
  • Identifiers (44)
  • Infrastructure and Standards Support (109)
  • Libraries (103)
  • Libraries Archives and Museums (136)
  • Libraries in the Enterprise (3)
  • Library Futures (11)
  • Library Management (15)
  • Linked Data (60)
  • Measurement and Behaviors (44)
  • Metadata (131)
  • Miscellaneous (176)
  • Modeling new services (113)
  • MOOCs (7)
  • Museums (58)
  • New Model Library (2)
  • Open Access (21)
  • Renovating Descriptive Practice (131)
  • Research Data Management (31)
  • Research Information Management (52)
  • Research Library Partnership (235)
  • Research support (71)
  • Resource Sharing (13)
  • Searching (38)
  • SHARES (13)
  • Social Interoperability (35)
  • Supporting Scholarship (69)
  • Systemwide Organization (42)
  • User Behavior Studies and Synthesis (18)
  • Visual Resources (17)
  • Web Archiving (14)
  • WebJunction (8)
  • Wikimedia (43)

Share Buttons

  • Bluesky
  • Facebook
  • Linkedin
  • Twitter
  • Outlook
  • Gmail
  • Yahoo Mail
  • Email

Recent Comments

  • Millie N. Horsfall on Backlogs and beyond: AI in primary cataloging workflows
  • Isabel Quintana on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kem Lang on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kelly Sattler on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Renee Mercer on World of cats meets real cat: My thoughts on the ultimate library quilt

Categories

Archives

More about OCLC Research

Visit our web site.

Recent Posts

  • Examining the role of AI in institutional repository workflows
  • Exploring AI uses in archives and special collections: Integration, entities, and addressing need
  • Backlogs and beyond: AI in primary cataloging workflows
  • Rising to the challenge: How the SHARES resource sharing community navigated a global disruption to international shipping
  • Roles for resource sharing practitioners in making library materials accessible

Policy Links

  • Code of Conduct
  • Terms of Use
  • Privacy Statement

Admin.

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Cookies used on Hanging Together
© 2024 OCLC || ISSN 2771-4802