Skip to content

Hanging Together

the OCLC Research blog

  • Home
  • About
  • Subscribe to Hanging Together
Main Menu
Infrastructure and Standards Support / Metadata

Services built on usage metrics

September 30, 2015February 15, 2024 - by Karen Smith-Yoshimura

That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by Corey Harper of New York University and Stephen Hearn of University of Minnesota. They had posited StackLifethat in an environment more oriented toward search than toward browse indexing, new kinds of services will rely on non-bibliographic data, usage metrics and data analysis techniques. Metrics can be about usage data—such as how frequently items have been borrowed, cited, downloaded or requested—or about bibliographic data—such as where, how and how often search terms appear in the bibliographic record. Some kinds of use data are best collected on a larger scale than most catalogs provide.

These usage metrics could be used to build a wide range of library services and activities. Among the possible services noted: collection management, identifying materials for offsite storage, deciding which subscriptions to maintain, comparing citations for researchers’ publications with what the library is not purchasing; improving relevancy ranking, personalizing search results, offering recommendation services, measuring impact of library usage on research or student success. What if libraries emulated Amazon with “People who accessed <this title> also accessed <these titles>” or “People in the same course as you are accessing <these titles>”?

Harvard Innovation Lab’s StackLife aggregates such usage data of library titles as number of check-outs (broken down by faculty, undergraduates and graduate students, with faculty checkouts weighted differently), number of ILL requests, and frequency the title is placed in course reserves, and then assigns a “Stack Score” for each title. A search on a subject then displays a heat map graphic with the higher scores shown in darker hues, as shown in the accompanying graphic, and can serve as a type of recommender service. The StackLife example inspired other suggestions for possible services, such as aggregating the holdings and circulation data across multiple institutions—or even across countries—with Amazon sales data, and weighting scores if the author was affiliated with the institution. A recent Pew study found that personal recommendations dominated book recommendations. Could libraries capture and aggregate faculty and student recommendations mentioned in blogs and tweets?

The University of Minnesota conducted a study[i] to investigate the relationships between first-year undergraduate students’ use of the academic library, academic achievement, and retention. The results suggested a strong correlation between using academic library services and resources—particularly database logins, book loans, electronic journal logins, and library workstation logins— and higher grade point averages.

Some of the challenges raised in the focus group discussions included:

Difficulties in analyzing usage data: The different systems and databases libraries have present challenges in both gathering and analyzing the data. A number of focus group members are interested in visualizing usage data, and at least a couple are using Tableau to do so. Libraries have with difficulty harvested citations and measure which titles are available in their repositories, but it is even more difficult to demonstrate which resources would not have been available without the library.  The variety of resources also mean that the people who analyze the data are scattered across the campus in different functional areas. Interpreting Google analytics to determine patterns of usage over time and the effect of curricula changes is particularly difficult.

Aggregating usage data across campus: Tools that allow selectors to choose titles to send to remote storage by circulation data and classification range (to assess the impact on a particular area of stacks) can be hampered when storage facilities use a different integrated library system.

Anonymizing data to protect privacy: Aggregating metrics across institutions may help anonymize data but hinders analysis of performance at an individual institution. Anonymizing data may also prevent usage metrics by demographics (e.g., professors vs. grad students vs. undergraduates). Even when demographic data is captured as part of campus log-ins, libraries cannot know the demographics of people accessing their resources who are not affiliated with their institution.

Difficulties in correlating library use with academic performance or impact: Some focus group members questioned whether it was even possible to correlate library use with academic performance. (“Are we busting our heads to collect something that doesn’t tell us anything?”) On the other hand, we can at least start making some decisions based on the data we do have, and perhaps libraries’ concern with being “scientific” is not warranted.

Data outside the library control: Much usage data lies outside the library control (for example, Google Scholar and Elsevier). Only vendors have access to electronic database logs. Relevance ranking for electronic resources licensed from vendors is a “black box”.

Inconsistent metadata: Inconsistent metadata can dilute the reliability of usage statistics. Examples cited included: the same author represented in multiple ways; varying metadata due to changes in cataloging rules over time; different romanization schemes used for non-Latin script materials.  The biggest issue is that most libraries’ metadata comes from external sources and thus the library has no control over its quality. The low quality of metadata for e-resources from some vendors remains a common issue; misplaced identifiers for ebooks was cited as a serious problem. Focus group members have pointed vendors to the OCLC cross-industry white paper, Success Strategies for Electronic Content Discovery and Access without much success. Threats to cancel a subscription unless the metadata improves prove empty when their selectors object. Libraries do some bulk editing of the metadata, for example: reconciling name forms with the LC name authority file (or outsource this work); adding language and format codes in the fixed fields. The best sign of a “reliable vendor” is that they get their metadata from OCLC. It’s important for vendors to view metadata as a “community property.”

[i] Krista M. Soria, Jan Fransen, Shane Nackerud.  Stacks, Serials, Search Engines, and Students’ Success: First-Year Undergraduate Students’ Library Use, Academic Achievement, and Retention. Journal of Academic Librarianship, 40 (2014), 84-91. doi:10.1016/j.acalib.2013.12.002

 

Karen Smith-Yoshimura

Karen Smith-Yoshimura, senior program officer,  topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements. Karen retired from OCLC November 2020.

OCLC Research

Hanging Together is the blog of OCLC Research. Learn more about OCLC Research on our website.

Stay Connected

Sign up to have Hanging Together updates sent directly to your inbox and to keep up with the latest news about OCLC Research.

Links

  • Next – OCLC Blog
  • OCLC Research
  • OCLC Research Library Partnership
  • WebJunction

Categories

  • Archives and Special Collections (229)
  • Artificial Intelligence (AI) (21)
  • Born-Digital Special Collections (15)
  • Collaboration (30)
  • Collections (3)
  • Collective Collections (124)
  • Data Science (16)
  • Digital Preservation (70)
  • Digitization (25)
  • Equity, Diversity, Inclusion (EDI) (99)
  • Evolving Scholarly Record (12)
  • Higher Education Future (9)
  • Identifiers (44)
  • Infrastructure and Standards Support (109)
  • Libraries (103)
  • Libraries Archives and Museums (136)
  • Libraries in the Enterprise (3)
  • Library Futures (11)
  • Library Management (15)
  • Linked Data (60)
  • Measurement and Behaviors (44)
  • Metadata (128)
  • Miscellaneous (176)
  • Modeling new services (113)
  • MOOCs (7)
  • Museums (58)
  • New Model Library (2)
  • Open Access (21)
  • Renovating Descriptive Practice (131)
  • Research Data Management (31)
  • Research Information Management (52)
  • Research Library Partnership (232)
  • Research support (70)
  • Resource Sharing (13)
  • Searching (38)
  • SHARES (13)
  • Social Interoperability (35)
  • Supporting Scholarship (69)
  • Systemwide Organization (42)
  • User Behavior Studies and Synthesis (18)
  • Visual Resources (17)
  • Web Archiving (14)
  • WebJunction (8)
  • Wikimedia (43)

Share Buttons

  • Bluesky
  • Facebook
  • Linkedin
  • Twitter
  • Outlook
  • Gmail
  • Yahoo Mail
  • Email

Recent Comments

  • Isabel Quintana on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kem Lang on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Kelly Sattler on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Renee Mercer on World of cats meets real cat: My thoughts on the ultimate library quilt
  • Trenton James on Navigating the future of special collections metadata by using insights from the past 

Categories

Archives

More about OCLC Research

Visit our web site.

Recent Posts

  • Rising to the challenge: How the SHARES resource sharing community navigated a global disruption to international shipping
  • Roles for resource sharing practitioners in making library materials accessible
  • Efficiënt ontdubbelen in WorldCat: hoe AI en catalogiseerwerk elkaar versterken
  • Deduplicación eficiente de WorldCat: Equilibrando la IA y la catalogación profesional
  • Leading through uncertainty: Fostering morale and connection in challenging times 

Policy Links

  • Code of Conduct
  • Terms of Use
  • Privacy Statement

Admin.

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Cookies used on Hanging Together
© 2024 OCLC || ISSN 2771-4802