Thresholds for Discovery

I’m pleased to report that we have an article in the most recent Code4Lib Journal! The article, Thresholds for Discovery: EAD Tag Analysis in ArchiveGrid, and Implications for Discovery Systems is based on an analysis of how EAD (Encoded Archival Description) is used in the ArchiveGrid corpus. We go beyond that to look at EAD-in-use …

More

Library Authorities Alternatives

Without delving into the dysfunctional politics that has led to the shutdown of most U.S. Government services, we thought it might be helpful to our library and archives colleagues to point out some alternatives to various library authority sources that have been shuttered at the Library of Congress. In some cases (for example, LCSH vs. …

More

ISBNs in WorldCat

Recently a question came up on the BIBFRAME list about ISBNs, and how many of them were in MARC records. This is just the kind of question that OCLC Research is uniquely placed to answer, so I quickly wrote some simple Perl code to run as a Hadoop streaming job to find out. It was …

More

“Cataloging Unchained”

Lorcan Dempsey (VP of Research at OCLC) has long said that we need to “make our data work harder.” And for years that is exactly what OCLC Research has been doing. So when I was asked to speak on data mining at the OCLC European, Middle East, and African Regional Council Meeting in Strasbourg, France, …

More

Top Corporate Names in WorldCat

As I explained earlier, I have been doing some investigations into how MARC has been used over the last several decades. Curious about the contents of the 110 $a (corporate names), I parsed it and the top 30 headings are listed below. Keep in mind a few things, however: Entities can be put together in …

More

Two Huge Linked Data Announcements

This week we have announced two major initiatives that are now providing significant library linked data resources to the world. First was the announcement yesterday that all of the 23rd Edition of the Dewey Decimal Classification has been released on the web as linked data. From the announcement: All assignable classes from DDC 23, the …

More

Five Easy Pieces

I seem to have acquired an obsession. This obsession manifests itself in various ways, but one clear way is that I can’t seem to stop thinking about some of the findings from my colleague’s work that resulted in the publication Implications of MARC Tag Usage on Library Metadata Practices. Chief among them, in my view, …

More