OCLC Control Numbers – Lots of them; all public domain

For the last few years I have been part of a group of OCLC staff charged with articulating data sharing practices that are consistent with the WorldCat Rights and Responsibilities for the OCLC Cooperative. We’ve made good progress towards openness while making expectations and practices more regular and consistent. The recommendation to use the ODC …

More

ISBNs in WorldCat

Recently a question came up on the BIBFRAME list about ISBNs, and how many of them were in MARC records. This is just the kind of question that OCLC Research is uniquely placed to answer, so I quickly wrote some simple Perl code to run as a Hadoop streaming job to find out. It was …

More

“Cataloging Unchained”

Lorcan Dempsey (VP of Research at OCLC) has long said that we need to “make our data work harder.” And for years that is exactly what OCLC Research has been doing. So when I was asked to speak on data mining at the OCLC European, Middle East, and African Regional Council Meeting in Strasbourg, France, …

More

Top Corporate Names in WorldCat

As I explained earlier, I have been doing some investigations into how MARC has been used over the last several decades. Curious about the contents of the 110 $a (corporate names), I parsed it and the top 30 headings are listed below. Keep in mind a few things, however: Entities can be put together in …

More

Top Topics in WorldCat

As I’ve described in a series of posts recently (“Adventures in Hadoop”, four so far), I’ve been having fun on our new compute cluster. Well, maybe “fun” isn’t exactly the right term for diving into the depths of the MARC format, but hey, librarians have to get their kicks somehow. Anyway, I’ve been doing some …

More

Registering researchers in authority files

Last month we launched a new task group of OCLC Research Library Partner staff and others who are involved in uniquely identifying authors and researchers that can be shared in a linked data environment. We were spurred by institutions’ need to uniquely identify all their researchers to measure their scholarly output, a factor in reputation …

More

Two Huge Linked Data Announcements

This week we have announced two major initiatives that are now providing significant library linked data resources to the world. First was the announcement yesterday that all of the 23rd Edition of the Dewey Decimal Classification has been released on the web as linked data. From the announcement: All assignable classes from DDC 23, the …

More