“Cataloging Unchained”

Lorcan Dempsey (VP of Research at OCLC) has long said that we need to “make our data work harder.” And for years that is exactly what OCLC Research has been doing. So when I was asked to speak on data mining at the OCLC European, Middle East, and African Regional Council Meeting in Strasbourg, France, …

More

Top Corporate Names in WorldCat

As I explained earlier, I have been doing some investigations into how MARC has been used over the last several decades. Curious about the contents of the 110 $a (corporate names), I parsed it and the top 30 headings are listed below. Keep in mind a few things, however: Entities can be put together in …

More

Two Huge Linked Data Announcements

This week we have announced two major initiatives that are now providing significant library linked data resources to the world. First was the announcement yesterday that all of the 23rd Edition of the Dewey Decimal Classification has been released on the web as linked data. From the announcement: All assignable classes from DDC 23, the …

More

Five Easy Pieces

I seem to have acquired an obsession. This obsession manifests itself in various ways, but one clear way is that I can’t seem to stop thinking about some of the findings from my colleague’s work that resulted in the publication Implications of MARC Tag Usage on Library Metadata Practices. Chief among them, in my view, …

More

FAST on the street

I’m pleased to say that today OCLC Research released FAST (Faceted Application of Subject Terminology) as linked data under an Open Data Commons Attribution license. FAST has been a multi-year project of OCLC Research in collaboration with the Library of Congress. The FAST authority file is an enumerative, faceted subject heading schema derived from the …

More

The tail of the COMET (Project)

Today the University of Cambridge released the final dataset from its COMET (Cambridge Open METadata) project. The final dataset contains more than 600,000 records derived from OCLC’s WorldCat available as both Marc21 and RDF triples under an Open Data Commons Attribution License (ODC-BY). All the previous data sets released, as well as this one, have …

More