The Most Used English Title Words in WorldCat

This is another installment in my continuing series of eclectic, peripatetic, and yes, let’s just say it: “pathetic” data investigations. The most recent identified the top countries of publication for WorldCat records. For whatever reason, I got it into my head to determine which English words appear the most in the main title of WorldCat …

More

Visualizations of MARC Usage

As part of my work to reveal exactly how the MARC standard has been used over the several decades it has existed (available at “MARC Usage in WorldCat”), I’ve always wanted to produce visualizations of the data. Recently, with essential help from my colleagues JD Shipengrover and Jeremy Browning, I was able to do exactly …

More

Metadata for digital objects

That was the topic discussed recently by OCLC Research Library Partners metadata managers. It was initiated by Jonathan LeBreton of Temple, who noted the questions staff raised when describing voluminous image collections such as: Do we share the metadata even if it would swamp results? What context can be provided economically? What are others doing …

More

Multilingual WorldCat represented by translations

Great works are translated—the cream of the world’s cultural and knowledge heritage is shared by being translated. And many of them are represented by bibliographic records in WorldCat. A group of us working on Multilingual WorldCat projects have been focusing on datamining WorldCat for works and all translations associated with them, identifying the translator for …

More

ISBNs in WorldCat

Recently a question came up on the BIBFRAME list about ISBNs, and how many of them were in MARC records. This is just the kind of question that OCLC Research is uniquely placed to answer, so I quickly wrote some simple Perl code to run as a Hadoop streaming job to find out. It was …

More