Archive for December, 2010

OCLC Research 2010: Reports and webinars

Friday, December 31st, 2010 by Melissa

As 2010 winds down, we have reflected on what we’ve worked on or created in this mini blog series. You can see a run down of highlights here.

As an OCLC Research senior communications officer, I have the good fortune of working with a brilliant group of people who provide me with an abundance of great outputs to communicate. It is my pleasure to blog about the busy and productive year we’ve had in OCLC Research, and how we’ve shared our findings in numerous reports and articles. In 2010 we published more than ten reports on an array of topics ranging from recommendations on using cameras in reading rooms and greening interlibrary loan practices to strategies to mitigate research libraries’ risks and implications resulting from the 2009 OCLC Research Survey of Special Collections and Archives. You can find our complete list of reports here.

In addition, articles by our research scientists and program officers on topics from improving virtual reference to better understanding today’s researchers and research libraries were included in 19 professional journals and newsletters. You can find this list of publications here.

We also held 23 webinars, including six in our TAI CHI Series, which were attended by over 1400 RLG Partners and other library professionals. The topics covered ranged from understanding mobile development to managing collections in the networked environment. In case you missed any, recordings of all of these webinars are available on our Web site and in iTunes.

As we look back on our accomplishments of 2010, we also look forward to sharing a successful and productive 2011 year with you. All of us in OCLC Research wish you a wonderful new year.

Want even more? Check out a three-page summary of our accomplishments over the last five years.

OCLC Research 2010: Blue Ribbon Task Force on Sustainable Preservation and Access

Thursday, December 30th, 2010 by Brian

2010 marked the conclusion of work of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Formed in 2007, the Task Force was an international group convened to examine the issue of economic sustainability in a digital preservation context. Membership included experts from across the digital preservation community, including the public sector, the private sector, cultural heritage, and academia, and reflected a range of expertise, including librarians, archivists, computer scientists, and economists.

The Task Force produced two substantial reports which, provide:

  • the first comprehensive study of the economics of digital preservation;
  • a clear definition of the conditions that must be met to achieve economic sustainability in a digital preservation context;
  • practical, actionable recommendations for achieving economic sustainability, based on detailed analysis of both the economic environment in which preservation decision-making takes place, and the attributes of digital preservation as an economic activity;
  • a list of priorities for near-term action;
  • a strong foundation to catalyze additional work on economically sustainable digital preservation.

In addition to publishing their final report this year, the Task Force organized a symposium in Washington, DC, with approximately 100 participants. The April symposium provided the community with a public forum to react to and discuss the Task Force’s findings and recommendations; to assemble panels of experts representing the four digital preservation contexts discussed in the report, and hear their thoughts on how the Task Force’s recommendations might be implemented; and to inspire ideas for future work, building on the foundation provided by the Task Force. A similar event was sponsored by JISC in London in May. Both events helped embed the Task Force’s work into the international digital preservation community. Additionally, the work of the Task Force was recognized by being included on the short list for the 2010 Digital Preservation Award.

OCLC Research 2010 – Lorcan named Miles Conrad Lecturer

Wednesday, December 29th, 2010 by Jim

As 2010 winds down, we are reflecting on what we’ve worked on or created in a mini blog series. You can see a run down of highlights here.

Our colleague and boss, Lorcan Dempsey, both leads the OCLC Research effort and serves as Chief Strategist for OCLC. He’s been in this role at OCLC since 2001 following distinguished service in UK libraries and at JISC. It’s coming on to ten years and his impact on librarianship has continued to increase over time. We think he’s one of the most influential and creative thinkers about library matters, information supply and discovery and the impact of the networked environment on both. If you’re reading this blog, you’ve no doubt read Lorcan’s blog (now on hiatus) and perhaps follow him on Twitter.

As it turns out we’re not the only ones who have such a high opinion of Lorcan’s thinking and his impact. We were proud that in 2010 he was named the Miles Conrad Lecturer by the National Federation of Abstracting and Indexing Services (NFAIS) 5 January: Correction: NFAIS=National Federation of Advanced Information Services. My apologies.. Lots of us get invited to lecture and do conference keynote presentations but this is a big one and an indisputable honor. They’ve been naming distinguished lecturers for the NFAIS annual gatherings since 1968 in honor of one of the key individuals responsible for founding the organization. It’s a roster that includes the giants – Gene Garfield, Roger Summit, Marty Cummings, Carlos Cuadra, etc. and back in 1976, Fred Kilgour, the founder of OCLC.

His lecture is not yet up on the NFAIS web site. 5 January: the slides are here pdf. When it appears we’ll blog about it. In the interim for some of Lorcan’s current thinking I’d suggest you review this presentation called an Environmental Glance from the June 2010 RLG Partnership Annual Meeting. If you’re dying to see Lorcan in action there is an .mp4 of presentation he gave in Trondheim Norway at a conference on Emerging Technologies in Academic Libraries called The network has reconfigured whole industries. What will it do to academic libraries? Not professionally filmed but all the content is there delivered in Lorcan’s inimitable (really) style.

OCLC Research 2010: Pulse taking

Monday, December 27th, 2010 by Ricky

As 2010 winds down, we are reflecting on what we’ve worked on or created in a mini blog series. You can see a run down of highlights here.

My colleague Jackie wrangled an incredibly long survey of an incredibly large sample of special collections and archives and synthesized it in an incredibly interesting (and long) report, Taking Our Pulse: The OCLC Research Survey of Special Collections and Archives. You’ll want to study it in all its detail, but for those pressed for time, Jackie distills the significant outcomes in a blog post that gives bulleted significant findings and bulleted recommendations. [She also blogged about what surprised her among the findings.] You might also like to read some background about this activity, including how it relates to the ARL study done in 1998.

What’s next? We’re deciding which of the recommendations we’ll take on and subtly suggesting to others those they might take on. All in the service of making those wonderful collections accessible and preserved for the ages.

If you’re itching for even more, check out this 3-page summary of the last five years of the RLG Partnership and OCLC Research.

OCLC Research 2010: Classify and WorldCat Genres

Friday, December 24th, 2010 by Merrilee

As 2010 winds down, we’d like to call attention to some of the things we’ve worked on or created this year. You can see a rundown of highlights here.

I hate those end of year “10 best” lists. For me, each list represents a number of [books, cds, movies, apps, restaurants] that I once again failed to get to in the current year and probably won’t in the next. I also hate being told what I should [read, listen to, watch, play with, eat].

But I love WorldCat Genres, which is a great way to browse and discover fiction (or movies) based on my own tastes and preferences. For example, I love autobiographical fiction, because it’s usually bittersweet and sometimes dishy. Browsing in WorldCat Genres, I can see some newer books that are in this genre that look tempting, as well as some old favorites, and related movies. I like this way of constructing my own lists, based on similarities in the WorldCat data.

And then there’s Classify. Classify is an experimental web service that reveals the classification (Dewey Decimal Classification, Library of Congress Classification, or National Library of Medicine Classification) that has been assigned across a FRBR work set. A good example is a book I’m reading now, Christopher McDougall’s Born to Run. You’ll see, at least for DCC, the classifications mostly adhere to one class number, but also tend to be assigned to two other class numbers.

Additionally, Classify reveals the FAST subject headings for the FRBR work set.

So what?

So this is a person-friendly prototype for what is actually a web service. Imagine farming a portion of your cataloging workflow off to a webservice. If there’s overwhelming agreement on classification (90% of those items that have a class number are all the same), then the class number is assigned automagically. If there’s variance, a human intervenes and makes a decision. There is also an opportunity to use the provided subject terms.

Classify helps to harness the wisdom of the crowds, the decisions of lots of catalogers, as represented in WorldCat.

Another cool tool to put under the Christmas tree.

You can find out more about the Classify project and more about what makes WorldCat Genres tick on our website.

And if you are thirsty for more, you can check out a three-page summary of our accomplishments over the last five years.

OCLC Research 2010: VIAF now includes corporate names

Thursday, December 23rd, 2010 by Merrilee

As 2010 comes to a close, we’d like to call attention to some of the things we’ve worked on or created this year. You can see a rundown of highlights here.

VIAF (or the Virtual International Authority File) is another one of these cool projects developed by our Dublin-based colleagues in OCLC Research. Since 2003, OCLC has been working with the Library of Congress, the Deutsche Nationalbibliothek, and the Bibliothèque nationale de France to produce a merged resource with authority files from 18 participants. I don’t monitor this project closely but it seems like new participants are joining all the time. VIAF does not privileged one authoritative form over another, instead uniting all forms so that the most appropriate form can be identified.

Just last month VIAF was expanded to include corporate (and conference) names in addition to personal names. As Thom explains in this blog post, corporate names are more difficult to match across the files than personal names. But at this point the project team has such finely honed algorithms and magical matching powers that they were able to fold corporate names into the mix. If you want to see an example of how complex corporate names can get, take a look at this entry for the United Nations (with 165 different alternate forms of the name, in addition to the twelve authoritative headings).

Resources like VIAF, which are machine derived but built on the very human labor of catalogers worldwide, will help to power multilingual searches, thus enabling improved discovery of materials, regardless where they are held or where they are cataloged.

For more information on VIAF, see the project page.

More? Check out a three-page summary of an accounting of accomplishments over the last five years!

OCLC Research 2010 – Cloud Library

Thursday, December 23rd, 2010 by Jim

The Cloud Library project (see the exposition that follows for a quick reductive overview of the idea) got a lot of attention and had a big impact in the research library community this past year.

Dark foreground and clouds, mountains highlighted, "Heaven's Peak," Glacier National Park, Montana.

My colleague, Constance Malpas, is the principal intellectual engine driving this effort. She’s shaped the opportunities, divined the evidence to support first steps and generally been a tireless participant in the discussions and action planning that have sprung up around this opportunity. She’s busy now finalizing a report which will be available in January, 2011. We’ll blog about its release.

What’s the idea? In the same way that cloud computing offers resources and applications on demand without the user having to operate and own the underlying assets, the cloud library project posited that it is now possible for academic libraries to rely on access to needed book and journal assets rather than manage them as locally-resident and managed physical items.

The entry point for exploring this possibility was to reconsider the relation between a library’s physical book collection, off-site storage repositories, and emerging digital text aggregations. Could a library change its local print inventory by relying on supply of a digital version of the text from a digital aggregator and offer a print volume when necessary through an arrangement with an existing storage facility(ies)? We found willing exemplars of each player – New York University as a customer, ReCAP as a storage facility willing to supply and the Hathi Trust as a digital text aggregator willing to offer access to an electronic version of the book.

Constance examined the overlap between NYU’s collection, the holdings of ReCAP and the rapidly growing database of digital texts being built by the Hathi Trust from the digital copies received by participants in the Google Books Library Project and other digital copies of books. It was tough to find out what was in ReCAP as those holdings haven’t been transmitted to OCLC and it was difficult to keep up with the Hathi Trust as the corpus was growing so rapidly. Nevertheless the analyses were done and have been repeated on a regular basis.

The results of this initial effort have been discussed in many forums. A nice summary by Constance is in this presentation from the RLG Partnership Annual Meeting 10 June 2010. The key findings were that 30% or more of the local collection was already present in the digital aggregation, that 75% of the mass digitized texts were already ‘backed up’ in one or more of the storage repositories and that, unfortunately, only a very small percentage of the intersection was in the public domain. Since the Google Book settlement was delayed this last point meant that there was no way to use the Hathi Trust as more than a digital preservation structure. Nevertheless even without an e-book delivery opportunity the amount of the library space that could be freed up by a willingness to responsibly (i.e. with contractual understanding, with proper reassurances about preservation, with Hathi acting as another preservation format and with appropriate coordination across multiple repositories willing to supply) rely on delivery from remote sources is quite large. In the NYU case it meant more than 700,000 volumes could be relegated and managed differently.

This finding prompted further work to look at other ARL library collections. The fascinating and hopefully motivating finding was that the percent duplication obtained across nearly the entire ARL group. That is, more than 30% of most library’s collections were already duplicated in the mass digitized aggregation and this was expected to grow to 50% over the next 3 years. This means that most ARLs would benefit in similarly substantial ways by moving from reliance on their local print collections to reliance on storage repository supply. Constance gave a very nice presentation(ppt) on these findings along with recommendations about where change could start and things to stop doing at the October 2010 ARL meeting.

The overwhelming evidence that this project has assembled coupled with extraordinary budget pressures have resulted in genuine action plans at individual research libraries as well as plans for group responses. Requirements for the necessary shared infrastructure are being articulated and pilot efforts are being launched. The ‘cloud library’ seeded by OCLC Research efforts is precipitating change. (Block that metaphor.)

OCLC Research 2010: Well-Intentioned Practices

Wednesday, December 22nd, 2010 by Merrilee

As 2010 winds down, we are reflecting on what we’ve worked on or created in a mini blog series. You can see a run down of highlights here.

Is copyright making you blue
And you don’t know what to do
Take advantage of others’ tactics
And put in place Well-Intentioned Practice!

I want to give a shout out to the National Library of Australia for what has become an annual display of talent and imagination. Each year the staff performs for their holiday party, and they share with the rest of us on YouTube. The results are funny and toe-tapping. This year’s theme was Putting on the Writs,” an homage to the trials and tribulations of adhering to copyright law.

National Library of Australia. We feel your pain. And we’ve been moved to do something about it. In the US. For unpublished materials.

Following on the heels of Shifting Gears, we began to realize what a barrier copyright law presents to those working with unpublished materials. We convened an advisory group. We held an event. Out of this came a document called Well-intentioned practice for putting digitized collections of unpublished materials online (we call it WIP). WIP encourages institutions to take a risk management approach (rather than apply item by item assessment).

WIP has been a success, and has been endorsed by numerous organizations and individuals. And we’ve just learned that we’ll have a session focusing on Well intentioned practices at the Society of American Archivists meeting in 2011. While WIP is based on US copyright law, as a risk management approach it may work in other situations.

We’ve written about WIP in the past. Here are two previous posts on this topic.

And if you haven’t seen it, here’s Puttin’ on the Writs in its full glory.

If you want to see even more of our accomplishments look at this summary of our accomplishments over the last five years. Only three pages!

OCLC Research 2010: mapFAST

Tuesday, December 21st, 2010 by Merrilee

As 2010 winds down, we’d like to call attention to some of the things we’ve worked on or created this year. You can see a rundown of highlights here.

I’ve spent some time playing with mapFAST, which is a mashup between Google Maps and the FAST Geographic subject headings. This is a really neat way to explore the intersection between a geographic area and publications of all sorts.

I grew up in Garden Grove, California, so I used that as a launching pad for exploration. I like that the mapFAST display shows Garden Grove and its environs (note how close to Disneyland it is!). Browsing through the WorldCat results, there are of course numerous city planning documents, but also some interesting surprises. There’s a masters thesis based on survey data collected in the Garden Grove elementary schools, dating from close to the time I was a student. There’s also quite a bit on the Garden Grove Community Church, which I found curious until I realized that’s the old name for the Crystal Cathedral (home of the Hour of Power broadcasts). I was also able to find links to images of the “new” Crystal Cathedral during construction.

In addition to links to, mapFAST also offers links to Google Books. I was surprised at how much content is available, almost too much even for the most ardent Garden Grove enthusiast.

You can find out more about mapFAST here, and more about FAST here.

And if you are thirsty for more, you can check out a three-page summary of our accomplishments over the last five years.

What do we mean when we say “born digital”?

Thursday, December 16th, 2010 by Merrilee

The phrase “born digital” is just as vexed as “digital libraries.” Taken all together, the “born digital” universe is all encompassing, and too much for any one professional or institution to tackle. No wonder librarians and archivists throw up their hands in frustration! It’s just too much.

Fortunately, my colleague Ricky Erway has written an essay, Defining Born Digital [pdf], that seeks to define the discrete components of the born digital landscape, which are:

  • Digital photographs
  • Harvested web content
  • Digital manuscripts
  • Electronic records
  • Static data sets
  • Dynamic data
  • Digital art
  • Digital media publications

If reading a four-page document is too daunting, you might also enjoy this rather humorous treatment of the subject, featured on our new YouTube channel.

I think that by better defining which corner of this vast universe we are working on (or seeking to understand), we will feel less overwhelmed and are more likely to make progress.

I’m interested in your reactions. How did we do? Did we miss anything? Does this help you?