New program, archives and special collections

Tuesday, September 30th, 2008 by Merrilee

We’ve had a lot going on here, in the archives and special collections department. Some of this has gone out through other channels, but I don’t believe this has gone out on the blog.

We’ve announced a new program of work around “effectively disclosing archives and special collections.” There are seven new projects in the program — something for everyone. We’ll be blogging about each of these in turn, but I encourage you to take a look at the various projects and let us know what you find interesting and useful.

Jackie Dooley has been hired on as a consulting archivist. Jackie will be pitching in on the new program and will also be making recommendations as to what OCLC can do better for archives and special collections. Jackie is not fully on board yet, but even in her part time capacity, she is adding a lot of energy and ideas to the mix. I’m grateful that she’s joined us.

We’ve done a number of interviews with those from the archives and special collections world (all links go to MP3 files): Richard Ovenden (Oxford), Alice Schreyer (University of Chicago), Jackie Dooley (UCI/OCLC), Dennis Meissner (Minnesota Historical Society), and Mark Dimunation (Library of Congress). So far. We’d like to do a lot more of these. Suggestions? Who do you want to hear from?

All of this taken together reflects both the old and the new. RLG has always looked after the needs of archives and special collections, and now we have new capacities and a new focus. We hope you will help us to accomplish great things!

Beyond the Silos of the LAMs

Monday, September 29th, 2008 by GĂĽnter

The last couple of weeks have been busy and gratifying for the team working on library, archive and museum convergence issues. We’ve now published our final report capping a year-long investigation into LAMs in a campus environment – you’ll find it here (.pdf: 334K/59 pp). We tried out the title “Beyond the Silos of the LAMs” during various presentations and meetings at SAA, and word percolated all the way into Facebook. Our friend Chris Freeland from the Missouri Botanical Garden posted: “Eagerly awaiting the “Silos of the LAMs” (Libraries, Archives, Museums) RLG paper! Best. Title. Ever.” We can only hope that the reviews of the content will be equally enthusiastic.

Incidentally, last week also brought good news from institutions we’ve visited as part of our investigation. The Smithsonian announced that they’ll put the 137 million objects in their (upwards of) two dozen collecting units online. As you’ll be able to read in our report, the Smithsonian workshop led to recommendations for the “creation of an internal single point of access to all Smithsonian collections information for staff” and “comprehensive digitization and access…for unencumbered photographic collections.” The Smithsonian was the first institution we’ve visited, and in the time that has passed since their October 2007 workshop, they’ve upgraded their aspirations from an internal to a public system, and from a single format (photography) to institution-wide digitization. I was thrilled to hear about this development, and hope that our colleagues at SI can find the funding to make rapid progress on this lofty goal.

From Yale, we’ve heard about the creation of a new Office of Digital Assets and Infrastructure (ODAI) to support an integrated campus-wide architecture for access to Yale library, archive, museum collections, as well as faculty research output. Meg Bellinger, who was the bearer of these glad tidings, will be the director of this new office. One of the recommendations from our Yale workshop was the creation of an entity which would have among its responsibilities “planning a shared information architecture for cross-collection services (such as digital preservation and integrated access to collections information).” It’s nice to think that our workshop played a small role in shoring up the support for this important new office as a nexus of collaboration on campus.

Next year, you’ll have the opportunity to hear from many of the workshop participants in person. The Committee on Archives, Libraries and Museums (CALM) has endorsed a series of panel presentations at ALA, SAA and AAM during 2009, which will give workshop participants from University of Edinburgh, Princeton University, the Smithsonian Institution, the Victoria and Albert Museum, and Yale University a platform to share the progress their institutions have made in aligning the efforts of their collecting units to provide a better experience for their respective audiences. Until then, you’ll have to make do with the report!

What could be more special?

Friday, September 19th, 2008 by Ricky

I’ve just read the minutes from a recent meeting of the Lot 49 group, which was formed to address issues related to moving image digitization. [Here's a link to notes about the inaugural meeting in July 2007.] The need to be in Dublin, OH last week precluded my being there, but reading the minutes has led me to reflect on how motion and sound fit into Jen’s and my diatribe, Shifting Gears: Gearing Up to Get Into the Flow (about digitizing special collections for access).

Our major premise is that, in cases where we will preserve the original, we ought to think about digitization for access rather than for preservation. In this way, we can get more special collections digitized and accessible, thereby increasing the demand and, hopefully funding, for our collections. The alternative, investing in time-consuming expensive processes, risks special collections becoming marginalized in the midst of the vast quantity of books on-line.

By using the phrase “special collections” we meant to draw attention to digitization of non-book materials, but we hadn’t given a lot of thought specifically to motion and sound. One way in which motion and sound are different than other non-book formats is that the delivery of access copies requires a significantly compressed file, usually sacrificing a lot of quality. Another difference is that the premise that we would most often be preserving the original doesn’t always apply to motion and sound media.

The first objective, always, is stabilization of the content, then provision of access. With motion and audio, sometimes the original is digital (e.g., much current audio) and we can derive an access copy from it. If the original is in a stable analog format (e.g., preservation-quality film), then we can digitize for access. If the original is unstable and needs to be reformatted, there are two possibilities: a) when the best option is to reformat onto another analog medium (e.g., going from nitrate to safety film), we would subsequently create a digital access copy, or b) when the best reformatting option is digital (e.g., going from magnetic tape to digital audio), we’ll want to retain all the quality possible when digitizing, and then derive an access copy.

But let’s not get ahead of ourselves, a lot of motion and sound in our collections hasn’t even been cataloged. [Maybe the next round of CLIR/Mellon Hidden Collections grant funding should be inundated with proposals to describe hidden motion and sound collections.] Until we have a good sense of the nature and size of the problem, we won’t be effective in addressing it. [And if you have any ideas about how to survey backlogs, get in touch with Merrilee, who is launching a project to assess archival backlog survey methods.]

First describe ‘em, then stabilize ‘em, and then by all means, make them accessible.

The future of museums – the massively multiplayer forecast

Wednesday, September 17th, 2008 by GĂĽnter

I’ve just received an announcement from AAM about a Massively Multiplayer Forecasting Game they’ll use to shape our thinking about the museum of the future. (If you’re a member of AAM, no doubt you’ve received this message as well.) Here’s a copy-and-paste from the announcement:

Players are encouraged to “imagine out loud” how their families, their local communities, their professions, or their extended social networks might respond to the game scenarios. They build websites from the future, keep blogs from the future, upload podcasts from the future, make videos from the future, develop research wikis from the future, and host discussion forums from the future. In short, they persuasively record, discuss, and debate the details of how they imagine their own personal futures might play out within the game parameters. In Superstruct, we’ll show you the world as it might look in 2019—and you’ll show us what it’s like to live there.

And here’s what the announcement says about some of the scenarios the game will present players with:

It’s 2019. Your museum is contacted by the Department of Homeland Security and informed that an international group touring your museum was exposed, on their flight to the U.S., to the latest deadly strain of Respiratory Distress Syndrome. You are instructed to lock down the museum and shelter staff and visitors in place while government authorities determine whether anyone is infected. Are you prepared to deal with this? Other snapshots from 2019: Is your museum ready to help your community cope with an influx of climate refugees? How will your operations change in the face of soaring energy prices or collapse of the food production and distribution system? Your museum depends on its website to deliver information and attract visitors, but your content has been corrupted repeatedly in the past few months by hackers attempting to undermine the credibility of your museum. How do you adapt?

I’m struck by the fact that these scenarios are completely focused on quasi-catastrophic threats from the outside. I’m personally more worried about the threats to the future of museums from the inside, so to speak. I hope they’ll add this scenario as well:

“It’s 2019, and your museum’s digitization initiatives are still paralyzed by copyright concerns, the tantalizing vision of a revenue stream from licensing digital images, and the notion that providing online access to digital images of its collection equals an unbearable loss of control. Your visitorship is dwindling – since they can’t find much of your content online, people have stopped believing that there’s much worth seeing on your walls. How do you adapt?”

The latest indicator that this scenario is becoming less likely: the Smithsonian just announced that they’ll put the 137 million-object collection of their two dozen (give or take) collecting units online. Kudos to our SI colleagues and their new Secretary G. Wayne Clough!

WorldCat and the iPhone

Tuesday, September 16th, 2008 by Jim

My long-time colleague, Bruce Washburn, won’t brag about it but I will. He took advantage of the WorldCat Search API to build a very nice WorldCat app for the iPhone. He’s talked about (and Lorcan noted) how designing for mobile devices is like writing a haiku – you have to figure out the essence and then feature it in the most intuitive and expressive way. I think his app captures the real essence and utility of WorldCat.
worldcat iphone
If you’ve got an iPhone and you read this blog then you should have it.

Late summer harvest: fallen fruit

Monday, September 15th, 2008 by Constance

A map to fallen fruit in Sherman Oaks, CA.

I have often heard colleagues in the library community describe JSTOR print back files as ‘low-hanging fruit’ for distributed print archiving and other cooperative management schemes. This characterization, based on the relative ubiquity of the JSTOR digital archive in academic libraries and its high degree of use, belies the difficulty that many libraries face in changing local print management strategies. While much of the scholarly use of content represented in the JSTOR archive has migrated from print to online formats, and despite the presence of multiple dark and dim archives that replicate the digital archive in print format, there is little evidence that library managers are prepared to adopt a changed approach to managing widely duplicated print collections, of which JSTOR print back files are a singular example. The fruit may be easily within reach, but it is left on the tree.

This brings to mind a poem — ubiquitous in its own right, as it is visible in Google Books [in The Poet's Companion] and held by more than 480 libraries [in The October Palace] — by Jane Hirshfield [WorldCat Identity], a poet based in the San Francisco Bay Area. It was once (ca. 1998) part of the circulating collection of the New York City public transit system, in the Poetry in Motion poster series. In few words, it describes the singular desirability of ‘The Groundfall Pear’:

It is the one he chooses,
Yellow, plump, a little bruised
On one side from falling.
That place he takes first.

This nicely expresses the vulnerability and sweetness of fallen fruit.

I have been looking recently at the distribution of library holdings for titles in JSTOR, Portico and a variety of other collections and have been struck by the degree to which aggregate library investment in print serials has clustered around a relatively limited number of titles. It is not especially surprising that journal titles in the JSTOR collections are among the most widely held in WorldCat; they were selected for inclusion in JSTOR based on their value to the academic community, which is reflected in their broad distribution in college and university libraries. What is surprising, or at least perplexing (as noted above) is the fact that the increasing scope and success of the JSTOR archive as a core scholarly resource, and the increasing pressure on library space pressures over the past decade hasn’t resulted in any significant change in print collection management.

Ten years on, and the JSTOR print back files are still among the most widely held journals in the system. Median library holdings for the print versions of these titles are several orders of magnitude greater than holdings for other journals, the content of which (being less widely distributed in print or electronic form) may represent a much greater preservation risk. If the widely distributed JSTOR print back file collections represent the low-hanging fruit for cooperative print management, then the even more heavily duplicated print titles represented in both the Portico and JSTOR digital archives represent something akin to groundfall.

By my estimate, there are nearly 250 print serial titles that are represented in both the JSTOR and Portico digital archives. (There are more than a thousand titles in the JSTOR digital archive and more than eight times that number in Portico.) These are journals in which many libraries have invested not once or twice but at least three times, by purchasing the original print-only format, licensing the digitized back file in JSTOR, subscribing to the prospective dual print and digital formats, contributing to the long-term preservation of the electronic content (digitized and born-digital) in the Portico digital archive and electing to retain the original print format. One might reasonably ask if at least some of this investment might be redirected to secure the long-term preservation and / or digital conversion of other highly-valued but less widely held titles. A coordinated effort to thin the heavily duplicated titles could free up space for the preservation of rarer material, thereby ensuring broader coverage of the scholarly record and the continuing enrichment of aggregate holdings.

How great is the opportunity here? 250 journal titles may not sound like much — many colleges and universities hold thousands, even tens of thousands of serials — but if you factor in the aggregate holdings for titles represented in both JSTOR and Portico, the potential impact looks fairly significant. On average, titles in this group of journals are held in print format by more than 700 libraries; some are held by thousands of libraries. Considering the relatively low duplication rates for print journals in general (roughly 9 holdings per title), and the robustness of the combined JSTOR and Portico infrastructure, this ‘groundfall’ has the makings of a respectable harvest. If one assumes a conservative estimate of 40 bound volumes per title (the average physical extent of print titles in JSTOR, by my rough count), this amounts to something like 7 million volumes, or a million linear feet of shelf space that might be repurposed. It remains to be seen if research institutions are prepared to reap this harvest.

(Looking for an image to accompany this post, I stumbled over an interesting project to map fallen fruit around Los Angeles. The organizers describe it as an ‘activist art’ project:

“Public Fruit” is the concept behind the Fallen Fruit, an activist art project which started as a mapping of all the public fruit in our neighborhood. We ask all of you to contribute your maps so they expand to cover the United States and then the world. We encourage everyone to harvest, plant and sample public fruit, which is what we call all fruit on or overhanging public spaces such as sidewalks, streets or parking lots. ["What is Fallen Fruit?"]

The next time I’m in LA, I’ll know where to go to find the public pomegranates. Sherman Oaks looks to have a ready supply.)

Hacking Away on the Thin Ice of a New Day

Friday, September 12th, 2008 by Roy

I suppose it was inevitable. OCLC has been thinking differently lately, and this is just one more example. Besides opening up access to a tremendous amount of data and services to software access via Grid Services, we’re also trying to help developers in libraries, museums, archives, and beyond use these services productively.

Our latest method of doing this is by sponsoring a two-day Hackathon in New York City. Co-sponsored by NYPL Labs, we wil be providing power, wi-fi, food, and a t-shirt (see image) to anyone who wants to come and mess around with APIs and see what damage we can do.

We’re charging a $30 “nuisance fee” to discourage folks from signing up and not coming, since we’re limited in the number of participants we can accommodate, but frankly if that would prevent you from coming then let’s talk. It isn’t about the money. We’ve also put up a wiki page where people can set up ride and room sharing.

If this event is successful, and we have every reason to believe it will, then we may take it on the road. If you’d like to see an event like this in your area, drop me a line. I hope to see you there!

A league table of journals

Tuesday, September 2nd, 2008 by John

The Australian government is revising its research assessment system, and is in the process of setting up ERA, Excellence in Research for Australia. This new system was an early commitment of the Labor Government elected in November of last year, and is replacing the Research Quality Framework (RQF) which the previous Government had started to develop in 2006, and which was intended to carve up AU$600 million in block grant research funding. That system did not reach fruition, despite (and partly because of) being very costly. The new system is designed to benchmark Australian research better within an international context, and is – for the moment – not intended to lead to a ranking-based carve-up of the research funding pot, though that option has been left in for the future.

One of the first outputs of the new process is a single ranked list (which can be downloaded here) of over 21,000 journals, broken down into 157 subject groups classified according to the Australian and New Zealand Standard Research Classification. Each journal is graded on a 4-tier system, A* (top 5 %), A (next 15%), B (next 30%) and C (next 50%). The allocation of journals to these bands has been made by learned societies and disciplinary bodies. There has already been a lot of concern expressed by academics in Australia, arguing that the journals list is not sufficiently widely representative, is too Anglophone, will skew publication practices among academics, and is invidious because a percentage banding system ignores intrinsic quality for the sake of convenience. In an open letter to Senator Kim Carr, an international group of philosophers states ‘The problem is not that judgments of quality in research cannot currently be made, but rather that in disciplines like Philosophy, those standards cannot be given simple, mechanical, or quantitative expression. Publisher and journal rankings are no substitute for direct assessment of a scholar’s work by knowledgeable peers.’

Metrics used in the evaluation of research are often discounted by academics as being too crude, while administrators favour them for being objective and so capable of supporting resource decisions to advance policy objectives. The previous Australian government was influenced by the UK’s adoption of metrics of research excellence which has been credited with improving its economic performance. It would appear that governments are prepared to tolerate a degree of hand-wringing by academics and game-playing by their managers if it leads to increased global competitiveness. The nobler aims of the Academy have no compelling match for tables and rankings.

Journal rankings have been made over many years, though normally not in such a visible and comprehensive way as in a list of over 21,000 journals – a total which is larger than that of both Thomson Reuters’ ISI Web of Knowledge and Elsevier’s Scopus, each of which has around 15,000 journals. Indeed, Ulrich’s Periodicals Directory estimates that the total number of peer-reviewed journals is around 22,000, so this Australian list must be almost complete. What is perhaps more interesting, in light of the fact that both the Australian and UK governments want to make extensive research assessment exercises as lightweight as possible, is the Pareto Principle in this context. Thomson Reuters itself admits that ‘A core of 3,000 … journals accounts for about 75% of published articles and over 90% of cited articles’. So why consider 21,000?

It will be interesting to see whether the Australian list becomes a reference point for researchers, and is used internationally – or whether the criticisms increase. Without the justification from metrics of the citation impact factor, which governs the ISI ranking, there is a danger that a peer-reviewed list could be dismissed as immediately out-of-date. Small publishers could well protest at the injustice it represents to their up-and-coming titles which have not been graded A or A*. Larger publishers will surely seek to consolidate their top-ranked journals and invest in those with the potential to join them, possibly cutting others loose as they do so.

Thus rankings based on metrics could once again lead to the research community, for lack of sufficiently strong better judgement, behaving in a Jekyll and Hyde manner, as characterised by Jean-Claude Guédon in his 2001 essay In Oldenburg’s Long Shadow: Librarians, Research Scientists, Publishers, and the Control of Scientific Publishing, conforming to higher aspirations one moment, and the next giving in to the grubbier demands of gamesmanship within a system which is largely out of their control.

(And, just in case you were wondering, the A* Library & Information Studies journals you want to be published in are, according to this list: Journal of Documentation, Library Quarterly, Library Trends, Management Science, Annual Review of Information Science and Technology, Journal of Information Science, MIS Quarterly, Library & Information Science Research, Information Systems Research, Information Research-An International Electronic Journal, School Library Media Research and Journal of the American Society for Information Science and Technology).