Archive for December, 2012

OCLC Research 2012: Welcome new colleagues!

Monday, December 31st, 2012 by Merrilee

This is the the final posting in a short series, looking back on just some of what what’s happened in OCLC Research during 2012.

I think that 2012 must have been a banner year for new colleagues in OCLC Research, or maybe it just seems that way. I’ve already mentioned Max, but here are a few more.

We started off the year by welcoming Titia van der Werf. Titia works in our Leiden office, and focusses much of her attention on European partners and projects. She is also a welcome addition to the Mobilizing Unique Materials team.

Our European team was further bolstered by Shenghui Wang, who joined us in May. Like Titia, Shenghui works out of our Leiden offices. The focus of her work is on text and data mining, deepening our strengths in this area.

We are lucky to have one of OCLC’s Diversity Fellows working with us this year — Julianna Barrera-Gomez (based in our Dublin office) is working with Lynn Silipigni-Connaway and Ixchel Faniel on a variety of projects. We’re fortunate to have these talented young people working with us during their time at OCLC!

And speaking of new ideas, we had two colleagues who joined us in September and October for long visits. Takashi Shimada joined us as an OCLC Research Fellow. Taka (as he graciously allows us to call him) came to us from Keio University and spend time both in San Mateo and Dublin learning about activities within the OCLC Research Library Partnership, and helped us gain a better understanding and appreciation of the issues faced by Japanese research libraries. Simone Kortekaas from Utrecht University spent three weeks of her sabbatical in our Dublin offices, both learning and sharing. We welcome visits like this (whether long or short) because they help us know how our work can make an impact in a real world setting.

As we close out 2012, we look forward to 2013 and all we’ll learn during the coming year. We’ll be sharing it here with you, so stay tuned. We wish you a happy, productive, and peaceful year!

OCLC Research 2012: ArchiveGrid

Thursday, December 27th, 2012 by Merrilee

This is the fourth posting in a short series, looking back on just some of what we’ve done in the last year.

ArchiveGrid is both a discovery system for an aggregation for archival collection descriptions, and a research sandbox, where we can experiment with both tightly and loosely structured data, and also try out interface design and emerging technologies. The ArchiveGrid team has done a lot in 2012 — here are just some highlights.

Although connections to ArchiveGrid from smartphones and tablets make up a relatively small percentage of overall use (currently about 11%), it is double what it was a year ago and is expected to grow. So we developed a new ArchiveGrid web interface that used responsive web design principles, letting the system adapt to a wide range of devices. The new interface was developed and tested over the summer, demonstrated at RBMS and SAA, and launched in October.

Around 75 new contributors of mostly EAD, but also PDF and HTML finding aids, joined ArchiveGrid this year, helping grow the index to a record 1.8 million collection descriptions from WorldCat and from crawler sites that institutions host. In February, the Northwest Digital Archives gave researchers another access point to noted Pacific Northwest archival and manuscript collections by contributing to ArchiveGrid its aggregated finding aids from 36 colleges, universities, libraries, museums, and historical societies in Oregon, Washington, Alaska, Idaho, and Montana. In March shortly before St. Patrick’s Day, National University of Ireland – Galway joined ArchiveGrid in March as our first contributor from Ireland, with 164 finding aids harvested and indexed.

The team gathered data via a survey that went out in spring to archives and special collections researchers. The purpose of the survey was to update our findings from previous user studies about these researchers. We also wanted to find out how Web 2.0 technology had changed how archives and special collections research is done. Surprisingly, we spotted a shift in who archives and special collections researchers are, with “unaffiliated scholars” – those who are not genealogists, faculty, and graduate students – making up nearly a quarter of the total number of survey respondents. We also noted a smaller role than expected of social media in archives and special collections research and a simultaneous need for archivists and librarians to embed themselves online where the researchers are and give help that most researchers say they trust. Ellen Ast presented results from the survey at the June RBMS meeting in San Diego. Look for more next year!

The ArchiveGrid team has been busy promoting ArchiveGrid in various venues — at SAA and at regional conferences for archives professionals who may not attend SAA. I presented on ArchiveGrid at the Society of Southwest Archivists / Council of Intermountain Archivists meeting in May (via Skype, which was a new experience for me!) and Bruce Washburn led a 90-minute discussion about ArchiveGrid at Mid-Atlantic Regional Archives Conference in October, leading to a flurry of new and potential ArchiveGrid contributors.

Capping the year, OCLC Research secured an intern, Marc Bron, for 2013 who will develop a WorldCat and ArchiveGrid data mapping system in order to improve name-based discovery. Bron is a doctoral student from the Netherlands and will work in the San Mateo research office.

Lead by Ellen Ast, the ArchiveGrid team launched a companion blog at the beginning of the year as a new venue for project team members to write about ArchiveGrid, our research activities around archival research and discovery, and developments in archives and special collections. The blog tracks new contributors and index growth, announces system developments, explains how we build and maintain our system, summarizes activity at conferences attended, highlights collections, and notes current events relevant to our target audience: archives and special collections practitioners, users, and aficionados. If you want to continue to follow ArchiveGrid in action, keep up with us all year around by following our blog!

OCLC Research 2012: Happy Holidays!

Friday, December 21st, 2012 by Merrilee

Taking a break from our end of year summary, which will continue next week after Christmas. Until then, happy holidays from us to you!


OCLC Research 2012: Born Digital

Thursday, December 20th, 2012 by Merrilee

This is the third posting in a miniseries of blog postings, looking back on what we’ve done in the last year. More to come!

One of the findings from our 2010 survey of special collections and archives in the US and Canada was that dealing with “born-digital” materials is one of the most challenging issues facing special collections. This is nothing new, but we realized that it was time to move past the “deer in the headlights” phase we seem to be in and move towards practical solutions based on emerging practice.

This year, Ricky Erway teamed up with Jackie Dooley and a crackerjack team of experts to push forward on born-digital solutions. The result is our Demystifying Born Digital project area, and two reports: You’ve Got to Walk Before You Can Run: First Steps for Managing Born-Digital Content Received on Physical Media, and Swatting the Long Tail of Digital Media: A Call for Collaboration.

You’ve Got to Walk is a gem of a report, informed by the group of practitioners who advise the Demystifying project. Its simple advice is encouraging, and practical. When we took a big stack of copies to the Society of American Archivists meeting, they were snapped up. This paper inspired the Jump In initiative — SAA’s Manuscripts Repositories section put out a challenge for archivists to take the Jump In pledge and take some of those first steps outlined in the report. Results will be discussed at next year’s meeting in August. We are of course delighted that this report has inspired action and look forward to hearing about the outcomes.

Swatting the Long Tail is a call for action more than it is a report. It calls for collaboration on transferring digital content from unstable physical media, and challenges the community to come up with an ecology of service providers.

More reports are in the works, and we’re looking forward to seeing what other action our work encourages, as well as what inspiration we can take from the community.

OCLC Research 2012: and the winner is…

Wednesday, December 19th, 2012 by Merrilee

We are doing a mini series of blog postings to reflect on some of our accomplishments in 2012. This posting is the second in the series.

Each year, OCLC Research staff are honored in various ways. This year is no exception and in fact we seem to have had a bumper crop.

In March, Ixchel Faniel won the iConference Award for her paper “Managing Fixity and Fluidity in Data Repositories.” The paper was co-authored with University of Michigan School of Information Professor Elizabeth Yakel and two doctoral students, Morgan Daniels and Kathleen Fear. This is one of the many contributions that Ixchel is making to help us understand data repositories and digital curation.

In May, colleagues Lynn Silipigni Connaway and Patrick Confer won RUSA’s 2012 Reference Service Press Award for their article “‘Are We Getting Warmer?’: Query Clarification in Live Chat Virtual Reference.” Lynn and Patrick co-authored the article with research colleagues Marie L. Radford, Susanna Sabolcsi-Boros, and Hannah Kwon of Rutgers, the State University of New Jersey.

You can hold your applause for Lynn, because in November she won the ALISE/Bohdan S. Wynar Research Paper Competition for her article “Not dead yet! A longitudinal study of query type and ready reference accuracy in live chat and IM reference,” to be published in Library & Information Science. Lynn and Marie have done a lot to improve our understanding of chat reference (and in my opinion have done much to underscore the value of basic customer service in libraries).

In October, our colleague Jeff Young was honored as the 2012 Kent State University SLIS Alumnus of the Year, an award given to a graduate who has made a significant contribution to the profession. Jeff’s was selected because of his important work sing Linked Data to increase the presence and discoverability of library data and materials on the web. One of these days, Jeff should get an special award for helping to explain linked data to his colleagues, but we haven’t gotten our act together yet.

Research colleagues also continue to be your “friends in high places”: Lynn was elected to the ASIS&T Board of Directors; Brian Lavoie was elected to the Dryad Data Repository Board of Directors; Eric Childress was invited to join the NISO Content and Collection Management Committee; and of course Jackie Dooley began her term as president of the Society of American Archivists (we still do get to see Jackie from time to time, although most of her blogging these days is over at Off the Record).

Finally, OCLC Research received an award of a different kind — funding! In June, JISC extended funding for the project “Visitors and Residents: What Motivates Engagement with the Digital Information Environment?”. On our end, the work is being led by none other than Lynn Silipigni Connaway, who is working with David S. White from the University of Oxford. This project helps expand our transnational knowledge base about students and technology.

Congratulations to everyone, and best wishes for continued success in the new year.

OCLC Research 2012: Wikipedia and Libraries

Tuesday, December 18th, 2012 by Merrilee

At the end of 2012, we are doing a mini series of blog postings to reflect on some of the year’s high points. This posting is the first in the series. Watch for updates!

2012 has been a great year for me, because I’ve had the privilege of seeing a project I’ve been passionate about for some time come to life — exploring the connection between Wikipedia and Libraries. Around this time last year I began making connections with the Wikipedia GLAM community, and exploring the idea of OCLC Research hosting a Wikipedian in Residence. We were fortunate enough to receive organizational support for this idea, and with help from folks in the Wikipedia community, craft a position description, and bring Max Klein into our team in OCLC Research. Having Max working with us has been terrific and not just because of his Wikipedia skills.

Since we’ve had Max on board, we attended Wikimania, have held not one but two Wikipedia Loves Libraries events, held two successful webinars attended by more than 500 librarians, done countless videos (okay, I counted them up and there are at least 8). And then there was the Open Access Wikipedia Challenge on P2PU. Oh, and VIAFbot, which brought authority control templates and VIAF links to thousands of articles on the English language Wikipedia.

Earlier this month, I presented a breakout session at CNI (along with Sara Snyder, from the Archives of American Art) on the connection between Wikipedia and Libraries. The session was well attended but more importantly, there was a lot of interest and excitement about the connection between Wikipedia and libraries. I’m very pleased that Max’s term has been extended, so he can help us explore some of those possibilities. So as we close out a successful and productive year, I look forward to another year of highlights in this area.

Want to know more? View all the HangingTogether blog posts on this topic!

Managing print books: A mega-problem?

Wednesday, December 12th, 2012 by Constance

This research note was co-authored by Brian Lavoie  and Constance Malpas.

Opportunity cost seems to be the watchword for print book collections these days. The staff, physical space, and other resources consumed by print-centric collections and services are badly needed to support new priorities in library services, such as deeper user engagement and closer alignment with changing research and learning practices. In the face of evidence of declining print book usage, combined with an ever-expanding array of digital alternatives, it is not difficult to imagine a future where “bookless” libraries are the norm.

But this may be premature. Few libraries are prepared to pack up their print books and send them to off-site high-density storage. On several highly-publicized occasions, plans to reduce local print book inventory have met vigorous opposition – witness the recent firestorm at the New York Public Library. In short, print collections pose a dilemma for libraries: they are assets too valuable to dispose of, yet sinking in priority vis-à-vis other aspects of the library service portfolio. The phrase “managing down print”, increasingly common in print management discussions, neatly captures the dueling imperatives: the need to allocate resources away from managing print book collections, but to do so in a gradual, orderly way. So the search is on for the golden mean: a viable print management strategy that can at once leverage more value out of the legacy print investment, and lower maintenance costs. This question is far from settled, but the contours of the solution are becoming apparent. First, future print management strategies are likely to be collaborative, with print books increasingly viewed as a shared asset to be managed cooperatively. Second, the scale of cooperation receiving the most attention, in terms of both planned and implemented solutions, is at the regional level.

This is not to suggest that the rest is a mere matter of detail: for example, the policy and technical infrastructures needed to support a regional strategy for cooperative print management are still in early stages of development. In the meantime, we can speculate on what a network of cooperatively-managed regional print book collections might look like. The OCLC Research report Print Management at “Mega-scale”: A Regional Perspective on Print Book Collections in North America explores a new geography of print book collections based on the concept of mega-regions. Mega-regions are geographical areas defined on the basis of economic integration and other forms of interdependence. The mega-regions framework has the benefit of basing regional boundaries on a substantive underpinning of shared traditions, mutual interests, and the needs of a common constituency.

In the report, we combine WorldCat data with an operationalization of the mega-region concept by urbanist Richard Florida to produce a network of twelve mega-regional print book collections – i.e., the collective print book holdings of all libraries in each region – corresponding to the twelve North American mega-regions identified by Florida (see figure below; click on image to view full size). We explore the salient characteristics of the mega-regional collections individually and as a group, and synthesize these characteristics into a set of stylized facts. The stylized facts are then used to explore the implications of a regionally-based, cooperative print strategy across a wide spectrum of issues, including access, management, and preservation.

(Click on image to view full-size version.)

Viewing print book collections as a cooperatively-managed regional resource yields benefits on both the supply-side and the demand side. On the demand side, aggregating the print holdings of many institutions into a single collective collection creates a resource of greater scope and depth than any single local collection. Exposing this collective collection to users around the region – or even beyond – may amplify or even create demand for print books that experience little or no local use. On the supply-side, regional coordination could streamline print management and reduce costs. Opportunities emerge for collaboration and coordination in collecting and retention decisions – for example, by diminishing excessive duplication and sharing collecting priorities across many institutions.

While our application of the mega-regions framework to print management is speculative, evidence does suggest that the organization of library stewardship is being reconfigured on a new supra-institutional, regional basis. The Western Regional Storage Trust, a cooperative effort to archive print journals in libraries in many Western (and even Midwestern) US libraries, is one among many examples.  Some of these initiatives, like the CIC Shared Print Archive or the ASERL Print Journal Archive, have the potential – if not the explicit intent – to deliver benefit at mega-regional scale:  CIC member libraries are distributed across the expansive CHI-PITTS  region and ASERL’s membership is concentrated in CHAR-LANTA.  It will be interesting to see if these natural experiments in redistributing print preservation responsibilities across broad geographies result in a richer collective resource, undergirded by a robust federation of preservation commitments, or a differently fragmented set of regional collections.

In the coming year, we’ll have an opportunity to extend our mega-regions analysis by taking a demand-side view of the North American print book collection. We’ll be working with partner libraries in the CIC (notably the Ohio State University) to examine how inter-lending data might be combined with supply-side holdings data to inform a regional print management strategy for retrospective monographic collections in CHI-PITTS. Here’s a thumbnail sketch of the regional resource, excerpted from our project proposal:

In aggregate, the print book resource held in CHI-PITTS libraries amounts to more than 40% of print book titles in North America. About 16% of these titles are unique to the region, i.e. not duplicated in any of the other eleven mega region collections. The remainder constitutes a significant preservation “backstop” for other North American libraries: 50-92% of titles held by other individual mega-regions are duplicated in CHI-PITTS libraries. Thus, investments in the preservation of print books in the CHI-PITTS region can deliver significant benefit to libraries throughout North America. Conversely, there are relatively few regional collections that duplicate a significant share of the CHI-PITTS collection, which means that the burden of print preservation responsibilities (and investments) will be largely shouldered by institutions within the region. Since less than a fifth of the print books in the region are held by academic research libraries – traditionally viewed as the institutions with the greatest stake in print preservation – it seems apparent that networks like the CIC will have an important role to play in rationalizing regional print preservation priorities and investment.

The CIC is an interesting test case for this sort of project, since all libraries in the consortium are partners in the HathiTrust Digital Library, a shared digital repository. By our reckoning, a third or more of the titles held by CIC member libraries are already “backed up” by digital preservation copies in HathiTrust.  Yet from a regional perspective, the situation is strikingly different:  we estimate that less than a fifth of the print books in CHI-PITTS are duplicated by HathiTrust. The collective preservation burden therefore remains significant even in a region with comparatively robust cooperative library infrastructure.

In regions where shared library infrastructure is less developed or less integrated, the challenges may be even greater.  Take Southern California, for example.  We estimate that the regional print book resource in the SO-CAL mega-region amounts to just under 10 million titles with about 40 million library holdings (i.e. holdings set by libraries in the region).  While much smaller in size than the CHI-PITTS collection, the SO-CAL collection represents an important regional asset and a significant stewardship concern for academic libraries in the area.  As elsewhere, these libraries are individually and collectively reassessing the opportunity costs of managing local print inventory and considering “above the institution” solutions.  Not surprisingly, smaller academic libraries look to larger research-intensive institutions as partners in the preservation enterprise and potential providers of shared infrastructure.

The University of California system, with five large research libraries and a high-density storage facility in the SO-CAL region, is an obvious focus of attention. But the infrastructure developed to support a statewide research university system with a global brand cannot simply be extended to serve all other libraries in the region. There is no shared governance model for the regional library resource, which is distributed across hundreds of public and private institutions. And there is no business model currently in place that would enable libraries to opt in to “preservation by proxy” arrangements. Yet, progress is being made. A group of library leaders from academic libraries and consortia in and around Southern California will meet later this week to begin what is certain to be a long conversation about a regional print management strategy. Bob Kieft, a long-time supporter (and sometime agitator) for collaborative collection management, has organized the meeting, which will be hosted by UCLA. It’s impossible to predict what the outcomes of the discussion might be – there is certainly no recipe for success in regional print management – but it is unquestionably an important first step in addressing what is increasingly a “mega” problem.