Archive for August, 2007

From hand-crafted to mass digitized

Wednesday, August 29th, 2007 by GĂĽnter

Ricky, Jennifer and I just got back from the joint RLG Programs / SAA sponsored forum “Digitization Matters.” We spent the day discussing with a sell-out crowd of 200 archivists at the Newberry Library how digitization for special collections can be ramped up to achieve the kind of throughput which would qualify as “mass digitization” as opposed to spending our effort on hand-crafted projects generating small numbers of high-quality digital files with very granular descriptions. We’ve asked our speakers to be bold and provocative, and we’ve asked our audience to be open-minded. Here’s the complete list of suggestions our speakers discussed with a lively audience, ordered by session. Read them now, and then stay tuned for the mp3 files of the talks and discussions!

Emphasis on Access

Susan Chun (presented by Michael Jenkins)

  • Treat digitization and cataloguing of collections as operational activities. This means:
  • Survey future activities, build operational budgets, and allocate permanent staff.
  • Funders should favor building permanent organizational capacity over short-term projects.
  • Make content available, then make changes based on use. This means:
  • Track users and uses of content.
  • Treat digitization and cataloguing as an iterative process

Sam Quigley

  • Include digitization in initial records processing, i.e., don’t get further behind.
  • Develop rapid production scanning using “prosumer” equipment and automation.
  • Consider cutting back on resolution and detailed metadata for faster production.
  • Investigate voice recognition software for handwritten documents, in lieu of OCR.
  • Investigate joining or forming a consortium for storage, Web delivery, and digital preservation.

Selection Decisions

Barbara Taranto

  • Digitize on demand.
  • Engage archivists in public service discussions.
  • Flatten hierarchy of discovery.
  • Digitize comprehensively.
  • Avoid discussions of audience.

Sharon Farb

  • Digitize what best supports and reflects your mission.
  • Digitize what users want and use
  • Integrate digitization into all workflows and user services
  • Collaborate with users/IT/curators/archivists/librarians throughout all stages – from planning to implementation –of the digitization process.


Bill Landis

  • ‘Boutique’ vs. ‘Mass’: Explore digitization options other than those with which we’ve gotten comfortable over the past decade or so.
  • Good enough is good enough: Embrace archival control, organization, and the descriptive metadata that flows from that collection management strategy.
  • Rose-colored lenses: Know the difference between interpretation and access, and how that impacts our description and digitization work.
  • Know your limitations: Aim to influence, not control, dissemination and use of digital facsimiles of material from our collections.

James Eason

  • Decide what you are: a Museum? a Picture Library? or an Archives?
  • 1a. Be an Archives.
  • Lose your conception of a Photograph as an Individual Work.
  • Describe only aggregations of photos, and only in the broadest terms.
  • Only consider investing more in description when both these conditions are met:
  • Extremely high value item (historical or artifactual value)
  • Part of a heterogeneous body of material
  • Look for and experiment with emerging technologies that support added description from users and external experts.

Public/Private Partnerships

James Hastings (presented by Ricky Erway)

  • Archival access is no longer about ingress into buildings. Think of the potential exponential increase in use when people no longer have to walk through doors.
  • Digitization and online access is far more expensive than most realize. Most archives and manuscript collections, cannot afford to do it all themselves.
  • Archival institutions can still have control of projects, standards, and principles when partnering with for-profit organizations pursuing of their own goals.
  • To achieve preservation and access goals, require partners to digitize entire series or collections. Avoid “cherry-picking.”

Terminologies Services

Monday, August 27th, 2007 by GĂĽnter

Merrilee and I just sent a short strawman document for an upcoming invitational meeting on terminologies services to a very eclectic group of catalogers, digital librarians, visual resources curators, folksonomists, archivists and art librarians. We’ve had help from our OCLC Research colleagues Diane Vizine-Goetz and Andrew Houghton in preparing the document, and they’ll also join us for the meeting hosted by the Metropolitan Museum on September 12th. While the bulk of the strawman contains use cases for discussion and prioritization during the meeting, I thought the introductory paragraph gives a good idea of why we’re investing effort in this area:

Different communities and descriptive strategies share a common need to unambiguously identify a place or a person and to provide access points through subject terms or keywords. Activities surrounding the use of terminology resources could be raised to the network level through a series of services that support a range of activities including metadata creation, search formulation and optimization, and management of terminology resources as “local authorities” on the network. These network services can leverage the combined expertise and investment in description across libraries, archives, museums and visual resources to produce more authoritative records in a less expensive fashion. Terminologies services become the powerful building-blocks for entire records, especially if they provide access to authority records. When describing rare and unique materials, terminologies services could provide a measure of copy-cataloging economy by providing ready access to authoritative chunks of records.

We’ve asked all meeting participants to call three colleagues to help them think through the ideas presented and whittle away at our use-cases, so maybe you’ll get drafted!

We’ll make the full document as amended, expanded or trimmed by group-consensus available after meeting.

NYARC: One for all, and all for one?

Wednesday, August 22nd, 2007 by GĂĽnter

NYARC Holdings OverlapBrian and I are working with four New York City art museum libraries on a collection analysis of their joint bibliographic holdings. The institutions in question:

We have (actually, in all fairness I should say: Brian has) completed a preliminary analysis, and now we are starting to focus on evaluating the implications of the data together with our four project partners. Constance has also been invaluable in speculating about the meaning hidden behind the raw figures, and what kinds of conclusions they support. While we’re not ready to go public with the full set of data or with our interpretation of it, I thought I’d share some background on the project to wet your appetite for a forthcoming publication this fall. And I’ve thrown in some teaser data just because I could.

Under the auspices of a planning grant sponsored by the Mellon Foundation, the four art museum libraries listed above formed the New York Art Resources Consortium (NYARC) to explore deep collaboration. (Three of the four have already announced a joint ILS [pdf] project.) We engaged NYARC in the analysis project to supply the art museum libraries with the business intelligence they need to make informed decisions about the nature of their collaborative efforts. The analysis determines the size of the collective NYARC collection, the extent of holdings overlap as well as uniquely held materials. The project also compiles statistics about specific types of materials the consortium holds a special interest in, such as auction catalogs, exhibition catalogs and serials. A comparison of the NYARC holdings to a set of three local research libraries (New York Public Library, New York University and Columbia University), as well as a west-coast peer institution (Getty Research Institute) provides additional context for the findings.

And here’s the teaser data from the analysis:

Size & Holding Overlap

  • Aggregate Collection: 962,290 unique titles
  • Holdings Overlap: One percent of titles are held by all four libraries; 83 percent of titles are held by only one library (these figures exclude the library’s auction catalog holdings as captured in SCIPIO)

General Characteristics

  • Auction Catalogs: 14 percent of the NYARC collection
  • Collections (mainly vertical files): 12 percent of the NYARC collection
  • Serials: 2 percent of the NYARC collection

Beyond the NYARC

  • 39 percent of the NYARC collection are unique compared to OCLC WorldCat
  • 66 percent of the NYARC collection are unique compared to New York Public Library, New York University and Columbia University

Much more detail will emerge once we’ve written this up for publication!

LAMs – A work in progress

Monday, August 20th, 2007 by GĂĽnter

I continue to spend a lot of my time moving forward our project of investigating library, archive and museum relationships in campus environments – we’ve settled on a short-list of institutions to visit (I’ll disclose the list once all visits are confirmed), and Ricky, Diane Zorich and I are continuing to do some hard thinking about how we’d like to spend our day with a group of LAM professionals. We’re also getting some perspective from community though-leaders like Chris Batt, Margaret Hedstrom, Cliff Lynch and Bob Martin – we talked to Chris last week, and are looking forward to a call with Bob Cliff and Margaret tomorrow.

I am starting to draft the scene-setting presentation for our visits, and my goal is to introduce language which can be used as a “tool” during the day’s discussion. Here’s one of the tools I think we’ll try to use:

LAMs are operating under specific circumstances which might be more or less conducive to integration and collaboration. These circumstances could be characterized along the lines of Lawrence Lessig’s modalities of constraint [pdf]: law, norms, market, and architecture. The list below extrapolates these constraints to the kinds of forces which might influence LAM behavior:

  • Law – e.g. what your administration tells you (mandate)
  • Norms – e.g. what your community’s rules dictate (work culture / tradition)
  • Market – e.g. what your audience is telling you (users) / your bottom line (funding)
  • Architecture – e.g. what your physical and technological infrastructure allows and supports (infrastructure)

This list can be used to analyze the present environment of an institution, and identify at which level interventions would most effectively bring about desired changes towards better integration among LAMs. Rather than dwell on obstacles, I hope that introducing this perspective will allow us to focus on how to effect desired change rather than be stiffled by unfavorable circumstances.

OAICatMuseum – Coming soon to a server near you!

Thursday, August 16th, 2007 by GĂĽnter

As those who have read this blog for a while will know, I am working with a sizeable group of museums on the issues around sharing descriptions and digital content using OAI-PMH and CDWA Lite XML. While OAI-PMH has been used in the library community since the initial release of the protocol in 2001, the technology received a good bit of attention in the museum world when the Getty Trust and ARTstor used OAI-PMH to transfer digital images from the Getty Museum and the Getty Research Institute to the ARTstor Digital Library in 2006.

The software used for this prototype was a modified version of OCLC Research’s open source software OAICat. Yesterday, we’ve released OAICatMuseum to a number of RLG Programs Partners for beta-testing. This son/daughter-of-OAICat was inspired by the Getty modifications to the original software, and has been uniquely tailored to the needs of the museum community. We are trying to make implementation of OAI-PMH as easy as possible, and to that end, OAICatMuseum supports CDWA Lite XML records out-of-the-box, and generates the OAI-PMH mandated DC records on the fly.

I expect that we’ll be ready to share this software with the world-at-large this fall after we’ve made modification based on the feedback of the working group!

The RLG Union Catalog – the last record*

Friday, August 10th, 2007 by Jim

As you know the last year has seen an enormous amount of effort invested in the integration of the RLG Union Catalog (still called RLIN by lots of its long-time users; here’s the Internet archive page describing the Research Libraries Information Network ) with OCLC’s main database, WorldCat. You know this because many of you had to plan for changes in your workflow and establish new practices to take advantage of the features afforded by the enriched WorldCat.

I’m proud of the integration team based here in Mountain View and those in Dublin. My RLG colleagues ensured that their decades of work in building the union catalog were honored by fully investing themselves in successfully migrating the data to WorldCat. At the same time my new Dublin colleagues made this effort their first priority and took seriously the need to execute it with minimal disruption to our users. Everyone did a great job!

Throughout this effort my RLG Programs colleague, Karen Smith-Yoshimura, was indefatigable and ubiquitous. She is finally ready to step back from these service responsibilities and give her energy to a variety of programmatic concerns like our projects related to Renovating Descriptive and Organizing Practices. She pointed out to me that on August 6, 2007 the very last record* was loaded into the RLG Union Catalog.

And just by way, OCLC has long celebrated each million record milestone by awarding a Gold Record to the institution who contributed the record. I think that the folks out here in the California office have a claim on a few of the gold records between the 77th and 82nd million-record milestones since a lot of those records came from the RLG Union Catalog loads into WorldCat. I’ve always wanted some gold records on our office walls…

So the very last record loaded into the RLG Union Catalog came from Library and Archives Canada and looks like this in all its tagged wonderfulness:

001 ONCGCN2007904206-F

007 vd·nv··un

008 070723s2007····bcc—········s···vneng··

016 ·· ‡a20079042066

020 ·· ‡a9780973835410 :‡c$59.95

035 ·· ‡a(CaOONL)20079042066

040 ·· ‡aCaOONL‡beng‡cCaOONL

055 ·3 ‡aRS164‡bM84 2007

082 0· ‡a615/.321‡222

100 1· ‡aMulders, Evelyn,‡d1959-

245 10 ‡aWestern herbs for eastern meridians & five element theory‡h[videorecording] :‡bworkshop DVD /‡cby Evelyn Mulders.

260 ·· ‡aLake Country, B.C. :‡bTri Lite Production,‡c2007.

263 ·· ‡a0707

650 ·0 ‡aHerbs‡xTherapeutic use.

650 ·0 ‡aMedicine, Chinese.

650 ·0 ‡aEnergy psychology.

I’ve stared at it a lot trying to read into it some serendipitous meaning. At the end of the day, I think it’s just a bib record. We’re proud of it just like all the others.

* it’s really the last ‘batch’ loaded record; on August 31, 2007 somebody will input the last online record…