Something New For Something Old (part 3)

December 7th, 2008 by Merrilee

The third day of the PACSCL conference started with a panel on digitization.

Max Evans argued “both sides” of digitization (while reminding us that we should never trust a “double minded man”). In the case against digitization, is cost. Early digitization projects that Max was involved with ran to $33/image, most funds being spent on metadata creation. Max estimates that there are 30 billion pages of archival materials in the United States, and it would be too costly to digitize all of these pages. While at the National Historical Publications and Records Commission (NHPRC), Max helped the commission to reexamine the previous policy of not supporting digitization projects because digitization can be an important tool in promoting democratic access of collections to a wide audience. Grant recipients for NHPRC’s digitization programs are getting costs down to one or two dollars per image — “they get it,” said Max. These projects are creating no or minimal metadata at the item level, relying instead on collection level metadata. (In stating the “pro” digitization argument, Max is in essence repeating much of what was in Shifting Gears [pdf]). Max is interested in digitization on demand and digitization in response to other well-defined criteria. In the end, I thought Max did better justice to the argument for digitization, so perhaps he is not so double minded after all.

Joshua Ranger spoke about a project at the University of Wisconsin at Oshkosh where they experimented with a minimum metadata approach. Costs were cut significantly (to 35 cents a page), but so was user satisfaction and ability to complete tasks against a control collection. Graduate students were more critical of the approach than undergrads, and suggested that in order to be useful, even more metadata than in the control collection would be needed. However, when given the cost tradeoffs (more stuff online for less money, but with less metadata) the experimental group was considered to be sufficient. Josh suggested several avenues that could be pursued in order to respond to issues raised in the experiment. These included, improving searching, provide a way to visually size up the size and context of a scanned folder, provide graphical browsing, in scanning provide a “patch code” that would allow for documents to be isolated from one another, get user supplied transcriptions for documents that cannot be OCRed. Finally Josh said that each collection’s needs should be considered on their own merits, taking into account what tradeoffs are acceptable.

Mary Rephlo from National Archives and Records Administration (NARA) talked about the archives’ goal of making 1% of all holdings available online by 2012, and their plan to leverage strategic partnerships in meeting that goal. NARA has articulated several principals which all partnership agreements must share such as: digitization of full series or file segments in order to maintain an archival view of collections (and to avoid cherry picking), free access to digitized content in the reading rooms, after a time unrestricted rights to the content (including metadata), and the partner must pay all reasonable direct costs. In order to maintain balance, it is important to recognize that partner needs differ, series needs differ, and that partners are in competition with one another. (Mary’s presentation hit several points that were raised in the Good Terms paper.)

I’ll wrap this up tomorrow (before heading off to the CNI task force meeting where I’ll do even more reporting).

Related posts:

Leave a Reply