Large-scale digitization of special collections: legal and ethical issues (part 3)

The symposium, The Legal and Ethical Implications of Large-Scale Digitization of Manuscript Collections, had two panels: Reconciling Modern Archival Practices and Ethics with Large-Scale Digitization; and a second on legal issues (Orphan Works, Fair-Use, and Risk Management). For whatever reason, I had the great good fortune to moderate both of these panels.

First of all, my hat is off to the speakers on each of these panels. Each panel had five speakers, we only had two hours, and wanted to reserve the bulk of the time for discussion. This left each speaker with 15 minutes (for the first panel) or 7 minutes (for the second panel, where speakers were addressing multiple topics). Each speaker was covering a pretty big area, so it was kind of like speed dating, both for the presenters and the audience. I was prepared to get out my moderators whip, but fortunately all of the panelists were extremely well-prepared and kept to time. It was really a model of collaboration.

In this posting, I’m going to cover the first panel, Reconciling Modern Archival Practices and Ethics with Large-Scale Digitization (almost all of the remarks in this panel have been contributed to the Wiki).

Barbara Aikens from the Archives of American Art (Smithsonian) kicked of the panel by talking about the role of the processing archivist, and tensions between More Product, Less Process (MPLP) practices as digitization. Barbara speaks from a deep well of experience as the AAA has been digitizing collections for a long time. Before collections are scanned, they get a once-over from the processing archivists to ensure that the collection is scan-worthy — that is, that it does not have sensitive or private materials, or avant-garde artistic materials that do not help in understanding the life or sensibilities of the artist. This is of course a subjective activity. Barbara does not feel that pre-digitization processing is in conflict with MPLP, which calls for a range of appropriate practices. If a collection has been prioritized for digitization, it is worth the additional effort they put into it. It is worth noting that AAA’s procedures are still “lite” in terms of digitization: no item-level metadata, but processing at series and folder level.

Bill Landis (Yale Manuscripts and Archives) spoke about the importance of working proactively with donors to their help in dealing with privacy and copyright issues. We have enough collections that we need to unravel, and it would be better to develop up-front plans for managing incoming collections that include how and when to make them accessible in and out of the reading room. Rather than taking on collections that present us with more doubt and risk than reward, we need have donors help us identify and mitigate risk, and to help us make plans for when collections can be made accessible. He also highlighted the importance of creating educational opportunities so that archivists can be more comfortable with third-party-privacy issues (and suggested that Aprille’s presentation is a great start for the type of education we all need — I second that!). We should be creating professional frameworks for dealing with collections, rather than reacting on an ad hoc basis.

Max Evans (LDS Church History Department) talked about his commitment to accelerated processing and digitizing as the head of National Historical Publications and Records Commission (NHPRC). He has long believed in digitization on demand. Max recently left NHPRC and returned to an institution, where implementing his theoretical practices has been challenging. The LDS Church History Department has done reviews of materials to look for sensitive, private, or confidential materials before making a collection available. This review has been done at an item level, and Max would like to change this review process. Two of these categories (sensitive and confidential) can be handled categorically, by looking at function and provenance of the records. Classification can be applied at the collection or component level, not at the item level. Private materials are more difficult to deal with — these materials can be scattered throughout collections. Max doesn’t have a good solution, but he has a bad solution. which is to follow the standard applied to US Census records as a starting point, and work backward from there. This would mean starting with 72 years and conduct access review on request. At his institution, this would mean opening up materials that have not been made available and bringing request for review into the mix.

Tom Hyry (Beineke Library, Yale University) talked about the promise and peril of digital records. He was mostly talking about born digital records, but also speculated about cases where we have a full digital representation of the item (not just a scan metadata related to that item, but the content of the item itself). On the one hand, fully digital records are a researcher’s dream because you can search for the text, and further you can DO things with the text — apply humanities computing-type processing to bodies of materials to find patterns, make connections, etc. On the other hand, fully digital records expose archival institutions to the risk of exposing private and sensitive information even more. But, with digital records, we may be able to develop methods to screen for private or sensitive materials.

Finally, Dan Santamaria (Princeton University Archives) started out with an amusing personal story about confronting bureaucracy that ended with “you don’t understand, it takes a very long time because we have thousands of documents to manage.” While this got a laugh, this is increasingly the answer we are giving to patrons — and in an era of increasingly online services, they are not buying it. Dan’s compelling talk focused on the ethical obligations we have to make collections accessible — not just to the elite few who can come into our reading rooms, but to as broad a public as possible. We must not become entangled by fear, but find ways to fulfill our ethical obligations. He reminded us that along with the duty to protect outlined in the Archivists Code of Ethics, was also have the duty to make accessible.


How do we know what’s appropriate? So much of this is contextual (as outlined in Aprille’s presentation). We must also acknowledge that there’s a difference between how private and public academic institutions or how federally funded institutions can deal with problematic materials.

72 year cut off is problematic, and shouldn’t be adopted wholesale. This may be appropriate for an institutional archivist, but the collecting archivist who is working with a variety of donors has a much different situation. The “review on request” component of Max’s proposal is very important.

How much online content has drawn fire? Not much.

We need a shift in thinking from small collection level record/big finding aid to big collection level record/small finding aid. Archivists are big on context, but not everyone who uses our collections is as in love with context as we are.

How do we get to democratic access without going down the “let’s just put the easy stuff up first” road again?

We are a diverse country with diverse opinions and diverse collections and diverse institutions. Frameworks will help, but will not solve everyone’s issues.