Large-scale digitization of special collections: legal and ethical issues (part 1)

February 24th, 2009

I was fortunate enough to attend a symposium hosted by the Southern Historical Collection, University of North Carolina, Chapel Hill. The title of the symposium pretty much sums up many of my own interests and concerns these days: Legal and Ethical Implications of Large-Scale Digitization of Manuscript Collections. The meeting was quite meaty and worthwhile, more than I can possibly tackle in a single posting. So I will reporting on the conference serially, and speculate along the way about future directions and implications.

Some background
Archives and special collections are certainly doing a lot of good work to increase the “flow” of archival materials, both in terms of picking up the pace of describing collections and in terms of getting collections online. This is reflected both in the More Product, Less Process report and in our own Shifting Gears paper. There is an acknowledgment that digitizing collections, in whole or in part, gives greater access to collections. Some people use the term “democratizing” in reference to digitizing collections. However, there are still barriers to providing this more democratic access to collections, particularly for collections created in the twentieth century. Issues include legals considerations — not only copyright, but also in some cases complying with US regulations such as HIPAA (Health Insurance Portability and Accountability Act) and FERPA (Family Educational Rights and Privacy Act). Various of our states and jurisdictions have differing rules regarding how and when personnel records may be made public. I’m confident that in the European Union and beyond there is a similarly mystifying web of law, policy, and regulation that all conspire to damped the intentions of the most well-meaning and public-spirited archivist. Beyond the law, we must also grapple with what is ethical. Can putting documents online damage reputations, or hurt feelings? Will digitizing collections create a chilling effect on donations of materials to special collections? Providing a democratizing level of access to primary source material that comprises documentation of the twentieth century while balancing legal and ethical issues is challenging at best.

When we were putting together our own symposium, Digitization Matters, we ruled two issues out of scope: copyright and the mechanics of digitization. We felt that legal issues merited a separate lengthy forum and discussion. I also think that discussions around copyright get complicated and scary, and tend to paralyze institutions into non-action. Furthermore, copyright issues are irrelevant for a large body of materials.

Now, 18 months later, institutions are moving ahead with planning and execution of digitizing materials at a rapid pace. We can no longer defer discussions around copyright. And so, this is a long winded way of saying that this symposium and ensuing discussions come at a perfect time. I hope to summarize the presentations and discussions from that day, and point you at the symposium Wiki, where many of the presentations and other materials have already been posted.

Extending the Reach of Southern Sources
Before I dive into the meat of the symposium, it’s worth saying a few words about UNC’s Mellon-funded project, “Extending the Reach of Southern Sources: Proceeding to Large-Scale Digitization of Manuscript Collections.” (Some of this may be folklore, so please accept this as a good story if I am getting the details wrong.) In the careful-what-you-wish-for category, UNC applied to Mellon to fund digitzation of the Southern Historical Collection — all of it. Mellon, in the person of Don Waters, countered by asking UNC to restructure their request to address all the issues they would need to consider in developing a plan to digitize their collections. Those issues include, setting priorities (and taking into consideration the needs of scholars), copyright and ethics, and sustainability. Thus far, the project has focused considerable effort in gathering information in workshops with scholars, symposia such as the one I attended, and a good deal of fact-finding from those in the community. I am particularly looking forward to the reports from the scholars’ workshops. Findings will be held up against the holdings of the collection — the SHC is developing a matrix that will help to structure digitization priorities against a backdrop of need, resources, and risk.

Kicking it off: what about third-party privacy?
Aprille Cooke McKay kicked off the conference with a talk on third-party privacy issues. The talk addressed mostly legal, but also ethical issues, and brought in some field reports that we can look to in assessing risk. Aprille started out by referring to the SAA Code of Ethics, which says (in part): “Archivists protect the privacy rights of donors and individuals or groups who are the subject of records.” We have an ethical obligation, but also a legal obligation to uphold any duty of confidentiality, particularly when expressed in a donor agreement (a contract between the donor and the institution).

That’s a clearcut case of what must be done, but what to do in the case where the donor breached their legal duty by donating stolen or classified materials? In the case of Brown & Williamson v. Regents of California, a judge ruled that the materials stolen from Brown & Williamson by “Mr. Butts” and made available as part of the Tobacco Archive (now the Legacy Tobacco Documents Library) are of such public value that they didn’t need to be returned (or taken down once digitized). In any case, the breach is the donor’s liability, not the repository’s. There is little case law in this area.

Part of the reason that archives are so seldom (visibly) sued for disclosing private information is that paper documents are obscured — if there is “tale telling” it’s done by those who use the documents (scholars or journalists). However, in a digital environment, documents are less obscure. Here, we must establish standards of care, which are currently lacking. Here, Aprille said what I think was the key phrase for the rest of the conference: what would a reasonable archivists do?

The presentation also covered other areas of interest (or that should be of interest to reasonable archivists). What defines “private?” How the passage of time might soften the definition of what is reasonably private, how the law views privacy (not covered by US federal law, and state laws generally do not favor the overly sensitive). What constitutes defamation? (Here again, the documentation showing what the reasonable archivist would do is needed.) Libel (also state law) has a short statute of limitations, and a low win rate for plaintiffs. She also explored areas that would help to mitigate risk, such as “aging” material, creating good takedown policies, and being respectful of complaints, and developing a contingency fund to cover litigation. There are not a lot of great case studies in this area, and we need to be alert for them. Some brave institution could go forward on behalf of the community and create case law.

My takeaways from this presentation were two-fold. In many ways, we have been protected by the relative obscurity of paper documents held in discrete physical locations. I think in many cases, our current practices would not hold up to the scrutiny of protecting privacy and that conversations about what to digitize and what not to digitize will cause us to reflect on what we collect and give access to in the reading room as well as on the screen. To return to Aprille’s query, what would a reasonable archivists do? That is the question we all need to face. This reflection is a good thing for the archival community, and I look forward to the continued dialog.

And, as for the title of this post, we clearly need some shorthand for this problem. New acronyms, anyone? LSD:LE?

2 Responses to “Large-scale digitization of special collections: legal and ethical issues (part 1)”

  1. Jessica Sedgwick Says:

    Excellent post! I loved Aprille Cooke McKay’s talk on third-party privacy – I thought it was very clear and well-organized, and frankly promising. Privacy is an issue that has come up in a number of collections I’ve worked with, and I must say, I’m not 100% convinced that the “what would a reasonable archivist do” mindset gets me very far in terms of balancing ethics of access against ethics of privacy protection. I think you can find “reasonable” archivists all along the spectrum. Other considerations or approaches that came up in the symposium that I think were a little more potent for me personally were plain-old risk management, and coming back to “whether the definite good outweighs the potential bad.” This is not to harsh on the reasonable archivist, but I really wonder, what would he or she do? Would we ever have an answer to that question?

  2. Merrilee Says:

    Thanks, Jessica, for the comments. I also loved Aprille’s presentation and think she (and it) should be cloned. I agree that reasonable archivists are all over the spectrum, and hope that conversations like this one will help us to begin to find balance and develop guidelines that define the middle ground.

    One thing that I think came out in Aprille’s presentation — the risks for running aground with third-party privacy are pretty low, or at least seem to be in our current paper-based environment. I think we can mitigate those risks by working with donors (getting ahead to a future blog posting) and categorizing sensitive materials in the meantime.

    I would like for there to be more balance — right now, the scales do seem to be tipped in favor of risk aversion, which I fear is to the detriment of researchers and ultimately our missions.