Digitization and access in archives

In my last Archives Month post, I reflected on the shift in practice related to allowing researchers to use cameras in special collections reading rooms, and what this has meant for users. Today, I’d like to focus on another area of practice that has shifted significantly in the relatively recent past: digitization. We heard a lot about digitization and a desire for online access to digital collections from all the users we surveyed and interviewed during our research for the Building a National Finding Aid Network (NAFAN) project. Some of this is synthesized in our summary of findings report for the project. But today I’d like to spend a little bit more time pulling together digitization related threads from our user research, as well as reflect a bit on where we’ve been and where we’re going in digital collections work.

Digitization facilitates access

Color photograph of a person setting up a manuscript for digitization using an overhead camera and copy stand.
Digitisation of a Dunhuang manuscript in the IDP UK studio from Wikimedia Commons

In our pop-up survey of 3,300 archives users, we asked respondents about their preference for accessing archival material online versus in person. Almost half (42.7%) indicated that they preferred online access but were willing to use materials in-person. Roughly a quarter of the respondents (23.6%) indicated they had no preference between online or in-person access. Fourteen percent of respondents said they were only interested in online access (14.4%) or prefer accessing materials in-person to online (14.7%). In our interviews with archival researchers, being able to access digitized material online was the feature most frequently mentioned by participants as enabling them to see and use archival collections.

Viewing archives in person often requires researchers to travel outside their home city, state, region, and sometimes country. Interview participants described competing responsibilities and limited time and funds to dedicate to travel as major challenges to accessing collections. One Family History Researcher explained, “[what’s] frustrating is knowing that there’s something that you can only see if you travel to the library and knowing that I don’t have the time or the financial wherewithal to get there.” An Academic Researcher explained that they may be able to get funding or justify travel to an archive for their research and publication work, but for their teaching work, it was not feasible. “If I’m using archival materials for teaching purposes … either proximity or being available digitally matters, right? It has to be at [home university] so we can get the real thing, or it has to be digitized, and there’s kind of no in-between there.” Participants identified online access as helping to alleviate barriers or challenges presented by travel, expense, or time required to do in-person research.

Conversely, many participants cited a lack of digitized content as a barrier to access. “[Digitized items will] turn up when I’m messing around, but ordinarily, the aggregators just list the shelf numbers and box numbers and a description, they don’t show the images, which doesn’t help me at all. … I can’t reach through the screen and grab box 13.” (Personal Interest Researcher). One participant described how it limits their family history research: “I get frustrated because I have a lot of relatives in [Town] and those newspapers aren’t online. So I keep periodically checking for that, hoping they’ll digitize. But I don’t think they’re doing anything. … I just feel like I’m super limited to what’s available online.” (Family History Researcher)

Another participant who does extensive research for their job described their research process, the vital role digitized content plays in it, and how lack of digitized content impacts their work. “I would start to rack my brain and think, ‘Okay, what is digitized out there? What books are digitized, what newspapers are digitized, what magazines are digitized?’ And of course, it’s much more difficult to work with special collections, for the most part, because that’s not all digitized. … It’s wonderful to come across that finding aid or that container list. But I tell you, if it ends there, it’s no help to me because it’s not like I’m going to travel to [repository on the opposite coast]. For me, it really needs to be digitized, unless it’s a [home city] repository. And so therefore, my research does not often include a lot of special collections items. It includes a lot of newspapers through ProQuest, it includes Archive.org, Google Books, it includes the HathiTrust. It’s limited to those types of sources.” (Professional Researcher)

Digitization on demand

In addition to existing online access to digital collections, we also heard from interview participants about the importance of digitization on demand services. “There’s definitely a lot of records that are not available online. And in that case, usually I can find where they are stored and go through that process then of contacting a researcher or the archivist that might be able to scan that and send it to me.” (Family History Researcher) Researchers also highlighted the importance of archival websites in supporting their access by making it easy to find digitization request forms or easily understand how to request reproductions of materials.

We also heard from participants that archives’ focus on digitization and online access to materials during pandemic closures enabled their research. A faculty researcher explained, “COVID was wonderful for the willingness of archivists all around the country to say, ‘Oh, of course we can do that,’ and send me 20 scans of something with almost no effort on my part.” Another participant shared their gratefulness that, “because of the pandemic, I think [archival institutions] put a lot of their stuff online and I was able to … access stuff that I don’t know if … I would’ve normally been able to access.” (Artist/Creative Researcher)

The good stuff

Something I found interesting in our interviews was that some participants perceived a difference in value to their research between material available online versus what could only be used in-person. This was voiced by a handful of researchers across user types.

One family history researcher said that they used material available online in their research but felt that most of the best information is in the archive and not electronically available.” Another family history researcher explained “As a genealogist … you’re able to get the skeleton of the story through resources such as Ancestry.com and FamilySearch. Because they give you the vital statistics of people, when they were born, when they were married or died … but they don’t tell you the stories which you’ll find in newspapers, which you’ll find in court cases, you’ll find in land records and things like that. So that’s where archival material really comes into play for me … the narratives that I’ve been able to create really have only been able to come about as I’ve gone into archives and libraries and moved away from what’s readily available online.”

Some of the faculty and professional researchers valued the relative inaccessibility of material that isn’t online. They sought to discover new sources in their subject area that hadn’t yet been analyzed and interpreted, and thought materials only available in the reading room were less likely to have been used by others in their own publications. One Faculty Researcher lamented that some of those discoveries became more broadly available through digitization on demand requests. “I have found these amazing things that I’ve dug up, and then they give them to me, then they make them accessible to the public. Which is fantastic, but on the other hand, there’s something about historians, we want to be the first, we want to be like, ‘This is mine.’”

Digitization costs and expectations

Talk of digitization can occasionally lead to some grumpiness from archivists, because of unrealistic expectations that archives and special collections should be able to “digitize everything.” Another aspect of our interviews that I found interesting was how many participants were aware of the resource intensity of digitization. “I know that’s really expensive, and I know that that’s a challenge … we need to give more money to organizations to digitalize their written collections because it’s just so much easier to use.” (Artist/Creative Researcher)

Similarly, they voiced realistic (in my estimation) expectations of repositories’ capacity to digitize their collections. “Of course, it’s a nice idea to think it could be digitized and be available online. … It’s not like they’re going to digitize their whole collection and put it online, I don’t think.” (Faculty Researcher) In the interviews we asked participants, if they had a magic wand, what their ideal way to discover and access archives would look like, and many brought up a desire for digitized materials. Even armed with magic powers, some participants remained modest in their desires, wishing only that more materials would be digitized, while others were more expansive. “Honestly, if I had a magic wand, everything, every archive would have everything digitized … All of those would be fully searchable, totally transcribed. If they’re in a different language, they would be translated in English for me. [chuckle] Yeah, I think that would be the dream type of database of having everything digitized, transcribed, translated and searchable.” (Family History Researcher)

Evolution of digitization practice

The focus of digitization work in archives has matured from small, bespoke, and highly curated projects at the outset of the 21st century to the large-scale, ongoing workflows of today. From 2007-2011, the OCLC Research Library Partnership led a variety of work that aimed to accelerate that maturity cycle and support the field in scaling up digitization in service of providing better access to rare and unique collections. Ricky Erway and Jennifer Schaffner’s Shifting Gears: Gearing up to Get in the Flow (2007) was a provocation intended to “compel us to temper our historical emphasis on quality with the recognition that large quantities of digitized special collections materials will better serve our users.” In 2009-2010, we did work to introduce balance in rights management, designed to ease the burden of overly restrictive or risk-averse approaches to rights management. In 2011,  Rapid Capture: Mass Digitization of Special Collections looked at capture methods and workflows to support scaling up digitization, and Scan and Deliver: Managing User-initiated Digitization in Special Collections and Archives presented strategies to help institutions build infrastructure and policy to address the increasing demand from researchers for user-driven digitization.

It’s been interesting to read through these reports, most of which I haven’t looked at for many years. Much of what they suggest has become common practice, and some of it didn’t totally pan out. And some of it remains relevant to challenges we haven’t quite figured out yet. So where are the challenges to digitization now?

Certainly, resourcing is one of them, as our interview participants recognized. Our Total Cost of Stewardship communication tool suite includes a digitization project assessment template that can help institutions articulate and advocate for the full costs associated with digitizing materials and making them available online.

Another facet of digitization work that we still collectively struggle with is dealing with copyright, which of course is also tied to resourcing in that it requires humans with expertise to do rights assessment work. The RLP recognized and tried to address this challenge with our well intentioned practice for putting digitized collections of unpublished materials online (2010). Thirteen years later, I think this work still proves relevant and useful. And some things it outlines, like using take-down policies as a strategy to manage risk, have become commonly accepted practice. The document also details ways to prospectively work with donors to make future digitization work less burdensome, including suggesting creative commons licensing in deeds of gift. Our more recent RLP Works in Progress Webinar Radical Access—Leveraging Creative Commons Licenses to Open up Archives is a terrific primer that explains the options available using Creative Commons licensing and offers strategies for explaining open licenses to donors, negotiating for them, articulating them in archival descriptions, and helping researchers make sense of them.

Institutions are also incorporating digitization into strategies to address diversity, equity, and inclusion goals, taking a corrective stance when prioritizing what collections to digitize. Two recent RLP Works in Progress webinars share approaches and lessons learned from such efforts. This wasn’t for you yesterday, but it will be tomorrow—Digitization policy to counteract histories of exclusion shares work at Louisiana State University to frame digitization as anti-racist action. Slavery, abolition, emancipation, and freedom—Primary sources from Houghton Library shares nuts and bolts workflows and useful lessons from the year-long Houghton effort to prioritize the digitization of their African American history holdings. Both webinars are clear-eyed and candid accounts of their work and I’d recommend them to anyone trying to make a similar effort in their own institution.

The NAFAN user research made me think more about digitization than I honestly had in quite some time. It reminded and reaffirmed just how important digitization work is in building toward our professional value of providing equitable access to the collections archives hold in trust for the public.

One Comment on “Digitization and access in archives”

  1. What a great idea to review the evolution of practice, and I would hope that revisiting past RLP reports that still have legs brings renewed attention to them. You all have done so much great work that remains relevant. Bravo!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.