A recent article in Library Journal (Google Books vs. BISON) has had a number of us talking, and has reminded us of a presentation the Constance and I gave at CNI last year. I’ve had intentions of turning the presentation into a proper paper, because the information and our reasoning is, I think, quite pertinent to the question: what’s the likely impact of mass digitization on library collections? The answer is, not what you might expect (and not what’s outlined in the LJ BISON article).
One of the assumptions put forth in the BISON article is that increased discoverability (via Google books) will result in increased accessibility. Searchers will find texts via Google Books, and Google Books will likewise serve up the books in digital form. Right? Wrong.
In our CNI presentation, we looked at WorldCat for some measures of just how much material might be classed as out of copyright and hence available for full-text presentation in Google Books (or in any other system). This represents a very small fraction of library-owned content. We also considered how scholars are actually interacting or are potentially interacting with those older texts (for the most part, materials published before 1923, and some portion of materials published to 1962 in the United States). Without delving into the specifics of the presentation (and what we hope to cover in our article), public domain full-text is inadequate to support current scholarly practice. The inadequacy of Google Books for supply of texts is further compromised by Google’s well-known conservative stance on what qualifies as public domain. So while these books (and articles, if you expand the view to include Google Scholar) are much more highly discoverable, the content is not available online without authentication (in the case of journals) which would be provided by the library, or without purchase.
We find, in many of our discussions with Programs Partners, that there is a real yearning for a repeat of the “Anatomy of Aggregate Collections” study (otherwise known as the Google 5 study). It’s thought that “Google 5 plus me” or “Google 5 plus me and my friends” will allow institutions to get a better handle on the total volume of material scanned, which then will enable institutions to manage their print holdings differently. We think this is not where the answer lies. We will only be able to make use of this information when we can disclose something about the availability and preservation status of those digitized texts.
The BISON article says,
If Google Books is scanning old materials and also getting new content from publishers, this leaves relatively little for small to medium-sized academic libraries to contribute…. [W]hat will happen to the library’s role in preservation, cataloging, and circulation? Will Google and Google Books lead to the extinction of academic research collections as we know them?
What this misses, I think, is the main point. Without the library, without “the stuff,” there is no delivery chain. Books and e-resources, while indexed by Google Scholar and Google Books, are held by libraries. Because of copyright and licensing agreements, Google cannot deliver this material. The fact is, these monographs are discoverable, but not available, online. As long as this continues to be the case, this much increased discoverability without equal accessibility will put greater pressure on delivery of print holdings for some time to come.
If you are dying to look at our presentation, I’ve loaded it into SlideShare.
Oh, and PS. If any of you are still expecting users to use Boolean operators, take a page out of the BISON study and cut it out right now. I say this with all the love in the world, knowing that you actually want your users to be successful when searching your catalog.