A recent article in Library Journal (Google Books vs. BISON) has had a number of us talking, and has reminded us of a presentation the Constance and I gave at CNI last year. I’ve had intentions of turning the presentation into a proper paper, because the information and our reasoning is, I think, quite pertinent to the question: what’s the likely impact of mass digitization on library collections? The answer is, not what you might expect (and not what’s outlined in the LJ BISON article).

One of the assumptions put forth in the BISON article is that increased discoverability (via Google books) will result in increased accessibility. Searchers will find texts via Google Books, and Google Books will likewise serve up the books in digital form. Right? Wrong.

In our CNI presentation, we looked at WorldCat for some measures of just how much material might be classed as out of copyright and hence available for full-text presentation in Google Books (or in any other system). This represents a very small fraction of library-owned content. We also considered how scholars are actually interacting or are potentially interacting with those older texts (for the most part, materials published before 1923, and some portion of materials published to 1962 in the United States). Without delving into the specifics of the presentation (and what we hope to cover in our article), public domain full-text is inadequate to support current scholarly practice. The inadequacy of Google Books for supply of texts is further compromised by Google’s well-known conservative stance on what qualifies as public domain. So while these books (and articles, if you expand the view to include Google Scholar) are much more highly discoverable, the content is not available online without authentication (in the case of journals) which would be provided by the library, or without purchase.

We find, in many of our discussions with Programs Partners, that there is a real yearning for a repeat of the “Anatomy of Aggregate Collections” study (otherwise known as the Google 5 study). It’s thought that “Google 5 plus me” or “Google 5 plus me and my friends” will allow institutions to get a better handle on the total volume of material scanned, which then will enable institutions to manage their print holdings differently. We think this is not where the answer lies. We will only be able to make use of this information when we can disclose something about the availability and preservation status of those digitized texts.

The BISON article says,

If Google Books is scanning old materials and also getting new content from publishers, this leaves relatively little for small to medium-sized academic libraries to contribute…. [W]hat will happen to the library’s role in preservation, cataloging, and circulation? Will Google and Google Books lead to the extinction of academic research collections as we know them?

What this misses, I think, is the main point. Without the library, without “the stuff,” there is no delivery chain. Books and e-resources, while indexed by Google Scholar and Google Books, are held by libraries. Because of copyright and licensing agreements, Google cannot deliver this material. The fact is, these monographs are discoverable, but not available, online. As long as this continues to be the case, this much increased discoverability without equal accessibility will put greater pressure on delivery of print holdings for some time to come.

If you are dying to look at our presentation, I’ve loaded it into SlideShare.

Oh, and PS. If any of you are still expecting users to use Boolean operators, take a page out of the BISON study and cut it out right now. I say this with all the love in the world, knowing that you actually want your users to be successful when searching your catalog.

  1. I drew a different conclusion from the article’s remarks about Boolean operators. It’s true that most people won’t use Boolean operators most of the time, but some people will use some Boolean operators some of the time. In particular, Google uses OR, so some people will be familiar with that. Also, Google’s Advanced Search (which is nothing of the sort) shows users how to construct a query using OR. On the other hand, no one is going to use “AND NOT” when Google uses minus sign.

    As for accessing materials located by Google Book Search, you’re absolutely right that the library and not Google is the repository and delivery mechanism. For the user to move more or less seamlessly between Google Books and their library account(s), either their Google profile has to include a pointer to their library, or the library discovery application has to search Google Books. In the former case, the browser can hold the user’s credentials — there’s no need to pass them to Google. This would be a nice application for OpenID, if OPACs began to support them.

    For that matter, I’d love to be able to link my WorldCat and library accounts.

  2. I’m really glad that people are doing this kind of work, and that many libraries are starting to integrate GBS into their catalogs. However, I thought the analysis in this article was pretty simplistic. Of course Google Books returns more results than a local OPAC. It 1) likely has a much larger database than BISON and 2) is searching full-text rather than metadata. More interesting, I think, would be an analysis comparing Google Books to another full-text system. Then you’d be comparing apples to apples.

  3. Thanks, Nancy. I should say that for those who study the Victorian era, Google Books (along with licensed resources such as EBBO and ECCO) is probably a great thing. I’ll take a look at the blog. Thanks for the pointer!

  4. Merrilee, have you and your colleagues looked at the blog of Miriam Burstein, a professor in one of the upstate SUNY colleges? She’s a Victorianist working on religious literature and frequently posts about her frustrations with Google Books, which holds the promise of fabulous material for her but sometimes is just maddening. See here, for example. There may be other scholars out there with similar anecdotal reports.

