Archive for October, 2008

Google settlement, OCA, and “lending” digital books

Friday, October 31st, 2008 by Dennis
This article, from earlier this week in the San Jose Mercury News, deserves a wider audience:
 
http://www.mercurynews.com/business/ci_10839879?nclick_check=1
 
Thanks be to my OCLC Research colleague Ricky Erway for pointing it out to me.
 
The Open Content Alliance was meeting in San Francisco earlier this week when Google’s settlement with authors and publishers was announced.  As you’ve no doubt heard by now — because of the settlement, in the foreseeable future libraries will be able to buy subscriptions from Google so that their patrons can view (on special terminals) the full content of books digitized by Google, with authors and publishers sharing in the proceeds.
 
Breakout groups at the OCA meeting spent several hours after the announcement exploring an alternate vision.  How can libraries freely offer up temporary digital versions of out-of-print but in-copyright books, to any user anywhere, with the digital surrogate treated in every way possible as if it were an interlibrary loan?
 
How could this be done from a technical standpoint?
 
What sorts of agreements, if any, would have to be in place with publishers and authors?  Or would such “lending” of “temporary” digital copies be covered under fair use?
 
What sorts of limitations and “due diligence” would be required by libraries to protect themselves from charges of copyright infringement?  
 
Would making the digital process mirror as much as possible the process with a physical book be enough to protect the digitizing/lending library?  For instance, holding the physical copy aside until the “temporary” digital copy expires? 
 
Are there some sensible best practices that a group of libraries could come up with and test? 
 
I’m thinking yes…
 

The Google Book Search settlement – a summary

Wednesday, October 29th, 2008 by Jim

Our colleague, Arnold Arcolio, entertained himself by reading over the actual Google settlement documents. I thought you’d welcome this thoughtful summary.

Share your thoughts on whether his understanding is correct and what implications it might have for us and others.

Quoting from Arnold’s email of this morning to me:
“My understanding of the settlement between Google, AAP, and the Authors Guild, which comes mostly from reading the Google press release , the books rightsholders settlement,
and especially the proposed Notice of Class Action .

It seems to me that the basic operating distinction is not between books which are in copyright and books which are not (which is a matter of fact, though sometimes hard to determine) but rather between books which are in print and books which are not (which Google will determine and publishers or authors can dispute; the Books Rights Registry is the organization that hears and arbitrates these disputes).

A books database, perhaps what Paul Courant refers to in his excellent post – The Google Settlement: From the Universal Library to the Universal Bookstore – as “the product,” will include all out of print books, whether in copyright or not, except those explicitly withdrawn by rights holders. It will include no in print books except those explicitly contributed by rights holders. Google may offer subject-based subsets of it. Google will produce and operate whatever system provides discovery and use. Use means gazing and page printing; institutions which subscribe for access and individuals who purchase access to individual items may also copy, paste, and annotate. A “research corpus” version with just about the same content as the product will support qualified researchers–it isn’t clear that they’d be from outside Google. The research projects that are offered as representative all have to do with making Google Book Search better.

It seems to me that a settlement serves Google better than a victory would, because it doesn’t open the way for similar efforts on whatever scale by others under a clarified definition of fair use. It does sound, however, as if the Book Rights Registry, an agency that collects and distributes payments, should support “similar programs that may be established by other providers.”

Out of print books, including those in copyright, can be previewed online more fully than they are now. This supports increased discovery and selection for out of print books. This generates chances for Google to sell fuller access to individual consumers. Authors and publishers will be compensated for purchases of books in copyright. For others?

Out of print books, including those in copyright, can be “viewed in full”–gazed at–at designated computers in US public and university libraries. Google cannot be charged with making these books any less accessible than they were. In fact, more books are available. However, they are in as illiquid a form as ever. This generates chances to sell fuller access to items or consumers or subscriptions to institutions. “The Public Access service will provide the same access to Books as Google offers in the institutional subscriptions, except that users will not be able to copy/paste or annotate any portions of a Book. At public libraries that are able to charge for printing, and at all libraries at higher educational institutions, users will be able to print from the Public Access terminals for a per-page fee.”

Consumers can purchase “online access” to many in-copyright books.

Institutions (including academic institutions) can subscribe. Will this “subscription for online access” add “copy/paste” and “annotate any portions” functionality.

“Upon Registry approval, Public Access terminals may be made available for a viewing and per-page printing fee at commercial businesses such as copy centers, which will share those fees with Google and the Rightsholders.” And while “users will be able to print from Public Access terminals for a per-page fee” will it provide any other functionality?

Contributing institutions may be permitted to make “non-display uses” of books.

These parts of the notice were particularly interesting to me:
Read the rest of this entry »

A web-based Collections Management System

Monday, October 27th, 2008 by Günter

I recently stumbled upon an announcement for NZMuseums, a website run by National Services Te Paerangi, itself a department of the Museum of New Zealand Te Papa Tongarewa. NZMuseums brings together collections information from museums across New Zealand. As of today, the system knows 383 museums, and contains digitized objects (albeit sometimes only a handful) for 54 collections. It boast a clean interface, and allows you to tag items – you can add and remove tags, and they immediately become available (or unavailable) for searching.

While all that is nice and good (it’s actually more than nice and good!), what really caught my attention is the system’s architecture: in order for the often very small museums to be able to contribute, NZMuseums partnered with Vernon Systems to deploy their brand-new eHive system. In essence, eHive is the first web-based collections management system I am aware of (and you should feel free to contradict me if I’m wrong).

In a recent podcast interview [mp3] I did with Ken Hamma, he singled out the cost of ownership of technology as a key issue for museums, and he mentioned open source and web-based systems as a possible way forward. His math:

“A museum thinks about having a collections-management systems. It goes out and licenses one from between $600 and $120,000, pays 11, 12, 13 percent maintenance year after year. But, that’s only the beginning of the costs. Once you’ve got that thing, you need to be able to support it on servers. You need to be able to provide access. You need a network.”

The price point for eHive (numbers taken from the eHive factsheet[pdf]): it starts with “free” for 100MB of storage and 200 images, and tops out at $800 per year for 25GB and 50k images. No surprise that this was a good fit for NZMuseums and its quest to bring the many small museums of New Zealand online.

LC-Flickr: updating the catalog

Wednesday, October 22nd, 2008 by Günter

In the context of John MacColl’s guest blog on Karen Calhoun’s Metalogue, I was reminded of the stats from the LC-Flickr project pertaining to changes LC made in their own catalog prompted by insightful Flickr comments.

When I last updated my Flickr slides for a class at Syracuse University, I found 174 records containing the word “flickr” in an all text field search of LC’s Prints and Photographs Online Catalog. The records in that set usually contain a credit such as “Source: Flickr Commons project” for information which has been added, like in this instance.

The same search today yields a whopping 4,256 records – which is quite close to the entire set of images LC has on Flickr (4,615 as of today). Upon closer inspection, I found that many of these records don’t contain a change to the substance of the record – however, they now do have a useful pointer to a discussion about the photograph on the Flickr site, and that’s why my search retrieved them. For an example, see this record which includes the following language: “Additional information about this photograph might be available through the Flickr Commons project at http://www.flickr.com/photos/library_of_congress/2369119062“. On Flickr, one can then follow a playful discussion about dating the photograph.

Interestingly enough, these links to Flickr aren’t programmatic – an item which doesn’t have comments on Flickr doesn’t seem to receive the link. See for yourself – the LC equivalent of this Flickr image does not contain the pointer in the LC record, since there was no comment on the image in Flickr.

It looks like LC continues to update its records based on Flickr user feedback, and they’re also creating links so people searching the LC catalog exclusively don’t miss out on the oftentimes rich discussion on Flickr. A search for “Source: Flickr Commons” yields 509 exact phrase hits, which is the portion which most likely represents actual updates to the catalog.

The family business?

Friday, October 17th, 2008 by Jim

The Association of Research Libraries meeting that just concluded concentrated presentations around leadership and career development. In particular it celebrated ten years of this program and featured the current cohort of Research Library Leadership Fellows.

The opening keynote was given by Bernadette Gray-Little, Executive Vice Chancellor and Provost at the University of North Carolina at Chapel Hill, who spoke about diversity in higher education and research institutions. In response to a question seeking advice about how to recruit appropriate talent into research librarianship she observed that our profession has a particular problem because

“people don’t aspire to research librarianship until much later in their life and career. You need to make the profession more visible and more interesting much earlier.”

During the reception I asked a lot of attendees how they’d come to the profession. The most common reason was that they’d been exposed to the profession because a relative or family member had been a librarian. (That’s true for me.)

Think firefighters and the police force. I wonder if the profession is really just a ‘family’ business.

P.S. One of the former participants in the Leadership and Career Development Program now has one of the most interesting position titles I’ve heard. Jeannie An is Director, 21st Century Fluencies / Liaison Program at McMaster University Library.

RLG Partners Among Early Grid Services Adopters

Thursday, October 9th, 2008 by Roy

At the prompting of my colleague Constance Malpas, I pulled the list of current users of OCLC Grid Services [PDF] to see how many RLG Partner institutions have already signed up. I found 15 — some of whom are already heavy users such as the University of Michigan.

I was gratified to see a mix of institution types, with the Getty Research Institute and the American Museum of Natural History joining libraries like Yale University, NYU, and New York Public. By my rough calculation (hey, I wasn’t a math major so no guarantees) over 15% of those signed up are RLG Partner institutions. Considering that RLG Partners comprise significantly less than 15% of OCLC members I think it’s a pretty good showing.

I also want to point out that those institutions who jump in early (that would be now) can help shape the services we deploy and how. From that perspective the significant percentage of RLG Partner institutions means they will have a disproportionate say in how things develop. That’s just as well, since they are also the institutions most likely to have the developer bandwidth to integrate these services into their local service array.

For more information, as well as link to the form to fill out to get a key for your institution, see the WorldCat Developer’s Network.

EAD tools project

Wednesday, October 8th, 2008 by Merrilee

Last week, I wrote about the backlogs project. Thanks to those who contacted me with suggestions and pointers. I think I’m close to having a fully populated working group, and I’m excited by the institutions and projects that will be represented.

The next project I’d like to tell you about is the EAD tools project. Have you noticed that there is a rather large number of EAD tools that have cropped up over the last two or three years? Well, I have. Don’t get me wrong, I think it’s a good thing that there are so many tools. But I think the very fact that we have so many makes it difficult to compare one to another. Some of the tools are just about EAD creation, others go into the realm of collection management.

The current project description says that we’ll be looking at open source tools, but colleagues in Programs and Research have persuaded me that we shouldn’t rule out commercial tools. It will be the first happy chore of the work group to come up with the rationale and to scope the exercise.

I have a few people who have expressed interest already, but if you are interested in participating or just in more details, please let me know.

The repository that isn’t there

Monday, October 6th, 2008 by John

Several European countries have moved in the direction of a national architecture for research information management. They do this for several reasons. One is because of the need to have in place efficient means of representing research performance to their governments, in exchange for government-provided research funding. Another is to make their research visible in a coherent way for public relations purposes. And a third is to support knowledge transfer and commercialization strategies, which feature ever more prominently in rationales for government research funding.

In Europe, it sometimes seems that the countries with the best developed national architectures for research are those whose institutions are least neurotic about being assessed for reward or punishment in the form of the division of a large research budget. Or, to put it another way, the UK has not yet developed its architecture very coherently. One country which has put a very integrated national architecture in place in recent years, however, is Ireland. And because its infrastructure is well-developed and coherent, the role of repositories within it is well thought-out, based on the need for open reporting, using systems which are intuitive enough to accept and manage the required content without the strain which can be so obvious in the UK.

At the recent innovative Repository Fringe meeting in Edinburgh, several of whose presentations are commendably made available in video format via Google Video, Niamh Brennan, Programme Manager for Research Information Systems & Services in Trinity College Dublin – one of our European Partners – revelled in the opportunity both to reveal how well-developed the Irish research architecture is, and in the Fringe’s encouragement to theatricality:

Friends, Romans (or Athenians of the North), I have come to bury the repository, not to praise it. The evil that repositories do lives after them – the good is oft interrèd with their bones; The noble Dorothea [Salo, from the University of Wisconsin, the opening keynote speaker] hath told us that the repository is ambitious; it has tried to do so many things, and has failed; if it were so, it was a grievous fault.

In a paper entitled CRIS Cross: the Repository in the Research Information System, she describes a nationally coherent research information infrastructure which is being built in Ireland. In this system, the humble institutional repository is so invisible to the researchers whose work is deposited in it, that it might almost be dead, for all that it intrudes into their consciousness. What is visible is its Research Support System (confusingly known as RSS), in which researcher images and CV details are accompanied by publication citations drawn from the repository. Some of the information in turn feeds expertiseireland, the island’s national research expertise portal. Records in the repository have been created in a variety of ways, including by purchase from Thomson Reuters. Academics feed the RSS, unaware that the repository sits behind the system. There is just the occasional hint, however, that there might be some other system lurking in the shadows, as in the invitation to upload the full-text.

All of this is consistent with the strategy which the Irish government has been funding since 1998 to improve the country’s competitive position as a knowledge-based economy. Universities have benefited from significant investment in research, and libraries have had funding which has allowed the development of a national e-resource portal, IReL, and a network of digital repositories with a common portal, IReL-Open. The commitment to Open Access research publication is proclaimed by the government itself. In recent months, the Irish Higher Education Authority has introduced a deposit mandate which Peter Suber applauds, though not as much as he did the earlier mandate by the Irish Research Council for Science, Engineering and Technology, which he praised, as Niamh Brennan puts it, as ‘possibly the best mandate in the world’.

Niamh Brennan and an attentive scholar in party mode

Her presentation fits with a general sense across this Repository Fringe event that libraries are tired of being embarrassed about their ‘failed’ repositories (David de Roure of Southampton University, in his closing keynote, suggests we call them ‘datacentres’ and lose the connotations of repositories as places ‘where things go to die’). They would rather look at the bigger picture which Ireland clearly exemplifies, and see – or rather not see – repositories as essential components better kept out of sight. Had the Fringe participants turned up to bury the repository? ‘Are we here for a wake?’ asks Niamh. ‘Maybe it’s a wake in which the body has disappeared; maybe it’s Finnegan’s Wake?’ The busts of the famous Edinburgh scientists and scholars watching attentively in their party hats seemed to symbolize this new spirit of defiance.

“Backlog” survey project

Thursday, October 2nd, 2008 by Merrilee

A few days ago, I said we’d write a little about each of our new projects in the archives and special collections program. The first one I’ll talk about is the Assess Archival Backlog Survey Tools Project.

The Greene-Meissner / “More Product, Less Process” report has caused archival institutions to look at backlogs differently. With this new focus, it’s a good time to also revisit how we assess collections and if possible see if there are some best practices that can be identified so that we aren’t all reinventing the wheel one by one by one. I’ve used the term backlogs, but it may not just be backlogs we consider. I’m putting the group together now (and have already had a few volunteers), so if this is something that interests you, please send me an email. If there are projects or efforts you think the group should be aware of, feel free to pass that information along as well.

Also, please note that almost all of these projects have cumbersome names and are officially open to renaming!

The need for “copyright evidence”

Wednesday, October 1st, 2008 by Merrilee

Love it or hate it, the Shawn Bentley Orphan Works Act of 2008 died in the US House of Representatives this week. The bill will not come up for discussions until later, possibly after the general elections in November (and I think likely not until the next Congress, the way things are going.) This is the most recent attempt to provide a legal framework for those who want to want to use works defined as “orphaned,” or those works where a copyright owner cannot be easily located.

With various forms of proposed and pending legislation on the horizon for some time now, librarians and others have been operating in the breach and tracking down what evidence they can, in hopes that some sort of safe harbor will be established for those who have made what has been termed a “reasonable effort.” To support these efforts (and to help cut down on redundancy) OCLC has released a pilot Copyright Evidence Registry. I’ve been working off and on with the team that created the CER for some time, and it has been great to see the project moving forward. Our own Copyright Investigation Summary Report (issued in January) provided background information to the team members and to the larger community.

If you are interested in participating, there’s information how to sign up for the pilot on the website. Kudos to colleagues for getting this out the door.