and we’re in for the next round

Wednesday, October 26th, 2005 by Jim

Last night at the launch event for the Open Content Alliance, RLG was announced as a partner.

    RLG is contributing the RLG Union Catalog as a source for data to OCA participants. press release

These 48 million titles will help to describe the digitized library materials that will be contributed, to keep track of what is digitized, and to help inform choices about what should be digitized. As the work goes forward the information in the Union Catalog will be augmented with information about copyright status to help guide selection and scanning efforts.

As I said at the event, we’re happy to be an early contributor. The union catalog that our research library, archive and museum members built over the course of the last thirty years has been one of the great boons to scholarship and learning. The ability to discover the materials of research has been one of the most powerful enabling capacities in our community.

It seems only fitting that The Union Catalog, one of the great products of the cooperation and social contracts in our community, should enable successful collaboration on the next grand generational challenge – building out the collective digital library to enable a great research and learning experience for scholars, students and citizens. We hope the OCA becomes the focus for a huge collaborative, multi-year, massively distributed, intelligently and appropriately coordinated effort to achieve this goal. We’re ready to help make it happen.

The future has to start…

Wednesday, October 26th, 2005 by Jim

Yesterday I participated along with two other RLG colleagues in an invitational day-long workshop sponsored by the Open Content Alliance (OCA) hosted by one of the founding participants The Internet Archive. This working session was capped by an evening reception at which Brewster Kahle shared his vision of the kind of “open library” that the Alliance could create. The event was attended by over 200 guests and included announcements of the next round of contributing participants. The big news was that Microsoft and Yahoo! have provided financial backing for book digitization and agreed to the principles of the Alliance.

That these two huge rivals could come together in support of this effort drew renewed press attention to the possibilities and opportunities created by mass digitization of library materials. I think it breaks open a logjam that has developed over the last few years in the library, archive and museum community. The pile-up of all our pilot projects along with our early digital service efforts and the presence of the huge Google Print boulder has made it difficult to focus our vision and think positively about ways to work towards it. For those in attendance this was a very happy occasion. It felt like the start of something very real and important. It felt like the forward flow was beginning.

The workshop
The founding participants reached out to other OCA contributors and experts to try and “answer the question of what technologies are needed so that we can move book scanning and online distribution forward.” The group of about 30 worked in smaller groups on the topics of

1. Scanning and image processing
2. Copyright, collections and cataloging/metadata
3. Formats, tools, and interfaces
4. Governance and organization

The goal was to put some specific shape on the challenges and issues associated with doing mass digitization at industrial throughput levels and set a target for where the OCA could and should be in these areas one year from now. I was impressed with the thinking reported out by the scanning and imaging group and the formats, tools, and interfaces group both of whom offered up very sensible baseline requirements and architectures. I think that was due to the influence of John Kunze of the California Digital Library and Herbert van de Sompel of Los Alamos National Laboratory who took the respective leads. They avoided the kinds of group urges that emerge in these settings to overbuild and honor every desire as a requirement.

The cataloging and metadata group had an easier time of it given that high-quality, consistent metadata for description, discovery and coordination are relatively obvious and things with which this community is familiar and comfortable.

The governance and organization group in which I participated struggled. I think some of that may have arisen simply from the use of the word governance. That immediately took the conversation down paths that were fraught – speculations about a membership organization, executives, advisory councils, etc. At the end of the day, however, they agreed that the most important thing was to have a clear articulation of the short list of the principles to which any participant in the OCA would have to subscribe. The founding members asserted a starter set of those principles that are published on the OCA website. Nevertheless it’s already clear that those need to be expanded and clarified. I don’t want to usurp the work that needs to go forward but a good example of a principle that might need to be in place is that all contributed content is available for bulk downloading and re-hosting in other environments. This principle might be required for the vision of re-usability and re-purposing of library content to be fully realized.

In any case, a proper final set of principles would inform how much organizational structure is really necessary and what other supporting mechanisms have to be created. Rick Prelinger of the Internet Archive was charged with drafting up an approach to be reviewed by the founding participants and on which the community would be consulted. I was much in favor of this approach. It seemed to me that defining the endpoints was necessary before one could fill in the rest. The vision of the OCA is pretty clear and the principles that would have to inform contribution and creation of digitized content flow pretty straightforwardly from that vision. This is a formulation that says you can’t have a particular desired end state if you don’t start from certain consistent foundation characteristics. If you want that then you have to start with this. What fills between principles and vision are the administrative practices, operational processes and technical requirements that implement those principles to achieve that end state. And those can then be designed to be suitable to the purpose. We expect this to be fastened down over the next thirty days or so.

The launch
You’ll find lots of blog entries about the launch elsewhere. If you want to have the flavor of the evening I’d suggest you go to the Open Library web site and turn the pages on the Open Library volume there. That’s what Brewster did.

Archivists’ Toolkit

Tuesday, October 25th, 2005 by admin

Lat week, I attended an advisory board meeting for the Archivists’ Toolkit project. The AT aims to provide open source software that will help archives of all sizes manage and describe archival collections. The software is being built modularly, and the draft specifications are online for anyone to look at now – although as a result of the advisory board advising, the specifications will be undergoing revision.

A lot of very good work has been done on the project since the first meeting I attended last October. I’m impressed with the work that the archivist analysts have done in terms of pulling together the specifications, and with the programming team. I’m looking forward to getting my hands on a working version of the software soon!

Why am I excited about this project? First of all, I think the AT can help to bring the archival community together in a similar way to EAD. EAD has given archivists not only a means for encapsulating their collection descriptions, but also something to gather around, to talk about, a platform for sharing ideas. EAD has been a great gathering point for the descriptive side of the archival community. I think the AT has the potential to bring the archival community together over descriptive practice, processes, workflow, and policies. I don’t think the AT will provlidge a particular workflow, but I think that when archivists share a common tool, they wil talk with one another about how they are using it and share ideas.

The AT will output EAD, MARC21 in XML, EAC, and METS. As someone who works for a data provider, I’m eager to see content generated in a uniform way. I’m also pleased that the project will take RLG up on our offer to incorporate the RLG EAD Report Card.

Who took that picture?

Wednesday, October 19th, 2005 by Günter

A long commute of listening to NPR makes for a person who has a lot to contribute at dinner parties, and it also helps with the occasional blog entry. Here’s an item from yesterday’s All Things Considered: in an interview with Purdue University Professor Edward Delp, I learned that every printer, scanner and digital camera leaves a unique signature which allows the documents created with the device to be tracked back. Astonishing. Not only of interest for forensic experts, but potentially also for archivists (provenance, anybody?) and folks thinking about digital preservation.

Showing up

Friday, October 14th, 2005 by Jim

About a month ago my colleague Günter wrote a post about the way in which our communities interact with industry. He wondered

“How do we communicate that the use of standards would actually give a product a competitive advantage in the marketplace? How do we communicate that our interest in long-term preservation isn’t so special – we’re not the only ones who’d like to see digital images accessible for generations.”

Well, one answer is to appear at industry conferences. I came across a presentation slot dedicated to Preservation Metadata Implementation Strategies (PREMIS) on the agenda of the upcoming 11th Digital Asset Management Symposium. Just where we need to be. The agenda reinforced for me that our concerns are shared and that advertising agencies, broadcast and media companies have some very powerful motivations to position their assets for long-term use.

P.S. The conference web site features a thumbnail photo of our friend, Murtha Baca, of the Getty Trust, that models LA attitude.

The more things change…

Friday, October 14th, 2005 by Jim

Today is the last day at RLG for one of the community’s best advocates and long-time leaders. Linda West, who has managed RLG’s Member Programs and Services for nearly a decade is winding things up and heading back east. She’s going with our greatest good wishes and lots of sadness that we won’t have her as a colleague on a day-to-day basis. See the synopsis of her RLG career and plans here. Linda was an expert in library management, an accomplished leader in the world of cataloging and very thoughtful about the broad enterprise that we used to call “technical processing” . In reflecting on her many contributions I unearthed a 1992 report commissioned by RLG and chaired by Linda – Technical processing in large research libraries: seeking a new paradigm. Mountain View, Calif.: Research Libraries Group, 1992.

Unfortunately the text isn’t web-accessible right now but we hope to make it available soon. (It’s available here now as a PDF 1992 tech processing in large research libraries) In looking it over I was struck by the dramatic changes in thinking that have occurred, the prescience of the work plan that was put forward, and how much of what was desired back then has been accomplished.

I was also struck by the persistence of the core challenge – this kind of descriptive metadata continues to be very costly to produce and its utility difficult to measure. And while we’ve changed the economics of describing our redundant print collections we have an enormous distance to traverse before the economics of describing the unique and special in libraries, museums and archives will be positively affected.

We’ll keep at it even while Linda moves on – I know she’ll be watching.

Metaphors and Visions

Wednesday, October 5th, 2005 by Günter

If you like big metaphors and glowing visions of our technological future, I recommend that you read this article on Web 2.0. Even if you don’t enjoy that sort of thing, I still think this is a fine and instructive read – it provides a solid idea of how a number of fairly recent technologies and services such as virtual clipping, blogs and social bookmarking create a completely new information ecology. In moving from Web 1.0 to Web 2.0, the author asserts, each piece of information can be “analyzed, repackaged, digested, and passed on down to the next link in the chain.” Good-bye isolated bits of information, welcome to the intricate fabric of re-woven threads of ideas and content.

I believe that this discussion has direct bearing on the collaboration between museums, libraries and archives. All cultural heritage institutions hold fascinating strands of information, but no matter how great each institution’s individual collection, it only unfolds its fullest meaning-generating potential once it gets inter-woven with other strands of content from other peer institutions. But the ultimate aggregation of all cultural heritage content is only the beginning of the Web 2.0 journey – what happens if this massive amount of content now meets up with other existing or emerging data pools?

Some pie-in-the-sky scenarios: What if you could see for each artefact which university class the item was taught in and how many students chose to write papers on it? What if you could automatically reconstruct from auction house and dealer records when and where the artefact was sold, and for what price? (I’m getting side-tracked now, but what if you could track this data against the valuation of the stock market?) What if you could automatically reconstruct from museum exhibition records when, where and for how long a piece has been on display, and find the exhibition catalog at a library to boot? With all of these data-points, a whole new history of the appreciation, status and value of a particular piece of artwork over time could be written.

Maybe these scenarios aren’t particularly likely, and their merit for teaching and learning could be debated as well. However, the point I’m trying to make is that we now have the technology for truly mind-boggling connections between pools of data – it’s time for us as a community to envision where we’d like to go with these capabilities.

Ning: a playground for social applications

Tuesday, October 4th, 2005 by Merrilee

Ning, a new free service that helps to enable…well, the FAQ says it best.

Ning is a free online service (or, as we like to call it, a Playground) for people to build and run social applications. Social “apps” are web applications that enable anyone to match, transact, and communicate with other people.

Okay, the apps that have been built now are kind of lame, but there is lots of potential for the LAM community to use a social application platform, er, playground. Museum directory? Standard registry (people who liked this standard also used…)?

Found on BoingBoing. Reported to be the first product of a Marc Andreesen startup.

Open Content Alliance scans while Google hits the airwaves

Monday, October 3rd, 2005 by Merrilee

By now you probably know that our Neighbors to the North have applied to provide Wi-Fi – for free – to our other neighbors to the north, the city of San Francisco. San Francisco has launched its wireless initiative a few months back in hopes of addressing the digital divide. Without getting into the digital divide, it’s pretty clear that the entirety of San Francisco, rich, poor, and in-between have a lot to gain from this initiative. Who stands to loose, potentially, are the “for pay” and very much wired telecom and cable service providers including, Comcast, SBC/Yahoo!, and Verizon.

When I think of who might gain from this project, I think of course of the academic researchers and students based at San Francisco’s large and small colleges and universities. I think of faculty and scholars who work or study at one of the Bay Area’s many other institutions of higher learning (think UC Berkeley and Stanford University), but who live or spend time in San Francisco. I also think of high school students, who can be mobile with their laptops and other devices. For the poor, I think of the $100 laptop being designed at MIT. Even though this computer was designed for use in developing countries, it certainly has application in the U.S.

And what might these users be looking at? In addition to licensed and purchased materials, they could be looking at out-of-copyright materials scanned courtesy of the Open Content Alliance. The OCA is an alliance that includes the European Archive, the Internet Archive, the National Archives (UK), University of California, University of Toronto, and Yahoo! San Franciscans might also have access to a wealth of material made available through the European Commission. Oh yeah, and there’s also that Google Print for Libraries project.

Librarian as genius

Monday, October 3rd, 2005 by Merrilee

So, now many of us can now claim that we know a genius. Last month, the MacArthur Foundation announced its most recent fellows, the recipients of the so-called “genius award.” On that list is Terry Belanger, founder of the Rare Book School at University of Virginia.

It’s interesting looking over the list of current and past recipients, which include a genius luthier, a genius fisherman, and a genius farmer. The American Library Association claims that Belanger is only the second librarian to receive a MacArthur fellowship. The way the list is organized, it would be difficult to pull out winners from the library, archival, or museum world, but a little searching and poking turned up no archivists. Not surprisingly, many museum people make the list, but almost all are curators. A possible exception is David Wilson (MacArthur class of 2001), founder of the Museum of Jurassic Technology in California. Since the Jurassic is sometimes considered a museum, sometimes not, this may be a toss-up.

Librarians are often portrayed in mass media as meek and mild mannered, spinsters who famously shush disruptive patrons. In the 21st century, librarianship is not for the risk adverse. Belanger is not a shy and retiring sort, and describes himself as “one of the noisier members of a considerable group of people who have worked for a very long time to help ensure that the future is not deprived of the past.” His dedication is legendary. He began to develop a program for rare book studies at Columbia University in the mid 1970s. When the School of Library Service at Columbia University packed up shop in 1992, he took the rare book program on the road to the University of Virginia. The Rare Book School has contributed to the education and continuing education of thousands of librarians. RBS has been run on a shoestring; hopefully this award will help bring attention to Belanger and the University of Virginia, but also to the importance of collecting, describing, maintaining, and preserving the unique, special, rare and rare materials in libraries, archives, and museums.