Across the US Thanksgiving holiday I was privileged to give the concluding talk at a symposium on library, archive, museum digitization sponsored by the Keio University Media Center (libraries). They have been RLG members since 2002 and have regularly invited me to keynote these gatherings. There is much more interaction and coordinated activity among the libraries, museums and archives than there was even last year. It’s impressive. I suspect that Keio’s decision to become a Google Book Search partner will spur even more activity. (They’ve indicated that they’ll be cooperating with the WorldCat e-synchronization project. )

The symposium was surprisingly informative (considering that I had to hear it through an interpreter; who was very good even in the swirl of professional jargon) and well attended at over 110 participants. Highlights for me were:

- Remarkable progress at the National Archives of Japan (Digital Archive here) and the Museum of Modern Art in collaborative digitization and joint disclosure (although they don’t know that word).

- Good examples of an approach to letting communities build their own archive and curate it later from a Keio Digital Media staffer (the 100th anniversary of Japanese immigration to Brazil).

- An example of an archive creating an interim product for scholars; the Tatsumi Hijikata butoh archive at Keio commissioning one of the master’s students to video each of his circa 2000 named dance forms. More about this in a separate post.

- And an example of new scholarship from the history professor responsible for the Silk Roads project.

- My presentation was well received (Mass Digitization and Special Collections) but anti climactic since the audience had received a copy of my slides plus a translated version of my speaking notes. The big hit was the map of the 200 most successful web sites arrayed on the Tokyo subway grid.

There was lots of evidence that Japanese institutions have begun to regard systematic digitization as an intrinsic part of their mission. That was heartening.

I’m sure you all know the feeling: you’re looking forward to the brand spanking New Year 2008, but you feel a moment of panic when you realize that after the ball has dropped in Times Square, you may not have enough Metadata on your calendar. “Spotlight on Metadata” to the rescue!

I’ll let its creator Glen Wiley, Metadata Librarian at Cornell, explain things in his own words:

I get frustrated with trying to keep up with professional development events related to metadata. As a result, I’ve created a Google calendar of metadata events that I hear about. Please have a look at the calendar at

The calendar is mostly made up of metadata-related events from D-Lib Meetings, Conferences, and Workshops,
I also post events that come across listservs (Metadatalibrarians, AUTOCAT, DLF, DSpace, EAD, JISC-Repositories, METS, Web4lib, XML4lib,etc). If you hear of any national or international events that you don’t see on the calendar, please e-mail me at .

Picked up on the Metadatalibrarians mailing list ( Thanks, Glen! It’s great to have one place to check what’s going on!

owl.gifDuring our recent internal Programs and Research summit meeting, we re-visited various portions of the PAR work agenda, looking for opportunities to add, revise, or perhaps even delete. I participated in discussions having to do with the Management Intelligence section of the agenda, which covers work aimed at gathering, mining, and analyzing data sources in support of library decision-making and context-setting needs. Management Intelligence extends over a wide range of current and prospective projects in PAR, but the scope and rationale of this work can be summarized with a couple of simple themes: Aggregate – Analyze – Generalize, which summarizes the sort of work we do in Management Intelligence; and Context – Evidence – Patterns, which summarizes why we do it.

1. What we do: Aggregate – Analyze – Generalize

Aggregate: The more institutions, collections, and individuals over which we aggregate data, the richer the context against which decision-making can be placed; issues can be characterized and understood; and patterns can be discerned and extrapolated. Much of PAR’s data-mining work flows around the aggregated bibliographic and holdings data in WorldCat. But our WorldCat-based work will be complemented with a new emphasis on aggregating other forms of data, such as circulation data, ILL transactions, virtual reference queries, click-through patterns, and e-usage data.

Analyze: Value is released from data by analyzing and leveraging it in innovative ways that support a variety of needs. Management Intelligence will prioritize work aimed at providing our Partners with the information and evidence required to support the directions in which they are moving, in areas such as digitization, shared print storage, and deeper forms of collaborative collection management.

Generalize: An important aspect of the work in Management Intelligence will be to identify and pursue opportunities to change library practice and improve existing services and processes. In pursuing these goals, we will prioritize forms of analysis that can be “generalized” into standard methodologies applicable across a variety of contexts – for example, by converging on sets of standard questions to be asked of the data in particular decision-making scenarios.

2. Why we do it: Context – Evidence – Patterns

Context: Cultural heritage institutions must endeavor to understand, and where appropriate, seize, opportunities created by trends, technologies, and other factors shaping the information environment. Therefore, an area of priority will be to pursue work that supplies empirical context for a range of general issues impacting libraries, archives, museums, and the wider information landscape. Such work will inform community-wide dialog on these issues, and help participants channel discussions in productive directions.

Evidence: Decision-making is increasingly data-driven. As more and more library services and usage migrate to online environments, the ease with which data can be captured, aggregated, and leveraged to support decision-making and planning will only increase. Looking ahead, we will prioritize work aimed at cultivating an “evidence-based” approach to library decision-making. In doing so, we will address issues like characterizing what an “evidence base” looks like in various library decision-making contexts, and developing clusters of questions that draw on well-defined evidence bases to inform key decision-making processes.

Patterns: As we collect, aggregate, and analyze data about library collections and user behavior, patterns begin to emerge, illuminating the shape of aggregated collections, the research and learning habits of library users, as well as other features of the overall library landscape. As these patterns emerge, we gain a better understanding of the system-wide characteristics of library collecting and usage activity. This intelligence can inform libraries’ thinking on ways to optimize the system-wide supply and demand for library materials, and in particular, how to reduce supply-side cost while improving demand-side accessibility.

Since our Programs colleagues joined us last year, the opportunities for work in the area of Management Intelligence have expanded dramatically. It is sometimes difficult to draw together all the strands of current and future work being undertaken in this area. However, the dual themes of Aggregate – Analyze – Generalize, and Context – Evidence – Patterns are a useful way to think of this work as a cohesive whole, as well as a roadmap for the kinds of work we will be prioritizing in the future.

Continuing our tradition of previous years (okay, RLG Programs was in on it last year), OCLC Programs and Research is pleased to “send” this holiday card your way:

The card showcases a new technology from Adobe called “Flex.” Clicking on a word in the tag cloud will show results in a pie chart. Select a piece of the pie to show results of queries into, WorldCat Identities, and the Dewey Browser. The words in the tag cloud were generated using a super-secret algorithm which pulls holiday-related words out of thin air.

Enjoy the card and the virtual pie. Something fun to mark the holidays from from all of us in OCLC Programs and Research.

Recently I realized that I’m spending almost as much time in “professional listening” as I am doing “professional reading”. So many interviews, Webcasts, TED talks, Google Tech Talks, and the like!

So I was intrigued indeed when I read about Searching Video Lectures, a tool from MIT that leverages decades of speech-recognition research to convert audio into text and make it searchable, as reported in MIT’s Technology Review of November 26, 2007.

I tried out the Lecture Browser. It currently has only 200 publicly available lectures, but still! For an astronomy buff like me I was thrilled to zero in on professors’ insights about Hubble images (retrieved easily by a keyword search on “Hubble”.)  Definitely a fun tool to play with.

Then I thought about oral history projects I’ve known. Think of all the recordings of interviews with individuals who provide insight to our history, culture, and perspectives. The ones I know about have a MARC record about the interview with a very brief summary of the topics covered (usually with associated subject headings), the media used (e.g., “sound tape reel”), and a note that a transcript is available. But in the Brave New World, imagine what it would be like for researchers to type a few keywords and pull up both the transcript where the key words appear and the spot in the audio where the topic is discussed?