Archive for August, 2008

Webinar on WorldCat Search API

Friday, August 22nd, 2008 by Roy

Thanks to my colleague Merrilee for announcing the WorldCat Search API. I guess I was so busy planning a Hackathon in November to help folks use it that I forgot to blog about it. More on the Hackathon later. But something coming up next week on Wednesday is a webinar on the API:

You are invited to¬† attend an¬† RLG partner webinar, “Using the WorldCat Search API”¬† on Wednesday,¬† August 27 at 4 PM Eastern Daylight Time (1 PM¬† Pacific Time).¬† In this¬† webinar, Senior Program Officer, Roy¬† Tennant, and Consulting Software¬† Engineer, Bruce Washburn, will¬† provide an overview of WorldCat Search API¬† features, demonstrate some¬† current applications that use the API and offer a¬† chance to ask and¬† answer questions.

Just launched, The  WorldCat  Search API provides OCLC libraries with new ways of taking advantage   of the WorldCat database and features.  With the API, you can  build  WorldCat search results, metadata, and links to library catalogs  into your own  systems.  Supporting common search protocols like  OpenSearch and SRU, and  delivering data in standard formats like RSS,  Atom, Dublin Core and MARC, the  API is ready to be applied to a wide  array of applications.

You may  add this webinar to your calendar.

To join the online WebEx meeting:
1. Go to
3. Enter the  meeting  password: partnerwebinar
4. Click “Join Now”.

To hear the audio  component  of this presentation, call:
USA Toll free: (866) 487-5722
International Toll: +1 (281) 540-4983
and provide participant  code  6147615231 to join the call.

If this is your first  time using  WebEx, please log on a few minutes early to download the required  software.

For WebEx technical support,  call:
US/Canada Toll  free: (866) 299-3239
International Toll: +1  (408)  435-7088

This webinar will be recorded and made  available on the  RLG Programs Web site in late August.

This was announced both on the RLG Systems Heads list as well as the general RLG Announce list, but here it is in case you aren’t on one of those lists. And if you are not part of the RLG Partnership, well then, let me be explicit: you’re welcome too.

WorldCat API

Thursday, August 21st, 2008 by Merrilee

My colleagues Roy Tennant and Bruce Washburn have invested significant effort into the WorldCat API effort, and now it’s out (Roy has been so busy, he hasn’t had time to blog about it here!). The API is available to all OCLC member libraries that maintain a cataloging subscription.

The API was announced earlier this month and the amazing Mark Matienzo has already rolled up his sleeves and put out a “pre-pre-alpha Python module.” Talk about grabbing the snake by the tail….

Where should the data live? What’s it’s specific gravity?

Wednesday, August 20th, 2008 by Jim

One of the areas we targeted for attention back in 2006 as we set the work agenda with our new Research colleagues was the nature of interactions with the Integrated Library System(ILS). It was clear that external systems were more often calling on and relying on services embedded deep in the ILS yet there were no standard services offered. We thought it would be interesting to work with the RLG Partnership to design and define an ILS Service Layer – an agreed upon set of services for an ILS to expose. Our initial focus was going to be on connecting a discovery environment (e.g. WorldCat, Google Scholar, etc.) to ILS functionality (e.g. get availability).

Two things reduced our investment in this project. The advent of WorldCat Local made this into a very important product focus since WorldCat Local doesn’t work very well without that well understood service layer. And in more or less the same time frame the DLF convened a Task Group to recommend standard interfaces for integrating the data and services of the Integrated Library System (ILS) with new applications supporting user discovery. We volunteered to be part of that working group but because of the OCLC service offering were invited only to comment when ILS vendor input was solicited.

In any event our colleague, Janifer Gatenby, had done a very thorough job of looking at the data in the ILS in preparation for this work. She gave a lot of thought to where this data ought to reside for maximum effectiveness. In the current network environment many data types that are supported at the local level might be both more efficiently managed and usefully invoked if they resided at the group or network level. She’s codified her thinking and her recommendations in a paper just published in Ariadne.

Her paper,
The Networked Library Service Layer: Sharing Data for More Effective Management and Co-operation is sensible, accessible and deserves consideration as we re-configure library services and the management processes that support them.

She particularly focuses on where the data should reside and specifies some of those levels:

* globally sharable data (e.g. bibliographic metadata, holdings, issue level holdings, suppliers, statistics, reference query-and-answer pairs)
* Data that can be shared within one or more co-operatives to which the library belongs (e.g. selection / rejection decisions, weeding reasons)
* Local data that are not shared (e.g. budgets, invoice details, some user information)

In our discussions we’ve taken to talking about this as the specific gravity‘of a service or its associated data.

This is the kind of thinking that pushes us to consider what it will really take to put our services into the users workflow.

Editing note: As soon as I hit the publish button for this post I noticed that Lorcan had also written about Janifer’s paper in a post titled Data at the Network Level. I was pleased to see that he seized on the same excerpt I quoted above.

More Access is Better

Tuesday, August 19th, 2008 by Roy

One of my RLG colleagues today brought us a question from an institution that was considering their options for what to do with a large mass of digitized content they were planning to create. The question was basically this: would they be better off just making it accessible themselves and letting the search engines guide people there, or join up with a large aggregation such as the World Digital Library?

This is certainly a good question, and one worth considering carefully, since it is a foundational question that has numerous ramifications down the road. But all things being equal (and they aren’t necessarily so stay tuned for more on this), more access is better.

That is, I would neither put all of my eggs in a ‚Äúlocal only‚ÄĚ basket nor in a ‚Äúone big aggregation‚ÄĚ basket, but both if at all possible. That is, retain control over your own stuff, but also syndicate it out to other places such as the World Digital Library, if that floats your boat, and other places that make sense as well. The one sticking point here is that you will need to know what is required to play well with those other locations and factor that into your planning. So if I were them, I would find out what the World Digital Library would want from me, as I would with any other aggregator I wanted to play with. Then I would do something locally that allows me to easily spin out the various versions required. I think this model provides the most flexibility and sustainability going forward than relying on any single solution, no matter what it is. Plus I would get the added benefit of being in many places at once.

So as I alluded to above, all things are often not equal, and here are some of the differentiating factors. It will likely take additional effort to make your metadata and/or content comply with the needs of aggregators. Depending on your local situation, this could require a significant investment (although I would argue that if it does you were probably planning on doing something locally that is not as flexible as it should be). Another is that if the aggregator, such as the World Digital Library, wishes to host the content as well (and not just the metadata), then you will have split usage statistics. But before deciding you can’t handle being in more places at once I would urge careful consideration of the benefits and drawbacks. The easier it is for people to find your content the more it will be valued. People can’t appreciate stuff they can’t even find.

Mid August news round up

Friday, August 15th, 2008 by Merrilee

Because it’s Friday (and because I have a cold!), this is just a round up of bits I’ve been meaning to blog about. They are piling up, and I figure it’s better to get out even a little bit on each, rather than try to find the time to blog about each one in depth.

Jennifer and I did our webinar yesterday (Assessing the impact of special collections) — I blogged about this earlier this month. I wanted to let you know that our slides are in Slideshare. The webinar itself will be posted later in the month, after some vacations. I’ll have more to say about the discussion, and the results of the poll later. For those of you who took the poll, thanks very much!

There was an interesting story on NPR about reCAPTCHA. Last summer, I blogged about our use of reCAPTCHA for validating the comments on this blog. Over time, these small efforts have added up. Something like 1.3 billion word, which adds up to enough text to fill up more than 17,600 books. Beneficiaries have been the New York Times and the Internet Archive / Open Content Alliance. So comment away — you are helping to do great things.

Finally, I was sad to hear that the Party Copyright Blog was shutting down — but even more disturbed that it subsequently vanished. Fortunately, the “voice of the people” was heeded, and the blog was mostly restored. It’s a reminder of the fleeting nature of information on the web and the importance of preserving valuable resources.


Thursday, August 14th, 2008 by John

The description of our newly launched Workflows in Research Assessment Program (WRAP) is now up on our website. WRAP fits within our Supporting New Modes of Scholarship theme. We will be looking at new and changing workflows for libraries in support of research assessment by institutions. In countries with strong research assessment programmes, such as the UK (whose Research Excellence Framework is developing) and Australia, where research assessment is done nationally as a way of dividing up large sums of government funding, libraries have been involved for some years in providing the bibliographic data upon which assessment is partly based. With the arrival of digital repositories, their role has become potentially stronger, but also in some ways more difficult. The Open Access agenda, for example, can conflict with institutional requirements for selective showcasing of their research output, and for rigorous identification of institutional affiliation (in the UK, as Lorcan Dempsey pointed out recently, even senior university managers have become interested in authority control and identifiers). In territories where national research funding is less of a driver, there are nonetheless pressures growing on institutions to manage the information about their own research production more effectively and efficiently, in support of improvements to assessment at individual level (tenure decision-making), and in other competitive contexts, such as public relations and knowledge commercialisation.

Constance attended the ARL Library Assessment Conference in Seattle last week. She drew our attention to a very relevant presentation by Patricia Brennan of Thomson Reuters which considers the library’s role in the assessment of research in universities. It looks at the drivers for research assessment, including reputation management, funding and tenure review. It touches also upon institutional rankings and league tables, and it attempts to identify roles for the library in this increasingly important activity.

One of her slides revealed the fascinating diversity in international approaches to research assessment:

I was reminded of a slide I’d seen a couple of years ago by the late Professor Sir Gareth Roberts, President of Wolfson College, Oxford, on the future shape of the UK’s Research Assessment Exercise. Professor Roberts had been involved in advising the (then) Australian Department of Education, Science and Training (DEST) on the introduction of its assessment framework, and he produced this sparser slide to show international approaches, with the horizontal axis here measuring ‘Influence on Funding Decision’ rather than ‘Adoption of Metrics’.

It is interesting to see how attention to this issue has extended to include the many countries which now feature in Patricia Brennan’s more recent chart, and also how Australia has moved into the upper right quadrant, suggesting that research assessment has intensified there in recent years. The Netherlands is also in a different quadrant, though due to the different horizontal measure: it is a world leader in research metrics and its universities are intensively reviewed, but there is as yet no strong correlation between these features and research funding decisions, though there is some evidence that this is under review.

Work on library workflows in research development and publishing has been undertaken in several places in recent years. We are particularly interested in the Mellon-funded Multi-Dimensional Framework study undertaken by the University of Minnesota Libraries in 2006, which identified ‘primitives’ in research behaviours around which new support services could be built by libraries. With the focus on research assessment intensifying for universities in many countries, what new behaviours are evident, and are the associated workflows yet optimised? If the library is becoming a research publisher, what workflows follow from that? How does the library’s control of information on research outputs integrate with other institutional research information held in university data marts and data warehouses in business intelligence contexts? We are currently assembling the expertise from within our Partnership and beyond to help us analyse these workflows – as they are currently, and as they may become over time.

LIBER, Equality, Fraternization

Friday, August 8th, 2008 by John

This year’s annual LIBER conference was held in Istanbul’s Ko√ß University – a private university on the outskirts of the city. Ricky and I attended, and joined European librarians in jovial mood. A lot of attention in the European research library community at present is on Europeana, the European digital library which is being built collaboratively with the assistance of large-scale digitisation funding from member states. Elisabeth Niggemann, Director General of the Deutsche Nationalbibliothek and Chair of the Conference of European Librarians, updated the conference on its progress to date. A final prototype is expected in November this year, which will provide access to 2m items. When finally launched to the public, it should have 6m items. But there are some concerns about the pace of development. Elisabeth Niggemann felt that the current snail’s pace of digitisation, being done only as project money is released bit-by-bit by European national governments, is not sustainable. And the attempt to portalise European digital content is already straining under the Google effect. Europeana began life as a European response to the Google Books project, but there is a growing acceptance now that European digital content needs to be found in Google services. Eisabeth Niggemann remarked in the Q&A that ”Europeana is just a method to attract the Googles of this earth.’

Ricky gave a presentation on mass digitisation of special collections material, based on the Shifting Gears work which she and Jen Schaffner reported last year. She gave a lucid account of the arguments for a quantity approach. Her paper went down well, and is to be published in LIBER Quarterly. Other European conferences are likely to feature Programs speakers on the same topic, and I will lead a workshop on it at the RLUK autumn meeting in Leeds.

Among other highlights was a presentation from the academic perspective by Professor Sijbolt Noorda (Chairman of the Dutch Association of Universities). We thought digitisation would save us money; now we know it’s about investment. Universities should be paying for their own libraries’ investments in this area, and not leaving it to external bodies. He went on to talk about the role of digitisation in the management of university reputations, making a bold statement which I noted because it provides a useful epigraph for our newly launched Workflows in Research Assessment program: ‘The essential game of each university is the reputation game; the essential game of each researcher is competition.’ Publishers make possible that game, but publishing is changing and becoming something universities do – with library involvement. We must be bolder in this, and prepared to experiment more in collaborative ways.

There was a session on web archiving. Fifteen European national libraries are now harvesting their national domains, with a further nine at the planning stage. The view of the Biblioth√®que Nationale de France (BnF) is that legal deposit treats all publications as equal. Thus they sample rather than select. French legislation means that they don’t need advance permission before harvesting – though access is only available from within the BnF. In the UK, by contrast, permission is required. This approach, confessed John Tuck of the BL, is not sustainable. Less than 1% of the total domain would be captured in 10 years.

Our OCLC EMEA colleague Janet Lees presented on ‘Moving Metadata Upstream: Early Outcomes from the OCLC Next Generation Cataloguing Pilot’. She located the origins of the ‘Next Generation’ work in the previous generation, with the vision of Fred Kilgour, founder of OCLC. The pilot has a couple of mantras: ‘Metadata should be acquired early and once. Metadata should be made to work harder and smarter’.

A presentation which also gave me heart was that of Dr Ralf Schimmer of the Max Planck Digital Library, who updated us on SCOAP, the initiative in high energy physics (HEP) which is seeking to convert the whole commercial journal literature to author-pays open access. A lot of funding has now been obtained from funders worldwide, and the collaborative effort is inspiring. Consortial regions and entire countries are being asked to assess their collective spend on HEP journals, and to divert it to the initiative. A tender will be issued, and publishers who respond will be obliged to unbundle their HEP subscriptions from their other offerings. If successful, this will be an important breakthrough in allowing the consumer to influence the terms of the offer. It would also offer an example to other disciplines which could open some powerful new directions in scholarly publishing along the lines proposed by Professor Noorda.

Sadly, I had to leave the conference early, and so missed the session on metrics and research assessment. I also missed the conference dinner which took place on the Bosphorus. For more details on either, ask Ricky!

Libraries and archives: cows of a different color?

Friday, August 8th, 2008 by Merrilee

Two interesting (and contrasting) blog postings popped into my feed reader today.

The first blog posting is from colleague Andrew Pace (the second in his “sacred cow” series). (Why don’t you have a “cow” category, Andrew? Or a “have a cow” category, for that matter.) For those of you who don’t like clicking links, I’ll summarize: circulation rules are ridiculously complicated, and putting restrictions — on what categories of users can use which materials, when, and for how long — serves no one.

I read this and I thought, well, special collections don’t function that way. Hmmm.

Five minutes later, I read the second blog posting from the Accidental Archivist. Again, to summarize, north American archivists are very welcoming and helpful to users of all stripes. He goes on to contrast North American behavior with European behavior. I have little experience in the European archival theater, so I can’t comment. If you have some observations, please leave him a comment!

I don’t think I have anything smart to say about the differences between libraries and archives in this regard. Maybe it’s that academic libraries’ constituencies are more clearly defined. Special collections (even in an academic context) consider their audience to be the scholars who are interested in the “stuff.” If they have bothered to turn up at your doorstep, they belong.

EAD @ 10 and the future of archival description

Friday, August 8th, 2008 by Merrilee

As I’ve mentioned previously, I’ve been involved in planning the upcoming EAD @ 10 symposium. The full agenda has been posted and registration is now open. RLG Partners can register for free, email me for details. For the rest of you, early bird registration closes on the Wednesday the 13th.

I’ve been helping to coordinate the “Into the Future” portion of the program, panel discussions on what the next 10 years of archival description will bring. We have 5 great speakers confirmed: Mary Elings, Mark Matienzo, Michelle Light, Jeanne Kramer-Smyth and Kathy Wisser.

Yesterday, Kris Kiesling and I had a great phone call with most of the panelist, and in a little more than 30 minutes we had raised several “big questions” about what will influence the future of archival description. I’m going to list these below, and invite you to add your questions to the comments here, or, add to the (unofficial) SAA Wiki page for EAD @ 10. Panelists will each address a few of these questions, so if you have a question about the future for our crystal-ball-gazers, let’s hear it!

Factors that might influence the future of archival description.

  • What will be the impact of MPLP/Greene-Meissner. If we are moving to a model of less processing up front, and perhaps more iterative processing down the line (‚Äújust-in-time description?‚ÄĚ) what will be the impact on description or descriptive workflow?
  • Specifically thinking about EAD, that standard was developed to support a range of descriptive practices. With 10 years of encoding under our belts, can we imagine identifying what people actually do in terms of markup? Determine what descriptive elements are most useful both from and end user and management point of view? Using this information to move to a much tighter version of EAD?
  • As different metadata creation centers (libraries, archives, museums, digital library production, institutional repository, etc.) come into closer contact with one another, what will be the influence of these communities on one another?
  • Will some aspects of descriptions (names, places, subject terms, etc.) become networked? If so, what will be in the impact on archival description. Would netwokred elements of description allow for some automatic identification of terms, rather than implicit markup?
  • Will archival description move away from a document-centered model? What are the possibilities for moving to other models of representation? Shifting away from hierarchical structures and towards relational structures, for example.
  • What will be the impact of user contributed metadata?
  • What will be the influence of electronic records?
  • Will metadata creation tools lead us in new directions, or do they simply model current practices?
  • It can be argued that EAD has changed archival descriptive practice. What standards will change our practices going forward?
  • It can be argued that technology (specifically, the advent of the web) has changed our approach towards archival descriptive practice. What new technologies will change our practice going forward?
  • How will our evolving understanding of end user needs shape archival descriptive practice?

Programs viennent à Paris!

Thursday, August 7th, 2008 by John

RLG Programs will run its first European Partner Meeting in Paris on 5-6 November 2008. We felt that, although several of our European Partners are represented at our Annual Meetings, for many of them a venue in Europe is more realistic. Paris was chosen not simply for the obvious reasons, but also because its transport links are excellent, and – for our many UK Partners – it is of course one of the nicer cities available via a short rail journey. The Meeting will be a more compact version of the Annual Meeting, developing and reporting on some of the new areas which were surfaced in Philadelphia, but running as a plenary sharing of activities and ideas through a single day.

We will begin, however, by getting to know each other better at an Evening Reception on Thursday 5 November, in order to allow a full day of presentations and discussion on Thursday 6th. Both Jim Michalko and Lorcan Dempsey will attend and provide the broad context in the first session. We will then work in plenary through a number of key theme areas, with presentations from Program Officers, and also from some of our European Partners. Delegates who wish to digest the messages of our Meeting along with some of the city’s other charms may then decide to stay on for le week-end.

Naturally, we hope for a strong attendance from our European Partners – but Partners from any country are welcome to register if they wish to join us! Details are on our website.