Archive for March, 2008

Copyright Investigation Summary Report now available

Friday, March 28th, 2008 by Merrilee

I am pleased to announce (at last!) that the RLG Partner Copyright Investigation Summary Report (pdf) is now available. This report summarizes interviews conducted between August and September 2007 with staff RLG Partner institutions. Interviewees shared information about how and why institutions investigate and collect copyright evidence, both for mass digitization projects and for items in special collections.

I am so grateful to those who shared their experience and wisdom with us.
Participating partner institutions and staff include:

Cornell University (Peter Hirtle); Emory University (Lisa Macklin); New York Public Library (Tom Lisanti); Rutgers University (Grace Agnew); Stanford University (Mimi Calter); University of California, Los Angeles (Angela Riggio); University of Michigan (Judy Aronheim); University of Texas at Austin (Dennis Dillon).

LC and Flickr – 3 months later

Thursday, March 27th, 2008 by Günter

We had the good fortune today to talk to Helena Zinkham, Michelle Springer and some additional staff members from the 12 people team at LC which worked on the LC-Flickr project. We were also joined by George Oates, who shepherded the collaboration from the Flickr side. The conversation highlighted a number of interesting facets of the collaboration which I hadn’t fully appreciated yet, and I thought they’d be worth sharing

  • In a very elegant way, Flickr solves the authority conundrum of exposing collections content to social process. No need to worry if some comments or tags are misleading, arbitrary or incorrect – it’s not happening on your site, but in a space where people know and expect a wide variety of contributions. On the other hand, LC selectively reaps the benefit of these contributions. Over 100 cataloging records have been changed through input from the Flickr community.
  • Identifying and siphoning off the information of use to LC is a time-consuming and laborious process. While Flickr offers a number of ways to look at user interactions with the content, LC has started building its own database, which pulls in information through the Flickr API for more convenient evaluation. Social tagging in this framework doesn’t mean letting others catalog your collections for you – it really means offering up materials for a conversation which you have to follow closely to extract the bits worth bringing back.
  • We had an interesting discussion about what I’m tempted to call the “absorbency” of Flickr. The 3k+ images LC posted in the prototype seemed a reasonably easy chunk of material for the Flickr community to process, meaning tag and discuss. (In some instances, images actually have reached their Flickr-imposed limit of 75 tags.) The group speculated that a larger upload of images would have perhaps caused a less thorough review of the photographs, and this thinking also seems to have influenced LC’s decision to keep updating their Flickr stream 50 images at a time. George commented that Flickr has made 1000 Flickr friends through the project so far, and 50 images at a time probably seem delightful to them, while 10s of thousands at a time might be overwhelming.
  • While at a pace of 50 images per week, the entire photographs of the Bain collection (50k) will take about 20 years to expose on Flickr, I think that piece of math may miss the point: from the conversations I noted a much greater interest in deep engagement with the presented material rather than in comprehensiveness. The evidence suggests that this deep engagement has been achieved – see, for example, the discussion surrounding these two photographs. Those with the desire and need to see all of Bain can always do that on the LC website – Flickr compliments this offering by turning parts of the collection into conversation-starters. LC staff seemed so impressed with the value of the interactions on Flickr that they felt linking back out to the Flickr pages from the catalog was as important as bringing back salient corrections and updates into the catalog.

    For LC, Flickr is still a prototype – commitments on a policy level will be discussed after the prototype has been thoroughly evaluated. For Flickr, working with cultural institutions seems to become a way of life. George commented that she has about eight more cultural institutions ready to be launched over the next 8 months, ranging from very large to very small. There will be new and different things to be learned from the next launches – how will the material fare without the boost the LC-Flickr project enjoyed as the goundbreaking initiative? I’m looking forward to continuing the conversation with our LC colleagues, and I’ll be watching out for those next cultural heritage collections on Flickr…

    Our podcasts + webinars = PARcasts

    Thursday, March 27th, 2008 by Merrilee

    As some on the blogosphere have already discovered, we have not only started a new podcast series, but also a series of webinars. The live webinars are offered for RLG Program partners, but we’re making them available on our website after the fact.

    Taken together, the podcasts and webinars are called … PARcasts (PAR = Programs and Research, the OCLC division we are part of). Cute, huh?

    The first webinar is titled “Out of the Stacks and onto the Desktop: Rethinking Assumptions about Access and Digitization” and combines reflections on the Good Terms paper (Kaufman and Uboius) and the Shifting Gears (pdf) paper (Erway and Schaffner) together with reflections on how libraries and archives can (and already are) make progress in digitizing collections and making them available to the broadest audiences.

    The great thing about the webinar was not only hearing Jen and Ricky present their ideas, but especially the questions and discussion that came after. Here are some of things that can up in discussion:

  • Bring together benchmarking information on what it takes to do digitization economically.
  • How to balance rights issues with creating broad access? Guidelines in this area will be helpful. (I actually think that the SAA Working Group on Intellectual Property may have some helpful guidance for those of us in the US)
  • Establish a clearinghouse for digitization projects underway so information and expertise can be widely shared
  • I must say that I am pretty dubious about the effectiveness of clearinghouses (I’ve seen many well-launched efforts fail for lack of consistent upkeep), but we’re all open to suggestions as to how this could be achieved.

    We are also interested in your thoughts and reactions, so please share them here or via email.

    What keeps you awake at night?

    Monday, March 24th, 2008 by Merrilee

    RLG Programs has launched a new podcast series called “What Keeps You Awake at Night?” We are interviewing movers and shakers from libraries, archives, and museums and asking them what they find exciting or nerve-wracking. We also ask who is doing good work in advancing or combating the issue. But we don’t want you to toss and turn — we are attempting to keep the interviews short. 15 minutes is an ideal, but we can live with 20 minutes or so.

    Our first offering is now up, an interview with Mark Dimunation from the Library of Congress. Those of you who know Mark know how passionate and articulate he is. Mark talks about the value of the physical artifact in an increasingly digital world. He also tells a tale of murder in the library.

    When we have a feed up for our podcasts, I’ll let you all know.

    Internet Archive bookscanning photos

    Thursday, March 20th, 2008 by Merrilee

    My brother-in-law is a freelance photographer for (if they were smart they would just hire him). He came up to visit in January and asked if I would take him to see some “cool, geeky things.” So we visited the Internet Archive in San Francisco and their book scanning operation in Richmond, California. The photo gallery from that visit is finally up online. Although he doesn’t mention the Open Content Alliance, and the book scanning in question is not free (the end product is), I think it’s a pretty good treatment of the operation. And the photos are great.

    NLA Unveils “Library Labs”

    Thursday, March 20th, 2008 by Roy

    The National Library of Australia (an RLG partner institution) has announced a “Library Labs” space “let our colleagues know what we are doing, to invite comments, questions and feedback and to provide a space for discussion and collaboration” (from Warwick Cathro in an emailed announcement). Cathro continues:

    “We have started to redevelop our digital library services using a service-oriented architecture and open source software solutions where these are functional and robust. We are also aiming to take a common (“single business”) approach to collection management, discovery and delivery.

    We are interested in forming a community of Australian business analysts and developers who are working on similar problems and who are interested in interoperable, standards-based solutions. We are also interested in working with colleagues at an international level to provide prototypes and testbeds for new and emerging standards.

    The wiki space features:

    • three search prototypes which we have developed, with a fourth prototype (for newspapers) to be released soon
    • reports relating to these prototypes
    • a report on our vision for a logically integrated “national metadata store” and a new discovery service based on that store
    • our draft digital library service framework
    • information on some of our related standards activities. “

    A wonderful effort, and something that is well worth spending some time with to see what we can all learn from what they’re doing and provide input and feedback. For our part, this work ties in well with our “Modeling New Service Infrastructures” theme that I head up. My visit to NLA last fall was quite instructive and I look forward to further interactions with NLA staff over shared issues.

    Value undeniable; price unknown

    Monday, March 17th, 2008 by John

    Oscar Wilde’s Lady Windermere’s Fan is the source of the famous definition of a cynic: ‘A man who knows the price of everything and the value of nothing.’ In an address to OCLC Members’ Council last November, Stephen Abram, Vice President of Innovation at SirsiDynix, ridiculed libraries for rejecting the increased business which can derive from empowering users. He quoted an example of a library in which interlibrary loan staff threw up their hands in horror at the extra workload which accompanied a 700% increase in interlibrary loan following patron-initiated requesting. Was this cynicism, or fear? To refuse to introduce a major service improvement on the grounds that the price is too high could represent either cynicism or timidity. Whichever reason applied in this case, Abram excoriated library leaders who allowed their services to be impaired by these attitudes. But on the other hand, we can always improve services if the price tag is not a problem. Having to live within our means requires a dedication to service improvement on a reasonable and costed basis.

    In our current professional climate it can be tempting to take refuge in those services whose prices we know, even when we can see the value in new services which we cannot yet cost. When it comes to the areas of new library collaborative activity where RLG Programs is engaged, libraries can sometimes feel as though they know the value of everything but the price of nothing. That can induce a helplessness which stops them in their tracks. Our ‘value proposition’ seem obvious, and we are convinced of the value of the work we are doing with our Partners in our thematic areas – developing the ‘collective collection’, renovating bibliographic descriptive practices, advancing new service architectures and promoting new modes of learning and scholarship. But what is the price of making some of these changes?

    For example, as libraries pursue aggregation, won’t there be responsibilities to the collective collection beyond just providing its content? ‘The librarian’, ‘the curator’ or ‘the archivist’ will be the professional source from which a researcher browsing digital collections will want to find the answer to certain questions which arise as they work with the materials on the web. With more specialist materials – rare books, archives and museum objects, attracting serious researchers and scholars – the likelihood of enquiries increases. Who is going to step up to this challenge? Do we expect that the library which has digitised a special collection is going to provide the reference expertise to go with it once it is being hit by thousands of people via Google? This point was made to us by somewhat anxious library staff in a recent meeting at the University of Aberdeen. And if some of those questions relate not just to the collection which the holding library has provided, but to other digitised collections which are complementary, do they even have the expertise to provide an answer? Arrangements which provide articulated specialist reference support will increasingly be required. The collective collection implies collective curation, interpretation and enquiry support. We need collaborations of human resource to accompany our collaborative collections, and therefore new business models to resource this.

    There are many other examples at the present time, where, though they may have no problem in seeing the value, even bold libraries will hesitate before agreeing to pay the price to achieve it, if they don’t know what it is. In other words, a pricelist is required, and producing it will be complex and challenging, requiring political as well as economic skills. In RLG Programs we will be tackling this question more and more in our projects and programs in the future. As a community we know we cannot turn back from this task, but it can seem a huge and frightening one. This is a moment when we require leadership which encourages and supports us to stick with the dynamic of change – more easily faced collaboratively – and continue to reject the stock responses of both cynicism and timidity.

    How many WorldCat MARC tags are REALLY used?

    Wednesday, March 12th, 2008 by Karen

    Answer: Not many. And there are a lot of MARC tags that are rarely used. I recently analyzed the occurrence frequency of MARC tags in a December 2007 WorldCat snapshot prepared by our Office of Research colleagues. At that time WorldCat comprised 96,174,586 records and 1,210,107,485 holdings. The results don’t differ substantially from those Bill Moen presented from his extensive research on MARC designations at the 2006 RLG Members Forum: More, Better, Faster, Cheaper, even though WorldCat comprised just 56 million records at the time he had done his research.

    Of the total 226 MARC tags represented in WorldCat.

    • Only 27 tags occur in 10% or more of WorldCat records.
    • 52 tags occur in 1% – 9% of WorldCat records.
    • 147 tags occur in less than 1% of all WorldCat records.

    Or 65% of all tags in WorldCat occur in less than 1% of all records. Bill Moen reported that of 167 unique fields identified in his research 66% occurred in less than 1% of all records. The record numbers change, but not the percentages, much.

    The distribution of MARC tags in WorldCat looks like this.

    I also looked at the tags weighted by the 1.2 billion WorldCat holdings attached to the records where these MARC fields appear, representing the number of items represented. The results are similar:

    • 35 tags occur in 10% or more of WorldCat items.
    • 48 tags occur in 1% – 9% of WorldCat items.
    • 143 tags occur in less than 1% of WorldCat items.

    Some tags are used more often by specific communities. For example, non-Latin script records are more likely to use uncontrolled subject terms (653 field, used in 18% of non-Latin script records) compared to the rest of WorldCat (4.27%). Vendor-supplied ordering data (in the OCLC-specific 938 field) occurs in more than half of all WorldCat items, although it is present in only 6% of all WorldCat records. Although form/genre terms in the 655 field occur in only 4.15% of all WorldCat records, it occurs in more than half of mixed material records (53.15%), 26.53% of visual material records, and 15.77% of integrated resource records.

    Still, 40 tags occur in fewer than 1,000 records in a WorldCat database of over 96,000,000 records. Tags we can forget about since no one is using them anyway?

    Open Library Developers Meeting

    Tuesday, March 11th, 2008 by Roy

    Open Library Developers Meeting ThumbnailI’ve been remiss in not quickly reporting on the Open Library Developer’s Meeting I attended on Friday, February 29. The purpose of the meeting was to “introduce developers to the Open Library framework so that other library developers can start building on it, as well as discussing some outstanding issues.” They listed potential issues as: designing the schema, designing the APIs, library participation, FRBRizing, reconciling different data sources, representing journals, and picking identifiers.

    As it turned out, given the ad hoc nature of the agenda (jotted on a flipchart moments before the speaking started), not all of these topics were addressed. The meeting began with a few run-throughs of existing infrastructure and technologies by Open Library staff (about six strong, funded by the Internet Archive and a state grant), then was followed up with break-out sessions. Break-out session topics included user experience, book viewer options, identifiers, catalog data and merging, and Wikipedia links.
    There were approximately 35 people in attendance, including Open Library staff and Brewster Kahle. Libraries represented ran the gamut from the Library of Congress and OCLC to the Missouri Botanical Gardens and Oregon State University. Others came from organizations such as Wikimedia and Creative Commons.

    Those of you who weren’t there but are curious can always watch the movie.

    SAA RLG roundtable, sign on and sign up

    Monday, March 10th, 2008 by Merrilee

    For those of you who are members of the Society of American Archivists, you will already know that the association has put a lot of work into their website and electronic communications infrastructure. For many years, RLG has used our own Primary Sources list to communicate about the RLG Roundtable at SAA. Since you need to be an RLG Program Partner to subscribe to Primary Sources, this has been a not-so-great solution. But now there’s a dedicated listserv for the RLG Roundtable at SAA, and a way to officially join the Roundtable if you are interested. If you are interested in details, read on. If not, this concludes this public service announcement.

    Individuals may have one of two roles in SAA Roundtables. You may join the discussion list (open to anyone), and if you are an SAA member, you can become a member of the roundtable.

    In either instance, you must first create an SAA profile. SAA Members and non-member alike may create a profile. If you would like to create a profile, or if you are not certain if you have an SAA profile, check this webpage.

    Listing yourself on the roundtable roster: SAA members and non-members alike may subscribe to an unlimited number of roundtable email discussion lists. Roundtable list participants may not vote for or serve as officers of the roundtable but do have the option of listing themselves as list participants in the online roundtable roster. This may be done by visiting the roundtable home page.

    Joining the roundtable: If you are an SAA member, you may officially join two roundtables. Roundtable members are eligible to vote for and serve as officers of the roundtable and may subscribe to roundtable email discussion lists (see below).

    SAA members typically choose their two official roundtables via the membership application and/or dues renewal form. Members may also submit updates by email to Roundtable members will automatically be listed as such in the online roundtable roster.

    Subscribing to the listserv: Whether or not you are listed in the roundtable roster as a roundtable member or list participant, you may subscribe to the roundtable listserv.

    Questions? Please contact me.