Archive for the 'Architecture and standards' Category

Economics of Scholarly Production: Supplemental Materials

Wednesday, August 25th, 2010 by Constance

At the Spring CNI Taskforce meeting last April, Karen Wetzel (Standards Program Manager at NISO) announced a new piece of work related to “supplemental materials” in journal articles. In the scientific literature, it is not uncommon for articles to be accompanied by a secondary set of figures, data, documentation of experimental protocols that aren’t considered part of the core content. Karen reported that thought-leaders from a variety of sectors had expressed concerns about the expense that publishers incur in managing this material, as well as the additional work that it creates for editorial staff and authors. Libraries were included in a long list of potential stakeholders, as potential curators of this supplemental material.

A central concern is that scholarly citation and reuse of this kind of supporting material is limited by the absence of identifiers, bibliographic metadata etc. Read the rest of this entry »

Breaking Open the ILS Silos

Friday, August 20th, 2010 by Roy

In 2007-2008, the Digital Library Federation (DLF) convened a Task Group to recommend standard interfaces for integrating the data and services of the Integrated Library System (ILS) with new applications supporting user discovery. The group produced a report with recommendations in December 2008. After that not much happened.

In February 2010, at the Code4Lib Conference, Karen Coombs (the OCLC Developer Network manager) and I brought together some of the people who had been on that task group as well as other interested parties who were at the conference to take this work to the next stage. At this ad hoc meeting we agreed that we were ready to take this work to the next stage. The next stage, we felt, was to actually create a middleware layer that we could collaboratively maintain. Read the rest of this entry »

Next-Gen Harvesting

Thursday, February 4th, 2010 by Roy

Metadata harvesting (collecting metadata from others and aggregating it in a collection) is not new. Although there are any number of ways to do this, the OAI-PMH protocol for metadata harvesting is often used and has been around for years. It defines a small set of actions that allows anyone to discover what sets of metadata are available for harvesting from a digital repository, which metadata formats are offered, and select and download those records. Thousands of repositories worldwide support it, sometimes even unknowingly, because many repository applications such as DSpace and ePrints come with OAI-PMH support out of the box.

This has led to a world in which there are metadata aggregators and even agreggators of aggregators. It has also led to potential confusion and difficulty. Records that are picked up from their “native” location and indexed and displayed elsewhere may not be depicted as the creator of that metadata intended. They also may not be refreshed in a timely fashion, thereby potentially leading to records that are out-of-date persisting in various corners of the Internet.

This is why when my colleagues on the services side of the house announced the WorldCat Digital Collection Gateway I sat up and took notice. This heralds a new world in which those being harvested can exert some control over not only how frequently their records are updated, but also how those records are depicted in the aggregation — in this case, WorldCat. Through a simple web-based interface, you can provide your OAI-PMH base URL, have the Gateway test harvest some records, view how those records would display in WorldCat, and change the mapping if you wish. Another benefit is that your records will then appear in all of the places WorldCat is syndicated.

A pilot project to test the Digital Collection Gateway was just announced, beginning March 1, and we are seeking volunteers to try it out and provide feedback. During the pilot you will be asked to:

  • Attend a two-hour webinar reviewing the use of the Gateway
  • Upload a minimum of 500 metadata records to WorldCat
  • Offer feedback and input on your experience with the Gateway to our support and product teams so we can improve the tool and workflows

If you would like to help us create a next-generation harvesting infrastructure, in which you control your metadata more than ever before, email us at oaister@oclc.org.

ORCID and ISNI: Author, Swineherd, Taxman, Alcohol Researcher

Saturday, January 30th, 2010 by Jim

At recent meetings I attended in Washington D.C. there was significant hallway discussion about the Open Researcher Contributor Identification (ORCID) initiative. Given the science orientation of the meetings this initiative to resolve the problem of name ambiguity and attribution in scholarly publication was particularly welcomed. As you’ll see if you visit the ORCID site this is early days for this pre-competitive multi-publisher effort whose goal is to establish

“an open, independent registry that is adopted and embraced as the industry’s de facto standard.” Their mission is “to resolve the systemic name ambiguity, by means of assigning unique identifiers linkable to an individual’s research output, to enhance the scientific discovery process and improve the efficiency of funding and collaboration.”

Meeting one was convened by Thomson Reuters and Nature Publishing not long ago with the first meeting in November 2009. The roster of participants is impressive and the continued involvement of Elsevier made those with whom I talked hopeful that this would be as successful an effort as CrossRef has been. A recent editorial in Nature Credit where credit is due (pdf) is quite to the point about the implications of success.

My colleagues, Thom Hickey and Janifer Gatenby, have been involved. OCLC has much to contribute here given Thom’s leadership of the Virtual International Authority File (VIAF) effort and Janifer’s in the development of the International Standard Name Identifier (ISNI). The scope of ORCID is narrower than ISNI as the latter is intended for the identification of “identities used publicly by parties involved throughout the media content industries in the creation, production, management, and content distribution chains.” This goes across all fields of creative activity not just science. As Janifer said,

“ISNI could become a cross domain identifier so that a researcher who also plays in a rock band (and wants it known that he is one and the same) can be identified.”

Read the rest of this entry »

The Straight Dope on OAIster

Monday, September 21st, 2009 by Roy

As many of you are probably aware, OCLC and the University of Michigan announced last January that OCLC was taking over the OAIster aggregation of metadata harvested from OAI-compliant repositories. The University of Michigan was no longer able to support it, and was looking for assistance in sustaining this valuable community resource. As Kat Hagedorn remarked in regards to our agreement, “Hosting anything of this size quickly got out of hand for UM Libraries, and it took us a long time to realize it. Besides, greater access for more folks? Sounds win-win to me, as long as it’s continuously freely available.” [reported by Dorothea Salo]

I have heard lots of questions since we started contacting contributors with the most recent phase of the transfer plan, so the purpose of this post is to bring everyone up to date on why we are doing this, where things are, and what we hope to accomplish in the future. Read the rest of this entry »

Context for Metasearch

Friday, August 28th, 2009 by Jennifer

Last Friday the Encoded Archival Context (EAC) standard for archival authorities was released to the international community for review. Warning: an EAC record is not your grandmother’s MARC authority record. EAC is a companion standard to Encoded Archival Description (EAD), yet now seems to be useful well beyond the world of archives.

Managing collections archivally requires archivists to create comprehensive descriptions of corporate bodies, persons and families. Who would know better the context of records and creators than the archivists with the stuff in their hands? And who knew that this contextual information would be exactly what folks want to share when Networking Names [pdf]? With EAC we can link the creators, the context and the stuff. EAC goes one step further, facilitating the exchange of authoritative contextual information across many domains.

It turns out EAC is useful infrastructure for metasearch. At our RLG Annual Meeting, Warwick Cathrow demonstrated The National Library of Australia’s prototype “one-search” service. Here one can discover everything - pictures, books, archives, newspaper articles, music, etc. - by and about a creator. The Australians have used EAC to collate dispersed, silo-ed information. (Just search the Christian name “Nellie” and watch it go! Hats off to Basil Dewhurst and his team.) Read the rest of this entry »

Networking names

Friday, May 1st, 2009 by Karen

Our Networking Names report has just been published! I was pleased to see this morning a number of tweets announcing it – or echoing other tweets.

I blogged last November about names touching everything soon after the Networking Names Advisory Group met together at the Met. The fifteen members of the advisory group have spent the time since refining fourteen use case scenarios, those that they were most knowledgeable about - academic libraries and scholars, archivists and archival users, and institutional repositories. These use case scenarios envisioned how different communities could benefit from aggregating information about persons and organizations, corporate and government bodies, and families, and making it available on a network level.  From the use case scenarios we derived the functions and attributes of what would be needed for a “cooperative Identities Hub”.

Some of the components of a cooperative Identities Hub exist or are being developed. We wanted to articulate the characteristics of a gateway to all forms of names authorized or used in other contexts without preferring one form of name over another and that would use social networking to tap expertise in all communities. We envisioned a switch for users or their machine applications to extract relevant information for re-use in their own contexts and enable contributions from different sources.  These are objectives we can all strive towards.

We’re looking at ways to amplify this work. Feel free to post your comments or reactions here in the meantime.

I am deeply grateful to all the RLG Partner staff who contributed to the report – a very talented group to work with: Grace Agnew (Rutgers), Laura Akerman (Emory), Genevieve Clavel (Swiss National Library), Joan Cobb (Getty Research Institute), Michele Crump (U. Florida), Amanda Hill (U. Manchester/UK Names Project), Deborah Kempe (Frick), Amy Lucker (New York University), Dennis Meissner (Minnesota Historical Society), Suzanne Pilsk (Smithsonian), Michael Rush (Yale), Jon Shaw (U. Pennsylvania), Laura Smart (CalTech), Daniel Starr (Metropolitan Museum of Art), Bob Wolven (Columbia).

Analysis Methodology for Museum Data

Wednesday, April 29th, 2009 by GĂĽnter

In a previous post, I’ve shared some background about the data analysis phase of our Museum Data Exchange Mellon grant, and posted some of the questions our museum participants wanted to have answered. In the meantime, we have created a spreadsheet [pdf] which captures our ideas to date of what questions we may want to ask of the 850K CDWA Lite XML records from 9 museums. Note that the methodology captured by this spreadsheet lays out a landscape of possibilities - it is not a definitive checklist of all the questions we will answer as part of this project. Only as we get deeper into the analysis will we know which questions are actually tractable with the tools we have at hand. I’d appreciate any thoughts on additional lines of inquiry we could pursue with our analysis, or other observations!

Read the rest of this entry »

Museum Data Exchange: Tools for Sharing

Monday, April 13th, 2009 by GĂĽnter

As all good things in life, this took a little longer to see the light of day than I had thought it would, which means I am doubly delighted to announce: we have now officially released the suite of tools generated through the Mellon-funded Museum Data Exchange project. You’ll find a lot of informative detail in this announcement. Here’s what it all boils down to: Museums now have access to COBOAT and OAICatMuseum 1.0 software.

  • COBOAT is a metadata publishing tool developed by Cognitive Applications Inc. (Cogapp) that transfers information between databases (such as collections management systems) and different formats. As configured for this project, COBOAT allows museums to extract standards-based records in the Categories for the Descriptions of Works of Art (CDWA) Lite XML data format out of Gallery Systems TMS, a leading collection management system. Configuration files allow COBOAT to be adjusted for extraction from different vendor-based or homegrown database systems, or locally divergent implementations of the same collections management system. COBOAT software is now available on the OCLC Web site under a fee-free license for the purposes of publishing a CDWA Lite repository of collections information at www.oclc.org/research/software/coboat/default.htm.

  • OAICatMuseum 1.0 is an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) data content provider supporting CDWA Lite XML. It allows museums to share the data extracted with COBOAT using OAI-PMH. OAICatMuseum was developed by OCLC Research and is available under an open source license online at www.oclc.org/research/software/oai/oaicatmuseum.htm.

  • Read the rest of this entry »

    Repositories and library cultures

    Tuesday, March 10th, 2009 by John

    When is a repository not a repository? When it’s an OPAC? Are OPACs in reality a species of repository, however reluctantly, given that the genus is usually used with a specific application in mind - one which is a newcomer to the library world whose value is still not convincingly proven?

    In the UK, JISC is about to award a tender for a study on The links between library OPACs and repositories in Higher Education Institutions. The invitation to tender states:

    Repositories and OPACs … share various features and requirements. Both depend for their efficiency upon accurate metadata. Both provide a primary service to the home institution but also provide services to external users, for example in enabling access to content for a user from another institution. Various items of content may be accessible both through the library OPAC and through the repository, sometimes in different versions (e.g. a preprint in a repository and a published journal article under licence in an OPAC).

    Its terms of reference include:

  • survey the extent to which repository content is in scope for institutional library OPACs, and the extent to which it is already recorded there;
  • examine the interoperability of OPAC and repository software for the exchange of metadata and other information;
  • list the various services to institutional managers, researchers, teachers and learners offered respectively by OPACs and by repositories;
  • make recommendations for the development of possible further links between library OPACs and institutional repositories, identifying the benefits of such links to various stakeholder groups.
  • Reading this reminded me that the University of Edinburgh has recently announced the introduction of an Open Access publication mandate. The Library will continue to run its Edinburgh Research Archive (ERA) open access repository alongside a new, closed, Publications Repository (PR), which will support research assessment and profiling. As the criteria for institutional deposit proliferate, the mandate document includes a FAQ section to answer researchers’ concerns. One is:

    What about research outputs which are not journal articles? The PR and ERA can accept most research output types including books, book chapters, conference proceedings, performances, video, audio etc. In some cases – for example books not available electronically – the PR/ERA will hold only metadata, with the possibility of links to catalogues so that users can find locations….

    Read the rest of this entry »