OCLC Research mini-symposium on the discovery and use of open (digitized) collections

Annapaola Ginammi and Antoine Isaac presenting at the OCLC Research mini-symposium

On June 19th we held the third OCLC Research mini-symposium in Leiden. This time the topic was: “The discovery and use of open (digitized) collections.” The event attracted both library professionals from the Netherlands and OCLC staff members from across Europe.

The format was an intensive three hour session with speakers addressing key aspects of the theme: 1) access to content in digital libraries (image interoperability: IIIF), 2) discovery of this content (the GDDN project) and 3) use – with a real world example of a digital scholarship workflow for philosophers (the CatVis project).

Introducing the theme

I introduced the theme by presenting findings from the OCLC Open Content survey through the lens of the “collective collection” perspective – which looks at digitized materials as part of the collective collection held by libraries around the world, and the trend towards system-wide attention and collaboration for the management of this global resource.

Access to content in digital libraries

Shane Huddleston, Product Manager for OCLC’s cloud-based digital repository CONTENTdm, talked about OCLC’s ongoing investment in the IIIF standard. The research and product team made good progress in co-developing experimental APIs that improve the online presentation, retrieval, and enrichment of images across repositories without heavy file transfers. Shane also described efforts to derive and reconcile linked data from the text-based Dublin Core metadata.

Antoine Isaac, R&D Manager for Europeana, explained in a bit more detail how the IIIF community works and how IIIF is being implemented within the Europeana network of content providers. He demonstrated how this allows contributors to control the way their content can be viewed with the example of a 6.5 meter long parchment scroll from Switzerland. He explained how Europeana is looking beyond OAI-PMH, to adopt IIIF and linked data (schema.org) as a way forward.

Discovery of digitized collections

Paul Gooding, lecturer in Information Studies at Glasgow, described the GGDN’s goal to investigate the utility and feasibility of a global registry (or dataset) of digitized texts for digital scholarship. The intended dataset would consist of metadata only. Paul expanded on the possible usefulness of such an instrument for academics. One use case, which is currently being tested, is the matching of similar full-texts of the same work (different copies/versions/editions). He also mentioned interest in use cases to support library workflows, such as digitization and preservation management.

Use of digitized collections

The CatVis team described the challenges and possibilities of digital text-based research in Philosophy. First, scanned books are gathered for corpus-building and then the corpus is cleaned and pre-processed. Then, during the analysis stage, semantic similarity clustering and data-visualization tools are applied on the corpus. Rob Koopman, architect at OCLC, shared his eye-opening experience with bulk-downloads from content providers, revealing how little accessible open digitized content can be. Annapaola Ginammi, researcher at the Faculty of Humanities in Amsterdam, reminded us how demanding scholarly disciplines are when it comes to using OCRed content.

Need for ongoing investment

During the Q&A session, participants wondered about the need for ongoing investment in the area of discovery and use of digitized collections. The speakers stressed that concerted efforts and innovative technologies are necessary to maximize the value and use of digitized materials and their related metadata.

You can find the slides and the recording of the session here: https://www.oclc.org/research/events/2019/061919-oclc-mini-symposium-discovery-use-open-collections.html

Stay updated on future Research events in Leiden

Register your interest here: https://oc.lc/symposium or contact Titia van der Werf.