This is the second of three posts about the workshop.
Part 1 introduced the Evolving Scholarly Record framework. This part summarizes the two plenary discussions.
Natasa illustrated the diversity and complexity of digital research information comparing it to a rainbow and asking how do we preserve a rainbow? She began with the question, How can we support the reuse of scientific data, tools, and resources to facilitate new scientific discoveries? We need to take a sociological point of view because scientific discovery is a social enterprise within communities of practice – and the information takes a complex journey from the lab to the paper, evolving en route. When teams consist of distributed scientists notions of ownership and sharing are challenged. We need to be attuned to the interplay between technology and collaborative practices as it affects the information artifacts.
Natasa encouraged a shift in thinking from the record to the ecology, as she shared her study of the artifacts ecology of a particular nanotechnology endeavor. Their ecosystem has electronic lab books, includes tools, ingests sensor data, and incorporates analysis and interpretation. This ecosystem provides context for understanding the data and other artifacts, but scientists want help linking these artifacts and overcoming limitations of physical interaction. They want content extraction and format transformation services. They want to create project maps and overviews to support their work in order to convey meaning to guide third party reuse of the artifacts. Preservation is not just persistence; it requires a connection with the contemporary ecosystem. A file and an application can persist and be completely unusable. They need to be processed and displayed to be experienced and this requires preserving them in their original state and virtualising the old environments on future platforms. She acknowledged the challenges in supporting research, but implored libraries to persevere.
Herbert took a web-focused view, saying that not only is nearly everything digital, it is nearly all networked, which must be taken into account when we talk about archiving. His presentation reflected thinking in progress with Andrew Treloar, of the Australian National Data Service. Herbert highlighted the “collect” and “fix” roles and how the materials will be obtained by archives. He used Roosendaal and Geurtz’s functions of scholarly communication to structure his talk: Registration (the claim, with its related objects), Certification (peer review and other validation), Awareness (alerts and discovery of new claims), and Archiving (preserving over time), emphasizing that there is no scholarly record without archiving. The four functions had been integrated in print journal publishing, but now the functions are disaggregated and distributed among many entities.
Herbert then characterized the future environment as the Web of Objects. Scholarly communication is becoming more visible, continuous, informal, instant, and content-driven. As a result, research objects are more varied, compound, diverse, networked, and open. He discussed several challenges this presents to libraries. Archiving must take into account that objects are often hosted on common web platforms (e.g., GitHub, SlideShare, WordPress), which are not necessarily dedicated to scholarship. We archive only 50% of journal articles and they tend to be the easy, low-risk titles. “Web at Large” resources are seldom archived. Today’s approach to archiving focuses on atomic objects and loses context. We need to move toward archiving compound objects in various states of flux, as resources on the web rather than as files in file systems. He distinguished between recording (short-term, no guarantees, many copies, and tied to the scholarly process) and archiving (longer-term, guarantees, one copy, and part of the scholarly record). Curatorial decisions need to be made to transfer materials from the recording infrastructures to an archival infrastructure through collaborations, interoperability, and web-scale processes.
Part 3 will summarize the breakout discussions.