The OCLC Research Library Partnership’s community exploration of research data management (RDM) continued with the second installment of a three-part webinar series based on our Realities of Research Data Management project. The topic was the role of different incentives in catalyzing the development of university RDM service offerings, as discussed in our recent Realities of RDM report. The webinar also featured some findings on researcher behaviors and practices around data management – in particular, data sharing and re-use – by our OCLC Research colleague Ixchel Faniel. As with the first webinar in the series, this webinar was followed up by a round of discussions in the Partnership’s RDM Interest Group, where we took a deeper dive into the webinar topics. This post synthesizes the discussions by Interest Group members.
First, a little more about the Interest Group. The group includes more than 80 individuals, representing nearly 50 Research Library Partnership member institutions in nine countries. Participants are distributed across a range of RDM roles, and represent both strategic and practitioner perspectives. The Interest Group is an opportunity for participants to interact with OCLC Research staff and each other, sharing experiences about RDM services and practices, and pooling knowledge about the current state – and future evolution – of RDM. Interest Group discussions are catalyzed by the topics covered in the accompanying webinar series, but are free-ranging and flexible to accommodate participants’ interests.
Like our Interest Group discussions following the first webinar, our latest discussions drew participants from North America, Europe, and the Asia-Pacific region. The starting point for our discussion was a model from the penultimate report in the Realities of RDM report series, in which we depicted four categories of incentives driving universities to acquire RDM capacity.
We discussed how these incentives operated in different university contexts. A number of participants indicated that compliance with mandates from funders, government agencies, and national directives was a strong initial driver in incentivizing the acquisition of RDM capacity at their institutions; however, some also noted that more recently, meeting publisher requirements for data availability as a condition of publication has been a key source of researcher requests for RDM support. Participants agreed that tracking the increasingly complex web of mandates and compliance requirements from funders, the public sector, and now publishers is challenging. While the job of monitoring the appearance and ongoing evolution of data mandates from a variety of sources often falls to the university library, staff also engage with other campus units to keep abreast of the latest developments, including the Research Office and even individual academic departments. Several participants noted that their institutions track mandates through DMPTool, an open-source online tool supporting the creation of data management plans.
Institutional interest in data and data management can be a strong driver for acquiring RDM capacity. We discussed the ways institutional strategy in the RDM space was articulated in university data policies. The responses varied considerably, with many participants indicating that their university had an institutional data policy in place, while others either had a draft policy under consideration or none at all. For participants whose university did have a data policy, feelings were mixed in regard to its effectiveness. For example, one participant indicated that the data policy helped align the university’s stance on data management with that of research funders, and served notice that the university had certain expectations of its researchers in regard to data management. Similarly, another participant observed that a data policy can clarify data management expectations, as well as signal institutional interest and commitment in this area. But one participant noted that although their university had a data policy, there was little in the way of enforcement built into it; in practice, it took the form more of a set of suggestions for good data management rather than a statement of requirements. It was additionally noted that university data policies in Europe were perhaps more advanced than elsewhere, due to more extensive national reporting requirements for research outputs.
Much of our discussion focused on the class of incentives represented in the figure by researcher demand for RDM services, and in particular, what universities were doing to engage with researchers around data management needs and requirements. Several participants noted that engagement with researchers often begins by providing support for creating data management plans, while others mentioned the provision of data storage capacity as another good way to gain entry into the researcher’s data management workflow. An interesting strand of discussion took place around the idea that much of the current RDM support being offered tends to cluster around the beginning and end of the data management process – that is, with data management plans at the beginning of the research process, and then the storage and sharing of final data once the project is complete. Looking ahead, a key question for many RDM practitioners is how to extend RDM services into the “valley” between these endpoints.
We had a spirited conversation around the problem of incentivizing researchers to engage in good data management practices. For example, several participants noted that while researchers often embraced the provision of data storage capacity, it was much more difficult to motivate them to provide adequate documentation for the data sets stored there, or to indicate the period of retention needed. Helping researchers see that data management is a worthy investment of their time is crucial: for example, at one institution, staff try to engage researcher interest by providing dedicated trainers to help researchers use REDCap (an electronic data capture system used extensively in academia for collecting clinical data), and show them how the system improves their data management practices. It was generally agreed that a key obstacle to motivating researchers to optimize their data for re-use – such as providing adequate documentation and metadata to support understanding and discovery of the data – is the lack of good data re-use stories that illustrate the tangible benefits of observing good data practices.
In addition to these topic areas, participants offered many wide-ranging comments about RDM services. For example, we learned that one university initiated a pilot project with departments in the life sciences in which RDM librarians sat in on project review boards and offered feedback on data management plans. Another participant noted the potential for drawing lessons on re-use from the sharing of software in academic circles, where there is a prevailing ethos that encourages code to be shared and built upon through services like GitHub. As we have seen with software, establishing a “culture of sharing” is very important in encouraging data re-use.
Several participants noted that while scarcity of resources can sometimes put interactions with other campus units on a competitive footing, it can also facilitate collaboration based on a mutual need to stretch limited resources for RDM service provision. The intersection between research ethics, privacy, and data management was mentioned by several participants as a growing concern, with researchers sometimes required to address ethical issues around data storage and sharing in their data management plans, or as part of a project’s ethics review. The European General Data Protection Regulation (GDPR), which went into effect in 2018, has amplified the need to address privacy issues in data storage and re-use. And finally, participants noted again and again how important it was to engage closely with researchers to understand how data management aligned with their daily workflows – as one participant noted, it is difficult to understand researcher needs without getting into their office!
These are just some of the highlights from a wide-ranging and informative discussion with RDM professionals within the Research Library Partnership. We thank all who participated in the discussion!
Brian Lavoie is a Research Scientist in OCLC Research. He has worked on projects in many areas, such as digital preservation, cooperative print management, and data-mining of bibliographic resources. He was a co-founder of the working group that developed the PREMIS Data Dictionary for preservation metadata, and served as co-chair of a US National Science Foundation blue-ribbon task force on economically sustainable digital preservation. Brian’s academic background is in economics; he has a Ph.D. in agricultural economics. Brian’s current research interests include stewardship of the evolving scholarly record, analysis of collective collections, and the system-wide organization of library resources.