What is the ideal future vision of an open science ecosystem supporting FAIR data? What are the challenges in getting there? These were the topics of the second installment of the OCLC/LIBER discussion series on open science, which brought together an international group of participants with a shared interest in the FAIR principles. The discussion series, which runs from September 24 through November 5, illuminates key topics related to the LIBER Open Science Roadmap. Both the discussion series and the Roadmap have the mutual goal of informing research libraries as they envision their roles in an open science landscape.
The first discussion in the series addressed the topic of scholarly publishing; a summary of the discussion highlights can be found here. In the second discussion, the focus was FAIR research data. FAIR is a set of broadly articulated principles describing the foundations of “good data management”, aimed at those who produce, publish, and/or steward research data sets, and serving as a set of guideposts for leveraging the full value of research data in support of scholarly inquiry. FAIR research data – that is, data that is findable, accessible, interoperable, and reusable – is seen as an important component of a broader open science ecosystem.
What does an ideal future look like for FAIR research data?
The discussion led off with one participant noting that “open science is just science done the right way”, emphasizing that the FAIR principles, and other aspects of open science, are elements in service of a broader vision of scientific progress unencumbered with barriers to access and communication. In an ideal world, adherence to FAIR would mean that research outputs like data, as well as software and metadata, would be equally available to both humans and machines as part of a cooperative effort among stakeholders in the scholarly process – including vendors. However, as one participant noted, at present this is “just a dream”.
Several participants noted that application of the FAIR principles must take into account the priorities and needs of a diverse set of communities. For example, one participant mentioned the CARE principles (Collective Benefit, Authority to Control, Responsibility, Ethics), which complement FAIR by addressing the interests of Indigenous Peoples. Another participant noted that the idea of “accessible” data must extend to users with physical limitations – all should have access to data in ways that are both “natural and easy”.
As we unpacked the notion of an ideal future vision for FAIR research data, discussion centered around several broad themes:
Library as data steward: At times, the library is overlooked as a campus partner in data management. Yet a great deal of the expertise needed to bring about FAIR data is already embedded in the librarian’s skill set. Moreover, librarians are well-placed to advocate for and raise awareness among researchers about the principles of open science that underpin FAIR. In an ideal world, researchers and other stakeholders will recognize that the library is an integral partner in bringing the FAIR principles to life as part of a broader shift to open science practices.
Standardization and specialization: Data management practices vary across disciplines, with different research cohorts solving data management issues – including the application of the FAIR principles – in different ways. This is often the result of ad hoc, insular approaches to data management, and, ideally, opportunities will be found to consolidate practices, standards, and protocols across disciplines where possible. But data stewards must also understand and support necessary differences in data management practices across disciplinary settings, such as specialized description standards for research data.
Researchers as partners: The ideal of FAIR data cannot be achieved without the active support of data producers. Researchers must therefore understand the importance of making their data available in ways that adhere to the FAIR principles, as well as have access to the resources necessary to put these principles into action, such as dedicated funding for data management. Ideally, data management training will become a standard component of educating future researchers, with learning communities forming to share curricula and best practices for instruction. All of this would be facilitated by the emergence of FAIR data “champions” among influential researchers, who would serve as models for good data management practices both as colleagues and mentors, as well as potential collaborative partners for the library in spreading awareness about the FAIR principles.
Staffing up and skilling up: In an ideal world, more library staff will be involved in supporting open science initiatives, including FAIR research data. However, the composition of staff skill sets will evolve: in particular, the complexities of increasingly sophisticated data management services will require the skills of individuals who are formally trained as data stewards. In addition, more cooperation across campus stakeholders in the provision of data services will occur, with partnerships between the library and units such as Campus IT and the Research Office. Close liaison with researchers will also help data librarians understand the nuances of data management needs in specific disciplines.
What are the main challenges in achieving this future?
After identifying how participants envisioned the FAIR principles operating in a future open science ecosystem, the conversation moved on to what challenges stood in the way of achieving that future, and how the library community can work together to overcome those challenges. A real-time, online poll among participants yielded their collective view of the top barriers to attaining FAIR research data in an open science environment:
Lack of rewards/incentives for researchers: As noted above, achieving FAIR data requires the active support of researchers. However, too many researchers lack sufficient incentives to allocate scarce time and resources to data management activities. Participants noted a number of ways that the library might address this problem, including gathering evidence on the incentives gap and raising awareness about it among campus leadership, as well as presenting evidence to researchers on the potential benefits – such as reputation enhancement – of making their data FAIR. Libraries can also join data management communities, such as Dryad or ICPSR, and provide funds and staffing to support these memberships.
Participants noted that smaller libraries with fewer resources could team up and act collectively in supporting researchers. National-scale systems for rewarding researchers for making their data available can also be helpful. Several participants observed that both top-down and bottom-up incentives are needed to create the appropriate reward structures, and that libraries can play a key role by bringing together the right campus stakeholders to create the right incentives. One participant emphasized that top-down support is essential for reward structures to work; however, another participant advised libraries not to be idle while waiting for top-down support to materialize: instead, get started right away by building data management services that can attract top-down support.
One participant suggested that funds could be re-allocated from collection budgets for demonstration projects that show the value of FAIR data. This elicited mixed reactions from others, but nevertheless highlighted an important point: increased activity in new service areas like data management will likely involve resource trade-offs with more traditional library budget lines like collections. This means that library involvement in new initiatives must be carefully considered – as one participant put it, are we sure it is the library’s role to try to develop incentives for FAIR data?
Culture change: A shift to open science in general, and to FAIR research data in particular, involves changes in attitudes, practices, and priorities for all stakeholders. Effecting this “culture change” will require a collective effort across campus, including but not limited to the library. Libraries need to support other campus units in navigating these changes, but they also need to support each other.
Changing the attitudes, and achieving the buy-in, of influential stakeholders at all levels – from campus leadership to senior researchers – is an essential step toward shifting the culture around data management. For example, one participant noted that the principle investigators (PIs) on a research project set the tone for the entire team. If the PIs treat data management requirements as merely “a box to tick”, junior members will likely do so as well; however, if PIs truly embrace the importance of making research data FAIR, they can instill a similar attitude throughout the rest of the project team. This highlights the importance of reaching out to “influencers” as a first step in bringing about culture change.
Several participants noted the Carpentries training model – in particular, the Data Carpentry version – as a possible means of promoting FAIR data, and equally as important, cultivating the skills needed to support it. The Carpentries model has the dual advantage of providing a dynamic learning community for researchers, along with a strong emphasis on “training the trainer” instruction.
Skill-building: As libraries become more deeply embedded in research support services like data management, the skill set needed to support these services will continue to evolve. In response, as one participant observed, “we need to create a community of data librarians and train them!” One approach is to ensure that data management skills are included in library school curricula – several participants felt that training of future librarians is sometimes too oriented toward “classic” library topics. For current librarians, participants suggested that it is important to obtain buy-in from library staff, in the form of a critical mass of staff acknowledging the importance of acquiring data management skills.
One participant noted that while it is important to train data librarians, it is equally important to retain them. Opportunities exist for librarians with data management skills to move on to other domains or industries. What can be done to keep them engaged in the library world?
Much of the discussion of this challenge focused on how libraries could act collectively to fill the skills gap in data management. One participant pointed out that many academic libraries are small, and do not have the resources to cultivate the full range of skills needed to address the diverse data management requirements encountered across disciplinary settings. A possible solution might be to develop coordinated specializations within groups of libraries, which could then be shared as a collective resource. Examples of pooling expertise in this way include the Data Curation Network in the United States, as well as the network of Dataverse communities.
The discussion continues …
Our conversation about the FAIR principles, and data management generally, was an instructive example of the power of collective wisdom. The participants brought a multiplicity of national and institutional backgrounds to bear on the questions we discussed, and the result was an illuminating exploration of both the challenges and opportunities for libraries in making FAIR data a robust component of the open science ecosystem. Please join us for more blog posts as we continue to collect community perspective on the seven focus areas of the LIBER Open Science Roadmap.
Brian Lavoie is a Research Scientist in OCLC Research. He has worked on projects in many areas, such as digital preservation, cooperative print management, and data-mining of bibliographic resources. He was a co-founder of the working group that developed the PREMIS Data Dictionary for preservation metadata, and served as co-chair of a US National Science Foundation blue-ribbon task force on economically sustainable digital preservation. Brian’s academic background is in economics; he has a Ph.D. in agricultural economics. Brian’s current research interests include stewardship of the evolving scholarly record, analysis of collective collections, and the system-wide organization of library resources.