RIM: notes and thoughts about the euroCRIS meeting in Amsterdam

Lorcan’s recent blog post on “Research information management systems – a new service category?” has drawn the attention of some of the euroCRIS board members and so I was invited to attend their strategic management meeting in Amsterdam, 11-12 November 2014.

The meeting brought together 1) Research Managers and Administrators (they form a vibrant profession of their own: see also EARMA), 2) University Librarians (repository managers and data librarians working for Research Support) and 3) vendors of CRIS systems (Elsevier, Thomson Reuters, CINECA, etc.). There were ca. 140 attendees from across Europe (including UK, NL, France, Italy, Spain, Greece, Germany, Scandinavia, Belgium, Serbia, Czech Republic).

euroCRIS gets a rich variety of stakeholders to the podium

euroCRIS is an organization that is “dedicated to the development of Research Information Systems and their interoperability”, it maintains the CERIF metadata standard for CRIS systems and it acts as a forum for stakeholders of RIM (see membership). They have strong support from the EC, which recommends/mandates the use of CERIF. Not surprisingly, the theme of the meeting was about interoperability and standards in Research Information (RI).

Much used adaptation of a Swedish cartoon to visualize infrastructure incompatibility –  presented to euroCRIS meeting by Ed Simons, November 2014.

Much used adaptation of a Swedish cartoon to visualize infrastructure incompatibility –
presented to euroCRIS meeting by Ed Simons, November 2014.

The introduction to the theme was a co-presentation by euroCRIS President (Ed Simons, Nijmegen University) together with David Baker (CASRAI) and Josh Brown (ORCID) – demonstrating the will of euroCRIS to advance interoperability through strategic partnerships with stakeholders in the field.

What impressed me most was the breadth of the RI-domain and its stakeholders’ ecosystem: in his presentation Ed listed funders, researchers, research managers and administrators, peer-reviewers and research evaluators, libraries, etc.; and all the presentations, during the 2-days meeting, reflected that same broad perspective. Even though not all stakeholders were represented at the meeting, they were clearly regarded as interlocutors and invited to the podium.

A maturity issue?

COAR, EARMA, EUNIS and JISC – all “strategic partners” of euroCRIS – gave their view on RI interoperability, after a brief introduction of their organization.

Friedrich Summann, from the University Library of Bielefeld and representing COAR, highlighted the interoperability issues between CRIS and IR-systems. Concerning publication metadata, which is the common denominator of the data held in these systems, he noted there is very little exchange taking place. CRIS-systems do not expose CERIF-data and they generally do not support harvesting protocols (except for Pure, which supports OAI-PMH). He observed 3 trends around the perceived “dichotomy” between the CRIS and the IR-system: 1) using the CRIS as an IR, 2) using the IR as a CRIS and 3) combining both the CRIS and the IR in a symbiotic relationship. He touched lightly on the different purpose of each system: for the IR, visibility and OA; for the CRIS, research information management – which seemed to justify an evolution to a symbiotic ecosystem instead of a standards-driven integrated system. During the meeting, the increasing complexity of the emerging RI-infrastructure, with many more different systems than just the CRIS and IR being tied in, became evident as each presentation had a slide similar to this one (this one was not presented … but I like it, because it is prototypical).

There were a lot of slides with bullet lists of needs. It was clear that in all use cases, researchers are important users of CRIS and IR systems because they have to supply the research information. There is a strong awareness that the systems should be simple and easy to use for them. Another mantra was: “Researchers should not be required to supply the same information more than once”. Nevertheless, it was equally clear from the presentations, that the researchers are not the end-users for whom the systems are designed and whose needs were listed on the slides. The needs come from the research managers, the funders, the government policies and mandates, the research assessment exercises, etc., and those needs have not been sorted out. The vendors at the meeting politely, but repeatedly, asked for robust use cases. It reminded me of what Ed Simons said at the beginning of the meeting: “There are no standard use cases in the RI-domain: we are still growing our own vegetables”. This was exactly the feeling I had after these 2 days: the RI-domain is not mature yet and it has no chance to mature because it keeps expanding at the rate of the universe’s expansion.

A nice example of how overwhelming and at the same time exciting, RI-developments are becoming for libraries in the UK was given by Anna Clements’ presentation. Anna is from the University Library of St Andrews and carries many different hats as board member of euroCRIS, chair of the CASRAI working group on data management planning and chair of the Pure UK strategy group. She explained that since the last Research Excellence Framework (REF)-assessment in 2008 in the UK, huge investments have been made in CRIS-systems – for example, in linking publications to project information. She anticipates that the next assessment-driven CRIS-development stage will require investments in linking datasets to articles and funding. At St Andrews, they are re-designing the use of their CRIS system to support new REF-requirements and they are currently contemplating to integrate the deposit of the long tail of small datasets in the CRIS. For this new workflow they will also need a data repository with access storage and archive storage (she mentioned Arkivum) and a “data librarian” to assist researchers with the deposit process and the provision of good metadata.

John Donovan (EARMA Chair and Head of Research at Dublin Institute of Technology in Ireland) gave an intriguing short talk. He showed an endless list of sources from which research information was collected (Research support pre- and post-award; Research Finance; Graduate school; Ethics and Integrity; Structured Postdoc training; Research HR; Research awareness raising; etc.) and then he said: “we collect information from so many different sources, that it is completely unsustainable”. John is currently interested in what makes research sustainable in new, small universities – his perspective may be somewhat biased, still he raises a legitimate issue: it seems the fever of registering data has overtaken the need to be informed. However, the next day, Julia Lane was going to give us the big data perspective of RI and remind us that the scientific approach will push us further down the road of RI.

 Is RI being taken over by Science?

The keynote by Julia Lane (Senior Managing Economist, American Institutes for Research) stretched the policy perspective to its logical extreme, introducing the need for a “science of science policy” to answer the big questions: “How much does a nation spend on science” and “what is the return to investment”? It is about making science metrics more scientific and gathering scientific evidence to better understand what the effects are of funding research. To this end Julia and her team developed the STAR METRICS program.

STAR METRICS conceptual framework, presented to euroCRIS meeting by Julia Lane, November 2014.
STAR METRICS conceptual framework, presented to euroCRIS meeting by Julia Lane, November 2014.

They are looking at the process of how funding creates output: Funding goes to institutions that employ and provide infrastructure to people who, with their knowledge and skills, produce outputs. Her team collects and analyzes data around this process (grant funds, HR records, financial transactions records, awards data, email, publications, blogs, etc.) – the data are not standardized but can be combined and mined – giving interesting results that help unpack how research is being done. They use external sources as well (Census Bureau data, LinkedIn data, etc.). In this way they can link the data to where people get jobs, start up businesses and to workforce growth in the proximity of scientific hubs. Their findings confirm that the majority of the impact of funded research is regional. They also observe that the vast majority of knowledge transmission is through human interactions and clearly not through paper and publications. If social networks are a major vehicle for knowledge transfer, we should start understanding (and measuring!) how people interact. That starts sounding creepy to me.

The presentations by the university and government representatives giving a policy perspective, René Hageman (VSNU-Dutch Association of Universities) and Geert van Grootel (Flemish Government, Dpt of Economy, Science and Innovation), hinted at what policy makers dream about, in terms of getting a 360 degree view of RI. But their thinking was confined within the safe boundaries of the CRIS. Or the FRIS  – in Flemish speak. “When a research project goes into execution, then the data automatically goes into FRIS. FRIS will continually monitor KPIs.”

A word of caution from the evaluation and benchmarking perspective

Paul Wouters (CWTS) was the perfect speaker to question the KPI-rush and to give us a scientific critique of research evaluation methods. He quoted Peter Dalher Larsen (The Evaluation Society): “Evaluation has become a profession on itself”. Data has become input for “evaluation machines” – to make stuff auditable. The trend towards mechanisation of control and standardization leads to less variety and diversity of scientific discovery practices. He argued that academics need to be in the driver’s seat and ask themselves: how can we monitor our research? How can we profile ourselves to attract the right students and staff? How should we divide funds? What is our scientific/societal impact? Instead of being “just” data-suppliers and subject to evaluation, they need to become full-partners in the emerging RIM landscape.

The RDM perspective

There were more presentations, giving the perspectives of several other stakeholder communities: the funders, the libraries, the data archives. Surprisingly, there were few attendees representing the RDM community. DANS (a national research data archiving institution in the Netherlands), who hosted the euroCRIS meeting in Amsterdam, was the notable exception. Peter Doorn’s presentation was interesting because it showed how DANS is adapting its mission and ambitions to the changing landscape of stewardship opportunities. Peter described the mission of DANS as “to provide permanent access to Research Information”. A major part of their focus is still RDM, but they are moving into the broader RIM space. Concerning RDM, which he defined as “how you organize/curate the data during the research project and afterwards”, he mentioned explicitly that for DANS the focus of stewardship during research is new. A noteworthy shift.

Re-reading Lorcan’s blog on RIM

After attending the euroCRIS meeting, I re-read Lorcan’s blog and its title makes much more sense to me now. Indeed there are many signals that this is an emerging new service category. There are many vendors out there, signaling that there is a market for RIM-systems. They are looking for robust use cases to develop their products and services, but the RI-space seems to be evasive, as it continues to expand and new demands and needs keep piling up. The sources for collecting data keep diversifying and their numbers growing. RIM is moving into the data science domain and this opens up new perspectives. It also begs the question if it is necessary to register data anew, when it is sitting somewhere in other systems? Data aggregation and data mining seem to be able to provide the business intelligence policymakers and funders are seeking.

Libraries are engaged in RIM. In Europe more so than in the U.S., because of the national governments and EC policies towards Open Access and Open Data and the drive to register data that informs the impact of such policies. What struck me though, was that the euroCRIS meeting presentations touched on standardization and interoperability issues in a way reminiscent of the library automation meetings (ELAG-like) conducted 20 years ago:  promoting layered architecture models, the full-implementation of standards, the need for evangelists to persuade governments to impose standards, etc. Libraries can help jump-start the RIM-discussion and OCLC could certainly contribute (there are many potential areas: aggregation, extracting knowledge from data, name authorities and name disambiguation, etc.).