Data Stewardship and Policy
To move forward and improve research data management (RDM) policy and implementation, the impacts of current and past approaches need to be well understood. Qualitative interviews were carried out with both heads of research data services and data stewards to get a better understanding of consultation and the impact of data stewardship. The results of this work can also be used as an informative reference point for the current state of data stewardship policy and how it’s being implemented.
The questions asked focused on two key aspects of policy, the awareness of policy and the implementation. Regarding the awareness researchers have of policy, generally it is likely most are aware of it, but to what extent is not known. With the implementation, data stewards reported progress but had a number of suggestions of how things can be improved, including how awareness can be helped.
Data stewardship has been a growing part of modern research for many years now. Concerns over integrity and reproducibility have driven this almost as much as the growth in need for large-scale data handling1. The FAIR guiding principles for research data stewardship were first put forward in 2014 and serve as the core concepts behind data stewardship. FAIR stands for findability, accessibility, interoperability and reusability and it is a cornerstone of modern research policy that all work carried out should be follow these principles.
Figure 1: Table of the FAIR principles explained1
Whilst the FAIR principles have quickly gained traction, they do not specify the means of how the principles should be achieved, a task that would be very difficult considering the variability of needs across fields and even between faculties. As part of the push to improve data quality standards and support the data structures in research, new policies have been developed and implemented2. The research data management policy at TU delft has been set up in a decentralised format where a core template for RDM policy was created and shared across faculties so that unique needs can be built into policy per a faculties specification. The policy sets out four goals, aiming to cultivate:
- Best practice for ensuring scientific arguments and results are reproducible in the long term.
- Better exposure of academic work of researchers at TU Delft leading to recognition of quality of the research process as a whole.
- Responsible management of research data, including the safe storage of research data and protection of intellectual capital developed by scientists across TU Delft.
- Improved practices for meeting the demands of funders and publishers with respect to research data management and sharing.
Along with these goals the template and university wide framework set out a series of responsibilities aiming to implement and progress towards these goals for the various stakeholders.
To gauge the impact of data stewardship the Data Stewardship Coordinator set out to qualitatively analyse research data policy, training and consultation across faculties and generate an informative report to be used as reference moving forward. For investigating the policy aspect it’s important to understand the initial expectations, how aware stakeholders are of policy, how policy is implemented and what kinds of feedback are received. Here, we asked both data stewards and heads of research services who led development of the policy, questions on these topics.
A set of semi-structured qualitative interviews were carried out with the heads of data research and data stewards to ask questions about RDM policy. The interviewees were selected based on their roles in developing policy and their continued roles in implementation. Interviews lasted on average 40-60 minutes and were not audio/video recorded but responses and key notes were taken during the interviews.
Areas of Data stewardship
Policy visibility and awareness
When asked about how all the relevant individuals are made aware of the policy it was explained that the development of the policy was made in collaboration with stakeholders. This involved discussion of the demands, the motivations and intended implementation of policy.
It was also however discussed that currently no checks have been done to measure how aware stakeholders are currently. There are multiple channels for individuals to be made aware of the policies but by far the most important way people will find out about policy is through the support and consultations with data stewards. Whilst awareness of the policy is important, the heads of department made it clear that policy is intended in to act as a context for work carried out by support staff and data stewards.
When asked about the expectations of the policy there were a number of both short- and long-term goals with the overarching aim to drive a culture change in how data is managed through policy acting as a context for the support given. An example given was how policy now requires PhD students to complete data management plans (DMPs) for their work in advance, therefore students engage with the courses available and data stewards leading to more requests from the data stewards. This in turn leading to increased awareness of the challenges and importance of good RDM. One of the goals in the policy is to cultivate reproducible science, reproducibility relies on data/code being reusable and accessible which are two key principles pushed by both training courses and data stewards, hence the indirect path to reproducibility by policy.
Policy and Data Stewards
Although the heads of research services had an integral role in designing policy, it is the data stewards who take on much of the responsibility for implementation and communication of policy goals with researchers. Therefore, to analyse the impact data stewardship, it is important to gather the insights, feedback, and observations over time from the data stewards in the context of policy.
Role and Utility
We asked the data stewards what role the policy plays in their work and how useful it is, there was agreement on the role of policy as a resource to refer to and engage researchers with and even providing an abridged version as part of teaching. It was also mentioned that informing researchers of how to act in compliance with policy adds legitimacy to RDM activities. As for how useful and important the policy is to the data stewards’ activities, there were mixed responses. Most data stewards’ responses reported the policy to be very useful for the reasons mentioned before, the few that didn’t find the policy that useful said that it was a matter of frequency, and they only rarely use it for reference in niche cases.
The policy sets out several goals and responsibilities, much of the most impactful of these are reliant on the data stewards and so their opinion on the system is invaluable for gauging progress, difficulties, and propriety. When asked about whether the current setup was ideal the data stewards suggested a few possible improvements that could be made. These suggestions were mainly focused on a lack of guidance documentation and lack of researcher awareness of policy.
We also asked about how effective the data stewards thought the current implementation had been in moving towards the goals set out in policy. The data stewards all reported large changes over time in researcher RDM culture. In contrast to five years ago, DMPs are now in place, archiving is common and publishing data along with peer reviewed articles is standard practice.
Considering the progress made and how rapidly practises can change led us to question the data stewards about how policy is updated, the frequency and the need for it currently. The general opinion was that policy doesn’t need to be changed at the moment and the current agreed upon frequency for updating documents was too short. The policy template, and therefore each faculty policy too, is broad and open enough for minor faculty changes to be made when needed and that it’s unlikely that big changes will be needed for a while. No explicit process has been setup for large updates yet because no changes haven’t been needed.
Awareness and Feedback
As important as data steward implementation of the policy is, it is also key for the researchers themselves to be aware of policies it is their responsibility to follow. Most data stewards have told researchers they’ve talked to about the policy documents but are unsure as to how much they’ve read. PhDs are typically more aware. As for feedback on the policy, few data stewards reported any, the ones that did reported requests for more detailed guidelines and general complaints about the ‘heaviness’ of the processes.
Considering progress and the impact of data stewardship in the context of policy is somewhat complex as the goals are both broad and long term. The goals also require multiple stakeholders to adhere to their responsibilities and implement policy. Here we’ve talked to a few stakeholders about how data stewardship has helped progress these goals.
For gauging the impact here, we asked the data stewards about how researcher culture has changed. Generally, they reported large change over the past few years with a number of key points raised. The introduction of DMPs has played a key role, now compulsory for PhDs, and has pushed a culture shift from the ground up where RDM skills are sought after. General changes include things like archiving now being common, data now being published alongside research papers, and data management/sharing commonly being considered important. Researchers are generally happy that guidelines and support are in place with data stewards agreeing that the current communication and training strategy is indeed working.
The role of data stewards in engaging with researchers cannot be understated as there is a concerning question of how aware researchers are of the policy documents and without their work it would seem unlikely that this much of a culture change would have been possible.
When talking to the data stewards about the way policy has been implemented currently the general opinion is that the overall communication strategy and training is working to mediate progress. The main roadblock to progress has been senior staff not fully embracing the policy, but the PhD students and new researchers are often very receptive. There were comments about how the full goals of the policy haven’t been reached and how perhaps the policy could even be more ambitious. The main suggestion put forward by data stewards was for the creation of supplementary guidance to accompany policy in the form of a manual, similar to the one produced for software policy.
Other suggestions relating to current implementation was ways to make researchers more aware of policy by finding ways to teach it or present it, especially to new staff. Another suggestion to promote implementation of the policy was to push every PI/Lab to implement their own data management strategy so that every researcher has a workflow they can follow and is aware of tailored to their specific needs.
1. Boeckhout, M., Zielhuis, G. A. & Bredenoord, A. L. The FAIR guiding principles for data stewardship: fair enough? Eur. J. Hum. Genet. 26, 931–936 (2018).
2. Arend, D. et al. From data to knowledge – big data needs stewardship, a plant phenomics perspective. Plant J. 111, 335–347 (2022).