Every month the Digital Competence Centre at the University of Twente organises a thematic session that focuses on a hot research topic. In May, the session was all about data and what researchers should do with it after their research project is complete.
Ask the audience
The session kicked off with a short survey to ask researchers about their current research data practices. Attendees were asked to answer the following question: ‘What do you currently do with the data after your research?’
Their responses revealed that whilst some researchers currently use a data repository (16%) and/or an external hard drive within group facilities (20%) to store their research data, the majority (59%) use a personal device or shared network drive.
The remainder of the session explored the benefits and limitations of these various storage solutions, and focused on the best practices for creating data that is findable, accessible, interoperable and reusable (The FAIR principles).
A data steward’s perspective
Why should researchers think about what to do with their data after their research project is complete?
Data should be kept after the research project is complete for validation and verification purposes. It’s important to make sure that others can trust the experimental results. In addition, preserving data in a secure and accessible location can help to make it available for reuse which can lead to a larger impact of the research.
Reuse doesn’t mean that data has to be reused by researchers working outside of a university. Often, the researcher or a colleague from within the same research group can benefit from reusing the data.
It is important that researchers keep these considerations in mind from the beginning of their research project. They can do so by using open and sustainable file formats, for example.
What do others expect?
In addition to the requirements for research verification and data reuse, archiving data for long-term preservation is also required by many funding bodies, publishers and research institutions around the world.
Funders, such as NWO, expect researchers to preserve their data for at least 10 years in a trusted repository, and preferably with open access.
Publishers, such as Springer, encourage researchers to share and cite research data in their publications. Some journals require that a data availability statement is included in research publications which tells the reader where the data associated with a paper is available, and under what conditions the data can be accessed. In some cases, a peer-review of the data is also required.
Additionally, research institutions may have their own data policies and guidelines in place, therefore, researchers should always check these requirements before archiving the data.
Which data should be preserved?
In principle, a package consisting of the data, tools for collecting and analysing the data, and the analysis syntax/code should be archived together with the results.
Where can data be preserved? And, does all data need to be publicly available?
Research data can be preserved in trusted repositories (e.g. 4TU.ResearchData, DANS Easy), institutional archive (e.g. University of Twente data archive: Areda) and/or group storage (e.g. University of Twente network drive).
Archiving your data in a trusted repository will give you the opportunity to (openly) share your data with the world, and your data will get a persistent identifier (e.g. DOI) which enables citation of the data. However, not every dataset can be made openly available due to special restrictions, e.g. privacy, commercial interests, patents, data owned by others, data related to public security, political interests, etc.
A researcher’s perspective
Kostas studied his PhD on ‘Hand Neuro-Motor Characterization and Motor Intention Decoding in Duchenne Muscular Dystrophy’. Although the practise of publishing data was not common within his research field, he published four data sets related to his PhD.
Currently, his dataset published in 4TU.ResearchData, ‘Raw data collected for the study of: Characterization of forearm high-density electromyograms during wrist-hand tasks in individuals with Duchenne Muscular Dystrophy’, has 175 views and 75 downloads.
What were his motivations to publish data?
Kostas decided to share his data to show that he is convinced of his results and not afraid of making the data available to others. He believes that researchers should not fear criticism and accept that they cannot always be 100% right.
For example, in one case, when he submitted a manuscript for peer-review, a reviewer checked Kostas’s data and found a mistake. Although the review process took slightly longer because of this, Kostas believes that it was worth sharing his data during the manuscript review process as he had time to correct the mistake and could feel confident to publish his research results.
Another important reason why Kostas shares his data is that he undertakes research using data collected from human participants. He appreciates that collecting data from humans places a burden on the study participants as well as the researchers. It costs a high amount of preparation time, resources and effort, and requires the recruitment of suitable participants, informed consent and ethical approval, for example.
Kostas believes that by sharing his data other researchers can reuse it without duplicating efforts and recruiting more human participants. This reduces the time, effort and resources spent by researchers and participants.
Choosing a repository
The session closed with demonstrations of how to upload data using the 4TU.ResearchData and DANS data repositories. The demonstrations allowed researchers to follow the process of publishing a dataset live with time for Q&A.
- Watch the recording of the 4TU.ResearchData repository demonstration by 4TU.ResearchData community manager, Connie Clare (the demo starts at 41 min 20 s).
- Watch the recording of the DANS repository demonstration by University of Twente data steward, Qian Zhang.