Definition
“Research data” are defined as factual records (numerical scores, textual records, images and sounds) used as primary sources for scientific research, and that are commonly accepted in the scientific community as necessary to validate research findings. A research data set constitutes a systematic, partial representation of the subject being investigated. This term does not cover the following: laboratory notebooks, preliminary analyses, and drafts of scientific papers, plans for future research, peer reviews, or personal communications with colleagues or physical objects (e.g. laboratory samples, strains of bacteria and test animals such as mice).
Definition from ‘PRINCIPLES AND GUIDELINES FOR ACCESS TO RESEARCH DATA FROM PUBLIC FUNDING’, OECD 2007
This definition is fairly restrictive and can be extended to the following categories (proposed by INIST – CNRS Institute for Scientific and Technical Information):
- Observational data: data captured in real time, usually unique and therefore impossible to reproduce.
- Experimental data: data obtained from laboratory equipment, which is often reproducible but sometimes costly.
- Computational or simulation data: data generated by computer or simulation models, which are often reproducible if the model is properly documented.
- Derived or compiled data: data resulting from the processing or combination of ‘raw’ data, often reproducible but costly.
- Reference data: a collection or accumulation of small datasets that have been peer-reviewed, annotated and made available.
Research data can therefore take a wide variety of forms: images, numerical data, texts, videos, source codes, etc.
This video covers a wide range of issues relating to data management in a research project, so don’t hesitate to watch it – in less than 5 minutes you’ll understand everything that’s at stake!
Data lifecycle

FAIR principles
‘FAIR Guiding Principles for scientific data management and stewardship’ were published in Scientific Data in 2016.
The ‘FAIR’ acronym stands for Findable, Accessible, Interoperable, Reusable and defines the basis for sharing data that is easy to find, accessible, interoperable and reusable.
Major funding bodies, including the European Commission, are encouraging FAIR data to ensure the integrity and increase the impact of their research investments.
Guidelines for FAIR data management in Horizon 2020
Whenever you work with research data or start a new research project, you should consider the following aspects:
- Storing and sharing research data with collaborators. It’s a good idea to estimate the size of the data collected or produced during the project and think about where the data will be stored. It is also important to think about the level of security for accessing your data and regular backups.
- Organizing and documenting research data. Even a subject as obvious as the organization and documentation of research data requires careful planning, with the key question being: will I be able to find and understand my data in a few years’ time?
- Opening up research data. Providing access to research data is becoming general practice as a way to validate scientific results and make science fully transparent (open). What’s more, the requirements of funding organizations in terms of open data are multiplying every year. It is therefore important to plan the opening up process of research data in advance.
- Preserving research data. Consideration must be given to what will happen to the data when the project is over. The availability of data after the research project may be important not only immediately after the project, but also in 20 to 30 years’ time. It is therefore important to preserve research data and ensure access to it.
This interactive DoRANum presentation aims to explain each item of the FAIR principles in a simplified way.