The FAIR data principles
- Findable: how do you find the data?
- Accessible: how do you gain access to the data?
- Interoperable: are data and metadata interoperable?
- Reusable: is it possible for others to use the data in the future?
FAIR is an acronym for Findable, Accessible, Interoperable, and Reusable. The FAIR data principles state that it should be possible to find research data, there should be information about how to gain access to them, they should be compatible with other data, and possible to reuse. The FAIR data principles are an integral part of the work within open science, and describe some of the most central guidelines for good data management and open access to research data. SND strives to make data in the national research data catalogue as compliant as possible with the FAIR criteria, but as a researcher, you also play an important part in this work.
Findable: how do you find the data?
In order to make a research material compliant with the FAIR data principles, other researchers must be able to find the material. When you present the data in a relevant research data catalogue and make sure that they are described with thorough and relevant metadata (see below), you make it possible for others to find them. In the data description form in the SND system DORIS, you can enter metadata about your research project and its data. You know the project data well, so you are the best person to describe various aspects of the data. When you have submitted your data description, SND or another data repository that complies with the FAIR data principles can give the data a persistent identifier (PID), meaning a permanent and unique digital ID, which makes it easier to cite the data correctly so that they can be found.
Metadata
Metadata are structured information about the collected data material. This information describes the material on various levels, for example where and by whom it was created; on which occasions and with which methods the data were collected; what a variable means and which values it can take. Metadata are not the same as documentation; what signifies metadata is that they are structured in a way that makes them readable by both humans and computers.
Accessible: how do you gain access to the data?
The next part of the FAIR data principles concerns access to the data. To share or make a data material accessible is not the same thing as sharing the data freely so that they can be accessed and used by everyone. If the material contains sensitive personal data, or special category data, for example, a confidentiality assessment needs to be made before the material can be released to anyone. Metadata, however, are not sensitive, so even if the data cannot be made freely accessible, you can use metadata to show that the material exists and under which conditions you may access and reuse it.
Interoperable: are the data and metadata interoperable?
In order to comply with the I in FAIR, both data and metadata need to follow accepted standards. The main responsibility for this rests on the organisation that makes the data accessible (for instance SND). But it also means that you, as a researcher, should use standardised ways to enter information such as dates, time periods, and geographic coordinates, that you select widely adopted scientific vocabularies to describe categories, and that you code variables according to an accepted standard. If possible, you should save the data in a widely used file format that is supported by common operating systems and can be opened in several programs, or use software that can export data in such file formats when the project is finished.
Reusable: is it possible for others to use the data in the future?
Finally, it should be possible to reuse the data. Some conditions for reusability are that the data are described with sufficient and relevant metadata, that both humans and computers can read the metadata, and that there is clear information about, for instance, the scientific purpose of the study, the context for the data collection as well as which equipment and software were used for the data collection and analysis. You should also clearly specify the conditions for how the data may be used.