Glossary
A glossary of terms used in this handbook. This list is not exhaustive and may be expanded.
Personal data
Personal data means information that can directly or indirectly identify a natural person, meaning a living individual.
Directly identifying information includes names, personal identity numbers, e-mail addresses, and IP addresses. Indirectly identifying information, sometimes called quasi-identifiers, is information that on its own is not sufficient to identify a living individual but that may do so in combination with other information (e.g., age, income, and municipality). Identifying individuals by combining information this way is known as re-identification.
Note that personal information about deceased individuals generally does not count as personal data from a GDPR perspective.
Sensitive personal data
Sensitive personal data includes information relating to a person's:
- Racial or ethnic origin
- Political opinions
- Religious or philosophical beliefs
- Trade union membership
- Health
- Sex life or sexual orientation
- Genetic data
- Biometric data for the purpose of uniquely identifying an individual.
Collecting or conducting research using sensitive personal data requires ethical approval.
Personal data processing
Processing refers to any handling of and operations on personal data, including:
- collection
- storage
- publication
- disclosure
- archiving
- erasure or destruction
Examples include collecting personal data through interviews, surveys, or administrative registers.
Pseudonymised vs. anonymised data
Anonymised research data, on the other hand, are fully de-identified and cannot be linked to an individual – not even via additional data sources – and therefore no longer count as personal data.
Code key
A code key is used to separate direct identifiers from research data by replacing them with serial numbers or similar placeholders. The mapping, or code key, between identifiers and serial numbers is stored separately. This way, data can be analysed and interpreted without revealing identities, and the privacy of the research subjects is protected. However, as long as a code key exists, the data are considered pseudonymised, not anonymised, as the code key is an example of an additional data source that could be used to connect data to individuals. That means that the data are still considered personal data and fall under data protection regulations.
Re-identification
Re-identification refers to the process of combining indirect identifiers (e.g., age, income, and municipality) to identify individuals. This can be done using variables or indirect identifiers within the original dataset or by combining information from the original dataset with additional data sources such as registers or information on social media. The risk of re-identification can be reduced by recoding variables – for example, by grouping ages or incomes into larger intervals or brackets, or by using larger geographic units, such as county instead of municipality.
Official document (allmän handling)
A document is considered official if it is held by a public authority and has either been received or drawn up by that authority. Research data held by public institutions in Sweden are generally classified as official documents.
Read more about official documents.
Research principal (forskningshuvudman)
A research principal is the government agency or a natural or legal person in whose organisation research is conducted. The research principal has the ultimate responsibility for ensuring that research is conducted in accordance with good research practice. Collaborative research projects with multiple parties may have multiple research principals. The concept is defined in the Act on Responsibility for Good Research Practice (SFS 2019:504) and the Ethical Review Act (SFS 2003:460).
Read more about research principals.
Data controller (personuppgiftsansvarig)
A data controller is the natural or legal person, public authority, agency or other body which, alone or jointly with others, determines the purposes and means of the personal data processing. The data controller is responsible for how research data containing personal information are processed. In Swedish publicly funded research, the data controller is typically the research principal – for example, the university.
Data processor (personuppgiftsbiträde)
A data processor processes personal data on behalf of the data controller. The data processor can be a natural or legal person, public authority, agency or other body, or a research infrastructure or company that helps collect or analyse research data on the controller’s behalf. The data processor is always external to the data controller’s organisation and operates under a specific mandate to process personal data for the data controller’s organisation. This mandate is always regulated by a Data Processing Agreement (personuppgiftsbiträdesavtal).
When two universities collaborate in research, each university is typically the data controller for the data it manages.
Data Protection Officer (dataskyddsombud)
Public authorities and universities acting as data controllers are required to appoint a Data Protection Officer (DPO). The DPO’s role is to inform and advise on the General Data Protection Regulation (GDPR), provide guidance on personal data processing, and monitor the organisation’s compliance with the GDPR. The DPO must be involved in all data protection impact assessments and acts as the contact point for the Swedish Authority for Privacy Protection (IMY).
Statistical disclosure control
Statistical disclosure control refers to a set of methods used to ensure that individual-level data cannot be inferred from a dataset. It also includes techniques for assessing risks of re-identification and information loss when anonymising or pseudonymising data.
Read more: Handbok i statistisk röjandekontroll (PDF) (Handbook on statistical disclosure control; in Swedish).
Encryption
Encryption is the process of converting information into a code or cipher that cannot be read without a decryption key. Encryption algorithms range from simple ones (e.g., shifting letters three steps ahead in the alphabet) to highly secure algorithms that cannot be broken even with supercomputers – unless the key is known. For research data containing personal information, encryption is considered a form of pseudonymisation – the encrypted data are pseudonymised and the decryption key is additional information that can be used to identify individuals in the data.
In research, encryption is often used for data at rest, meaning data that are not in active use.