Scores of responses by doctors and ChatGPT on the Swedish family medicine specialist exam

SND-ID: 2023-311. Version: 1. DOI: https://doi.org/10.5878/j8jh-5128

Citation

Creator/Principal investigator(s)

Rasmus Arvidsson - University of Gothenburg, Institute of Medicine, School of Public Health and Community Medicine

Ronny Gunnarsson - University of Gothenburg, Institute of Medicine, School of Public Health and Community Medicine orcid

Artin Entezarjou - University of Gothenburg, Institute of Medicine, School of Public Health and Community Medicine

David Sundemo - University of Gothenburg, Institute of Medicine, School of Public Health and Community Medicine orcid

Carl Wikberg - University of Gothenburg, Institute of Medicine, School of Public Health and Community Medicine orcid

Research principal

University of Gothenburg - Institute of Medicine rorId

Description

These scores were compiled as part of a study which compared ChatGPT’s performance with real doctors on the Swedish family medicine licensing exam.

The scores from zero to ten for the cases of exam years 2017-2022. For more details, see README.txt.

Data contains personal data

No

Language

Method and outcome

Population

Anonymous responses from SFAM's specialist exam in general medicine 2017-2022 and responses from ChatGPT to the same cases.

Time Method

Study design

Observational study

Description of study design

ChatGPT’s scores were compared with that of real doctors using cases from the Swedish family medicine specialist exam.

Sampling procedure

Mixed probability and non-probability
1. Randomly selected doctor responses - a single response was selected randomly for each case.
2. Top tier doctor responses - a response for each case chosen by the exam reviewers as an example of a very good response.
3. ChatGPT responses - responses provided by ChatGPT.-4, August 3 Version 2023.

Time period(s) investigated

2017 – 2022

Data format / data structure

Data collection

Data collection 1

  • Mode of collection: Simulation
  • Description of the mode of collection: Questions prompted to ChatGPT-4
  • Time period(s) for data collection: 2023-08-23 – 2023-08-23
  • Data collector: University of Gothenburg
  • Instrument: ChatGPT-4 (Other)

Data collection 2

  • Mode of collection: Educational measurements and tests
  • Description of the mode of collection: SFAM's specialist exam in general medicine
  • Data collector: The Swedish Association of General Practice (SFAM)
Geographic coverage

Geographic spread

Geographic location: Sweden

Administrative information

Responsible department/unit

Institute of Medicine

Topic and keywords

Research area

Other medical engineering (Standard för svensk indelning av forskningsämnen 2011)

General practice (Standard för svensk indelning av forskningsämnen 2011)

Other medical and health sciences not elsewhere specified (Standard för svensk indelning av forskningsämnen 2011)

Publications

Copyright

Copyright is retained for the example case in the README file. See LICENSE.txt.

License

Other

Versions

Version 1. 2024-03-08

Version 1: 2024-03-08

DOI: https://doi.org/10.5878/j8jh-5128

Published: 2024-03-08
Last updated: 2024-03-08