Direkt zum Inhalt

Najafali, Daniel ; Galbraith, Logan G. ; Camacho, Justin M. ; Stoffel, Victoria ; Herzog, Isabel ; Moss, Civanni ; Taiberg, Stephanie L. ; Knoedler, Leonard

Class in Session: Analysis of GPT-4-created Plastic Surgery In-service Examination Questions

Najafali, Daniel, Galbraith, Logan G., Camacho, Justin M., Stoffel, Victoria, Herzog, Isabel, Moss, Civanni, Taiberg, Stephanie L. und Knoedler, Leonard (2024) Class in Session: Analysis of GPT-4-created Plastic Surgery In-service Examination Questions. Plastic and Reconstructive Surgery - Global Open 12 (9), e6185.

Veröffentlichungsdatum dieses Volltextes: 04 Okt 2024 17:03
Artikel
DOI zum Zitieren dieses Dokuments: 10.5283/epub.59336


Zusammenfassung

Background: The Plastic Surgery In-Service Training Examination (PSITE) remains a critical milestone in residency training. Successful preparation requires extensive studying during an individual’s residency. This study focuses on the capacity of Generative Pre-trained Transformer 4 (GPT-4) to generate PSITE practice questions. Methods: GPT-4 was prompted to generate multiple choice ...

Background:

The Plastic Surgery In-Service Training Examination (PSITE) remains a critical milestone in residency training. Successful preparation requires extensive studying during an individual’s residency. This study focuses on the capacity of Generative Pre-trained Transformer 4 (GPT-4) to generate PSITE practice questions.
Methods:

GPT-4 was prompted to generate multiple choice questions for each PSITE section and provide answer choices with detailed rationale. Question composition via readability metrics were analyzed, along with quality. Descriptive statistics compared GPT-4 and the 2022 PSITE.
Results:

The overall median Flesch–Kincaid reading ease for GPT-4-generated questions was 43.90 (versus 50.35 PSITE, P = 0.036). GPT-4 provided questions that contained significantly fewer mean sentences (1 versus 4), words (16 versus 56), and percentage of complex words (3 versus 13) than 2022 PSITE questions (P < 0.001). When evaluating GPT-4 generated questions for each examination section, the highest median Flesch–Kincaid reading ease was on the core surgical principles section (median: 63.30, interquartile range [54.45–68.28]) and the lowest was on the craniomaxillofacial section (median: 36.25, interquartile range [12.57–58.40]). Most readability metrics were higher for the 2022 PSITE compared with GPT-4 generated questions. Overall question quality was poor for the chatbot.

Conclusions:

Our study found that GPT-4 can be adapted to generate practice questions for the 2022 PSITE, but its questions are of poor quality. The program can offer general explanations for both the correct and incorrect answer options but was observed to generate false information and poor-quality explanations. Although trainees should navigate with caution as the technology develops, GPT-4 has the potential to serve as an effective educational adjunct under the supervision of trained plastic surgeons.



Beteiligte Einrichtungen


Details

DokumentenartArtikel
Titel eines Journals oder einer ZeitschriftPlastic and Reconstructive Surgery - Global Open
Verlag:PRS Global Open
Band:12
Nummer des Zeitschriftenheftes oder des Kapitels:9
Seitenbereich:e6185
Datum19 September 2024
InstitutionenMedizin > Zentren des Universitätsklinikums Regensburg > Zentrum für Plastische-, Hand- und Wiederherstellungschirurgie
Identifikationsnummer
WertTyp
10.1097/GOX.0000000000006185DOI
Dewey-Dezimal-Klassifikation600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin
StatusVeröffentlicht
BegutachtetJa, diese Version wurde begutachtet
An der Universität Regensburg entstandenJa
URN der UB Regensburgurn:nbn:de:bvb:355-epub-593361
Dokumenten-ID59336

Bibliographische Daten exportieren

Nur für Besitzer und Autoren: Kontrollseite des Eintrags

nach oben