The successful ChatGPT software, which generates text using artificial intelligence, scored close to the score required to pass a difficult medical exam in the United States, according to a study published Thursday.
Californian start-up OpenAI launched a conversational robot last November, which has been causing a stir ever since. Easy to use, it produces texts -essays, articles or even poems- on demand.
For the study, published in the journal PLOS Digital health, researchers from the company AnsibleHealth tested the performance of the software in an exam that medical students in the United States must take, and which questions them in several fields (science of knowledge, clinical reasoning , bioethics, etc.).
Called the USMLE (United States Medical Licensing Exam), this exam is divided into three parts: the first is passed after about two years of study, the second after four years, and the third is required to become a doctor.
Close to the margin of success
ChatGPT was tested on 350 of the 376 questions posted on the USMLE site that were part of the June 2022 exam. The image-based questions had to be removed.
They were presented in three formats: open-ended questions (“What would this patient’s diagnosis be given the information presented?”), unjustified multiple-choice questions (“What is the most appropriate next follow-up step among the following?”) and multiple choice with justification (What is the most likely reason for the patient’s nocturnal symptoms? Explain your reasoning”).
Two reviewers graded the paper and a third adjudicated discrepancies between them. The software scored between 52.4% and 75% correct answers. Generally, the score required to pass the exam is 60%. “ChatGPT is close to the margin of success,” the study concludes.
Future help for doctors?
Some outside experts have criticized the method used. The researchers could have introduced a degree of anonymization by mixing human responses with those of the robot, said Nello Cristianini, a professor of artificial intelligence at the University of Bath in the United Kingdom.
Still, he called the work “part of a series of exciting new developments in the field of artificial intelligence” (AI).
According to Lucía Ortiz de Zárate, a researcher at the Autonomous University of Madrid, this study demonstrates “the potential of AI in the medical field.” “It can be of great help to physicians when making diagnoses and prescribing treatments,” she said.
In late January, another study showed that ChatGPT could pass exams at a US law school, even though it finished last in the class.
Source: BFM TV
