Last year, the release of ChatGPT, whose OpenAI developer is backed by Google’s rival Microsoft, sparked a race between tech giants in the burgeoning field of AI.
Healthcare is one area where technology has already shown tangible progress, with some algorithms shown to be able to read medical scans better than humans. Google had introduced its dedicated artificial intelligence tool for medical questions, called Med-PaLM, in a pre-publication article in December. Unlike ChatGPT, it has not been released to the general public.
First AI to pass the test
Google claims that Med-PaLM was the first large language model, an artificial intelligence technique trained on large amounts of human-generated text, to pass the USMLE (US Medical Licensing Exam).
Passing this exam allows you to practice medicine in the United States. To achieve this, you must obtain a score of approximately 60%. In February, a study revealed that ChatGPT performed quite well on the exam.
In a new peer-reviewed study published Wednesday in the journal Nature, Google researchers said Med-PaLM scored 67.6% when answering USMLE-style multiple-choice questions.
Results “encouraging but inferior to humans”
These results are “encouraging, but still lower than those in humans,” the study says. To identify and reduce so-called “hallucinations,” the word for an obviously wrong answer given by an AI model, Google said it has developed a new benchmark for the evaluation.
Karan Singhal, a Google researcher and lead author of the new study, told AFP that his team had tested a newer version of the model. Med-Palm 2 reportedly scored 86.5% on the USMLE exam, beating the previous version by nearly 20%, according to a non-peer-reviewed study published in May.
According Wall Street Journal, Med-PaLM 2 would be on trial at the prestigious Mayo Clinic research hospital in the United States from April. Any tests done with Med-PaLM 2 will not be “clinical, patient-oriented or likely to harm patients,” Karan Singhal said. Rather, the model will be tested for “administrative tasks that can be automated relatively easily, with little risk,” she added.
Source: BFM TV
