Comparing ChatGPT 3.5, Microsoft Bing, and Google Gemini in diagnosing cases of neuro-ophthalmology

Author(s):

A brief analysis showed that ChatGPT 3.5 performed better.

(Image Credit: AdobeStock/AdriaVidal)

With an increase in artificial intelligence being used in everyday activities, a recent query was conducted on 3 separate AI-powered chat boxes (ChatGPT 3.5, Microsoft Bing, and Google Gemini) to assess their accuracy in handling and diagnosing neuro-ophthalmological case scenarios.

Ten different, randomly chosen, neuro-ophthalmological cases from a publicly accessible database (Neuro-Ophthalmology 2023: When Should I Worry? Concerning Signs, Symptoms, and Findings in Neuro-Ophthalmology) were fed into the models followed by the query: "What is the most probable diagnosis?"¹

Cases included various neuro-ophthalmic diseases such as ethambutol optic neuropathy, optic neuritis in the setting of myelin oligodendrocyte glycoprotein (MOG) antibody−associated disease (MOGAD), sixth nerve palsy secondary to immunoglobulin G4 (IgG4)-related disease, superior optic disc hypoplasia, arteritic anterior ischemia on due to (d/t) giant cell arteritis, pseudotumor cerebri, trochlear nerve palsy, vitreomacular traction, amiodarone-associated toxic optic neuropathy, and idiopathic orbital inflammatory syndrome. Each case had a confirmed diagnosis before being entered into the platform.¹

One of the prompts fed into the platforms with the diagnosis of trochlear nerve palsy is as follows:

“A 47-year-old male presented for the evaluation of recent-onset diplopia, worse in the down gaze and lateral gaze. These symptoms were noticed one morning upon awakening two weeks prior to presentation. He described the images as being one on top of the other and slightly angled, and he felt his symptoms were stable since onset. He denied headaches, neck stiffness, associated pain, or blurred vision. The patient was in excellent health, and he had no past medical history of significance. Family and social histories were also unremarkable. Laboratory work ordered by his PCP, whom he saw initially, yielded a positive Lyme titer by western blot. The patient denied any history of tick bites, joint pain, or fevers; however, he did note recent fatigue. He also mentioned that as a child, he was told that he had a "wandering eye." He denied any prior patching or treatment for this. On examination, his visual acuity was 20/20 OU, with normal color and normally reactive pupils without an afferent pupillary defect. Visual fields by confrontation were full. He appeared to have a right head tilt. Ocular motility testing revealed full ocular ductions and versions. On prism alternate cover test, he had a 5Δ left hypertropia (LHT) in the primary gaze, which increased in the right gaze (8Δ LHT) and down gaze (6Δ LHT) and on left head tilt (6Δ LHT). There was no obvious excyclotorsion. He was able to fuse with 4Δ of base-down (BD) prism over the left eye.” Followed by "What is the most probable diagnosis?"¹

In the instance of this prompt, all 3 platforms gave the correct diagnosis.

Further results showed that all 3 platforms gave the correct diagnosis in 4 out of 10 cases, but ChatGPT 3.5 performed the best in giving the correct diagnosis in 6 out of 10 cases. Microsoft Bing, and Google Gemini gave the correct diagnosis in 5 out of 10.

The 4 cases all 3 platforms diagnosed correctly were for trochlear nerve palsy, ethambutol optic neuropathy, arteritic anterior ischemia optic neuropathy d/t giant cell arteritis, and Optic neuritis in the setting of myelin oligodendrocyte glycoprotein (MOG) antibody-associated disease (MOGAD). ChatGPT 3.5 showed greater accuracy in assessing sixth nerve palsy secondary to IgG4-related disease and pseudotumor cerebri secondary to minocycline when compared to Microsoft Bing, and Google Gemini.¹

The authors concluded that “results demonstrate the great potential of artificial intelligence-driven chatbots, which can be used as a consultation tool to help family doctors and patients get referral recommendations.”

References:

Shukla R, Mishra A K, Banerjee N, et al. (April 14, 2024) The Comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for Diagnosing Cases of Neuro-Ophthalmology. Cureus 16(4): e58232. doi:10.7759/cureus.58232

Don’t miss out—get Ophthalmology Times updates on the latest clinical advancements and expert interviews, straight to your inbox.

Subscribe Now!

Comparing ChatGPT 3.5, Microsoft Bing, and Google Gemini in diagnosing cases of neuro-ophthalmology

References:

Shukla R, Mishra A K, Banerjee N, et al. (April 14, 2024) The Comparison of ChatGPT 3.5, Microsoft Bing, and Google Gemini for Diagnosing Cases of Neuro-Ophthalmology. Cureus 16(4): e58232. doi:10.7759/cureus.58232

Newsletter