June 24, 2024
Mount Sinai Study: ChatGPT Deemed Fit to Practice Medicine, Researchers Say
AI

Mount Sinai Study: ChatGPT Deemed Fit to Practice Medicine, Researchers Say

A recent study conducted by medical researchers from the Icahn School of Medicine at Mount Sinai has delved into the potential of artificial intelligence (AI) chatbots as autonomous practitioners of evidence-based medicine (EBM). The preprint research, published on arXiv, reveals insights into the capabilities of large language models (LLMs), including ChatGPT 3.5 and 4, in performing medical tasks based on EBM protocols.

The Mount Sinai team tested various off-the-shelf consumer-facing LLMs, employing prompts like “you are a medical professor” to evaluate their ability to suggest evidence-based treatments for different test cases. Among the models tested, ChatGPT 4 emerged as the most successful, achieving an accuracy rate of 74% across all cases and surpassing ChatGPT 3.5 by approximately 10%.

The study concludes that LLMs, particularly ChatGPT 4, can function as autonomous practitioners of evidence-based medicine. The researchers highlight the models’ capacity to interact with healthcare system infrastructures and perform patient management tasks in adherence to guidelines.

Evidence-based medicine relies on the application of lessons learned from previous cases to guide the treatment trajectory for similar cases. However, the complexity of medical decision-making often leads to information overload for clinicians, making the process challenging to manage.

The researchers propose that LLMs can help mitigate this overload by handling tasks typically performed by human medical experts, such as ordering and interpreting investigations or issuing alarms. They describe LLMs as versatile tools capable of understanding clinical context and generating possible downstream actions.

While the study emphasizes the potential benefits of AI chatbots in evidence-based medicine, it also acknowledges potential challenges. LLMs, including foundational models like ChatGPT, generate new text with each query, and concerns about occasional fabrication or “hallucination” are raised. The researchers claim minimal hallucinations during testing but do not provide details on mitigation techniques at scale.

The paper’s mention of AI models’ reasoning and approaching artificial general intelligence (AGI) sparks debate, as there is no consensus among computer scientists on whether LLMs possess reasoning capabilities. Additionally, the study does not define AGI or address the ethical considerations of introducing unpredictable automated systems into clinical workflows.

Despite the promising benchmarks, questions linger about the practical benefits of general chatbots like ChatGPT in a clinical EBM environment compared to existing approaches or bespoke medical LLMs trained on curated, relevant data. As the AI community continues to explore the intersection of technology and healthcare, the study prompts further discussions on the role of AI in evidence-based medical practices.

Image: Wallpapers.com

Disclosure Statement: Miami Crypto does not take any external funding, or support to bring crypto news to the readers. We do not have any conflicts of interest while writing news stories on Miami Crypto.

Related posts

McDonald’s Pumps Brakes on AI Drive-Thru Experiment

Cheryl  Lee

OpenAI to Train New Flagship AI Model, With Better AI Safety

Eva Moore

AI in DeepMind Falls Short of Complete Solution for Climate Issues

Eva Moore

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Please enter CoinGecko Free Api Key to get this plugin works.