LOGO

Chatbot Health Advice: Study Reveals Limitations

May 5, 2025
Chatbot Health Advice: Study Reveals Limitations

The Rise of AI Chatbots in Self-Diagnosis

Due to extensive wait times and escalating expenses within strained healthcare systems, a growing number of individuals are utilizing AI-powered chatbots, such as ChatGPT, for preliminary medical self-diagnosis. Recent surveys indicate that approximately one in six American adults now consult these chatbots for health guidance on at least a monthly basis.

Risks Associated with Over-Reliance on Chatbots

However, excessive confidence in the information provided by chatbots can present considerable risks. A recent study spearheaded by Oxford University highlights that a key factor contributing to this risk is the difficulty individuals experience in determining what information to provide to chatbots to obtain optimal health recommendations.

“The research demonstrated a breakdown in communication occurring in both directions,” explained Adam Mahdi, Director of Graduate Studies at the Oxford Internet Institute and a co-author of the study, in an interview with TechCrunch. “Individuals utilizing these chatbots did not demonstrate improved decision-making compared to those who relied on conventional methods like internet searches or personal judgment.”

Study Methodology and Findings

The study involved approximately 1,300 participants from the U.K. They were presented with medical scenarios crafted by a panel of doctors.

Participants were then tasked with identifying potential health conditions within these scenarios and utilizing both chatbots and their own resources to determine appropriate actions, such as consulting a physician or seeking hospital care.

The AI models tested included the default version powering ChatGPT, GPT-4o, as well as Cohere’s Command R+ and Meta’s Llama 3, which previously supported Meta’s AI assistant. The study’s findings revealed that chatbots not only decreased the likelihood of participants correctly identifying relevant health conditions, but also increased their tendency to underestimate the seriousness of those conditions they did recognize.

Communication Challenges and Response Quality

Mahdi noted that participants frequently failed to include crucial details when interacting with the chatbots.

Furthermore, the responses received were often ambiguous or difficult to interpret. “[T]he responses they received frequently combined good and poor recommendations,” he stated. “Existing methods for evaluating AI chatbots fail to capture the intricacies of human-computer interaction.”

Tech Companies and AI in Healthcare

These findings emerge as technology companies increasingly promote AI as a means of enhancing health outcomes.

For example, Apple is reportedly developing an AI tool to offer guidance on exercise, diet, and sleep patterns. Amazon is investigating the use of AI to analyze medical databases for “social determinants of health.” Microsoft is also contributing to the development of AI systems designed to prioritize messages sent by patients to healthcare providers.

Cautious Perspectives and Recommendations

However, as previously reported by TechCrunch, both healthcare professionals and patients hold mixed opinions regarding the readiness of AI for high-stakes medical applications.

The American Medical Association advises against physicians using chatbots like ChatGPT to assist with clinical decision-making. Major AI companies, including OpenAI, also caution against basing diagnoses on the outputs generated by their chatbots.

“We recommend prioritizing trusted sources of information when making healthcare decisions,” Mahdi emphasized. “Current evaluation methods for chatbots do not reflect the complexity of interacting with human users. Similar to clinical trials for new medications, these systems should undergo real-world testing before widespread implementation.”

#chatbots#health advice#AI#healthcare#study#limitations