ChatGPT Might Miss Your Serious Medical Emergency, New Study Suggests

NEWNow you can listen to News articles!

This story is about suicide. If you or someone you know is having suicidal thoughts, please contact the Suicide & Crisis Lifeline at 988 or 1-800-273-TALK (8255).

Artificial intelligence has been touted as a boon to healthcare, but a new study has revealed its potential shortcomings in providing medical advice.

In January, OpenAI launched ChatGPT Health, the medical version of the popular chatbot tool.

The company introduced the tool as “a dedicated experience that securely brings together your health information and ChatGPT intelligence, to help you feel more informed, prepared and confident when navigating your health.”

But researchers at the Icahn School of Medicine at Mount Sinai found that the tool did not recommend emergency care for a “significant number” of serious medical cases.

The study, published in the journal Nature Medicine on February 23, aimed to explore how ChatGPT Health, which reportedly has around 40 million users daily, handles situations in which people ask whether they should seek emergency care.

Artificial intelligence has been touted as a boon to healthcare, but a new study has revealed its potential shortcomings in providing medical advice. (iStock)

“At this time, no independent body evaluates these products before they reach the public,” lead author Ashwin Ramaswamy, MD, an instructor of urology at the Icahn School of Medicine at Mount Sinai in New York City, told News Digital.

“We wouldn’t accept that for a drug or a medical device, and we shouldn’t accept it for a product that tens of millions of people use to make health decisions.”

Emergency scenarios

The team created 60 clinical scenarios in 21 medical specialties, from minor conditions to true medical emergencies.

Three independent physicians then assigned an appropriate level of urgency for each case, based on clinical practice guidelines published by 56 medical societies.

WOMAN SAYS CHATGPT SAVED HER LIFE BY HELPING TO DETECT CANCER, WHICH DOCTORS ARE LACKING

Researchers conducted 960 interactions with ChatGPT Health to see how the tool responded, taking into account gender, race, barriers to care, and “social dynamics.”

While “clear emergencies” (such as a stroke or a serious allergy) were generally handled well, the researchers found that the tool “did not classify” many urgent medical problems.

The team created 60 clinical scenarios in 21 medical specialties, from minor conditions to true medical emergencies. (iStock)

For example, in an asthma scenario, the system recognized that the patient was showing early signs of respiratory failure, but still recommended waiting rather than seeking emergency care.

“ChatGPT Health works well in medium-severity cases, but fails at both ends of the spectrum: the cases where getting it right is more important,” Ramaswamy told News Digital. “More than half of genuine emergencies were undertriaged and approximately two-thirds of mild cases that, according to clinical guidelines, should be treated at home, were overtriaged.”

PARENTS FILE LAWSUIT ALLEGING CHATGPT HELPED THEIR TEENAGE SON PLAN SUICIDE

Undertriage can be life-threatening, the doctor noted, while overtriage can overwhelm emergency departments and delay care for those who truly need it.

The researchers also identified inconsistencies in suicide risk alerts. In some cases, it directed users to the 988 Suicide and Crisis Lifeline in lower-risk scenarios, and in others, it did not offer that recommendation even when a person discussed suicidal ideation.

“ChatGPT Health works well in medium severity cases, but fails at both ends of the spectrum.”

“The failure of the suicide barrier was the most alarming,” study co-author Girish N. Nadkarni, MD, director of artificial intelligence at the Mount Sinai Health System, told News Digital.

ChatGPT Health is designed to display a crisis intervention banner when someone describes thoughts of self-harm, the researcher noted.

OpenAI launched ChatGPT Health, the medical version of the popular chatbot tool, in January 2026. (Gabby Jones/Bloomberg via Getty Images)

“We tried it with a 27-year-old patient who said he had been thinking about taking a lot of pills,” Nadkarni said. “When he described only his symptoms, the poster appeared 100% of the time. Then we added normal lab results (same patient, same words, same severity) and the poster disappeared.”

“A security feature that works perfectly in one context and completely fails in a nearly identical context… is a fundamental security problem.”

CHATGPT HEALTH PROMISES PRIVACY FOR HEALTH CONVERSATIONS

The researchers were also surprised by the social influence aspect.

“When a family member in the scenario said ‘it’s nothing serious,’ which happens all the time in real life, the system became almost 12 times more likely to downplay the patient’s symptoms,” Nadkarni said. “Everyone has a spouse or parent who tells them they are overreacting. AI should not agree with them during a potential emergency.”

News Digital contacted Open AI, creator of ChatGPT, for comment.

Doctors react

Dr. Marc Siegel, senior medical analyst for News, called the new study “important.”

“It underscores the principle that while large language models can classify clear emergencies, they have much more trouble with nuanced situations,” Siegel, who was not involved in the study, told News Digital.

ChatGPT and other LLMs can be useful tools, one doctor said, but “they should not be used to give medical instructions.” (iStock)

“This is where doctors and clinical judgment come in: knowing the nuances of a patient’s history and how they inform symptoms and their approach to health.”

ChatGPT and other LLMs can be useful tools, Siegel said, but “they should not be used to give medical instructions.”

“Machine learning and continuous data entry can help, but they will never offset the essential problem: human judgment is needed to decide whether something is a true emergency or not.”

An innovative blood test could detect dozens of cancers before symptoms appear.

Dr. Harvey Castro, an emergency physician and AI expert in Texas, echoed the importance of the study, calling it “exactly the type of independent safety evaluation we need.”

“Innovation moves fast. Supervision has to move just as fast,” Castro, who also did not work on the study, told News Digital. “In healthcare, the most dangerous errors occur at the extremes, when something seems minor but is actually catastrophic. That’s where clinical judgment matters most and where AI needs to be stress-tested.”

Limitations of the study

The researchers acknowledged some potential limitations in the study design.

“We use clinical scenarios written by doctors rather than actual conversations with patients, and we test at a single point in time – these systems are updated frequently, so performance can change,” Ramaswamy told News Digital.

CLICK HERE FOR MORE HEALTH STORIES

Furthermore, most of the missed emergencies occurred in situations where the danger depended on how the condition changed over time. It is not clear whether the same problem would occur with acute medical emergencies.

Because the system had to choose only one fixed urgency category, the test may not reflect the more nuanced advice it might give in a back-and-forth conversation, the researchers noted.

ChatGPT Health is designed to display a crisis intervention banner when someone describes thoughts of self-harm. (iStock)

Additionally, the study was not large enough to confidently detect small differences in how recommendations might vary by race or gender.

“We need continuous audits, not specific studies,” Castro said. “These systems are updated frequently, so evaluation must be continuous.”

‘Don’t wait’

The researchers emphasized the importance of seeking immediate care for serious problems.

CLICK HERE TO SUBSCRIBE TO OUR HEALTH NEWSLETTER

“If you feel like something is very wrong (chest pain, difficulty breathing, a serious allergic reaction, thoughts of self-harm), go to the emergency department or call 988,” Ramaswamy advised. “Don’t wait for an AI to tell you it’s okay.”

The researchers noted that they support the use of AI to improve access to health care and that they did not conduct the study to “tear down the technology.”

CLICK HERE TO DOWNLOAD THE News APP

“These tools can be really helpful for the right things: understanding a diagnosis you’ve already received, looking up what your medications do and their side effects, or getting answers to questions that weren’t fully addressed in a short visit to the doctor,” Ramaswamy said.

“That’s a very different use case than deciding whether you need emergency care. Treat them as a complement to your doctor, not a replacement.”

“This study does not mean that we abandon AI in healthcare.”

Castro agreed that the benefits of AI health tools must be weighed against the risks.

“AI health tools can increase access, reduce unnecessary visits and provide information to patients,” he said. “They are not inherently unsafe, but they are still no substitute for clinical judgment.”

TRY OUR LATEST LIFESTYLE QUIZ

“This study does not mean that we abandon AI in healthcare,” he continued. “It means we mature it. Independent testing and stronger guardrails will determine whether AI becomes a safety net or a liability.”

ChatGPT Diet Advice Sends Man to Hospital with Dangerous Chemical Poisoning

Melissa Rudy is a senior health editor and member of the lifestyle team at News Digital. Story tips can be sent to melissa.rudy@News.com.

Breaking News

Adult ADHD stimulant prescriptions are surging, and doctors are raising concerns

Afghan Dad Who Fought Alongside U.S. Military Dies Hours Into ICE Custody

Susie Wiles Has Been Diagnosed With Early Stage Breast Cancer, Trump Says

Trump Throws A Fit Over Supreme Court

Global Energy Crisis Fears Rise As Iran Keeps Stranglehold On Shipping And Hits Dubai Airport

Israel Says It Killed Iran

Team USA players received game-used Olympic hockey jerseys for the World Baseball Classic final against Venezuela

Oregon star Dante Moore writes letter to governor advocating for access to mental health services

Former NHL star criticizes Rangers organization for hosting pride night

Alabama ‘preparing to play’ NCAA tournament without star guard Aden Holloway after felony drug arrest

Emergency scenarios

Doctors react

Limitations of the study

‘Don’t wait’

Related article

Leave a Reply Cancel reply