Breaking: Government report on ER misdiagnoses has "fatal flaw," internal analysis found.

A report which has made shock waves in the US medical world had flimsy methods on key outcomes, which peer reviewers and technical experts worried about prior to publication.

Dec 17, 2022

Inside Medicine is a daily newsletter where I bring you to the frontline of medicine and public health. Sometimes, like today, we break some news. I hope you find it useful and I appreciate your support…

The New York Times reported yesterday that a newly released federal government study believes that up to 250,000 people die in the United States annually due to misdiagnoses made in emergency rooms.

However, in a large document obtained by Inside Medicine that is not yet public, one expert contributing to an internal review of the report prior to its publication found a “fatal flaw” in the methodology behind some of the most crucial and eye-catching findings. Other major concerns were brought up by other reviewers and technical experts, which the study authors did not fully address prior to the release of the report. The technical expert concerned about the “fatal flaw” wrote that results were, “Headline grabbing, yes, but this is at best gravely misleading, given the concerns....”

Emergency medicine organizations have already pointed out major problems in the report. One thing not yet pointed out is that the magnitude of the findings fail every whiff test imaginable. If the findings of the report were somehow to be true, that would mean that 8.6% of all deaths in the United States—that is, 250,000 out of 2.9 million deaths (2019, the last pre-pandemic year)—are caused by mistakes and misses in ERs. That’s preposterous, on its face.

Below, you’ll find the internal review in PDF form. It contains reviewer comments (left column), and the author replies (right column). Overall, I would characterize the reviewer comments as a combination of being supportive whilst confused; concerned, and yet (in some instances) openly admitting to being out of their depth. On one hand, a reviewer praises the report. But a paragraph later, the same reviewer finds vast problems that render the findings impossible to interpret. Such concerns did not keep those same findings off of the front page of major newspapers.

For example, “Peer Reviewer #1" was largely supportive of the report, but worried that the conclusions and recommendations might be wrong. That’s kind of a major problem. The reviewer wrote that, “[Emergency Departments] have often been criticized for the overuse of diagnostic tests and an emphasis on diagnostic error has the potential to increase testing among low-risk patients, leading to increased costs, identification of incedentalomas [random findings of unclear meaning, which are usually benign] which adversely impact patient wellbeing, radiation exposure, etc.” This is the correct concern to bring up. It gets lip service in the paper. It is the crux of the problem. It’s not really dealt with, though.

Indeed, later, this same reviewer writes about the perils of over-diagnosis. For example, if a patient with a virus is admitted for IV antibiotics that they do not need (rather than sent home to get better), dangerous parasitic infections can occur.

The theme that emerges is that this report focuses on diagnostic misses, without acknowledging that the opposite—over-diagnosis or over-admitting more patients to the hospital for further tests and treatments for needle-in-the-haystack searches—comes with harms. Remember, if over-diagnosis causes more harm than the benefit of catching a rare missed case, the system has failed, overall.

This paper seems to have forgotten that. This paper seems to believe that a carefully-reasoned risk-benefit analysis is akin to a missed diagnoses. As the first reviewer notes, “[Over-diagnosis] is an important component of misdiagnosis…I feel this section that introduces overcalls should be more balanced throughout the manuscript,” the reviewer said. For their part, the authors basically ignored the reviewer comments whenever they got too close to rendering any of their conclusions null and void. Instead, they responded with window dressing, but not any further analysis.

That same reviewer also noted that making the leap from European data to make a US estimate on mortality and disabilities “caused” by ER errors is a massive problem. And yet, that’s exactly how the 250,000 deaths per year number was generated. “US estimates are made on limited…data primarily from Europe…” the reviewer noticed. Indeed, in Europe, emergency medicine is not even a recognized specialty in every country. While many European nations have excellent ERs, some are still on the learning curve. Comparing European ERs to US ones (our ERs are staffed by seasoned board-certified emergency physicians and other experts in emergency care) is like concluding something about the US soccer team from data about the French team’s performance. “These results should be more tempered with acknowledgment of the limitations of these numbers,” the reviewer wrote. And yet, as more than one reviewer predicted, the headline findings—whether true or not—were too delicious for the media to pass up.

Another reviewer pointed out that “hindsight bias” was not considered in the report. For example, consider what happens when ERs discharge a patient who has been ruled-out for a heart attack. What happens if that patient goes on to have a fatal heart attack in the next 30 days? Is that a missed diagnosis? No. It’s failure to have a crystal ball, even after all the tests designed to identify such patients have been passed. The only way to catch such rare events (the theory would seem to be) would be to hospitalize hundreds, or thousands, or even tens-of-thousands of such patients. But doing that would cause more harm than good because a small number of those patients would die of complications from what were overly aggressive invasive procedures, or from hospital-acquired infections, or falls, etc. In fact, the American Heart Association (and Emergency Medicine organizations), have protocols that are designed to make sure that ERs do not admit too many patients, specifically so that we do not do more harm than good.

Indeed, to save one life due to a heart attack that was destined to occur (but had not yet occurred—which, again, this report categorizes as a “missed diagnosis,” rather than “failure to be clairvoyant”), ERs would have to admit hundreds or thousands of patients to the hospital. But invasive procedures to prevent those deaths are dangerous in-and-of themselves. When coronary artery perforations happen during cardiac catheterization (which happens in around 1 in 250 cases) the mortality rate is 7.5% (albeit this has trended lower in recent years). What if, on balance, that approach caused more harm than good? Suddenly, ERs which had been sending these patients home would no longer be chastised as having made “misdiagnoses,” but, rather, hailed as having protected patients from rare but potentially devastating downstream errors which are hard to avoid, once the journey down the invasive path has begun.

Another example: Imagine that after reading this report, 50 patients get admitted to hospitals for cardiac stenting who otherwise would have been sent home by ER doctors (because they were not having a heart attack or any signs of an impending one). One in 50 of them would be expected to have a serious complication such as death, stroke, myocardial infarction, arrhythmia, hemorrhage because of the aggressive approach. That’s 1 serious complication in every 50 patients, simply as a result of ERs becoming more aggressive and admitting more patients (i.e. being scared to miss a heart attack) in response to actually believing this report’s findings. But only 1 out of many hundreds of patients (and maybe one among thousands) would have had an equally bad outcome because they were sent home. In this example, doing less (i.e., ignoring this report’s “findings”) would have saved more lives, on net.

And it is not just heart attacks. If the idea is that ER doctors miss too many strokes, consider the fact that many patients actually receive clot-busting medications even though they never had a stroke. That’s because neurology protocols recommend giving the clot-busting drugs even if there is no evidence of a stroke on a CT scan. Yes, some of these patients are indeed having strokes. But some have other conditions that mimic a stroke. One out of 250 patients with a stroke mimic experiences symptomatic internal bleeding of the brain due to side effects of clot-busting medications they never even needed. So, if this report wants ERs to treat more patients as having strokes without evidence of a stroke on CT scans, they need to prove that doing so helps more patients than it hurts.

And that’s the problem. This report seems unfamiliar with the idea that what we seek in medicine is net benefit. This report counts only the misses, but none of the saves ERs routinely make by following evidence-based medicine developed by emergency physicians, cardiologists, neurologists, and other experts working together. This report seems to think that abiding by the principle of balancing risks and harms is somehow synonymous with medical error.

To get a sense of how absurd this is, let’s pause for a special note about Peer Reviewer #3. Peer Reviewer #3 seems to have no self-awareness. It’s honestly astounding. He acknowledges—and I’m playing the odds on gender here—that “I am not a methodologist but have participated in several systematic reviews.” He then goes on to say unironically that, “The rigor of the methodology used is obvious, and the quality of the review is first rate in every respect.” Is this a joke? That’s like me saying, “I have no idea how cars work, but I have inspected the repairs just completed and Carla, here, is one amazing auto mechanic. She’s first rate!” We are in a clown car, folks.

I end with a comment from yet another technical expert who reviewed the paper and provided comments. “The methodology overall is excellent—however, given the substantial heterogeneity of the studies, the variable definitions for diagnostic error, varying including/exclusion criteria, and varying outcome measures make the robust analytic methods completely moot.”

In other words, these folks built an amazing space shuttle. However, it very likely can’t fly and will probably crash.

Here is the internal review and here is the link to the federal government’s report. Please add your comments and questions below!

Internal Review Ahrq Paper

1.1MB ∙ PDF file

Download

Thanks to Dr. Joshua D. Niforatos for the rapid literature review.

Inside Medicine is written five days per week by Dr. Jeremy Faust, MD, MS, a practicing emergency physician, a public health researcher, and a writer. He blends his frontline clinical experience with original and incisive analyses of emerging data—and to help people make sense of complicated and important issues.

Did you like this? Please share it and follow me on Twitter, Instagram, and Facebook!

John Stiller

I agree with your analysis. The “study” is so flawed that its primary use is a teaching tool on lousy methodology. With the caveat that EDs quality of care vary from excellent to awful it should surprise no one that there are significant problems in many EDs independent of the amount of misdiagnoses and it's effect on survival. These include but are not limited to poor triage, absurd waiting times, lack of time or appreciating “red flags” that lead to misdiagnosis and less-than-optimal care, poor or even mistreatment of specific patient populations (e.g., chronic psychiatric patients who present with a possible acute medical issue such as possible delirium). The fundamental approach when evaluating a patient: “What could reasonably be occurring that requires an intervention to prevent death or significant morbidity” is often replaced by a cavalier: “ What benign diagnosis gets the patient out of the ED with a set of instructions they don't read or understand.

Expand full comment

Inside Medicine

Breaking: Government report on ER misdiagnoses has "fatal flaw," internal analysis found.

A report which has made shock waves in the US medical world had flimsy methods on key outcomes, which peer reviewers and technical experts worried about prior to publication.

Discussion about this post