Algorithm That Detects Sepsis Cut Deaths by Nearly 20 Percent

2022-08-02
关注

Hospital patients are at risk of a number of life-threatening complications, especially sepsis—a condition that can kill within hours and contributes to one out of three in-hospital deaths in the U.S. Overworked doctors and nurses often have little time to spend with each patient, and this problem can go unnoticed until it is too late.

Academics and electronic-health-record companies have developed automated systems that send reminders to check patients for sepsis, but the sheer number of alerts can cause health care providers to ignore or turn off these notices. Researchers have been trying to use machine learning to fine-tune such programs and reduce the number of alerts they generate. Now one algorithm has proved its mettle in real hospitals, helping doctors and nurses treat sepsis cases nearly two hours earlier on average—and cutting the condition’s hospital mortality rate by 18 percent.

Sepsis, which happens when the body’s response to an infection spirals out of control, can lead to organ failure, limb loss and death. Roughly 1.7 million adults in the U.S. develop sepsis each year, and about 270,000 of them die, according to the Centers for Disease Control and Prevention. Although most cases originate outside the hospital, the condition is a major cause of patient mortality in this setting. Catching the problem as quickly as possible is crucial to preventing the worst outcomes. “Sepsis spirals extremely fast—like in a matter of hours if you don’t get timely treatment,” says Suchi Saria, CEO and founder of Bayesian Health, a company that develops machine-learning algorithms for medical use. “I lost my nephew to sepsis. And in his case, for instance, sepsis wasn’t suspected or detected until he was already in late stages of what’s called septic shock..., where it’s much harder to recover.”

But in a busy hospital, prompt sepsis diagnosis can be difficult. Under the current standard of care, Saria explains, a health care provider should take notice when a patient displays any two out of four sepsis warning signs, including fever and confusion. Some existing warning systems alert physicians when this happens—but many patients display at least two of the four criteria during a typical hospital stay, Saria says, adding that this can give warning programs a high false-positive rate. “A lot of these other programs have such a high false-alert rate that providers are turning off that alert without even acknowledging it,” says Karin Molander, who is an emergency medicine physician and chair of the nonprofit Sepsis Alliance and was not involved in the development of the new sepsis-detection algorithm. Because of how commonly the warning signs occur, physicians must also consider factors such as a person’s age, medical history and recent lab test results. Putting together all the relevant information takes time, however—time sepsis patients do not have.

In a well-connected electronic-records system, known sepsis risk factors are available but may take time to find. That’s where machine-learning algorithms come in. Several academic and industry groups are teaching these programs to recognize the risk factors for sepsis and other complications and to warn health care providers about which patients are in particular danger. Saria and her colleagues at Johns Hopkins University, where she directs the Machine Learning and Healthcare Lab, began work on one such algorithm in 2015. The program scanned patients’ electronic health records for factors that increase sepsis risk and combined this information with current vital signs and lab tests to create a score indicating which patients were likely to develop septic shock. A few years later, Saria founded Bayesian Health, where her team used machine learning to increase the sensitivity, accuracy and speed of their program, dubbed Targeted Real-Time Early Warning System (TREWS).

More recently, Saria and a team of researchers assessed TREWS’s performance in the real world. The program was incorporated over two years into the workflow of about 2,000 health care providers at five sites affiliated with the Johns Hopkins Medicine system, covering both well-resourced academic institutions and community hospitals. Doctors and nurses used the program in more than 760,000 encounters with patients—including more than 17,000 who developed sepsis. The results of this trial, which suggest TREWS led to earlier sepsis diagnosis and reduced mortality, are described in three papers published in npj Digital Medicine and Nature Medicine late last month.

“I think that this model for machine learning may prove as vital to sepsis care as the EKG [electrocardiogram] machine has proved in diagnosing a heart attack,” Molander says. “It is going to allow the clinician to go from the computer..., trying to analyze 15 years’ worth of information, to go back to the bedside and reassess the patient more rapidly—which is where we need to be.”

TREWS is not the first program to demonstrate its value in such trials. Mark Sendak, a physician and population health and data science lead at the Duke Institute for Health Innovation, works on a similar program developed by Duke researchers, called Sepsis Watch. He points out that other machine-learning systems focused on health care—not necessarily those created for sepsis detection in particular—have already undergone large-scale trials. One groundbreaking test of an artificial-intelligence-based system for diagnosing a complication of diabetes was designed with input from the U.S. Food and Drug Administration. Other programs have also been tested in multiple different hospital systems, he notes.

“These tools have a valuable role in improving the way that we care for patients,” Sendak says, adding that the new system “is another example of that.” He hopes to see even more studies, ideally standardized trials that involve research support and guidance from external partners, such as the FDA, who don’t have a stake in the results. This is a challenge because it is extremely difficult to design health care trials of machine-learning systems, including the new studies on TREWS. “Anything that takes an algorithm and puts it into practice and studies how it’s used and its impact is phenomenal,” he says. “And doing that in the peer-reviewed literature—massive kudos.”

As an emergency room physician, Molander was impressed by the fact that the AI does not make sepsis decisions on behalf of health care providers. Instead it flags a patient’s electronic health record so that when doctors or nurses check the record, they see a note that the patient is at risk of sepsis, as well as a list of reasons why. Unlike some programs, the alert system for TRWES does not prevent “the clinician from doing any other further work [on the computer] without acknowledging the alert,” Molander explains. “They have a little reminder there, off in the corner of the system, saying, ‘Look, this person is at higher risk of decompensation [organ failure] due to sepsis, and these are the reasons why we think you need to be concerned.’” This helps busy doctors and nurses prioritize which patients to check on first without removing their ability to make their own decisions. “They can choose to disagree because we don’t want to take autonomy away from the provider,” Saria says. “This is a tool to assist. This is not a tool to tell them what to do.”

The trial also gathered data on whether doctors and nurses were willing to use an alert system such as TREWS. For instance, 89 percent of its notifications were actually evaluated rather than dismissed automatically, as Molander described happening with some other systems. Health care providers’ willingness to check the program could be because TREWS cut the high rate of false sepsis-warning notifications by a factor of 10, according to a press release from Bayesian Health, reducing the barrage of alerts and making it easier to distinguish which patients were in real danger. “That’s mind-blowing,” Molander says. “That is really important because it allows providers to increase their trust in machine learning.”

Building trust is important, but so is collecting evidence. Health care institutions would not be likely to accept machine-learning systems without proof they work well. “In tech, people are much more willing to adopt new ideas if they believe in the thought process. But in medicine, you really need rigorous data and prospective studies to support the claim to get scalable adoption,” Saria says.

“In some sense, we are building the products while also building the evidence base and the standards for how the work needs to be conducted and how potential adopters need to be scrutinizing the tools that we’re building,” Sendak says. Achieving widespread adoption for any algorithmic alert system is challenging because different hospitals may use different electronic-records software or may already have a competing system in place. Many hospitals also have limited resources, which makes it difficult for them to assess the effectiveness of an algorithmic alert tool—or to access technical support when such systems inevitably require repairs, updates or troubleshooting.

Still, Saria hopes to use the new trial data to expand the use of TREWS. She says she is building partnerships with multiple electronic-records companies so she can incorporate the algorithm into more hospital systems. She also wants to explore whether machine-learning algorithms could warn about other complications people can experience in hospitals. For instance, some patients must be monitored for cardiac arrest, heavy bleeding and bedsores, which can impact health during hospital stays and recuperation afterward.

“We’ve had a lot of learning around what ‘AI done right’ looks like, and we’ve published significantly on it. But what this is now showing is AI done right actually gets provider adoption,” Saria says. By incorporating an AI program into existing records systems, where it can become part of a health care provider’s workflow, “you can suddenly start chopping your way through all these preventable harms in order to improve outcomes—which benefits the system, benefits the patient and benefits the clinicians.”

参考译文
检测败血症的算法将死亡率降低了近20%
医院里的病人面临着许多危及生命的并发症的风险,尤其是败血症——在美国,败血症可以在数小时内夺去生命,三分之一的住院病人都是由败血症造成的。工作过度的医生和护士通常没有多少时间照顾每个病人,而这个问题往往被忽视,直到为时已晚。学者和电子健康记录公司已经开发了自动系统,可以发送提醒来检查患者是否患有败血症,但警报数量之多可能导致医疗服务提供者忽略或关闭这些通知。研究人员一直在尝试使用机器学习来微调这类程序,减少它们产生的警报数量。现在,一种算法已经在真实的医院中证明了它的威力,帮助医生和护士平均提前近两个小时治疗败血症病例,并将这种疾病的住院死亡率降低了18%。当身体对感染的反应失控时,就会发生败血症,可能导致器官衰竭、肢体丧失和死亡。根据美国疾病控制与预防中心(Centers for Disease Control and Prevention)的数据,美国每年大约有170万成年人患败血症,其中约27万人死亡。尽管大多数病例发生在医院外,但这种情况是这种情况下患者死亡的主要原因。尽快发现问题对于防止最坏的结果至关重要。贝叶斯健康公司(Bayesian Health)的首席执行官兼创始人苏奇·萨里亚(Suchi Saria)说:“如果你没有得到及时治疗,败血症的恶化速度非常快,在几个小时内就会恶化。”贝叶斯健康公司是一家开发医疗用机器学习算法的公司。“我的侄子死于败血症。例如,在他的病例中,直到他已经处于败血症休克的晚期,才被怀疑或检测到败血症……在美国,恢复起来要困难得多。“但在繁忙的医院里,及时诊断败血症可能很困难。萨里亚解释说,在目前的护理标准下,当病人出现四种败血症的警告信号中的任何两种时,包括发烧和精神错乱,卫生保健提供者应该予以注意。当这种情况发生时,一些现有的预警系统会向医生发出警报,但许多患者在典型的住院期间至少表现出四项标准中的两项,Saria说,这可能会给预警程序带来很高的假阳性率。“很多其他项目都有很高的误报率,提供者甚至在没有意识到的情况下就关闭了警报,”卡琳·莫兰德(Karin Molander)说。他是一名急诊医生,也是非营利组织败血症联盟(Sepsis Alliance)的主席,没有参与新的败血症检测算法的开发。由于这些警告信号发生的普遍程度,医生还必须考虑患者的年龄、病史和最近的实验室检测结果等因素。收集所有相关信息需要时间,但败血症患者没有时间。在一个连接良好的电子记录系统中,已知的败血症危险因素是可用的,但可能需要时间来发现。这就是机器学习算法的用武之地。一些学术和行业组织正在教授这些课程,以认识败血症和其他并发症的风险因素,并警告医疗服务提供者哪些患者特别危险。Saria和她在约翰霍普金斯大学(Johns Hopkins University)领导机器学习和医疗保健实验室的同事们于2015年开始研究这样一种算法。该项目扫描了患者的电子健康记录,寻找增加脓毒症风险的因素,并将这些信息与当前的生命体征和实验室检测结合起来,创建一个评分,表明哪些患者可能发生脓毒症休克。几年后,萨里亚创立了贝叶斯健康,她的团队利用机器学习提高了他们的程序的灵敏度、准确性和速度,被称为定向实时早期预警系统(TREWS)。 最近,Saria和一组研究人员评估了TREWS在现实世界中的表现。该项目在两年多的时间里被纳入约翰·霍普金斯医学系统的五个附属站点的约2000名医疗保健提供者的工作流程中,涵盖了资源丰富的学术机构和社区医院。医生和护士使用该程序与超过76万患者接触,其中包括超过1.7万患上败血症的患者。该试验的结果表明,TREWS导致败血症的早期诊断和降低死亡率,发表在上个月下旬《npj数字医学》和《自然医学》上的三篇论文描述了这一结果。Molander说:“我认为这种机器学习模型对脓毒症的治疗至关重要,就像EKG(心电图)机器对心脏病的诊断一样。”“这将允许临床医生离开电脑……试图分析15年来有价值的信息,回到床边,更迅速地重新评估病人——这就是我们需要做的。“TREWS并不是第一个在此类试验中证明其价值的项目。马克·森达克(Mark Sendak)是杜克大学健康创新研究所(Duke Institute for health Innovation)的一名医生、人口健康和数据科学负责人,他负责一个由杜克大学研究人员开发的类似项目,名为“败毒症观察”(Sepsis Watch)。他指出,其他专注于医疗保健的机器学习系统——不一定是专门为败血症检测而创建的机器学习系统——已经进行了大规模的试验。一项基于人工智能的糖尿病并发症诊断系统的开创性测试是在美国食品和药物管理局(U.S. Food and Drug Administration)的输入下设计的。他指出,其他项目也在多个不同的医院系统中进行了测试。森达克说:“这些工具在改善我们照顾病人的方式方面发挥了宝贵的作用。”他补充说,新系统“是这方面的另一个例子。”他希望看到更多的研究,最好是标准化试验,包括来自外部合作伙伴(如FDA)的研究支持和指导,而这些合作伙伴与研究结果没有利害关系。这是一个挑战,因为设计机器学习系统的医疗保健试验极其困难,包括关于TREWS的新研究。他说:“任何将算法付诸实践、研究如何使用及其影响的事情都是显著的。”“在同行评议的文献中,这是巨大的荣誉。”作为一名急诊室医生,Molander对AI不代表医疗保健提供者做出败血症决策的事实印象深刻。相反,它标记了病人的电子健康记录,这样当医生或护士检查记录时,他们就会看到病人有败血症风险的说明,以及原因列表。与某些程序不同,TRWES的警报系统不会阻止“临床医生在没有确认警报的情况下(在计算机上)进行任何进一步的工作,”莫兰德解释说。“他们在系统的角落里有一个小提示,说,‘看,这个人由于败血症有更高的失代偿(器官衰竭)风险,这些就是我们认为你需要关注的原因。’”这有助于忙碌的医生和护士在不剥夺他们自己做决定的能力的情况下,优先检查哪些病人。“他们可以选择不同意,因为我们不想剥夺提供者的自主权,”Saria说。“这是一种辅助工具。这不是告诉他们该做什么的工具。” 该试验还收集了医生和护士是否愿意使用TREWS等警报系统的数据。例如,它89%的通知实际上是被评估的,而不是像Molander描述的其他一些系统那样被自动驳回。贝叶斯健康中心的一份新闻稿称,医疗服务提供者之所以愿意检查该程序,可能是因为TREWS将错误的败血症警报通知率降低了10倍,减少了警报的大量发送,更容易区分哪些患者处于真正的危险之中。莫兰德说:“这太令人兴奋了。”“这真的很重要,因为它让提供者增加了对机器学习的信任。”建立信任很重要,但收集证据也很重要。在没有证明机器学习系统运行良好的情况下,医疗机构不太可能接受机器学习系统。“在科技领域,如果人们相信思维过程,他们就更愿意接受新想法。但在医学领域,你真的需要严格的数据和前瞻性研究来支持这种说法,以获得可扩展的采用。从某种意义上说,我们在构建产品的同时,也在构建证据基础和如何开展工作的标准,以及潜在采用者如何审查我们正在构建的工具,”桑达克说。要实现任何算法警报系统的广泛采用都是一项挑战,因为不同的医院可能使用不同的电子记录软件,或者可能已经有了竞争的系统。许多医院也有有限的资源,这使得他们很难评估算法警报工具的有效性,或者当这类系统不可避免地需要修复、更新或排除故障时,他们很难获得技术支持。尽管如此,Saria希望利用新的试验数据来扩大TREWS的使用。她说她正在与多家电子记录公司建立合作关系,这样她就可以将算法应用到更多的医院系统中。她还想探索机器学习算法是否可以警告人们在医院可能遇到的其他并发症。例如,一些病人必须接受心脏骤停、大出血和褥疮的监测,这些都会影响住院期间和之后的恢复。“我们对‘人工智能做得对’的样子有很多了解,我们也发表了很多相关文章。但现在的情况表明,如果人工智能做得好,实际上得到了供应商的采用,”萨里亚说。通过将人工智能程序整合到现有的记录系统中,它可以成为医疗服务提供者工作流程的一部分,“你可以突然开始从所有这些可预防的伤害中杀出一条路来,以改善结果——这对系统、患者和临床医生都有利。”
  • 机器学习
  • 死亡率
  • en
您觉得本篇内容如何
评分

评论

您需要登录才可以回复|注册

提交评论

scientific

这家伙很懒,什么描述也没留下

关注

点击进入下一篇

赋能边缘AI,恩智浦给传统MCU带来X无限可能

提取码
复制提取码
点击跳转至百度网盘