Skip to content

Key challenges for delivering clinical impact with artificial intelligence.

文献信息

DOI10.1186/s12916-019-1426-2
PMID31665002
期刊BMC medicine
影响因子8.3
JCR 分区Q1
发表年份2019
被引次数681
关键词算法、人工智能、评估、机器学习、监管
文献类型Journal Article, Research Support, Non-U.S. Gov't
ISSN1741-7015
页码195
期号17(1)
作者Christopher J Kelly, Alan Karthikesalingam, Mustafa Suleyman, Greg Corrado, Dominic King

一句话小结

本研究探讨了医疗保健领域人工智能(AI)从研究转化为临床实践所面临的主要挑战,包括技术实施、社会文化障碍和数据偏见等问题,并强调需要通过强有力的临床评估和适当监管来确保患者安全和创新的有效性。研究表明,平衡创新速度与潜在风险、改进AI算法的可解释性以及识别和缓解算法偏见是实现AI在医疗中成功应用的关键。

在麦伴科研 (maltsci.com) 搜索更多文献

算法 · 人工智能 · 评估 · 机器学习 · 监管

摘要

背景
医疗保健领域的人工智能(AI)研究正在快速发展,其潜在应用在医学的各个领域得到了展示。然而,目前成功将这些技术应用于临床实践的案例仍然有限。本文探讨了医疗保健中AI的主要挑战和限制,并考虑了将这些潜在变革性技术从研究转化为临床实践所需的步骤。

主体
医疗保健中AI系统转化的关键挑战包括机器学习科学本身固有的困难、实施过程中的后勤挑战,以及对采用障碍的考虑,以及必要的社会文化或途径变更。作为随机对照试验的一部分,进行强有力的同行评审临床评估应被视为证据生成的金标准,但在实践中进行这些评估可能并不总是适宜或可行。性能指标应旨在捕捉真实的临床适用性,并且能够为预期用户所理解。需要制定一种平衡创新速度与潜在危害的监管机制,同时进行深思熟虑的市场后监测,以确保患者不被暴露于危险的干预措施中,也不被剥夺获得有益创新的机会。必须开发能够直接比较AI系统的机制,包括使用独立的、本地的和有代表性的测试集。AI算法的开发者必须警惕潜在的危险,包括数据集漂移、意外拟合混杂变量、无意的歧视性偏见、向新群体推广的挑战,以及新算法对健康结果产生的意想不到的负面后果。

结论
将AI研究安全及时地转化为临床验证和适当监管的系统,以造福所有人,是一项挑战。使用对临床医生直观的指标进行强有力的临床评估,理想情况下超越技术准确性的衡量,纳入护理质量和患者结果,是至关重要的。需要进一步的工作(1)识别算法偏见和不公平性的主题,同时开发减轻这些问题的对策;(2)减少脆弱性并提高泛化能力;(3)开发改进机器学习预测可解释性的方法。如果能够实现这些目标,患者将可能获得变革性的益处。

英文摘要

BACKGROUND Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice.

MAIN BODY Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes.

CONCLUSION The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.

麦伴智能科研服务

智能阅读回答你对文献的任何问题,帮助理解文献中的复杂图表和公式
定位观点定位某个观点在文献中的蛛丝马迹
加入知识库完成数据提取,报告撰写等更多高级知识挖掘功能

主要研究问题

  1. 在实现人工智能临床应用时,如何有效评估和验证算法的临床适用性?
  2. 目前有哪些成功案例可以作为人工智能在医疗中应用的参考?
  3. 在人工智能医疗应用中,如何处理数据集偏移和潜在的歧视性偏见问题?
  4. 有哪些策略可以促进医疗工作者对人工智能系统的接受和使用?
  5. 如何在快速发展的人工智能技术与必要的监管之间找到平衡,以确保患者安全?

核心洞察

研究背景和目的

人工智能(AI)在医疗保健领域的研究正在迅速发展,展现出在多种医学领域的潜在应用。然而,成功将这些技术转化为临床实践的实例仍然有限。本文探讨了AI在医疗保健中的主要挑战和局限性,并考虑了将这些潜在变革性技术从研究转化为临床实践所需的步骤。

主要方法/材料/实验设计

本文通过分析现有文献,系统总结了AI在医疗保健中的应用现状和面临的挑战。重点关注以下几个方面:

  • AI系统的科学挑战:包括机器学习的固有科学限制、实施中的后勤困难和社会文化障碍。
  • 临床评估标准:强调随机对照试验(RCT)作为证据生成的金标准,但在实践中实施这些试验可能并不总是可行。
  • 性能指标的适用性:应捕捉真实的临床适用性,并且对目标用户易于理解。

以下是技术路线的流程图表示:

Mermaid diagram

关键结果和发现

  1. 临床应用稀少:尽管AI在多个领域展示了潜力,但在实际临床应用中仍然很少。
  2. 算法的偏见和不公平:AI系统可能会反映社会偏见,导致对某些群体的预测不准确。
  3. 需要明确的性能评估标准:现有的评估指标(如ROC曲线下面积)未必能有效反映临床适用性。
  4. 人机交互的复杂性:AI系统的解释性和透明性对临床应用至关重要。

主要结论/意义/创新性

AI在医疗保健中具有变革潜力,但其安全和有效的临床转化面临多重挑战。需要进行强有力的临床评估,使用超越技术准确性的指标来衡量AI对护理质量、医疗专业人员变异性和患者结果的影响。此外,开发具有全球视野的算法并确保其对不同人群的适用性是未来研究的关键。

研究局限性和未来方向

  • 局限性:本文主要依赖于文献综述,缺乏实证数据支持某些观点。
  • 未来方向
    • 需要进一步的前瞻性研究,以更好地理解AI系统的真实效用。
    • 开发适合临床环境的解释性AI,以增强临床医生对AI决策的信任。
    • 建立适应性强的监管框架,以应对AI系统在临床应用中的快速变化。
研究方向描述
前瞻性研究评估AI系统在实际临床中的表现
解释性AI增强AI决策的透明度和可理解性
监管框架适应AI技术快速发展的监管机制
偏见识别与修正识别算法中的偏见并制定相应的修正措施

参考文献

  1. Identifying facial phenotypes of genetic disorders using deep learning. - Yaron Gurovich;Yair Hanani;Omri Bar;Guy Nadav;Nicole Fleischer;Dekel Gelbman;Lina Basel-Salmon;Peter M Krawitz;Susanne B Kamphausen;Martin Zenker;Lynne M Bird;Karen W Gripp - Nature medicine (2019)
  2. Caveats for the use of operational electronic health record data in comparative effectiveness research. - William R Hersh;Mark G Weiner;Peter J Embi;Judith R Logan;Philip R O Payne;Elmer V Bernstam;Harold P Lehmann;George Hripcsak;Timothy H Hartzog;James J Cimino;Joel H Saltz - Medical care (2013)
  3. Reporting of artificial intelligence prediction models. - Gary S Collins;Karel G M Moons - Lancet (London, England) (2019)
  4. Artificial intelligence using deep learning to screen for referable and vision-threatening diabetic retinopathy in Africa: a clinical validation study. - Valentina Bellemo;Zhan W Lim;Gilbert Lim;Quang D Nguyen;Yuchen Xie;Michelle Y T Yip;Haslina Hamzah;Jinyi Ho;Xin Q Lee;Wynne Hsu;Mong L Lee;Lillian Musonda;Manju Chandran;Grace Chipalo-Mutati;Mulenga Muma;Gavin S W Tan;Sobha Sivaprasad;Geeta Menon;Tien Y Wong;Daniel S W Ting - The Lancet. Digital health (2019)
  5. The triple aim: care, health, and cost. - Donald M Berwick;Thomas W Nolan;John Whittington - Health affairs (Project Hope) (2008)
  6. Design Characteristics of Studies Reporting the Performance of Artificial Intelligence Algorithms for Diagnostic Analysis of Medical Images: Results from Recently Published Papers. - Dong Wook Kim;Hye Young Jang;Kyung Won Kim;Youngbin Shin;Seong Ho Park - Korean journal of radiology (2019)
  7. Adversarial attacks on medical machine learning. - Samuel G Finlayson;John D Bowers;Joichi Ito;Jonathan L Zittrain;Andrew L Beam;Isaac S Kohane - Science (New York, N.Y.) (2019)
  8. Making Machine Learning Models Clinically Useful. - Nigam H Shah;Arnold Milstein;Steven C Bagley PhD - JAMA (2019)
  9. Measurement Is Essential for Improving Diagnosis and Reducing Diagnostic Error: A Report From the Institute of Medicine. - Elizabeth A McGlynn;Kathryn M McDonald;Christine K Cassel - JAMA (2015)
  10. Predicting scheduled hospital attendance with artificial intelligence. - Amy Nelson;Daniel Herron;Geraint Rees;Parashkev Nachev - NPJ digital medicine (2019)

引用本文的文献

  1. Clinical-grade Computational Pathology: Alea Iacta Est. - Filippo Fraggetta - Journal of pathology informatics (2019)
  2. Artificial Intelligence in Medicine: Today and Tomorrow. - Giovanni Briganti;Olivier Le Moine - Frontiers in medicine (2020)
  3. Presenting machine learning model information to clinical end users with model facts labels. - Mark P Sendak;Michael Gao;Nathan Brajer;Suresh Balu - NPJ digital medicine (2020)
  4. The challenges of colposcopy for cervical cancer screening in LMICs and solutions by artificial intelligence. - Peng Xue;Man Tat Alexander Ng;Youlin Qiao - BMC medicine (2020)
  5. Age and sex affect deep learning prediction of cardiometabolic risk factors from retinal images. - Nele Gerrits;Bart Elen;Toon Van Craenendonck;Danai Triantafyllidou;Ioannis N Petropoulos;Rayaz A Malik;Patrick De Boever - Scientific reports (2020)
  6. A data-driven framework for selecting and validating digital health metrics: use-case in neurological sensorimotor impairments. - Christoph M Kanzler;Mike D Rinderknecht;Anne Schwarz;Ilse Lamers;Cynthia Gagnon;Jeremia P O Held;Peter Feys;Andreas R Luft;Roger Gassert;Olivier Lambercy - NPJ digital medicine (2020)
  7. Deep Learning for Accurate Diagnosis of Liver Tumor Based on Magnetic Resonance Imaging and Clinical Data. - Shi-Hui Zhen;Ming Cheng;Yu-Bo Tao;Yi-Fan Wang;Sarun Juengpanich;Zhi-Yu Jiang;Yan-Kai Jiang;Yu-Yu Yan;Wei Lu;Jie-Min Lue;Jia-Hong Qian;Zhong-Yu Wu;Ji-Hong Sun;Hai Lin;Xiu-Jun Cai - Frontiers in oncology (2020)
  8. "Yes, but will it work for my patients?" Driving clinically relevant research with benchmark datasets. - Trishan Panch;Tom J Pollard;Heather Mattie;Emily Lindemer;Pearse A Keane;Leo Anthony Celi - NPJ digital medicine (2020)
  9. Finding undiagnosed patients with hepatitis C infection: an application of artificial intelligence to patient claims data. - Orla M Doyle;Nadejda Leavitt;John A Rigg - Scientific reports (2020)
  10. Accuracy and efficiency of an artificial intelligence tool when counting breast mitoses. - Liron Pantanowitz;Douglas Hartman;Yan Qi;Eun Yoon Cho;Beomseok Suh;Kyunghyun Paeng;Rajiv Dhir;Pamela Michelow;Scott Hazelhurst;Sang Yong Song;Soo Youn Cho - Diagnostic pathology (2020)

... (671 更多 篇文献)


© 2025 MaltSci 麦伴科研 - 我们用人工智能技术重塑科研