skip to main content
10.1145/3313831.3376718acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
research-article
Open Access
Honorable Mention

A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy

Published:23 April 2020Publication History

ABSTRACT

Deep learning algorithms promise to improve clinician workflows and patient outcomes. However, these gains have yet to be fully demonstrated in real world clinical settings. In this paper, we describe a human-centered study of a deep learning system used in clinics for the detection of diabetic eye disease. From interviews and observation across eleven clinics in Thailand, we characterize current eye-screening workflows, user expectations for an AI-assisted screening process, and post-deployment experiences. Our findings indicate that several socio-environmental factors impact model performance, nursing workflows, and the patient experience. We draw on these findings to reflect on the value of conducting human-centered evaluative research alongside prospective evaluations of model accuracy.

Skip Supplemental Material Section

Supplemental Material

a589-beede-presentation.mp4

mp4

40.8 MB

References

  1. Michael D. Abràmoff, Philip T. Lavin, Michele Birch, Nilay Shah, and James C Folk. 2018. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit Med 1 (Aug. 2018), 39.Google ScholarGoogle Scholar
  2. E. Alberdi, A. A. Povyakalo, L. Strigini, P. Ayton, M. Hartswood, R. Procter, and R. Slack. 2005. Use of computer-aided detection (CAD) tools in screening mammography: a multidisciplinary investigation. Br. J. Radiol. 78 Spec No 1 (2005), S31--40.Google ScholarGoogle Scholar
  3. American Academy of Ophthalmology. 2015. Eye Health Statistics. https://www.aao.org/newsroom/eye-health-statistics#_edn25. (2015). Accessed: 2019--9--7.Google ScholarGoogle Scholar
  4. Eta S. Berner. 2007. Clinical Decision Support Systems: Theory and Practice. Springer Science & Business Media.Google ScholarGoogle Scholar
  5. Hugh Beyer and Karen Holtzblatt. 1997. Contextual design: defining customer-centered systems. Elsevier.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Timothy W. Bickmore, Laura M. Pfeifer, and Brian W. Jack. 2009. Taking the Time to Care: Empowering Low Health Literacy Hospital Patients with Virtual Nurse Agents. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '09). ACM, NY, NY, USA, 1265--1274. DOI: http://dx.doi.org/10.1145/1518701.1518891Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Keld Bødker, Finn Kensing, and Jesper Simonsen. 2009. Participatory IT design: designing for business and workplace realities. MIT press.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Carrie J. Cai, Emily Reif, Narayan Hegde, Jason Hipp, Been Kim, Daniel Smilkov, Martin Wattenberg, Fernanda Viegas, Greg S. Corrado, Martin C. Stumpe, and Michael Terry. 2019a. Human-Centered Tools for Coping with Imperfect Algorithms During Medical Decision-Making. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 4, 14 pages. DOI: http://dx.doi.org/10.1145/3290605.3300234Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Carrie Jun Cai, Samantha Winter, David Steiner, Lauren Wilcox, and Michael Terry. 2019b. "Hello AI": Uncovering the Onboarding Needs of Medical Practitioners for Human-AI Collaborative Decision-Making. CSCW Conf Comput Support Coop Work 2019 (2019).Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ronald A Castellino. 2005. Computer aided detection (CAD): an overview. Cancer Imaging 5, 1 (2005), 17.Google ScholarGoogle ScholarCross RefCross Ref
  11. CDC. 2019. More than 100 million Americans have diabetes or prediabetes. Press Release. (February 2019). https://www.cdc.gov/media/releases/2017/ p0718-diabetes-report.html.Google ScholarGoogle Scholar
  12. Elodia B. Cole, Zheng Zhang, Helga S. Marques, R. Edward Hendrick, Martin J. Yaffe, and Etta D. Pisano. 2014. Impact of Computer-Aided Detection Systems on Radiologist Accuracy With Digital Mammography. AJR Am. J. Roentgenol. 203, 4 (Oct. 2014), 909.Google ScholarGoogle ScholarCross RefCross Ref
  13. Kathleen Musante DeWalt and Billie R DeWalt. 2002. Participant Observation: A Guide for Fieldworkers. Rowman Altamira.Google ScholarGoogle Scholar
  14. Shelley E Ellis, Theodore Speroff, Robert S. Dittus, Anne Brown, James W. Pichert, and Tom A. Elasy. 2004. Diabetes patient education: a meta-analysis and meta-regression. Patient Educ. Couns. 52, 1 (Jan. 2004), 97--105.Google ScholarGoogle ScholarCross RefCross Ref
  15. Glyn Elwyn, Isabelle Scholl, Caroline Tietbohl, Mala Mann, Adrian GK Edwards, Catharine Clay, France Légaré, Trudy van der Weijden, Carmen L. Lewis, Richard M. Wexler, and others. 2013. "Many miles to go...": a systematic review of the implementation of patient decision support interventions into routine clinical practice. BMC medical informatics and decision making 13, 2 (Nov. 2013), 1--10.Google ScholarGoogle Scholar
  16. Geraldine Fitzpatrick and Gunnar Ellingsen. 2013. A review of 25 years of CSCW research in healthcare: contributions, challenges and future agendas. Computer Supported Cooperative Work (CSCW) 22, 4--6 (2013), 609--665.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jodi Forlizzi and John Zimmerman. Promoting service design as a core practice in interaction design.Google ScholarGoogle Scholar
  18. Maryellen L Giger, Heang-Ping Chan, and John Boone. 2008. Anniversary Paper: History and status of CAD and quantitative image analysis: The role of Medical Physics and AAPM. Med. Phys. 35, 12 (Dec. 2008), 5799.Google ScholarGoogle ScholarCross RefCross Ref
  19. Trisha Greenhalgh, Joe Wherton, Chrysanthi Papoutsi, Jenni Lynch, Gemma Hughes, Sue Hinder, Rob Procter, Sara Shaw, and others. 2018. Analysing the role of complexity in explaining the fortunes of technology programmes: empirical application of the NASSS framework. BMC medicine 16, 1 (2018), 66.Google ScholarGoogle Scholar
  20. Varun Gulshan, Lily Peng, Marc Coram, Martin C Stumpe, Derek Wu, Arunachalam Narayanaswamy, Subhashini Venugopalan, Kasumi Widner, Tom Madams, Jorge Cuadros, and others. 2016. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. Jama 316, 22 (2016), 2402--2410.Google ScholarGoogle ScholarCross RefCross Ref
  21. Mark Hartswood, Rob Procter, Mark Rouncefield, Roger Slack, James Soutter, and Alex Voss. 2003. ?Repairing' the Machine: A Case Study of the Evaluation of Computer-Aided Detection Tools in Breast Screening. In ECSCW 2003. Springer, 375--394.Google ScholarGoogle ScholarCross RefCross Ref
  22. Matthew K Hong, Clayton Feustel, Meeshu Agnihotri, Max Silverman, Stephen F Simoneaux, and Lauren Wilcox. 2017. Supporting families in reviewing and communicating about radiology imaging studies. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 5245--5256.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Jenchitr W, Hanutsaha P, Iamsirithaworn S, Parnrat U, Choosri P. 2007. The national survey of blindness low vision and visual impairment in thailand 2006--2007. Thai J Pub Hlth Ophthalmol 21, 1 (2007), 10.Google ScholarGoogle Scholar
  24. Marina Jirotka, Rob Procter, Mark Hartswood, Roger Slack, Andrew Simpson, Catelijne Coopmans, Chris Hinds, and Alex Voss. 2005. Collaboration and trust in healthcare innovation: The eDiaMoND case study. Computer Supported Cooperative Work (CSCW) 14, 4 (2005), 369--398.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Pearse A Keane and Eric J Topol. 2018. With an eye to AI and autonomous diagnosis. npj Digital Medicine 1, 1 (Aug. 2018), 1--3.Google ScholarGoogle Scholar
  26. Ajay Kohli and Saurabh Jha. 2018. Why CAD failed in mammography. Journal of the American College of Radiology 15, 3 (2018), 535--537.Google ScholarGoogle ScholarCross RefCross Ref
  27. Mark A Musen, Blackford Middleton, and Robert A Greenes. 2014. Clinical Decision-Support Systems. In Biomedical Informatics. Springer, London, 643--674.Google ScholarGoogle Scholar
  28. American Academy of Ophthalmology. 2002. International Clinical Diabetic Retinopathy Disease Severity Scale. http://www.icoph.org/dynamic/attachments/resources/ diabetic-retinopathy-detail.pdf. (Oct. 2002). Accessed: 2019--12--17.Google ScholarGoogle Scholar
  29. World Health Organization. 2018. Vision impairment and blindness. https://www.who.int/news-room/ fact-sheets/detail/blindness-and-visual-impairment. (2018). Accessed: 2019--9--13.Google ScholarGoogle Scholar
  30. Sun Young Park, Pei-Yi Kuo, Andrea Barbarin, Elizabeth Kaziunas, Astrid Chow, Karandeep Singh, Lauren Wilcox, and Walter Lasecki. 2019. Identifying Challenges and Opportunities in Human--AI Collaboration in Healthcare. (2019).Google ScholarGoogle Scholar
  31. Laura Pfeifer Vardoulakis, Amy Karlson, Dan Morris, Greg Smith, Justin Gatewood, and Desney Tan. 2012. Using mobile phones to present medical information to hospital patients. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1411--1420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Sarah K Pontefract, Jamie J Coleman, Hannah K Vallance, Christine A Hirsch, Sonal Shah, John F Marriott, and Sabi Redwood. 2018. The impact of computerised physician order entry and clinical decision support on pharmacist-physician communication in the hospital setting: A qualitative study. PloS one 13, 11 (2018), e0207450.Google ScholarGoogle ScholarCross RefCross Ref
  33. Paisan Raumviboonsuk, Jonathan Krause, Peranut Chotcomwongse, Rory Sayres, Rajiv Raman, Kasumi Widner, Bilson JL Campana, Sonia Phene, Kornwipa Hemarat, Mongkol Tadarati, and others. 2019. Deep learning versus human graders for classifying diabetic retinopathy severity in a nationwide screening program. npj Digital Medicine 2, 1 (2019), 25.Google ScholarGoogle Scholar
  34. Madhu C Reddy, David W McDonald, Wanda Pratt, and M Michael Shabot. 2005. Technology, work, and information flows: Lessons from the implementation of a wireless alert pager system. Journal of biomedical informatics 38, 3 (2005), 229--238.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Nigam H. Shah, Arnold Milstein, and Steven C. Bagley, PhD. 2019. Making Machine Learning Models Clinically Useful. JAMA (08 2019). DOI: http://dx.doi.org/10.1001/jama.2019.10306Google ScholarGoogle ScholarCross RefCross Ref
  36. Lauren Wilcox, Dan Morris, Desney Tan, and Justin Gatewood. 2010. Designing patient-centric information displays for hospitals. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 2123--2132.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Lauren Wilcox, Janet Woollen, Jennifer Prey, Susan Restaino, Suzanne Bakken, Steven Feiner, Alexander Sackeim, and David K Vawdrey. 2016. Interactive tools for inpatient medication tracking: a multi-phase study with cardiothoracic surgery patients. J. Am. Med. Inform. Assoc. 23, 1 (Jan. 2016), 144--158.Google ScholarGoogle ScholarCross RefCross Ref
  38. World Health Organization. 2007. Global Initiative for the Elimination of Avoidable Blindness : action plan 2006--2011. https://www.who.int/blindness/Vision2020_report.pdf. (2007). Accessed: 2019--9--7.Google ScholarGoogle Scholar
  39. World Health Organization. 2014. WHO: Diabetes factsheet. https://www.who.int/news-room/fact-sheets/detail/diabetes. (2014). Accessed: 2019--9--13.Google ScholarGoogle Scholar
  40. World Health Organization. 2016a. Diabetes country profiles 2016 : Thailand. https://www.who.int/diabetes/ country-profiles/tha_en.pdf?ua=1. (2016). Accessed: 2019--9--7.Google ScholarGoogle Scholar
  41. World Health Organization. 2016b. Diabetes country profiles 2016 : USA. https://www.who.int/diabetes/country-profiles/usa_en.pdf. (2016). Accessed: 2019--9--7.Google ScholarGoogle Scholar
  42. Qian Yang, Aaron Steinfeld, and John Zimmerman. 2019. Unremarkable AI: Fitting Intelligent Decision Support into Critical, Clinical Decision-Making Processes. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems (CHI '19). ACM, NY, NY, USA, Article 238, 11 pages. DOI: http://dx.doi.org/10.1145/3290605.3300468Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI '20: Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems
      April 2020
      10688 pages
      ISBN:9781450367080
      DOI:10.1145/3313831

      Copyright © 2020 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 23 April 2020

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate6,199of26,314submissions,24%

      Upcoming Conference

      CHI '24
      CHI Conference on Human Factors in Computing Systems
      May 11 - 16, 2024
      Honolulu , HI , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format