Artificial Intelligence Might Help Clinicians Answer Patient Questions
By Martin S. Lipsky, MD
Chancellor, South Jordan Campus, Roseman University of Health Sciences, South Jordan, UT
SYNOPSIS: Researchers evaluated the ability of ChatGPT to answer patient questions posed in an online forum. The authors found the chatbot generated quality and empathetic answers. These results suggest artificial intelligence assistants might help draft responses to patient questions.
SOURCE: Ayers JW, Poliak A, Dredze M, et al. Comparing physician and artificial intelligence chatbot responses to patient questions posted to a public social media forum. JAMA Intern Med 2023;183:589-596.
The COVID-19 pandemic led to the wider use of virtual healthcare.1 One consequence was an expansion of electronic patient messages, with each question adding two to three minutes of administrative work for physicians.2 Ayers et al explored whether ChatGPT, a new generation of artificial intelligence (AI) technology, could draft responses that physicians could edit, perhaps alleviating some administrative burden.
Using questions from a public social media forum (Reddit’s r/AskDocs), the authors randomly selected 195 exchanges from October 2022 during which a verified physician had responded to a question. Researchers generated chatbot responses by entering the original questions into a fresh session. The original question, along with anonymized and randomly ordered physician and chatbot responses, were evaluated in triplicate by a team of licensed healthcare professionals. Evaluators chose “which response was better” and judged both “the quality of information provided” (very poor, poor, acceptable, good, or very good) and “the empathy or bedside manner provided” (not empathetic, slightly empathetic, moderately empathetic, empathetic, and very empathetic). The authors ordered mean outcomes on a 1 to 5 scale and compared responses between the chatbot and physicians.
Of the 195 questions and responses, evaluators preferred chatbot responses to physician responses in 78.6% of the 585 evaluations (95% CI, 75.0%-81.8%). The proportion of responses rated as good or very good quality was higher for the chatbot than for physicians (chatbot: 78.5%; 95% CI, 72.3%-84.1%; physicians: 22.1%; 95% CI, 16.4%-28.2%). This amounted to a 3.6 times higher prevalence of good or very good quality responses for the chatbot. Chatbot responses also were rated significantly more empathetic than physician responses, with a 9.8 times higher prevalence of empathetic or very empathetic responses for the chatbot.
The authors concluded ChatGPT generated quality and empathetic responses to patient questions posed in an online forum. They suggested further study in clinical settings to see if AI might improve message responses, lower clinician burnout, and improve patient outcomes.
COMMENTARY
The FDA considers AI and machine learning (ML) a branch of computer science, statistics, and engineering that uses algorithms or models to perform tasks and exhibit behaviors such as learning, making decisions, and making predictions without specific programming.3 ChatGPT is a natural language processing tool driven by AI technology that can answer questions and assist with tasks, such as composing emails and essays. In the Ayers et al study, an independent review panel rated the quality of ChatGPT responses to questions as better than physician answers.
I was not especially surprised by this result, but I was astonished the panel rated chatbot responses almost 10 times higher empathetic or very empathetic than physician responses. I am not sure whether this represents a feather in AI’s hat or a physician’s black eye. One explanation might be questions posed by an anonymous source do not generate the level of empathy that a known patient’s question might. I choose to believe physicians would be more empathetic than ChatGPT in responding to their familiar patients.
Ayers et al suggested that in an environment where electronic medical records and documentation plague physicians, AI offers a technological means to alleviate physician burden. In an editorial, Li et al asked what physician would not welcome help with drafting progress notes, summarizing the literature, completing insurance forms, and responding to messages.4 Additional studies of the application of AI in primary care demonstrated it offers the promise to successfully manage many routine documentation tasks and shorten the time a physician spends on these tasks.5,6 Other investigators found AI can help with triage, assess procedural skills, and that ChatGPT can perform at or near the passing threshold on all three parts of the United States Medical Licensing Examination.7-9
I teach an online public health class. After hearing so much about AI, I decided to explore how ChatGPT answered several of my study questions. The program responded quickly to each question, and most answers were as good (or maybe better) than my answer key. However, the chatbot responses also included several incorrect references and answers that were incomplete or, at best, average. While it often was impressive, my admittedly unscientific results suggest chatbot responses still needed careful review and the references need verification.
AI reminds me of other technological transitions. For example, I remember bringing a slide rule to class and teachers forbidding calculators. Today, many students would not recognize a slide rule, and teachers accept calculators as quicker and more accurate. Despite this, a calculator does not replace the skills needed to enter information correctly and interpret the outcome accurately. I am optimistic that AI will be a useful tool, but I do not see it as replacing physicians. Instead, it likely will reduce elements of practice that exacerbate burnout while enhancing data summarization and augmenting the application of evidence to patient care.
While new applications should be researched carefully, AI already is used in imaging and drug discovery. Other areas where AI might be effective include improving diagnostic accuracy, image analysis, and prognosis predictions.10 However, the tool should be employed cautiously since AI is subject to data bias (it can only be as good as the data entered); overfitting of data; and the fact the tool identifies predictors, not causes. Ultimately, if AI can create more time to spend with patients and improve care, it seems likely that with the proper cautions, this AI tool will be a good thing.
REFERENCES
1. Zulman DM, Verghese A. Virtual care, telemedicine visits, and real connection in the era of COVID-19: Unforeseen opportunity in the face of adversity. JAMA 2021;325:437-438.
2. Holmgren AJ, Downing NL, Tang M, et al. Assessing the impact of the COVID-19 pandemic on clinician ambulatory electronic health record use. J Am Med Inform Assoc 2022;29:453-460.
3. U.S. Food & Drug Administration. Artificial intelligence and machine learning (AI/ML) for drug development. Content current as of May 16, 2023.
4. Li R, Kumar A, Chen JH. How chatbots and large language model artificial intelligence systems will reshape modern medicine: Fountain of creativity or Pandora’s box? JAMA Intern Med 2023;183:596-597.
5. American Academy of Family Physicians. Using an AI assistant to reduce documentation burden in family medicine. Evaluating the Suki assistant. November 2021.
6. American Academy of Family Physicians. AI assistant for clinical review to reduce burden and improve quality and value-based care outcomes. Evaluating the Navina assistant. October 2022.
7. Ellertsson S, Hlynsson HD, Loftsson H, Sigur Sson EL. Triaging patients with artificial intelligence for respiratory symptoms in primary care to improve patient outcomes: A retrospective diagnostic accuracy study. Ann Fam Med 2023;21:240-248.
8. Igaki T, Kitaguchi D, Matsuzaki H, et al. Automatic surgical skill assessment system based on concordance of standardized surgical field development using artificial intelligence. JAMA Surg 2023; Jun 7:e231131. doi: 10.1001/jamasurg.2023.1131. [Online ahead of print].
9. Kung TH, Cheatham M, Medenilla A, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digit Health 2023;2:e0000198.
10. Obermeyer Z, Emanuel EJ. Predicting the future - Big data, machine learning, and clinical medicine. N Engl J Med 2016;375:1216-1219.
Researchers evaluated the ability of ChatGPT to answer patient questions posed in an online forum. The authors found the chatbot generated quality and empathetic answers. These results suggest artificial intelligence assistants might help draft responses to patient questions.
Subscribe Now for Access
You have reached your article limit for the month. We hope you found our articles both enjoyable and insightful. For information on new subscriptions, product trials, alternative billing arrangements or group and site discounts please call 800-688-2421. We look forward to having you as a long-term member of the Relias Media community.