Artificial Intelligence in Managing Messages from Patients

I ran across another interesting article in the JAMA Network about Artificial Intelligence (AI) with respect to health care organization managing messages from patients to doctors and nurse. The shorthand for this in the article is “in-basket burden.” Health care workers respond to a large number of patients’ questions and it can lead to burnout. Some organizations are trying to test AI by letting it make draft replies to patients. The results of the quality improvement study were published in a paper:

English E, Laughlin J, Sippel J, DeCamp M, Lin C. Utility of Artificial Intelligence–Generative Draft Replies to Patient Messages. JAMA Netw Open. 2024;7(10):e2438573. doi:10.1001/jamanetworkopen.2024.38573

One of the fascinating things about this is the trouble we have naming the problems with misinformation that AI has. We tend to use a couple of terms interchangeably: hallucinations and confabulation. Whatever you call it, the problem interferes with communication between health care workers and patients.

Dr. English describes the interference as a “whack-a-mole” issue, meaning every time they think they got the hallucination/confabulation problem licked, the AI comes up with another case of miscommunication.

Just for fun, I did a web search trying to find out whether “hallucination” or “confabulation” fit the AI behavior best. Computer experts tend to use the term “hallucination” and neuropsychologists seem to prefer “confabulation.” I think this community chat site gives a pretty even-handed discussion of the distinction. I prefer the term “confabulation.”

Anyway, there are other substantive issues with how using AI drafts for patient messaging affects communication. I think it’s interesting that patients tend to think AI is more empathetic than medical practitioners. As Dr. English puts it: “This GPT is nicer than most of us,” and “And ChatGPT, or any LLM, isn’t busy. It doesn’t get bored. It doesn’t get tired.” The way that’s worded made me think of a scene from a movie:

OK, so I’m kidding—a little. I think it’s important to move carefully down the path of idealizing AI. I think back to the recent news article about humans teaching AI how to lie and scheme. I remember that I searched the web with the question “Can AI lie?” and getting a reply from Gemini because I have no choice on whether or not it gives me its two cents. I’m paraphrasing but it said essentially, “Yes, AI can lie and we’re getting better with practice.”

I like Dr. English’s last statement, in which she warns us that AI can be a fun tool which clinicians need to have a healthy skepticism about. It may say things you might be tempted to gloss over or even ignore, like:

“I’ll be back.”

Dirty Deepfakes

I saw an article about the unreliable ability of humans to detect digital deepfakes in audio and video productions (Mai KT, Bray S, Davies T, Griffin LD. Warning: Humans cannot reliably detect speech deepfakes. PLoS One. 2023 Aug 2;18(8):e0285333. doi: 10.1371/journal.pone.0285333. PMID: 37531336; PMCID: PMC10395974.).

I was a little surprised. I thought I was pretty good at detecting the weird cadence of Artificial Intelligence (AI) speech patterns, which I think I can distinguish pretty well. Maybe not.

And there are some experts who are concerned about AI’s ability to mimic written and spoken grammar—but it continues to make stuff up (called “hallucinations”). In fact, some research shows that AI can display great language skills but can’t form a true model of the world.

And the publisher of the book (“Psychosomatic Medicine: An Introduction to Consultation-Liaison Psychiatry”) that I and my co-editor, Dr. Robert G. Robinson, MD wrote 14 years ago is still sending me requests to sign a contract addendum that would allow the text to be used by AI organizations. I think I’m the only who gets the messages because they’re always sent to me and Bob—as though Bob lives with me or something.

Sometimes my publisher’s messages sound like they’re written by AI. Maybe I’m just paranoid.

Anyway, this reminds me of a blog post I wrote in 2011, “Going from Plan to Dirt,” which I re-posted last year under the title “Another Blast from the Past.” Currently, this post is slightly different although it still applies. I don’t think AI can distinguish plan from dirt and sometimes makes up dirt, simply put.

And if humans can’t distinguish the productions by AI from those of humans, where does that leave us?