Alerts

Does ChatGPT Provide Higher Quality and More Empathetic Responses to Patient Questions Compared to Physician Responses?

BACKGROUND AND PURPOSE:

Artificial intelligence (AI) such as ChatGPT can provide responses to patients
Ayers et al. (JAMA Internal Medicine, 2023) evaluated the ability of ChatGPT to provide quality and empathetic responses to patient questions

METHODS:

Cross-sectional study
Dataset
- Patient questions from a public social media forum (Reddit’s r/AskDocs)
Interventions
- Chatbot responses
  - Generated by entering the original question into a fresh session (without prior questions having been asked in the session)
- Physician responses
Study design
- Responses were anonymized and randomly ordered, and were then evaluated in triplicate by a team of health care professionals
- Evaluators judged
  - Quality: Very poor | Poor | Acceptable | Good | Very good
  - Empathy: Not empathetic | Slightly empathetic | Moderately empathetic | Empathetic | Very empathetic
Primary outcomes
- Mean outcomes, ordered on a 1 to 5 scale and compared between chatbot and physicians

RESULTS:

195 questions and responses
Percentage of chatbot responses preferred by evaluators: 78.6% (95% CI, 75.0 to 81.8)
Physician responses were significantly
- Shorter than chatbot responses (P<0.001)
  - Physician: 52 (IQR, 17 to 62) words | Chatbot: 211 (IQR, 168 to 245) words
- Lower quality than chatbot responses
  - P<0.001 (P<0.001)
The proportion of responses rated as good or very good quality (≥ 4) was higher for chatbot than physicians
- Chatbot: 78.5% (95% CI, 72.3 to 84.1)
- Physicians: 22.1% (95% CI, 16.4 to 28.2)
Chatbot responses were also evaluated as significantly more empathetic than physician responses (P<0.001)
The proportion of responses rated empathetic or very empathetic (≥4) was higher for chatbot than for physicians
- Chatbot: 45.1% (95% CI, 38.5 to 51.8)
- Physicians: 4.6% (95% CI, 2.1 to 7.7)

CONCLUSION:

Compared to physician responses, chatbot responses to patient-posed questions had a 3.6 times higher prevalence of being good or very good quality, and a 9.8 times higher prevalence of being empathetic or very empathetic
Limitations included
- Quality and empathy measures were not pilot tested or validated
- Study evaluators were coauthors that could lead to biases
Another significant limitation was lack of context as stated by the authors

The main study limitation was the use of the online forum question and answer exchanges

Such messages may not reflect typical patient-physician questions

For instance, we only studied responding to questions in isolation, whereas actual physicians may form answers based on established patient-physician relationships

Learn More – Primary Sources:

Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum

Does ChatGPT Provide Higher Quality and More Empathetic Responses to Patient Questions Compared to Physician Responses?

BACKGROUND AND PURPOSE:

METHODS:

RESULTS:

CONCLUSION:

Learn More – Primary Sources:

SPECIALTY AREAS

Already a PcMED Member?

Not a PcMED Member Yet?