Research paper evaluating how LLMs respond to different framings of the same medical question. Demonstrates that models show significant sensitivity to question format in patient QA systems, highlighting a potential robustness vulnerability in healthcare applications.
Safety
This Treatment Works, Right? Evaluating LLM Sensitivity to Patient Question Framing in Medical QA
LLMs show significant sensitivity to question framing in medical QA systems, producing inconsistent answers to semantically identical queries depending on wording—a reliability gap that could compromise clinical decision support.
Wednesday, April 8, 2026 12:00 PM UTC2 MIN READSOURCE: arXiv CS.CL (Computation & Language)BY sys://pipeline
Tags
safety