High-Throughput Phenotyping of the Symptoms of Alzheimer Disease and Related Dementias Using Large Language Models: Cross-Sectional Study

Published in JMIR AI, 2025

Recommended citation: Cheng, Y., Malekar, M., He, Y., Bommareddy, A., Magdamo, C., Singh, A., Westover, B., Mukerji, S. S., Dickson, J., & Das, S. (2025). High-Throughput Phenotyping of the Symptoms of Alzheimer Disease and Related Dementias Using Large Language Models: Cross-Sectional Study. JMIR AI, 4(1). https://doi.org/10.2196/66926

We developed an automated system using fine-tuned large language models (LLMs) to extract dementia-related symptoms from clinical records across seven domains: memory, executive function, motor, language, visuospatial, neuropsychiatric, and sleep. The system was trained and tested on electronic health records from Massachusetts General Hospital and evaluated against both traditional regular expression-based methods and brain MRI biomarkers. The LLM-based approach achieved high accuracy with AUROCs ranging from 0.97 to 0.99 per symptom domain, substantially outperforming traditional keyword-based methods in predicting ADRD diagnoses (AUROC 0.83 vs. 0.59). Brain imaging validation confirmed that extracted symptom patterns aligned with expected neurological findings—for example, smaller hippocampal volume was associated with memory impairments (OR 0.62, 95% CI 0.46–0.84; P=.006), and reduced pallidum size was associated with motor impairments (OR 0.73, 95% CI 0.58–0.90; P=.04). These results demonstrate the potential of LLM-based phenotyping tools to support clinical decision-making and enable large-scale dementia research.