Unsupervised Deep Learning of Electronic Health Records to Characterize Heterogeneity Across Alzheimer Disease and Related Dementias: Cross-Sectional Study
Published in JMIR Aging, 2025
Recommended citation: West, M., Cheng, Y., He, Y., Leng, Y., Magdamo, C., Hyman, B. T., Dickson, J. R., Serrano-Pozo, A., Blacker, D., & Das, S. (2025). Unsupervised Deep Learning of Electronic Health Records to Characterize Heterogeneity Across Alzheimer Disease and Related Dementias: Cross-Sectional Study. JMIR Aging, 8(1). https://doi.org/10.2196/65178
We applied unsupervised deep learning to electronic health records from 3,454 memory clinic patients at Massachusetts General Hospital to identify subtypes of Alzheimer disease and related dementias (ADRD). Using both structured ICD diagnostic codes and large language model-derived embeddings of clinical notes, we discovered patient clusters with distinct clinical profiles. Two ADRD subtypes showed consistent patterns across both data types: one characterized by psychiatric manifestations with higher female prevalence (1.59×), and another with cardiovascular and motor complications and higher male prevalence (1.75×). These findings demonstrate how combining different EHR data modalities can reveal clinically meaningful disease heterogeneity with potential applications for precision medicine approaches in dementia care.
