Where we live and work, our age, and the conditions we grew up in can influence our health and lead to disparities, but these factors can be difficult for clinicians and researchers to capture and address. A new study by investigators from Mass General Brigham demonstrates that large language models (LLMs), a type of generative artificial intelligence (AI), can be trained to automatically extract information on social determinants of health (SDoH) from clinicians’ notes which could augment efforts to identify patients who may benefit from resource support. Findings, published in npj Digital Medicine, show that the finely tuned models could identify 93.8 percent of patients with adverse SDoH, whereas official diagnostic codes included this information in only 2 percent of cases. These specialized models were less prone to bias than generalist models such as GPT-4.
“Our goal is to identify patients who could benefit from resource and social work support, and draw attention to the under-documented impact of social factors in health outcomes,” said corresponding author Danielle Bitterman, MD, a faculty member in the Artificial Intelligence in Medicine (AIM) Program at Mass General Brigham and a physician in the Department of Radiation Oncology at Brigham and Women’s Hospital. “Algorithms that can pass major medical exams have received a lot of attention, but this is not what doctors need in the clinic to help take better care of patients each day. Algorithms that can notice things that doctors may miss in the ever-increasing volume of medical records will be more clinically relevant and therefore more powerful for improving health.”
Health disparities are widely recognized to be linked to SDoH, which include employment, housing and other non-medical circumstances that impact medical care. For example, the distance a cancer patient lives from a major medical center or the support they have from a partner can substantially influence outcomes. While clinicians may summarize relevant SDoH in their visit notes, this vital information is rarely systematically organized in the electronic health record (EHR).
The emergence of AI tools in health has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation’s top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support and administrative processes.
To create LMs capable of extracting information on SDoH, the researchers manually reviewed 800 clinician notes from 770 patients with cancer who received radiotherapy at the Department of Radiation Oncology at Brigham and Women’s Hospital. They tagged sentences that referred to one or more of six, pre-determined SDoH: employment status, housing, transportation, parental status (if the patient has a child under 18 years old), relationships and presence or absence of social support.
Using this “annotated” dataset, the researchers trained existing LMs to identify references to SDoH in clinician notes. They tested their models using an additional 400 clinic notes from patients treated with immunotherapy at Dana-Farber Cancer Institute and from patients admitted to the critical care units at Beth Israel Deaconess Medical Center.
The researchers found that fine-tuned LMs, especially Flan-T5 LMs, could consistently identify rare references to SDoH in clinician notes. The “learning capacity” of these models was limited by the rarity of SDoH documentation in the training set, where the researchers found that only 3 percent of sentences in clinician notes contained any mention of SDoH. To address this issue, the researchers used ChatGPT, another LM, to produce an additional 900 synthetic examples of SDoH sentences that could be used as an extra training dataset.
A major criticism of generative AI models in health care is that they can potentially perputate bias and widen health disparities. The researchers found that their fine-tuned LM was less likely than OpenAI’s GPT-4, a generalist LM, to change its determination about an SDoH based on individuals’ race/ethnicity and gender. The researchers state that it is difficult to understand exactly how biases are formed and deconstructed — both in humans and in computer models. Understanding the origins of algorithmic bias is an ongoing endeavor for the researchers.
“If we don’t monitor algorithmic bias when we develop and implement large language models, we could make existing health disparities much worse than they currently are,” Bitterman said. “This study demonstrated that fine-tuning LMs may be a strategy to reduce algorithmic bias, but more research is needed in this area.”
Authorship: Mass General Brigham co-authors include co-first authors Marco Guevara, MS, and Shan Chen, MS, of the AIM Program at Mass General Brigham and the Department of Radiation Oncology at the Brigham, Spencer Thomas, Tafadzwa L. Chaunzwa, Benjamin H. Kann, Jack M. Qian, Hugo JWL Aerts and Raymond H. Mak (Brigham and Women’s Hospital and AIM Program) and Idalid Franco and Shalini Moningi (AIM Program, Brigham and Women’s Hospital). Additional co-authors include Madeleine Golstein, Susan Harper, Paul J. Catalano and Guergana K. Savova..
Disclosures: Bitterman is an Associate Editor of Radiation Oncology, HemOnc.org and receives funding from the American Association for Cancer Research. Chen and Guevara report no disclosures. A complete list of disclosures is included in the paper.
Funding: Bitterman received funding for this work from the Woods Foundation, Jay Harris Junior Faculty Award, Joint Center for Radiation Therapy Foundation and National Institutes of Health (U54CA274516-01A1). A complete list of funding sources is included in the paper.
Paper cited: Guevara, M et al. “Large Language Models to Identify Social Determinants of Health in Electronic Health Records” npj Digital Medicine DOI: 10.1038/s41746-023-00970-0
For More Information:
- Artificial Intelligence in Medicine Program
- Data Science Office
- 2024 Predictions about Artificial Intelligence
- ChatGPT Shows Limited Ability to Recommend Guidelines-Based Cancer Treatments
- Study Finds Limitations to CPR Directions Given by Artificial Intelligence Voice Assistants, Recommends Use of Emergency Services
- ChatGPT Shows ‘Impressive’ Accuracy in Clinical Decision Making
###
About Mass General Brigham
Mass General Brigham is an integrated academic health care system, uniting great minds to solve the hardest problems in medicine for our communities and the world. Mass General Brigham connects a full continuum of care across a system of academic medical centers, community and specialty hospitals, a health insurance plan, physician networks, community health centers, home care, and long-term care services. Mass General Brigham is a nonprofit organization committed to patient care, research, teaching, and service to the community. In addition, Mass General Brigham is one of the nation’s leading biomedical research organizations with several Harvard Medical School teaching hospitals. For more information, please visit massgeneralbrigham.org.
Journal
npj Digital Medicine
Method of Research
Computational simulation/modeling
Subject of Research
People
Article Title
Large Language Models to Identify Social Determinants of Health in Electronic Health Records
Article Publication Date
11-Jan-2024
COI Statement
Bitterman is an Associate Editor of Radiation Oncology, HemOnc.org and receives funding from the American Association for Cancer Research. Chen and Guevara report no disclosures. A complete list of disclosures is included in the paper.