News Release

BU researchers develop computational tools to safeguard privacy without degrading voice-based cognitive markers

Peer-Reviewed Publication

Boston University School of Medicine

(Boston)—Digital voice recordings contain valuable information that can indicate an individual’s cognitive health, offering a non-invasive and efficient method for assessment. Research has demonstrated that digital voice measures can detect early signs of cognitive decline by analyzing features such as speech rate, articulation, pitch variation and pauses, which may signal cognitive impairment when deviating from normative patterns.

 

However, voice data introduces privacy challenges due to the personally identifiable information embedded in recordings, such as gender, accent and emotional state, as well as more subtle speech characteristics that can uniquely identify individuals. These risks are amplified when voice data is processed by automated systems, raising concerns about re-identification and potential misuse of data.

 

In a new study, researchers from Boston University Chobanian & Avedisian School of Medicine have introduced a computational framework that applies pitch-shifting, a sound recording technique that changes the pitch of a sound, either raising or lowering it, to protect speaker 

identity while preserving acoustic features essential for cognitive assessment.

 

“By leveraging techniques such as pitch-shifting as a means of voice obfuscation, we demonstrated the ability to mitigate privacy risks while preserving the diagnostic value of acoustic features,” explained corresponding author Vijaya B. Kolachalama, PhD, FAHA, associate professor of medicine.

 

Using data from the Framingham Heart Study (FHS) and DementiaBank Delaware (DBD), the researchers applied pitch-shifting at different levels and incorporated additional transformations, such as time-scale modifications and noise addition, to alter vocal characteristics to responses to neuropsychological tests. They then assessed speaker obfuscation via equal error rate and diagnostic utility through the classification accuracy of machine learning models distinguishing cognitive states: normal cognition (NC), mild cognitive impairment (MCI) and dementia (DE).

 

Using obfuscated speech files, the computational framework was able to accurately determine NC, MCI and DE differentiation in 62% of the FHS dataset and 63% of the DBD dataset.

 

According to the researchers, this work contributes to the ethical and practical integration of voice data in medical analyses, emphasizing the importance of protecting patient privacy while maintaining the integrity of cognitive health assessments. “These findings pave the way for developing standardized, privacy-centric guidelines for future applications of voice-based assessments in clinical and research settings,” adds Kolachalama, who also is an associate professor of computer science, affiliate faculty of Hariri Institute for Computing and a founding member of the Faculty of Computing & Data Sciences at Boston University.

 

These findings appear online in Alzheimer's & Dementia: The Journal of the Alzheimer's Association.

 

This project was supported by grants from the National Institute on Aging’s Artificial Intelligence and Technology Collaboratories (P30-AG073104 and P30-AG073105), the American Heart Association (20SFRN35460031), Gates Ventures, and the National Institutes of Health (R01-HL159620, R01-AG062109, and R01-AG083735).

 

Note to Editors:

V.B.K. is a co-founder and equity holder of deepPath Inc. and CogniScreen, Inc. He also serves on the scientific advisory board of Altoida Inc. R.A. is a scientific advisor to Signant Health and NovoNordisk.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.