Manuel Eichelberger and Simon Tanner, two ETH doctoral students, store data in music. This means, for example, that background music can contain the access data for the local Wi-Fi network, and a mobile phone's built-in microphone can receive this data. "That would be handy in a hotel room," Tanner says, "since guests would get access to the hotel Wi-Fi without having to enter a password on their device."
To store the data, the two doctoral students and their colleague, Master's student Gabriel Voirol, make minimal changes to the music. In contrast to other scientists' attempts in recent years, the researchers state that their new approach allows higher data transfer rates with no audible effect on the music. "Our goal was to ensure that there was no impact on listening pleasure," Eichelberger says.
Tests the researchers have conducted show that in ideal conditions, their technique can transfer up to 400 bits per second without the average listener noticing the difference between the source music and the modified version (see also the audio sample). Given that under realistic conditions a degree of redundancy is necessary to guarantee transmission quality, the transfer rate will more likely be some 200 bits - or around 25 letters - per second. "In theory, it would be possible to transmit data much faster. But the higher the transfer rate, the sooner the data becomes perceptible as interfering sound, or data quality suffers," Tanner adds.
Dominant notes hide information
The researchers from ETH Zurich's Computer Engineering and Networks Laboratory use the dominant notes in a piece of music, overlaying each of them with two marginally deeper and two marginally higher notes that are quieter than the dominant note. They also make use of the harmonics (one or more octaves higher) of the strongest note, inserting slightly deeper and higher notes here, too. It is all these additional notes that carry the data. While a smartphone can receive and analyse this data via its built-in microphone, the human ear doesn't perceive these additional notes.
"When we hear a loud note, we don't notice quieter notes with a slightly higher or lower frequency," Eichelberger says. "That means we can use the dominant, loud notes in a piece of music to hide the acoustic data transfer." It follows that the best music for this kind of data transfer has lots of dominant notes - pop songs, for instance. Quiet music is less suitable.
To tell the decoder algorithm in the smartphone where it needs to look for data, the scientists use very high notes that the human ear can barely register: they replace the music in the frequency range 9.8-10 kHz with an acoustic data stream that carries the information on when and where across the rest of the music's frequency spectrum to find the data being transmitted.
From the loudspeaker to the mic
The transmission principle behind this technique is fundamentally different from the well-known RDS system as used in car radios to transmit the radio station's name and details of the music that is playing. "With RDS, the data is transmitted using FM radio waves. In other words, data is sent from the FM transmitter to the radio device," Tanner explains. "What we're doing is embedding the data in the music itself - transmitting data from the loudspeaker to the mic."
###
Reference
Eichelberger M, Tanner S, Voirol G, Wattenhofer R: Imperceptible Audio Communication. 44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, 12-17 May 2019