These audio samples are recovered by VibSpeech from captured narrowband vibration signals. The experimental setup is introduced in Section 7.1.
The Raw(NB) refers to the raw narrowband signals derived from mmWave signals.
The Reconstructed (WB) refers to the reconstructed audio by VibSpeech.
The Ground-Truth refers to the original audio played by the loudspeaker.
| Raw (NB) | Reconstructed (WB) | Ground-Truth | Text | |
|---|---|---|---|---|
| #1 | Go Do You Hear | |||
| #2 | We Are Glad To Welcome His Gospel | |||
| #3 | At This Moment The Whole Soul Of The Old Man Seemed Centred In His Eyes Which Became Bloodshot | |||
| #4 | One Minute A Voice Said |
These samples are used to demonstrate the performance of VibSpeech when the attacker does/does not have a short utterance from the victim for speaker embedding extraction.
The Raw (NB) refers to the raw narrowband signals derived from mmWave signals.
The Reconstructed (#0) refers to the reconstructed audio when without speaker embeddings of Spk-A (female).
The Reconstructed (#1) refers to the reconstructed audio when with speaker embeddings of Spk-A. Audio for speaker embedding extraction: Audio#1
The Reconstructed (#2) refers to the reconstructed audio when with speaker embeddings of Spk-A. Audio for speaker embedding extraction: Audio#2
The Ground-Truth refers to the original audio played by the loudspeaker.
| Raw (NB) | Reconstructed (#0) | Reconstructed (#1) | Reconstructed (#2) | Ground-Truth | Text | |
|---|---|---|---|---|---|---|
| Spk-A | It's Delightful To Hear It In A London Theatre |
The Raw (NB) refers to the raw narrowband signals derived from mmWave signals.
The Reconstructed (#0) refers to the reconstructed audio when without speaker embeddings of Spk-B (male).
The Reconstructed (#1) refers to the reconstructed audio when with speaker embeddings of Spk-B. Audio for speaker embedding extraction: Audio#1
The Reconstructed (#2) refers to the reconstructed audio when with speaker embeddings of Spk-B. Audio for speaker embedding extraction: Audio#2
The Ground-Truth refers to the original audio played by the loudspeaker.
| Raw (NB) | Reconstructed (#0) | Reconstructed (#1) | Reconstructed (#2) | Ground-Truth | Text | |
|---|---|---|---|---|---|---|
| Spk-B | There Was Nothing Said About The Sort Of Accommodation Which Would Be Provided |
These audio samples are recovered by VibSpeech when we applied VibSpeech on IMU-measured vibration signals (Section 8). The experimental setup is shown in Figure 23.
| Raw (NB) | Reconstructed (WB) | Ground-Truth | Text | |
|---|---|---|---|---|
| #1 | Go Do You Hear | |||
| #2 | We Are Glad To Welcome His Gospel | |||
| #3 | At This Moment The Whole Soul Of The Old Man Seemed Centred In His Eyes Which Became Bloodshot | |||
| #4 | It Is Obviously Unnecessary For Us |