Personal information inference from voice recordings: User awareness and privacy concerns

Jacob Leon Kröger; Leon Gellrich; Sebastian Pape; Saba Rebecca Brause; Stefan Ullrich

Personal information inference from voice recordings: User awareness and privacy concerns

Authors: Jacob Leon Kröger (Weizenbaum Institute for the Networked Society, Technische Universität Berlin, Germany), Leon Gellrich (Universität Potsdam, Germany), Sebastian Pape (Goethe Universität, Frankfurt, Germany), Saba Rebecca Brause (Weizenbaum Institute for the Networked Society, TU Berlin, Germany), Stefan Ullrich (Weizenbaum Institute for the Networked Society, TU Berlin, Germany)

Volume: 2022
Issue: 1
Pages: 6–27
DOI: https://doi.org/10.2478/popets-2022-0002

Download PDF

Abstract: Through voice characteristics and manner of expression, even seemingly benign voice recordings can reveal sensitive attributes about a recorded speaker (e. g., geographical origin, health status, personality). We conducted a nationally representative survey in the UK (n = 683, 18–69 years) to investigate people’s awareness about the inferential power of voice and speech analysis. Our results show that – while awareness levels vary between different categories of inferred information – there is generally low awareness across all participant demographics, even among participants with professional experience in computer science, data mining, and IT security. For instance, only 18.7% of participants are at least somewhat aware that physical and mental health information can be inferred from voice recordings. Many participants have rarely (28.4%) or never (42.5%) even thought about the possibility of personal information being inferred from speech data. After a short educational video on the topic, participants express only moderate privacy concern. However, based on an analysis of open text responses, unconcerned reactions seem to be largely explained by knowledge gaps about possible data misuses. Watching the educational video lowered participants’ intention to use voice-enabled devices. In discussing the regulatory implications of our findings, we challenge the notion of “informed consent” to data processing. We also argue that inferences about individuals need to be legally recognized as personal data and protected accordingly.

Keywords: privacy, voice recording, speech, microphone, voice assistant, smart speaker, inference attack

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 license.