Privacy-Preserving Membership Queries for Federated Anomaly Detection

Authors: Jelle Vos (Delft University of Technology), Sikha Pentyala (University of Washington Tacoma), Steven Golob (University of Washington Tacoma), Ricardo Maia (University of Brasília), Dean Kelley (University of Washington Tacoma), Zekeriya Erkin (Delft University of Technology), Martine De Cock (University of Washington Tacoma, Ghent University), Anderson Nascimento (University of Washington Tacoma)

Volume: 2024
Issue: 3
Pages: 186–201
DOI: https://doi.org/10.56553/popets-2024-0074

Download PDF

Abstract: In this work, we propose a new privacy-preserving membership query protocol that lets a centralized entity privately query datasets held by one or more other parties to check if they contain a given element. This protocol, based on elliptic curve-based ElGamal and oblivious key-value stores, ensures that those 'data-augmenting' parties only have to send their encrypted data to the centralized entity once, making the protocol particularly efficient when the centralized entity repeatedly queries the same sets of data. We apply this protocol to detect anomalies in cross-silo federations. Data anomalies across such cross-silo federations are challenging to detect because (1) the centralized entities have little knowledge of the actual users, (2) the data-augmenting entities do not have a global view of the system, and (3) privacy concerns and regulations prevent pooling all the data. Our protocol allows for anomaly detection even in strongly separated distributed systems while protecting users' privacy. Specifically, we propose a cross-silo federated architecture in which a centralized entity (the backbone) has labeled data to train a machine learning model for detecting anomalous instances. The other entities in the federation are data-augmenting clients (the user-facing entities) who collaborate with the centralized entity to extract feature values to improve the utility of the model. These feature values are computed using our privacy-preserving membership query protocol. The model can be trained with an off-the-shelf machine learning algorithm that provides differential privacy to prevent it from memorizing instances from the training data, thereby providing output privacy. However, it is not straightforward to also efficiently provide input privacy, which ensures that none of the entities in the federation ever see the data of other entities in an unencrypted form. We demonstrate the effectiveness of our approach in the financial domain, motivated by the PETs Prize Challenge, which is a collaborative effort between the US and UK governments to combat international fraudulent transactions. We show that the private queries significantly increase the precision and recall of the otherwise centralized system and argue that this improvement translates to other use cases as well.

Keywords: federated learning, anomaly detection, ElGamal encryption, oblivious key-value stores, differential privacy

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.