ScrambleDB: Oblivious (Chameleon) Pseudonymization-as-a-Service

Anja Lehmann

ScrambleDB: Oblivious (Chameleon) Pseudonymization-as-a-Service

Authors: Anja Lehmann (IBM Research – Zurich)

Volume: 2019
Issue: 3
Pages: 289–309
DOI: https://doi.org/10.2478/popets-2019-0048

Abstract: Pseudonymization is a widely deployed technique to de-sensitize data sets by consistently replacing identifying attributes with non-sensitive surrogates. However, all existing solutions are impractical to deploy in settings where data is accumulated from distributed sources: they either require sharing the same secret key with all sources, or rely on a fully trusted service to consistently compute these pseudonyms. Further, the consistency of pseudonyms, which is required to maintain the data’s utility, comes with inherent and severe privacy limitations. This paper solves the key management and privacy challenges by introducing oblivious pseudonymization-as-a-service. Therein, the pseudonymization is outsourced to a central, yet fully oblivious entity, i.e., the service neither learns the sensitive information nor the pseudonyms it produces. Further, to obtain better privacy we no longer require pseudonyms to be computed consistently and instead introduce a dedicated join procedure. When data is stored at rest, all data is pseudonymized in a fully unlinkable manner. Only when certain subsets of the data are needed, the linkage is established through a controlled and nontransitive join operation. We formally define the desired security properties in the UC framework and propose a generic protocol that provably satisfies them. The core of our scheme is a 3-party oblivious and convertible PRF, which we believe to be of independent interest.

Keywords: pseudonymization, privacy, OPRF

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 license.