Growing synthetic data through differentially-private vine copulas

Authors: Sébastien Gambs (UQAM), Frédéric Ladouceur (Ericsson Montréal), Antoine Laurent (UQAM), Alexandre Roy-Gaumond (UQAM)

Volume: 2021
Issue: 3
Pages: 122–141
DOI: https://doi.org/10.2478/popets-2021-0040

artifact

Download PDF

Abstract: In this work, we propose a novel approach for the synthetization of data based on copulas, which are interpretable and robust models, extensively used in the actuarial domain. More precisely, our method COPULASHIRLEY is based on the differentially-private training of vine copulas, which are a family of copulas allowing to model and generate data of arbitrary dimensions. The framework of COPULA-SHIRLEY is simple yet flexible, as it can be applied to many types of data while preserving the utility as demonstrated by experiments conducted on real datasets. We also evaluate the protection level of our data synthesis method through a membership inference attack recently proposed in the literature.

Keywords: Synthetic data, Copulas, Differential privacy, Privacy evaluation.

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution-NonCommercial-NoDerivs license.