DyPS: Dynamic, Private and Secure GWAS

Authors: Túlio Pascoal (SnT, University of Luxembourg.), Jérémie Decouchant (FSTM, University of Luxembourg.), Antoine Boutet (University of Lyon, INSA Lyon, Inria, CITI.), Paulo Esteves-Verissimo (KAUST - Resilient Computing and Cybersecurity Center (RC3). Work partly performed while this author was with the University of Luxembourg.)

Volume: 2021
Issue: 2
Pages: 214–234
DOI: https://doi.org/10.2478/popets-2021-0025

Download PDF

Abstract: Genome-Wide Association Studies (GWAS) identify the genomic variations that are statistically associated with a particular phenotype (e.g., a disease). The confidence in GWAS results increases with the number of genomes analyzed, which encourages federated computations where biocenters would periodically share the genomes they have sequenced. However, for economical and legal reasons, this collaboration will only happen if biocenters cannot learn each others’ data. In addition, GWAS releases should not jeopardize the privacy of the individuals whose genomes are used. We introduce DyPS, a novel framework to conduct dynamic privacy-preserving federated GWAS. DyPS leverages a Trusted Execution Environment to secure dynamic GWAS computations. Moreover, DyPS uses a scaling mechanism to speed up the releases of GWAS results according to the evolving number of genomes used in the study, even if individuals retract their participation consent. Lastly, DyPS also tolerates up to all-but-one colluding biocenters without privacy leaks. We implemented and extensively evaluated DyPS through several scenarios involving more than 6 million simulated genomes and up to 35,000 real genomes. Our evaluation shows that DyPS updates test statistics with a reasonable additional request processing delay (11% longer) compared to an approach that would update them with minimal delay but would lead to 8% of the genomes not being protected. In addition, DyPS can result in the same amount of aggregate statistics as a static release (i.e., at the end of the study), but can produce up to 2.6 times more statistics information during earlier dynamic releases. Besides, we show that DyPS can support a larger number of genomes and SNP positions without any significant performance penalty.

Keywords: Federated GWAS, Genomic privacy, Dynamic workload, Collusion resistance

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 license.