Privacy-preserving Multiple Sequence Alignment Scheme for Long Gene Sequence

Authors: Yatong Jiang (Beihang University), Tao Shang (Beihang University), Jianwei Liu (Beihang University)

Volume: 2025
Issue: 1
Pages: 236–249
DOI: https://doi.org/10.56553/popets-2025-0014

Download PDF

Abstract: Gene Multiple Sequence Alignment is crucial for genomic data analysis, forming the basis for studying its biological significance. The digitization of genomic data allows collaborative analysis on cloud platforms, improving the efficiency and precision of genomic research. However, gene sequences contain sensitive information, posing a risk of privacy leakage with unauthorized access. Balancing privacy, accuracy, and efficiency in multiple sequence alignment for long gene sequences remains a challenge. In this paper, we propose a distributed privacy-preserving multiple sequence alignment scheme for long sequences based on secure multi-party computation. Our scheme includes a method for segmenting long sequences to achieve partially distributed computing and a privacy-preserving method for calculating edit distance among subsequences using secret sharing. The scheme consists of a distributed computing phase and an aggregate computing phase, optimizing efficiency by dropping repeated subsequences alignment. Our proposed scheme achieves accurate and efficient privacy-preserving alignment for long gene sequences.

Keywords: Multiple Sequence Alignment, Secure Multi-Party Computation, Cloud Platform, Privacy-preserving, Distributed computing

Copyright in PoPETs articles are held by their authors. This article is published under a Creative Commons Attribution 4.0 license.