Multiparty Private Set Intersection Cardinality and Its Applications

,


INTRODUCTION
Secure multi-party computation (MPC) allows a set of parties to jointly invoke a distributed computation while ensuring correctness, privacy, and more. in this work, we study Private Set Intersection Cardinality (PSI-CA), a special case of MPC, that allows multiple parties to compute the intersection size of their private sets without revealing additional information.PSI itself has been motivated by many real-world applications such as contact discovery [26].Over the last several years PSI has become truly practical with extremely fast cryptographically secure implementations [11,22,35,38].In the setting of two parties, PSI with post-processing (a.k.a circuitbased PSI), especially PSI-CA, has recently drawn more attention with several applications, such as measuring the effectiveness of online advertising [25], limiting the spread of Child Sexual Abuse Material (CSAM) [7], and private contact tracing related to COVID-19 [6,15,18].However, the state-of-the-art PSI-CA is only efficient in the two-party setting [15,18,25].This work considers a natural generalization to the multi-party setting, which opens the opportunity for richer applications, like the two we showcase below.The state-of-the-art protocol for PSI-CA in the multi-party setting [10] relies on secret-shared computation [12], which might not scale well for a large number of parties.In this work we present a scalable protocol for PSI-CA in the multi-party setting with an assumption that a particular subset of parties refrains from collusion.This is an reasonable assumption for real-life applications especially when performance is critical so that a weaker security guarantee is applied as trade off.For example, in the Covid-19 heatmap computation with multiple mobile network operators, a large cloud computing company (e.g.Amazon AWS) can play the role of the server, and the health service running by the government can play the role of  1 .Statutes and regulations imposed upon large companies and governments reduce the likelihood of collusion among these parties.In reality, such participants can be chosen to play the role of server or leaders (i.e. 1 ,  2 , or   depending on the protocol) when invoking the multi-party protocols we proposed.
Moreover, we present a new protocol, called DotProd, where  parties may compute a sum of element-wise products of their binary vectors without revealing any additional information.Mathematically, suppose party   holds the -element vector   , then the parties obtain  =1  =1   [ 𝑗], where   [ 𝑗] is the  ℎ element of the vector   .Note that in the two-party case, the computation is exactly of the dot product  1 •  2 .We demonstrate the efficiency of our protocols through two real-world applications: a COVID-19 heatmap computation based on PSI-CA and an associated rule learning (ARL) based on DotProd.
In the rest of this section, we will present the related work of PSI-CA and its applications.Additionally, we will delve into the technical overview and outcomes of our proposed protocols.To establish a foundation, Section 2 presents the necessary preliminaries.Furthermore, we will introduce two novel cryptographic gadgets, namely Server-Aided Shuffled OPRF and Server-Aided OPPRF, in Section 3. We will discuss our PSI-CA protocols in Section 4 and explore practical applications in Section 5. Lastly, in Section 6, we will evaluate the performance of our PSI-CA protocols and provide a comparison with existing approaches.

State-of-the-Art for PSI Cardinality
Private Set Intersection Cardinality (PSI-CA) is a variant of PSI in which the parties learn the intersection size and nothing else.In this work, we also focus on server-aided PSI-CA constructions.By "server-aided", we refer to cases where the parties perform PSI-CA computation with the help of a semi-honest cloud server(s).To the best of our knowledge, this work proposes the first special-purpose PSI-CA protocols from symmetric-key techniques that work in the multi-party setting.
We start with discussing PSI-CA works in the two-party setting.Clearly, one can use circuit-based PSI [36] to implement PSI-CA.However, this generic solution is expensive due to the secure computation inside the circuit.For the special-purpose two-party PSI-CA constructions, the work [13,25] extends the classic DH-based PSI protocol [32] to support two-party PSI-CA by having a sender shuffle the PRFs of their items before returning to the receiver.Epione [43] also proposed a protocol that is suitable to the unbalanced, client-server setting, in which the server has a large database of  1 items and the client has a small database of  2 items.The protocol, however, requires  ( 1 +  2 ) expensive public-key operations (group exponentiation).Delegated PSI-CA [18] improves the efficiency of the two-party PSI-CA protocol on the client's device, Catalic [18] proposes a delegated system in which the client (i.e.PSI-CA receiver) can shift most of its PSI-CA computation to multiple untrusted servers while preserving privacy.However, Catalic system requires at least two non-colluding cloud servers with a heavy computation/communication cost.Based on oblivious switching network (OSN), [21] proposes a two-party PSI-CA (so-called OSN-based PSI-CA) which is better than circuit-based PSI-CA protocol [36] in terms of communication cost and running time in the WAN setting.However, it has both communication and computation complexity  ( log()) for a set size  due to the expensive OSN construction.Dittmer et al. [15] introduces a variant of two-party PSI-CA (so-called weighted PSI-CA) in which each token of the client has an associated secret weight.The weighted PSI-CA is based on cheap Function Secret Sharing (FSS) constructions [8,9], thus it is efficient on both client's and server's sides.However, their construction assumes that there exist two non-colluding servers, each holding an identical input set.In Section 4, we show that the more straightforward version of the server-aided PSI [27] yields a fastest PSI-CA protocol in the two-party setting.
A multi-party PSI-CA protocol was first proposed by Kissner and Song [28].The protocol of [28] is based on oblivious polynomial evaluation which is implemented using additively homomorphic encryption.The basic idea is to represent a dataset as a polynomial whose roots are its elements, and send the homomorphic encryptions of the coefficients to other parties so that they can evaluate the encrypted polynomial on their inputs.The protocol of [28] has a quadratic computation and communication complexity in both the size of dataset and the number of parties.
Mohassel et al. [34] proposed a PSI-CA protocol, but on secret shared data in the honest-majority three-party setting, which is different than the setting in this paper, as we consider a setting with any number of parties, in which the input does not have to be in a secret-sharing form.However, one can extend the protocol of [34] to support the multi-party PSI-CA where all the parties secretshare their input to the three parties of [34] which then jointly compute the final output.We discuss the extension and compare the performance of our protocol and [34]'s in Section 6.2.
Chandran et al. [10] proposed an efficient PSI (not PSI-CA), which can be extended to circuit-based PSI.Hence, one could combine their extended protocol with a circuit that computes the size of the intersection to obtain a protocol for PSI-CA.At the technical core, [10] is built on -party secret-sharing functionalities introduced by [12].Their use of generic secure computation protocol for a specific problem (of PSI-CA) makes their extended protocol less attractive.In addition, [10] requires 10 interaction rounds while our server-less multiparty PSI-CA protocol needs 4 rounds.We compare the performance of our protocols and [10] in Section 6.2.
Very recently, Fenske et al. [19] proposed an efficient and malicious multi-party PSI-CA procotol in the outsourcing setting.Their approach makes use of  servers, with the assumption that at least one of the servers is not colluding with other participants.When  = 1, their protocol is comparable to our server-aided one, as both require a non-colluding server.However, our protocol outperforms theirs in this scenario, as we employ symmetric-key operations while [19] heavily relies on the additively homomorphic encryption.For instance, in the case of  = 8 and  = 2 16 , our semi-honest protocol can compute PSI-CA within 3 seconds.In contrast, the malicious protocol [19] requires approximately 2 hours for  = 5 and  = 30000, as indicated in their Figure 10.This demonstrates the efficiency and superiority of our approach in terms of computational time.Note that the protocol [19] only works in the server-aided setting, whereas in this work we also propose a way to work without such an entity, which we call the "server-less" setting.

Secure Dot Product and Its Applications
Dot-product plays a key role in machine learning and data analysis tasks.Its implementation in a privacy-preserving setting remains expensive as it requires either generating Beaver triples [5] or using fully homomorphic encryption (FHE).There is a long list of results for secure computation of dot product or linear algebra in general [1,4,14,24,42,46,47].For the applications that we consider in this paper, namely, Covid-heatmap and ARL, dot-product of sparse vectors would be sufficient.Many algorithms for linear algebra operations, like matrix multiplication, leverage an apriori knowledge of the operands being sparse, and sometimes these algorithms can even be computed securely, without degrading their asymptotic complexity.None of the above works, however, address the problem of dot product in a setting where the vectors are sparse.The most relevant works to ours are [4,16,41,44,45].
To the best of our knowledge, Vaidya and Clifton [44] were the first to study secure computation of scalar product of two element vectors in the two-party setting and its application to privacy-preserving association rule learning (ARL).Their dot product protocol heavily relies on public-key operations, and requires four communication rounds, communication complexity of  () and computation complexity of  ( 2 ).
Their follow-up work [45] is based on PSI, which makes the complexity dependent only of , where  is the upper-bound on the Hamming weight of the vectors.They also propose a protocol for the multi-party setting, which requires a commutative one-way hash function so that the input from each party can be encrypted by a common set of keys.The resulting ciphertexts are the same if the original values are the same.Although efficient, their protocol introduces an undesirable leakage; specifically, it leaks the items in the intersection (rather only their sum).Moreover, their protocol is insecure when the input domain is relatively small (e.g. of size 2 30 ) as one party could easily perform a brute force attack [37].To handle the latter security issue, [16] studied a two-party ARL and proposed a solution via PSI that is built on the Goldwasser-Micali Encryption [23] and Oblivious Bloom Intersection [17].Their protocol still leaks the items in the intersection, and became much more expensive than the protocol we present in this paper.In addition, they did not consider an extension to the multi-party case.
Recently, Bampoulidis et al. [4] studies COVID-19 heatmap computation and proposes secure dot product based on homomorphic encryption with several optimizations.However, the number of required HE operations is  () (regardless of the Hamming weight of the vectors), which makes their protocol expensive.Schoppmann et al. [41] presents efficient two-party protocols for several common sparse linear algebra operations including sparse matrix-vector multiplication.The main building block of their protocols is a new functionality -Read-Only Oblivious Map (ROOM).Using ROOM, the cost of the secure matrix-vector multiplication is dependent only on the number of non-zero entries, instead of the operands' size.However, in all three ROOM constructions the parties invoke generic secure computation in order to obtain a secret-shared output.We compare the performance of our protocol to a ROOM-based dot-product in Section 6.1.

Our Results and Techniques
1.3.1 Our PSI-CA Approach: We present a new multi-party PSI-CA protocol paradigm with an assumption that a subset of particular parties does not collude.We offer two variants of our protocol.The first protocol relies on a non-colluding semi-honest server that has no input.It is optimized for the number of communication rounds between parties; that is, the protocol leverages a star network topology, where parties mostly communicate with the server.The second protocol removes the need of a server by reducing the problem of -party PSI-CA to the problem of server-aided ( − 1)-party PSI-CA with use of a semi-honest party   who may have an input.The base case with  = 2 can be instantiated efficiently by two-party server-aided PSI protocol of Kamara et al. [27].However, [27] is only for PSI itself (not PSI-CA)1 .We simplify their PSI protocol and present the server-aided two-party PSI-CA in Section 4.1.
The main building blocks of our multiparty PSI-CA protocols are oblivious key-value store (OKVS) data structure [22], and/or Oblivious Programmable PRF (OPPRF) [30].To this end, we propose a very simple and efficient protocol for server-aided OPPRF, which we believe to be of independent interest.Our server-aided OPPRF is based on a two-party server-aided shuffled OPRF, a functionality we formally define in Section 3.1.
We provide an implementation of server-aided and server-less variants of our PSI-CA approach for  > 2. To the best of our knowledge, this is the first 'special-purpose' implementation of multi-party PSI-CA from symmetric-key techniques that does not rely on generic secure computation.We find that multi-party PSI-CA is practical, by evaluating our protocols over settings with million items sets and 16 parties.The main reason for the efficiency of our protocol is its reliance on fast symmetric-key primitives.This is in contrast with prior multi-party PSI-CA protocols, which require expensive public-key operations for each item [28] or computation on secret-shared data [10].
Interestingly, the server-less PSI-CA variant is about 10× faster than the server-aided one.We consider colluding model in the semihonest setting which is introduced in detail in Section 2.1.The two variants, however, offer different security guarantees.Specifically, the former is secure in the presence of an adversary who may passively corrupt any subset from { 3 , . . .,   } or one of  1 , 2 or   (i.e. 1 ,  2 and   are non-colluding).The latter (server-aided PSI-CA) is secure in the presence of an adversary who may passively corrupt any strict subset of { 1 ,  3 , . . .,   } or { 2 ,  3 , . . .,   } (i.e. 1 and  2 do not collude) or passively corrupt the cloud server C.
In some sense, one may look at the server-less variant as a multiserver-aided PSI-CA but the servers have their private input.Hence, we can use our efficient server-aided OPPRF (instead of the twoparty OPPRF [30]) in the server-less PSI-CA protocol, which may explain why it is possible to get a better performance in this case.In the server-less variant, we assign the non-colluding party  1 the role of a server in the server-aided OPPRF protocol.
The security model employed in this work deviates from the commonly known concept of "threshold security".Rather, we adopt a specific but sufficiently general access structure, in which a designated subset of parties does not collude.Although this approach differs from the conventional notion of threshold security, we do believe our approach can be used as a stepping stone toward achieving security in the 'standard' threshold access structure.
Note that in practice, a server-aided model can be reasonable.Performance is critical and often it makes sense given that the alternative has a weaker security guarantee.For example, in the federated learning setting, there is a server and many clients where the server helps training a machine learning model for the benefit of the clients.In this work, we motivate our protocols with two realworld applications in which using a non-colluding, but semi-honest server, makes complete sense.For example, in the Covid-19 heatmap computation, an established company (e.g.Google or Apple) can play the role of the server.

Our Multi-party Dot-Product of Binary Vectors (DotProd):
We propose a new protocol for computing the sum of elementwise products of  sparse binary vectors (so-called multiple dot product, DotProd).Let us begin with the simpler case, where  = 2, known as secure dot product.One would expect a solution for a dot product of -elements vectors to incur communication overhead of at least  (), for the very fact that the parties need to first input those elements (which usually involves some sort of encryption or secret sharing on each element).In this work, we show that the communication and computation complexity is independent of  and can be reduced to  (), where  is the upper bound on the Hamming weight of the vectors.This improvement is significant when the vectors are sparse (i.e. =  ()).
For an -element binary vector  we define idx() = { ∈ [] |  [] = 1} to be the set of non-zero indices in .Suppose the receiver  0 and the sender  1 hold an -element binary sparse vector  0 and  1 , respectively.The vectors are sparse and have the number of non-empty elements bounded by  =  ().As a very simple warm-up, we consider a non-secure dot product computation with the communication complexity cost of  ().Given the input vector  0 , the receiver computes  0 = idx( 0 ) and the sender computes  1 = idx( 1 ).The sender then sends  1 to the receiver, who is able to compute the dot product  •  by computing the intersection  0 ∩  1 and outputting its cardinality | 0 ∩  1 |.
The main advantage of the above solution is to reduce dependency on the length of the vectors, especially when the input vectors are sparse.To compute  0 •  1 securely, the parties run a private set intersection cardinality protocol (PSI-CA) where  0 inputs  0 and  1 inputs  1 .This idea, however, has received little attention due to the large overhead required to compute PSI-CA.We then extend DotProd to the multi-party case.Given an input vector   , party   computes   = idx(  ).It is easy to see that the sum of element-wise products of the vectors is equal to the size of their intersection, namely, We implement the multi-party DotProd using our multi-party PSI-CA.

Application to PSI-CA and DotProd:
We show that our PSI-CA and DotProd techniques can be used to implement and improve the performance of several privacy-preserving applications.More specifically, we consider two running examples: COVID-19 heatmap computation and associated rule learning (ARL).
In the COVID-19 heatmap problem, we consider a scenario where the Department of Health and Human Services (HHS) wants to learn areas with a higher chance of getting infected with the disease without knowing the travel route of infected individuals.The heatmap can be implemented by computing the vector-matrix multiplication as  ⊤  , where  and  are as follows:  is a binary vector of size  , held by the HHS, such that  [] = 1 if the th user has tested positive to COVID-19 and  [] = 0 otherwise; and  = ( 1 , . . .,   ) ∈ Z  × 2 is a user-location matrix, held by a network operator, such that the th element of the column vector   indicates whether the th user has recently visited the th location.In that case   [] = 1 and otherwise   [] = 0. Clearly,  =  ⊤  is an -element vector where the th element is equal to the number of users who have tested positive and recently visited the th location.[4] proposes different optimizations on HE to implement a secure dot product, which still requires  () independent multiplications (regardless of the Hamming weight of the vectors).In the heatmap example above, we observe that the vector  is sparse because the proportion of diagnosed individuals per day among all  subscribed individuals is small (e.g, 0.01-1% would be a large percentage [2]).Similarly, the matrix  is also sparse due to people's localized travel habits.In Section 5.2, we apply our DotProd protocol to compute COVID-19 heatmap.In addition, [4] only supports a two-party computation between the HHS and a network provider.In real-world scenarios, there are many network providers.We modify the two-party PSI-CA protocol [27] to support heatmap computation between the HHS and multiple network providers without revealing additional information.
Second, we study associated rule learning (ARL) as an application of DotProd.ARL is a rule-based machine learning method that is used to discover rules/relations of the type ( ⇒  ) between variables ,  in databases.As a typical example in the sales database of a supermarket, a rule/relation {onions, potatoes ⇒ burger} indicates that if a customer buys onions and potatoes together, they are likely to also buy hamburger meat.In market design, such information can be used as the basis for decisions about product placements, promotional pricing, and more.However, the ARL training process requires a large transaction database, which may be collected from different sources.Thus, it is highly desirable to maintain the privacy of each source.We study a common ARL training algorithm, called Apriori [3,40], and adapt it to the privacy-preserving setting.Most steps in Apriori can be computed locally except a step in which the parties want to compute a confidence score of how many transactions across a joint database that contains all attributes/items in both  and  .This step can be implemented by computing a sum of bit-wise products of multiple binary vectors.We first apply multi-party DotProd for ARL and make its learning process in a privacy-preserving manner.

Security Model
Secure computation allows mutually untrusted parties to jointly compute a function on their private inputs without revealing any additional information.There are two classical security models: colluding model is modeled by considering a single monolithic adversary that captures the possibility of collusion between the dishonest participants; and non-colluding model is modeled by considering independent adversaries, each captures the view of each independent dishonest party.There are also two adversarial models, which are usually considered.In the semi-honest (passive) model, the adversary is assumed to follow the protocol, but may try to learn information from the protocol transcript.In the malicious (active) model, the adversary follows an arbitrary polynomial-time strategy to learn additional information.This paper introduces two variations of PSI-CA in the semi-honest model, each providing distinct security guarantees.Firstly, the server-aided variant of PSI-CA ensures security in the presence of an adversary who may passively corrupt any subset of { 1 ,  3 , . . .,   } or { 2 ,  3 , . . .,   } (i.e. 1 and  2 do not collude) or passively corrupt the server C. The "server-less" protocol guarantees security in the presence of an adversary who may passively corrupt any subset from { 3 , . . .,   } or passively corrupt one of  1 ,  2 , or   .

Oblivious Key-Value Store (OKVS)
A Key Value Store (KVS) consists of two algorithms: i) Encode takes as input a set of (  ,   ) key-value pairs from the key-value domain, K × V, and outputs an object  (or, with negligible probability, an error indicator ⊥); ii) Decode takes as input an object , a key  and outputs a value .
We say that a KVS is oblivious if for all K 1 , K 2 of size  and all PPT adversaries A: In other words, if the values   are chosen uniformly then the output of Encode hides the choice of the keys   .Oblivious Key-Value Store (OKVS) [22] is given in Experiment 2.2.1, where A is an arbitrary PPT algorithm.

Oblivious PRF (OPRF) and Programmable PRF (OPPRF)
An oblivious PRF (OPRF) [20]  An oblivious programmable PRF (OPPRF) functionality is introduced by [30].It is similar to the plain OPRF functionality except that it allows the sender to initially provide a set of points P which will be programmed into the PRF.Functionality 2 presents a simple version of OPPRF defined in [30].For a comprehensive and in-depth understanding of OPPRF, we refer the reader [30,35].Parameters: A PRF  , an upper bound  1 on the number of points to be programmed, and a bound  2 on the number of queries.Behavior: Wait for points P = { ( 1 ,  1 ), . . ., (  1 ,   1 ) }, with distinct keys   's, from the sender S, and distinct queries ( 1 , . . .,   2 ) from the receiver R. Run  ← KeyGen(, P).Give  to S and ( (,  1 ), . . .,  (,   2 )) to R.

Private Set Intersection Cardinality
Private set intersection cardinality (PSI-CA) allows  parties, each holding a set of  items, to learn the intersection size of their private sets without revealing anything else.In the server-aided PSI-CA, we assume there is a distrusted server that has no input and does not collude with the parties.The server is involved in the PSI-CA protocol while learning nothing.PSI-CA and server-aided PSI-CA are formally presented in Functionality 4. The highlighted text is required for the server-aided case.

Secure Dot Product of Binary Vectors
Secure dot product functionality allows  parties, each holding an -element binary vector, to learn the dot product of their private vectors without revealing any additional information.In this work, we consider the problem of the secure dot product of  binary vectors, in a server-aided setting, in which we make use of a noncolluding distrusted server.

SERVER-AIDED OPRF AND OPPRF
In this section, we introduce new OPRF and OPPRF constructions which make use of a semi-honest non-colluding cloud server.

Server-Aided Shuffled OPRF
The server-aided OPRF functionality involves a sender S, a receiver R and a server C. It is defined as follows: S has a key-pair  = ( 1 ,  Parameters: S, R and C, the set size , a pseudorandom function (PRF) where  is a PRF.Behavior: We first define  ′ (( 1 ,  2 ), ) =  ( 2 ,  ( 1 , )) where  is a PRF.It is easy to see that  ′ is a PRF.In protocol Π () soprf , the S has the key  = ( 1 ,  2 ), so it can send  1 to R and  2 to C. We assume that the receiver and helper do not collude, thus they are unable to compute the sender's key .Having  1 , R computes  ′ =  ( 1 ,  ) and sends  ′ to C. The server C then computes  ′′ =  ( 2 ,  ′ ), and applies a random permutation  on  ′′ .This one-round protocol takes into account the presence of a semi-honest sender/receiver.Inputs: and sends a random permutation  of  ′′ to R. Parameters: S, R and C,  1 the size of P and  2 the number of queries.

Server-Aided OPPRF
Behavior: In the protocol, S, R and C invoke a non-shuffled version of OPRF, where S inputs the key  = ( 1 ,  2 ), R inputs  , and a sets where  is a PRF.Inputs: • C has no input.

PSI CARDINALITY PROTOCOL
In this section we present three protocols: • In Section 4.1, we simplify the server-aided PSI protocol of [27] and formally present their server-aided two-party PSI-CA protocol.Unlike previous "server-less" protocols (see Section 1.1) that are based on oblivious transfer [18] or on the Diffie Hellman proble [25,43], which in turn are based on public-key primitives, the two-party PSI-CA protocol [27] uses only symmetric-key operations.This is possible, among other improvements, due to the replacement of their OPRF constructions with a server-aided version, which is much simpler and more efficient.• In Section 4.2, we show an extension of the protocol to the multiparty case, where the adversary may passively corrupt (almost) any strict subset of the parties or passively corrupt the server.To the best of our knowledge, this is the first 'special-purpose' protocol for privately computing the intersection cardinality of more than two parties, for which we present interesting applications (see Section 5).• In Section 4.3, we show that a server is not necessary when some parties are assumed to be semi-honest and non-colluding.
4.1 Server-Aided Two-Party PSI-CA [27] We consider sender S and receiver R who want to compute the intersection size of their private sets  = { 1 , . . .,   1 } and  = { 1 , . . .,   2 }, respectively.To do so, they use a non-colluding, semihonest cloud server C. The formal description is given in Protocol 10.The protocol is inspired by the size-hiding server-aided PSI of Kamara et al. [27].For completeness, a description of their PSI protocol is given in Appendix E.
Length of OPRF outputs.The length of OPRF output influences the probability of a collision within the protocol.Similar to previous work [29], it is sufficeint to have the output length of  + log( 1  2 ) to bound the probability of any spurious collision to 2  .
The protocol is one-round and extremely efficient because of the efficiency of the shuffled F soprf .In terms of communication cost, it only requires S to send  1 values to R. The construction for F soprf , in turn, requires only  2 messages from R to C and  2 messages back from C to R. We prove the following: PROTOCOL 10.Server-Aided Two-party PSI-CA Parameters: • The protocol runs between a sender S, a receiver R, and a server C. S and R have input size of  1 and  2 , resp.A PRF  ′ : {0, 1} 2 × {0, 1} ℓ → {0, 1}  . Inputs: Protocol: (1) S, R, and C jointly invoke soprf as follows: S inputs a random key  = ( 1 ,  2 ) ∈ {0, 1} 2 , upon which R obtains  1 and C obtains  2 and  (recall that  :

Server-Aided Multi-Party PSI-CA
In this section, we assume that all parties have the same set size .Protocol 11 may be seen as if we have one receiver, who is  1 , and multiple senders, who are  2 , . . .,   .The role of the server is to shuffle PRF results from the senders before delivering them to the receiver.As a simplification to Protocol 11, suppose that we want the receiver to obtain  − 1 shares of zero for each of its items that is in the intersection.This can be done by querying the senders on each of their items and collecting the results.Each sender programs the responses such that if the query is on one of its items, then it responds with its (pseudorandom) share of zero, otherwise, it responds with some other pseudorandom value.Given the senders' responses on a query, if they sum up to zero then the receiver knows that its query is in the intersection.Since the server shuffles the responses to the queries, the receiver does not know, for a given set of responses which are shares of zero, to which query it is associated, thus, the output leaks nothing but the intersection size.Formally, (1)  2 , . . .,   (the senders) generate keys for a zero sharing function , so   obtains   such that for every  it holds that  =2  (  , ) = 0.
(3) The server runs an OPPRF instance with every sender, using the queries  1 .A sender   ( ∈ [2, ]) programs the responses such that on query  ∈   the response is  (  , ) whereas on any other query the response is another pseudorandom value.(4) The server obtains the set  ′  ∈ [2,] , of  − 1 OPPRF responses, on every query   ∈  1 .It chooses a random permutation  : [] → [] and sends to  1 the set { ′  (1) , . . .,  ′  () }. (5)  1 checks for every response set   whether its values are valid shares of zero.If so, it adds 1 to the cardinality.
In the above simplification, there are several security issues: first, the server learns  1 's queries in the clear; second, the server mediates all PRF responses and therefore it learns whenever there is a set of responses that are valid shares of zero, thus it can learn the intersection size as well; third, if the receiver colludes with one of the senders, together they can reverse the server's permutation on items that are in the intersection and by that leak the intersection itself (rather than only its size).The first issue is easily solved by having all parties  1 , . . .,   agree on a PRF key , so instead of computing |  =1   | their objective is to compute |  =1  (,   )|.This way, the server does not know  1 's set.Hiding the intersection size from the server (the second issue above) is trickier.We solve it by having  1 and  2 agree on a set of random values Γ = { 1 , . . .,   } so that instead of programming the responses with the 'zero shares', on a value  ∈  2 ,  2 programs the response  ( 2 , ) ⊕  for some  ∈ Γ.Now, for items that are in the intersection, the server C sees a set of responses that constitutes a valid share of some  ∈ Γ, but since the C does not know Γ, this looks random indistinguishable from the responses on values that are not in the intersection.Finally, we propose a protocol under a relaxed setting, that solves the third issue above.Concretely, the protocol is secure as long as  1 and  2 do not collude.This is done by adding one step to the above description: before the server forwards the responses set  to  1 , it sums its items and forwards only the sum to  1 .This means that now   ( ≥ 3) could not trace back and learn the intersection itself.This is formally presented in Protocol 11.The initial five steps of the protocol need only one round of communication as the involved parties can consolidate their messages prior to transmission.The OPPRF [30] demands two rounds.Step 7 involves a single round.Consequently, this server-aid protocol requires a total of four rounds in entirety.We note that, in our protocol, parties use zero shares to mask their actual input.This step is similar to the one in [22].The formal proof of the Theorem 4 is present in Appendix A.4. Inputs:   has   = { ,1 , . . .,  , }.

Multi-party PSI-CA
We now describe our "server-less" multi-party PSI-CA protocol.
The main idea is to convert the problem of -party server-aided PSI-CA to the problem of ( − 1)-party with the use of an We implement the two-party PSI-CA using our server-aid protocol described in Protocol 10 in which any party   ∈ [2,−1] (say  2 ) can play the role of the cloud server.The two party PSI-CA Protocol 10 requires that both sender and receiver do not collude with the semi-honest server.Thus, in our multi-party protocol, we assume that  2 is semi-honest and non-colluding with both  1 and   .In addition, given this assumption, we can improve the performance of our multi-party OPPRF.Particularly, unlike Protocol 11 in the above section, we use our server-aided OPPRF construction described in Section 3.2 to execute an OPPRF instance between   and each   ∈ [2,−1] , where  1 plays the role of the OPPRF server (thus,  1 is non-colluding).We formally present our server-less multi-party PSI-CA in Protocol 12, and its security statement below (see the formal proof in Appendix A.5). Similar to our server-aided PSI-CA, this server-less protocol is 4-round.

APPLICATIONS
We demonstrate that our PSI-CA can be used for several privacypreserving applications by implementing two running example applications which are built on the two-party PSI-CA protocol [27] and multi-party PSI-CA protocols, respectively.

Secure Dot Product Construction
Given a secure protocol for computing the cardinality of the intersection of the parties' sets, the protocol for dot product is simple.Let   be an -element binary vector of party   , and let   = idx(  ).It is easy to see that the dot product of the   's is exactly the cardinality of the intersection of the   's, that is, Thus, to securely compute the dot product, we can use the PSI-CA functionality described in the previous section.Note that even though the input size is  (), the communication complexity of the protocol is only  (), which makes it extremely efficient when  =  (), where  is the upper bound on the Hamming weight of the vectors.
One subtle issue is that in the PSI-CA protocols the parties know the number of elements in each other's set, which leaks more information than required.Here, we assume that there is a known upper bound, , on the Hamming weight of the vectors   's, and require that the parties' input to the PSI-CA contains exactly  items.That is, if the Hamming weight of   is  ′ <  then   adds random "dummy" items to its input to the PSI-CA.Formally, for a given upper bound ,   inputs   to the PSI-CA where   ← idx ′ (  , ) and idx ′ (, ) is defined as follows: let  ′ be the Hamming weight of  , set  = idx( ), pick  −  ′ random values  = { 1 , . . .,   − ′ } from the domain D = { + 1, . . ., 2 +log( ) + } and output  =  ∪ .The choice of the domain D allows the collision probability of dummy items to be negligible and equals to 2 − .
The formal description is given in Protocol 15 in Appendix D. Note that it is possible to compute dot product DotProd with or without the help of a cloud server C, so both variants are presented.The protocol's correctness, complexity and security follow directly from the underlying PSI-CA protocol presented in Section 4 with different corruption structures.Theorem 4. Protocol 15 securely implements the Functionality 5 (F DotProduct ) in the (F PSI−CA )-hybrid model.In particular, if  is a protocol that securely computes F PSI−CA in the presence of an adversary A then, when instantiated with , Protocol 15 is secure in the presence of adversary A as well.

Heatmap Computation
As stated in [4], the heatmap can be considered as a two-party computation between HHS and a mobile network operator (MNO).HHS has a list of individuals who have reported positive for the disease.MNO knows an approximated location data of their subscribers as the subscriber connects to a certain cell tower when traveling (unless the user does not have a phone or disconnects to their network provider).Mathematically, HHS generates a binary vector  ∈ Z  2 which indicates whether the user  ∈ [1,  ] amongst  subscribed individuals has tested positive ( [] = 1) or not ( [] = 0).For each cell tower  ∈ [1, ], the MNO initializes a vector   of  elements, where   [] corresponds to the -th subscriber (say that HHS and MNO agree on the subscribers' identifier and on their positions in the vectors).If the -th subscriber connects to a cell tower  within some period of time, then   [] = 1, and   [] = 0 otherwise.To learn how many positive individuals visit a certain area (e.g. the area covered by the -th cell tower, HHS and MNO run a secure dot product protocol to obtain  •   . The solution proposed in [4] relies on HE to implement the secure dot product for the heatmap problem.Even with the HE optimizations, [4] requires  ( ) independent secure multiplications to compute  •   for each cell tower.Therefore, their protocol costs  ( ) HE multiplications to compute secure vector-matrix multiplications  •  , where  consists of  columns  1 , . . .,   .Each element of  •  corresponds to how many diagnosed subscribers visited a cell town.
In this work, we observe that the proportion of diagnosed individuals among all  subscribed individuals is usually small (e.g.0.01 − 0.1% new positive cases per day [2]), thus, the vector  is sparse.In addition, the vector   is also sparse due to people's localized travel habits.Therefore, the heatmap computation is a perfect application for our DotProd where the input vectors are sparse.By applying DotProd, we show that the computational complexity of the dot product in the heatmap example can be reduced from  ( ) to  (), where  is the maximum between the upper bound on the number of new positive test cases and the upper bound on the number of individuals visiting a geographical area covered by a cell tower.
Multiple MNOs.We support a heatmap computation between one HHS,  0 , and multiple MNOs,  1 , .In real-world scenarios, HHS prefers to minimize bandwidth cost and computation workload on their side.Our protocol makes this happen by making use of the untrusted server.For the heatmap computation, HHS only needs to compute  and 2 symmetric-key operations in the two-party and multi-party settings, respectively.In terms of communication cost, HHS sends and receives 3 elements.Finally, our protocol requires only 1-round communication.

Association Rule Learning
Association rules learning (ARL) aims to discover regularities/rules between variables in transaction data.In this work, we use our DotProd protocol to mitigate information leakage in ARL when training the model on a vertical partitioning of the private database between multiple parties.We study the ARL definition in [3] and adapt it to the privacy-preserving context (see Definition 1 in Appendix B).We consider only a vertically-partitioned database since if the data is horizontally-partitioned, each party can locally compute ARL.For whom are not familiar with ARL, we provide a detailed explanation of the algorithm in Appendix C. PROTOCOL 14. Privacy-Preserving ARL Parameters: • A ARL threshold ,  attributes, empty lists   , . . .,   .
- 1 obtains the output , and adds  to  +1 if  >  Privacy-preserving ARL (PPARL) consists of two subproblems (see Appendix B).The second subproblem can be publicly solved since the frequent itemsets are a part of the ARL result.According to [44], one can reduce the first subproblem of PPARL to securely computing the dot products of the binary vectors with minor leakage information.For simplicity, consider the candidate itemset has only two attributes.Let  and  represent columns in the database.i.e.,  [] = 1 iff row  has value 1 for attribute  (similar for  and  ).Each party  1 and  2 holds a vertically-partitioned database of the transaction  and  respectively.The dot product of two -element vectors  and  as is the support count which indicates how many times the itemset  appears in the joint transaction set.The dot product computation requires the joint database from both parties, thus, it should be computed in a privacy-preserving manner.Given  ←  • , the parties can check whether the obtained support count is greater or equal to the threshold .If yes, the candidate itemset is a frequent itemset.In the ideal world, if  < , the exact value of  is not revealed to the parties.Thus, the information is considered as leakage information in our PPARL scheme as well as previous work [16,44].Note that [16,44] reveal more information than ours -they leak indexes that  [] =  [] = 1 (i.e.intersection items).
In this work, we consider -party setting with global rules where every vertically-partitioned transaction database   ∈ [] has at least one item in the frequent itemset.Protocol 14 presents our PPARL construction which closely follows the Apriori algorithm [3,44].The first two steps aim to find a list of itemsets that (1) appear in the transaction set  at least  times; and (2) every party has at least one attribute in the itemset.We denote the obtained list to be   .Given   , the party locally computes a list of candidates  +1 for itemsets of size  + 1 using the apriori-gen algorithm [3].At the high-level idea, the function apriori-gen is done by generating a superset of possible candidate itemsets and pruning this set.We present the apriori-gen algorithm in Figure 1, and refer the reader to [3] for more detail.Note that apriori-gen is computed on the public list   , thus it leaks no additional information.The parties jointly execute Step (3) to compute   > until it is empty.

IMPLEMENTATION AND PERFORMANCE
We evaluate the performance of our PSI-CA (or DotProd) protocols and estimate the performance of heatmap computation and ARL.Protocols are evaluated under different network settings, number of parties, and input set sizes to demonstrate their scalability.Our implementation is available at https://github.com/asu-crypto/mpsica.

Choice of Parameters.
We run experiments on a single machine 2× 36-core Intel Xeon 2.30GHz CPU and 256GB of RAM and simulated network using the Linux tc command.We consider two network settings: the LAN setting has 0.02ms round-trip latency and 10 Gbps network bandwidth; the WAN setting has 96ms round-trip latency and 200 Mbps network bandwidth.In our implementation, each party uses a separate thread to communicate with other parties.The computational security parameter  = 128 and the statistical security parameter  = 40.The number of parties is in a range of {2, 4, 8, 16}.The set size  of PSI-CA or the upper-bound Hamming weight  of DotProd is in {2 12 , 2 16 , 2 20 , 2 24 }.
Choice of PRF, OPPRF, and OKVS.We instantiate the PRF  using AES-NI.We use OKVS and OPPRF as a black box in the implementation.Our implementation uses the table-based OPPRF code from [30].While there are different OKVS constructions [22], we choose the most efficient Encode and Decode of 3-cuckoo PaXoS data structure.The number of bins in the cuckoo table is 1.3 with 3 hash functions.
PSI-CA and DotProd protocols.Recall that the steps of PSI-CA and DotProd protocols are similar, except for a small cost overhead in Step (1) of DotProd where each party locally computes a function idx().In the DotProd protocol, we assume that there is a known upper bound, , on the Hamming weight of the party's input vector  .To implement DotProd using PSI-CA, we require that the parties' input to the PSI-CA contains exactly  items.Thus, we only report the detailed computational and communication performance results of our PSI-CA protocols for the set size .It indicates that the DotProd protocols are evaluated with the upper bound  = .

Performance of Two-party Protocols
PSI-CA Protocol.We evaluate the two-party PSI-CA protocol [27] (Protocol 10) in the LAN and WAN settings.The aim is to evaluate its performance in comparison to existing work and determine the most effective candidate for our multi-party protocols, heatmap computation, and ARL among the two-party PSI-CA options.We consider both balanced and unbalanced set sizes as our heatmap computation is built on the asymmetric two-party PSI-CA.In Protocol 10, the parties do not need to involve in the entire protocol's computation.The sender S send  (( 1 ,  2 ),  ) and  1 to the receiver, send  2 to the server C at the same time and complete its computation.Similarly, the C does not need to be online during the whole process.Instead, the C start its computation when receiving the S's key PRF  2 and the set of receiver's queries.Therefore, we report the performance of each participant separately in Table 6 (in Appendix).We find that Protocol 10 scales well in the experiments as it contains only AES calls.For instance, the total run time with the input set size  1 =  2 = 2 20 is only 1.5 seconds.
Two-party PSI-CA Comparison.Both DH-based and delegated PSI-CA [18] protocols are secure against a semi-honest adversary, but the latter requires two non-colluding servers.Note that one can use the protocol proposed in [33] to implement PSI-CA, however, the protocol is much expensive compared to DH-based PSI-CA.The PSI-CA implementation of [15,43] is not available2 , thus we exclude them from the comparison.In addition, we compare Protocol 10 with ROOM-based protocol [41].The two-party DotProd of [41] consists of two expensive steps: ROOM and a generic dense matrix multiplication.We only report the performance of ROOM in settings where [41] performs best.We use DH-based PSI code implemented by [39] with the fastest Curve25519 implementation from libsodium.For a fair comparison, we run the implementation of delegated PSI-CA [18] and DH-based PSI on the same benchmark machine and network settings.Note that [18] only provides the implementation of their protocol building blocks, thus, there are no performance results on the WAN setting.The times 3 for ROOM are taken from [41,Figure 17] and [31,Table 2], initially provided for a database 50, 000 and a number of queries 5, 000 and 50, 000.Table 4 presents the performance of each PSI-CA protocol.When comparing the protocols, we find that the running time of Protocol 10 is 10 − 100× faster than that of the prior works.In addition, our protocol requires 2 − 5× less bandwidth cost compared to them.
Performance of Heatmap Computation.In the two-party setting, executing the heatmap computation essentially involves multiple DotProd or PSI-CA executions.Similar to [4], we want to evaluate our protocol for smaller nation-states such as New York City or Singapore which has a population around  = 2 23 .Concretely, we consider a case in which the MNO has a matrix  of size  ×  and the HHS has a vector  of  , where  = 2 23 and  = 2 15 .The parties need to perform  DotProd instances as  •   ∈ [] , where 3 Unknown benchmark machine   is the  ℎ column of  .Recall the  and   are binary vectors that indicate whether an individual tested positive to COVID-19, and whether this individual visited a place nearby the network town   , respectively.Among  = 2 23 , we assume that there are  2 = 2 12  new positive cases per day [2], and each patient visits 4 places per day on average.We run  = 2 15 instances of Protocol 10 with the MNO's set size  1 = 2 14 and the HHS's set size  2 = 2 12 , and find that our protocol costs about 10 minutes using a single thread.On the other hand, [4] reports about 90 minutes but using 96 threads and stronger benchmark machine 4 .Therefore, we estimate that our protocol is at least 50× faster than [4].It is due to the fact that our protocol is based on symmetric-key operations while [4] heavily relies on public-key operations.In addition, [4] requires that the participants agree on database indices (i.e.data alignment before running heatmap computation).Using PSI-CA, we can remove this requirement.The party's input can be a set of patient/visitor ids (instead of the vector/matrix).
Performance of ARL.Based on the DotProd performance, we estimate the performance of our ARL.In two-party setting, each party   ∈ [2] locally computes a list   1 of frequent itemsets that has only one attribute.The parties sequentially invoke DotProd to compute lists   of frequent itemsets that has exactly  attributes where  +1 is empty (say  +1 is empty).Assume that each attribute/vector in   ∈ [2,] has a Hamming weight   .Also, assume that each   has |  | candidates.The performance of our ARL DotProduct ] is the cost of twoparty DotProd with Hamming weight   .According to Table 6, we estimate that our ARL would take under hours to compute ARL of the database with million records.

Performance of Multi-party Protocols
PSI-CA Protocol.The running times and communication overhead of our server-aided multiparty PSI-CA are shown in Table 5 (Appendix).The protocol is asymmetric with respect to the server, the receiver  1 and other parties   ∈ [2,] , thus, we report the performance results of these parties separately.In our protocol, the workload of the receiver is light as it only requires to call  AES instances.The majority of the receiver's running time is to wait for other parties to finish their work.For example,  1 takes 34.74 seconds to compute PSI-CA (or DotProd) with  = 8 and  = 2 20  (or  = 2 20 ) in the LAN setting.Also, the server plays the role of the receiver in most OPPRFs, his communication cost is highest amongst other participants.For  = 8 and  = 2 20 (or  = 2 20 ), the PSI-CA (or DotProd) requires 3305 MB on the server's side.
Table 1 presents the performance of our "server-less" multiparty PSI-CA protocol in both LAN and WAN settings.Similar to the server-aided protocol, we separately report the performance results of  1 ,  2 ,   and other parties   ∈ [3,−1] .Unlike server-aided protocol, this protocol only relies on OKVS (i.e.makes use of symmetrickey operations only).We find that our protocol scales to large input sets (e.g. = 2 20 ) with a large number of participants (e.g. = 16).For  = 16 and  = 2 20 (or  = 2 20 ), our protocol requires only 6 seconds with the total communication cost 1GB.Comparison with Prior Work.The three-party PSI-CA protocol [34] can be applied to multi-party cases by letting all the  parties secret-share their set of  items to their three parties/leaders  1 ,  2 ,  3 , then the three leaders jointly compute the PSI-CA output.The three leaders conduct the computation in the honest-majority model, which might achieve the similar security assumption in our server-less protocol in which  1 ,  2 ,   acts as leaders.To implement a -party PSI-CA, each having  input items, the protocol of [34] requires to run PSI-CA on the total of  secret-shared input items.Note that [34] only consider computing the PSI-CA for two sets, each of  items.Thus, the running time and communication cost of their protocol reported in [34, Figure 8] is for computing PSI-CA on the total of 2 secret-shared input items.To have a fair comparison, we report the performance of ours and [34]'s protocol for the total  input items.For example, computing PSI-CA for  = 2 3 parties, each with  = {2 14 , 2 18 , 2 22 }, results in the computation of the total  ∈ {2 17 , 2 21 , 2 25 } elements.This is equivalent to the experiential results for the two-party PSI-CA using [34] with the set size {2 * 2 16 , 2 * 2 20 , 2 * 2 24 }, which are reported in [34,Figure 8] where each party has {2 16 , 2 20 , 2 24 } input items, respectively (i.e., one needs to execute the two-party PSI-CA of [34] with each input set of /2 items).Since the implementation of [34] is not publicly available, we take numbers from the publication and have the comparison with our protocol.We present the detailed performance comparison in Table 3 5 .Our protocol shows about 1.5× faster than [34] for sufficient large .We also note that when these leaders servers collude, our protocol only reveals the intersection items while [34] leaks all input items to the adversary.
As far as we know, [10]'s implementation is not publicly available.Thus, we take their reported run times from [10,.For the most direct comparison, we used the same configured machine (2x 36-core Intel Xeon 2.30GHz 256GB of RAM) and network settings to evaluate their and our protocols.We compare our "server-less" protocol with [10] for the case of  = 4, one dishonestly colluding (no collusion), each with  ∈ {2 12 , 2 16 , 2 20 }.We show an improvement of 1.6 − 4× in the run time, and 3.5 − 4× in the bandwidth cost.We report the performance numbers in Table 2. Our server-less protocol with  = 16 requires only 7.74s in the LAN setting and  = 2 20 (see Table 1).From Table 2, the [10] with  = 4 requires 23.8s in the same setting.Our protocol with  = 16 is already 3.07× faster than [10] with  = 4, thus, we do not present the comparison of the two protocols for larger .
Performance of Heatmap Computation.The complexity of our heatmap protocol is linear in the number of MNOs.Using the suitable parameters of the two-party heatmap where each MNO has a matrix of size 2 23 ×2 15 , and HHS has a vector of size 2 23 , we estimate that our protocol takes about one hour if there are 6 MNOs involved in the protocol execution.Note that our protocol does not reveal additional information other than the output -how many patients visit a certain area.In contrast, [4] only works in the two-party setting.In real-world scenarios, there are many MNOs.If using only their protocol where the HHS executes vector-matrix multiplication with each MNO and then computes the "global" heatmap, this solution leaks extra information -the individual result of each vector-matrix multiplication.
Performance of ARL.Similar to the two-party ARL, the performance of our multi-party ARL is  Proof.We exhibit simulators Sim S , Sim R , and Sim C for simulating the view of corrupt S, R, and C respectively which consists of the randomness, input, output, and received messages during the execution of the protocol.And then we argue the indistinguishability of the produced transcript from the real execution.with input  and appends the output to the view.To simulate Step 3, Sim R generates  1 random points (  ,   ) ← {0, 1} ℓ × {0, 1}  , constructs an OKVS over  ← Encode({(  ,   )}, and appends it to the view.We now argue that the output of Sim R is indistinguishable from the real execution.For this, we formally show the simulation by proceeding with the sequence of hybrid transcripts  0; 1; 2, where  0 is the real view of S, and  3 is the output of Sim R .
-Hybrid 1.Let  1 be the same as  0 , except the output of serveraid OPRF execution is replaced by the output of the Sim soprf R .It is easy to see  0 and  1 are computationally indistinguishable.
-Hybrid 2. Let  2 be the same as  1 , except the OKVS  is constructed on randomly selected points (  ,   ).Since the value  ′ (,   ) ⊕   are also pseudorandom in the real execution, the two constructed OKVS tables  are computationally indistinguishable.• Passively Corrupted C. Since the C only participates in the execution of server-aid OPRF as the C, the Sim C is exactly the same Sim C in the server-aid OPRF.According Theorem 1, it is computationally indistinguishable from the real execution and we omit the proof here.□ Table 4: Run time (in second), communication cost (in MB), and system requirement of the two-party PSI-CA (or DotProd) protocols: DH-based PSICA [25,32], OSN-based PSI-CA [21], Catalic [18], ROOM as a building block in DotProd [41], and Protocol 10 (a simpler variant of the [27] PSI protocol) for the sender set size  1 and receiver set size  2 .Cells with − denote trials that are not supported by the protocol.

DH-PSICA [25]
OSN-based PSI-CA [21] ROOM [41] Catalic [18] Protocol   Proof.We exhibit simulators Sim S , Sim R , and Sim C for simulating the view of corrupt S, R, and C respectively, and argue the indistinguishability of the produced transcript from the real execution.
• Passively Corrupted S. Sim S simulates the view of corrupt S, which consists of S's randomness, input, output, and received messages.Sim S proceeds as follows.It chooses a random key  = ( 0 ,  1 ) ← {0, 1} 2 , calls a simulator

FUNCTIONALITY 4 .
PSI Cardinality -F PSI−CA Parameters:  parties  1 , . . .,   ; an untrusted server C; the set size .Behavior: • Wait for input set   of  distinct items from   .• Give the server C nothing.• Give  1 an intersection set size |  =1   |.
Theorem 1. Protocol Π () soprf securely implements the functionality F () soprf in the presence of an adversary who may passively corrupt either S, R, or C. The formal proof of Theorem 1 is present in Appendix A.1.PROTOCOL 7. Server-Aided Shuffled OPRF -Π () soprf Parameters: • Set size ; a PRF  .• A sender S, a receiver R, a non-colluding semi-honest server C

Theorem 3 .
Protocol 10 securely implements the Functionality 4 (F PSI−CA ) with  = 2 in the F soprf -hybrid model, in the presence of an adversary who may passively corrupt either S, R, or C. The formal proof of Theorem 3 is present in Appendix A.3.

A
CORRECTNESS AND SECURITY PROOF A.1 Server-Aided Shuffled OPRF Theorem 1. Protocol Π () soprf securely implements its functionality F () soprf in the presence of an adversary who may passively corrupt either S, R, or C.
soprf -hybrid model, in the presence of an adversary who may passively corrupt either S, R, or C.Proof.We exhibit simulators Sim S , Sim R , and Sim C for simulating the view of corrupt S, R, and C respectively, and argue the indistinguishability of the produced transcript from the real execution.• Passively Corrupted S. Sim S simulates the view of corrupt S, which consists of S's randomness, input, output, and received messages.Sim S proceeds as follows.It chooses a random key  = ( 0 ,  1 ) ← {0, 1} 2 , calls a simulator Sim soprf S of the corrupt sender in the server-aided OPRF, and appends its output to the view.Since the Sim soprf S is trivial (i.e. S does not receive any messages during the execution of Π () soprf ), it is easy to see the view of Sim S is computationally indistinguishable from the real execution.• Passively Corrupted R. Sim R simulates the view of corrupt R, which consists of R's randomness, input, output, and received messages.Sim R proceeds as follows.It calls a Sim soprf R

A. 3 Theorem 3 .
Server-Aided Two-party PSI-CA Protocol 10 securely realizes Functionality 4 (F PSI−CA ) with  = 2 in the F soprf -hybrid model, in the presence of an adversary who may passively corrupt either S, R, or C.
of the corrupt sender in the server-aided OPRF, and appends its output to the view.Since the Sim soprf S does not receive any messages in the protocol, it is easy to see the view of Sim S and the view in the real execution are identical.• Passively Corrupted R. Sim R simulates the view of corrupt R, which consists of R's randomness, input, output, and received messages.Sim R proceeds as follows.It calls a Sim soprf R with input  and appends the output to the view.To simulate Step 2, Sim R generates a random set of  1 values  = { 1 , ...,   1 }, chooses random key  ′ = ( ′ 1 ,  ′ 2 ) ← {0, 1} 2 , computes  ′′ =  ′ (,  ) and appends it to the view.The is a 2-party protocol in which the sender learns a PRF key  and the receiver learns the PRF values  (,  1 ), . . .,  (,   ).Here,  is a PRF and ( 1 , . . .,   ) are inputs chosen by the receiver.Functionality 1 presents a variant of OPRF where the receiver obtains outputs of multiple statically chosen queries.
Prod.DotProd is presented in Functionality 5.The highlighted text is required for the server-aided case.
Our protocols are extremely efficient when the upper bound on the Hamming weight of the vectors, denoted , is in  ().The dot product of  vectors  1 , . . .,   , each with  elements, is defined by  =1  =1   [ ] and is called Dot- 2 ) where   ∈ {0, 1}  , R has a set of queries {  }  ∈ [] and the server C has no input.S does not receive an output whereas R obtains { is a random permutation.The output of C is the permutation  .Clearly, R cannot associate the response  ′  with the query   as all responses are pseudorandom.Figure6formally presents the ideal functionality of F ′  (1) , . . .,  ′  () } where  ′  =  ′ (,   ) and  : [] → [] Since the sender can combine messages of OKVS and OPRF executions before sending them to the receiver, thus, this server-aded OPRF protocol is one-round.Parties are sender S, receiver R, and a server C. Set sizes  1 ,  2 .
acts as a sender with input P  ,•  1 acts as a cloud server with no input.•   acts as a receiver with input  ′  =  (,   ).  obtains the result  , on the query  , .(6) For every  ∈ [],   computes   = =2  , ⊕ (  ,  , ).Then,   sets  to be { 1 , . . .,   }. (7)  1 and   invoke the server-aided F PSI−CA functionality with  2 as a server, where •   acts as a sender with input  •  2 acts as a cloud server with no input •  1 acts as a receiver with input  =  (,  1 ), and obtains | ∩  |.
untrusted party   who, however, has a private input set   .Recall that in the server-aided PSI-CA protocol, the cloud server C has no input, but obtains from  1 the PRF values  (,  1 ) which are used to invoke an OPPRF with parties   ∈ [2,] .In the problem of ( − 1)-parties, however, party   (who plays the role of C) does have input   .Thus, . .,   .For a cell tower  ∈ [1, ], the MNO   ( ∈ []) has the vector    of  elements.   [] indicates whether a subscriber  connects to a cell tower  of the MNO   (we assume that the -th cell tower of all MNOs covers the same geographical area, this should be adjusted in practice).The sum of the dot products  =1 ( •    ) indicates how many individuals, across different MNOs, visit a certain area.In our multiparty heatmap, if  0 invokes DotProd with each MNO   where  0 's input is  and   's input is    ,  0 learns extra informationeach term of the sum  =1 ( •    ).To address the issue, we modify the underlying shuffled-opprf protocol of DotProd.At the highlevel idea, C computes PRF values of all MNOs   ∈ [] , permutes them before returning to the  0 .The formal description of our multi-party heatmap computation is presented in Protocol 13.

Table 1 :
Run time (in second) and communication cost (in MB) of our "server-less" multiparty PSI-CA protocols for  parties on sets of size .

Table 2 :
[10]time (in second) and communication cost (in MB), and round number of[10]and our protocols for 4 parties and no collusion.Each party has a set size .The numbers of[10]are for PSI itself (not, PSI-CA).

Table 3 :
[34]time (in second), communication cost (in MB) of[34]and our server-less protocol for  parties.Each party has a set size .
DotProduct ] is the cost of -party DotProd with Hamming weight   .Here, we assume that each attribute/vector in   ∈ [,] has a Hamming weight   .According to the performance of our multiparty DotProd (or multi-party PSI-CA) shown in Table5&1, we estimate that our ARL would take under a day to compute ARL of the database with million records.
• Passively Corrupted S. S does not receive anything during the execution of the protocol.So it is trivial to simulate his view.• Passively Corrupted R. Sim R randomly select a pair of keys ( 1 ,  2 ) and appends the  1 to the view.Given the PRF  , Sim R computes  ′′ =  ( 2 ,  ( 1 ,  )) for the input set  and appends a permutation of  ′′ to the view.Now we argue that the view output by Sim R is indistinguishable from the real one.The way that Sim R selects keys is identical to the real execution when assuming that the receiver does not collude with the server.Outputs of the PRF given different keys are computationally indistinguishable.So this simulated view is computationally indistinguishable from the real execution.• Passively Corrupted C. Sim C randomly selects a pair of keys ( 1 ,  2 ) and appends the  2 to the view.Given the PRF  , Sim C computes  ′ =  ( 1 ,  ) for the randomly selected input set  and appends it to the view.Now we argue that the view output by Sim C is indistinguishable from the real one.The way that Sim C selects keys is identical to the real execution when assuming that the receiver does not collude with the server.Outputs of the PRF given are computationally indistinguishable.So this simulated view is computationally indistinguishable from the real execution.□

Table 5 :
Run time (in second) and communication cost (in MB) of our server-aided multiparty PSI-CA protocols for  parties on sets of size .

Table 6 :
[27]ing time (in second) and communication cost (in MB) of the two-party PSI-CA protocol[27](Protocol 10) for the sender set size  1 and receiver set size  2 .