Dynamic Volume-Hiding Encrypted Multi-Maps with Applications to Searchable Encryption

,


Introduction
Structured encryption (STE) schemes, introduced by Chase and Kamara [10], enable a client to outsource the storage of an encrypted version of their structured data to an untrusted third-party server (such as a cloud storage provider). The encrypted data is stored in a structured manner so that the client may still perform operations on it without the server ever viewing the plaintext data. For privacy, the ideal goal is to ensure that the adversarial server does not learn any information about the outsourced data or operations performed by the client. Currently, this ideal privacy is only known to be achievable by using expensive cryptographic primitives such as fully homomorphic encryption or oblivious RAMs (ORAMs). Instead, STE schemes aim to strike a delicate balance between efficiency and privacy by enabling some leakage that is upper bounded by a well-defined and "sensible" leakage function to obtain efficiency that is necessary for real-world applications.
In our work, we focus on the encrypted multi-map (EMM) primitive that is an important example of an STE scheme that manage collections of pairs of labels and value tuples consisting of one or more values. EMMs form the basis of many important applications where clients outsourced encrypted data to a untrusted cloud server. By leveraging EMMs, one can build systems that enable searching over the encrypted data (also known as searchable encryption [39,11,9]), or performing SQL queries over the encrypted databases [19].
From a technical perspective, there are significant difficulties when dealing with operations that enable updating the encrypted data even when ignoring volume-hiding requirements. At a high level, EMMs (and, generally, STE schemes) are attempting to find a delicate balance between functionality, efficiency and privacy. As dynamicity is increasing functionality, EMMs must ensure that only minimal loss of efficiency and/or privacy are incurred compared to the static setting. Due to this difficulty, there has been prior works that explored and defined standard privacy requirements in dynamic settings to avoid privacy degradation. Formally, these standard notions for dynamic STE schemes are forward and backward privacy [40,6,7]. Forward privacy guarantees that insertion operations do not leak information on previous queries. Backward privacy addresses a similar concern with respect to deletion ensuring that it is not possible to apply a query to data that has been deleted. Enabling update operations only becomes more difficult when studying volume-hiding EMMs. For static volume-hiding EMMs, schemes must ensure volume is not leaked only on query operations. In the dynamic setting, volume must not be leaked by either query or update operations. Furthermore, designers must ensure that combining leakage between query and update operations does not reveal volumes as well.
In our work, we will design dynamic volume-hiding encrypted multi-maps that provide forward and backward privacy while simultaneously being efficient.

Our Contributions
As our main contribution, we present dynamic volume-hiding EMMs that are forward and backward private with better efficiency than prior works. The state-of-the-art, dynamic, volume hiding scheme was presented in the original work by Kamara and Moataz [20] and is denoted as the Dense Subgraph Transform (DST). For a MM with n total values and maximum volume ℓ, DST requires O(ℓ log n) overhead for both queries and updates. Furthermore, DST supports only a subset of update operations and is not forward private, a standard security notion of the dynamic setting. With this in mind, there are four main challenges that we address in our work.
1. Dynamicity and Hiding Volume. The volume-hiding scheme in [20] only enables adding, deleting or overwriting the entire tuple associated with a label. In particular, users may not append a single value or remove a value from an existing value tuple. While one can achieve this functionality using a query before an update, it turns out that this degrades privacy significantly (see Appendix A). This motivates the following question: Is it possible to construct fully-dynamic volume-hiding schemes with the ability to add/remove a set of values from an already existing tuple that is both efficient and private?
2. Forward and Backward Privacy. Introduced by [6,7] for the special case of dynamic searchable encryptions, forward and backward privacy are the de-facto standard security notions for dynamic STEs to protect against various injection attacks [43]. At a high level, these notions guarantee that modified data is not leaked until a query for the data is performed. Prior volume-hiding schemes [20,35] are not forward and backward private, which motivates the following problem: Is it possible to construct volume-hiding dynamic EMMs that are both forward and backward private?
3. Efficiency. DST [20] requires O(ℓ · log n) overhead for both queries and updates, which is larger than the O(ℓ) overhead needed by the best static volume-hiding scheme [35] and raises the following question: Is it possible to construct a dynamic volume-hiding scheme with better efficiency while simultaneously providing forward and backward privacy?
4. Leakage. Beyond forward and backward privacy, our schemes will aim to leak as little information as possible. We identify three leakages that are necessary for functionality or efficiency: MM size n, maximum volume ℓ and label equality leakage (whether two operations are for the same label). The MM size n is necessary as the server stores the EMM. We show that hiding the maximum volume ℓ would require Ω(n) communication for any reasonable error probability in Appendix B. Patel et al. [34] showed that avoiding label equality leakage would require overhead equivalent to ORAMs [13,31,3] (see Appendix I for more details). It is not a coincidence that prior works [20,35,42] Table 1: A comparison of amortized query and update overhead of dynamic schemes that provide either volumehiding, forward or backward privacy with our schemes. We use the following abbreviations: roundtrips (RT), volumehiding (VH), forward privacy (FP) and backward privacy (BP). For notation, n denotes the maximum number of values in the multi-map MM and m denotes the number of unique labels in the MM. For volume-hiding schemes, ℓ represents the maximum volume. We denote ℓlabel to be the volume associated with the queried label and ulabel to be the number of updated values. Correct %-age refers to the percentage of returned correct values. For client storage, an asterisk* means client storage may increase up to O(n) temporarily. For [42], b refers to the number of batch updates. In [44], O λ (x) means there are hidden λ factors. Label equality leakage is referred to by leq.
of n, ℓ and label equality. This leads to the natural question: Is it possible to construct dynamic volume-hiding schemes supporting the above properties with minimal leakage?
We present two schemes 2ch FB and 2ch s FB that address all three problems simultaneously and present different trade-offs between client storage and update overhead. We remind the reader that in the following statements that ℓ is the maximum length of any value tuple and n is the maximum total number of values.
Theorem 1 (Informal). There exists a fully-dynamic, volume-hiding, forward and type-II backward private EMM, 2ch FB with query overhead of O(ℓ log log n), amortized update overhead of O(ℓ), server storage of O(n) and client storage of size O(m) where m is the number of unique labels in the MM.
2ch FB achieves all our goals of dynamicity, volume-hiding, efficiency, forward/backward privacy and minimal leakage of only n, ℓ and label equality. However, 2ch FB requires O(m) client storage, which is common to the majority of forward private schemes, such as schemes in [6,7]. We present 2ch s FB with smaller permanent client storage at the cost of slightly larger overhead.
Theorem 2 (Informal). There exists a fully-dynamic, volume-hiding, forward and type-II backward private EMM 2ch s FB with query overhead of O(ℓ log log n), amortized update overhead of O(ℓ log n), server storage of O(n) and permanent client storage of size at most f (n), for every function f (n) = ω(log n).
To our knowledge, 2ch FB and 2ch s FB are the first dynamic EMM schemes that simultaneously provide volume-hiding, forward and backward privacy while being concretely efficient with a small number of roundtrips with minimal leakage of n, ℓ and label equality. A comparison of the asymptotic performance of our schemes and prior dynamic schemes obtaining at least one of volume-hiding, forward or backward privacy are presented in Table 1. The experimental evaluation in Section 5.1 shows that our schemes also improve on the concrete performance of prior schemes. It also shows that we enable dynamicity without incurring any additional cost when compared with prior static schemes [35]. This is very surprising as static schemes are optimized for query communication whereas experimentally our schemes, despite having to support a very rich set of dynamic operations, have query cost comparable with the static scheme of [35]. In addition, our schemes exhibit a 2-3x improvement in query communication cost over DST [20], the best existing nonlossy volume hiding dynamic scheme, while supporting a wider range of dynamic operations and providing stronger security guarantees. Discussion about Concurrent Works. Zhao et al. [44] give two schemes for volume-hiding dynamic EMMs. Even though their schemes offer stronger type-I backward privacy and avoid leaking ℓ (along with similar security guarantees elsewhere), they fail to return the correct value tuple for all instances. In particular, only an ℓ/n-fraction is guaranteed to be returned. However, their schemes obtain smaller query computation and communication of O(ℓ + log n) compared to our schemes. Wang and Chow [42] also construct dynamic volume-hiding EMMs with the same privacy guarantees as ours along with very small server storage overhead by using consistent hashing. To guarantee forward and backward privacy, they cache the update operations and then execute them as part of the first available query operation. A query then requires time proportional to the number of cached updates as it needs to handle them. To improve query performance, they allow updates to be processed in batches. A set of updates make a batch, if they arrive simultaneously. Each batch update is stored in a separate volume-hiding EMM. A query consists of querying all uploaded EMMs increasing query overhead for each set of batch updates. Thus, the ability of handling updates in batches does not improve the worst-case running time, unless the client is willing to accumulate the updates in local memory to form a batch. Our schemes also employ caching of the update operations (that is, the updates are not instantly implemented on the main data structure). However, this is done while guaranteeing that the worst case query overhead is independent of the number of updates. We note that when the number of batched updates is small such as b = O(1), the scheme in [42] has smaller query communication O(bℓ) but either larger query computation O(bℓ log n) 1 or client storage O(m) compared to 2ch FB and 2ch s FB respectively. Discussion about Backward Privacy. Both of 2ch FB and 2ch s FB provide type-II backward privacy as defined in [7]. We note that there are several schemes that provide stronger type-I backward privacy. However, current type-I backward private schemes are expensive and resort to usage of ORAMs. As a result, we do not consider type-I backward privacy and leave it as an open problem for future work. Discussion about Oblivious RAMs. From a theoretical perspective, ORAMs [13,31,3] address the problems of dynamicity, forward and backward privacy (as outlined in [20]). However, ORAMs are expensive as they require logarithmic number of client-server roundtrips or fully homomorphic encryption schemes. As evidenced by prior works such as [37], the high number of roundtrips of ORAMs significantly hinder efficiency. In our work, we ensure our schemes use either 1 or 2 roundtrips and only use cheap, symmetric primitives instead of expensive cryptographic tools such as ORAMs and FHE. DST [20] has the same asymptotic overhead as an ORAM, but is faster in practice due to 1 roundtrip and no FHE usage. Discussion about Parallelism. Prior works [21,40,26] investigated enabling the client to issue multiple operations in parallel. In our work, we will focus on constructing dynamic volume-hiding schemes in the sequential setting that were previously not known to exist with our efficiency and privacy guarantees. To our knowledge, we believe that DST [20] and DSSE [44] may enable issuing parallel queries (but could not verify this). We believe all other volume-hiding works (including [42] and ours) do not have this property. We leave it as an open problem to enable parallel queries in our schemes.

Structured Encryption
In a STE scheme, a client may encrypt and outsource storage of the data structure to a server. The encryption is structured in such a way that the underlying data structure may be operated on by the client in a private manner. The notion of STE was first presented by Chase and Kamara [10]. While we consider generic definitions for encrypting any data structure, our work focuses on MMs as they are a simple data structure with several applications. STE schemes may be differentiated using several criteria. Static STE schemes only enable clients to query the underlying data structure while dynamic STE schemes additionally enable clients to update the underlying data structure. We will focus on dynamic STE schemes that consist of three protocols to be executed between the client and the server: the Setup protocol to compute the initial encryption of the data structure, the Query protocol to query the data structure, and the Update protocol to update the data structure.
The number of communication rounds between the client and server is an important measure. We say that an operation of an STE scheme is r-interactive if it can be completed in at most r rounds of communication between the client and the server. An STE scheme is r-interactive if all operations use at most r rounds of interaction. In our work, we will exclusively focus on STE schemes with a low number of rounds of interaction as they are more practical. Definition 1. An r-interactive dynamic STE scheme Σ = (Setup, Query, Update) consists of the following protocols between client C and server S: The setup protocol is executed jointly by C and S where C receives (1 λ , params, DS) and S receives 1 λ . At termination, C receives its state st and S receives the encrypted data structure EDS.

Adaptive Security
We consider the notion of security for STE schemes against an honest-but-curious PPT adversary A with respect to a leakage function L = (L Setup , L Query , L Update ). The leakage function is an upper bound on the amount of information leaked to the adversary in the sense that (1) the initial setup reveals no information beyond L Setup ; (2) a query reveals no information beyond L Query ; and (3) an update operation reveals no information beyond L Update . The leakage on an operation may depend on all the previous operations and the setup phase. We consider adaptive security that considers adversaries that view the execution of one operation before choosing the next operation that was first formalized by Curtmola et al. [11]. The definition utilizes the real-ideal paradigm with a stateful, honest-but-curious, PPT adversary A and a stateful, PPT simulator S.
More formally, let Σ = (Setup, Query, Update) be a dynamic STE and consider the following real game Real Σ,A and ideal game Ideal L,S Σ,A between a stateful PPT adversary A and a challenger C. In the ideal game, S is a stateful PPT simulator and L = (L Setup , L Query , L Update ) is a leakage function. Real Σ,A (1 λ , z): Adversary A(1 λ , z), takes as input the security parameter 1 λ and the auxiliary information z, outputs an input data structure DS. The challenger C executes Setup on DS obtaining client state st and encrypted data structure EDS. C sends EDS to A. For i = 1, . . . , poly(λ) : • A adaptively picks operation o i .  • In both cases A plays the role of the server S and challenger C plays the role of the client C. Therefore, A receives a transcript of the protocol and updates EDS by setting EDS ← EDS new and C updates the client state by setting st ← st new .
Ideal L,S Σ,A (1 λ , z): Adversary A(1 λ , z), on input the security parameter λ and the auxiliary information z, outputs an input data structure DS. The challenger C runs the simulator S on input leakage L Setup (DS, n, ℓ) and the auxiliary information z to obtain the encrypted data structure EDS that is sent to the adversary A. • In both cases A plays the role of the server S and S the role of the client C. Therefore, A receives a protocol transcript and an updated version of EDS. Note, S may deviate from the protocol.
Definition 2 (Adaptive Security). STE scheme Σ is adaptively L-secure if there exists a stateful, PPT simulator S such that for all stateful, PPT adversaries A and all auxiliary information z ∈ {0, 1} * :

Multi-Maps
A MM stores a collection of label and value tuple pairs (label, ⃗ v) where label is from the label universe L and ⃗ v is a tuple of values from the value universe V.   If the size n or maximum volume ℓ may change, the new values must be submitted as parameters to Update.

Label Equality Leakage
The label equality pattern leaks whether two operations are performed on the same label or not. For a sequence of operations We note that [34] proved a lower bound showing mitigating label equality leakage in any small way would require Ω(ℓ log n) computational overhead. Our schemes will all leak label equality to obtain better efficiency.

Volume Hiding Leakage Functions
Volume-hiding leakage functions were introduced in [20] and formally defined in [35] for static schemes. We present a definition of a volume-hiding leakage function for dynamic EMMs that extends the definition of Patel et al.

C computes
such that i. The operation types and label equality leakage are the same: . n t and ℓ t must be valid size and volume upper bounds for MM t 0 and MM t 1 .
We denote by p A,L η as the probability that A outputs η when playing game VHGame A,L η (n, ℓ).

Definition 3 (Volume-Hiding Leakage Functions).
A leakage function L = (L Setup , L Query , L Update ) is volumehiding if and only if for all adversaries A and for all values n ≥ ℓ ≥ 1, Definition 4 (Volume-Hiding Encrypted Multi-Maps). An EMM scheme Σ is volume-hiding if there exists a leakage function L = (L Setup , L Query , L Update ) such that: 1. Σ is adaptively L-secure according to Definition 2.
2. L is a volume-hiding according to Definition 3.
We note this definitions reflects that both the MM size n and maximum volume ℓ will grow over time as more operations occur. An upper bound on n and ℓ will be inherently leaked after each operation. In a concurrent work [44], an alternative definition is provided where ℓ is not leaked. In Appendix B, we show such a definition inherently requires large query communication. If we want to even guarantee that ϵ-fraction of matching values are returned, we show a query communication lower bound of Ω(ϵn). In other words, if we want at least half the matching values, then the query communication is already linear. As a result, we choose to use a definition that leaks ℓ to ensure better efficiency and correctness. The construction in [44] adheres to our lower bound as they can only return (ℓ/n)-fraction of matching values when using O(ℓ) query communication.
Discussion about Label Equality Leakage. In our volume hiding definition, the adversary must choose two sequences with the same label equality leakage. Instead, we could have chosen a more general definition by parameterizing the game with some leakage function L label,op over the labels and operations and force the adversary to submit two sequences with the same leakage with respect to L label,op . We chose label equality as prior works [34] showed mitigating label equality requires large overhead similar to ORAMs (see Appendix I for more details). On the other hand, leaking only label equality is sufficient for faster constructions. Therefore, label equality seems to be the minimal leakage required to obtain efficient constructions faster than ORAMs.

Forward and Backward Privacy
Forward and backward privacy provide guarantees on the amount of information leaked to an adversary as the client performs update operations. We present the standard definitions of forward and backward privacy (readers may also refer to [6,7]).
Forward privacy guarantees that the leakage of update operations is independent of all previous operations. For any forward private leakage, an update o does not give any information on the sequence of operations O except the update operation itself.
. Backward privacy controls the leakage viewed by the adversary during queries about previous deletion operations. In our work, we obtain type-II backward privacy where only the total number of updates and their timestamps are revealed for deleted items.
Bost et al. [7] formally defined three types of backward privacy where type-I provides the strongest privacy to type-III with the weakest privacy. To define backward privacy, we need the following three additional leakage functions that takes as argument a sequence O of operations that is omitted for convenience.
that consists of the timestamps of all update operations that modify label. Finally, We note that Type-II backward privacy reveals the total number of updates performed on label and the timestamps of each update operation for label. All our constructions will be type-II backward private. We point readers to Appendix C for definitions of other types of backward privacy.

Cryptographic Tools
We will utlize pseudorandom functions (PRFs) and IND-CPA encryption. PRF F guarantees its output is computationally indistinguishable from random functions for a secret seed. In our proofs, we may model them as random oracles. IND-CPA encryption scheme SKE = (Gen, Enc, Dec) ensures each ciphertext is computationally indistinguishable from random strings.

Our Constructions
In this section, we present our new constructions. We start with a warm-up construction 2ch that achieves full dynamicity, efficiency and backward privacy but not forward privacy. Next, we present 2ch FB and 2ch s FB that build upon 2ch to obtain forward privacy with different efficiency trade-offs. Throughout this section, we focus on the simpler setting where the upper bounds on MM size n and volume ℓ are fixed through all operations. In Section 4, we present generic transformations to handle changing n and ℓ.

Problems with Naive Padding
We start by discussing a naive solution of adding padding to prior dynamic constructions (such as [6,7]) to obtain volume-hiding. At a high level, one could pad the storage and communication with sufficient dummy values to always return ℓ values. Unfortunately, this straightforward approach incurs O(nℓ) blowup in server storage, which can be very large for many values of ℓ (such as ℓ = √ n). Instead, we will utilize hashing (like prior works [20,35,42]) to enable re-using encryptions of real values as padding for multiple queries and avoid significant storage increase. As a result, all our constructions will require the minimal O(n) storage.

2ch: Warm-Up Scheme
We start from the optimal static volume-hiding scheme by Patel et al. [35]. Their construction utilizes cuckoo hashing [30,23] to embed data into server storage. Cuckoo hashing guarantees that each value is stored in one of two hash table locations or in a small client stash. To perform a query, one can simply access 2ℓ hash table locations, two for each of the ℓ possible values associated with the label. Unfortunately, inserting values with cuckoo hashing is much more complex. Cuckoo hashing insertion works in an iterative fashion where a value is placed into two locations and, if both locations are occupied, the algorithm displaces one of the values; the displaced value is inserted by using the same algorithm. This algorithm is not volume-hiding as the adversary learns whether certain entries are occupied or not by viewing how long the insertion algorithm runs.
Looking closer, the query algorithm with cuckoo hashing [35] is volume-hiding because 2ℓ entries are retrieved regardless of the hash table's contents. On the other hand, the insertion algorithm heavily depends on the hash table's contents. It turns out that the simple balls-into-bins hashing scheme obtains the property that both query and update operations are independent of the table contents. The balls-and-bins hashing scheme considers n bins. To insert a value, it is placed into one of the n bins uniformly at random. If we have n values and n bins, the maximum number of values assigned to a bin will be Θ(log n) (see [27]). To obtain volume-hiding, all bins must be padded to the maximum load of Θ(log n). Both query and update operations will access ℓ bins possibly with dummies to attain volume-hiding resulting in O(ℓ log n) overhead as employed by DST [20].
Our goal is to find a hashing scheme with efficiency better than balls-into-bins hashing while ensuring both queries and updates are volume-hiding. To achieve this goal, we utilize two-choice hashing by Azar et al. [4]. Once again, there are n bins. To insert an value, two bins are chosen uniformly at random and the item is placed into the bin that is least loaded (i.e. currently contains less items). Using this technique, the maximum bin size becomes O(log log n). Unfortunately, the server storage grows to O(n log log n) since each of the n bins must be padded to O(log log n) size to hide the true number of values in each bin.
To avoid this extra storage, we can utilize a modified version of two-choice hashing introduced by Patel et al. [33] that we denote by H 2ch . This hashing scheme reduces the amount of unused space by arranging bins to share physical memory. At a high level, the hash table consists of n/ log n binary trees each of height log log n such that there are n leaf nodes. All node store at most one value. As a result, the total size becomes at most 2n. Each bin is uniquely assigned to a binary tree leaf and the bin's storage corresponds to the nodes that appear on the unique path from the bin's leaf to the root of its respective binary tree. For insertion, the least loaded bin is the one with the empty node that is at the highest level (i.e. furthest away from its corresponding root). Additionally, there is a stash to store overflows. Whenever a value is inserted into two bins that are completely filled (all nodes appearing on the unique leaf-to-root paths are occupied), the item is instead placed into the stash. We formally present bounds on the stash size and point readers to the proof in [33].
). Let f (n) = ω(log n). When mapping at most n items using H 2ch , the stash stores at most f (n) items except with probability negl(n).
Using the H 2ch hashing scheme and padding empty binary nodes with dummy values, we may obtain a dynamic volume-hiding scheme with O(ℓ log log n) overhead that we denote as 2ch (standing for 2-choice hashing) following the same techniques as [20,35] that maps values to bins using pseudorandom functions. We note that 2ch already results in a more efficient, volume-hiding construction than DST [20].
However, 2ch does not achieve forward privacy as updating a label enables association with previous queries to the same label. Our next constructions will solve this problem to obtain forward privacy. In contrast, 2ch is already type-II backward private as updates only reveal the timestamp of previous queries and updates for the same label (encapsulated by TimeDB and TimeUpdate respectively in Definition 6).
We present the pseudocode for 2ch in Appendix D along with a formal proof of security and efficiency. Comparison with [33]. Both [33] and our work aim to build privacy-preserving maps. However, [33] aims to hide access patterns using dummy queries for maps that store at most one value per label. In contrast, our work aims to mitigate volume leakage for MMs storing multiple values per label.

Construction 2ch FB
We formally present our dynamic volume-hiding STE scheme for MMs, 2ch FB (standing for 2-choice hashing with Forward and Backward privacy). 2ch FB builds upon our hashing techniques from the prior section. The major difference between 2ch FB and prior volume-hiding works lies in the update algorithms. For forward privacy, we need to make sure that an update on a label does not leak anything about previous queries on the same label. In particular, we need to hide label equality leakage during updates. In 2ch and [20], identical bins are retrieved for both queries and updates. Otherwise, an adversary can link that both operations were performed on the same label, which is why prior constructions (as well as 2ch) are not forward private.
We take a different approach to update operations for 2ch FB where update operations are not immediately applied to the underlying MM inspired by prior works such as [6,7,26] to obtain forward privacy. The update operation is only applied when a query for the same label is performed. In more detail, 2ch FB outsources two encrypted data stores to the server. The first multi-map Table stores all values of update operations that have already been queried (i.e. the update operations were applied). The other multi-map EMM u will accumulate update operations for labels that have not yet been queried. A table of PRF keys used to generate locations for storing updates in EMM u is stored locally by the client. Once a query for label is performed, all update operations pertaining to label will be retrieved from EMM u and applied to Table before returning the final result. As a result, all the accumulated updates for label in EMM u cannot be linked until a query for label is performed ensuring 2ch FB achieves forward privacy. We note these ideas have been abstracted in [26]. 2ch FB is also type-II backward private inheriting the same properties as 2ch.
We present the pseudocode of 2ch FB in Figure 1. Setup. The setup algorithm is executed by the client C to construct an EMM. It takes as input a security parameter 1 λ , params = (n, ℓ), where n is an upper bound on the total number of values that will be stored and ℓ is an upper bound on the maximum volume, and a multi-map MM. Setup creates a two-choice hash Roots are at level 0 and leaf nodes are at height h. Each node has the capacity to hold a single encryption. Each of the n bins are uniquely assigned to n different leaf nodes.

5.
For all empty nodes in the binary trees, C adds a fresh encryption of Enc(K Enc , (⊥, ⊥, ⊥)).   6. For each i = 0, . . . , cnt − 1:    [1] will denote the number of tuples for label in the update encrypted structure EMM u . To cache the update operation, first ⃗ v is padded with dummies until its length is exactly ℓ. Next, the client computes The client decrypts all contents to find all values that are associated with label in the 2ℓ bins along with values that may be stored in the overflow stash. The client then decrypts the updates returned from EMM u , applies them locally to the downloaded 2ℓ bins or the overflow stash. These updated 2ℓ bins are re-encrypted with fresh randomness and sent back to the server for storage. All values associated with label in these 2ℓ bins and the overflow stash are finally returned as the query's answer. The client increments the label version

Security
We present the leakage of 2ch FB against a persistent adversary. At setup, the adversary learns nothing except for the public parameter n. Therefore, L Setup (MM) = n. Let O be any sequence of operations and o be the current operation. Then, the update leakage is L Update (MM, (O, o)) = (ℓ, uop) where uop leaks that the operation is an update but not anything specific about the type of update operation (as exactly ℓ encrypted values are inserted into a random entry of EMM u ). We observe that our update operations are forward private as the update leakage is independent of all previous operations. Type-II backward privacy is inherited as 2ch FB has essentially the same properties as 2ch for deletions. Finally, the query leakage L Query (MM, (O, o)) = (ℓ, leq(O, o), qop). Label equality is revealed by retrieving all cached unapplied updates for label(o) from EMM u and from the fact that all query operations on the same label, access the same 2ℓ bins from the two-choice hash table Table. The proof that 2ch FB is L-secure for the leakage function L described above is in Appendix E.

Theorem 4.
If SKE is IND-CPA secure, F and G are PRFs, and H is a keyed hash function modeled as a random oracle, 2ch FB is a volume-hiding, forward private and type-II backward private L-secure dynamic, EMM scheme.

Efficiency
We split our analysis into amortized and worst case overhead starting with amortized. The amortized communication and computational cost of an update operation is O(ℓ). During updates, ℓ encrypted values are inserted into EMM u . During a query operation, the same ℓ encrypted values are downloaded and decrypted locally. Amortized communication and computational complexity of a query is O(ℓ log log n) as exactly 2ℓ bins are retrieved where each bin contains O(log log n) values.
Next, we consider worst case overhead. For update operations, ℓ encrypted values are always uploaded to EMM u . The worst case query overhead heavily depends on the number of unapplied update operations. For label, we denote the number of unapplied update operations since the last query for label by u(label). Then, the worst case overhead of a query operation is O(ℓ log log n + ℓu(label)) from retrieving 2ℓ bins along with applying all prior update operations for label.
The server storage consists of O(n) value along with the number of unapplied update operations. While this may be unbounded, we present a variant in Appendix G where the update operations in EMM u may be applied every O(n/ℓ) update operations to ensure that server storage never exceeds O(n). The client storage consists of MM st requiring at most O(m) storage where m is the number of unique labels. The other portion of client storage is the two-choice hashing overflow stash using f (n) storage except with probability negligible in n for any f (n) = ω(log n). Discussion about Forward Privacy. In 2ch FB , the client storage increases as there are more update operations without intermediate query operations. This directly maps to the setting that forward privacy becomes more important as more information in the updates are protected from the adversarial server. In other words, the additional client storage is a direct result of providing stronger protection for updates without intermediate queries. In the next section, we present a construction providing the same forward privacy protection without the increasing client storage.

Construction 2ch s FB
Next, we present our final scheme 2ch s FB (standing for 2-choice hashing with Forward and Backward privacy and small client storage) that is also volume-hiding, forward and type-II backward private like 2ch FB . 2ch s FB improves upon 2ch FB by using smaller permanent client storage. Recall that 2ch FB uses client storage potentially linear in the number of unique labels O(m). 2ch s FB will only require permanent client storage of size ω(log n). Recall that 2ch FB required the client to locally store MM st . For any label ∈ L, MM st [label] stores two integers; a version number required by the keyed hash H and the number of unapplied update operations that are in EMM u . Instead, we will outsource the storage of MM st to the server inspired by ideas from recent work in ORAMs [25,31] and encrypted search [12].
In order to get rid of MM st at the client, 2ch s FB will explicitly store the location of cached operations in EMM u , in a series of static, encrypted multi-maps EMM loc 0 , . . . , EMM loc t−1 of geometrically increasing sizes stored on the server. The number of encrypted multi-maps will be t = O(log u) where u is the number of previous update operations. For any i, EMM loc i stores at most 2 i cached update operations. We instantiate these t structures using PiBas * (a modified version of PiBas [9]) that is a static, response-hiding, volume-revealing, encrypted multi-map scheme as described in [12]. We note however that any static encrypted search scheme with setup leakage being the size of the input multi-map and query leakage being at most query equality and volume of the tuple, will suffice as a replacement to PiBas * . Specifically, 2ch s FB maintains the invariant that the encrypted multi-map EMM loc i will store the locations of cached operations per label over the latest update operations that are not stored in smaller encrypted multi-maps, EMM loc 0 , . . . , EMM loc i−1 . As smaller encrypted multi-maps are filled, their contents are percolated to larger encrypted multi-maps in an efficient, but amortized, manner. As an example, suppose that all encrypted multi-maps EMM loc 0 , . . . , EMM loc i−1 are fully occupied. For the next update operation, the contents will be combined and placed into the larger encrypted multi-map EMM loc i . By querying all of EMM loc 0 [label], . . . , EMM loc t−1 [label], the client learns the entries of EMM u that contain all cached update operations. The result is the client forgoes local storage of MM st at the cost of an additional roundtrip and t additional encrypted multi-map queries.
We present the pseudocode for 2ch s FB in Figure 2. Setup. The client executes the same setup algorithm as 2ch FB except that the client does not store MM st . Update. For an update operation (op, label, ⃗ v), the client chooses a random location x and stores an encryption of the current update operation at EMM u [x] after padding ⃗ v to be length ℓ. To store x, the client identifies the smallest, empty multi-map. Say, this is EMM loc i . Next, the client downloads all EMM loc 0 , . . . , EMM loc i−1 , decrypts them locally and combines all counts into a single multi-map MM i . For each label ′ that appears in one of the i downloaded encrypted multi-maps, the client sets . The random location of the current update x is also appended to MM i [label]. MM i is then encrypted using the setup algorithm of PiBas * or a valid replacement and sent to S for storage as EMM loc i while all EMM loc 0 , . . . , EMM loc i−1 are emptied. Query. For a query to label ∈ L, 2ch s FB performs t queries with the server S to retrieve the locations of all cached update operations for label in EMM u . Afterwards, C uses the same algorithm as 2ch FB .Query to retrieve the final result. The only difference being that instead of sending a seed to S to compute the locations in EMM u , C sends the locations directly.

Security
We present the leakage profile for 2ch s FB when each EMM loc i is initialized by PiBas * . While one could present generic leakage, we choose to present leakage of a specific instantiation for ease of readability. Recall this construction has setup leakage of simply the total number of values and query leakage of label-equality and the queried label volume. Let MM be the input multi-map, O be a operation sequence and o be the current operation. The setup leakage of 2ch s FB is identical to 2ch FB as the adversary's view is the same. Therefore, L Setup (MM) = n. In terms of update leakage, the server learns information about which EMMs are downloaded and uploaded by the client. Note, this is a pre-determined schedule depending only on the number of previous updates. So, the update leakage is L Update (MM, (O, o)) = (ℓ, uop) which is also same as 2ch FB . Finally, the query leakage of 2ch s FB is similar to 2ch FB but it also has an extra leakage of queries on EMM loc i that we denote as L loc . However, L loc is a strict subset of the label-equality leakage. Therefore, L Query (MM, (O, o)) = (ℓ, leq(O, o), qop, L loc ). As this is essentially the same leakage as 2ch FB , 2ch s FB also inherits forward and type-II backward privacy. The security proof of 2ch s FB is found in Appendix F. Theorem 5. If SKE is IND-CPA secure and PiBas * is a static, response-hiding EMM scheme, then 2ch s FB is a volume-hiding, forward private and type-II backward private L-secure dynamic, EMM scheme.

Efficiency
We start with the main improvement of 2ch s FB over 2ch FB that is client storage. The client storage of 2ch s FB becomes only the overflow stash of size at most f (n) for any function f (n) = ω(log n) except with negligible probability. In Section 5.1, we show the overflow stash never exceeded more than a couple of items at a time through experimental evaluation. We note that client storage may be temporarily higher during operation time if and when rebuilding (discussed in Appendix G) is required. The additional server storage consists of EMM loc 0 , . . . , EMM loc t−1 that stores at most |Update(O)| values. So, 2ch s FB has identical worst case client storage cost as 2ch FB .
Note, the only additional query and update overhead costs consist of the downloading, uploading, constructing and querying the encrypted multi-maps used to store counts. Consider the encrypted multimap EMM loc i that stores at most 2 i counts. We note that EMM loc i is downloaded and re-uploaded when EMM loc 0 , . . . , EMM loc i−1 are full. This occurs every 2 i update operations. For u update operations, the total Let SKE = (Gen, Enc, Dec) be an IND-CPA encryption scheme, 2ch FB be as described in Figure 1 and PiBas * (a modified version of PiBas [9]) be a static, response-hiding, encrypted multi-map scheme as described in [12].   Query(((qop, label), st2); EMM2). In the execution, C uses L as the locations of cached update operations for label in EMMu instead of sending a seed to S to compute these locations.

Modifying n and ℓ
In Section 3, we assume that the upper bounds multi-map size (n) and volume (ℓ) never change. We will now present a generic transformations to handle changing upper bounds. In this section, will use n and ℓ as the current size and volume upper bounds respectively. Values n and ℓ will also be inputs for each operation.

Changing Multi-Map Size n
We start with handling either growing or shrinking the multi-map (i.e., changes to n). To do this, we will leverage a technique used in most common data structure implementations. Consider a dynamic array implementation (such as std::vector in C++). The array is initialized in memory with some fixed capacity upper bound. Once data grows beyond the capacity, the array implementation increases the capacity by some multiplicative factor (such as 2x), allocates new memory for the increased capacity and copies the contents to the new allocation. Our transformation will use the same paradigm.
Consider any dynamic volume-hiding EMM Σ with leakage L for fixed n and ℓ. We build Σ ′ with an additional Rebuild functionality. Rebuild((st, n, ℓ), (EMM, n, ℓ)) : 1. C downloads EMM and decrypts using st to get plaintext MM.
First, we evaluate the leakage of executing Rebuild. The first step leaks nothing as C simply downloads EMM. The second step leaks setup leakage L Setup (MM, n, ℓ). So, L Rebuild = L Setup .
When the client reports a change to n, Σ ′ will first execute Rebuild before running the original algorithm of Σ for either queries or updates. So, there is additional leakage informing when n changes that we will model with N such that N [i] = L Setup (MM, n, ℓ) if n changes on the i-th operation and N [i] =⊥ otherwise. We choose n to double (halve) to increase (decrease) capacity.
Theorem 6. Let Σ be a dynamic volume-hiding encrypted multi-map EMM with leakage function L for fixed values of n and ℓ. Then, Σ ′ is a dynamic volume-hiding EMM with leakage (L Setup , (L Query , M ), (L Update , M )). If n is only doubled or halved, Σ ′ has no increased amortized overhead.
Proof. For leakage, the simulator can detect when Rebuild is run and simulate setup using M . In terms of efficiency, consider the setting where capacity is doubled. That means, there must have been at least Ω(n) values added. The cost of Rebuild is O(n) meaning that the amortized overhead is at most O(1) per updated value. A similar argument can be applied if capacity is halved.
Instantiation with 2ch FB or 2ch s FB . If Σ is chosen to be 2ch FB or 2ch s FB , then L Setup (MM, n, ℓ) = n. So, the leakage of Σ ′ is L ′ = (L Setup , (L Query , n), (L Update , n)) as M may be upper bounded with L Setup (MM, n, ℓ) = n. Discussion about Approach. At a high level, the transformation is a straightforward approach of downloading, modifying locally and re-uploading the multi-map. To our knowledge, this remains the most efficient technique in the literature. For example, similar techniques were used in [12] for avoiding local storage of large count tables. We also employ similar techniques in 2ch s FB . To our knowledge, techniques with smaller client storage utilize more expensive algorithms including oblivious shuffling or sorting [1,29,32,2]. We leave it as an open problem to improve handling multi-map size changes beyond the straightforward approach.

Changing Maximum Volume ℓ
For changing values of ℓ, we could also apply the same technique for handling changing n. However, the amortized overhead may be larger as only a small number of keys need to be added to force a change in ℓ. Instead, we present an even simpler transformation that may be applied to either 2ch FB or 2ch s FB . We augment the Query and Update algorithms in the following way. The state of the client will also include the current maximum volume ℓ. Whenever ℓ changes, the client communicates the new value to the server. Afterwards, the protocols use the new value of ℓ to continue. Combining with techniques in Section 4.1, we get the following theorem that we prove in Appendix H: Theorem 7. Let Σ ′ be either 2ch FB or 2ch s FB with the above modifications to handle changing n and ℓ with leakage L = (L Setup , L Query , L Update ). Then, Σ ′ is a dynamic volume-hiding EMM with leakage L ′ = (L Setup , (L Query , n, ℓ), (L Update , n, ℓ)).

Experimental Evaluation
In this section, we evaluate the practicality of our volume-hiding schemes. First, we describe the experimental setup and our choice of parameters for our constructions. Using these experiments, we aim to answer whether our constructions concretely efficient while providing better privacy and more operations.

Experimental Setup
Our experiments are performed using the same machine for both the client and the server; a Ubuntu PC with an Intel(R) Core(TM) i5-9400 CPU with 6 cores, and 64 GB of RAM. Our schemes are implemented in Rust in about 500-800 lines of code each. Both schemes are instantiated in-memory. All the results of our experiments have standard deviations less than 2% of their average and were repeated at least 10 times. Primitives. We use and build on top of cryptographic primitives provided by ring [38] and OPENSSL [41] rust crates. For symmetric encryption, we use AES in CTR mode with key of size 32 bytes. In all our experiments, we consider PRFs with 32 byte outputs. In particular, we implement our PRFs using HMAC with SHA256. Input Multi-Maps. We will consider general multi-maps containing n ∈ {2 16 , 2 18 , 2 20 , 2 22 } maximum values which are considered standard in the literature [12,35]. As our schemes are dynamic, we initialize our input multi-maps with 90% of their maximum capacity. The final 5% is set aside to support updates. Since we are trying to guage efficiency of volume-hiding schemes, we set the number of unique labels to be n/100 so that volume of labels are large and also comparable to experiments in other works such as [12]. The size of label and value strings will be 20 bytes. Setup Protocol. The time taken by the setup algorithm of both 2ch FB and 2ch s FB ranges from 0.35s to 36.7s as the size of the input multi-map increases. For our experiments, we set the value for the parameters c = 1. We refer the readers to Figure 3 for a detailed plot of setup times. We also varied the value of c from 0.01 to 4 to study how the value of c effects the stash size when we put up to n values in our encrypted multi-map. This experiment was repeated 100 times and results are plotted in Figure 4. We find that for values of c ≥ 0.1 the client stash size averaged 0 regardless of the value of n we picked. Query Protocol. For both our schemes, we computed the total latency taken by the client and the server collectively on average to produce a final query result. We first focus on query times without any updates. In our experiments, for each data point, we would do multiple rounds of three queries on the same label but each time we would increase the number of updates done on that label prior to a round of queries. At a certain point the volume of the label would approach the maximum volume set for that particular instantiation of the scheme and we would stop updating further. We would then take the average of query times for this label across these rounds. This experiment is done in this way to factor in the effect of updates on query times. Figures 5a and 5b show query times for different input multi-map sizes against different maximum volumes. Note that the query time in these graphs are per result where the number of results for a query is the maximum volume. This is done so that a direct comparison to the static volume hiding schemes in [35] can be made. Here we note that the query times are comparable to the query times in [35] even though our schemes support dynamic operations. The query times of 2ch FB and 2ch s FB are also very comparable even though updates are stored differently in both schemes. Queries for both schemes ranged from 0.027ms to 0.051ms per result. Update Protocol. For 2ch FB , the time taken by an update stays under 25ms and for 2ch s FB under 76ms even for maximum volume of 20, 000 as shown in Figures 5c and 5d. This is primarily becuase of forward and backward privacy, updates are not directly applied to the two-choice hashing structure and some of the work is postponed, until queries. The updates for 2ch s FB are costly compared to 2ch FB as expected because unlike 2ch FB where a tuple is directly inserted into an encrypted multi-map, in 2ch s FB multiple encrypted structures are downloaded and rebuilt which takes extra time. However, 2ch s FB is still desirable due to smaller permanent client state. Effects of Updates on Queries. We refer to Figure 6 for a detailed look at 2ch FB 's query times interposed with its updates to show the effects updates have on a query. In each of the graphs in this figure, there are 9 queries issued and the x-axis represents the ith query of the 9 queries. The y-axis represents the total query time for each query. Right before the first, fourth and seventh query on a label, there were 10 , 50 and 100 updates made on that label, respectively. The size of each update was randomly sampled. The graphs, hence show small spikes in the first, fourth and seventh query times because of unresolved updates at those times and sudden speed up of the following two queries. We observed that 2ch s FB (Figure 7) which saves a lot on client storage, tends to be comparable but slower than 2ch FB . This is because its query protocol takes two rounds and has to do considerably more rebuilding than 2ch FB . We, however, note that our query times are in order of microseconds (25µs to 50µs) per single label, value pairs for both 2ch FB and 2ch s FB . Compared to Figures 5a and 5b, we do see a slight increase as queries now need to apply updates. Comparison with DST [20]. We compare with DST [20]. The other construction based on Pseudo-Random Transform in [20] is lossy in nature and leads to inaccurate query results. Hence, we do not believe a fair comparison is possible there. For DST, we note that it lacks several features offered by our constructions such as forward privacy and full-dynamicity (see App. A). For a comprehensive treatment, we still present a comparison with DST. We will show that 2ch FB and 2ch s FB offer the additional functionalities with minimal (or no) increased costs compared to DST.
For updates, DST takes from 7ms-1000ms for ℓ ranging from 128 to 20,000. During an update of a label x, DST downloads all the bins for x, deletes them on the server and re-uploads the edited bins. In comparison Figures 5c and 5d, show that for updates, our schemes have smaller communication and computation than DST. This is not surprising as for an update 2ch FB only uploads a vector of size ℓ to the server and 2ch s FB rebuilds a series of encrypted structure up until the smallest empty one. For smaller values of ℓ (≤ 1024), 2ch s FB takes more time than DST during updates due to additional cost of re-executing setup protocols on the underlying data structures dominating the cost incurred due to value of ℓ.
Starting with the simple case when ignoring updates, our schemes 2ch FB and 2ch s FB improve the communication during queries by 2-3x as our bins contain 4-5 items each using c = 1 whereas bins in DST contain at least 21-32 items (see experiments in [35]). Now taking updates into account during queries, and for different values of ℓ and n, we observed that 12-22 updates on a label before a query on that label would increase the query time to as much as that of DST. This is because downloading these additional updates during our queries makes our communication/computation costs similar to DST (countering our small bin size advantage). This is not surprising as we provide more stronger privacy guarantees. The total cost would still be same or better than DST as this increase in cost of query time is actually amortized over updates in DST.
the desired functionality, it degrades privacy significantly. A recent work [34] shows that, unless one is willing to utilize ORAM-like overheads, leakage of label equality patterns must be revealed by queries. Recall that label equality patterns reveal whether two different operations are performed on the same label or not (see Section 2.4). Using the above approach of replacing an update with two semi-dynamic operations will leak label equality leakage for every update (due to the usage of the semi-dynamic query). This ends up being a significant privacy degradation as label equality leakage during updates violates the privacy requirements of being forward private. Therefore, the above transformation requires either the EMM to use ORAM-like overhead or not provide forward privacy. In our work, we avoid this problem by directly building update operations that avoid performing query operations.

B Lower Bounds when Hiding ℓ
In this section, we analyze the dynamic volume-hiding definition in [44] that hides the maximum volume ℓ. Additionally, they introduce the notion of (p, ϵ)-correctness meaning that ϵ-fraction of matching values are returned with probability at least p. We refer readers to [44] for both definitions. We show a strong and simple query communication lower bound in this model. Theorem 8. Let Σ be a dynamic volume-hiding encrypted multi-map EMM according to the definition in [44] that is (p, ϵ)-correct. Then, the sum of the expected query communication and client storage of Σ must be Ω(p · ϵ · n).
Proof. Consider any MM. We show that when MM is input to Σ, the query communication must be Ω(ϵn). To do this, we construct the following adversary A. In the first phase of the definition in [44], A chooses any label k appearing in MM and constructs MM ′ with k associated with a value tuple of size |MM|. For operations, A chooses to query k repeatedly. Note, that the size of the query communication is viewed by the adversary. By the correctness requirement, it must be that Ω(ϵn) values are returned when querying MM ′ with probability at least p. So, the query communication and client storage must be Ω(p·ϵ·n) in expectation. By the volume-hiding requirement and the fact that the adversary sees the size of query communication, this means that queries to MM must satisfy the same requirement.
For reasonable parameters such as p ≥ 0.5 and ϵ ≥ 0.5 and sublinear client storage, then Ω(n) expected query communication is required. As seen from the definition, at query time for label, type-I backward privacy reveals the total number of updates performed on label. Type-II backward privacy also reveals the timestamps of each update operation for label. Finally, type-III backward privacy additionally reveals pairings of update operations that deleted values inserted by a prior update operation. All our constructions will be type-II backward private.

D 2ch: Pseudocode and Analysis
The pseudocode of 2ch is presented in Figure 8. Security. We present the leakage profile for our scheme against a persistent adversaries. During Setup, no information is leaked about the plaintext MM other than an upper bound n on the total number of values stored and hence L Setup (MM) = n. As far as query and update operations are concerned, we observe that operations on the same label access the same 2ℓ bins. So, the adversary may link different operations as operating on the same label or not. Moreover, update operations write back the bins accessed whereas query operations do not and so the type of operation, update or query, is also leaked. . We now prove that following theorem for 2ch. Theorem 9. If SKE is an IND-CPA-secure encryption and F, G are pseudorandom functions, then for every n ≥ ℓ ≥ 1, 2ch is a volume-hiding and type-II backward private, L-secure dynamic STE scheme for multi-maps.
In order to prove this Theorem 9, we first prove Theorem 10, Lemma 1 and Lemma 2.
Theorem 10. If SKE is an IND-CPA-secure encryption and F, G are pseudorandom functions, then for every n ≥ ℓ ≥ 1, 2ch is an adaptive L-secure dynamic STE scheme for multi-maps.
Proof of Theorem 10. We consider a stateful simulator S with state st that works as follows: EMM ← S.SimSetup(1 λ , n): • Game 0 is identical to Real 2ch,A (1 λ ).
• Game 1 replaces the PRF F with a random function. This is indistinguishable from Game 0 because of the pseudo-randomness of F .
• Game 2 replaces the IND-CPA encryption SKE.Enc steps with encryptions of ⊥ that are indistinguishable due to IND-CPA guarantees.
• Game 3 replaces the outputs of random functions with uniformly random chosen values. This is indistinguishable from Game 2 as the output of random function and a random string are indistinguishable.
Game 3 is the same as the ideal experiment completing the proof.
Next we will prove that 2ch is volume-hiding.
Lemma 1. Leakage function L is volume-hiding.
Proof. To prove that L is volume-hiding, we consider any two multi-maps with the number of values ≤ n and with maximum volume of a label ≤ ℓ. Note that the only other leakage is the label-equality pattern which is independent of the input maps as well as the response lengths of the query operations even after updates. As a result, the input to the adversary in both games with different multi-maps is identical, completing the proof.
Next we will prove that 2ch is type-II backward private.

Lemma 2.
Leakage function L is type-II backward private.
Proof. Note that L Update is dependent on the public parameter ℓ and the label on which the update is being performed. The leakage during queries on previous updates is the timestamps of all previous updates via leq. This leakage profile falls under the definition of type-II backward privacy.
Proof of Theorem 9. Follows directly from Theorem 10, Lemma 1 and Lemma 2.
Efficiency. Communicational and computational query and update operations are O(ℓ log log(n)) as the client uploads a single PRF evaluations and uploads and/or downloads 2ℓ bins of size O(log log n). By the analysis of [33], the overflow stash in client storage contains at most f (n) values, for any function f (n) = ω(log n), except with probability negligible in n. Server storage for our scheme is ⌈n/(c log n)⌉ · ⌈log(c log n)⌉ = O(n) encrypted values. If we had used standard two-choice hashing, server storage would be O(n log log n) without a client stash.
Variants. In our pseudocode, query and update algorithms are distinguishable since query algorithms are non-interactive while update algorithms are interactive. If we wish to hide operational types from the adversary, we can modify the query algorithm in the following way. After receiving the 2ℓ bins, the query algorithm re-encrypts all values in 2ℓ bins and re-uploads them back to the server. The resulting variant of 2ch will ensure that adversaries cannot distinguish between query and update algorithms.

E Security Proof of 2ch FB
For convenience, we present the leakage function L of 2ch FB (repeated from Section 3.3.1).
• L Setup (MM) = n. As a reminder, the above leakage means that only the size of the multi-map is leaked during setup. During the update operation, only the maximum volume and update operation is leaked. Finally, the maximum volume, query operation and label equality leakage pattern are revealed during queries.
Theorem 11. If SKE is an IND-CPA-secure encryption and F, G are pseudorandom functions and H is modeled as a random oracle, then for every n ≥ ℓ ≥ 1, 2ch FB is an adaptive L-secure dynamic STE scheme for multi-maps.
Proof. We consider a stateful simulator S with state st that works as follows: EMM ← S.SimSetup(1 λ , n): • Game 3 replaces the IND-CPA encryption SKE.Enc steps with simply producing a random string.
RCPA security of SKE guarantees indistinguishability between a ciphertext and a randomly generated string.
• Game 4 replaces the outputs of random functions with uniformly random chosen values. This is indistinguishable from Game 3 as the output of random function and a random string are indistinguishable.
Game 4 is the same as the ideal experiment.
Note that the random oracle assumption may be removed by using H as a pseudo-random function PRF and the client sending all PRF evaluations to the server. Proof. To prove that L is volume-hiding, we consider any two multi-maps with the number of values ≤ n and with maximum volume of a label ≤ ℓ. Note that the only other leakages are the global label-equality pattern and the number of updates performed for queried labels since the last searches on them. In particular, no leakage about the value tuples associated with update operations is leaked. As a result, the input to the adversary in both volume-hiding games with different multi-maps is identical. Proof. Note that L Update is only dependent on the public parameter ℓ and independent of all previous operations. Therefore, L is forward private. For type-II backward privacy, we note that the leakage during queries on previous updates is the number of previous updates on the queried label that may be computed using TimeUpdate(O) where O is all previous operations. Therefore, L is also type-II backward private.
Proof of Theorem 4. Follows directly from above. The leakage is identical to 2ch FB for all of setup, updates and queries. We note that the proof will consider the leakage of L loc from EMM loc i . However, this turns out to be a subset of label equality leakage.
Theorem 12. If SKE is an IND-CPA-secure encryption and F , G are pseudorandom functions, then for every n ≥ ℓ ≥ 1, 2ch s FB is an adaptive L-secure dynamic STE scheme for multi-maps.
Proof. We will utilize a simulator S pi for our initialization of each EMM loc i using the PiBas * construction. We assume S ′ to be the simulator for 2ch FB . We consider a stateful simulator S with state st that works as follows: EMM ← S.SimSetup(1 λ , n): 1. Execute S ′ .SimSetup(1 λ , n) Response ← S. SimQuery(1 λ , uop, ℓ, leq(O, o), L loc ): 1. Using the total number of updates so far, determine encrypted multi-maps EMM loc i that are non-empty.
2. For each EMM loc i that is non-empty, execute and return S pi .SimQuery(1 λ , L loc ).
G Variants of 2ch FB and 2ch s FB Note that both 2ch FB and 2ch s FB require server storage linear in the number of update operations in the worst case (i.e. the updated labels are never queried). Moreover, the static structures EMM loc i in 2ch s FB do not take into account the space wasted due to resolved updates. We show that one can ensure the server storage stays at O(n) using a scheduled clean-up algorithm. Every O(n/ℓ) update operations, the client and server agree to perform a scheduled clean-up. The client downloads the entire encrypted storage, decrypts locally, applies all cached update operations and re-uploads a freshly encrypted version of the two-choice hash table. As a result, the server storage never exceeds O(n). Furthermore, the additional amortized cost of each update operation increases by O(ℓ) that does not increase the total cost. Also particularly for 2ch s FB , one can modify the update algorithm in such a way that EMM loc i that is selected to be locally reconstructed is one that has some space newly freed due to a deletion of an entry from EMM loc i during update resolution in a past query operation. Discussion about Forward Privacy. We note that the server storage increases as there are more update operations without intermediate query operations. Similar to the growing client storage of 2ch FB discussed in Section 3.3.2, the additional server storage enables providing stronger protection for updates without intermediate queries. We leave it as an open problem to achieve this protecting without additional storage costs.

H Proof of Theorem 7
Proof of Theorem 7. We note that the proof is essentially identical to Theorems 4 and 5. The only modification is that after each operation, the simulator is provided n and ℓ. The simulator will run the same algorithm with these newly provided values. For correctness, both 2ch FB and 2ch s FB compact their results such that non-dummy values appear before dummy values. As long as ℓ is a valid upper bound, then all correct values are always returned.

I Circumventing Label Equality Lower Bound [34]
In a work by Patel et al. [34], it was shown that encrypted multi-map scheme that aims to leak anything less than label equality leakage will inevitably require Ω(log n) overhead that is similar to an oblivious RAM (ORAM). Throughout our work, we justify the leakage of label equality leakage as a way to obtain efficiency faster than ORAMs and circumvent this lower bound.
One may wonder whether it is possible to circumvent the lower bound in [34] in other ways without leaking label equality. One attempt may be to restrict the sequence of valid operations to avoid the one that was used to prove the lower bound in [34]. Recall that the proof in [34] considers a hard sequence of k operations with k/2 updates with value tuples of length ℓ to unique labels followed by k/2 queries to the same labels in any order. For this set of sequences, it was shown that Ω(log(kℓ)) overhead is required for a wide range of choices for k and ℓ. One obtains the above lower bound by setting kℓ = n α for any constant 0 < α ≤ 1.
In theory, it is possible to construct an encrypted multi-map that is faster for sequences that are not the above hard sequences without leaking label equality. That is, the construction is faster for non-hard sequence but slower for hard sequences. Unfortunately, the set of hard sequences is large and considers a natural setting of updating k/2 different labels followed by querying them. Therefore, the practical benefits of such a construction remain unclear. Nevertheless, we leave it as an interesting open question as to whether this efficiency dichotomy is achievable.