Result-pattern-hiding Conjunctive Searchable Symmetric Encryption with Forward and Backward Privacy

Dynamic searchable symmetric encryption (DSSE) enables the data owner to outsource its database (document sets) to an untrusted server and make searches and updates securely and efficiently. Conjunctive DSSE can process conjunctive queries that return the documents containing multiple keywords. However, a conjunctive search could leak the keyword pair result pattern (KPRP), where attackers can learn which documents contain any two keywords involved in the query. File-injection attack shows that KPRP can be utilized to recover searched keywords. To protect data effectively, DSSE should also achieve forward privacy , i.e., hides the link between updates to previous searches, and backward privacy , i.e., prevents deleted entries being accessed by subsequent searches. Otherwise, the attacker could recover updated/searched keywords and records. However, no conjunctive DSSE scheme in the literature can hide KPRP in sub-linear search efficiency while guaranteeing forward and backward privacy. In this work, we propose the first sub-linear KPRP-hiding conjunctive DSSE scheme (named HDXT) with both forward and back-ward privacy guarantees. To achieve these three security properties, we introduce a new cryptographic primitive: Attribute-updatable Hidden Map Encryption (AUHME). AUHME enables HDXT to efficiently and securely perform conjunctive queries and update the database in an oblivious way. In comparison with previous work that has weaker security guarantees, HDXT shows comparable, and in some cases, even better performance.


INTRODUCTION
Searchable symmetric encryption (SSE) enables the client to outsource an encrypted database to an untrusted server and then search it securely. Dynamic SSE (DSSE) allows the client to securely update the database. In a typical setting of SSE, the database DB is a collection of documents associated with a search index, commonly represented by a set of keyword-document pairs. If a keyworddocument pair is in the index, it means that the document contains the keyword. A search query returns the documents that have a specific relationship with searched keyword(s). The index, documents, and queries are all encrypted before being sent to the server.
An ideal goal of SSE is to efficiently and securely support query types as rich as the plaintext database, such as single-keyword query [7, 8, 11, 13, 15, 28, 43-45, 53, 56], Boolean query [12,25,31,37,39], range query [48,55], and update query. However, there exists a trade-off among performance, security, and functionality for SSE. Existing SSE schemes usually achieve better performance and/or functionality at the cost of information leakage. For instance, Cash et al. [11] designed a SSE, called OXT, that sub-linearly supports conjunctive query, represented as w 1 ∧ · · · ∧ w n , i.e., search the documents containing the n keywords, where n > 1. However, it leaks DB(w 1 ) ∩ DB(w j ), where DB(w) is the set of the documents containing the keyword w, and 2 ≤ j ≤ n. Such leakage is referred to as keyword pair result pattern (KPRP), and it can be generalised to DB(w i ) ∩ DB(w j ), where 1 ≤ i < j ≤ n. The file-injection attack [54] shows that KPRP leakage is not acceptable as attackers could leverage KPRP to first recover DB(w i ) and then learn w i for 1 ≤ i ≤ n, by injecting documents into the database.
In the dynamic setting, forward and backward privacy has been identified by the literature [8,13,39,48,55,56] as two crucial security notions for DSSE. Forward privacy hides the link between an update query to previous searches. Achieving forward privacy is essential to resist the file-injection attack [54], otherwise updated keywords can be recovered. Backward privacy ensures that search queries do not reveal the results that were deleted. Bost et al. [8] introduce three types of backward privacy: from Type-I that has the least leakages to Type-III which reveals the most information.
A naive solution to hide the KPRP is to search each keyword with response-hiding single-keyword SSE, such as MITRA [13], and intersect the results on the client. Response-hiding SSE does not reveal the search result (i.e., the identifiers of matching documents) in plaintext to the server. Despite the adopted single-keyword SSE could be --✓ HXT [31] O (γ n |DB(w 1 ) |) O (γ n |DB(w 1 ) |) [25] O (π 1 + n |DB(w 1 ) |) [50] O (τ |DB(w 1 ) | log |D|) O (n + m) [39] O (π 1 + n |DB(w 1 ) |) Type-II ✗ ODXT [39] O (nπ 1 Type-II ✓ |W |, |D|, and N are the number of keywords, documents, and keyword-document pairs in the database, respectively. N ≤ |W | |D |. n denotes the number of keywords involved in the query. m is the number of documents matching the query. π 1 is the number of updates related to w 1 . τ = n i =1 τ i , where τ i is the number of updates related to w i since the last search involved w i . γ and ξ are the parameters for Bloom filter, which typically are 20 and 29. For an update, W d denotes the number of keywords contained in the updated document and v denotes the number of keywords involved in an edit operation. λ is the security parameter and h denotes the average number of documents matched by any two keywords in the database. forward and backward private, the naive solution causes computational and communication overhead worse than O( n i=1 |DB(w i )|), which is inefficient especially when one or more of the keywords in the search query have high-frequency occurrence.
To the best of our knowledge, no DSSE scheme in the literature can support conjunctive queries securely and efficiently. As shown in Table 1, existing SSE schemes that support conjunctive queries are either static or leak KPRP (the static ones with KPRP leakage are not included in the table). Only Zuo et al.' scheme FBDSSE-CQ [57] hides KPRP while ensuring forward and Type-II backward privacy, but at the expense of linear search overheads.
Our Work. In this paper, we aim to fill this gap, i.e., design a forward and backward private DSSE scheme that sub-linearly supports conjunctive queries and hides the KPRP. Our solution is inspired by HXT [31], a static SSE that supports KPRP-hiding conjunctions. However, compared with HXT [31], our solution is more efficient and supports update queries with solid security guarantees: thus, we named our approach HDXT. As done in OXT [12] and HXT [31], the big idea of HDXT is to perform the conjunctive search in two steps: searches for DB(w 1 ), and filters out those results that do not match w 2 ∧ · · · ∧ w n . w 1 is the keyword with the minimum occurrence among the n keywords, and it is called s-term. The other keywords are called x-terms. In HDXT, DB(w 1 ) is obtained with a response-hiding single-keyword DSSE scheme. The challenge lies in how to perform securely and efficiently the second step.
To overcome the challenge, we introduce a new cryptographic primitive: attribute-updatable hidden map encryption (AUHME). AUHME allows us to query if a set of pairs m p is a subset of a larger set m a securely. If the answer is no, it does not leak which pair(s) in m p does not belong to m a . This property enables us to perform the second step without leaking KPRP. Specifically, assume W is the set of all the keywords in DB and D contains all the document identifiers of DB. HDXT has an index structure DB ′ = {(w ||id, v) | w ∈ W, id ∈ D}, where v is either 1 or 0, indicating whether document id contains w or not, respectively. DB ′ is encrypted with AUHME. For performing the second step, the client constructs I id = {(w 2 ||id, 1), · · · , (w n ||id, 1)} for each id ∈ DB(w 1 ) and queries whether I id is a subset of DB ′ with AUHME. If the subset query succeeds, id matches the conjunctive query. Otherwise, AUHME ensures whether (w i ||id, 1) is in DB ′ for every 1 ≤ i ≤ n is concealed, which protects DB(w 1 ) ∩ DB(w i ), indicating KPRP is hidden successfully. DB ′ can be securely updated with the update function of AUHME. Basically, the client caches recent updates locally and evicts the cache to DB ′ when it is full. The eviction is processed in an oblivious way such that the server cannot learn which entries of DB ′ were updated. Also, for any subset query for I id (id ∈ DB(w 1 )) in a subsequent search, the server only learns the query result with respect to the latest DB ′ , which will not reveal any information about the deleted keyword-document pairs. Consequently, the queries and updates over DB ′ satisfy forward and the highest level of backward privacy. That is, HDXT achieves forward and backward security as long as the adopted single-keyword DSSE does.
In Table 1, we summarise the performance overheads and security properties achieved by HDXT and other conjunctive DSSE schemes 1 . Compared with FBDSSE-CQ, HDXT has much less computational and communication overhead for search queries. HDXT also has less storage overhead on the server side. In Section 5.2, we compare HDXT with other schemes in detail.
We also experimentally compare the performance of HDXT with HXT and MITRA CONJ [39] (the naive solution implemented by Patranabis and Mukhopadhyay). The results show that HDXT is 10.7× and 13× faster than HXT and MITRA CONJ respectively, for the queries involving 11 keywords.
Our Contributions. Overall, our contribution can be summarized as below.
(1) We are the first to introduce the concept of AUHME and design a selectively-semantically secure AUHME scheme. (2) We propose the first conjunctive DSSE scheme HDXT that hides KPRP while preserving sub-linear search efficiency. HDXT also achieves forward privacy and backward privacy with the level of at least Type-II. (3) We implement a prototype of HDXT and evaluate its performance with real-world datasets. (4) We prove that our AUHME scheme is selectively-semantically secure, and HDXT is adaptively secure while achieving the three security properties mentioned above.

PRELIMINARIES
In this section, we first introduce the notations used in the following sections. Then we provide the definitions for AUHME and DSSE.

Notations
Throughout this paper, {0, 1} l denotes the set of all binary strings of length l. {0, 1} * denotes the set of arbitrary strings. 0 l represents the binary string of length l where every bit is 0. || denotes the concatenation of two strings. ⊥ represents an empty string. a 1 $ ← S means a 1 is sampled uniformly at random from the set S. |X | represents the cardinality of a set/map/list X .
A map X is a data structure that associates keys to values, where each entry contains exactly one unique key and its corresponding value. We also consider X as a set that contains (key, value) pairs.We use X : S 1 → S 2 to represent that the space for keys is S 1 and the space for values is S 2 , X ⊑ S 1 → S 2 to denote that the key space of X is a subset of (or equal to) S 1 and the value space of X is S 2 .

Attribute-updatable Hidden Map Encryption
Predicate encryption can encrypt a message associated with an attribute A to a ciphertext and generate a key SK corresponding to a predicate f such that the ciphertext can be correctly decrypted using SK if and only if f (A) = 1, while ensuring that nothing about the message is leaked if f (A) = 0. This security property is called payload-hiding. The predicate encryption is attribute-hiding if the ciphertext also conceals information about A.
In this paper, we introduce a special attribute-hiding predicate encryption: Hidden Map Encryption (HME), where the attribute A is a map. Let K, K a , and V be three finite sets, where K a ⊆ K. HME works for a class of predicates Φ hme = {ϕ hme m p |m p ⊑ K a → V} where, for an attribute map m a : K a → V, That is, ϕ hme m p (m a ) is satisfied when the pairs in m p are all included in m a , and we say m p is a subset of m a in this case.
We introduce the attribute-updatability property to HME, which means the attribute map can be updated without reproducing the ciphertext from scratch. Specifically, attribute-updatable HME (AUHME) supports two kinds of updates: adding a pair into m a and editing the value of an existing pair in m a . Deleting a pair can be achieved through editing the value of the pair to ⊥. Formally, in the symmetric-key setting, it consists of the following six algorithms: • Setup(1 λ ) → (msk, δ ): On input the security parameter 1 λ , it outputs a master secret key msk and a state δ . • Enc(msk, m a : K a → V, M) → C: Taking as input the master secret key msk, an attribute map m a , and a message M, it outputs the ciphertext C. • GenKey(msk, δ, m p ⊑ K a → V) → dk: Taking as input the master secret key msk, the current state δ , and a predicate map m p , it outputs a decryption key dk. • Query(dk, C) → M or ⊥: On input a decryption key dk and the ciphertext C, it outputs M or ⊥. • GenUpd(msk, δ, op, u 1 ∈ K, u 2 ∈ V) → (UTok, δ ′ ): On input the master secret key msk, the current state δ , an operator op ∈ {add, edit }, and a pair (u 1 , u 2 ), it produces an update token UTok and a possibly updated state δ ′ . Note that if op = add, u 1 is inserted into K a . • ApplyUpd(UTok, C) → C ′ : On input a token UTok and the ciphertext C, it outputs the updated ciphertext C ′ . The correctness of an AUHME scheme requires that, for all possible legal inputs, after running Enc, GenKey, and even after performing a polynomial number of updates on m a with GenUpd and ApplyUpd, if ϕ hme m p (m a ) = 1, then Query(dk, C) = M, otherwise Query(dk, C) =⊥ with all but negligible probability.
A variation of predicate encryption is a predicate-only scheme where the inputs of Enc do not include any M, and Query only reveals whether the predicate is satisfied. For a predicate-only HME, Query(dk, C) = ϕ hme m p (m a ) for any dk ← GenKey(msk, δ, m p ).
(2) A may make ρ 1 queries in an adaptive way. For a key generation query on a predicate map m p , A is given dk generated by GenKey(msk , δ , m p ). For an update query (op, u 1 , u 2 ), A is given the update token U T ok outputted by GenUpd(msk , δ , op, u 1 , u 2 ). (3) A chooses a message M and is given the ciphertext C generated by Enc(msk , m a , M ). (4) A may make ρ 2 queries adaptively, which are processed as in (2) (5) With the view observed by A as the input, A outputs a bit b.
(2) A may adaptively makes ρ 1 queries. For a query on m p , A is given dk outputted by S(L h q (m p ), ϕ hme mp (m a )). For an update (op, u 1 , u 2 ), A is given U T ok generated by S(L h u (op, u 1 , u 2 ). (3) A chooses a message M and is given the ciphertext C generated by S(1 |M | , |m a |). (4) A may make ρ 2 queries adaptively as in (2).  Intuitively, the security for AUHME requires that the adversary learns nothing about M and m a , a query only reveals the query result, and an update discloses nothing. Here we consider the relaxed security where queries and updates might leak a little information. We denote the allowed leakage as L h = (L h q (m p ), L h u (op, u 1 , u 2 )), which captures the query and update leakages, respectively. Briefly, we require that L h q (m p ) only exposes which keys exist in the predicate map m p , which is called key pattern. L h u (op, u 1 , u 2 ) can reveal op but leak nothing about (u 1 , u 2 ). In Definition 2.1, we provide the security definition for AUHME in the simulation-based setting. Definition 2.1. We say an AUHME scheme is L h -selectivelysemantically secure if, for any security parameter λ and any probabilistic polynomial-time adversary A, there exists a simulator S and a negligible function negl such that; where AUHMEREAL A (λ) and AUHMEIDEAL A,S,L h (λ) are shown in Fig.1

Dynamic Searchable Symmetric Encryption
W i store all the document identifiers and keywords in the database, respectively. Given a search formula ψ (w) involving a collection of keywords w ⊆ W, DB(ψ (w)) represents the identifiers of the documents that satisfy ψ (w). ψ (w) is a conjunctive query if it combines every keyword w ∈ w with the operator '∧' (AND). An identifier id i satisfies a conjunction over w iff w ⊆ W i . Moreover, we de- , and is 0 otherwise. Finally, note that a dynamic database supports inserting a new document into the database (add), adding keywords in W to an existing document (edit + ), removing keywords from an existing document (edit − ), and deleting documents from the database (del). Formally, DSSE consists of the following three protocols: • Setup(λ, DB; ⊥) → (K, s; EDB): On input the security parameter λ, and a database DB, the client outputs a secret key K and a state s. The server outputs an encrypted database EDB without any input. • Search(K, s,ψ (w); EDB) → (s ′ , DB( ψ (w)); EDB ′ ): The client takes as input the secret key K, the current state s, and a search formula ψ (w). The server has EDB as the input. Eventually, the client outputs a possibly updated state s ′ and the search result DB(ψ (w)). The server outputs a possibly updated encrypted database EDB ′ . • Update(K, s, op, in; EDB) → (s ′ ; EDB ′ ): The client has five parameters as inputs that include the secret key K, the current state s, an operator op ∈ (add, edit + , edit − , del), and the updated information in = (id, W id ) where id is a document identifier and W id is a collection of keywords. The server takes as input the encrypted database EDB. Finally, the client outputs an updated secret state s ′ , and the server outputs an updated encrypted database EDB ′ . The correctness for SSE requires that for every database DB, every encrypted database EDB generated from DSSE.Setup or DSSE.Update, and every supported search formula ψ (w), the search query on ψ (w) should return DB(ψ (w)) to the client.
As done in previous literature [7,11,28], we use three functions L = (L Stp (DB), L Sr ch (DB,ψ (w)), L U pdt (DB, op, in)) to capture the leakages for the setup, search, and update protocols, respectively. We borrow the formal definition for DSSE from [11,28], which is shown in Definition 2.2. Definition 2.2. Let = {Setup, Search, Update} denote a DSSE scheme. We say is L − adptively − secure if for any security parameter λ, any probabilistic polynomial-time adversaries A, there exist a a simulator S and a negligible function negl such that: where SSEREAL A (λ) and SSEIDEAL A,S,L (λ) are defined as: • SSEREAL A (λ): At first, A chooses a database DB, and obtains EDB by invoking the function Setup(λ, DB). Then it repeatedly performs search queries Search(ψ (w)) and update queries Update(op, w , id) in an adaptive way. A receives all the transcripts generated during the above operations, and outputs a bit b.
• SSEIDEAL Σ A,S,L (λ): A chooses a database DB, and calls S(L S t p ( DB)) to get the encrypted database EDB. After that, it adaptively performs search queries (update queries) by calling S(L S r ch (DB, w)) (S(L U pd t (DB, op, w , id ))). A observes the transcripts of all operations and outputs a bit b.
Forward Privacy. Forward privacy requires that an update reveals nothing about the updated keyword. We borrow the definition from [7,8], which is shown in Definition 2.3.
where L ′ is a stateless function.
Backward Privacy. Backward privacy limits what the server could learn about a deleted entry from the queries issued after the deletion. Bost et al. [8] introduce three types of backward privacy for single-keyword DSSE, from Type-I to Type-III. Briefly, Type-I requires that a single-keyword search on w only reveals DB(w), when each document in DB(w) is inserted, and the total number of updates related to w. Type-II additionally leaks the timestamps of the updates related to w. The leakages of Type-III also include which deletion cancels which addition. To extend the definition to conjunctive DSSE, similar to [48], we say that a multi-keyword DSSE is backward private iff the update and search leakages about every keyword do not exceed what is revealed by a backward private single-keyword DSSE. Nevertheless, Bost et al.' definition has two assumptions: the initial database is empty; a keyword w cannot be inserted into the document from which w was previously removed. We generalize their definition by eradicating the two assumptions. Since our DSSE scheme at least achieves Type-II, we only define Type-I and Type-II backward privacy.
We use Q to represent the list of the issued queries, (t, q) to denote a conjunctive query, and (t, op, in) to stand for an update, where t is the timestamp. For a conjunction q, q[i] is the i-th term involved in q. t ▷ denotes the timestamp of the setup protocol and DB ▷ is the initial database. For a conjunction q, π ▷ i records the number of documents containing the keyword q[i] in DB ▷ .
For a keyword w, TimeDB(w) outputs the identifiers currently matching w and the timestamps these identifiers were first inserted into the database. Formally, TimeDB(w) = {(t, id)|id ∈ DB(w) and ∃W id : (t, add, (id, W id )) ∈ Q} ∪ {(t ▷ , id)|id ∈ DB(w) and id exists in DB ▷ }. Updates(w) is the list of timestamps of updates related to w. Formally, Updates(w) = {t |∃W id that contains w : (t, add, (id, W id )) ∈ Q or (t, edit + , (id, W id )) ∈ Q or (t, edit − , (id, W id )) ∈ Q or (t, del, (id, W id )) ∈ Q}. For a conjunctive q, we write (TimeDB(q[i])) n i=1 as TimeDB(q), (Updates(q[i])) n i=1 as Updates(q), (π ▷ i ) n i=1 as π ▷ (q), (π i ) n i=1 as π (q), where π i is the sum of π ▷ i and the number of updates related to q[i]. We give the definition in Definition 2.4. Note that the existing definitions [39,57] either have strict assumptions or are specialized to their own schemes. Type-II backward private iff where L ′ and L ′′ are stateless functions.
Definition 2.4 is applicable to single-keyword DSSE by considering a search on w as a conjunction q = w. Our definition differs slightly from Bost et al.'s definition [8] due to the complex setting we consider, for which we make a detailed analysis in Appendix C.
KPRP-hiding. The KPRP is a leakage related to the keywords involved in the same search query. A conjunction query q aims to obtain the documents containing all the keywords involved in q, i.e., DB(q). The KPRP-hiding property means that the server could know DB(q) after the search, but otherwise cannot learn which other documents contain any two keywords involved in q. We define KPRP-hiding in Definition 2.5.

AUHME CONSTRUCTION
This section presents our predicate-only AUHME construction. In our construction, for the attribute map, the key can be an arbitrary string and the value belongs to {0, 1}, i.e., K is a finite set of arbitrary strings and V is {0, 1}. For update operations, the mapped value of any pair can only be updated to 0 or 1. Moreover, the new value must be different from the stale one; otherwise the update is invalid, which is forbidden in our construction. For simplicity, our construction does not consider deleting pairs from m a .

Overview of AUHME Construction
Query Process. The main purpose of our AUHME construction is to securely query if a predicate map m p is a subset of the attribute map m a , i.e., query if all the pairs in m p are also included in m a , with two requirements: R1) the pairs of the two maps should be protected in any case; and R2) if m p is not a subset of m a , which pairs in m p are included in m a should not be leaked.
To achieve R1, every pair in the two maps is encrypted with a pseudorandom function (PRF) F : The ciphertexts of m a 's pairs are stored in a map C, where every entry is indexed by its associated encrypted key. Our strategy to achieve R2 is based on the XOR MAC technique [4], where we XOR the ciphertexts of m p 's pairs and get a string xors. To conceal xors, we generate d ← H (r ||xors) as the query token, where r is a random string and H : {0, 1} * → {0, 1} λ is a hash function. During the query, only the ciphertexts of the pairs whose keys are included in m p are picked out from C and XORed into xors ′ . Given r , we can check if H (r ||xors ′ ) = H (r ||xors), which is true only when xors ′ = xors, indicating m p is a subset of m a . The query process exposes the query result, |m p |, and the access pattern over m a , from which the adversary cannot break R1 and R2.
Update Process. For an update, we aim to break the link between the update and previous queries, i.e., conceal whether the updated key is included in any ever queried m p . As mentioned before, the attribute map m a can be updated in two different ways: add a new pair, or edit the value of an existing pair. For an addition, the newly added pairs must have new keys, which means they must be not related to any m p . Thus, we can directly encrypt the new pairs with F and add their ciphertexts into C. However, it is challenging to edit pairs. During the query, the access pattern over m a is leaked in order to generate xors ′ . To break the link between editing updates and queries, we have to edit the pairs obliviously; otherwise, they can be linked based on the access pattern.
To protect editing updates efficiently, we leverage an ORAM-like idea where we create a local cache for saving the recent editing updates and evict them to C when the cache is full. In the eviction procedure, we re-randomise all the pairs in C so as to hide the access pattern. Specifically, the pairs without updates are XORed with a string that does not affect their values, and the pairs with updates are XORed with a string that can change their values to the updated ones, which can be easily achieved as the value is either 1 or 0 in plaintext. The strings generated for the two cases are indistinguishable as they are encrypted with F . Thus, the adversary cannot tell which pairs are actually updated. • AUHME.Setup(1 λ ) : It generates the secret key msk = (k 1 , k 2 , k 3 ) and the initial state δ = (cnt,T , ζ , S). Specifically, cnt counts the number of evictions that were executed. T is the cache for editing updates, which is a map with a capacity of ζ . S is ⊥ except when performing an eviction. Within an eviction, S stores F (k 1 , k) for every key k ∈ m a . S can be pre-computed or pre-requested from C before the eviction.

Details of AUHME Construction
• AUHME.Enc(msk, m a ) : The algorithm produces the ciphertext C, which is in the form of a map. For every element • AUHME.GenUpd(msk, δ, op, k u , v u ) : Given the pair (k u , v u ) and operator op, the algorithm generates the update token tok and updates the state δ . tok is initialized to be an empty map. Hereafter we denote the attribute map associated with C as cm a , which is outdated when the local cache is not empty. Specifically, if op = add, the algorithm computes (ℓ, ν) from (k u , v u ) and the global counter cnt, and sets tok , because we assume all the updates are valid, i.e., the new value must be different from the stale one. In this case, CInsert deletes T [ℓ]; otherwise, it sets T [ℓ] to v u . Recall that each value in C will be re-randomised with a string during an eviction. Such a string is derived from the encrypted key in C. So before running CEvict, we need obtain all the encrypted keys either from C or by re-encrypting all the keys of m a (if they are accessible), and store them into S. AUHME.Setup(1 λ ): AUHME.Enc(msk , ma): 13: else 14: S ← all the keys in C 15: : 18: 19: end if 8: end for AUHME.GenKey(msk , δ , mp): xor s ← xor s ⊕ ν 10: AUHME.Query(dk , C): 5: else 6: should be updated without modifying v a . We achieve that by generating u that will only increase cnt , and we produce u that will update v a to T [ℓ] and increase cnt. When CEvict is done, CClear is called to clear T . Finally, the algorithm returns (UTok, δ ), where δ = (cnt + 1,T , ζ , ⊥).
• AUHME.ApplyUpd(UTok, C) : This algorithm updates the ciphertext C with UTok = (op, tok). In the case that op = add, C is updated to the union of C and tok. If op = edit, which is for an eviction, the algorithm computes for each key ℓ in tok.
• AUHME.GenKey(msk, δ, m p ) : It generates the decryption key dk = (L, r, d), i.e., the token for querying if m p is a subset of m a . Since the most recent editing updates are cached locally, the values stored in C may be out of date. Thus, for a pair (k p , v p ) ∈ m p and ℓ = F (k 1 , k p ), we have 3 cases to process: 1) an update for k p is cached in T but v p does not match the cached value, i.e., T [ℓ] ⊥ and T [ℓ] v p ; 2) an update for k p is cached in T and v p matches the cached value, i.e., T [ℓ] = v p ; and 3) no update for k p is cached in T , i.e., T [ℓ] =⊥. The first case indicates m p is not a subset of the latest m a for sure, and if all the pairs in m p are in the second case, m p must be a subset of m a . However, to avoid leaking information about updates, we generate dk and perform the query for all cases. L and r are generated in the same way, yet d is generated in different ways for the three cases. Specifically, L stores ℓ = F (k 1 , k p ) for every k p ∈ m p . r is a random string. For the first case, d is also a random string as we already know the query result. For the last two cases, d is H (r ||xors), where xors is generated by XORing all the encrypted pairs of m p . In particular, in the second case (i.e., when T [ℓ] = v p ), C[ℓ]'s plaintext must be 1 − v p ; otherwise the update in T is invalid, which is forbidden. To ensure the correctness of the query, we encrypt (k p , 1 − v p ) for such pairs.
• AUHME.Query(dk, C) : It first parses dk to (L, r, d). Then it XORs every value in C whose key belongs to L and obtains the result xors ′ . If H (r ||xors ′ ) = d, which demonstrates that xors ′ = xors, the algorithm outputs 1, otherwise it outputs 0.

Complexities of AUHME
Encryption. Each pair in the map is encrypted with F , thus AUHME.Enc causes O(|m a |) computational complexity. The storage overhead added by C is also O(|m a |). Query. AUHME.GenKey generates dk through traversing each pair in m p , resulting in O(|m p |) computational overhead and token size. AUHME.Query procedure only processes each entry of C whose key is in L, also causing O(|m p |) computational cost.
Update. To add a pair (k u , v u ), AUHME.GenUpd derives two strings, resulting in O(1) computational overhead and token size. When op = edit, if the cache is not full, O(1) computational cost is paid to cache (k u , v u ) and the token size is zero, otherwise an eviction happens. An eviction pseudorandomly derives |m a | strings, which incurs O(|m a |) computational overhead and token size. On average, the editing overhead amortized to each pair is O(|m a |/ζ ). For AUHME.ApplyUpd, the number of processed pairs is equal to the token size. Therefore, the incurred overhead is O(1) for an addition and O(|m a |/ζ ) for an edit operation.

Security of AUHME
To capture the query leakage, we first define a vector K. Initially, each key in m a is inserted into K in sequence. When an addition involving (k u , v u ) comes, k u is inserted into K. Then we define a function Loc(m p ) that outputs the key pattern about a predicate map m p . Formally, Loc(m p ) outputs a vector v that satisfies for all ). An addition reveals the operator add. An edit operation leaks nothing except for whether it incurs an eviction. We define a function IfEvic(k u , v u ). If the edit operation (edit, k u , v u ) makes an eviction occur, IfEvic(k u , v u ) outputs 1, otherwise it outputs nothing.
Theorem 3.1. If F is a secure PRF and H is modeled as a random oracle, our AUHME construction is L-selectively-semantically secure.

Overview of HDXT
In HDXT, the encrypted index consists of TMap and XMap. TMap is a structure produced by a response-hiding single-keyword DSSE scheme (denoted as RHS). Initially, XMap is obtained by using predicate-only AUHME to encrypt the extended database DB ′ defined in Section 2.3. Within a conjunction w 1 ∧ · · · ∧ w n , the client first makes a single-keyword query on w 1 with TMap to obtain DB(w 1 ). Then for each id ∈ DB(w 1 ), it builds a predicate map I that stores a mapping from w i ||id to 1 for 2 ≤ i ≤ n and issues an AUHME query to check if I is a subset of DB ′ . If the AUHME query returns 1, id matches the conjunction. The security of AUHME guarantees that the server cannot learn DB ′ [w i ||id] for all 2 ≤ i ≤ n if the AUHME query returns 0. Thus, KPRP-hiding can be ensured.
To update a keyword-document pair, TMap is trivially updated with RHS, and AUHME enables DB ′ to be updatable. As we have described in Section 3, to achieve secure edit operations, AUHME preserves a local cache of fixed size. For HDXT, the cache is kept by the client, and the cache capacity is set to |W|. Note that the incurred client storage is comparable to many SSE schemes [7,29,30,43].
Following the mainstream SSE, we assume there is an authentication scheme in place that enables the client and the server to verify each other's identities before exchanging any data. This can be implemented with the transport layer security (TLS) protocol, twofactor authentication [47,49], or human-memorizable passwordbased authentication [14]. In addition, In line with [25,39,50], we prohibit incorrect updates introduced in [53]. Fig.4, Fig.5, and Fig.6 show the pseudocodes for HDXT. RHS is adopted in a black-box way, and AUHME is abbreviated as HME.
• (s; EDB) ← HDXT.Update(K, s, op, in; EDB): Within an update, RHS is executed to update TMap. The update token for XMap is a map tok x .
As shown in Line 3 -17 (Fig.5), the client generates an AUHME addition token UTok for each pair and then merges these addition tokens into tok x . In the case that op = edit + /edit − , DB ′ [w ||id] should be changed to 1 (op = edit + ) or 0 (op = edit − ) for each w ∈ W id . As presented in line 18 -24 (Fig.5), for each w ∈ W id , the client calls EditPair (msk, δ, op, id, w) to generate tok x , which is either empty or an eviction token.
In EditPair (msk, δ, op, id, w), to make an eviction to be completed in one round, if the cache will overflow, the client computes all the keys in XMap and include them into the state of AUHME before calling HME.GenUpd. If op = del, XMap is unchanged. This will not affect subsequent searches, because the client will find that id was deleted during the related single-keyword searches on the s-term.
• (DB(w 1 ∧con(w 2 , · · · , w n )); EDB) ← HDXT.Search (K, s, w 1 ∧ · · · ∧ w n ; EDB): Within a search on w 1 ∧ w 2 , · · · , ∧w n , HDXT first executes the search protocol of RHS, after which only the client gets DB(w 1 ). Then for each identifier id ∈ DB(w 1 ), it tests whether id satisfies w 2 ∧ · · · ∧ w n . Specifically, the client stores DB(w 1 ) into a list R 1 and randomly shuffles the elements of R 1 . For 1 ≤ j ≤ |R 1 |, it takes id from R 1 [j] and builds a map I j = {(w i ||id, 1)} n i=2 . The client calls AUHME to generate the decryption key dk for I j , which is then inserted into the j-th position of a list DK. DK is sent to the server. With DK[j], the server calls AUHME to query whether I j is a subset of DB ′ . If the AUHME query returns true, the server inserts j into a set Pos. Pos is then returned to the client. The final search result is R = {R 1 [j]} j ∈P os .

SECURITY AND PERFORMANCE ANALYSIS
In this section, we comprehensively analyze the security and performance achieved by HDXT.

Security of HDXT
To analyze the security of HDXT, we continue using the notions and functions introduced in Section 2.3. We denote the leakage function for RHS as L RH S , and also introduce the other four functions.
IP(q) records the conditional intersection pattern with respect to a conjunction q. It is expressed as (IP(q [1], q[i])) n i=2 . For 2 ≤ i ≤ n, if there exists a previous search q ′ that satisfies the following two conditions: AddTims(q [1]) outputs when the documents that belong to DB(q [1]) were added to the database. Formally, AddTims(q[1]) = {t |∃id ∈ DB(q [1]) and W id : (t, add, (id, W id )) ∈ Q}.
Based on Q and |W|, the timestamps of the evictions can be obtained. If an eviction occurs within (op, in), Evic(op, in) outputs 1, otherwise it outputs nothing.
Within an update (op, in), updating TMap could expose L For a conjunction q, the single-keyword query on q[1] reveals L Sr ch RH S (DB, q [1]). From the queries to XMap, the server could directly learn |DB(q [1])| through the number of the issued AUHME queries in q. Since an AUHME query could leak key pattern, it first could be linked to previous additions related to DB(q [1]), which is captured by AddTims(q [1]). Through the leaked key pattern, an AUHME query can also be associated with the previous conjunctions that have the same keys in predicate maps. The leakage caused by this association is no more than the information captured by IP(q). After each conjunction, the server could obtain TimeIds(DB(q)). Formally, we can get Theorem 5.1.
Theorem 5.1. If RHS is L RH S -adaptively secure and AUHME is selectively-semantically secure as defined in Section 3, HDXT is L HDXT -adaptively secure where

DB(q[j]
) for any 1 ≤ i < j ≤ n, except for DB(q). It demonstrates that HDXT successfully hides KPRP. The update leakage function clearly shows that HDXT inherits forward privacy from RHS.
Mitigating Other Attacks. Existing attacks can be classified into: known-data/query attacks [5,10,23], inference attacks [21,33,34,41], and injection attacks [40,54]. For the first two types, the adversary is passive and requires an amount of auxiliary information, such as a subset of target databases/queries or a statistical distribution similar to the target databases/queries. For injection attacks, the adversary is active and capable of injecting a number of documents, without (or with quite less) auxiliary information.
Injection attacks [40,54] are devastating for DSSE. HDXT mitigates the file-injection attack [54] by ensuring KPRP-hiding and forward privacy. Achieving forward privacy also helps to mitigate the injection attack [40] proposed by Poddar et al.. Their attack leverages the response length for search queries, whereas the adversary should be able to replay search queries after a round of updates independently. Forward private SSE updates the token of a search query after each related update, which makes the search unreplayable by anyone but the client.
Among the passive attacks, most of them [10, 21, 23, 41] exploit co-occurrence patterns, i.e., the number of documents containing both w i and w j for any two queried keywords w i and w j . We claim that achieving KPRP-hiding is essential to prevent such attacks, otherwise KPRP directly exposes co-occurrence patterns. The other attacks demand explicit search patterns of single-keyword queries [33,34] or volume patterns [5] that capture the number of keywords contained by the document that matches a query. To mitigate them, we can further reduce the leakages by instantiating the RHS of HDXT with search-pattern-hiding DSSE [18], which prevents RHS from revealing search and volume patterns. Furthermore, before making AUHME queries within a conjunction, the client can insert some randomly selected document identifiers into R 1 (Fig.6). This step adds noise into the leakages caused by queries over XMap. Besides, the client could issue searches on negated terms described in Section 6 to further perturb the above leakages.

Performance of HDXT
For clarity, we initialise RHS with MITRA [13] for performance analysis. In the following, unless otherwise specified, the overhead refers to the computational and communication overhead.
The setup phase generates TMap and XMap directly with RHS and AUHME, which results in O(|DB ▷ |) and O(|W||D|) overheads, respectively. Within an update on (op, (id, W id )), HDXT uses RHS to update TMap, which costs O(1) overhead for each keyworddocument pair. To update XMap when op = add, HDXT invokes the addition procedure of AUHME |W| times, which causes O(|W|) total overhead and O(|W|/W d ) average overhead per pair. When updating XMap during an edit query, an edit procedure of AUHME is invoked for every involved keyword-document pair. This edit process results in the same complexity as AUHME, which is O(|W||D|/ζ ) as shown in Section 3.3. HDXT sets ζ to |W|, hence the overhead amortized to each pair is O(|D|). Because a deletion only updates TMap, so its overhead is O(1). For a conjunction q, RHS searches on q[1] that brings O(π 1 ) overhead. Then HDXT issues |DB(w 1 )| AUHME queries. Each AUHME query is about a predicate map of size n − 1, which incurs O(n − 1) overhead. The total overhead for a conjunction is O(π 1 + n|DB(q[1])|).
TMap and XMap cost O(N ) and O(|W||D|) server storage overheads, respectively. For the client storage, RHS causes O(|W| log |D|) overhead. The client also requires O(|W|λ) bits to keep the local cache. The total client storage is O(|W|(log |D| + λ)). Note that the eviction procedure in HDXT could be processed in a streaming manner to avoid excessive consumption of client storage.
Performance Comparison with Previous Work. Table 1 shows that HDXT outperforms FBDSSE-CQ [57] in every respect, especially search and storage efficiency. Compared with other schemes [25,31,35,39,50] that have weaker security, HDXT achieves very competitive search efficiency and might be less efficient in editing and storage efficiency.
We claim that the less efficient editing efficiency is a price HDXT pays for small leakages. In Section 6, we describe an extension of HDXT (called HDXT SU ), which achieves much better editing efficiency at the cost of increasing the leakage. Note that HDXT SU still guarantees KPRP-hiding and forward privacy.
The server storage of HDXT is higher than the KPRP-hiding static solution HXT [31]. In HXT, the essential plaintext index structure is a Bloom filter [6] built from the database, which makes its server storage smaller than ours. However, Bloom filter is not friendly for updates. DB ′ adopted by HDXT enables secure updates while preserving efficient KPRP-hiding searches, at the cost of larger size. In the current literature, it is common to achieve better security or functionality at the expense of increasing server storage as the storage is getting much cheaper. For instance, compared with OXT [12], IEX [25] and CNFFilter [37] support disjunctive searches sub-linearly, yet they need much higher server storage than OXT.

Cache and Eviction Strategy
HDXT needs a cache T on the client to process updates. Its size ζ only affects the amortized edit complexity and has no impact on security and search performance; thus, it can be configured based on the storage capacity of the client.
Specifically, T is only used within edit and search queries. For an edit query on a keyword-document pair, it is either inserted into the cache or evicted to XMap with all the cached updates. An eviction is oblivious and reveals nothing. For performance, Section 5.2 shows that the amortized edit complexity is inversely proportional to ζ . During a search, the client uses the cache in the second round. To test whether an identifier id ∈ DB(w 1 ) matches the remaining keywords, the client issues an AUHME query that first accesses the cache (n − 1) times and then generates a decryption key dk = (L, r, d). L remains constant for the same predicate map, r is randomly generated, and d is also random from the perspective of the server. Therefore, neither the cached content nor the capacity has any impact on the search performance and security. In practice, the client could evict the cache to XMap in any state, such as when the server is idle, as long as the client storage is affordable.

EXTENSION
In this section, we first briefly describe HDXT SU and then discuss how HDXT supports queries involving negated terms.
Performance Enhanced Update. HDXT updates all the entries of XMap in an eviction, which achieves high-level security guarantees but is costly. HDXT SU improves the update performance by reducing the pairs to be updated in the eviction. Basically, HDXT SU only updates the entries related to the documents that were edited since the last eviction (or the setup if no eviction happened). In this case, the server could learn which documents were edited since the last eviction. It cannot infer which keywords were updated, so forward privacy is still guaranteed.
HDXT SU creates XMap in a slightly different way. In the setup phase, instead of building DB ′ for all the documents, the client builds The collection of all the encrypted DB ′ id is the XMap of HDXT SU . For search queries, after obtaining DB(w 1 ), the client builds I ′ = {(w i , 1)} n i=2 and queries whether I ′ ⊆ DB ′ id for each id ∈ DB(w 1 ). Within an eviction, for every document id that has at least one related edit operation in the local cache, HDXT SU evicts the edit operations associated with id to DB ′ id . By doing so, the overhead caused by an eviction is only linear with |W| · t, where t is the number of edited documents since the last eviction. The amortized edit complexity can be reduced to O(t). We present the detailed HDXT SU in Appendix D.
Conjunctions on Negated Terms. HDXT can be trivially extended to support conjunctions on negated terms. A negated term aims to return the documents that do not contain the given keyword. Given a conjunction on negated terms (e.g., w 1 ∧ ¬w 2 ∧ ¬w 3 ), RHS is first invoked to search for DB(w 1 ). After that, if w 1 is a nonnegated term (negated term), the client inserts DB(w 1 ) (ID\DB(w 1 )) to a list R 1 . Then for every id ∈ R 1 , it constructs the predicate map where if w i is a non-negated term, b i is set to 1, otherwise it is 0. The client launches an AUHME query to check whether I is a subset of DB ′ as in Fig.6. If the query returns 1, id matches the conjunction. Note that HXT [31] cannot support conjunctions with negated terms, mainly due to its index structure.

PERFORMANCE EVALUATION
We implement a prototype of HDXT and compare its performance with the state-of-art conjunctive SSE schemes with KPRP-hiding.

Experiment Setting
Baselines. There currently exist four KPRP-hiding conjunctive SSE solutions: the naive solution shown in Section 1, Blind Seer [35], HXT [31], and FBDSSE-CQ [57]. Table 1 shows that HDXT is more efficient than Blind Seer and FBDSSE-CQ for search queries. This is because Blind Seer heavily relies on expensive secure two-party computation and requires non-constant rounds of client-server interactions, and FBDSSE-CQ incurs linear overheads for a search query. In this section, we compare the search performance of HDXT with the naive solution and HXT, and the update performance with the naive solution and FBDSSE-CQ. As done in [39], we use MITRA [13] to instantiate the naive solution, called MITRA CONJ . MITRA is also used to instantiate RHS used in HDXT.
We use MITRA CONJ as one baseline to present the performance of HDXT. But note that, as described in [39], MITRA CONJ has serious leakages: the number of updates related to every keyword involved in a conjunction and the repetition of every searched keyword.
Implementation. We implement a prototype [52] for MITRA CONJ , HXT, FBDSSE-CQ, HDXT, and HDXT SU with C++. The cryptographic primitives are implemented based on Crypto++ library [16]. In particular, we use AES-ECB-128 + SHA-256 for pseudorandom functions, SHA-256 for hash functions, the C++ Bloom filter library [36] for the Bloom filter used in HXT, and the elliptic curve secp256r1 for group operations in HXT. RocksDB [17] is deployed for the storage on the client and the server. gRPC [20] is adopted for communication between the client and the server.
Test-bed. We use two machines to conduct the experiments. Both machines run Ubuntu 18.04 LTS: the first machine has 16× Intel Core Processor (Broadwell, IBRS 2.15GHz), 64GB RAM, and 4TB hard disk drives; the second has 16 cores (Intel Core i9-9900 CPU 3.10 GHz), 31GB RAM, and 483 GB SSD disk space. The experiments are executed in the network setting, where the first machine runs as the server and the second plays as the client.
Dataset. We extract two datasets from Wikimedia [1]. The first dataset contains 23643 documents, 60879 keywords, and 8373977 keyword-document pairs. The second one comprises 86386 documents, 188096 keywords, and 27850059 keyword-document pairs.

Search Performance in Static Database
We first measure the search performance of our solutions for the static database. For this experiment, we take the first dataset as input, set up the encrypted database, and perform a series of conjunctive queries. We measure the time cost by the client and the server and the end-end search latency for each search query. Meanwhile, we measure the costed communication overheads. 7.2.1 2-Conjunctions. We start by testing the performance of conjunctions involving two keywords. We choose two terms v and a. The term v is variable with |DB(v)| ranging from 1 to 20604. |DB(a)| is fixed to 1096. When |DB(v)| ≤ DB(a), we perform the conjunction v ∧ a. a ∧v is searched when |DB(v)| > DB(a). For our solutions, this can be easily done by checking the local counters produced by MITRA [13]. The search time for these conjunctions is described in Fig.7, and the communication overheads are presented in Fig.9(a). For HDXT, HDXT SU , and HXT, when |DB(v)| < 10 3 , the search time for v ∧ a rises as |DB(v)| increases, and the efficiency for a ∧ v remains almost constant when |DB(v)| > 10 3 . This result is consistent with the asymptotic complexity given in Table 1.

n-Conjunctions.
We also test the performance for conjunctions of n (2 ≤ n ≤ 11) keywords. Here a conjunction is expressed as a ∧ v 1 ∧ · · · ∧ v n−1 , where a is the s-term. Fig.8 presents the search time, and Fig.9(b) gives the communication overheads. The two figures clearly show that n-conjunctions in our two solutions perform better than MITRA CONJ and HXT in every respect. Moreover, we can see that the performance gap between HDXT and HXT and the gap between HDXT and MITRA CONJ become larger as n increases.
In particular, when n = 11, HDXT is 10.7× and 10.5× faster than HXT and MITRA CONJ , respectively. The communication overhead is 12.7× and 9.2× better than HXT and MITRA CONJ , respectively.

Search Performance in Dynamic Database
This sub-section tests the search performance in the dynamic database for the dynamic solutions: HDXT, HDXT SU , and MITRA CONJ . Here we generate a sequence of queries that involve ten keywords w 1 , · · · , w 10 . 99% of them are update queries, and 1% of them are conjunctions of the ten keywords. Among the update queries, 2%  of them edit pairs related to w 1 , 10% edit the pairs related to w i for 2 ≤ i ≤ 10, and 1% delete pairs related to w i for 4 ≤ i ≤ 10. Fig.10 shows the search time spent by every conjunction. We can see that the search performance of our two solutions is significantly better than MITRA CONJ . For instance, the end-to-end search latency in our solutions is 13× better than that in MITRA CONJ .

Update Performance
We use the sequence of update queries generated in Section 7.3 to evaluate the update performance. Specifically, we test the time cost by editing a keyword-document pair, the results of which are shown in Fig.11(a). Meanwhile, we compute the amortized update time per pair by first taking the total time it takes to update an increasing number of pairs and then dividing the obtained time by the number of updated pairs. Fig.11(b) shows the result. Fig.11(a) demonstrates that the update efficiency of HDXT and HDXT SU is close to that of MITRA CONJ and much better than FBDSSE-CQ, except for the query that incurs an eviction. HDXT and HDXT SU spend 5.8 and 1.6 hours for an eviction, respectively. Regarding the amortized efficiency, as shown in Fig.11(b), HDXT and HDXT SU are 8.2× and 32× better than that of FBDSSE-CQ, respectively. The amortized update performance of the three schemes is much weaker than MITRA CONJ . Nevertheless, MITRA CONJ achieves quick updates at the cost of search efficiency and security.

Storage
We test the storage overheads for HDXT, HDXT SU , HXT, MITRA CONJ , and FBDSSE-CQ [57]. In the experiment for the server storage, to demonstrate that the server storage caused by our schemes is acceptable, we also test several other schemes proposed in recent years, including DIEX [25], IBTree [32], and CNFFilter [37]. Note that DIEX, IBTree, and CNFFilter do not achieve KPRP-hiding. 7.5.1 Server Storage. We encrypt the two datasets with the five schemes and show their storage overhead in Table 2. From the table, we can see that although the server storage required by HDXT and HDXT SU is larger than that needed by HXT and MITRA CONJ , it is less than or comparable to some previous conjunctive SSE schemes. This is because there exists a trade-off between security, performance, and functionality for the design on conjunctive SSE. In order to improve security or functionality without sacrificing search efficiency, increasing the server storage moderately becomes a choice considering that the storage is becoming much cheaper.

Client Storage.
For static SSE, the client only keeps the secret keys, which commonly puts O(1) storage overhead on the client. However, for DSSE, the client needs to store a state s to support updates securely. So here we just test the client storage required by the dynamic KPRP-hiding and forward secure solutions, which include HDXT, HDXT SU , MITRA CONJ [39], and FBDSSE-CQ. For MITRA CONJ [39] and FBDSSE-CQ, we measure the size of the RocksDB database on the client after creating the encrypted database with the above datasets. Considering that the client keeps a cache in our two solutions, we generate an update sequence for HDXT and HDXT SU to fill the cache, before measuring their client storage. Table 3 presents the results. HDXT needs less client storage than FBDSSE-CQ. HDXT SU requires 13% more client storage space than FBDSSE-CQ. Note that the size of the client storage required by FBDSSE-CQ is the same as many previous forward secure SSE schemes, such as [7,29,30,43].

RELATED WORK
SSE was first introduced by Song et al. [42] in 2000. It is a technique that allows slight leakages (such as search and access patterns) to ensure practicability. This motivates the research on leakage-abuse attacks [5,10,21,23,33,34,40,41,54] that exploit leakages to undermine security guarantees. In response, some literature develops leakage-suppression techniques [2,13,18,22,26,27,38,46] to counteract the above attacks. However, these techniques have rather high overheads and focus on single-keyword searches. For example, Hoang et al. [22] hide response length for single-keyword queries, but their search complexity scales linearly with |D|. Chamani et al. [13] leveraged Path-ORAM to achieve ORION, which only reveals the response length. Nevertheless, ORAM brings impractical overhead and O(log N ) rounds of interactions. The search performance could be improved by replacing Path-ORAM with more efficient ones, such as Root ORAM [46] that pays the price of reducing security to the level of differential privacy. However, this solution still suffers from O(log N ) round complexity. As described in Section 1, these single-keyword schemes can be extended to securely process conjunctions, but their performance will become more unacceptable. As shown in Section 5, HDXT proposed in this paper achieves    a desirable trade-off between search efficiency and security. In the following, we review the existing conjunctive DSSE and mainly concern three crucial security properties: KPRP-hiding, forward privacy, and backward privacy. Golle et al. [19] proposed the first conjunctive SSE scheme in 2004. Their scheme was extended in [3,9] for better performance. However, they all suffer from linear search complexity.
In 2014, Pappas et al. [35] proposed Blind Seer. They adopt a treebased index and use security computation to process searches. Blind Seer only reveals the search pattern, but it requires non-constant rounds of interactions. The schemes given in [24,32,50,51] are also built on trees, with significant performance improvements. VBTree [50] also achieves forward privacy. However, three [32,50,51] of them leak the identifiers matched by every searched keyword, which is more severe than KPRP. Rphx [24] hides KPRP in the static setting, but it relies on hardware security provided by Intel SGX.
In 2017, Kamara and Moataz [25] proposed IEX by utilizing the inclusion-exclusion principle in the set theory. However, IEX leaks KPRP to the server. In [37], Patel et al. designed CNFFilter, a static scheme that reduces the leakages in IEX while ensuring efficiency.
However, CNFFilter reveals the documents that contain both the first and the second keywords involved in a search.
Zuo et al. [57] utilize a bitmap index and symmetric homomorphic encryption to achieve FBDSSE-CQ. FBDSSE-CQ supports KPRP-hiding conjunctions, while reaching forward and Type-II backward privacy. However, their scheme suffers from linear search complexity and huge server storage.
Overall, there is no existing conjunctive DSSE schemes that achieve KPRP-hiding in sub-linear search efficiency, while ensuring forward and backward privacy.

CONCLUSION
In this work, we introduce a new cryptographic primitive: attributeupdatable hidden map encryption (AUHME), and design a secure AUHME construction. With AUHME as the primary tool, we propose HDXT, which is the first KPRP-hiding conjunctive DSSE solution with sub-linear search efficiency. Furthermore, HDXT simultaneously supports two crucial security properties: forward and backward privacy. The analysis and experiments show that the performance of HDXT is competitive compared with the previous schemes that do not have such strong security. In our future work, we aim to extend HDXT to process more complex searches and work for multi-client settings.

ACKNOWLEDGMENTS
Yuan and Russello acknowledge the MBIE-funded Stratus Research Programme (UOAX1910) for its support and inspiration for this research. All the authors thank the anonymous reviewers for their valuable comments and suggestions in improving the work.

A PROOF OF THEOREM 3.1
Proof. The construction for the ideal experiment is presented in Fig.12. This experiment AUHMEIDEAL A, S (λ) could be obtained by gradually building the following three experiments.
Exp 0 : Exp 0 is the real experiment AUHMEREAL A (λ). Exp 1 : To obtain Exp 1 , every call to the PRF F (k, x) in Exp 0 is replaced in the following way: if x is a new input, Exp 1 chooses the output y uniformly at random from {0, 1} λ and inserts the pair (x, y) into a table F , otherwise it outputs F [x]. The ability to distinguish Exp 0 and Exp 1 could be reduced to that of breaking the security of the PRF.
Exp 2 : When ϕ hme m p ( m a )) = 0 and β = 1, Exp 2 selects d uniformly at random from {0, 1} λ , instead of computing d = H (r ||xors) in Exp 1 . Since H is modeled as a random oracle, the two experiments might be distinguished only when r ||xors could be used as the input to H by the adversary, which is called the event break by us. r is randomly chosen but will be exposed to A in the query. When ϕ hme m p ( m a )) = 0 and β = 1, following Exp 1 , xors is indistinguishable from a random value. Therefore, for the adversary that makes α (3) S sends C 0 to A. (4) A may make ρ 2 queries in an adaptive way, and each query is processed as in 2). (5) Taking as input the view observed by A in the above operations, A outputs a bit b. queries to H , the event break on (m p , cnt) happens with less than α/2 λ probability. Assuming that there are a total of n * distinct queries on (m p , cnt) that satisfy ϕ hme m p ( m a )) = 0 and β = 1, the chance of distinguishing Exp 1 and Exp 2 is less than n * α/2 λ AUHMEIDEAL A, S ( λ) : The ideal experiment is Exp 2 .
In conclusion, if F is a secure PRF and H is modeled as a random oracle, we can get that:  Proof. In this section, we prove HDXT is adaptively secure with the leakage functions shown in Theorem 5.1 by constructing a simulator S for HDXT.
As shown in Section 4, HDXT only adopts two cryptographic primitives: RHS and AUHME. The simulator S for HDXT could be constructed by invoking the simulators for RHS and AUHME. We denote the simulator for RHS and AUHME as S RH S and S H M E , respectively. The simulator S for the setup, update, and search protocols is presented in Fig.13, Fig.14, and Fig.15, respectively. We can directly get the simulator S for the setup protocol by invoking S RH S and S H M E . Notably, S creates five empty maps Γ, Ψ, Z 1 , Z 2 , and ϒ, which are global variables in S. We will detail these five variables later. S fills Γ[t ▷ ] that records all the vector indices in E 1 .
In the simulator S for the update protocol, S first parses L We can see that S.Update is obtained just by replacing RHS.Update and HME.GenUpd with S RH S .Update and the update process in S H M E , respectively. Therefore, to distinguish the update protocols in the ideal and real games, A has to break the security of RHS or AUHME.
To get the simulator S for the search protocol, S RH S .Search is first run to simulate the process of searching for DB(q [1]). After that, the AUHME queries need to be simulated. In the real game, for each id ∈ DB(q[1]), the client builds a map I = {(q[k]||id, 1)} n k=2 and calls HME.GenKey(msk, δ, I ) to generate the decryption key dk for I . dk is inserted into a list DK. The entries of DK are randomly permuted before sending to the server. To simulate the process of producing dk, S needs to build Loc(I ) for each id ∈ DB(q[1]) and runs S H M E (Loc(I ), ϕ HME I (DB ′ )). The function Loc is defined in Section 3.4. Every keyword-document concatenation q[k]||id should match an unique vector index in E 1 . Loc(I ) for id outputs the list {ϵ k ,id } n k =2 , where ϵ k ,id is the vector index matched by q[k]||id.
To simulate Loc(I ) for each id ∈ DB(q[1]), S fills or updates the global maps: Γ, Ψ, Z 1 , Z 2 , ϒ as follows: • Γ[t] records the vector indices that were added to E 1 at the timestamp t and have not been assigned to any keyworddocument concatenation. • For a conjunction q that occurs at t, ϒ[t] is the number of documents that satisfy all the following three requirements: 1) belong to |DB(q[1])|; 2) added into the database at t ▷ ; 3) the document identifiers are not exposed to the adversary. • Given a conjunction q that occurs at t, for each id ∈ DB(q [1]), if id has been leaked, Ψ[t, id, k] outputs the vector index matched by q[k]||id for 2 ≤ k ≤ n. If id is not leaked and was added into the database at the timestamp t 1 (t 1 > t ▷ ), the vector index matched by q[k]||id is stored in Ψ[t, t 1 , k] for 2 ≤ k ≤ n. For each identifier id that is not leaked and exists in the initial database, id could be denoted by any one in { * ||i} i=1 and the vector index matched by q[k]||id could be any one in {Ψ[t, * ||i, k]} • For any document identifier id that is leaked and was added into the database at t 1 (t 1 > t ▷ ), Z 1 [id] is set to t 1 and Z 2 [t 1 ] is set to id.
As shown in Line 3 -Line 30 in Fig.15, S first analyzes IP(q). It could get the vector indices matched by keyword-document concatenation that were already used by previous conjunctions and store them in L. L[id, k] is the vector index of q[k]||id. Meanwhile, S obtains the set U , which stores all the document identifiers existing in IP(q). After analyzing IP(q), for each identifier id ∈ U , S builds the list Loc for id. For 2 ≤ k ≤ n, if L[id, k] is not empty, it is inserted into Loc. When L[id, k] does not exist, it demonstrates that q[k]||id has not been used by previous conjunctions. In this case, S first determines the timestamp t 1 that id was added and then selects an vector index ϵ from Γ[t 1 ] uniformly at random. ϵ is used as the vector index of q[k]||id. After constructing Loc for id, S runs S H M E (Loc, b) to generate the decryption key dk, where if id ∈ DB(q), b = 1, otherwise b = 0. dk is inserted into the list DK.
For each entry (t 1 , id) ∈ TimeIds(DB(q)) that satisfies id U (t 1 is the timestamp that id was added), S builds Loc for id, by selecting the vector index ϵ from Γ[t 1 ] and inserting ϵ into Loc. (Loc, 1) is transferred to S H M E that then produces dk. dk is inserted into DK.
After processing the identifiers in U ∪ DB(q), S starts to process every document that satisfy all the following three conditions: 1) belongs to DB(q[1]); 2) does not exist in IP(q); 3) added after t ▷ . The timestamps that these documents were added are stored in AddTimes(q [1]) \ Pt. Pt is the set of the timestamps occurring in IP(q) and TimeIds(DB(q)). For each t 1 ∈ AddTimes(q[1]) \ Pt, S selects an vector index ϵ from Γ[t 1 ] and inserts ϵ into Loc for 2 ≤ k ≤ n. S H M E (Loc, 0) is run to generate dk, which is inserted into the list DK.
At last, S processes the documents that: 1) belong to DB(q[1]); 2) do not exist in IP(q); 3) exist in the initial database. Each vector index ϵ is selected from Γ[t ▷ ] and inserted into Loc. S H M E (Loc, 0) is run to generate dk, which is inserted into DK. S randomly permutes entries of DK and sends DK to the server.
In the simulator S for the search protocol, when q[k]||id has not been queried by previous conjunctions, S selects the vector index of q[k]||id from Γ[t 1 ] uniformly at random, where t 1 is the timestamp that id was added. This is the only difference with the real search protocol. Because the entries of DB ′ are randomly permuted before calling HME.Enc in the real setup protocol and the entries of addition token are also randomly permuted after calling HME.GenUpd, S cannot distinguish the real and the ideal game.

C BACKWARD PRIVACY DEFINITION
In Definition 2.4, we give the definition for backward private conjunctive DSSE. The definition also works for single-keyword DSSE as follows.
The above definition differs slightly from Bost et al.'s [8] in two aspects. First, TimeDB(w) in [8] captures DB(w) and the timestamps that these document identifiers are inserted into DB(w), while our TimeDB(w) records DB(w) and the timestamps that these identifiers are first added into the database (when they might not contain w). We argue that the exposed timestamps in our definition reveal nothing about the deletion information, so they will not influence backward privacy. Bost et al.'s TimeDB(w) is not suitable for the complex setting we consider, where a keyword-document pair could be inserted into the database again after it has been deleted. In this setting, there might exist multiple timestamps where an identifier id was added to DB(w), from which the server could infer when the previous deletions happened. For instance, if the server learns that a keyword-document pair (w, id) is inserted into the database in the two timestamps t 1 , t 3 , it could get that the pair was deleted once at the timestamp t 2 . Second, for Type-II backward privacy, the search leakage in our definition captures π ▷ (w). π ▷ (w) is not included in Bost et al.'s definition just because they assume the initial database is empty.

D HDXT SU -SUBLINEAR UPDATES
In this section, we propose HDXT SU , which aims to reduce the edit complexity of HDXT to be sub-linear.
The encrypted index of HDXT SU consists of TMap and XMap2. TMap is the same as that in HDXT. XMap2 is the new version of XMap. The pseudocodes of HDXT SU are shown in Fig.16, Fig.17, and Fig.18. HDXT SU adopts the pseudorandom function F and our AUHME construction. Every document has an independent AUHME instance, so in principle the client needs to store the master secret key and the state per AUHME instance. To reduce the client storage, the master secret key is pseudorandomly computed from