[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Article in Journal
A Variational Neural Network Based on Algorithm Unfolding for Image Blind Deblurring
Previous Article in Journal
Natural Language Processing Tools and Workflows for Improving Research Processes
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme

by
Xixi Yan
,
Pengyu Cheng
*,
Yongli Tang
and
Jing Zhang
School of Software, Henan Polytechnic University, Jiaozuo 454000, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(24), 11740; https://doi.org/10.3390/app142411740
Submission received: 3 November 2024 / Revised: 7 December 2024 / Accepted: 13 December 2024 / Published: 16 December 2024

Abstract

:
Searchable encryption (SE) allows users to efficiently retrieve data from encrypted cloud data, but most of the existing SE solutions only support precise keyword search. Fuzzy searchable encryption agrees with practical situations well in the cloud environment, as search keywords that are misspelled to some extent can still generate search trapdoors that are as effective as correct keywords. In scenarios where multiple users can search for ciphertext, most fuzzy searchable encryption schemes ignore the security issues associated with malicious cloud services and are inflexible in multi-user scenarios. For example, in medical application scenarios where malicious cloud servers may exist, diverse types of files need to correspond to doctors in the corresponding departments, and there is a lack of fine-grained access control for sharing decryption keys for different types of files. In the application of medical cloud storage, malicious cloud servers may return incorrect ciphertext files. Since diverse types of files need to be guaranteed to be accessible by doctors in the corresponding departments, sharing decryption keys with the corresponding doctors for different types of files is an issue. To solve these problems, a verifiable fuzzy searchable encryption with blockchain-assisted multi-user scenarios is proposed. Locality-sensitive hashing and bloom filters are used to realize multi-keyword fuzzy search, and the bigram segmentation algorithm is optimized for keyword conversion to improve search accuracy. To realize fine-grained access control in multi-user scenarios, ciphertext-policy attribute-based encryption (CP-ABE) is used to distribute the shared keys. In response to the possibility of malicious servers tampering with or falsifying users’ search results, the scheme leverages the blockchain’s technical features of decentralization, non-tamperability, and traceability, and uses smart contracts as a trusted third party to carry out the search work, which not only prevents keyword-guessing attacks within the cloud server, but also solves the verification work of search results. The security analysis leads to the conclusion that the scheme is secure under the adaptively chosen-keyword attack.

1. Introduction

Due to the convenience of cloud services, individuals and organizations are inclined to upload their files to the cloud for storage, it is crucial to encrypt sensitive files before storing on the cloud. However, another problem arises as to how to perform search operations on the ciphertext files in cloud services. Searchable encryption technology [1] perfectly solves the problem of searching ciphertext files directly and has a wide range of applications, such as in the smart healthcare [2], smart city, and IoT fields [3].
Since the pioneering work on searchable encryption, scholars have devoted themselves to studying some practical and feasible searchable encryption schemes. For example, a scheme [4] supports conjunctive, subset and range queries on encrypted data, a scheme [5] supports spatial keyword queries, and another scheme [6] supports Boolean queries. In practice, when users perform a search operation, they often misspell the search keywords or simply enter the approximate stem of the keyword or other forms to start performing the search operation. To address this situation, fuzzy search [7] realizes a certain degree of approximate keyword search matching, which can generate the same or similar keyword trapdoor for users to perform search matching. However, a few existing fuzzy search schemes still have limitations. For example, the schemes in [7] and the scheme in [8] only support single keyword search. Although there exists a multi-user scenario implemented in the scheme in [9] and the scheme in [10], where multiple users can retrieve the ciphertext through a single cloud server. In the case that multiple users have search permissions, it is necessary to ensure that people with different search permissions can search for the corresponding ciphertext files. Therefore, fine-grained access control needs to be ensured. There is the problem that the encryption and decryption keys are difficult to share, the online auditing and authorization by administrators for users applying for access are not friendly when it comes to the actual situation of smart healthcare, and do not satisfy the fine-grained access control of multi-user application scenarios. Although the scheme [11] can search for ciphertext files with corresponding permissions in multi-user scenarios, current fuzzy search schemes seldom consider fine-grained access control on shared keys for different types of files. Therefore, it is essential to provide multi-keyword fuzzy search encryption schemes with fine-grained control under multi-user conditions.
In the scenario of smart healthcare, doctors in different departments need to search for the ciphertext of case files. The case information of different patients will be first classified in a certain way, and then the medical data center will store the patient’s case and other private information through encryption in the medical database or cloud server, which can only be accessed by authorized doctors. In practice, doctors in different departments have access rights to different files, e.g., doctors in the cardiology department can only access cases of cardiac patients. Although some fuzzy search schemes support multiple user scenarios, there is the problem of difficulty in sharing encryption and decryption keys, and it is unfriendly in the actual situation in smart healthcare to review and authorize the users applying for access online through administrators.
In addition, some scheme [12,13,14,15] models serve as honest but curious entities, which are not realistic in practice. Cloud servers have too much power and are not conducive to the data owner’s control over the data. Some malicious cloud servers even seriously harm the data owner’s interests, such as returning wrong or incomplete files. This greatly affects the practical application of fuzzy search encryption schemes. Therefore, a decentralized fuzzy search encryption scheme is necessary and feasible to solve the possible problems of malicious cloud servers.
Aiming at the above problems, the Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme is proposed, which has more comprehensive functions and efficient performance. And it is more adaptable to the ciphertext search needs under new network services such as smart healthcare. The main innovations are as follows.
  • For the existence of malicious cloud servers in fuzzy search schemes, semi-decentralization is used to weaken the power of cloud servers, a smart contract is constructed to guarantee fairness between users and servers, and blockchain is introduced to assist in achieving integrity verification of results.
  • Aiming at the lack of fine-grained access control for users in one-to-many application scenarios, the is used to distribute the shared key with a tree structure, and only authorized users who meet the access results can decrypt the shared key.
  • By improving the index tree structure and bigram participle algorithm, the proposed scheme not only optimizes the search efficiency and search precision but also enables users to check the search results locally for a second time.
  • Through security analysis, it is proved that the scheme is secure under the adaptive selection of keyword attacks. Functional comparison and experimental analysis show that the search efficiency of the scheme has been improved under the premise of supporting multi-functionalities such as multiple keywords, fuzzy search, dynamic updating, and result validation. In particular, the search time of our scheme is more advantageous with the increase in search keywords and searched files.
The rest of this paper is arranged as follows. Related work on fuzzy search and blockchain-based searchable encryption is in Section 2. Some preliminaries used in the scheme are given in Section 3. In Section 4, the definitions of algorithm and security model are provided. Details of the program will be presented in Section 5. Section 6 and Section 7 provide the program analysis and efficiency analysis, respectively. Section 8 summarizes the work and gives conclusions.

2. Related Work

We present research work on fuzzy searchable encryption and blockchain-based searchable encryption in this section.

2.1. Fuzzy Searchable Encryption

Since the fuzzy search encryption scheme has a broader application prospect, it attracts scholars to study it continuously. Li et al. [7] first formalized the fuzzy keyword search problem on encrypted cloud data and designed a wildcard-based secure index construction scheme, which first constructs a wildcard-based fuzzy keyword set, each element in the set is encrypted, and when searching, the user first generates the same encrypted fuzzy keyword set, then the cloud server finds the corresponding keywords, and finally returns the corresponding files. Subsequently, Wang et al. [16] improved on the scheme in [7] and proposed a searchable encryption scheme for fuzzy keyword search based on the index tree structure of keywords, which improves the search efficiency, but it only supports single keyword search. Although [17] designed a dual ranking function that combines keyword weights and keyword morphological similarity to rank search results, it still cannot avoid the limitations of single-keyword search.
Kuzu et al. [18] scaled by minhash to transform keyword vectors and determine the similarity of keywords. Subsequently, Wang et al. [19] implemented a multi-keyword search based on scheme [18]. Fu et al. [14] improved the search accuracy with the Bloom filter and location-sensitive hash function but still did not support the dynamic update function. Zhong et al. [15] proposed a scheme to support efficient dynamic updating, which improves the search accuracy and supports encrypted index tree based on the top-k sorting of encrypted files. Tong et al. [20] designed a dual bloom filter to store and mask the keywords contained in a file and realized the verification of correctness and completeness, but the verification is more complicated. Li et al. [21] designed an improved bi-gram participle algorithm to enhance fuzzy search accuracy, the scheme assumes cloud server is malicious but fails to construct an effective solution for the malicious behavior of malicious servers.

2.2. Blockchain-Based Searchable Encryption

The cloud server has too much right in traditional searchable encryption schemes, and all the search processes are dependent on the cloud server. Scholars have introduced the blockchain fairness mechanism to realize fairness due to the non-tamperable characteristics, which guarantees that if a user and cloud server follow the agreement honestly, the cloud server gets the corresponding service fee, while users get the correct results. Once a cloud server is detected with dishonest behavior, it will not get the service fee and will be punished with the loss of margin.
Cai et al. [22] realized dynamic keyword search in distributed storage combined with blockchain technology for fair search. Wang et al. [23] constructed a decentralized privacy-preserving search scheme. Data owners can be assured of receiving correct results without having to worry about malicious server behavior. Hu et al. [24] combined blockchain technology to achieve fair payment and constructed a file-sharing scheme in one-to-one as well as one-to-many user scenarios. Smart contracts are used to store secure indexes and execute searches, ensuring that users get correct results if they pay for the transaction. Chen et al. [25] constructed a blockchain-enabled public key encryption scheme with multi-keyword search, which uses smart contract to ensure the fairness without introducing a third party. Guo et al. [26] constructed an authentication-enabled search encryption scheme on blockchain and implement forward security. It is evident that the combination of blockchain and searchable encryption cryptography to achieve decentralization, build fair transactions, and authentication features is effective.

3. Preliminaries

We introduce the technical knowledge used in fuzzy search in this section.

3.1. Bloom Filter (BF)

The Bloom Filter is a space-saving probabilistic data structure that determines whether an element is in a collection or not. Mapping two or more different elements to the same position in a bit array via a hash function has a false positive rate.
Suppose the bloom filter initializes a p-bit array, k random hash functions, and the element set = { α 1 , α 2 , α 3 , α n } .
For each element in the set, it is mapped to the k positions of the p - bit bloom filter by k independent hash functions and corresponding positions are set to 1. When determining whether α x is in , it uses the same approach as described above to map α i to the k positions of the bloom filter. If one of the k positions has a value of 0, then α x ; if all k positions have a value of 1, then α x or there is a false positive.

3.2. Locality Sensitive Hashing (LSH)

In the scheme, the LSH determines whether the keyword is in the file or not by mapping keywords into a hash bucket. A positional hash function is defined as follows: A family of hash functions F is a mapping from Euclidean space S 2 to hash coding space U . The hash function F is said to satisfy r 1 , r 2 , p 1 , p 2 - ness when P r F [ h ( q ) = h ( p ) ] p 1 , p B ( q , r 1 ) and P r F [ h ( q ) = h ( p ) ] p 2 , p B ( q , r 2 ) are satisfied, where B is the space centered at q with radii r 1 , r 2 . The p-stable position-sensitive hash function is calculated as h a , b ( v ) = [ a v + b n ] , where the parameters a and b are a d - dimensional vector and a random number between [ 0 , n ] , respectively, and each element in the vector v satisfies p - stable distribution. As shown in Figure 1, a v means that vector a is mapped onto an axis with vector v as the base vector, and this axis is divided into n equal parts of m . The labelled value of the interval corresponding to the value of a v is taken as the hash value of vector a . We can determine whether a keyword can be mapped to a bucket by calculating the probability that the values are equal by using LSH.

4. Definitions of Algorithm and Security Model

Before describing our scheme in detail, the basic model, algorithmic framework, and security model are first given.

4.1. System Model

The model is based on the following roles, as shown in Figure 2.
(1)
Data Owner ( D O ) . D O extends beyond individual entities to data centers that are responsible for the integration of file information owned by multiple data owners. This implementation realizes the application of a multi-owner scenario from another perspective.
(2)
Trusted Authority ( T A ) . As an authenticating entity, it generates attribute private keys for data users and transmits system public parameters and other relevant information to the users securely.
(3)
Data Users ( D U ) . D U need to apply for their attribute private keys based on their attributes as well as reliable information, then generate and send search trapdoor to the smart contract account for searching transactions through query requirements.
(4)
Smart Contract ( S C ) . As a decentralized component of the system, S C stores secure index tree I n d e x t r e e and verification information of data owners, mitigating the excessive and uncontrolled authority of the cloud server. The search contract performs the initial segment of the entire search transaction and ensures the verification process for D U .
(5)
Cloud Server ( C S ) . C S has powerful storage as well as computational capabilities, and stores the ciphertext data files outsourced by D O , and provides D U with the second part of the search service based on the transaction initiated, it may have the malicious behavior of tampering with the user’s search results.
The workflow of the system is as follows.
D O transmits parameters, public keys, etc., to T A initially (Step 1), firstly categorizes all large collections of files, encrypts classified file collection D = { F 1 , F 2 , , F n } and outsources the ciphertext set C = { C 1 , C 2 , , C n } to C S (Step 2). To achieve access control and efficient search, D O constructs a file index tree and sends the index tree to the contract storage address (Step 3). D U send a registration request to T A (Step 4) and receives attribute private keys and search keys (Step 5). Then the D U generates a search token locally based on query keywords and algorithms disclosed by T A and sends it through a transaction to S C (Step 6). S C executes a search to obtain a search list and sends the information to the user (Step 7). The ciphertext of file identifiers ( F I D ), the ciphertext set C = { C 1 , C 2 , , C n } and transaction information is sent to D U (Step 8). Then C S returns to D U the ciphertext data of the search transaction (step 9). D U can verify the search results locally, and only users whose attribute private key satisfies the access control can finally decrypt the file and malicious servers will receive penalties. (Step 10).

4.2. Algorithm Framework

4.2.1. Definition

Before giving the definition of the algorithm, the definitions of the symbols used in the scheme are given in Table 1.

4.2.2. Algorithm Definition

The scheme has nine main algorithms.
(1)
S e t u p ( 1 λ ) ( p k , m k ) is a system initialization algorithm. The Data Owner D O inputs random security parameter λ, generates system public key p k and the master private key m k .
(2)
K e y G e n ( 1 λ , l ) K S is a key generation algorithm. D O inputs random security parameter λ and the length l of file index, outputs the key set K S = ( M 1 , M 2 , S ) .
(3)
E n c ( p k , K 1 , K 2 , D , K W , K S , Γ ) ( F I D , C , C K 1 , I , M A C c i , I n d e x t r e e ) is an encryption algorithm. D O takes public key p k , plaintext set D , keyword set K W , access policy Γ , key K i of the corresponding type of files, authentication key K 2 , and index encryption key K S as input. Then, for outputs, the ciphertext set C , encrypted file identifier F I D , security index I , the key cipher containing access policy C K i , ciphertext file message authentication code M A C c i , and security index tree I n d e x t r e e .
(4)
U s e r S i g n ( p k , m k , S u ) ( S K u , S K s , u s e r _ i d ) is a user registration algorithm run by T A , it takes p k , m k , and S u as input, outputs attribute private key S K u , the user search key S K s , and authenticated identifier u s e r _ i d .
(5)
T o k e n G e n ( p k , S K s , W q ) T t is a trapdoor generation algorithm. Data User takes the public key p k , a query keyword collection W q and attribute private key S K s as input, outputs the trapdoor with time stamp T t .
(6)
S e a r c h ( I n d e x t r e e , T t ) ( C κ i , F I D , M A C c i ) is a search-matching algorithm run by Smart Contract S C . This algorithm takes T t and I n d e x t r e e as input, outputs C K i , M A C c i and F I D .
(7)
V e r i f y ( C i , M A C C i , K 2 ) 1 / 0 is a verification algorithm. The Data User takes the verification key K 2 , the ciphertext file C i and the ciphertext message authentication code M A C C i as input. It is determined whether a preliminarily correct result is obtained by outputting the result.
(8)
D e c ( S K u , C K i , C w q ) ( D k w / ) is a decryption algorithm run by the Data User. It takes the attribute private key S K u , the key cipher containing access policy C K i , and file set cipher C w q , which are based on the query keyword set w q as input. If S K u satisfies the access structure Γ defined by the Data Owner D O , then it outputs the plaintext files containing the query keyword set, otherwise, the output is .
(9)
U p d a t e ( U p t y p e , I T c ) I n d e x t r e e _ n e w is an index tree update algorithm run by the smart contract. It takes the encrypted subtree I T c sent by the Data Owner as input, and outputs the new index tree I n d e x t r e e _ n e w . The detailed process is described in Section 5.2.4.

4.3. Security Model

The C S is considered an entity that may have malicious behavior and is not fully trustworthy, it may return untrue results to save computation and bandwidth resources, which can seriously harm the D U . In addition, D U uploads its own set of attributes honestly to obtain u s e r _ i d , all communication with T A is secure.
Regarding privacy requirements, this article considers two threat models with different adversary capabilities.
Known Ciphertext Model: The cloud server can only observe the encrypted file set, the encrypted index, the submitted tokens, and the search results. It has no prior knowledge of the file set or the search keywords.
Known Background Model: the cloud server is also able to obtain some additional background information like statistical information about the files, the queried keywords and their corresponding tokens, and the frequency of searched keywords.
A simulation-based game between adversary A and a stateful simulator B using an allowed leakage access model and a search model are used to prove security. Two leakage functions L = ( L 1 ,   L 2 ) are defined, where L 1 and L 2 are expressed formally as follows.
L 1 ( D ) = { | D | , n , { | F i | , i n d ( F i ) } i [ 1 , n ] } and takes the plaintext file collection D as input and outputs the plaintext collection size | D | , the number of files n , the size of each file | F i | , and file identifiers i n d ( F i ) i [ 1 , n ] .
L 2 ( D , W q ) = { P ( W q ) , T } , it takes the plaintext file collection D and the search keyword collection as input and outputs the keyword access patterns P ( W q ) and search trapdoor T . Here, suppose A is an adversary, B is a stateful simulator, and G acts as a challenger. The interaction game between A , B , and G is as follows.
  • R e a l A ( λ ) . G initializes system and runs K e y G e n ( 1 λ , m ) K S , then sends a collection of files D to G . G generates encrypted file C and index I with a secure symmetric encryption algorithm and index generation algorithm, then sends them to A . A generates a polynomial number of keywords sets and launches an adaptive search Q 1 = ( w 1 , w 2 , , w q ) . Then the search keywords in set are transformed into a set of vectors to generate search trapdoor by G , which is sent to A . Eventually, A outputs a n s e r { 0 , 1 } , ending the game.
  • I d e a l A ( λ ) . Given the leakage functions L 1 and L 2 , G inputs plaintext file D . Simulator B generates a simulated index I and ciphertext C , sends I and C to A . B chooses a polynomial number of keywords Q 1 = ( w 1 , w 2 , , w q ) for adaptive interrogation, then outputs search trapdoor T t based on L 2 . Finally, A outputs the result a n s e r { 0 , 1 } , ending the game.

5. Proposed Scheme

We describe the main ideas of fuzzy search and detail the algorithms in this section.

5.1. Main Idea

The scheme implements fuzzy search based on LSH and BF. Assuming that D O has n plaintext files D = { F 1 , F 2 , , F n } , a p bit bloom filter B i is generated as an index vector, which contains all the keywords in F i . To map the keywords in F i on B i , the keywords need to be converted into bigram vectors, then the bigram vectors are mapped to B i by LSH. Trapdoor generation similarly requires transforming the query keyword W q into a bigram vector and generating an m - bit bloom filter B w q , which maps bigram vector on B w q using the same LSH to generate index vector. The inner product of B i and B w q represents the relevance of file F i and query w q . As shown in Figure 3.

5.2. Concrete Construction

We describe the various algorithms and talk about them in four broad modules, namely, the preliminary work, the encryption work, the search work, and the subsequent work.

5.2.1. Preliminary Work

The preliminary work includes S e t u p ( 1 λ ) , K e y G e n ( 1 λ , m ) and U s e r S i g n ( p k , m k , S u ) .
S e t u p ( 1 λ ) : D O defines two multiplicative cyclic groups G 0 , G 1 on p , the bilinear map e : G 0 × G 0 G 1 , randomly selects anti-collision hash function H 1 : { 0 , 1 } G 0 , and two pseudo-randomized functions H 2 : { 0 , 1 } l × { 0 , 1 } { 0 , 1 } k and H 3 : { 0 , 1 } l × { 0 , 1 } { 0 , 1 } k . g is the generating element of G 0 , p is a large secure prime, and e ( g , g ) is the value of the bilinear mapping in the swarm. T A selects α , β randomly as input, outputs the master private key m k = ( α , β ) , and p k = ( G 0 , G 1 , p , g , h = g β , e ( g , g ) α , H 1 , H 2 , H 3 ) .
K e y G e n ( 1 λ , l ) : D O inputs a random security parameter λ and the index length l to generate K S = ( M 1 , M 2 , S ) to encrypt the trapdoor and index, where ( M 1 , M 2 ) R l × l , S { 0 , 1 } l is a vector that prevents brute force decryption and M 1 , M 2 are invertible matrices.
U s e r S i g n ( p k , m k , S u ) : When a user sends an enrollment request, T A checks the user’s attribute set S u , then confer u s e r _ i d and search key S K s to the user.

5.2.2. Encryption Work

The encryption work mainly includes key encryption, file index vector generation, index tree generation, and trapdoor generation.
A.
Key Encryption
E n c ( p k , K 1 , K 2 , D , K W , K S , Γ ) : Assume that the file set has t types, D O has n plaintext files D = { F 1 , F 2 , , F n } . The files are categorized according to their types and the corresponding symmetric encryption key K i ( 1 i t ) is generated for different types of files., e.g., in a medical database, ophthalmology case files are classified as one type and case files from different sections are categorized into different types. D O parses keyword sets K W from D and K W F i from F i respectively, then encrypts file identifiers of D = { F 1 , F 2 , , F n } by K 2 to obtain the ciphertext F I D = { F I D 1 , F I D , , F I D n } .
D O sets the access control structure tree Γ for different types of files with key K 1 based on actual application requirements. For example, the core files of a medical department can be decrypted only by users who meet the conditions of important attributes, while for the general files of the department only the basic conditions of access to the files by the department need to be met. D O assign a polynomial q x for each node x from root r , where q x in leaf nodes are constant. Randomly select s from p which q r ( 0 ) = s , deg ( q x ) = d x = k x 1 , then d e g ( q r ) random coefficients are selected from p to determine q x . Y denotes the leaf nodes and k x is the threshold value of x . Eventually gets C K i = { Γ ,   C ¯ = K 1 e ( g , g ) α s ,   C = h s ,   { C y ,   C y } y Y } where C y = g q y ( 0 ) , C y = H 1 ( a t t ( y ) ) q y ( 0 ) . D O encrypts plaintext D = { F 1 , F 2 , , F n } by K i to obtain ciphertext set C = ( C 1 , C 2 , C n ) . Eventually, C and F I D are sent to C S , C K i will be stored in the index tree described subsequently.
B.
File Index Vector Generation
The generation of file index vectors is carried out by mapping the keywords on the bloom filter of the file. The keywords transform into vectors so that they can be mapped on the bloom filter and the improved keyword bigram disambiguation algorithm from scheme [21] is used to transform the keywords into vectors.
For the keyword “happy”, D O first converts it to a bigram set { h a 1 , a p 1 , p p 1 , p y 1 } . Considering the existence of the same bigram elements, the vector is expanded to 2 × 26 2 , where 26 × 26 represents possible binary letters. If the bigram set of the keyword exists in the corresponding position of the vector, the corresponding position is set to 1, otherwise it is set to 0, different keywords are converted into vectors with the same length.
D O constructs and initializes an l - bit bloom filter B i for file F i in the file collection D by k LSH. I and I are l - bit null vectors. The keyword set K W F i is subjected by the porter stemming algorithm to obtain the stemmed set K W S T F i . If the file F i has x keywords, the bigram set of K W S T F i finally converts to a vector set B v S T F i , b v i in the vector set B v S T F i are mapped onto B i by k LSH. Set the corresponding position of the mapping to T F i , j / k according to the importance of the file according to the importance of the keyword to the file, where T F i , j = 1 + | k w j | / | F i | is the frequency of keyword k w j in F i , | k w j | denotes the number of occurrences of k w j in F i , and | F i | denotes the number of keywords in F i .
The index vector construction for F i is demonstrated by an example, as shown in the left side of Figure 3. Suppose F i has three keywords “happily, teachers, likely”, where S = [ 1 , 0 , 1 , 0 , 0 , 1 , 1 , 0 , 0 , 1 ] , l = 10 , M 1 ,   M 2 are invertible matrices and M 1 ,   M 2 R 10 × 10 . D O constructs an l - bit bloom filter B i for F i , then converts the set of keywords to the stem set S T k w = { h a p p y , t e a c h , l i k e } . The stem keyword “happy” is first converted to B v 1 = { h a 1 , a p 1 , p p 1 , p y 1 } and then converted to vector b v 1 . For ease of representation, Figure 3 is represented by a string of 0 and 1 vectors. Then B v 2 ,   B v 3 and b v 2 ,   b v 3 are constructed for the rest stem keywords in the same way, finally, b v 1 , b v 2 , and b v 3 are mapped as inputs to k LSHs on B i , respectively, and the corresponding position of B i is set to T F i , j / k . Assuming three keywords have the same weight T F i , j / k of 0.5, the final index vector B i of the file F i is obtained.
C.
Index Tree Generation
Denote S in K S and bloom filter B i of file F i as S = ( s 1 , s 2 , s m ) and B i = ( b 1 , b 2 , b m ) , respectively, where B i and S have the same structure. For each b i in B i , let B i = ( b 1 , b 2 , , b m ) ,   B i = ( b 1 , b 2 , , b m ) . If s i = 1 , then b i = b i = b i ; If s i = 0 , then b i = b i / 2 + r , b i = b i / 2 r , where r is a random parameter and r R . Finally, the encrypted index I i = ( I i , I i ) of the file F i is obtained, where I i = ( M 1 T B i ) ,   I i = ( M 2 T B i ) .
The index tree construction of the scheme in [15] is used to construct the I n d e x t r e e and the storage structure of the leaf nodes is improved on this basis. Leaf nodes store file identifiers and vectors in addition to filing authentication codes and key ciphertexts with access control structures. The internal nodes store the optimal vectors of their leaf node index vectors and the internal node vectors are encrypted in accordance with the index encryption method of F i . Finally, the I n d e x t r e e is sent to S C , the I n d e x t r e e is shown in Figure 4.
D.
Trapdoor Generation
T o k e n G e n ( p k , S K s , W q ) : D U first converts the set of keywords W q into a bigram set by the bigram segmentation algorithm, then the bigram set is transformed into a binary vector set B v using the same method as for index generation. Each vector in B v is used as an input to the k LSHs, respectively, to map into an l - bit bloom filter, the value of the mapped location is set to 1. Finally, the vector B t of the keywords is obtained.
The bloom filter used to generate the search trapdoor is denoted as B t = ( t 1 , t 2 , t m ) , S in K S is denoted as S = { s 1 , s 2 , , s m } . S and B t have the same structure as B t = ( t 1 , t 2 , t m ) and B t = ( t 1 , t 2 , t m ) . For each t i in B t , if s i = 1 , then t i = t i / 2 + r , t i = t i / 2 r ; if s i = 0 , then t i = t i = t i . Then, encrypt B t , B t to get T = ( M 1 1 B t ) , T = ( M 2 1 B t ) . Eventually, the secure search trapdoor T = ( T , T ) , T t = ( T | | L t ) , and u s e r _ i d are sent to S C through transaction.

5.2.3. Search Match

S e a r c h ( I n d e x t r e e , T t ) : D O defines a reasonable search unit price $ s e a r c h , then sets $ p r o _ D U as the deposit of D U stored temporarily in S C to prevent D U from quitting in the middle. $ A l l represents the total price that D U should pay for each search transaction and $ s e a r c h _ p r i c e represents the overhead of calling the search algorithm. g a s _ l i m and g a s _ p r i c e respectively represent the maximum gas consumption and gas unit price in the S C , and the product of the two represents the maximum price that the S C is allowed to accept.
S C obtains the user’s search trapdoor and u s e r _ i d through the transaction. After authenticating whether the identity is correct, it checks whether the pre-deposit of D U satisfies the search transaction, if it does, a search match will be performed, otherwise, the transaction is canceled.
The index tree will be traversed from top to bottom. Meanwhile, S C calculates T T I i , which represents the correlation score S c o r e of the node and search keywords, where T I i = { M 1 1 B t M 1 T B i + M 2 1 B t M 2 T B i } . If the S c o r e of the parent node vector and the search trapdoor is less than the threshold T H , the traversal of the node’s children is stopped. The entire index tree is traversed in this way.
S C defines a list S c l i s t , which consists of at most k n records. Each entry contains a pair < S c o r e F i , v a l u e > , where v a l u e = { F I D | | C K i | | M A C C i } , S c o r e = { T I i } . When traversing to the leaf node, S C stores the information of leaf node in S c l i s t . After traversing the index tree, S C sorts and eliminates the k n entries of S c l i s t based on S c o r e F i . Finally the k entries that match the user’s search are obtained, then S C sends the v a l u e in R L i s t to D U , sends the F I D and transaction information in v a l u e to the C S . After receiving the message, C S packages the ciphertext set C and sends it to the D U .
S C uses GDFS (Greedy Depth-First Search) for searching, the notation and Algorithm 1 used are described as follows.
S c o r e n d : correlation score between index vector I i of node n d and search vector T .
n d . h c l d : child nodes of node n d with high query relevance.
n d . l c l d : child nodes of node n d with low query relevance.
Algorithm 1 Search Matching Algorithm
Input: the index tree I n d e x t r e e
Output: S c l i s t
1: if n d is not a leaf node then
2: if  S c o r e n d > T H then
3:   GDFS ( n d . h c l d )
4:    GDFS ( n d . l c l d )
5: else
6:   return
7: end if
8: else
9: if   S c o r e F i > T H && S c l i s t is full
10:     delete the lowest relevance node from S c l i s t
11:   insert the new node < S c o r e F i , v a l u e > and sort all the node by S c o r e F i
12: else if S c o r e F i > T H && S c l i s t is not full
13:   insert the new node < S c o r e , v a l u e > in S c l i s t
14: end if
15: return
16: end if
As the child nodes are overwritten by parent nodes, only part of the tree needs to search. As in Figure 4, the file collection D = { F 1 , F 2 , , F 7 } , the threshold T H is set to 1.5, and the trapdoor T = ( 0 , 1 , 0 , 1 , 0 ) is sent to S C by the user. Assume that 1 most relevant file for the search is selected.
S C traverses the nodes from top to bottom and computes S c o r e for each node. The children of L 11 do not need to traverse due to S c o r e R o o t = 2.5 > T H , S c o r e L 11 = 1 < T H ,   S c o r e R 12 = 2 > T H , S c o r e R 23 = 1.8 > T H , S c o r e F 5 = 1 < T H , S c o r e F 6 = 1.6 > T H and S c o r e F 7 = 1.5 > T H . The information of F 6 and F 7 is finally added to S c l i s t , S C ranks F 6 and F 7 in S c l i s t according to S c o r e , then F 7 is eliminated.

5.2.4. The Subsequent Work

In this section, we focus on verification, decryption of files, and updating the index tree.
A.
Verification
V erify ( C i , M A C C i , K 2 ) : Suppose k encrypted identifiers F I D = { F I D 1 , F I D 2 , F I D k } are returned in a search result, D U first determines whether F I D from C S is the same as F I D from S C . Then, the M A C C i of the file is calculated locally by k 2 to determine whether the C S has tampered with search information. S C sends $ s e a r c h to D O , sends $ s e a r c h _ p r i c e $ g a s _ p r i c e to the staff member who executed the transaction, and refunds the transaction deposit from D U .
B.
Decryption
D e c ( S K u , C K i , C w q ) : After getting the key ciphertext C K i , the user checks whether the S K u matches with the ciphertext access policy, if it matches, D U can get A = e ( g , g ) r s and recover the K i , otherwise, D U has no access to the file. The decryption of K i is shown in the Decryption Correctness section.
C.
Index tree update
U p d a t e ( U p t y p e , I T c ) : The updating of the index tree is the process of replacing the old subtree with the new encrypted one. D O retains the unencrypted index tree locally and denotes T c as the set of nodes that may be changed during the update process. As an example, in Figure 4, if the file F 1 is deleted in an update operation, then T c = { L 11 , L 21 , R o o t , F 1 } .
When U p t y p e = d e l e t e , delete the leaf node F i and recompute the other node index vectors in T c , encrypt node index vectors in T c by K S . If the operation destroys the balance of the tree, replace it with a pseudo node, then a new indexed subtree I T c is generated. D O sends I T c and U p t y p e = d e l e t e to S C , which replaces the original indexed subtree with the help of I T c .When U p t y p e = a d d , the leaf node F i needs to store the key ciphertext C K i , the file ciphertext message authentication code M A C F i , encrypted identifier F I D F i and the other operations are the same as the deletion operation. The index update process will not destroy the balance of the tree because the new nodes will be added to the pseudo nodes.

6. Program Analysis

We analyze the correctness and security of the scheme in this section.

6.1. Correctness Analysis

6.1.1. Fuzzy Search Correctness

We denote I i = ( I i , I i ) and trapdoor T = ( T , T ) for the multi-keyword set w q . Assuming n file indexes are matched with T in the search, S c o r e is the inner product of I i and T , denote each element in S c o r e as P x , y , which S c o r e = I i T + I i T = M 1 T B i M 1 1 B t + M 2 T B i M 2 1 B t , P = { P x , y | 1 x n , 1 y n } and satisfy (1).
r x , y = b x , y t y + b x , y t y = b x , y ( t y / 2 + r ) + b x , y ( t y / 2 r ) = b x , y t y ,   s y = 1 r x , y = b x , y t y + b x , y t y = b x , y ( t y / 2 + r ) + b x , y ( t y / 2 r ) = b x , y t y ,   s y = 0
From the above calculations, it can be concluded that the random numbers r and r are not linked to the result, so the correlation scores of the fuzzy search when the trapdoor is matched with the index inner product are not affected by the random numbers.

6.1.2. Decryption Correctness

After D U receives the key ciphertext C k 1 containing the access structure, D U uses attribute private key S u to determine whether the access structure is satisfied or not, and only if the access structure is satisfied can K i be recovered, the process is in (2).
K i = C ¯ e ( C , E ) A = K i e ( g , g ) α s e ( h s , g α + r β ) e ( g , g ) r s = K i e ( g , g ) α s e ( g , g ) r s e ( g , g ) s ( α + r )

6.2. Security Analysis

Theorem 1. 
The scheme is secure under the adaptive selection of keyword attacks when C S and external adversaries can only divulge what they are allowed to.
Proof. 
For adversary A , a stateful simulator B satisfies the condition | P r [ R e a l A ( λ ) ] P r [ I d e a l A ,   B ( λ ) ] | n e g l ( λ ) , where n e g l ( λ ) is a negligible probability. From the proof equivalence in the scheme in [27], the simulation-based game proofs are equivalent to indistinguishable game proofs. A only needs to analyze the differentiation between B to win the game, P r [ I n d A ( λ ) = 1 ] 1 / 2 n e g l ( λ ) is used to prove the scheme is secure. B simulates the process of generating ciphertexts, indexes, and trapdoors as follows.
(1)
Generate ciphertext C by simulation.
According to L 1 ( D ) = { | D | , n , { | F i | , i n d ( F i ) } i [ 1 , n ] } , B generates n ciphertexts of | F i | i [ 1 , n ] - bit uniformly at random to simulate the encrypted file set C = { C 1 , C 2 , , C n } , which is encrypted by AES. According to the semantic security of symmetric encryption, the ciphertext C generated by R e l A ( λ ) and the ciphertext C generated by I d e a l A ,   B ( λ ) are computationally indistinguishable, i.e., the advantage of the adversary’s victory can be ignored, i.e., P r [ E n c r y p t ( F , K i ) C ] P r [ S i m C ] n e g l 1 ( λ ) .
(2)
Generate search trapdoor T by simulation.
According to L 2 ( D , W q ) = { P ( W q ) , T } , B maps each keyword in W q = ( w 1 , w 2 , , w q ) to the bloom filter, its search trapdoor is denoted as T Q = ( T Q , T Q ) , where T Q = ( M 1 1 B Q ) , T Q = ( M 2 1 B Q ) , B Q = ( t 1 , t 2 , , t m ) , t j = t j / 2 + r , and t j = t j / 2 r . The probability of B simulating the search token T Q is negligible when the K S is unknown. Due to the introduction of the random value r in the encrypted query and the pseudo-random nature of the key set K S , the simulator B randomly selects K S and r to construct the feasible search token with probability P r [ T Q = T ] P r [ r = r ] P r [ K S = K S ] 1 / 2 k n e g 2 ( λ ) , thus the advantage of B to simulate the feasible search trapdoor is negligible.
(3)
Simulated construction of index I .
In the process of generating the secure index based on the leakage function 1 , L 2 , simulator B picks a random value to replace the random number used in the real case. According to the secure K N N encryption algorithm, it is known that the secure K N N algorithm using random numbers is indistinguishable. Meanwhile, the matrix picked for the key set K S is a sufficiently large invertible matrix and K S is not artificially compromised. Thus, the advantage of winning for the polynomial adversary satisfies (3), which is negligible.
| P r [ B u i l d I n d e x ( S K , F i , K W F i ) I r e a l ] P r [ S i m I ] | n e g l 3 ( λ )  
Since the adversary wins this indistinguishability game by analyzing ciphertexts, indexes and search credentials, the advantages of the adversary in distinguishing between real ciphertexts and simulated ciphertexts, real indexes and simulated ciphertexts, and real search tokens and simulated search tokens, respectively, are denoted as A d v A ( C , C ) , A d v A ( I r e a l , I ) and A d v A ( T Q , T ) . Then we get (4).
P r [ I n d A ( λ ) = 1 ] = 1 2 + A d v A ( C , C ) + A d v A ( I r e a l , I ) + A d v A ( T Q , T ) = 1 2 + | P r [ E n c r y p t ( F i , K i ) C ] P r [ S i m C ] | + | P r [ B u i l d I n d e x ( K S , F i , K W F i ) I r e a l ] P r [ S i m I ] | + | P r [ T r a p d o o r ( K S , W q ) T Q ] P r [ S i m T ] | 1 2 + n e g l 1 ( λ ) + n e g 2 ( λ ) + n e g l 3 ( λ )  
In summary, the outputs of R e l A ( λ ) and I d e a l A ,   B ( λ ) obtained for any polynomial adversary are indistinguishable. □
Theorem 2. 
If the blockchain is tamper-proof, the scheme ensures that user transactions are fair.
Proof. 
If the C S is dishonest, it will return incorrect file numbers or falsify search results. According to the fair-trade rules set by S C , C S will not receive service fees and will be penalized with loss of deposit. If C S enforces the contract rules honestly, D U gets the correct result and C S gets the corresponding service fee. In addition, the transaction specifies a limit time L t , when the transaction time is greater than L t , the user’s deposit will be refunded, and the transaction will be ended. □
Theorem 3. 
For malicious cloud servers and other external adversaries, no information can be learned about plaintext files except for ciphertext files.
Proof. 
In the scheme, the file is encrypted with the corresponding symmetric key K i before it is uploaded to the C S , then D O encrypts K i as C K i based on CP - ABE and stores it in the index tree I n d e x t r e e . Even though the C S and external adversaries eavesdrop on the ciphertext files in the channel, the difficulty of trying to get the plaintext information is equal to the difficulty of decrypting K i . Since C ¯ = k i e ( g , g ) α s in C K i , e ( g , g ) α s must be computed to decrypt K i . Under the discrete logarithm problem, the external adversary and C S to compute e ( g , g ) α s from C = h s and e ( g , g ) α is difficult. Based on the D i f f e H e l l m a n computational problem, the adversary cannot use the Lagrange interpolation formula to recursively compute e ( g , g ) r s without the attribute key S K u that conforms to the access policy and obtain K i . K i and the plaintext file can be decrypted when S K u satisfies the access structure. Since S K u is privately stored by the D U , it is not possible for C S and external adversaries to learn any plaintext information other than the ciphertext file. □

7. Efficiency Analysis

The scheme will be analyzed in terms of function, complexity, and performance experimentation in this section.

7.1. Functionality Comparison

The functions of the proposed scheme with the related schemes [15,21,25,28,29] are completed in this section. As shown in Table 2, where the symbol “✓” indicates that the corresponding functionality is satisfied and “✗” indicates that it is not satisfied.
Both the scheme in [15] and the scheme in [29] are about multi-keyword fuzzy searchable encryption. The scheme in [15] realizes dynamic updating by constructing a tree structure but does not support verification of the search result. The scheme in [29] only considers the problem of ciphertext result correctness and does not support dynamic updating. The scheme in [25] utilizes the blockchain mechanism to protect the index, but it does not support fuzzy search and relevance sorting of files, and it cannot maximize the return of the set of files that the user is interested in. The scheme in [28] utilizes authentication and homomorphic encryption to detect misbehavior occurring in cloud servers, and the search results are outsourced to edge computing servers, which can verify the correctness and integrity, but it would impose additional overhead. Although the scheme [21] supports all the functions, the scheme does not indicate how the process of dynamic updating takes place. It also does not take measures against the possible malicious behavior of the cloud server and does not effectively limit the privileges of the cloud server.
The proposed scheme makes up for the shortcomings of the above schemes, and it not only realizes fuzzy search, search result relevance sorting, and dynamic updating but also supports fine-grained access control, search result correctness, and integrity verification. At the same time, smart contracts are used to constrain malicious behaviors that may occur in the search process of cloud servers and protect the rights and interests of users.

7.2. Complexity Theoretical Analysis

In this section, the performance will be analyzed from three aspects: index tree generation, trapdoor generation, and search matching.
  • Index tree generation. The cost of file index generation includes the generation of a plaintext index vector for each file and the encryption of the plaintext index vector. Where the generation of the index is mainly linked to the number of keywords, the encryption of the plaintext vector mainly consists of the factorization operation of the vector and multiplication with a matrix of order l × l . The time overhead of single file index generation is O ( l 2 ) . The I n d e x t r e e generates ( n 1 ) additional nodes at the internal nodes when the number of files is n . Therefore, the cost of the I n d e x t r e e generation is a little higher than the other schemes, but it is generated by the D U locally and upload to the smart contract only once.
  • Trapdoor Generation. The cost of trapdoor generation mainly consists of generating an l bit plaintext trapdoor and performing matrix multiplication operations on the plaintext trapdoor, so the cost of trapdoor generation is O ( l 2 ) .
  • Search Match. The Search and Match includes traversing the I n d e x t r e e , the calculation of the inner product of trapdoors and index vectors, and sorting the files to filter out rejections based on relevance. In this scheme, only the n k most relevant ciphertext files are searched. These files and child nodes share the same access path, so the traversal of the I n d e x t r e e only touches a portion of n nodes i.e., on r ( r < < n ) paths from the root to the leaf nodes instead of traversing the child nodes repeatedly, hence the time complexity is O ( r log n ) .
In Table 3, the efficiency is compared in terms of index generation, trapdoor generation, and search time. Here, n is the number of files, Q is the number of exact keywords, Z is the maximum number of fuzzy keywords, and l denotes the index length in the proposed scheme. In the scheme, l can be treated as a constant and n < < Q < < Z .

7.3. Experimental Performance

In this section, simulation experiments are conducted. The performance of the scheme is evaluated using Java language in the Windows 10 Home Chinese version. The hardware system is a 64-bit Windows operating system Intel(R) Core (TM) i7-8750H CPU @2.20 GHz with 16 GB of RAM and the experimental dataset is RFC (Request For Comments). Smart contracts are deployed through the solidity language, running in an Ethereum virtual machine. The blockchain implementation is based on the official Ethereum test network Rinkeby and the Ethereum wallet MetaMask is used to function as an account. The numbers of LSH are chosen as k = 30, the constructed index and trapdoor vector is l = 8000, the file is encrypted by AES-256, and the number of keywords in a file is 80–113.
The scheme excludes keywords with little semantic distinction such as the, where, and a. Simulation experiments and comparisons of our scheme with scheme [15], scheme [28], and scheme [29] are conducted in terms of file index generation, search matching time, and search accuracy.
  • Individual File Index Generation and Encryption. In Figure 5a, the cost of generating a plaintext index vector for a single file index is gradually increasing with the number of file keywords. Meanwhile, the encryption time of the plaintext index vector is constant, which fluctuates slightly at 40 ms.
  • Search Match. In Figure 5b, as the number of files gradually increases, all 4 schemes basically show an increasing trend. When the number of files is no more than 1000, all the schemes increase at a relatively small level. The time cost and growth rate required for searching in the proposed scheme are lower than those of the schemes in [25,29], and as the number of files increases, the scheme follows up with a slight increase compared to the beginning. Although the search time of the scheme in [21] also consistently increases at a lower level, the search time of our scheme is still superior to it. When the number of files is 2000, the search times of the schemes in [28,29] are 7.42 s and 4.42 s, respectively, while the time cost of our scheme is only 2.21 s. In Figure 5c, the search matching costs of the schemes in [21,28,29] tend to stabilize, while the time cost in our scheme shows a small linear increase. When the number of search keywords is 9, the time costs of the scheme in [28,29], and [21] are 7.42 s, 4.51 s, and 2.65 s, respectively, while the time cost of our scheme is only about 1.85 s.
  • Fuzzy Search Accuracy. Since our scheme and the scheme in [15] have similarities in index construction which uses u n i g r a m to transform keywords, the scheme in [15] is chosen to compare fuzzy search accuracy. The fuzzy keywords are constructed by replacing one letter in the keyword with another letter randomly, and the check accuracy is used to test the search results. The accuracy test method Δ P = p / p from the scheme in [32] is used in our scheme, where p indicates the number of search files returned and p indicates the number of real files returned.
As seen in Figure 5d, the search accuracy increases gradually with the increase in the number of search keywords, which makes the scheme more capable of distinguishing the files that satisfy the user’s search conditions. The search accuracy of the scheme in [15] is around 85%, while the search accuracy of our scheme can reach 91% with the increase in valid search keywords.

8. Conclusions

In this paper, an effective semi-decentralized multi-keyword fuzzy searchable encryption scheme is proposed. We improved the index tree storage to support fine-grained access control without losing search efficiency. In addition, CP - ABE is introduced into the scheme to support multi-user scenarios. Experimental results show that our scheme has certain advantages in efficiency compared to similar schemes. In a word, the proposed scheme is rich with additional functions such as verification, updating, etc., while at the same time presenting a high performance.

Author Contributions

Conceptualization, X.Y. and P.C.; methodology, X.Y. and P.C.; software, X.Y.; validation, X.Y. and P.C.; formal analysis, X.Y., P.C. and J.Z.; investigation, P.C.; resources, X.Y. and Y.T.; data curation, P.C.; writing—original draft preparation, X.Y. and P.C.; writing—review and editing, X.Y. and P.C.; visualization, X.Y. and P.C.; supervision, X.Y. and Y.T.; project administration, X.Y. and Y.T.; funding acquisition, X.Y. and Y.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by [Programs for Science and Technology Development of Henan Province] grant number [242102210152], [the Fundamental Research Funds for the Universities of Henan Province] grant number [NSFRF240620], [Key Scientific Research Project of Henan Higher Education Institutions] grant number [24A520015] and [National Natural Science Foundation of China] grant number [62472144]. And The APC was funded by [National Natural Science Foundation of China] grant number [62472144].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Song, D.X.; Wagner, D.; Perrig, A. Practical techniques for searches on encrypted data. In Proceedings of the 2000 IEEE Symposium on Security and Privacy, S&P 2000, Berkeley, CA, USA, 14–17 May 2000; pp. 44–55. [Google Scholar]
  2. Xu, G.; Li, H.; Ren, H.; Lin, X.; Shen, X. DNA Similarity Search With Access Control Over Encrypted Cloud Data. IEEE Trans. Cloud Comput. 2022, 10, 1233–1252. [Google Scholar] [CrossRef]
  3. Tong, Q.; Miao, Y.; Liu, X.; Choo, K.-K.R.; Deng, R.H.; Li, H. VPSL: Verifiable Privacy-Preserving Data Search for Cloud-Assisted Internet of Things. IEEE Trans. Cloud Comput. 2022, 10, 2964–2976. [Google Scholar] [CrossRef]
  4. Boneh, D.; Waters, B. Conjunctive, Subset, and Range Queries on Encrypted Data. In Theory of Cryptography; Vadhan, S.P., Ed.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2007; Volume 4392, pp. 535–554. [Google Scholar] [CrossRef]
  5. Chen, L.; Cong, G.; Jensen, C.S.; Wu, D. Spatial Keyword Query Processing: An Experimental Evaluation. In Proceedings of the VLDB Endowment, Riva del Garda, Trento, Italy, 26–30 August 2013. [Google Scholar]
  6. Tong, Q.; Li, X.; Miao, Y.; Liu, X.; Weng, J.; Deng, R. Privacy-Preserving Boolean Range Query with Temporal Access Control in Mobile Computing. IEEE Trans. Knowl. Data Eng. 2022, 35, 5159–5172. [Google Scholar] [CrossRef]
  7. Li, J.; Wang, Q.; Wang, C.; Cao, N.; Ren, K.; Lou, W. Fuzzy Keyword Search over Encrypted Data in Cloud Computing. In Proceedings of the 2010 Proceedings IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–5. [Google Scholar] [CrossRef]
  8. Chuah, M.; Hu, W. Privacy-Aware BedTree Based Solution for Fuzzy Multi-keyword Search over Encrypted Data. In Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops, Minneapolis, MN, USA, 20–24 June 2011; pp. 273–281. [Google Scholar] [CrossRef]
  9. Wang, B.; Yu, S.; Lou, W.; Hou, Y.T. Inverted index based multi-keyword public-key searchable encryption with strong privacy guarantee. In Proceedings of the IEEE Conference on Computer Communications (INFOCOM), Hong Kong, China, 26 April–1 May2015; pp. 2092–2100. [Google Scholar]
  10. Chen, J.; He, K.; Deng, L.; Yuan, Q.; Du, R.; Xiang, Y. EliMFS: Achieving Efficient, Leakage-Resilient, and Multi-Keyword Fuzzy Search on Encrypted Cloud Data. IEEE Trans. Serv. Comput. 2020, 13, 1072–1085. [Google Scholar] [CrossRef]
  11. Shan, B.; Yao, Y.; Li, W.; Zuo, X.; Yu, N. Fuzzy Keyword Search over Encrypted Cloud Data with Dynamic Fine-grained Access Control. In Proceedings of the 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Wuhan, China, 9–11 December 2022; pp. 1340–1347. [Google Scholar] [CrossRef]
  12. Xia, Z.; Wang, X.; Sun, X.; Wang, Q. A Secure and Dynamic Multi-Keyword Ranked Search Scheme over Encrypted Cloud Data. IEEE Trans. Parallel Distrib. Syst. 2016, 27, 340–352. [Google Scholar] [CrossRef]
  13. Fu, Z.; Wu, X.; Wang, Q.; Ren, K. Enabling Central Keyword-Based Semantic Extension Search Over Encrypted Outsourced Data. IEEE Trans. Inform. Forensics Secur. 2017, 12, 2986–2997. [Google Scholar] [CrossRef]
  14. Fu, Z.; Wu, X.; Guan, C.; Sun, X.; Ren, K. Toward Efficient Multi-Keyword Fuzzy Search Over Encrypted Outsourced Data With Accuracy Improvement. IEEE Trans. Inform. Forensic Secur. 2016, 11, 2706–2716. [Google Scholar] [CrossRef]
  15. Zhong, H.; Li, Z.; Cui, J.; Sun, Y.; Liu, L. Efficient dynamic multi-keyword fuzzy search over encrypted cloud data. J. Netw. Comput. Appl. 2020, 149, 102469. [Google Scholar] [CrossRef]
  16. Wang, C.; Ren, K.; Yu, S.; Urs, K.M.R. Achieving usable and privacy-assured similarity search over outsourced cloud data. In Proceedings of the 2012 Proceedings IEEE INFOCOM, Orlando, FL, USA, 25–30 March 2012; pp. 451–459. [Google Scholar] [CrossRef]
  17. Zhang, H.; Zhao, S.; Guo, Z.; Wen, Q.; Li, W.; Gao, F. Scalable Fuzzy Keyword Ranked Search Over Encrypted Data on Hybrid Clouds. IEEE Trans. Cloud Comput. 2023, 11, 308–323. [Google Scholar] [CrossRef]
  18. Kuzu, M.; Islam, M.S.; Kantarcioglu, M. Efficient Similarity Search over Encrypted Data. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, Arlington, VA, USA, 1–5 April 2012; pp. 1156–1167. [Google Scholar] [CrossRef]
  19. Wang, B.; Yu, S.; Lou, W.; Hou, Y.T. Privacy-preserving multi-keyword fuzzy search over encrypted data in the cloud. In Proceedings of the IEEE INFOCOM 2014—IEEE Conference on Computer Communications, Toronto, ON, Canada, 27 April–2 May 2014; pp. 2112–2120. [Google Scholar] [CrossRef]
  20. Tong, Q.; Miao, Y.; Weng, J.; Liu, X.; Choo, K.-K.R.; Deng, R. Verifiable Fuzzy Multi-keyword Search over Encrypted Data with Adaptive Security. IEEE Trans. Knowl. Data Eng. 2022, 35, 5386–5399. [Google Scholar] [CrossRef]
  21. Li, X.; Tong, Q.; Zhao, J.; Miao, Y.; Ma, S.; Weng, J. VRFMS: Verifiable Ranked Fuzzy Multi-keyword Search over Encrypted Data. IEEE Trans. Serv. Comput. 2022, 16, 698–710. [Google Scholar] [CrossRef]
  22. Cai, C.; Weng, J.; Yuan, X.; Wang, C. Enabling Reliable Keyword Search in Encrypted Decentralized Storage with Fairness. IEEE Trans. Dependable Secur. Comput. 2021, 18, 131–144. [Google Scholar] [CrossRef]
  23. Wang, S.; Zhang, Y.; Zhang, Y. A Blockchain-Based Framework for Data Sharing With Fine-Grained Access Control in Decentralized Storage Systems. IEEE Access 2018, 6, 38437–38450. [Google Scholar] [CrossRef]
  24. Hu, S.; Cai, C.; Wang, Q.; Wang, C.; Luo, X.; Ren, K. Searching an Encrypted Cloud Meets Blockchain: A Decentralized, Reliable and Fair Realization. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018; pp. 792–800. [Google Scholar] [CrossRef]
  25. Chen, Z.; Wu, A.; Li, Y.; Xing, Q.; Geng, S. Blockchain-Enabled Public Key Encryption with Multi-Keyword Search in Cloud Computing. Secur. Commun. Netw. 2021, 2021, 6619689. [Google Scholar] [CrossRef]
  26. Guo, Y.; Zhang, C.; Wang, C.; Jia, X. Towards Public Verifiable and Forward-Privacy Encrypted Search by Using Blockchain. IEEE Trans. Dependable Secur. Comput. 2022, 20, 2111–2126. [Google Scholar] [CrossRef]
  27. Chai, Q.; Gong, G. Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers. In Proceedings of the 2012 IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 917–922. [Google Scholar] [CrossRef]
  28. Li, J.; Ma, J.; Miao, Y.; Chen, L.; Wang, Y.; Liu, X. Verifiable Semantic-Aware Ranked Keyword Search in Cloud-Assisted Edge Computing. IEEE Trans. Serv. Comput. 2022, 15, 3591–3605. [Google Scholar] [CrossRef]
  29. Fu, S.; Zhang, Q.; Jia, N.; Xu, M. A Privacy-preserving Fuzzy Search Scheme Supporting Logic Query over Encrypted Cloud Data. Mob. Netw. Appl. 2021, 26, 1574–1585. [Google Scholar] [CrossRef]
  30. Ge, X.; Yu, J.; Hu, C.; Zhang, H.; Hao, R. Enabling Efficient Verifiable Fuzzy Keyword Search Over Encrypted Data in Cloud Computing. IEEE Access 2018, 6, 45725–45739. [Google Scholar] [CrossRef]
  31. Huang, R.; Li, Z.; Wu, G. A Verifiable Encryption Scheme Supporting Fuzzy Search. In Security, Privacy, and Anonymity in Computation, Communication, and Storage; Wang, G., Feng, J., Bhuiyan, M.Z.A., Lu, R., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2019; Volume 11611, pp. 397–411. [Google Scholar] [CrossRef]
  32. Xiangyang, Z.; Hua, D.; Xun, Y.; Geng, Y.; Xiao, L. MUSE: An Efficient and Accurate Verifiable Privacy-Preserving Multikeyword Text Search over Encrypted Cloud Data. Secur. Commun. Netw. 2017, 2017, 1923476. [Google Scholar] [CrossRef]
Figure 1. p-stable element mapping.
Figure 1. p-stable element mapping.
Applsci 14 11740 g001
Figure 2. System model.
Figure 2. System model.
Applsci 14 11740 g002
Figure 3. LSH-Bloom Architectural Diagram.
Figure 3. LSH-Bloom Architectural Diagram.
Applsci 14 11740 g003
Figure 4. The index tree.
Figure 4. The index tree.
Applsci 14 11740 g004
Figure 5. (a) Index generation time. (b) The search matching time when search keyword number is 3. (c) The search matching time when the file number is 2000. (d) The fuzzy search accuracy. Zhong, H. et al (2020) refers to the scheme in ref. [15], Li, X. et al (2022) refers to the scheme in ref. [21], Li, J. et al (2022) refers to the scheme in ref. [28] and Fu, S. et al (2021) refers to the scheme in ref. [29].
Figure 5. (a) Index generation time. (b) The search matching time when search keyword number is 3. (c) The search matching time when the file number is 2000. (d) The fuzzy search accuracy. Zhong, H. et al (2020) refers to the scheme in ref. [15], Li, X. et al (2022) refers to the scheme in ref. [21], Li, J. et al (2022) refers to the scheme in ref. [28] and Fu, S. et al (2021) refers to the scheme in ref. [29].
Applsci 14 11740 g005
Table 1. Description of Symbols.
Table 1. Description of Symbols.
NotationMeaning
D = { F 1 , F 2 ,   , F n } the file set of D O
C = { C 1 , C 2 ,   , C n } the ciphertext set
p k the public key
Γ the access policy
K i the symmetric key for the corresponding type of files.
C K i the key cipher containing access policy
K 2 the verification key
S u the user attributes collection
W q a query keyword collection
K S the index encryption key
S K u attribute private key for users
S K s the user search key
T t a trapdoor with time stamp
K W a collection of keywords for a file set
K W F i the keyword set for file F i
S T k w the stemming of the keyword k w
S T K W F i the keyword stemming set for file F i
M A C c i the ciphertext message authentication code
K S = ( M 1 , M 2 , S ) the key set for the encrypted trapdoor and index
Table 2. Functionality Comparison of Different Schemes.
Table 2. Functionality Comparison of Different Schemes.
SchemeScheme [15]Scheme [28]Scheme [25]Scheme [29]Scheme [21]Ours
Multi-Keyword
Dynamic Updates
Fuzzy Search
Integrity Verification
Correctness Verification
Related Sorting
Table 3. Time Complexity of Different Schemes.
Table 3. Time Complexity of Different Schemes.
SchemeScheme [30]Scheme [31]Scheme [21]Ours
Index Generation O ( Z Q ) O ( Z Q ) O ( n l 2 ) O ( ( 2 n 1 ) l 2 )
Trapdoor Generation O ( Z ) O ( Z ) O ( l 2 ) O ( l 2 )
Search Cost O ( Z ) O ( Z log Q ) O ( n l ) O ( r log n )
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yan, X.; Cheng, P.; Tang, Y.; Zhang, J. Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme. Appl. Sci. 2024, 14, 11740. https://doi.org/10.3390/app142411740

AMA Style

Yan X, Cheng P, Tang Y, Zhang J. Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme. Applied Sciences. 2024; 14(24):11740. https://doi.org/10.3390/app142411740

Chicago/Turabian Style

Yan, Xixi, Pengyu Cheng, Yongli Tang, and Jing Zhang. 2024. "Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme" Applied Sciences 14, no. 24: 11740. https://doi.org/10.3390/app142411740

APA Style

Yan, X., Cheng, P., Tang, Y., & Zhang, J. (2024). Blockchain-Assisted Verifiable and Multi-User Fuzzy Search Encryption Scheme. Applied Sciences, 14(24), 11740. https://doi.org/10.3390/app142411740

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop