CN104036050A - Complex query method for encrypted cloud data - Google Patents
Complex query method for encrypted cloud data Download PDFInfo
- Publication number
- CN104036050A CN104036050A CN201410316970.2A CN201410316970A CN104036050A CN 104036050 A CN104036050 A CN 104036050A CN 201410316970 A CN201410316970 A CN 201410316970A CN 104036050 A CN104036050 A CN 104036050A
- Authority
- CN
- China
- Prior art keywords
- file
- keyword
- binary vector
- query
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 34
- 239000013598 vector Substances 0.000 claims abstract description 104
- 239000011159 matrix material Substances 0.000 claims abstract description 27
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000013475 authorization Methods 0.000 claims description 8
- 230000000717 retained effect Effects 0.000 claims 1
- 230000007723 transport mechanism Effects 0.000 claims 1
- 230000007246 mechanism Effects 0.000 abstract description 14
- 238000011160 research Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Storage Device Security (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
密文云数据复杂查询方法,包括:数据拥有者对其文件集构建二进制向量索引,并使用对称密码机制加密文件集,然后将加密文件集发送至云端。当某用户要求访问包含某些关键词的文件时,向数据拥有者申请查询令牌,查询令牌中包含有关键词集合和所有文件的二进制向量索引。用户根据查询关键词与关键词集合构建查询二进制向量,并将查询二进制向量与每个文件的索引二进制向量进行内积计算判断该文件是否包含用户的查询关键词。若该文件包含有查询关键词,则进一步构建与查询关键词对应的新索引二进制向量。用户将查询关键词根据逻辑表达式生成LSSS矩阵,并将新索引二进制向量与LSSS矩阵进行内积计算以进一步判断该文件是否满足查询逻辑表达式。本发明能实现精确的复杂查询,并能实现比目前广泛使用的倒排索引更高的查询效率。
The complex query method of ciphertext cloud data includes: the data owner constructs a binary vector index for its file set, encrypts the file set using a symmetric cryptographic mechanism, and then sends the encrypted file set to the cloud. When a user requests to access files containing certain keywords, he applies for a query token from the data owner, and the query token contains the keyword set and the binary vector indexes of all files. The user constructs the query binary vector according to the query keyword and the keyword set, and calculates the inner product of the query binary vector and the index binary vector of each file to determine whether the file contains the user's query keyword. If the file contains query keywords, a new index binary vector corresponding to the query keywords is further constructed. The user generates the LSSS matrix based on the query keywords according to the logical expression, and calculates the inner product of the new index binary vector and the LSSS matrix to further judge whether the file satisfies the query logical expression. The invention can realize precise and complicated query, and can realize higher query efficiency than the widely used inverted index at present.
Description
技术领域technical field
本发明属于云存储和信息检索领域,具体涉及一种密文云数据复杂查询方法。The invention belongs to the field of cloud storage and information retrieval, and in particular relates to a complex query method for ciphertext cloud data.
背景技术Background technique
在云存储环境下,要保护用户数据机密性和隐私性,加密是一种常用的方法,但是数据加密后,密文数据检索问题亟待解决。In the cloud storage environment, encryption is a common method to protect user data confidentiality and privacy, but after data encryption, the problem of ciphertext data retrieval needs to be solved urgently.
为解决密文云数据检索问题,目前主要有两种典型的方法:一种是直接对密文进行线性搜索,即对密文中单词逐个进行比对,确认关键词是否存在以及出现的次数;第二种方法基于安全索引,即先对文档建立关键词索引,然后将文档和索引加密后上传至云端,搜索时从索引中查询关键词是否存在于某个文档中。直接对密文线性搜索的方法缺点在于搜索效率不高,且无法应对海量数据的搜索场景。基于索引的密文检索方法是目前的研究主流,原因是其查询效率更好,安全性能更高,适合用于大规模的云存储密文检索系统。In order to solve the problem of ciphertext cloud data retrieval, there are currently two typical methods: one is to directly perform a linear search on the ciphertext, that is, to compare the words in the ciphertext one by one to confirm whether the keyword exists and the number of times it appears; The two methods are based on a secure index, that is, first establish a keyword index for the document, then encrypt the document and the index and upload it to the cloud, and check whether the keyword exists in a certain document from the index when searching. The disadvantage of the method of directly searching the ciphertext linearly is that the search efficiency is not high, and it cannot cope with the search scenario of massive data. The index-based ciphertext retrieval method is the mainstream of current research because it has better query efficiency and higher security performance, and is suitable for large-scale cloud storage ciphertext retrieval systems.
在已有的研究工作中,所有方案都是采用倒排索引机制,还没有使用二进制向量索引的方案。并且目前关于复杂查询的方案比较少,而查询结果的准确性更是亟待提高。In the existing research work, all schemes use the inverted index mechanism, and there is no scheme using binary vector index. Moreover, there are relatively few solutions for complex queries at present, and the accuracy of query results needs to be improved urgently.
采用二进制向量索引在数据拥有者端只需要保留较少的信息,就可以实现高效安全的密文数据检索。采用LSSS矩阵可以实现精确的复杂查询。Using the binary vector index only needs to retain less information on the data owner side to achieve efficient and secure ciphertext data retrieval. Precise and complex queries can be achieved by using the LSSS matrix.
密文云数据查询是保证云存储中数据机密性和可检索性的关键技术,对于推进云存储的快速发展具有重要的理论意义和实用价值。Ciphertext cloud data query is a key technology to ensure the confidentiality and retrievability of data in cloud storage, and it has important theoretical significance and practical value for promoting the rapid development of cloud storage.
发明内容Contents of the invention
针对现有技术的缺陷,本发明的目的在于提供一种密文云数据复杂查询方法,旨在提高数据查询准确性、查询效率与安全性。In view of the defects of the prior art, the purpose of the present invention is to provide a complex query method for ciphertext cloud data, aiming at improving the accuracy, efficiency and security of data query.
为实现上述目的,本发明提供了一种密文云数据复杂查询方法,包括以下步骤:In order to achieve the above object, the present invention provides a complex query method for ciphertext cloud data, comprising the following steps:
步骤1.数据拥有者对其文件集构建索引,使用二进制向量索引,即索引中每一位代表一个关键词,以0和1表示相应关键词是否存在于此文件中;Step 1. The data owner builds an index for its file set, using a binary vector index, that is, each bit in the index represents a keyword, and 0 and 1 indicate whether the corresponding keyword exists in this file;
步骤2.数据拥有者基于单个文件或数据块使用对称密码机制加密文件集;Step 2. The data owner encrypts the file set using a symmetric encryption mechanism based on a single file or data block;
步骤3.数据拥有者将加密文件集发送至云端;Step 3. The data owner sends the encrypted file set to the cloud;
步骤4.用户要求访问包含某些关键词的文件时,向数据拥有者申请查询令牌,查询令牌中包含有关键词集合和所有文件的二进制向量索引;Step 4. When the user requests to access files containing certain keywords, apply for a query token from the data owner. The query token contains the keyword set and the binary vector index of all files;
步骤5.用户根据查询关键词与关键词集合构建查询二进制向量,并将查询二进制向量与每个文件的索引二进制向量进行内积计算判断该文件是否包含用户的查询关键词;Step 5. The user constructs a query binary vector according to the query keyword and the keyword set, and calculates the inner product of the query binary vector and the index binary vector of each file to determine whether the file contains the user's query keyword;
步骤6.若该文件包含有查询关键词,则进一步构建与查询关键词对应的新索引二进制向量;Step 6. If the file contains query keywords, then further construct a new index binary vector corresponding to query keywords;
步骤7.用户将查询关键词根据逻辑表达式生成LSSS(Linear Secret SharingScheme,线性秘密共享方案)矩阵,并将新索引二进制向量与LSSS矩阵进行内积计算以进一步判断该文件是否满足查询逻辑表达式。Step 7. The user generates the LSSS (Linear Secret Sharing Scheme) matrix based on the query keywords according to the logical expression, and calculates the inner product of the new index binary vector and the LSSS matrix to further determine whether the file satisfies the query logical expression .
步骤1具体包括以下子步骤:Step 1 specifically includes the following sub-steps:
1.1数据拥有者使用已有的分词算法对其文件集提取关键词,构建关键词集合;1.1 The data owner uses the existing word segmentation algorithm to extract keywords from its file set and build a keyword set;
1.2数据拥有者根据每个文件中是否包含关键词集合中的对应关键词构建二进制向量索引,以1表示相应关键词存在于此文件中,以0表示相应关键词不存在于此文件中。1.2 The data owner builds a binary vector index based on whether each file contains the corresponding keyword in the keyword set, with 1 indicating that the corresponding keyword exists in the file, and 0 indicating that the corresponding keyword does not exist in the file.
步骤2中,如果是基于单个文件加密,数据拥有者根据文件集中文件数量,利用对称密码机制随机生成对应数目的对称密钥,并利用对称密钥对文件进行加密生成密文,每个文件的加密密钥均不同;如果是基于数据块加密,数据拥有者根据设定数据块大小将文件集中文件进行分块,利用对称密码机制随机生成对应数目的对称密钥,并利用对称密钥对数据块进行加密生成密文,每个数据块的加密密钥均不同。In step 2, if the encryption is based on a single file, the data owner uses the symmetric encryption mechanism to randomly generate a corresponding number of symmetric keys according to the number of files in the file set, and uses the symmetric key to encrypt the file to generate ciphertext. The encryption keys are all different; if it is based on data block encryption, the data owner divides the files in the file set into blocks according to the set data block size, uses the symmetric encryption mechanism to randomly generate a corresponding number of symmetric keys, and uses the symmetric key to encrypt the data. Each block is encrypted to generate ciphertext, and the encryption key is different for each data block.
步骤4具体包括以下子步骤:Step 4 specifically includes the following sub-steps:
4.1用户向数据拥有者发送查询授权申请,数据拥有者根据其安全策略决定是否向用户以及针对哪些文件集颁发授权令牌,令牌中包含有授权文件集的关键词集合以及授权文件的二进制向量索引;4.1 The user sends a query authorization application to the data owner, and the data owner decides whether to issue an authorization token to the user and for which file sets according to its security policy. The token contains the keyword set of the authorized file set and the binary vector of the authorized file index;
4.2数据拥有者使用通用的安全传输机制将令牌发送给用户。4.2 The data owner sends the token to the user using a common secure transmission mechanism.
步骤5具体包括以下子步骤:Step 5 specifically includes the following sub-steps:
5.1首先构建查询二进制向量,其方法如下:用户根据查询关键词是否在关键词集合中构建查询二进制向量,以1表示相应关键词存在于关键词集合中,以0表示相应关键词不存在于关键词集合中。5.1 First construct the query binary vector, the method is as follows: the user constructs the query binary vector according to whether the query keyword is in the keyword set, and 1 indicates that the corresponding keyword exists in the keyword set, and 0 indicates that the corresponding keyword does not exist in the keyword set. word set.
5.2将查询二进制向量与每个文件的索引二进制向量进行内积计算,当内积计算结果为非0时,表明该文件包含查询关键词,当内积计算结果为0时,表明该文件不包含查询关键词。并且内积计算结果的值越大,表明包含的关键词越多。5.2 Calculate the inner product of the query binary vector and the index binary vector of each file. When the inner product calculation result is non-zero, it indicates that the file contains the query keyword. When the inner product calculation result is 0, it indicates that the file does not contain Query keywords. And the larger the value of the inner product calculation result, the more keywords are included.
假设ri是文档Fi的二进制索引向量,其中ri[j]∈{0,1}表示关键词wi是否在文档中存在;Q是一个查询向量,其屮Q[j]∈{0,1}表示关键词wj是否在查询关键词集合W中。文档Fi与查询关键词集合W的相似性得分通过内积方式计算出来,即rQ。Suppose r i is the binary index vector of document F i , where r i [j]∈{0,1} indicates whether the keyword w i exists in the document; Q is a query vector, where Q[j]∈{0 , 1} indicates whether the keyword w j is in the query keyword set W. The similarity score between the document F i and the query keyword set W is calculated by the inner product, that is, rQ.
步骤6中,构建与查询关键词对应的新索引二进制向量方法如下:在文件的索引二进制向量中,将查询关键词对应位置的二进制位保留,将其它非查询关键词对应位去掉。In step 6, the method of constructing a new index binary vector corresponding to the query keyword is as follows: in the index binary vector of the file, the binary bits corresponding to the query keywords are reserved, and the bits corresponding to other non-query keywords are removed.
步骤7具体包括以下子步骤:Step 7 specifically includes the following sub-steps:
7.1首先根据查询逻辑表达式构建LSSS矩阵,其方法如下:首先将根节点向量设为(1),其向量长度为1,并将变量c初始化为1,父节点使用向量v标记。如父节点为OR门,则孩子节点由v标记;如父节点为AND门,则左孩子节点为v||1,右孩子节点为(0,……0)||-1,0的个数为c,并且c=c+1。完成整棵树的标记后,叶子节点组成LSSS矩阵M的行,若长度不等,则填充0。7.1 First construct the LSSS matrix according to the query logic expression, the method is as follows: first set the root node vector to (1), the vector length is 1, and the variable c is initialized to 1, and the parent node is marked with the vector v. If the parent node is an OR gate, the child node is marked by v; if the parent node is an AND gate, the left child node is v||1, and the right child node is (0,...0)||-1, 0 The number is c, and c=c+1. After marking the entire tree, the leaf nodes form the rows of the LSSS matrix M, and if the lengths are not equal, fill them with 0.
7.2将新索引二进制向量与LSSS矩阵进行内积计算,当且仅当计算结果为(1,0,0,…,0)时,表明文件满足查询条件,否则不满足查询条件。7.2 Calculate the inner product of the new index binary vector and the LSSS matrix. If and only if the calculation result is (1,0,0,...,0), it indicates that the file meets the query condition, otherwise it does not meet the query condition.
一种密文云数据复杂查询方法,包括数据拥有者、用户和云端,数据拥有者用于使用已有分词算法对其文件集提取关键词,并构建所有文件的二进制向量索引;数据拥有者还用于对文件使用对称密码机制进行加密,如果是基于数据块,还要将文件按设定数据块大小进行分块,然后使用对称密码机制进行加密,然后将加密的文件发送到云端;用户用于向数据拥有者请求查询授权;数据拥有者还用于按照指定安全策略向用户发放授权令牌;用户还用于使用令牌信息构建查询二进制向量;用户还用于使用查询二进制向量与所有文件的索引二进制向量进行内积计算以判断文件是否包含查询关键词;用户还用于构建与查询关键词对应的新索引二进制向量;用户还用于将查询关键词根据逻辑表达式生成LSSS矩阵,并将新索引二进制向量与LSSS矩阵进行内积计算;用户还用于向云端请求包含查询关键词的文件密文,并使用令牌中包含的文件密钥解密文件;云端用于存放数据,并响应用户的读写请求。A complex query method for ciphertext cloud data, including data owners, users, and the cloud. Data owners use existing word segmentation algorithms to extract keywords from their file sets and build binary vector indexes for all files; data owners also It is used to encrypt files using a symmetric cipher mechanism. If it is based on data blocks, the file must be divided into blocks according to the set data block size, then encrypted using a symmetric cipher mechanism, and then the encrypted files are sent to the cloud; users use It is used to request query authorization from the data owner; the data owner is also used to issue an authorization token to the user according to the specified security policy; the user is also used to use the token information to construct the query binary vector; the user is also used to use the query binary vector with all files Inner product calculation of the index binary vector to determine whether the file contains the query keyword; the user is also used to construct a new index binary vector corresponding to the query keyword; the user is also used to generate the LSSS matrix based on the query keyword according to the logical expression, and Calculate the inner product of the new index binary vector and the LSSS matrix; the user is also used to request the ciphertext of the file containing the query keyword from the cloud, and use the file key contained in the token to decrypt the file; the cloud is used to store the data and respond The user's read and write requests.
通过本发明所构思的以上技术方案,与现有技术相比,本发明具有以下的优势:Through the above technical solutions conceived by the present invention, compared with the prior art, the present invention has the following advantages:
1.查询准确度高,使用查询逻辑表达式可以表示复杂的查询条件,使用LSSS矩阵可以得到与查询逻辑表达式完全相符的查询结果。1. The query accuracy is high, the query logic expression can be used to express complex query conditions, and the LSSS matrix can be used to obtain query results that completely match the query logic expression.
2.数据更新方便,建立索引的过程由数据拥有者完成,关键词集合信息由数据拥有者保管,当有文件需要更新时,数据拥有者只需要更新文件的二进制向量索引,并重新加密文件,然后将加密的文件发送至云端。2. The data update is convenient. The indexing process is completed by the data owner, and the keyword set information is kept by the data owner. When a file needs to be updated, the data owner only needs to update the binary vector index of the file and re-encrypt the file. The encrypted file is then sent to the cloud.
3.使用二进制向量内积计算非常高效,只需要在用户端增加少量的存储就可以实现高效的检索。3. The calculation of the inner product of binary vectors is very efficient, and only a small amount of storage needs to be added on the user side to achieve efficient retrieval.
附图说明Description of drawings
图1为本发明所涉及的各实体关系图。FIG. 1 is a relationship diagram of various entities involved in the present invention.
图2为本发明方法流程图。Fig. 2 is a flow chart of the method of the present invention.
图3为本发明二进制向量索引图。Fig. 3 is a binary vector index diagram of the present invention.
图4为本发明LSSS矩阵构造图。Fig. 4 is a structure diagram of the LSSS matrix of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
以下首先就本发明的技术术语进行解释和说明:Below at first explain and illustrate with regard to the technical terms of the present invention:
数据拥有者:指文件的拥有者,需要将文件存储在云中,且制定文件的访问控制策略;Data owner: refers to the owner of the file, who needs to store the file in the cloud and formulate access control policies for the file;
用户:需要读取数据拥有者发布的文件;User: need to read the files released by the data owner;
云端或云存储:存储数据拥有者的文件,会忠实执行数据拥有者和合法用户发出的操作请求,但在条件允许时会偷窥文件内容;Cloud or cloud storage: store the files of the data owner, and will faithfully execute the operation requests issued by the data owner and legitimate users, but will peek at the contents of the file when conditions permit;
文件:数据拥有者需要上传至云端的数据;File: the data that the data owner needs to upload to the cloud;
文件块:文件分块,数据拥有者对同一文件的不同分块采用不同的加密密钥;File blocks: file blocks, data owners use different encryption keys for different blocks of the same file;
对称密码机制:是一种传统密码机制,加密和解密采用相同密钥,效率较高,在本发明中采用该机制加密文件或文件块;Symmetric encryption mechanism: it is a traditional encryption mechanism, the same key is used for encryption and decryption, and the efficiency is high. This mechanism is used in the present invention to encrypt files or file blocks;
对称密钥:对称密码机制中随机生成的二进制数据;Symmetric key: Randomly generated binary data in a symmetric encryption mechanism;
LSSS:线性秘密共享方案,是其英文全称Linear Secret Sharing Scheme的缩写。LSSS: Linear Secret Sharing Scheme, which is the abbreviation of its full English name, Linear Secret Sharing Scheme.
以下结合实施例和附图对本发明做进一步说明。The present invention is further described below in conjunction with embodiment and accompanying drawing.
如图1所示,本发明的密文云数据复杂查询方法是应用在加密云存储系统中,该系统包括数据拥有者、用户以及云端。As shown in FIG. 1 , the complex query method for ciphertext cloud data of the present invention is applied in an encrypted cloud storage system, which includes data owners, users and the cloud.
在本实施方式中,数据拥有者为某科研单位秘书,传至云端的数据是该单位的科研项目文件,主要用于单位内人员包括有出差在外的人员在项目申请以及开发过程中的数据共享。In this embodiment, the data owner is the secretary of a scientific research unit, and the data transmitted to the cloud is the scientific research project file of the unit, which is mainly used for data sharing in the process of project application and development by personnel in the unit, including those who are on business trips .
如图2所示,本发明的密文云数据复杂查询方法包括以下步骤:As shown in Figure 2, the complex query method of ciphertext cloud data of the present invention comprises the following steps:
步骤1.数据拥有者对其文件集构建索引,使用二进制向量索引,即索引中每一位代表一个关键词,以0和1表示相应关键词是否存在于此文件中,如图3所示。本步骤具体包括以下子步骤:Step 1. The data owner constructs an index for its file set, using a binary vector index, that is, each bit in the index represents a keyword, and 0 and 1 indicate whether the corresponding keyword exists in the file, as shown in Figure 3. This step specifically includes the following sub-steps:
1.1数据拥有者使用已有的分词算法对其文件集提取关键词,构建关键词集合;举例而言,如图3所示,关键词集合{云计算,云存储,加密,数据检索,二进制向量}。1.1 The data owner uses the existing word segmentation algorithm to extract keywords from its file set, and builds a keyword set; for example, as shown in Figure 3, the keyword set {cloud computing, cloud storage, encryption, data retrieval, binary vector }.
1.2数据拥有者根据每个文件中是否包含关键词集合中的对应关键词构建二进制向量索引,以1表示相应关键词存在于此文件中,以0表示相应关键词不存在于此文件中。1.2 The data owner builds a binary vector index based on whether each file contains the corresponding keyword in the keyword set, with 1 indicating that the corresponding keyword exists in the file, and 0 indicating that the corresponding keyword does not exist in the file.
举例而言,如图3所示,文件1包含关键词{云计算,加密},其索引二进制向量为f1=(1,0,1,0,0),文件2包含关键词{云存储,加密,数据检索,二进制向量},其索引二进制向量为f2=(0,1,1,1,1)。For example, as shown in Figure 3, file 1 contains keywords {cloud computing, encryption}, its index binary vector is f 1 =(1,0,1,0,0), and file 2 contains keywords {cloud storage , encryption, data retrieval, binary vector}, its index binary vector is f 2 =(0,1,1,1,1).
步骤2.数据拥有者使用对称密码机制加密文件集(可以基于单个文件或数据块);Step 2. The data owner uses a symmetric encryption mechanism to encrypt the file set (can be based on a single file or data block);
步骤3.数据拥有者将加密文件集发送至云端;Step 3. The data owner sends the encrypted file set to the cloud;
步骤4.用户要求访问包含某些关键词的文件时,向数据拥有者申请查询令牌,查询令牌中包含有关键词集合和所有文件的二进制向量索引。本步骤具体包括以下子步骤:Step 4. When the user requests to access files containing certain keywords, apply for a query token from the data owner. The query token contains the keyword set and the binary vector indexes of all files. This step specifically includes the following sub-steps:
4.1用户向数据拥有者发送查询授权申请,数据拥有者根据其安全策略决定是否向用户以及针对哪些文件集颁发授权令牌,令牌中包含有授权文件集的关键词集合以及授权文件的二进制向量索引;4.1 The user sends a query authorization application to the data owner, and the data owner decides whether to issue an authorization token to the user and for which file sets according to its security policy. The token contains the keyword set of the authorized file set and the binary vector of the authorized file index;
4.2数据拥有者使用通用的安全传输机制将令牌发送给用户。4.2 The data owner sends the token to the user using a common secure transmission mechanism.
步骤5.用户根据查询关键词与关键词集合构建查询二进制向量,并将查询二进制向量与每个文件的索引二进制向量进行内积计算判断该文件是否包含用户的查询关键词。本步骤具体包括以下子步骤:Step 5. The user constructs a query binary vector based on the query keyword and the keyword set, and calculates the inner product of the query binary vector and the index binary vector of each file to determine whether the file contains the user's query keyword. This step specifically includes the following sub-steps:
5.1首先构建查询二进制向量,其方法如下:用户根据查询关键词是否在关键词集合中构建查询二进制向量,以1表示相应关键词存在于关键词集合中,以0表示相应关键词不存在于关键词集合中。5.1 First construct the query binary vector, the method is as follows: the user constructs the query binary vector according to whether the query keyword is in the keyword set, and 1 indicates that the corresponding keyword exists in the keyword set, and 0 indicates that the corresponding keyword does not exist in the keyword set. word set.
5.2将查询二进制向量与每个文件的索引二进制向量进行内积计算,当内积计算结果为非0时,表明该文件包含查询关键词,当内积计算结果为0时,表明该文件不包含查询关键词。并且内积计算结果的值越大,表明包含的关键词越多。5.2 Calculate the inner product of the query binary vector and the index binary vector of each file. When the inner product calculation result is non-zero, it indicates that the file contains the query keyword. When the inner product calculation result is 0, it indicates that the file does not contain Query keywords. And the larger the value of the inner product calculation result, the more keywords are included.
举例而言,设查询关键词为:w1=“云计算”,w2=“云存储”,w3=“加密”,w4=“数据检索”,查询表达式为:(w1or w2)and w3and w4,则查询二进制向量为q=(1,1,1,1,0)。For example, suppose the query keywords are: w 1 = "cloud computing", w 2 = "cloud storage", w 3 = "encryption", w 4 = "data retrieval", and the query expression is: (w 1 or w 2 )and w 3 and w 4 , then the query binary vector is q=(1,1,1,1,0).
举例而言,如图3所示,文件1包含关键词{云计算,加密},其索引二进制向量为f1=(1,0,1,0,0),文件2包含关键词{云存储,加密,数据检索,二进制向量},其索引二进制向量为f2=(0,1,1,1,1)。将查询向量与文件1的索引向量进行内积计算:q·f1=(1,1,1,1,0)·(1,0,1,0,0)-1=2,将查询向量与文件2的索引向量进行内积计算:q·f2=(1,1,1,1,0)·(0,1,1,1,1)-1=3。For example, as shown in Figure 3, file 1 contains keywords {cloud computing, encryption}, its index binary vector is f 1 =(1,0,1,0,0), and file 2 contains keywords {cloud storage , encryption, data retrieval, binary vector}, its index binary vector is f 2 =(0,1,1,1,1). Calculate the inner product of the query vector and the index vector of file 1: q·f 1 =(1,1,1,1,0)·(1,0,1,0,0) -1 =2, the query vector Inner product calculation with the index vector of file 2: q·f 2 =(1,1,1,1,0)·(0,1,1,1,1) −1 =3.
步骤6.若该文件包含有查询关键词,则进一步构建与查询关键词对应的新索引二进制向量;Step 6. If the file contains query keywords, then further construct a new index binary vector corresponding to query keywords;
步骤6中,构建与查询关键词对应的新索引二进制向量方法如下:在文件的索引二进制向量中,将查询关键词对应位置的二进制位保留,将其它非查询关键词对应位去掉。In step 6, the method of constructing a new index binary vector corresponding to the query keyword is as follows: in the index binary vector of the file, the binary bits corresponding to the query keywords are reserved, and the bits corresponding to other non-query keywords are removed.
举例而言,文件1的新索引二进制向量为f1'=(1,0,1,0),文件2的新索引二进制向量为f2'=(0,1,1,1)。For example, the new index binary vector of file 1 is f 1 ′=(1,0,1,0), and the new index binary vector of file 2 is f 2 ′=(0,1,1,1).
步骤7.用户将查询关键词根据逻辑表达式生成LSSS矩阵,并将新索引二进制向量与LSSS矩阵进行内积计算以进一步判断该文件是否满足查询逻辑表达式。本步骤具体包括以下子步骤:Step 7. The user generates an LSSS matrix based on the query keywords according to the logical expression, and calculates the inner product of the new index binary vector and the LSSS matrix to further determine whether the file satisfies the query logical expression. This step specifically includes the following sub-steps:
7.1首先根据查询逻辑表达式构建LSSS矩阵,其方法如下:首先将根节点向量设为(1),其向量长度为1,并将变量c初始化为1,父节点使用向量v标记。如父节点为OR门,则孩子节点由v标记;如父节点为AND门,则左孩子节点为v||1,右孩子节点为(0,……0)||-1,0的个数为c,并且c=c+1。完成整棵树的标记后,叶子节点组成LSSS矩阵M的行,若长度不等,则填充0。7.1 First construct the LSSS matrix according to the query logic expression, the method is as follows: first set the root node vector to (1), the vector length is 1, and the variable c is initialized to 1, and the parent node is marked with the vector v. If the parent node is an OR gate, the child node is marked by v; if the parent node is an AND gate, the left child node is v||1, and the right child node is (0,...0)||-1, 0 The number is c, and c=c+1. After marking the entire tree, the leaf nodes form the rows of the LSSS matrix M, and if the lengths are not equal, fill them with 0.
7.2将新索引二进制向量与LSSS矩阵进行内积计算,当且仅当计算结果为(1,0,0,…,0)时,表明文件满足查询条件,否则不满足查询条件。7.2 Calculate the inner product of the new index binary vector and the LSSS matrix. If and only if the calculation result is (1,0,0,...,0), it indicates that the file meets the query condition, otherwise it does not meet the query condition.
举例而言,要找到满足查询条件的文件,首先构造LSSS矩阵,见图4。构造方法如下:首先将根节点向量设为(1),其向量长度为1,并将变量c初始化为1,父节点使用向量v标记。如父节点为OR门,则孩子节点由v标记;如父节点为AND门,则左孩子节点为v||1,右孩子节点为(0,……0)||-1,0的个数为c,并且c=c+1。完成整棵树的标记后,叶子节点组成LSSS矩阵M的行,若长度不等,则填充0。For example, to find the files satisfying the query conditions, first construct the LSSS matrix, as shown in Figure 4. The construction method is as follows: first, set the root node vector to (1), and its vector length is 1, and initialize the variable c to 1, and the parent node is marked with the vector v. If the parent node is an OR gate, the child node is marked by v; if the parent node is an AND gate, the left child node is v||1, and the right child node is (0,...0)||-1, 0 The number is c, and c=c+1. After marking the entire tree, the leaf nodes form the rows of the LSSS matrix M, and if the lengths are not equal, fill them with 0.
矩阵M构造完成后,逐条查询每个文件的索引向量,文件1的新索引二进制向量为f1'=(1,0,1,0),计算f1'M=(1,0,1),因此文件1不满足查询条件。文件2的新索引二进制向量为f2'=(0,1,1,1),计算f2'M=(1,0,0),因此文件2满足查询条件。After the matrix M is constructed, query the index vector of each file one by one. The new index binary vector of file 1 is f 1 '=(1,0,1,0), and calculate f 1 'M=(1,0,1) , so file 1 does not satisfy the query condition. The new index binary vector of file 2 is f 2 '=(0,1,1,1), and f 2 'M=(1,0,0) is calculated, so file 2 satisfies the query condition.
设一个汉字占2个字节,一个关键词设为最多5个汉字,占10个字节,假设有1000个关键词,存储关键词集合只需要10K字节的存储空间。每个文件的二进制向量索引大小为1000位,约12个字节,1000个文件,只需要12K字节的索引存储空间。It is assumed that a Chinese character occupies 2 bytes, and a keyword is set to be a maximum of 5 Chinese characters, occupying 10 bytes. Assuming that there are 1000 keywords, only 10K bytes of storage space are required to store the keyword set. The binary vector index size of each file is 1000 bits, about 12 bytes, and 1000 files only need 12K bytes of index storage space.
本领域的技术人员容易理解,以上所述仅为本发明的较佳实施例而己,并不用以限制本发明,凡在本发明的精神和原则之内所作的任何修改、等同替换和改进等,均应包含在本发明的保护范围之内。Those skilled in the art can easily understand that the above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present invention , should be included within the protection scope of the present invention.
Claims (8)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410316970.2A CN104036050A (en) | 2014-07-04 | 2014-07-04 | Complex query method for encrypted cloud data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410316970.2A CN104036050A (en) | 2014-07-04 | 2014-07-04 | Complex query method for encrypted cloud data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN104036050A true CN104036050A (en) | 2014-09-10 |
Family
ID=51466820
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410316970.2A Pending CN104036050A (en) | 2014-07-04 | 2014-07-04 | Complex query method for encrypted cloud data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104036050A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106027509A (en) * | 2016-05-13 | 2016-10-12 | 成都镜杰科技有限责任公司 | Cloud platform data computing method in ERP environment |
CN106980796A (en) * | 2017-03-27 | 2017-07-25 | 河南科技大学 | MDB is based under cloud environment+The multiple domain of tree connects the searching method of keyword |
CN108563732A (en) * | 2018-04-08 | 2018-09-21 | 浙江理工大学 | Towards encryption cloud data multiple-fault diagnosis sorted search method in a kind of cloud network |
CN109327448A (en) * | 2018-10-25 | 2019-02-12 | 深圳技术大学(筹) | A cloud file sharing method, device, device and storage medium |
CN110069604A (en) * | 2019-04-23 | 2019-07-30 | 北京字节跳动网络技术有限公司 | Text search method, apparatus and computer readable storage medium |
CN110612563A (en) * | 2017-05-18 | 2019-12-24 | 三菱电机株式会社 | Retrieval device, label generation device, query generation device, hidden retrieval system, retrieval program, label generation program, and query generation program |
CN110727835A (en) * | 2019-10-17 | 2020-01-24 | 浙江中智达科技有限公司 | Data query method, device and system |
CN112256839A (en) * | 2020-11-11 | 2021-01-22 | 深圳技术大学 | A ciphertext search method, device, system and computer-readable storage medium |
CN112639787A (en) * | 2018-07-16 | 2021-04-09 | 北京航迹科技有限公司 | Multiple file anomaly detection based on violation counting |
WO2022099496A1 (en) * | 2020-11-11 | 2022-05-19 | 深圳技术大学 | Ciphertext search method, apparatus and system, and computer-readable storage medium |
CN115098649A (en) * | 2022-08-25 | 2022-09-23 | 北京融数联智科技有限公司 | Keyword search method and system based on double-key accidental pseudorandom function |
CN119299239A (en) * | 2024-12-13 | 2025-01-10 | 云南省地矿测绘院有限公司 | Data encryption uploading method applied to cloud platform |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130046974A1 (en) * | 2011-08-16 | 2013-02-21 | Microsoft Corporation | Dynamic symmetric searchable encryption |
CN103345526A (en) * | 2013-07-22 | 2013-10-09 | 武汉大学 | Efficient privacy protection encrypted message querying method in cloud environment |
-
2014
- 2014-07-04 CN CN201410316970.2A patent/CN104036050A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130046974A1 (en) * | 2011-08-16 | 2013-02-21 | Microsoft Corporation | Dynamic symmetric searchable encryption |
CN103345526A (en) * | 2013-07-22 | 2013-10-09 | 武汉大学 | Efficient privacy protection encrypted message querying method in cloud environment |
Non-Patent Citations (2)
Title |
---|
段亚伟 等: ""扩展的密文策略属性基加密机制"", 《华中科技大学学报(自然科学版)》 * |
董丽阳: "公用云平台查询服务中隐私信息保护功能模块的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106027509A (en) * | 2016-05-13 | 2016-10-12 | 成都镜杰科技有限责任公司 | Cloud platform data computing method in ERP environment |
CN106980796A (en) * | 2017-03-27 | 2017-07-25 | 河南科技大学 | MDB is based under cloud environment+The multiple domain of tree connects the searching method of keyword |
CN106980796B (en) * | 2017-03-27 | 2020-03-06 | 河南科技大学 | Multi-domain connection keyword search method based on MDB+ tree in cloud environment |
CN110612563B (en) * | 2017-05-18 | 2023-05-12 | 三菱电机株式会社 | Search device, hidden search system, and computer-readable storage medium |
CN110612563A (en) * | 2017-05-18 | 2019-12-24 | 三菱电机株式会社 | Retrieval device, label generation device, query generation device, hidden retrieval system, retrieval program, label generation program, and query generation program |
CN108563732A (en) * | 2018-04-08 | 2018-09-21 | 浙江理工大学 | Towards encryption cloud data multiple-fault diagnosis sorted search method in a kind of cloud network |
CN112639787A (en) * | 2018-07-16 | 2021-04-09 | 北京航迹科技有限公司 | Multiple file anomaly detection based on violation counting |
CN112639787B (en) * | 2018-07-16 | 2021-11-02 | 北京航迹科技有限公司 | System, method and computer readable medium for protecting sensitive data |
CN109327448A (en) * | 2018-10-25 | 2019-02-12 | 深圳技术大学(筹) | A cloud file sharing method, device, device and storage medium |
CN110069604A (en) * | 2019-04-23 | 2019-07-30 | 北京字节跳动网络技术有限公司 | Text search method, apparatus and computer readable storage medium |
CN110727835A (en) * | 2019-10-17 | 2020-01-24 | 浙江中智达科技有限公司 | Data query method, device and system |
CN110727835B (en) * | 2019-10-17 | 2021-03-12 | 浙江中智达科技有限公司 | Data query method, device and system |
CN112256839A (en) * | 2020-11-11 | 2021-01-22 | 深圳技术大学 | A ciphertext search method, device, system and computer-readable storage medium |
WO2022099496A1 (en) * | 2020-11-11 | 2022-05-19 | 深圳技术大学 | Ciphertext search method, apparatus and system, and computer-readable storage medium |
CN112256839B (en) * | 2020-11-11 | 2023-07-07 | 深圳技术大学 | Ciphertext search method, ciphertext search device, ciphertext search system and computer-readable storage medium |
CN115098649A (en) * | 2022-08-25 | 2022-09-23 | 北京融数联智科技有限公司 | Keyword search method and system based on double-key accidental pseudorandom function |
CN119299239A (en) * | 2024-12-13 | 2025-01-10 | 云南省地矿测绘院有限公司 | Data encryption uploading method applied to cloud platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104036050A (en) | Complex query method for encrypted cloud data | |
Ge et al. | Towards achieving keyword search over dynamic encrypted cloud data with symmetric-key based verification | |
CN108334612B (en) | Shape-near Chinese character full-text fuzzy retrieval method aiming at ciphertext domain | |
CN106127075B (en) | A searchable encryption method based on privacy protection in cloud storage environment | |
CN106776904B (en) | The fuzzy query encryption method of dynamic authentication is supported in a kind of insincere cloud computing environment | |
CN104038349B (en) | Effective and verifiable public key searching encryption method based on KP-ABE | |
CN112765650A (en) | Attribute-based searchable encryption block chain medical data sharing method | |
CN109493017B (en) | Trusted outsourcing storage method based on block chain | |
CN109361644B (en) | Fuzzy attribute based encryption method supporting rapid search and decryption | |
CN105681280A (en) | Searchable encryption method based on Chinese in cloud environment | |
CN104023051A (en) | Multi-user multi-keyword searchable encryption method in cloud storage | |
CN104022866A (en) | Searchable encryption method for multi-user cipher text keyword in cloud storage | |
CN104780161A (en) | Searchable encryption method supporting multiple users in cloud storage | |
US20170262546A1 (en) | Key search token for encrypted data | |
US10733317B2 (en) | Searchable encryption processing system | |
CN108171066A (en) | The cross-domain searching method of keyword and system in a kind of medical treatment cloud under secret protection | |
Dowsley et al. | A survey on design and implementation of protected searchable data in the cloud | |
CN106874516A (en) | Efficient cipher text retrieval method based on KCB trees and Bloom filter in a kind of cloud storage | |
EP4235473A2 (en) | Encrypted search with a public key | |
CN102143159A (en) | Database key management method in DAS (database-as-a-service) model | |
CN103970889A (en) | Security cloud disc for Chinese and English keyword fuzzy search | |
CN104899517A (en) | Phrase-based searchable symmetric encryption method | |
Peng et al. | LS-RQ: A lightweight and forward-secure range query on geographically encrypted data | |
CN107094075A (en) | A kind of data block dynamic operation method based on convergent encryption | |
CN108650268B (en) | A searchable encryption method and system for realizing multi-level access |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20140910 |