ServeDB Secure, Verifiable, and Efficient Range Queries on Outsourced Database
4 minute read ∼ Filed in : A paper noteIntroduction
Current Problems & Challenges
Existing work proposes some schema to support queries on encrypted data, and return encrypted results
-
Most of them have limited functionality such as keyword search and single-dimensional range query.
They extended SSE to support one-dimensional queries by transforming data and queries into SSE codes and using the code to enable query processing.
Directly expends this method by querying on each dimension separately is not only time-consuming but also reveals more matched records than desired.
-
Most of them only assume cloud server is semi-honest
-
Not many works focus on addressing multiple dimensional range queries.
Some challenges
- Order preserving encryption can be used to encrypt Index structure in order to support multiple-dimensional range queries, But the order and distribution information of data records will be revealed due to the weak security notion of OPE
- Searchable symmetric encryption supports keyword search,
Solutions
-
Propose a secure and scalable schema that supports multi-dimensional range queries. And provides the following:
-
Privacy: server cannot learn the contents and data record the query touches.
-
Efficiency: achieve sublinear query time.
-
Verifiability: the schema allow the user to verify the correctness and completeness of the query results.
-
-
Use hierarchical cubes to encode the data and build a tree-structure index on top of the data encoded by cubes to support efficient query processing.
-
Add verification information in the tree structure without introducing a new structure.
-
Experiments show efficiency and scalability.
PROBLEM STATEMENT AND RELATED WORK
System Model
Roles
Data owner, Data user, Cloud Server.
Data Owner: Encrypts each record in dataset D into E and builds a secure index T on top of it. And then the owner can upload data to the cloud.
Data users:
- Apply secure keys from the data owner, encrypts the query, and send it to the server.
- Verify correctness and completeness of results.
Cloud Server:
- store encrypted data, index.
- Run encrypted query.
- can be malicious.
Related Work
- CryptDB use OPE, but leak the order of data.
- Mulit-dimensional tree structures like kd-trees and R-trees. Used to achieve sub-linear multidimensional range privacy.
- Merkle Trees and accumulation trees are used in verfication
SCHEME OVERVIEW
Store
Transform data ranges into discrete cube codes C. Range query is also encoded as Q.
Compare data records and query range by comparing the cube codes. C and Q
The comparison of C and Q is based on the index tree. But B-Tree and KD-tree dony consider security. So the paper introduces SVETree.
- Each non-leaf node stores a bloom filter.
- Each leaf-node stores an encrypted data record.
Query Processing.
Given a query Q, we can use the Bloom filter to check whether each node contains codes falling in the range.
-
Encode prepare range query Q into C using same CCS (encoding system in paper)
-
Fo reaches cube g_j in C, calculate r hash values T_j, And all cube’s hash values can form a matrix M = [T1…Tf ]
T_j is used when checking with the bloom filter stored in each node.
-
searching each non-leaf node and checking if there are some matches in this sub-branch tree.
-
After getting the result from the server, the user should verify it.
- Data correctness:
- If in the range: decrypt the result and check if each data is in range.
- If the data is from the data owner: calculate the hash value of the root, and check if it is equal to returned hash root value. ( Similar to Merkle Tree )
- Completeness: data users can use the query results R and proof information π to reproduce a correct query process.
- Data correctness:
Experiments
Setup
- Twitter dataset, 5 million tweets
- Compare with
- PBtree: only supported secure range query on single-dimensional data
- R-tree
- Scan searched each dimension based on ciphertext respectively and intersected the result sets on each dimension.
- Measure delay of querying and verifying.