Abstract
Recently, negative databases (NDBs) are proposed for privacy protection. Similar to the traditional databases, some basic operations could be conducted over the NDBs, such as select, intersection, update, delete and so on. However, both classifying and clustering in negative databases have not yet been studied. Therefore, two algorithms, i.e., a k nearest neighbor (kNN) classification algorithm and a k-means clustering algorithm in NDBs, are proposed in this paper, respectively. The core of these two algorithms is a novelmethod for estimating the Hamming distance between a binary string and an NDB. Experimental results demonstrate that classifying and clustering in NDBs are promising.
Similar content being viewed by others
References
Esponda F. Everything that is not important: negative databases [research frontier]. IEEE Computational Intelligence Magazine, 2008, 3(2): 60–63
Esponda F, Forrest S, Helman P. Enhancing privacy through negative representations of data. Technical Report, DTIC Document. 2004
Esponda F, Forrest S, Helman P. Negative representations of information. International Journal of Information Security, 2009, 8(5): 331–345
Esponda F, Ackley E S, Helman P, Jia H, Forrest S. Protecting data privacy through hard-to-reverse negative databases. International Journal of Information Security, 2007, 6(6): 403–415
Liu R, Luo W, Wang X. A hybrid of the prefix algorithm and the q-hidden algorithm for generating single negative databases. In: Proceedings of 2011 IEEE Symposium on Computational Intelligence in Cyber Security (CICS). 2011, 31–38
Jia H, Moore C, Strain D. Generating hard satisfiable formulas by hiding solutions deceptively. Journal of Artificial Intelligence Research, 2007, 28(1): 107–118
Jia H, Moore C, Strain D. Generating hard satisfiable formulas by hiding solutions deceptively. In: Proceedings of the National Conference on Artificial Intelligence. 2005, 384
Esponda F. Negative surveys. arXiv preprint math.ST/0608176. 2006
Bao Y, Luo W, Zhang X. Estimating positive surveys from negative surveys. Statistics & Probability Letters, 2013, 83(2): 551–558
Xie H, Kulik L, Tanin E. Privacy-aware collection of aggregate spatial data. Data & Knowledge Engineering, 2011, 70(6): 576–595
Horey J, Groat M M, Forrest S, Esponda F. Anonymous data collection in sensor networks. In: Proceedings of the 4th Annual International Conference on Mobile and Ubiquitous Systems: Networking & Services. 2007, 1–8
Dasgupta D, Saha S. Password security through negative filtering. In: Proceedings of 2010 International Conference on Emerging Security Technologies (EST). 2010, 83–89
Dasgupta D, Saha S. A biologically inspired password authentication system. In: Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research: Cyber Security and Information Intelligence Challenges and Strategies. 2009, 41
Dasgupta D, Azeem R. An investigation of negative authentication systems. In: Proceedings of the 3rd International Conference on Information Warfare and Security. 2008, 117–126
Bringer J, Chabanne H. Negative databases for biometric data. In: Proceedings of the 12th ACM Workshop on Multimedia and Security. 2010, 55–62
Esponda F, Trias E D, Ackley E S, Forrest S. A relational algebra for negative databases. University of New Mexico Technical Report, 2007
Esponda F, Ackley E S, Forrest S, Helman P. Online negative databases. International Journal of Unconventional Computing, 1993, 1(3): 201–220
Esponda F, Ackley E S, Forrest S, Helman P. Online negative databases. In: Proceedings of the 3rd International Conference on Artificial Immune Systems. 2004, 175–188
Selman B, Kautz H, Cohen B. Local search strategies for satisfiability testing. Cliques, Coloring, and Satisfiability: Second DIMACS implementation challenge, 1993, 26: 521–532
Fu Z, Marhajan Y, Malik S. Zchaff sat solver. Technical Report. Department of Electrical Engineering, Princeton University. 2004
Huang Z. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 1998, 2(3): 283–304
Author information
Authors and Affiliations
Corresponding author
Additional information
Ran Liu is currently a postdoctoral fellowship as faculty member in Information Security Department, School of Computer Science, China University of Geosciences (Wuhan). He received his BS degree of computer science from Central China Normal University in 2007. He obtained his PhD of computer science from University of Science and Technology of China in 2013. His research interests mainly focus on negative representation, cloud computing, security in cloud platform and privacy in cloud platform.
Wenjian Luo received the BS and PhD degrees from Department of Computer Science and Technology, University of Science and Technology of China, Hefei, China, in 1998 and 2003. He is presently an associate professor of School of Computer Science and Technology, University of Science and Technology of China. His current research interests include computational intelligence and applications.
Lihua Yue has been a full professor in the School of Computer Science and Technology at University of Science and Technology of China (USTC) since 2001. She received her BS and MS degrees in computer science both from USTC. Her research interests include flash-based databases, spatiotemporal databases, information retrieval, and image processing. She is a committee of Database Society of the China Computer Federation (CCF DBS) and served as a PC member of many conferences.
Rights and permissions
About this article
Cite this article
Liu, R., Luo, W. & Yue, L. Classifying and clustering in negative databases. Front. Comput. Sci. 7, 864–874 (2013). https://doi.org/10.1007/s11704-013-2318-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-013-2318-9