Abstract
Cloud storage is the ideal solution for outsourcing big data since the cloud can store a large amount of data. Cloud storage, on the other hand, raises additional problems about data duplication, fine-grained access control, and privacy, all of these factors are crucial for cloud large data storage. Data duplication approaches based on encrypted data schemes now available do not provide for fine-grained access control. This paper proposes a secure framework for managing data using rapid asymmetric maximum based dynamic size chunking and fuzzy logic for deduplication. Chunking, fingerprinting, hashing, and writing are the four main process of the proposed method. Initially, chunking is done to split the files into chunks. Rapid Asymmetric Maximum (RAM) based Dynamic Size Chunking (DSC) is used in the proposed method. These chunked files are then fingerprinted using hashing process for ensuring data authentication. Then B-tree indexing approach is used in the proposed method in order to keep the fingerprinted in an organized state. General Type2-Fuzzy logic is using Ant Lion Optimization (ALO) is used for detecting duplicate files in the documents. In the cloud storage platform, only non-duplicate documents are safely kept. The Triple Data Encryption Standard is used to do a security study before outsourcing non-duplicate data to a third-party cloud server. The total computation time of the proposed technique is 0.4 s in the inline phase and 0.04 s in the offline phase, and the deduplication ratio is 95% in the inline phase and 90% in the offline phase. This proposed deduplication approach requires less storage, which reduces memory use and processing time.
Similar content being viewed by others
Data availability statement
If all data, models, and code generated or used during the study appear in the submitted article and no data needs to be specifically requested.
Code availability
No code is available for this manuscript.
References
Yuan, H., Chen, X., Wang, J., Yuan, J., Yan, H., & Susilo, W. (2020). Blockchain-based public auditing and secure deduplication with fair arbitration. Information Sciences, 541, 409–425.
Wang, L., Wang, B., Song, W., & Zhang, Z. (2019). A key-sharing based secure deduplication scheme in cloud storage. Information Sciences, 504, 48–60.
Periasamy, J. K., & Latha, B. (2020) An enhanced secure content deduplication identification and prevention (ESCDIP) algorithm in cloud environment. Neural Computing and Applications, 1–10.
Pooranian, Z., Shojafar, M., Garg, S., Taheri, R., & Tafazolli, R. (2020). LEVER: Secure deduplicated cloud storage with encrypted two-party interactions in cyber-physical systems. IEEE Transactions on Industrial Informatics, 17(8), 5759–5768.
Widodo, R. N., Lim, H., & Atiquzzaman, M. (2017). A new content-defined chunking algorithm for data deduplication in cloud storage. Future Generation Computer Systems, 71, 145–156.
Ali, G., Ahmad, M. I., & Rafi, A. 2020, January. Secure block-level data deduplication approach for cloud data centers. In 2020 3rd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET) (pp. 1–6). IEEE.
Karthick, S. (2017). Semi supervised hierarchy forest clustering and KNN based metric learning technique for machine learning system. Journal of Advanced Research in Dynamical and Control Systems, 9, 2679–2690.
Rashmi, R. P., Gandhi, Y., Sarmalkar, V., Pund, P., & Khetani, V. (2020, October). RDPC: Secure Cloud Storage with Deduplication Technique. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC) (pp. 1280–1283). IEEE.
Yuan, H., Chen, X., Wang, J., Yuan, J., Yan, H. and Susilo, W. (2020) Blockchain-based Public Auditing and Secure Deduplication with Fair Arbitration. Information Sciences,.
Li, J., Chen, X., Li, M., Li, J., Lee, P. P., & Lou, W. (2013). Secure deduplication with efficient and reliable convergent key management. IEEE transactions on parallel and distributed systems, 25(6), 1615–1625.
Yang, C., Zhang, M., Jiang, Q., Zhang, J., Li, D., Ma, J., & Ren, J. (2017). Zero knowledge based client side deduplication for encrypted files of secure cloud storage in smart cities. Pervasive and Mobile Computing, 41, 243–258.
Jayapandian, N. and Rahman, A.M.J.M. (2018) Secure deduplication for cloud storage using interactive message-locked encryption with convergent encryption, to reduce storage space. Brazilian Archives of Biology and Technology, 61.
Li, J., Chen, X., Xhafa, F., & Barolli, L. (2015). Secure deduplication storage systems supporting keyword search. Journal of Computer and System Sciences, 81(8), 1532–1541.
Sun, S., Yao, W., & Li, X. (2019). SORD: A new strategy of online replica deduplication in Cloud-P2P. Cluster Computing, 22(1), 1–23.
Li, J., & Hou, M. (2018). Improving data availability for deduplication in cloud storage. International Journal of Grid and High Performance Computing (IJGHPC), 10(2), 70–89.
Rao, K.P.R., Reddy V.K. and Yakoob, S.K. (2018). Dynamic Secure Deduplication in Cloud Using Genetic Programming. In Data Engineering and Intelligent Computing (pp. 493–502). Springer.
Zhang, Y., Xu, C., Li, H., Yang, K., Zhou, J., & Lin, X. (2018). Healthdep: An efficient and secure deduplication scheme for cloud-assisted health systems. IEEE Transactions on Industrial Informatics, 14(9), 4101–4112.
Wu, S., Li, K. C., Mao, B., & Liao, M. (2017). DAC: Improving storage availability with deduplication-assisted cloud-of-clouds. Future Generation Computer Systems, 74, 190–198.
Saeed, A. S. M., & George, L. E. (2021). Fingerprint-based data deduplication using a mathematical bounded linear hash function. Symmetry, 13(11), 1978.
Carvajal, O., Melin, P., Miramontes, I., & Prado-Arechiga, G. (2021). Optimal design of a general type-2 fuzzy classifier for the pulse level and its hardware implementation. Engineering Applications of Artificial Intelligence, 97, 04069.
Rajkumar, K., & Dhanakoti, V. (2022). Fuzzy-Dedup: A secure deduplication model using cosine based Fuzzy interference system in cloud application. Journal of Intelligent & Fuzzy Systems, (Preprint), 1–14.
Kambo, H., & Sinha, B. (2017, May). Secure data deduplication mechanism based on Rabin CDC and MD5 in cloud computing environment. In 2017 2nd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT) (pp. 400–404). IEEE.
Li, Y., Hu, L., Xia, K., & Luo, J. (2019). Fast distributed video deduplication via locality-sensitive hashing with similarity ranking. EURASIP Journal on Image and Video Processing, 2019, 1–11.
Funding
There is no funding provided to prepare the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The process of writing and the content of the article does not give grounds for raising the issue of a conflict of interest.
Ethical approval
this article does not contain any studies with human participants or animals performed by any of the authors.
Informal consent
Informed consent was obtained from all individual participants included in the study.
Consent to participate
I have read and I understand the provided information.
Consent to publish
This article does not contain any Image or video to get permission.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Rajkumar, K., Hariharan, U., Dhanakoti, V. et al. A secure framework for managing data in cloud storage using rapid asymmetric maximum based dynamic size chunking and fuzzy logic for deduplication. Wireless Netw 30, 321–334 (2024). https://doi.org/10.1007/s11276-023-03448-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11276-023-03448-9