Efficient Association Rules Hiding Using Genetic Algorithms
<p>The relationship between <span class="html-italic">X</span> and <span class="html-italic">Y</span>. <math display="inline"><semantics> <mrow> <msup> <mi mathvariant="script">N</mi> <mi>k</mi> </msup> </mrow> </semantics></math> denotes the positive integer set with dimension <math display="inline"><semantics> <mrow> <mi>k</mi> </mrow> </semantics></math>, and <math display="inline"><semantics> <mrow> <msup> <mi mathvariant="script">N</mi> <mi>δ</mi> </msup> </mrow> </semantics></math> denotes the positive integer set with dimension <span class="html-italic">δ</span>.</p> "> Figure 2
<p>Execution times difference between the decrease the confidence rule (DCR) and efficient association rules hiding using a genetic algorithm (EARH-GA) on (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; and (<b>c</b>) Mobile dataset.</p> "> Figure 3
<p>Utility difference between the DCR and EARH-GA is visualized with respect to each dataset: (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; (<b>c</b>) and Mobile dataset.</p> "> Figure 4
<p>Accuracy difference between the DCR and EARH-GA visualized with respect to each dataset: (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; (<b>c</b>) and Mobile dataset.</p> "> Figure 4 Cont.
<p>Accuracy difference between the DCR and EARH-GA visualized with respect to each dataset: (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; (<b>c</b>) and Mobile dataset.</p> "> Figure 5
<p>Lost rules ratio difference between the DCR and EARH-GA is visualized with respect to each dataset: (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; (c) and Mobile dataset.</p> "> Figure 5 Cont.
<p>Lost rules ratio difference between the DCR and EARH-GA is visualized with respect to each dataset: (<b>a</b>) Mushroom dataset; (<b>b</b>) Chess dataset; (c) and Mobile dataset.</p> ">
Abstract
:1. Introduction
2. Related Work
3. Methodology
3.1. Terms and Symbols
3.2. Problem Formulation
3.3. EARH-GA
Algorithm 1. Optimizing association rule hiding using a genetic algorithm. |
Input: Database (DB), association rules (ARs), sensitive association rules (SARs), SARcounter = 0, minimum support (MS), minimum confidence (MC) Output: DB′ Start
|
Algorithm 2. The adopted genetic algorithm pseudocode. |
Input: Database (DB′), association rules (ARs), sensitive association rule (SAR), nonsensitive association rules (NSARs), candidate transactions (CTs), victimItem (VI), MS, MC Output: Global best individual Start
|
3.4. Tracing Example
4. Dataset Description
5. Experiments and Analysis
5.1. Experimental Setup
5.2. Evaluation Measures
5.3. Performance Analysis
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Rehman, S.; Sharma, A. Privacy-Preserving Data Mining Using Association Rule Based on Apriori Algorithm. Int. Adv. Inform. Comput. Res. 2017, 712, 218–226. [Google Scholar]
- Srivastava, N.; Gupta, K.; Baliyan, N. Improved Market Basket Analysis with Utility Mining. In Proceedings of the 3rd International Conference on Internet of Things and Connected Technologies (ICIoTCT), Jaipur, India, 26–27 March 2018. [Google Scholar]
- Liu, L.; Yu, S.; Wei, X.; Ning, Z. An improved Apriori-based algorithm for friends recommendation in a microblog. Int. J. Comput. Syst. 2018, 31, e3453. [Google Scholar] [CrossRef]
- Hareendran, S.A.; Chandra, S.V. Association Rule Mining in Healthcare Analytics. In Data Mining and Big Data; Springer: Cham, Switzerland, 2017; pp. 31–39. [Google Scholar]
- Ning, C.P.; Ji, X.; Wang, H.Q.; Du, X.Y.; Niu, H.T.; Fang, S.B. Association between the sonographer’s experience and diagnostic performance of IOTA simple rules. World J. Surg. Oncol. 2018, 16, 179. [Google Scholar] [CrossRef] [PubMed]
- Petrova, E.; Pauwels, P.; Svidt, K.; Jensen, R.L. In search of sustainable design patterns: Combining data mining and semantic data modeling on disparate building data. In Advances in Informatics and Computing in Civil and Construction Engineering; Springer: Cham, Switzerland, 2019; pp. 19–26. [Google Scholar]
- Rekik, R.; Kallel, I.; Casillas, J.; Alimi, A.M. Assessing web sites quality: A systematic literature review by text and association rules mining. Int. J. Inf. Manag. 2018, 38, 201–216. [Google Scholar] [CrossRef]
- Jabri, S.; Dahbi, A.; Gadi, T.; Bassir, A. Ranking of text documents using TF-IDF weighting and association rules mining. In Proceedings of the 2018 4th International Conference on Optimization and Applications (ICOA), Mohammedia, Morocco, 26–27 April 2018; pp. 1–6. [Google Scholar]
- Divanis, A.G.; Verykios, V. An integer programming approach for frequent itemset hiding. In Proceedings of the 15th ACM International Conference on Information and Knowledge Management, Arlington, VA, USA, 5–11 November 2006; pp. 5–11. [Google Scholar]
- Agrawal, R.; Imielinski, T.; Swami, A. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data, Washington, DC, USA, 25–28 May 1993; pp. 207–216. [Google Scholar]
- Yogendra, J.K.; Vinod, K.Y.; Geetika, S.P. An Efficient Association Rule Hiding Algorithm for Privacy Preserving Data Mining. Int. J. Comput. Sci. Eng. 2011, 3, 2792–2798. [Google Scholar]
- Sui, P.; Li, X. A privacy-preserving approach for multimodal transaction data integrated analysis. Neurocomputing 2017, 253, 56–64. [Google Scholar] [CrossRef]
- Farkas, C.; Jajodia, S. The inference problem: A survey. ACM SIGKDD Explor. Newslett. 2001, 4, 6–11. Available online: http://www.utdallas.edu/~muratk/courses/course_material/p6-farkas.pdf (accessed on 27 June 2018). [CrossRef]
- Aqra, I.; Herawan, T.; Ghani, N.A.; Akhunzada, A.; Ali, A.; Razali, R.B.; Choo, K.K.R. A novel association rule mining approach using TID intermediate itemset. PLoS ONE 2018, 13, e0196957. [Google Scholar]
- Mohan, S.V.; Angamuthu, T. Association Rule Hiding in Privacy-Preserving Data Mining. Int. J. Inform. Secur. Priv. 2018, 12, 141–163. [Google Scholar] [CrossRef]
- Moustakides, G.V.; Vassilios, V.S. A MaxMin Approach for Hiding Frequent Itemsets. In Proceedings of the Sixth IEEE International Conference on Data Mining—Workshops, Hong Kong, China, 18–22 December 2006. [Google Scholar]
- Sun, X.; Philip, S.Y. Hiding Sensitive Frequent Itemsets by a Border-Based Approach. J. Comput. Sci. Eng. 2007, 1, 74–94. [Google Scholar] [CrossRef] [Green Version]
- Telikani, A.; Shahbahrami, A. Optimizing association rule hiding using combination of border and heuristic approaches. Appl. Intell. 2017, 47, 544–557. [Google Scholar] [CrossRef]
- Le, H.Q.; Arch-Int, S. A Conceptual Framework for Privacy Preserving of Association Rule Mining in E-Commerce. In Proceedings of the 2012 7th IEEE Conference on Industrial Electronics and Applications (ICIEA), Singapore, 18–20 July 2012. [Google Scholar]
- Le, H.Q.; Arch-Int, S.; Nguyen, X.Y.; Arch-Int, N. Association Rule Hiding in Risk Management for Retail Supply Chain Collaboration. Comput. Ind. 2013, 64, 776–784. [Google Scholar] [CrossRef]
- Le, H.Q.; Arch-Int, S.; Arch, N. Association Rule Hiding Based on Distance and Intersection Lattice. Math. Probl. Eng. 2013, 2013, 210405. [Google Scholar]
- Lin, C.W.; Zhang, B.; Yang, K.T.; Hong, T.P. Efficiently hiding sensitive itemsets with transaction deletion based on genetic algorithms. Sci. World J. 2014, 2014, 398269. [Google Scholar] [CrossRef] [PubMed]
- Lin, C.W.; Hong, T.P.; Yang, K.T.; Wang, S.L. The GA-based algorithms for optimizing hiding sensitive itemsets through transaction deletion. Appl. Intell. 2015, 42, 210–230. [Google Scholar] [CrossRef]
- Lin, C.W.; Hong, T.P.; Wang, J.P.; Lan, G.C. A GA-Based Approach to Hide Sensitive High Utility Itemsets. Sci. Word J. 2014, 2014, 804629. [Google Scholar] [CrossRef] [PubMed]
- Hong, T.P.; Lin, C.W.; Chang, C.C.; Wang, S.L. Hiding sensitive item sets by inserting dummy transactions. In Proceedings of the 2011 IEEE International Conference on Granular Computing, Kaohsiung, Taiwan, 8–10 November 2011; pp. 246–249. [Google Scholar]
- Oliveira, S.R.M.; Osmar, R.Z. A Unified Framework for Protecting Sensitive Association Rules in Business Collaboration. Int. J. Bus. Intell. Data Min. 2006, 1, 247. [Google Scholar] [CrossRef]
- Verykios, V.S.; Elmagarmid, A.K.; Bertino, E. Association Rule Hiding. IEEE Trans. Knowl. Data Eng. 2014, 16, 434–447. [Google Scholar] [CrossRef]
- Dhyanendra, J. Hiding Sensitive Association Rules without Altering the Support of Sensitive Item(S). Int. Conf. Comput. Sci. Inf. Technol. 2012, 3, 500–509. [Google Scholar]
- Kalyani, G.; Chandra, S.R. Particle Swarm Intelligence and Impact Factor-Based Privacy Preserving Association Rule Mining for Balancing Data Utility and Knowledge Privacy. Arab. J. Sci. Eng. 2017, 43, 4161–4178. [Google Scholar] [CrossRef]
- Prabha, M.; Sathiya, S.V. Association rule hiding using artificial bee colony algorithm. Int. J. Comput. Appl. 2011, 33, 41–47. [Google Scholar]
- Gupta, M.; Joshi, R.C. Privacy-Preserving Fuzzy Association Rules Hiding in Quantitative Data. Int. J. Comput. Theory Eng. 2009, 1, 1793–8201. [Google Scholar] [CrossRef]
- Ji, C.R.; Deng, Z.H. Mining Frequent Ordered Patterns without Candidate Generation. In Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007), Haikou, China, 24–27 August 2007; pp. 402–406. [Google Scholar]
- Zaki, M.K.; Srinivasan, P.; Ogihara, M.; Li, W. New Algorithms for Fast Discovery of Association Rules. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, Newport Beach, CA, USA, 14–17 August 1997; pp. 283–286. [Google Scholar]
- Bayardo, R. Efficiently Mining Long Patterns from Databases. 1998, pp. 85–89. Available online: https://archive.ics.uci.edu/ml/datasets.html (accessed on 5 July 2018).
Term | Meaning |
---|---|
AR | Set of association rules |
SAR | Set of sensitive rules |
NSAR | Set of nonsensitive rules |
MC or | Minimum confidence threshold for hiding a sensitive rule |
MS or | Minimum support threshold for hiding a sensitive rule |
δ | Number of transactions to be altered with the object to hide the sensitive rule |
VI | Set of victim items |
ID | Items |
---|---|
1 | {l, m, n, r} |
2 | {m, o, p, q} |
3 | {l, m, n, t} |
4 | {l, o, p, q} |
5 | {l, m, n, t, r} |
6 | {m, p, r} |
7 | {l, m, n, o, p, q} |
8 | {n, o, q, t} |
Association Rule | Confidence |
---|---|
{l} ⇒ {m} | 80% |
{l} ⇒ {n} | 80% |
{n} ⇒ {l} | 80% |
{n} ⇒ {m} | 80% |
{p} ⇒ {m} | 75% |
{r} ⇒ {m} | 100% |
{t} ⇒ {n} | 100% |
{o} ⇒ {p} | 75% |
{p} ⇒ {o} | 75% |
{o} ⇒ {q} | 75% |
{q} ⇒ {o} | 75% |
{p} ⇒ {q} | 75% |
{q} ⇒ {p} | 75% |
{l} ⇒ {m, n} | 80% |
{n} ⇒ {l, m} | 80% |
{l, m} ⇒ {n} | 100% |
{l, n} ⇒ {m} | 100% |
{m, n} ⇒ {l} | 100% |
{o} ⇒ {p, q} | 75% |
{p} ⇒ {o, q} | 75% |
{q} ⇒ {o, p} | 75% |
{o, p} ⇒ {q} | 100% |
{o, q} ⇒ {p} | 100% |
{p, q} ⇒ {o} | 100% |
Association Rule | Confidence |
---|---|
{l} ⇒ {m, n} | 40% |
{o, p} ⇒ {q} | 100% |
ID | Items |
---|---|
1 | {l, m, r} |
2 | {m, o, p, q} |
3 | {l, m, t} |
4 | {l, o, p, q} |
5 | {l, m, n, t, r} |
6 | {m, p, r} |
7 | {l, m, n, o, p, q} |
8 | {n, o, q, t} |
Association Rule | Confidence |
---|---|
{l} ⇒ {m} | 80% |
{l} ⇒ {n} | 40% |
{n} ⇒ {l} | 66.66% |
{n} ⇒ {m} | 66.66% |
{p} ⇒ {m} | 75% |
{r} ⇒ {m} | 100% |
{t} ⇒ {n} | 66.66% |
{o} ⇒ {p} | 75% |
{p} ⇒ {o} | 75% |
{o} ⇒ {q} | 75% |
{q} ⇒ {o} | 75% |
{p} ⇒ {q} | 75% |
{q} ⇒ {p} | 75% |
{n} ⇒ {l, m} | 66.66% |
{l, m} ⇒ {n} | 50% |
{l, n} ⇒ {m} | 100% |
{m, n} ⇒ {l} | 100% |
{o} ⇒ {p, q} | 75% |
{p} ⇒ {o, q} | 75% |
{q} ⇒ {o, p} | 75% |
{o, p} ⇒ {q} | 100% |
{p, q} ⇒ {o} | 100% |
Dataset | No. of Items | No. of Transactions | Transaction Length |
---|---|---|---|
Chess | 75 | 3196 | 37 |
Mushroom | 119 | 8124 | 23 |
Mobile | 5250 | 5000 | 16 |
Dataset | Minimum Confidence | No. of Association Rules | No. of Sensitive Rules |
---|---|---|---|
94% | 7177 | 72 | |
Chess | 95% | 6145 | 62 |
96% | 4435 | 45 | |
97% | 4081 | 41 | |
45% | 4949 | 50 | |
Mushroom | 50% | 4402 | 45 |
55% | 3937 | 40 | |
60% | 3535 | 36 | |
40% | 2092 | 20 | |
50% | 1984 | 19 | |
Mobile | 60% | 1801 | 18 |
70% | 1580 | 15 | |
80% | 1333 | 13 | |
90% | 1246 | 12 |
Improvement | Mushroom | Chess | Mobile | Max |
---|---|---|---|---|
Time | 81% | 70% | 48% | 81% |
Lost rules ratio | 36% | 79% | 21% | 79% |
Accuracy | 5% | 5% | 3% | 5% |
Utility | 2% | `2% | 23% | 23% |
© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Khuda Bux, N.; Lu, M.; Wang, J.; Hussain, S.; Aljeroudi, Y. Efficient Association Rules Hiding Using Genetic Algorithms. Symmetry 2018, 10, 576. https://doi.org/10.3390/sym10110576
Khuda Bux N, Lu M, Wang J, Hussain S, Aljeroudi Y. Efficient Association Rules Hiding Using Genetic Algorithms. Symmetry. 2018; 10(11):576. https://doi.org/10.3390/sym10110576
Chicago/Turabian StyleKhuda Bux, Naadiya, Mingming Lu, Jianxin Wang, Saajid Hussain, and Yazan Aljeroudi. 2018. "Efficient Association Rules Hiding Using Genetic Algorithms" Symmetry 10, no. 11: 576. https://doi.org/10.3390/sym10110576
APA StyleKhuda Bux, N., Lu, M., Wang, J., Hussain, S., & Aljeroudi, Y. (2018). Efficient Association Rules Hiding Using Genetic Algorithms. Symmetry, 10(11), 576. https://doi.org/10.3390/sym10110576