[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

MeTA: Characterization of Medical Treatments at Different Abstraction Levels

Published: 15 July 2015 Publication History

Abstract

Physicians and health care organizations always collect large amounts of data during patient care. These large and high-dimensional datasets are usually characterized by an inherent sparseness. Hence, analyzing these datasets to figure out interesting and hidden knowledge is a challenging task. This article proposes a new data mining framework based on generalized association rules to discover multiple-level correlations among patient data. Specifically, correlations among prescribed examinations, drugs, and patient profiles are discovered and analyzed at different abstraction levels. The rule extraction process is driven by a taxonomy to generalize examinations and drugs into their corresponding categories. To ease the manual inspection of the result, a worthwhile subset of rules (i.e., nonredundant generalized rules) is considered. Furthermore, rules are classified according to the involved data features (medical treatments or patient profiles) and then explored in a top-down fashion: from the small subset of high-level rules, a drill-down is performed to target more specific rules. The experiments, performed on a real diabetic patient dataset, demonstrate the effectiveness of the proposed approach in discovering interesting rule groups at different abstraction levels.

References

[1]
ADA. 2013. American diabetes association standards of medical care in diabetes 2013. Diabetes Care 36, Supplement 1 (Jan. 2013), S11--S66.
[2]
Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. SIGMOD Record 22, 2 (June 1993), 207--216.
[3]
Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases (VLDB’94). Morgan Kaufmann Publishers Inc., San Francisco, CA, 487--499.
[4]
Mobyen Uddin Ahmed and Peter Funk. 2011. Mining rare cases in post-operative pain by means of outlier detection. In Proceedings of the 2011 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT’11), Adel Elmaghraby and Dimitrios N. Serpanos (Eds.). IEEE, 35--41.
[5]
Dario Antonelli, Elena Baralis, Giulia Bruno, Tania Cerquitelli, Silvia Chiusano, and Naeem A. Mahoto. 2013. Analysis of diabetic patients through their examination history. Expert Systems Applications 40, 11 (2013), 4672--4678.
[6]
Dario Antonelli, Elena Baralis, Giulia Bruno, Silvia Chiusano, NaeemA. Mahoto, and Caterina Petrigni. 2012. Analysis of diagnostic pathways for colon cancer. Flexible Services and Manufacturing Journal 24, 4 (2012), 379--399.
[7]
ATC. 2013. Norwegian-Institute-of-Public-Health: ATC/DDD Index 2013. (Nov. 2013). Retrieved November 1, 2013 from http://www.whocc.no/atc_ddd_index/.
[8]
Elena Baralis, Giulia Bruno, Silvia Chiusano, Virna C. Domenici, Naeem A. Mahoto, and Caterina Petrigni. 2010. Analysis of medical pathways by means of frequent closed sequences. In Proceedings of the 14th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems (KES’10) (Lecture Notes in Computer Science), Rossitza Setchi, Ivan Jordanov, Robert J. Howlett, and Lakhmi C. Jain (Eds.), Vol. 6278. Springer, 418--425.
[9]
Elena Baralis, Luca Cagliero, Tania Cerquitelli, Silvia Chiusano, and Paolo Garza. 2013. Frequent weighted itemset mining from gene expression data. In Proceedings of the 13th IEEE International Conference on BioInformatics and BioEngineering (BIBE’13). IEEE Computer Society, 1--4.
[10]
Elena Baralis, Luca Cagliero, Tania Cerquitelli, Vincenzo D’Elia, and Paolo Garza. 2010. Support driven opportunistic aggregation for generalized itemset extraction. In Proceedings of the 5th IEEE International Conference on Intelligent Systems, IS 2010. IEEE, 102--107.
[11]
Elena Baralis, Tania Cerquitelli, Silvia Chiusano, Vincenzo D’Elia, Riccardo Molinari, and Davide Susta. 2013. Early prediction of the highest workload in incremental cardiopulmonary tests. ACM TIST 4, 4 (2013), 70.
[12]
Iyad Batal, Gregory F. Cooper, and Milos Hauskrecht. 2012. A bayesian scoring technique for mining predictive and non-spurious rules. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2012), Part II (Lecture Notes in Computer Science), Peter A. Flach, Tijl De Bie, and Nello Cristianini (Eds.), Vol. 7524. Springer, 260--276.
[13]
Iyad Batal, Hamed Valizadegan, Gregory F. Cooper, and Milos Hauskrecht. 2013. A temporal pattern mining approach for classifying electronic health record data. ACM TIST 4, 4 (2013), 63.
[14]
Margherita Berardi, Michele Lapi, Pietro Leo, and Corrado Loglisci. 2005. Mining generalized association rules on biomedical literature. In Innovations in Applied Artificial Intelligence, 18th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, IEA/AIE (Lecture Notes in Computer Science), Moonis Ali and Floriana Esposito (Eds.), Vol. 3533. Springer, 500--509.
[15]
Sergey Brin, Rajeev Motwani, and Craig Silverstein. 1997. Beyond market baskets: Generalizing association rules to correlations. SIGMOD Rec. 26, 2 (June 1997), 265--276.
[16]
Luca Cagliero. 2013. Discovering temporal change patterns in the presence of taxonomies. IEEE Transactions on Knowledge Data Engineering 25, 3 (2013), 541--555.
[17]
Luca Cagliero, Tania Cerquitelli, Paolo Garza, and Luigi Grimaudo. 2014. Misleading generalized itemset discovery. Expert Syst. Appl. 41, 4 (2014), 1400--1410.
[18]
Luca Cagliero and Paolo Garza. 2013. Itemset generalization with cardinality-based constraints. Information Science 244 (2013), 161--174.
[19]
Carlo Combi and Alberto Sabaini. 2013. Extraction, analysis, and visualization of temporal association rules from interval-based clinical data. In Artificial Intelligence in Medicine - 14th Conference on Artificial Intelligence in Medicine, AIME 2013 (Lecture Notes in Computer Science), Niels Peek, Roque Marín Morales, and Mor Peleg (Eds.), Vol. 7885. Springer, 238--247.
[20]
Elias Egho, Chedy Raïssi, Dino Ienco, Nicolas Jay, Amedeo Napoli, Pascal Poncelet, Catherine Quantin, and Maguelonne Teisseire. 2012. Healthcare trajectory mining by combining multidimensional component and itemsets. In New Frontiers in Mining Complex Patterns - 1st International Workshop, NFMCP 2012, Held in Conjunction with ECML/PKDD 2012 (Lecture Notes in Computer Science), Annalisa Appice, Michelangelo Ceci, Corrado Loglisci, Giuseppe Manco, Elio Masciari, and Zbigniew W. Ras (Eds.), Vol. 7765. Springer, Bristol, UK, 109--123.
[21]
Peter Flach, Valentina Maraldi, and Fabrizio Riguzzi. 2006. Algorithms for efficiently and effectively using background knowledge in tertius. “Analisi Sperimentale e Benchmark di Algoritmi per l'Intelligenza Artificiale” (Incontro del Gruppo di Lavoro Rappresentazione della Conoscenza e Ragionamento Automatico dell'Associazione Italiana per l'Intelligenza Artificiale). Marco Gavanelli and Tony Mancini (Eds.).
[22]
Jiawei Han and Yongjian Fu. 1999. Mining multiple-level association rules in large databases. IEEE Transactions on Knowledge and Data Engineering 11, 5 (Sept. 1999), 798--805.
[23]
ICD-9-CM. 2011. International Classification of Diseases, 9th revision, Clinical Modification. Retrieved March 1, 2011 from http://icd9cm.chrisendres.com.
[24]
IDF. 2013. International Diabetes Federation. Retrieved November 1, 2013 from http://www.idf.org/.
[25]
Asha Gowda Karegowda, M. A. Jayaram, and A. S. Manjunath. 2012. Cascading k-means clustering and k-nearest neighbor classifier for categorization of diabetic patients. International Journal of Engineering and Advanced Technology (IJEAT) 1, 3 (Feb. 2012), 147--151.
[26]
Rhonda Kost, Benjamin Littenberg, and Elizabeth S. Chen. 2012. Exploring generalized association rule mining for disease co-occurrences. In Proceedings of the AMIA 2012 Annual Symposium. AIMA, Chicago, IL, 1284--1293.
[27]
Michael Mampaey, Nikolaj Tatti, and Jilles Vreeken. 2011. Tell me what i need to know: succinctly summarizing data with itemsets. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’11). ACM, New York, NY, 573--581.
[28]
Xue-Hui Meng, Yi-Xiang Huang, Dong-Ping Rao, Qiu Zhang, and Qing Liu. 2013. Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. The Kaohsiung Journal of Medical Sciences 29, 2 (2013), 93--99.
[29]
Jeremy Mennis and Jun Wei Liu. 2005. Mining association rules in spatio-temporal data: an analysis of urban socioeconomic and land cover change. Transactions in GIS 9, 1 (2005), 5--17.
[30]
MeTA. 2014. MeTA Source Code. Retrieved May 1, 2014 from http://dbdmg.polito.it/wordpress/wp-content/uploads/2014/05/META.zip.
[31]
Jesmin Nahar, Tasadduq Imam, Kevin S. Tickle, and Yi-Ping Phoebe Chen. 2013. Association rule mining to detect factors which contribute to heart disease in males and females. Expert Systems with Applications 40, 4 (2013), 1086--1093.
[32]
Haiwei Pan, Xiaolei Tan, Qilong Han, Xiaoning Feng, and Guisheng Yin. 2012. GMA: An approach for association rules mining on medical images. In Proceedings of the 8th International Conference on Intelligent Computing Theories and Applications (ICIC’12). Springer-Verlag, Berlin, 425--432.
[33]
Nicolas Pasquier, Yves Bastide, Rafik Taouil, and Lotfi Lakhal. 1999. Discovering frequent closed itemsets for association rules. In Proceedings of the 7th International Conference on Database Theory (ICDT’99). Springer-Verlag, London, UK, 398--416.
[34]
Bankat Madhavrao Patil, Ramesh Chandra Joshi, and Durga Toshniwal. 2011. Classification of type-2 diabetic patients by using apriori and predictive apriori. International Journal of Computer Vision and Robotics 2, 3 (Oct. 2011), 254--265.
[35]
Iko Pramudiono and Masaru Kitsuregawa. 2004. FP-tax: tree structure based generalized association rule mining. In Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD’04). ACM, New York, NY, 60--63.
[36]
Tobias Scheffer. 2005. Finding association rules that trade support optimally against confidence. Intelligent Data Analysis 9, 4 (July 2005), 381--395.
[37]
A Mi Shin, In Hee Lee, Gyeong Ho Lee, Hee Joon Park, Hyung Seop Park, Kyung Il Yoon, Jung Jeung Lee, and Yoon Nyun Kim. 2010. Diagnostic analysis of patients with essential hypertension using association rule mining. Healthcare Information Research 16, 2 (2010), 77--81.
[38]
Ramakrishnan Srikant and Rakesh Agrawal. 1995. Mining generalized association rules. In Proceedings of the 21th International Conference on Very Large Data Bases (VLDB’95). Morgan Kaufmann, San Francisco, CA, 407--419. http://dl.acm.org/citation.cfm?id=645921.673304
[39]
Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal. 1997. Mining association rules with item constraints. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining (KDD’97), David Heckerman, Heikki Mannila, and Daryl Pregibon (Eds.). AAAI Press, Newport Beach, CA, 67--73.
[40]
Kritsada Sriphaew and Thanaruk Theeramunkong. 2002. A new method for finding generalized frequent itemsets in generalized association rule mining. In Proceedings of the 7th IEEE Symposium on Computers and Communications (ISCC’02). IEEE Computer Society, Taormina, Italy, 1040--1045.
[41]
Pang-Ning Tan and Vipin Kumar. 2000. Interestingness measures for association patterns: A perspective. In Proceedings of the KDD 2000 Workshop on Post-Processing in Machine Learning and Data Mining: Interpretation, Visualization, Integration, and Related Topics. Boston, MA, Article 8, 9 pages.
[42]
Pang-Ning Tan, Vipin Kumar, and Jaideep Srivastava. 2002. Selecting the right interestingness measure for association patterns. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’02). ACM, New York, NY, 32--41.
[43]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2006. Introduction to Data Mining. Addison-Wesley Longman, Boston, MA.
[44]
Nikolaj Tatti. 2010. Probably the best itemsets. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’10). ACM, New York, NY, 293--302.
[45]
Paola S. Timiras. 2013. Physiological Basis of Aging and Geriatrics. Taylor & Francis, UK.
[46]
Stephanie M. van Rooden, Willem J. Heiser, Joost N. Kok, Dagmar Verbaan, Jacobus J. van Hilten, and Johan Marinus. 2010. The identification of Parkinson’s disease subtypes using cluster analysis: A systematic review. Movement Disorders 25, 8 (June 2010), 969--978.
[47]
Jinn-Yi Yeh, Tai-Hsi Wu, and Chuan-Wei Tsao. 2011. Using data mining techniques to predict hospitalization of hemodialysis patients. Decision Support Systems 50, 2 (Jan. 2011), 439--448.
[48]
Mohammed J. Zaki. 2004. Mining non-redundant association rules. Data Mininging and Knowledge Discovery 9, 3 (Nov. 2004), 223--248.

Cited By

View all
  • (2022)Associative patterns in health data: exploring new techniquesHealth and Technology10.1007/s12553-021-00635-612:2(415-431)Online publication date: 21-Jan-2022
  • (2019)Developing an Effective Classification Model for Medical Data AnalysisAdvanced Classification Techniques for Healthcare Analysis10.4018/978-1-5225-7796-6.ch001(1-17)Online publication date: 2019
  • (2018)Analysis of Medical Opinions about the Nonrealization of Autopsies in a Mexican Hospital Using Association Rules and Bayesian NetworksScientific Programming10.1155/2018/43040172018Online publication date: 13-Feb-2018
  • Show More Cited By

Index Terms

  1. MeTA: Characterization of Medical Treatments at Different Abstraction Levels

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Intelligent Systems and Technology
      ACM Transactions on Intelligent Systems and Technology  Volume 6, Issue 4
      Regular Papers and Special Section on Intelligent Healthcare Informatics
      August 2015
      419 pages
      ISSN:2157-6904
      EISSN:2157-6912
      DOI:10.1145/2801030
      • Editor:
      • Yu Zheng
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 July 2015
      Accepted: 01 October 2014
      Revised: 01 May 2014
      Received: 01 November 2013
      Published in TIST Volume 6, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Health care informatics
      2. data mining
      3. generalized association rule mining

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • the GenData2020 project
      • the Italian Ministry of Research (MIUR)

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 10 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Associative patterns in health data: exploring new techniquesHealth and Technology10.1007/s12553-021-00635-612:2(415-431)Online publication date: 21-Jan-2022
      • (2019)Developing an Effective Classification Model for Medical Data AnalysisAdvanced Classification Techniques for Healthcare Analysis10.4018/978-1-5225-7796-6.ch001(1-17)Online publication date: 2019
      • (2018)Analysis of Medical Opinions about the Nonrealization of Autopsies in a Mexican Hospital Using Association Rules and Bayesian NetworksScientific Programming10.1155/2018/43040172018Online publication date: 13-Feb-2018
      • (2018)Probabilistic modeling personalized treatment pathways using electronic health recordsJournal of Biomedical Informatics10.1016/j.jbi.2018.08.00486(33-48)Online publication date: Oct-2018
      • (2017)Frequent Itemsets Mining for Big Data: A Comparative AnalysisBig Data Research10.1016/j.bdr.2017.06.0069(67-83)Online publication date: Sep-2017
      • (2017)Association Analysis of Medical Opinions About the Non-realization of Autopsies in a Mexican HospitalNew Perspectives on Applied Industrial Tools and Techniques10.1007/978-3-319-56871-3_12(233-251)Online publication date: 17-Jun-2017
      • (2016)Analyzing air pollution on the urban environment2016 39th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)10.1109/MIPRO.2016.7522370(1464-1469)Online publication date: May-2016
      • (2016)Data mining for better healthcare: A path towards automated data analysis?2016 IEEE 32nd International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW.2016.7495617(60-63)Online publication date: May-2016
      • (2015)Digging deep into weighted patient data through multiple-level patternsInformation Sciences: an International Journal10.1016/j.ins.2015.06.006322:C(51-71)Online publication date: 20-Nov-2015

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media