Abstract
Mammogram—breast X-ray—is considered the most effective, low cost, and reliable method in early detection of breast cancer. Although general rules for the differentiation between benign and malignant breast lesions exist, only 15–30 % of masses referred for surgical biopsy are actually malignant. In this work, an approach is proposed to develop a computer-aided classification system for cancer detection from digital mammograms. The proposed system consists of three major steps. The first step is region of interest (ROI) extraction of 256 × 256 pixels size. The second step is the feature extraction; we used a set of 19 GLCM and GLRLM features, and the 19 (nineteen) features extracted from gray-level run-length matrix and gray-level co-occurrence matrix could distinguish malignant masses from benign masses with an accuracy of 96.7 %. Further analysis was carried out by involving only 12 of the 19 features extracted, which consists of 5 features extracted from GLCM matrix and 7 features extracted from GLRL matrix. The 12 selected features are as follows: Energy, Inertia, Entropy, Maxprob, Inverse, SRE, LRE, GLN, RLN, LGRE, HGRE, and SRLGE; ARM with 12 features as prediction can distinguish malignant mass image and benign mass with a level of accuracy of 93.6 %. Further analysis showed that area under the receiver operating curve was 0.995, which means that the accuracy level of classification is good or very good. Based on that data, it was concluded that texture analysis based on GLCM and GLRLM could distinguish malignant image and benign image with considerably good result. The third step is the classification process; we used the technique of decision tree using image content to classify between normal and cancerous masses. The proposed system was shown to have the large potential for cancer detection from digital mammograms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Majid AS, de Paredes ES, Doherty RD, Sharma N, Salvador X (2003) Missed breast carcinoma: pitfalls and pearls. Radiographics 23:881–895
Osmar RZ, Antonie M-L, Coman A (2002) Mammography classification by association rulebased classifier. MDM/KDD2002 International workshop on multimedia data mining with (ACM SIGKDD 2002, Edmonton, Alberta, Canada, 17–19 July 2002), pp 62–69
Xie X, Gong Y, Wan S, Li X (2005) Computer aided detection of SARS based on radiographs data mining. In: Proceedings of the 2005 IEEE engineering in medicine and biology 27th annual conference Shanghai, China, 1–4 Sept 2005
Shuyan W, Mingquan Z, Guohua G (2005) Application of fuzzy cluster analysis for medical image data mining. In: Proceedings of the IEEE international conference on mechatronics & automation Niagara Falls, Canada, July 2005
Jensen R, Qiang S (2004) Semantics preserving dimensionality reduction: rough and fuzzy-rough based approaches. IEEE Trans Knowl Data Eng 16:1457–1471
Walid E, Hakim H (2006) A new cost sensitive decision tree method application for mammograms classification. IJCSNS Int J Comp Sci Netw Secur, 6 No. 11
Liu Y, Zhang D, Lu G (2008) Region based image retrieval with high-level semantics using decision tree learning. Pattern Recognit 41:2554–2570
Polat K, Günes S (2009) A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems. Expert Syst Appl 36:1587–1592
Chen C, Lee G (1997) Image segmentation using multitiresolution wavelet analysis and expectation maximum (em) algorithm for mammography. Int J Imaging Syst Technol 8(5):491–504
Wang T, Karayaiannis N (1998) Detection of microcalcifications in digital mammograms using wavelets. IEEE Trans Med Imaging 17(4):498–509
Christiyanni I et al (2000) Fast detection of masses in computer aided mammography. IEEE Signal Process Mag 54–64
Faderl S, Keating MJ, Do KA, Liang SY, Kantarjian HM, O’Brien S et al (2002) Expression profile of 11 proteins and their prognostic significance in patients with chronic lymphocytic leukemia (CLL). Leukemia 16:1045–1052
Quinlan J (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Francisco, CA
http://www.rulequest.com/see5-info.html. http://marathon.csee.usf.edu/Mammography/Database.html
Cherif chaabani A, Boujelben A, Mahfoudhi A, Abid M (2010) An automatic-pre-processing method for mammographic images. JDCTA Int J Digit Content Technol Appl 4(3):190–201
Hanmandlu M, Madasu VK, Vasikarla S (2004) A fuzzy Approach to texture segmentation. Proc Int Conf Inf Technol: Coding and Computing 1:636–642
Chalana V, Kim Y (1997) A methodology for evaluation of boundary algorithms on medical images. IEEE Trans Med Imaging 16(5):642–652
Perkins WA (1980) Area segmentation of images using edge points. IEEE Trans Pattern Anal Mach Intell PAMI-2:8–15
Torheim G, Godtliebsen F, Axelson D, Kvistad KA, Haraldseth O, Rinck PA (2001) Feature extraction and classification of dynamic contrast-enhanced T2*-weighted breast image data. IEEE Trans Med Imaging 20:1293–1301
Sohail A, Bhattacharya P, Mudur SP, Krishnamurthy S (2011) Classification of ultrasound medical images using distance based feature selection and fuzzy-SVM. Pattern Recognit Image Anal, Lecture notes in computer science. 6669:176–183. doi:10.1007/978-3-642-21257-4_22
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern SMC-3:610–621
Gupta R, Undrill PE (1997) The use of texture analysis to identify suspicious masses in mammography. Phys Med Biol 15:835–855
Chan HP et al (1995) Computer-aided classification of mammographic masses and normal tissue: linear discriminant analysis in texture feature space. Phys Med Biol 40:857–876
Wei D, Chan HP, Helvie MA, Sahiner B, Petrick N, Adler DD, Goodsitt MM (1995) Classification of mass and normal breast tissue on digital mammograms: multiresolution texture analysis. Med Phys 22:1501–1513
Gibbs P, Turnbull LW (2003) Textural analysis of contrast-enhanced MR images of the breast. Magn Reson Med 50:92–98
Mohanty AK, Senapati MR, Lenka SK (2012) A novel image mining technique for classification of mammograms using hybrid feature selection. Neural Comput Appl. doi:10.1007/s00521-012-0881-x
Conners RW, Harlow CA (1980) A theoretical comparison texture algorithms. IEEE Trans Pattern Anal Mach Intell 2:204–222
Arivazhagan S, Ganesan L (2003) Texture classification using wavelet transform. Pattern Recognit Lett 24:1513–1521
Hiremath PS, Shivashankar S (2006) Texture classification using wavelet packet decomposition. ICGSTs GVIP J 6(2):77–80
Albregtsen F (1995) Statistical texture measures computed from gray level run-length matrices. University of Oslo
Hall-Beyer M (2004) GLCM texture: a tutorial v.2.7.1, on-line document, www.fp.ucalgary.ca/mhallbey/tutorial.htm
Haralick RM, Shanmugam K, Dinstein I (1973) Texture features for image classification. IEEE Trans SMC 3(6):610–621
Kulak E (2002) Image analysis of textural Features for Content based retrieval. Thesis, Sabanci University
Tang X (1998) Texture information in run-length matrices. IEEE Trans Image Process 7(11):1602–1609
Chu A, Sehgal CM (1990) Use of gray value distribution of run lengths for texture analysis. Pattern Recognit Lett 415–420
Dasarathy BV, Holder EB (1991) Image characterizations based on joint gray level—run length distributions. Pattern Recognit Lett 12(8):497–502
Nisbet R, Elder J, Miner G (2009) Handbook of statistical analysis and data mining applications. Academic Press, Burlington
Quinlan J (1986) Induction of decision trees. Machine Learning 1:81–106
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA
Loh WY, Shih YS (1997) Split selection methods for classification trees. Stat Sin 7:815–840
Michael JA, Gordon SL (1997) Data mining technique: for marketing, sales and customer support. Wiley, New York
Han JW, Kamber M (2006) Data mining concepts and techniques, 2nd edn. Morgan Kaufmann Publishers, San Francisco
Dudoit S, Fridlyand J, Speed TP (2000) Comparison of discrimination methods for the classification of tumors using gene expression data. Technical report 576, Department of Statistics, University of California at Berkeley, Berkeley, CA
Quinlan JR (1996) Improved use of continuous attributes in C4.5. J Artif Intell Res 4:77–90
Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min Knowl Disc 2(4):345–389
Zou KH, O’Malley AJ, Mauri L (2007) Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models. J Am Heart Assoc 654–657. doi:10.1161/CIRCULATIONAHA.105.594929
Park SH, Goo JM, Jo C-H (2004) Receiver operating characteristic (roc) curve: practical review for radiologists. Korean J Radiol 5(1):11–18
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mohanty, A.K., Senapati, M.R., Beberta, S. et al. Texture-based features for classification of mammograms using decision tree. Neural Comput & Applic 23, 1011–1017 (2013). https://doi.org/10.1007/s00521-012-1025-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1025-z