[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications

Published: 01 August 2017 Publication History

Abstract

The advent of Big Data era drives data analysts from different domains to use data mining techniques for data analysis. However, performing data analysis in a specific domain is not trivial; it often requires complex task configuration, onerous integration of algorithms, and efficient execution in distributed environments. Few efforts have been paid on developing effective tools to facilitate data analysts in conducting complex data analysis tasks. In this paper, we design and implement FIU-Miner, a Fast, Integrated, and User-friendly system to ease data analysis. FIU-Miner allows users to rapidly configure a complex data analysis task without writing a single line of code. It also helps users conveniently import and integrate different analysis programs. Further, it significantly balances resource utilization and task execution in heterogeneous environments. Case studies of real-world applications demonstrate the efficacy and effectiveness of our proposed system.

References

[1]
Anselin L (1995) Local indicators of spatial association--LISA. Geogr Anal 27(2):93---115
[2]
Belz R, Mertens P (1996) Combining knowledge-based systems and simulation to solve rescheduling problems. Decis Support Syst 17(2):141---157
[3]
Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca Raton
[4]
Chang C-C, Lin Chih-Jen (2011) Libsvm: a library for support vector machines. TIST 2(3):27
[5]
Chen Injazz J (2001) Planning for ERP systems: analysis and future trend. Bus Process Manag J 7(5):374---386
[6]
Chen W-C, Tseng S-S, Wang Ching-Yao (2005) A novel manufacturing defect detection method using association rule mining techniques. Exp Syst Appl 29(4):807---815
[7]
Davis Chad A, Gerick Fabian, Hintermair Volker, Friedel Caroline C, Fundel Katrin, Küffner Robert, Zimmer Ralf (2006) Reliable gene signatures for microarray classification: assessment of stability and performance. Bioinformatics 22(19):2356---2363
[8]
Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189---1232
[9]
Groger C, Niedermann F, Schwarz H, Mitschang B (2012) Supporting manufacturing design by analytics, continuous collaborative process improvement enabled by the advanced manufacturing analytics platform. In: CSCWD, pp 793---799. IEEE
[10]
Gröger C, Niedermann F, Mitschang B (2012) Data mining-driven manufacturing process optimization. Proc World Congr Eng 3:4---6
[11]
Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The WEKA data mining software: an update. ACM SIGKDD explorations newsletter 11(1):10---18
[12]
Jiang Y, Perng C-S, Sailer A, Silva-Lepe I, Zhou Yang, Li Tao (2016) CSM: a cloud service marketplace for complex service acquisition. ACM TIST 8(1):8
[13]
Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95---116
[14]
Li H, Calder CA, Cressie N (2007) Beyond Moran's I: testing for spatial dependence based on the spatial autoregressive model. Geogr Anal 39(4):357---375
[15]
Lei L, Wei P, Saurabh K, Tong S, Tao L (2015) Recommending users and communities in social media. ACM Trans Knowl Discov Data 10(2):17:1---17:27
[16]
Li L, Shen C, Wang L, Zheng L, Jiang Y, Tang L, Li H, Zhang L, Zeng C, Li T, Tang J, Liu D (2014) Iminer: mining inventory data for intelligent management. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, CIKM '14, pp 2057---2059, New York, ACM
[17]
Liu H, Motoda H (2008) Computational methods of feature selection. Chapman & Hall, London
[18]
Loscalzo S, Yu L, Ding C (2009) Consensus group stable feature selection. In: SIGKDD, pp 567---576. ACM
[19]
Lu Y, Zhang M, Li T, Guang Y, Rishe N (2013) Online spatial data analysis and visualization system. In: Proceedings of the ACM SIGKDD workshop on interactive data exploration and analytics, pp 71---78. ACM
[20]
MILK. http://pythonhosted.org/milk
[21]
MLC++. http://www.sgi.com/tech/mlc
[22]
Oh S, Han J, Cho H (2001) Intelligent process control system for quality improvement by data mining in the process industry. In: Dan B (ed) Data mining for design and manufacturing, pp 289---309. Springer, Berlin
[23]
Owen S, Anil R, Dunning T, Friedman E (2011) Mahout in action. Manning, New York
[24]
Pang-Ning T, Steinbach M, Kumar V et al (2006) Introduction to data mining. Pearson Education, USA
[25]
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE PAMI 27(8):1226---1238
[26]
Pindyck RS, Rubinfeld DL (1998) Econometric models and economic forecasts. Irwin and McGraw-Hill, New York
[27]
Prekopcsak Z, Makrai G, Henk T, Gaspar-Papanek C (2011) Radoop: analyzing big data with rapidminer and hadoop. In: RCOMM
[28]
Rasmussen CE (2006) Gaussian processes for machine learning. MIT Press, Cambridge
[29]
Shen L, Francis EHT, Liangsheng Q, Yudi S (2000) Fault diagnosis using rough sets theory. Comput Ind 43(1):61---72
[30]
Skormin VA, Gorodetski VI, Popyack LJ (2002) Data mining technology for failure prognostic of avionics. TAES 38(2):388---403
[31]
Tan P-N, Steinbach M, Kumar V (2006) Introduction to data mining. Pearson Education, USA
[32]
Tao L, Chunqiu Z, Wubai Z, Qifeng Z, Li Z (2015) Data mining in the era of big data: from the application perspective. Big Data Res 1(4):1---24
[33]
Topchy A, Jain AK, Punch W (2004) A mixture model of clustering ensembles. In: SDM, pp 379---390.
[34]
Unger DA, van den Dool H, O'Lenic E, Collins D (2009) Ensemble regression. Month Weather Rev 137(7):2365---2379
[35]
Woznica A, Nguyen P, Kalousis A (2012) Model mining for robust feature selection. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining ACM, New York
[36]
Yu L, Zheng J, Wu B, Wang B, Shen C, Qian L, Zhang R (2012) Bc-pdm: data mining, social network analysis and text mining system based on cloud computing. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 1496---1499). ACM, New York
[37]
Yu L, Ding C, Loscalzo S (2008) Stable feature selection via dense feature groups. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 803---811. ACM, New York
[38]
Zeng C, Jiang Y, Zheng L, Li J, Li L, Li H, Shen C, Zhou W, Li T, Duan B, Lei M, Wang P (2013) FIU-Miner: international conference on knowledge discovery and data mining, pp 1506---1509
[39]
Zeng C, Li H, Wang H, Guang Y, Liu C, Li T, Zhang M, Chen S-C, Rishe N (2014) Optimizing online spatial data analysis with sequential query patterns. In: Joshi J, Bertino E, Thuraisingham BM, Liu L (eds) IRI, pp 253---260. IEEE
[40]
Zhang M, Wang H, Lu Y, Li T, Guang Y, Liu C, Edrosa E, Li H, Rishe N (2015) Terrafly geocloud: an online spatial data analysis and visualization system. ACM Trans Intell Syst Technol 6(3):34:1---34:24
[41]
Zheng L, Shen C, Tang L, Zeng C, Li T, Luis S, Chen S-C (2013) Data mining meets the needs of disaster information management. IEEE Trans Hum-Mach Syst 43(5):451---464
[42]
Zheng L, Zeng C, Li L, Jiang Y, Xue W, Li J, Shen C, Zhou W, Li H, Tang L, Li T, Duan B, Lei M, Wang P (2014) Applying data mining techniques to address critical process optimization needs in advanced manufacturing. In: Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD '14, pp 1739---1748, New York, ACM
[43]
Zipkin PH (2000) Foundations of inventory management, vol 2

Cited By

View all
  • (2022)Towards More Clean Results in Data Visualization: A Weka Usability ExperimentDesign, User Experience, and Usability: UX Research, Design, and Assessment10.1007/978-3-031-05897-4_27(389-400)Online publication date: 26-Jun-2022
  • (2021)LogGAN: a Log-level Generative Adversarial Network for Anomaly Detection using Permutation Event ModelingInformation Systems Frontiers10.1007/s10796-020-10026-323:2(285-298)Online publication date: 1-Apr-2021
  • (2019)S3MiningComputer Standards & Interfaces10.1016/j.csi.2019.03.00465:C(143-158)Online publication date: 1-Jul-2019
  1. FIU-Miner (a fast, integrated, and user-friendly system for data mining) and its applications

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Knowledge and Information Systems
    Knowledge and Information Systems  Volume 52, Issue 2
    August 2017
    267 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 August 2017

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Towards More Clean Results in Data Visualization: A Weka Usability ExperimentDesign, User Experience, and Usability: UX Research, Design, and Assessment10.1007/978-3-031-05897-4_27(389-400)Online publication date: 26-Jun-2022
    • (2021)LogGAN: a Log-level Generative Adversarial Network for Anomaly Detection using Permutation Event ModelingInformation Systems Frontiers10.1007/s10796-020-10026-323:2(285-298)Online publication date: 1-Apr-2021
    • (2019)S3MiningComputer Standards & Interfaces10.1016/j.csi.2019.03.00465:C(143-158)Online publication date: 1-Jul-2019

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media