Abstract
In this paper, we used software engineering principles for the development of models and proposed the K-Means clustering architecture implemented on the multi-stage data mining process. We developed a modified architecture and expanded it by showing refinements on every process of the clustering and knowledge discovery stages. We used the mentioned hierarchical clustering model to partition the data into smaller groups of attributes so that we would determine the data structure before applying the data mining tools. The experiment shows that the model using the clustering resulted to an isolated but imperative association rules based on clustered data, which in return could be practically explained for decision making purposes. Shorter processing time had been observed in computing for smaller clusters implying faster and ideal processing period than dealing with the entire dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Pressman, R.: Software Engineering: a practitioner’s approach, 5th edn. McGraw- Hill, USA (2001)
Han, J., Kamber, M.: Data Mining Concepts & Techniques. Morgan Kaufmann, USA (2001)
Chen, B., Haas, P., Scheuermann, P.: A new two-phase sampling based algorithm for discovering association rules. In: Proceedings of ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2002)
Cluster Analysis defined, Available at, http://www.clustan.com/what_is_cluster_analysis.html
Determining the Number of Clusters, Available at, http://cgm.cs.mcgill.ca/soss/cs644/projects/siourbas/cluster.html #kmeans
Using Hierarchical Clustering in XLMiner, Available at, http://www.resample.com/xlminer/help/HClst/HClst_intro.htm
Agglomerative Hierarchical Clustering, Available at, http://www2.cs.uregina.ca/~hamilton/courses/831/notes/clustering/clustering.htm
Ertz, L., Steinbach, M., Kumar, V.: Finding Topics in Collections of Documents: A Shared Nearest Neighbor Approach. In: Text Mine 2001, Workshop on Text Mining, First SIAM International Conference on Data Mining, Chicago, IL (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gerardo, B.D., Lee, JW., Choi, YS., Lee, M. (2005). The K-Means Clustering Architecture in the Multi-stage Data Mining Process. In: Gervasi, O., et al. Computational Science and Its Applications – ICCSA 2005. ICCSA 2005. Lecture Notes in Computer Science, vol 3481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424826_8
Download citation
DOI: https://doi.org/10.1007/11424826_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25861-2
Online ISBN: 978-3-540-32044-9
eBook Packages: Computer ScienceComputer Science (R0)