Abstract
Support vector machines (SVMs) are supervised learning models traditionally employed for classification and regression analysis. In classification analysis, a set of training data is chosen, and each instance in the training data is assigned a categorical class. An SVM then constructs a model based on a separating plane that maximizes the margin between different classes. Despite being one of the most popular classification models because of its strong performance empirically, understanding the knowledge captured in an SVM remains difficult. SVMs are typically applied in a black-box manner where the details of parameter tuning, training, and even the final constructed model are hidden from the users. This is natural since these details are often complex and difficult to understand without proper visualization tools. However, such an approach often brings about various problems including trial-and-error tuning and suspicious users who are forced to trust these models blindly.
The contribution of this paper is a visual analysis approach for building SVMs in an open-box manner. Our goal is to improve an analyst’s understanding of the SVM modeling process through a suite of visualization techniques that allow users to have full interactive visual control over the entire SVM training process. Our visual exploration tools have been developed to enable intuitive parameter tuning, training data manipulation, and rule extraction as part of the SVM training process. To demonstrate the efficacy of our approach, we conduct a case study using a real-world robot control dataset.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Cortes, C.; Vapnik, V. Support-vector networks. Machine Learning Vol. 20, No. 3, 273–297, 1995.
Tong, S.; Koller, D. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research Vol. 2, 45–66, 2001.
Osuna, E.; Freund, R.; Girosi, F. Training support vector machines: An application to face detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 130–136, 1997.
Furey, T. S.; Cristianini, N.; Duffy, N.; Bednarski, D. W.; Schummer, M.; Haussler, D. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics Vol. 16, No. 10, 906–914, 2000.
Hasenauer, J.; Heinrich, J.; Doszczak, M.; Scheurich, P.; Weiskopf, D.; Allgöwer, F. A visual analytics approach for models of heterogeneous cell populations. EURASIP Journal on Bioinformatics and Systems Biology Vol. 2012, 4, 2012.
Abe, S. Support Vector Machines for Pattern Classification. Springer London, 2010.
Tzeng, F.-Y.; Ma, K.-L. Opening the black box— Data driven visualization of neural networks. In: Proceedings of the IEEE Visualization, 383–390, 2005.
Martens, D.; Baesens, B. B.; van Gestel, T. Decompositional rule extraction from support vector machines by active learning. IEEE Transactions on Knowledge and Data Engineering Vol. 21, No. 2, 178–191, 2009.
Núñez, H.; Angulo, C.; Català, A. Rule extraction from support vector machines. In: Proceedings of the European Symposium on Artificial Neural Networks, 107–112, 2002.
Schölkopf, B.; Smola, A. J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
Ladicky, L.; Torr, P. Locally linear support vector machines. In: Proceedings of the 28th International Conference on Machine Learning, 985–992, 2011.
Ganti, R.; Gray, A. Local support vector machines: Formulation and analysis. arXiv preprint arXiv:1309.3699, 2013.
Baesens, B.; Gestel, T. V.; Viaene, S.; Stepanova, M.; Suykens, J.; Vanthienen, J. Benchmarking stateof- the-art classification algorithms for credit scoring. Journal of the Operational Research Society Vol. 54, No. 6, 627–635, 2003.
Wahba, G. Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV. In: Advances in Kernel Methods. Schölkopf, B.; Burges, C. J. C.; Smola, A. J. Eds. Cambridge, MA, USA: MIT Press, 69–88, 1999.
Hsu, C.-W.; Chang, C.-C.; Lin, C.-J. A practical guide to support vector classification. 2016. Available at http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide. pdf.
Mangasarian, O. L.; Wild, E. W. Proximal support vector machine classifiers. In: Proceedings of KDD- 2001: Knowledge Discovery and Data Mining, 77–86, 2001.
Maji, S.; Berg, A. C.; Malik, J. Classification using intersection kernel support vector machines is efficient. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1–8, 2008.
Blanzieri, E.; Melgani, F. An adaptive SVM nearest neighbor classifier for remotely sensed imagery. In: Proceedings of the IEEE International Symposium on Geoscience and Remote Sensing, 3931–3934, 2006.
Yin, C.; Zhu, Y.; Mu, S.; Tian, S. Local support vector machine based on cooperative clustering for very largescale dataset. In: Proceedings of the 8th International Conference on Natural Computation, 88–92, 2012.
Barakat, N. H.; Bradley, A. P. Rule extraction from support vector machines: A sequential covering approach. IEEE Transactions on Knowledge and Data Engineering Vol. 19, No. 6, 729–741, 2007.
Fung, G.; Sandilya, S.; Rao, R. B. Rule extraction from linear support vector machines. In: Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, 32–40, 2005.
Caragea, D.; Cook, D.; Wickham, H.; Honavar, V. Visual methods for examining SVM classifiers. In: Visual Data Mining. Simoff, S. J.; Böhlen, M. H.; Mazeika, A. Eds. Springer Berlin Heidelberg, 2007.
Aragon, C. R.; Bailey, S. J.; Poon, S.; Runge, K. J.; Thomas, R. C. Sunfall: A collaborative visual analytics system for astrophysics. In: Proceedings of the IEEE Symposium on Visual Analytics Science and Technology, 219–220, 2007.
Ma, Y.; Chen, W.; Ma, X.; Xu, J.; Huang, X.; Maciejewski, R.; Tung, A. K. H. EasySVM: A visual analysis approach for open-box support vector machines. In: Proceedings of the IEEE VIS 2014 Workshop on Visualization for Predictive Analytics, 2014.
Asimov, D. The grand tour: A tool for viewing multidimensional data. SIAM Journal on Scientific and Statistical Computing Vol. 6, No. 1, 128–143, 1985.
Friedman, J. H.; Tukey, J. W. A projection pursuit algorithm for exploratory data analysis. IEEE Transactions on Computers Vol. C-23, No. 9, 881–890, 1974.
Buja, A.; Cook, D.; Asimov, D.; Hurley, C. Computational methods for high-dimensional rotations in data visualization. In: Handbook of Statistics, Volume 24: Data Mining and Data Visualization. Rao, C. R.; Wegman, E. J.; Solka, J. L. Eds. Amsterdam, the Netherlands: North-Holland Publishing Co., 391–413, 2005.
Cook, D.; Buja, A. Manual controls for high-dimensional data projections. Journal of Computational and Graphical Statistics Vol. 6, No. 4, 464–480, 1997.
Nam, J. E.; Mueller, K. TripAdvisorN−D: A tourism-inspired high-dimensional space exploration framework with overview and detail. IEEE Transactions on Visualization and Computer Graphics Vol. 19, No. 2, 291–305, 2013.
Cleveland, W. C.; McGill, M. E. Dynamic Graphics for Statistics. Boca Raton, FL, USA: CRC Press, 1988.
Inselberg, A. The plane with parallel coordinates. The Visual Computer Vol. 1, No. 2, 69–91, 1985.
Inselberg, A.; Dimsdale, B. Parallel coordinates: A tool for visualizing multi-dimensional geometry. In: Proceedings of the 1st Conference on Visualization, 361–378, 1990.
Chambers, J. M.; Cleveland, W. S.; Kleiner, B.; Tukey, P. A. Graphical Methods for Data Analysis. Duxbury Press, 1983.
Elmqvist, N.; Dragicevic, P.; Fekete, J. D. Rolling the dice: Multidimensional visual exploration using scatterplot matrix navigation. IEEE Transactions on Visualization and Computer Graphics Vol. 14, No. 6, 1539–1148, 2008.
Sanftmann, H.; Weiskopf, D. 3D scatterplot navigation. IEEE Transactions on Visualization and Computer Graphics Vol. 18, No. 11, 1969–1978, 2012.
Liu, B.; Ma, Y.; Wong, C. K. Improving an association rule based classifier. In: Principles of Data Mining and Knowledge Discovery. Zighed, D. A.; Komorowski, J.; Żytkow, J. Eds. Springer Berlin Heidelberg, 504–509, 2000.
Quinlan, J. R. Induction of decision trees. Machine Learning Vol. 1, No. 1, 81–106, 1986.
Teoh, S. T.; Ma, K.-L. PaintingClass: Interactive construction, visualization and exploration of decision trees. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 667–672, 2003.
Van den Elzen, S.; van Wijk, J. J. BaobabView: Interactive construction and analysis of decision trees. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 151–160, 2011.
Heimerl, F.; Koch, S.; Bosch, H.; Ertl, T. Visual classifier training for text document retrieval. IEEE Transactions on Visualization and Computer Graphics Vol. 18, No. 12, 2839–2848, 2012.
Höferlin, B.; Netzel, R.; Höferlin, M.; Weiskopf, D.; Heidemann, G. Inter-active learning of ad-hoc classifiers for video visual analytics. In: Proceedings of the IEEE Conference on Visual Analytics Science and Technology, 23–32, 2012.
Joia, P.; Coimbra, D.; Cuminato, J. A.; Paulovich, F. V.; Nonato, L. G. Local affine multidimensional projection. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 12, 2563–2571, 2011.
Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. Journal of Machine Learning Research Vol. 3, 1157–1182, 2003.
Claessen, J. H. T.; van Wijk, J. J. Flexible linked axes for multivariate data visualization. IEEE Transactions on Visualization and Computer Graphics Vol. 17, No. 12, 2310–2316, 2011.
Freire, A. L.; Barreto, G. A.; Veloso, M.; Varela, A. T. Short-term memory mechanisms in neural network learning of robot navigation tasks: A case study. In: Proceedings of the 6th Latin American Robotics Symposium, 1–6, 2009.
Acknowledgements
This work was supported in part by the National Basic Research Program of China (973 Program, No. 2015CB352503), the Major Program of National Natural Science Foundation of China (No. 61232012), and the National Natural Science Foundation of China (No. 61422211).
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is published with open access at Springerlink.com
Yuxin Ma is a Ph.D. student in the State Key Lab of CAD&CG, Zhejiang University. His current research focuses on information visualization, visual analytics, and visual data mining.
Wei Chen is a professor in the State Key Lab of CAD&CG, Zhejiang University. He has published more than 60 papers in international journals and conferences. He served as steering committee of IEEE Pacific Visualization, conference chair of IEEE Pacific Visualization 2015, and paper co-chair of IEEE Pacific Visualization 2013. For more information, please refer to http://www.cad.zju.edu.cn/home/chenwei/.
Xiaohong Ma is a master student in the State Key Lab of CAD&CG, Zhejiang University. Her current research focus is information visualization.
Jiayi Xu is a Ph.D. candidate in the Department of Computer Science and Engineering, Ohio State University. He received his B.E. degree in the School of Computer Science and Technology, Zhejiang University. His research interest is information visualization.
Xinxin Huang is a master student in the State Key Lab of CAD&CG, Zhejiang University. Her current research focuses are information visualization and visual analytics, especially visual analytics of sports data.
Ross Maciejewski is an assistant professor in Arizona State University (ASU). His recent work has actively explored the extraction and linking of disparate data sources exploring combinations of structured geographic data to unstructured social media data to enhance situational awareness. His primary research interests are in the areas of geographical visualization and visual analytics focusing on public health, dietary analysis, social media, and criminal incident reports. He is a fellow of the Global Security Initiative in ASU and the recipient of an NSF CAREER Award (2014).
Anthony K. H. Tung received his B.S. (second class honor) and M.S. degrees in computer science from the National University of Singapore (NUS), in 1997 and 1998, respectively, and Ph.D. degree in computer science from Simon Fraser University in 2001. He is currently an associate professor in the Department of Computer Science, NUS. His research interests include various aspects of databases and data mining (KDD) including buffer management, frequent pattern discovery, spatial clustering, outlier detection, and classification analysis.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www.editorialmanager.com/cvmj.
Electronic supplementary material
Rights and permissions
Open Access The articles published in this journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Ma, Y., Chen, W., Ma, X. et al. EasySVM: A visual analysis approach for open-box support vector machines. Comp. Visual Media 3, 161–175 (2017). https://doi.org/10.1007/s41095-017-0077-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41095-017-0077-5