Abstract
In traditional supervised learning, one uses ”labeled” data to build a model. However, labeling the training data for real-world applications is difficult, expensive, or time consuming, as it requires the effort of human annotators sometimes with specific domain experience and training. There are implicit costs associated with obtaining these labels from domain experts, such as limited time and financial resources. This is especially true for applications that involve learning with large number of class labels and sometimes with similarities among them. Semi-supervised learning (SSL) addresses this inherent bottleneck by allowing the model to integrate part or all of the available unlabeled data in its supervised learning. The goal is to maximize the learning performance of the model through such newly-labeled examples while minimizing the work required of human annotators. Exploiting unlabeled data to help improve the learning performance has become a hot topic during the last decade and it is divided into four main directions: SSL with graphs, SSL with generative models, semi-supervised support vector machines and SSL by disagreement (SSL with committees). This survey article provides an overview to research advances in this branch of machine learning.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abdel Hady, M.F., Schwenker, F.: Combining committee-based semi-supervised learning and active learning. Journal of Computer Science and Technology (JCST): Special Issue on Advances in Machine Learning and Applications 25(4), 681–698 (2010)
Abdel Hady, M.F., Schwenker, F., Palm, G.: Semi-supervised Learning for Regression with Co-training by Committee. In: Alippi, C., Polycarpou, M., Panayiotou, C., Ellinas, G. (eds.) ICANN 2009, Part I. LNCS, vol. 5768, pp. 121–130. Springer, Heidelberg (2009)
Abdel Hady, M.F., Schwenker, F., Palm, G.: Semi-supervised learning for tree-structured ensembles of RBF networks with co-training. Neural Networks 23(4), 497–509 (2010)
Adankon, M., Cheriet, M.: Genetic algorithm–based training for semi-supervised svm. Neural Computing and Applications 19, 1197–1206 (2010)
Balcan, M.-F., Blum, A., Yang, K.: Co-Training and expansion: Towards bridging theory and practice. In: Advances in Neural Information Processing Systems 17, pp. 89–96 (2005)
Basu, S., Banerjee, A., Mooney, R.: Semi-supervised clustering by seeding. In: Proc. of the 19th International Conference on Machine Learning (ICML 2002), pp. 19–26 (2002)
Basu, S., Bilenko, M., Mooney, R.: A probabilistic framework for semi-supervised clustering. In: Proc. of the 10th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2004), pp. 59–68 (2004)
Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research 7, 2399–2434 (2006)
Bennet, K., Demiriz, A., Maclin, R.: Exploiting unlabeled data in ensemble methods. In: Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 289–296 (2002)
De Bie, T., Cristianini, N.: Semi-supervised learning using semi-definite programming. In: Semi-supervised Learning. MIT Press (2006)
Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: Proc. of the 18th International Conference on Machine Learning (ICML 2001), pp. 19–26 (2001)
Blum, A., Lafferty, J., Rwebangira, M., Reddy, R.: Semi-supervised learning using randomized mincuts. In: Proc. of the 21st International Conference on Machine Learning (ICML 2004), pp. 13–20 (2004)
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proc. of the 11th Annual Conference on Computational Learning Theory (COLT 1998), pp. 92–100. Morgan Kaufmann (1998)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)
Chapelle, O., Chi, M., Zien, A.: A continuation method for semi-supervised svms. In: International Conference on Machine Learning (2006)
Chapelle, O., Sindhwani, V., Keerthi, S.: Branch and bound for semi-supervised support vector machines. In: Advances in Neural Information Processing Systems (2006)
Chapelle, O., Zien, A.: Semi-supervised learning by low density separation. In: Proc. of the 10th International Workshop on Artificial Intelligence and Statistics, pp. 57–64 (2005)
Cozman, F.G., Cohen, I.: Unlabeled data can degrade classification performance of generative classifiers. In: Proc. of the 15th International Conference of the Florida Artificial Intelligence Research Society (FLAIRS), pp. 327–331 (2002)
d’Alché-Buc, F., Grandvalet, Y., Ambroise, C.: Semi-supervised MarginBoost. In: Neural Information Processing Systems Foundation, NIPS 2002 (2002)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society. Series B (Methodological) 39(1), 1–38 (1977)
Feger, F., Koprinska, I.: Co-training using RBF nets and different feature splits. In: Proc. of the International Joint Conference on Neural Networks (IJCNN 2006), pp. 1878–1885 (2006)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)
Freund, Y., Seung, H.S., Shamir, E., Tishby, N.: Selective sampling using the query by committee algorithm. Machine Learning 28, 133–168 (1997)
Fung, G., Mangasarian, O.: Semi-supervised support vector machines for unlabeled data classification. Optimization Methods and Software 15, 29–44 (2001)
Goldman, S., Zhou, Y.: Enhancing supervised learning with unlabeled data. In: Proc. of the 17th International Conference on Machine Learning (ICML 2000), pp. 327–334 (2000)
Grandvalet, Y., Bengio, Y.: Semi-supervised learning by entropy minimization. Advances in Neural Information Processing Systems 17, 529–536 (2005)
Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Hoi, S.C.H., Lyu, M.R.: A semi-supervised active learning framework for image retrieval. In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 302–309 (2005)
Inoue, M., Ueda, N.: Exploitation of unlabeled sequences in hidden markov models. IEEE Transactions On Pattern Analysis and Machine Intelligence 25(12), 1570–1581 (2003)
Joachims, T.: Transductive inference for text classification using support vector machines. In: Proc. of the 16th International Conference on Machine Learning, pp. 200–209 (1999)
Kemp, T., Waibel, A.: Unsupervised training of a speech recognizer: Recent experiments. In: Proc. EUROSPEECH, pp. 2725–2728 (1999)
Kiritchenko, S., Matwin, S.: Email classification with co-training. In: Proc. of the 2001 Conference of the Centre for Advanced Studies on Collaborative research (CASCON 2001), pp. 8–19. IBM Press (2001)
Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems 7, 231–238 (1995)
Lawrence, N.D., Jordan, M.I.: Semi-supervised learning via gaussian processes. Advances in Neural Information Processing Systems 17, 753–760 (2005)
Levin, A., Viola, P., Freund, Y.: Unsupervised improvement of visual detectors using co-training. In: Proc. of the International Conference on Computer Vision, pp. 626–633 (2003)
Lewis, D., Catlett, J.: Heterogeneous uncertainty sampling for supervised learning. In: Proc. of the 11th International Conference on Machine Learning (ICML 1994), pp. 148–156 (1994)
Li, M., Zhou, Z.-H.: Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man and Cybernetics- Part A: Systems and Humans 37(6), 1088–1098 (2007)
McCallum, A.K., Nigam, K.: Employing EM and pool-based active learning for text classification. In: Proc. of the 15th International Conference on Machine Learning (ICML 1998), pp. 350–358. Morgan Kaufmann (1998)
Miller, D.J., Uyar, H.S.: A mixture of experts classifier with learning based on both labelled and unlabelled data. Advances in Neural Information Processing Systems 9, 571–577 (1997)
Muslea, I., Minton, S., Knoblock, C.A.: Active + semi-supervised learning = robust multi-view learning. In: Proc. of the 19th International Conference on Machine Learning (ICML 2002), pp. 435–442 (2002)
Nagy, G., Shelton, G.L.: Self-corrective character recognition systems. IEEE Transactions on Information Theory, 215–222 (1966)
Nigam, K.: Using Unlabeled Data to Improve Text Classification. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, USA (2001)
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proc. of the 9th International Conference on Information and Knowledge Management, New York, NY, USA, pp. 86–93 (2000)
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning 39(2-3), 103–134 (2000)
Salaheldin, A., El Gayar, N.: New Feature Splitting Criteria for Co-training Using Genetic Algorithm Optimization. In: El Gayar, N., Kittler, J., Roli, F. (eds.) MCS 2010. LNCS, vol. 5997, pp. 22–32. Springer, Heidelberg (2010)
Seeger, M.: Learning with labeled and unlabeled data. Technical report, University of Edinburgh, Institute for Adaptive and Neural Computation (2002)
Settles, B.: Active learning literature survey. Technical report, Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI (2009)
Shahshahani, B., Landgrebe, D.: The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon. IEEE Transactions on Geoscience and Remote Sensing 32(5), 1087–1095 (1994)
Sindhwani, V., Keerthi, S., Chapelle, O.: Deterministic annealing for semi-supervised kernel machines. In: International Conference on Machine Learning (2006)
Tang, W., Zhong, S.: Pairwise constraints-guided dimensinality reduction. In: Proc. of the SDM 2006 Workshop on Feature Selection for Data Mining (2006)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer (1995)
Wagstaff, K., Cardie, C., Schroedl, S.: Constrained k-means clustering with background knowledge. In: Proc. of the 18th International Conference on Machine Learning (ICML 2001), pp. 577–584 (2001)
Wang, W., Zhou, Z.-H.: Analyzing Co-training Style Algorithms. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 454–465. Springer, Heidelberg (2007)
Young, T.Y., Farjo, A.: On decision directed estimation and stochastic approximation. IEEE Transactions on Information Theory, 671–673 (1972)
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. Advances in Neural Information Processing Systems 16, 753–760 (2004)
Zhou, Y., Goldman, S.: Democratic co-learning. In: Proc. of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), pp. 202–594. IEEE Computer Society, Washington, DC (2004)
Zhou, Z.-H., Chen, K.-J., Jiang, Y.: Exploiting Unlabeled Data in Content-Based Image Retrieval. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 525–536. Springer, Heidelberg (2004)
Zhou, Z.-H., Li, M.: Semi-supervised learning by disagreement. Knowledge and Information Systems (in press)
Zhou, Z.-H., Li, M.: Semi-supervised regression with co-training. In: Proc. of the 19th International Joint Conference on Artificial Intelligence (IJCAI 2005), pp. 908–913 (2005)
Zhou, Z.-H., Li, M.: Tri-training: Exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Zhou, Z.-H., Zhang, D., Chen, S.: Semi-supervised dimensionality reduction. In: Proc. of the 7th SIAM International Conference on Data Mining (SDM 2007), pp. 629–634 (2007)
Zhu, X.: Semi-supervised learning literature survey. Technical Report 1530 (2008)
Zhu, X., Ghahramani, Z., Lafferty, J.: Semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of the 20th International Conference on Machine Learning (ICML 2003), pp. 912–919 (2003)
Zhu, X., Lafferty, J., Ghahramani, Z.: Combining active learning and semi-supervised learning using gaussian fields and harmonic functions. In: Proc. of the ICML 2003 Workshop on The Continuum from Labeled to Unlabeled Data (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Hady, M.F.A., Schwenker, F. (2013). Semi-supervised Learning. In: Bianchini, M., Maggini, M., Jain, L. (eds) Handbook on Neural Information Processing. Intelligent Systems Reference Library, vol 49. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36657-4_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-36657-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36656-7
Online ISBN: 978-3-642-36657-4
eBook Packages: EngineeringEngineering (R0)