Abstract
In this paper, we consider a novel machine learning problem, that is, learning a classifier from noisy label distributions. In this problem, each instance with a feature vector belongs to at least one group. Then, instead of the true label of each instance, we observe the label distribution of the instances associated with a group, where the label distribution is distorted by an unknown noise. Our goals are to (1) estimate the true label of each instance, and (2) learn a classifier that predicts the true label of a new instance. We propose a probabilistic model that considers true label distributions of groups and parameters that represent the noise as hidden variables. The model can be learned based on a variational Bayesian method. In numerical experiments, we show that the proposed model outperforms existing methods in terms of the estimation of the true labels of instances.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)
Cheng, Z., Caverlee, J., Lee, K.: You are where you tweet : a content-based approach to geo-locating twitter users. In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management, pp. 759–768 (2010)
Culotta, A., Kumar, N.R., Cutler, J.: Predicting twitter user demographics using distant supervision from website traffic data. J. Artif. Intell. Res. 1, 389–408 (2016)
Culotta, A., Ravi, N.K., Cutler, J.: Predicting the demographics of twitter users from website traffic data. In: Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, pp. 72–78 (2015)
Dietterich, T.G., Lathrop, R.H., Lozano-Pérez, T.: Solving the multiple instance problem with axis-parallel rectangles. Artif. Intell. 89, 31–71 (1997)
Li, J., Ritter, A., Hovy, E.: Weakly Supervised User Profile Extraction from Twitter. In: Association of Computational Linguistics, pp. 165–174 (2014)
Maron, O., Lozano-Pérez, T.: A framework for multiple-instance learning. In: Advances in Neural Information Processing, pp. 570–576 (1997)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yoshikawa, Y. (2017). Learning from Noisy Label Distributions. In: Lintas, A., Rovetta, S., Verschure, P., Villa, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2017. ICANN 2017. Lecture Notes in Computer Science(), vol 10614. Springer, Cham. https://doi.org/10.1007/978-3-319-68612-7_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-68612-7_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68611-0
Online ISBN: 978-3-319-68612-7
eBook Packages: Computer ScienceComputer Science (R0)