[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

An easy-to-hard learning paradigm for multiple classes and multiple labels

Published: 01 January 2017 Publication History

Abstract

Many applications, such as human action recognition and object detection, can be formulated as a multiclass classification problem. One-vs-rest (OVR) is one of the most widely used approaches for multiclass classification due to its simplicity and excellent performance. However, many confusing classes in such applications will degrade its results. For example, hand clap and boxing are two confusing actions. Hand clap is easily misclassified as boxing, and vice versa. Therefore, precisely classifying confusing classes remains a challenging task. To obtain better performance for multiclass classifications that have confusing classes, we first develop a classifier chain model for multiclass classification (CCMC) to transfer class information between classifiers. Then, based on an analysis of our proposed model, we propose an easy-to-hard learning paradigm for multiclass classification to automatically identify easy and hard classes and then use the predictions from simpler classes to help solve harder classes. Similar to CCMC, the classifier chain (CC) model is also proposed by Read et al. (2009) to capture the label dependency for multi-label classification. However, CC does not consider the order of di_culty of the labels and achieves degenerated performance when there are many confusing labels. Therefore, it is non-trivial to learn the appropriate label order for CC. Motivated by our analysis for CCMC, we also propose the easy-to-hard learning paradigm for multi-label classification to automatically identify easy and hard labels, and then use the predictions from simpler labels to help solve harder labels. We also demonstrate that our proposed strategy can be successfully applied to a wide range of applications, such as ordinal classification and relationship prediction. Extensive empirical studies validate our analysis and the efiectiveness of our proposed easy-to-hard learning strategies.

References

[1]
Jimmy Ba and Rich Caruana. Do deep nets really need to be deep? In NIPS, pages 2654-2662, 2014.
[2]
Peter Bartlett and John Shawe-Taylor. Generalization performance of support vector machines and other pattern classifiers. In Advances in Kernel Methods - Support Vector Learning, pages 43-54. MIT Press, Cambridge, MA, USA, 1998.
[3]
Zafer Barutcuoglu, Robert E. Schapire, and Olga G. Troyanskaya. Hierarchical multilabel prediction of gene function. Bioinformatics, 22(7):830-836, 2006.
[4]
Samy Bengio, Jason Weston, and David Grangier. Label embedding trees for large multiclass tasks. In Advances in Neural Information Processing Systems 23, pages 163-171, 2010.
[5]
Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. Curriculum learning. In ICML, pages 41-48, 2009.
[6]
Alina Beygelzimer, John Langford, Yury Lifshits, Gregory B. Sorkin, and Alexander L. Strehl. Conditional probability tree estimation analysis and algorithms. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Arti_cial Intelligence, pages 51-58, 2009a.
[7]
Alina Beygelzimer, John Langford, and Pradeep Ravikumar. Error-correcting tournaments. In Proceedings of the 20th Conference on Algorithmic Learning Theory, pages 247-262, 2009b.
[8]
Matthew R. Boutell, Jiebo Luo, Xipeng Shen, and C.M.Christopher M. Brown. Learning multi-label scene classi_cation. Pattern Recognition, 37(9):1757-1771, 2004.
[9]
Ignas Budvytis, Vijay Badrinarayanan, and Roberto Cipolla. Label propagation in complex video sequences using semi-supervised learning. In British Machine Vision Conference, pages 1-12. British Machine Vision Association, 2010.
[10]
Yao-Nan Chen and Hsuan-Tien Lin. Feature-aware label space dimension reduction for multi-label classi_cation. In Advances in Neural Information Processing Systems 25, pages 1538-1546, 2012.
[11]
Kai-Yang Chiang, Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon, and Ambuj Tewari. Prediction and clustering in signed networks: a local to global perspective. Journal of Machine Learning Research, 15(1):1177-1213, 2014.
[12]
Kai-Yang Chiang, Cho-Jui Hsieh, and Inderjit S. Dhillon. Matrix completion with noisy side information. In NIPS, pages 3447-3455, 2015.
[13]
Wei Chu and Zoubin Ghahramani. Gaussian processes for ordinal regression. Journal of Machine Learning Research, 6:1019-1041, 2005.
[14]
Moustapha Cissé, Maruan Al-Shedivat, and Samy Bengio. ADIOS: architectures deep in output space. In ICML, pages 2770-2779, 2016.
[15]
Krzysztof Dembczynski, Weiwei Cheng, and Eyke Hüllermeier. Bayes optimal multilabel classification via probabilistic classifier chains. In Johannes Fürnkranz and Thorsten Joachims, editors, Proceedings of the 27th International Conference on Machine Learning, pages 279-286, Haifa, Israel, 2010. Omnipress.
[16]
Can Demirkesen and Hocine Cherifi. An evaluation of divide-and-combine strategies for image categorization by multi-class support vector machines. In 23rd International Symposium on Computer and Information Sciences, pages 1-6, 2008.
[17]
Sébastien Destercke and Gen Yang. Cautious ordinal classification by binary decomposition. In ECML/PKDD, pages 323-337, 2014.
[18]
Thomas G. Dietterich and Ghulum Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2:263-286, 1995.
[19]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. LIB-LINEAR: A library for large linear classi_cation. Journal of Machine Learning Research, 9:1871-1874, 2008.
[20]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse recti_er neural networks. In AISTATS, pages 315-323, 2011.
[21]
Chen Gong, Dacheng Tao, Wei Liu, Liu Liu, and Jie Yang. Label propagation via teaching-to-learn and learning-to-teach. IEEE Trans. Neural Netw. Learning Syst., 28(6):1452- 1465, 2017.
[22]
Matthieu Guillaumin, Jakob J. Verbeek, and Cordelia Schmid. Multimodal semi-supervised learning for image classification. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 902-909. IEEE Computer Society, 2010.
[23]
Yuhong Guo and Suicheng Gu. Multi-label classi_cation using conditional dependency networks. In Toby Walsh, editor, Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pages 1300-1305, Barcelona, Catalonia, Spain, 2011. AAAI Press.
[24]
Yuhong Guo and Dale Schuurmans. Adaptive large margin training for multilabel classification. In Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011.
[25]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR, pages 770-778, 2016.
[26]
Xuming He, Richard S. Zemel, and Miguel Á. Carreira-Perpiñán. Multiscale conditional random fields for image labeling. In CVPR, pages 695-702, 2004.
[27]
Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015.
[28]
Cho-Jui Hsieh, Kai-Yang Chiang, and Inderjit S. Dhillon. Low rank modeling of signed networks. In KDD, pages 507-515, 2012.
[29]
Chih-Wei Hsu and Chih-Jen Lin. A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks and Learning Systems, 13(2):415-425, 2002.
[30]
Daniel Hsu, Sham Kakade, John Langford, and Tong Zhang. Multi-label prediction via compressed sensing. In Advances in Neural Information Processing Systems, pages 772- 780, 2009.
[31]
Sheng-Jun Huang and Zhi-Hua Zhou. Multi-label learning by exploiting label correlations locally. In Jörg Hoffmann and Bart Selman, editors, Proceedings of the Twenty-Sixth AAAI Conference on Artifcial Intelligence, Toronto, Ontario, Canada, 2012. AAAI Press.
[32]
Lina Huo, Licheng Jiao, Shuang Wang, and Shuyuan Yang. Object-level saliency detection with color attributes. Pattern Recognition, 49:162-173, 2016.
[33]
Prateek Jain and Inderjit S. Dhillon. Provable inductive matrix completion. CoRR, abs/1306.0626, 2013.
[34]
Feng Kang, Rong Jin, and Rahul Sukthankar. Correlated label propagation with application to multi-label learning. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1719-1726, NY, USA, 2006. IEEE Computer Society.
[35]
Michael J. Kearns and Robert E. Schapire. E_cient distribution-free learning of probabilistic concepts. In Proceedings of the 31st Symposium on the Foundations of Computer Science, pages 382-391, Los Alamitos, CA, 1990. IEEE Computer Society Press.
[36]
Maksim Lapin, Matthias Hein, and Bernt Schiele. Top-k multiclass SVM. In Advances in Neural Information Processing Systems 28, pages 325-333, 2015.
[37]
Maksim Lapin, Matthias Hein, and Bernt Schiele. Loss functions for top-k error: Analysis and insights. In The IEEE Conference on Computer Vision and Pattern Recognition, 2016.
[38]
Jure Leskovec, Daniel P. Huttenlocher, and Jon M. Kleinberg. Predicting positive and negative links in online social networks. In WWW, pages 641-650, 2010.
[39]
Weiwei Liu and Ivor W. Tsang. Large margin metric learning for multi-label prediction. In AAAI, pages 2800-2806, 2015a.
[40]
Weiwei Liu and Ivor W. Tsang. On the optimality of classifier chain for multi-label classification. In NIPS, pages 712-720, 2015b.
[41]
Weiwei Liu and Ivor W. Tsang. Sparse perceptron decision tree for millions of dimensions. In AAAI, pages 1881-1887, 2016.
[42]
Weiwei Liu and Ivor W. Tsang. Making decision trees feasible in ultrahigh feature and label dimensions. Journal of Machine Learning Research, 18:1-36, 2017.
[43]
Weiwei Liu, Xiaobo Shen, and Ivor W. Tsang. Sparse embedded k-means clustering. In NIPS, 2017.
[44]
Qi Mao, Ivor Wai-Hung Tsang, and Shenghua Gao. Objective-guided image annotation. IEEE Transactions on Image Processing, 22(4):1585-1597, 2013.
[45]
Paolo Massa and Paolo Avesani. Trust-aware bootstrapping of recommender systems. In Proceedings of ECAI 2006 Workshop on Recommender Systems, pages 29-33, 2006.
[46]
Jonathan Milgram, Mohamed Cheriet, and Robert Sabourin. "one against one" or "one against all": Which one is better for handwriting recognition with SVMs? In Tenth International Workshop on Frontiers in Handwriting Recognition, 2006.
[47]
Gang Niu, Marthinus Christo_el du Plessis, Tomoya Sakai, Yao Ma, and Masashi Sugiyama. Theoretical comparisons of positive-unlabeled learning against positive-negative learning. In NIPS, pages 1199-1207, 2016.
[48]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake VanderPlas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825-2830, 2011.
[49]
Trung T. Pham, Ian Reid, Yasir Latif, and Stephen Gould. Hierarchical higher-order regression forest _elds: An application to 3D indoor scene labelling. In ICCV, 2015.
[50]
Jesse Read, Bernhard Pfahringer, Geoff Holmes, and Eibe Frank. Classifier chains for multilabel classi_cation. In Wray L. Buntine, Marko Grobelnik, Dunja Mladenic, and John Shawe-Taylor, editors, Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, pages 254-269, Berlin, Heidelberg, 2009. Springer-Verlag.
[51]
Ryan M. Rifkin and Aldebaro Klautau. In defense of one-vs-all classi_cation. Journal of Machine Learning Research, 5:101-141, 2004.
[52]
Robert E. Schapire and Yoram Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2-3):135-168, 2000.
[53]
Christian Schüldt, Ivan Laptev, and Barbara Caputo. Recognizing human actions: A local svm approach. In 17th International Conference on Pattern Recognition, pages 32-36. IEEE Computer Society, 2004.
[54]
Chun-Wei Seah, Ivor W. Tsang, and Yew-Soon Ong. Transductive ordinal regression. TNNLS, 23(7):1074-1086, 2012.
[55]
Amnon Shashua and Anat Levin. Ranking with large margin principle: Two approaches. In NIPS, pages 937-944, 2002.
[56]
John Shawe-Taylor, Peter L. Bartlett, Robert C. Williamson, and Martin Anthony. Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5):1926-1940, 1998.
[57]
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. Indoor segmentation and support inference from RGBD images. In 12th European Conference on Computer Vision, pages 746-760. Springer, 2012.
[58]
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[59]
Farbound Tai and Hsuan-Tien Lin. Multilabel classi_cation with principal label space transformation. Neural Computation, 24(9):2508-2542, 2012.
[60]
Ali Fallah Tehrani, Weiwei Cheng, and Eyke Hüllermeier. Preference learning using the choquet integral: The case of multipartite ranking. IEEE Trans. Fuzzy Systems, 20(6): 1102-1113, 2012.
[61]
Antonio Torralba, Kevin P. Murphy, and William T. Freeman. Contextual models for object detection using boosted random fields. In Advances in Neural Information Processing Systems, pages 1401-1408, 2004.
[62]
Grigorios Tsoumakas, Ioannis Katakis, and Ioannis Vlahavas. Mining multi-label data. In Data Mining and Knowledge Discovery Handbook, pages 667-685. Springer US, 2010.
[63]
Jason Weston and Chris Watkins. Support vector machines for multi-class pattern recognition. In 7th European Symposium on Arti_cial Neural Networks, pages 219-224, 1999.
[64]
Jian-Bo Yang and Ivor W. Tsang. Hierarchical maximum margin learning for multi-class classi_cation. In UAI, pages 753-760, 2011.
[65]
Min-Ling Zhang and Kun Zhang. Multi-label learning by exploiting label dependency. In Bharat Rao, Balaji Krishnapuram, Andrew Tomkins, and Qiang Yang, editors, Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 999-1008, Washington, DC, USA, 2010. ACM.
[66]
Min-Ling Zhang and Zhi-Hua Zhou. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 26(8):1819-1837, 2014.
[67]
Yi Zhang and Jeff Schneider. Maximum margin output coding. In John Langford and Joelle Pineau, editors, Proceedings of the 29th International Conference on Machine Learning, pages 1575-1582, New York, NY, USA, 2012. Omnipress.
[68]
Yi Zhang and Je_ G. Schneider. Multi-label output codes using canonical correlation analysis. In Geo_rey J. Gordon, David B. Dunson, and Miroslav Dudík, editors, Proceedings of the Fourteenth International Conference on Arti_cial Intelligence and Statistics, pages 873-882, Fort Lauderdale, USA, 2011. JMLR.org.

Cited By

View all
  • (2024)Does label smoothing help deep partial label learning?Proceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692704(15823-15838)Online publication date: 21-Jul-2024
  • (2023)Delving into noisy label detection with clean dataProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620093(40290-40305)Online publication date: 23-Jul-2023
  • (2023)Pitfalls of assessing extracted hierarchies for multi-class classificationPattern Recognition10.1016/j.patcog.2022.109225136:COnline publication date: 1-Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 18, Issue 1
January 2017
8830 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 January 2017
Published in JMLR Volume 18, Issue 1

Author Tags

  1. classifier Chain
  2. easy-to-hard learning paradigm
  3. multi-label classification
  4. multiclass classification

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)28
  • Downloads (Last 6 weeks)6
Reflects downloads up to 05 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Does label smoothing help deep partial label learning?Proceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692704(15823-15838)Online publication date: 21-Jul-2024
  • (2023)Delving into noisy label detection with clean dataProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620093(40290-40305)Online publication date: 23-Jul-2023
  • (2023)Pitfalls of assessing extracted hierarchies for multi-class classificationPattern Recognition10.1016/j.patcog.2022.109225136:COnline publication date: 1-Apr-2023
  • (2022)On robust multiclass learnabilityProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602618(32412-32423)Online publication date: 28-Nov-2022
  • (2022)Linear Ordering Problem based Classifier Chain using Genetic Algorithm for multi-label classificationApplied Soft Computing10.1016/j.asoc.2021.108395117:COnline publication date: 1-Mar-2022
  • (2022)DD-GAN: pedestrian image inpainting with simultaneous tone correctionMultimedia Tools and Applications10.1007/s11042-022-12342-z82:2(2503-2516)Online publication date: 28-Jun-2022
  • (2021)Understanding partial multi-label learning via mutual informationProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540578(4147-4156)Online publication date: 6-Dec-2021
  • (2021)Handling Difficult Labels for Multi-label Image Classification via Uncertainty DistillationProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475406(2410-2419)Online publication date: 17-Oct-2021
  • (2021)Generic Multi-label Annotation via Adaptive Graph and Marginalized AugmentationACM Transactions on Knowledge Discovery from Data10.1145/345188416:1(1-20)Online publication date: 20-Jul-2021
  • (2019)CPM-netsProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454338(559-569)Online publication date: 8-Dec-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media