[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/1099554.1099591acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Collective multi-label classification

Published: 31 October 2005 Publication History

Abstract

Common approaches to multi-label classification learn independent classifiers for each category, and employ ranking or thresholding schemes for classification. Because they do not exploit dependencies between labels, such techniques are only well-suited to problems in which categories are independent. However, in many domains labels are highly interdependent. This paper explores multi-label conditional random field (CRF)classification models that directly parameterize label co-occurrences in multi-label classification. Experiments show that the models outperform their single-label counterparts on standard text corpora. Even when multi-labels are sparse, the models improve subset classification error by as much as 40%.

References

[1]
A. L. Berger, V. J. D. Pietra, and S. A. D. Pietra. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--71, 1996.
[2]
M. Boutell, X. Shen, J. Luo, and C. Brown. Multi-label semantic scene classification, technical report, dept. comp.sci. u. rochester. 2003.
[3]
R. H. Byrd, J. Nocedal, and R. B. Schnabel. Representations of quasi-newton matrices and their use in limited memory methods. Mathematical Programming, 63:129--156, 1994.
[4]
S. F. Chen and R. Rosenfeld. A gaussian prior for smoothing maximum entropy models, technical report cmucs -99--108, carnegie mellon university. 1999.
[5]
K. Crammer and Y. Singer. A new family of online algorithms for category ranking. In SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 151--158, Tampere, Finland, 2002. ACM.
[6]
S. Gao, W. Wu, C.-H. Lee, and T.-S. Chua. A mfom learning approach to robust multiclass multi-label text categorization. In Machine Learning, Proceedings of the Twenty-first International Conference (ICML 2004), Banff, Alberta, Canada, 2004. ACM.
[7]
W. R. Hersh, C. Buckley, T. J. Leone, and D. H. Hickam. Ohsumed: An interactive retrieval evaluation and new large test collection for research. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 192--201, Dublin, Ireland, July 1994. ACM/Springer.
[8]
T. Joachims. Text categorization with suport vector machines: Learning with many relevant features. In Machine Learning: ECML-98, 10th European Conference on Machine Learning, volume 1398 of Lecture Notes in Computer Science, pages 137--142. Springer, April 1998.
[9]
J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pages 282--289, Williamstown, MA, USA, 2001. Morgan Kaufmann.
[10]
H. A. Loeliger. An introduction to factor graphs. In IEEE Signal Processing Magazine, pages 28--41, January, 2004.
[11]
A. McCallum. Multi-label text classification with a mixture model trained by EM. In AAAI'99 Workshop on Text Learning, 1999.
[12]
A. McCallum. Efficiently inducing features of conditional random fields. In UAI'03, Proceedings of the 19th Conference in Uncertainty in Artificial Intelligence (UAI), pages 403--410, Acapulco, Mexico, August 2003. Morgan Kaufmann.
[13]
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In In IJCAI-99 Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.
[14]
R. E. Schapire and Y. Singer. Boostexter: A boosting-based system for text categorization. Machine Learning, 39(2/3):135--168, 2000.
[15]
B. Taskar, P. Abbeel, and D. Koller. Discriminative probabilistic models for relational data. In UAI '02, Proceedings of the 18th Conference in Uncertainty in Artificial Intelligence, University of Alberta, Edmonton, Alberta, Canada, August 1-4, 2002, pages 485--492, Edmonton, Alberta, Canada, August 2002. Morgan Kaufmann.
[16]
N. Ueda and K. Saito. Parametric mixture models for multi-labeled text. In Advances in Neural Information Processing Systems 15 {Neural Information Processing Systems, NIPS 2002, pages 721--728, Vancouver, British Columbia, Canada, December 2002. MIT Press.
[17]
Y. Yang. An evaluation of statistical approaches to text categorization. Inf. Retr., 1(1--2):69--90, 1999.

Cited By

View all
  • (2024)Performance evaluation of seven multi-label classification methods on real-world patent and publication datasetsJournal of Data and Information Science10.2478/jdis-2024-00149:2(81-103)Online publication date: 27-May-2024
  • (2024)A Proposed Weighted Multi-Label Classification Approach for Ancestral Population Identification in Admixed IndividualsProcedia Computer Science10.1016/j.procs.2024.09.520246(1011-1018)Online publication date: 2024
  • (2024)Document-level Relation Extraction with Relation CorrelationsNeural Networks10.1016/j.neunet.2023.11.062171:C(14-24)Online publication date: 17-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management
October 2005
854 pages
ISBN:1595931406
DOI:10.1145/1099554
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 October 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. machine learning
  3. multi-label
  4. statistical learning
  5. uncertainty

Qualifiers

  • Article

Conference

CIKM05
Sponsor:
CIKM05: Conference on Information and Knowledge Management
October 31 - November 5, 2005
Bremen, Germany

Acceptance Rates

CIKM '05 Paper Acceptance Rate 77 of 425 submissions, 18%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)180
  • Downloads (Last 6 weeks)20
Reflects downloads up to 30 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Performance evaluation of seven multi-label classification methods on real-world patent and publication datasetsJournal of Data and Information Science10.2478/jdis-2024-00149:2(81-103)Online publication date: 27-May-2024
  • (2024)A Proposed Weighted Multi-Label Classification Approach for Ancestral Population Identification in Admixed IndividualsProcedia Computer Science10.1016/j.procs.2024.09.520246(1011-1018)Online publication date: 2024
  • (2024)Document-level Relation Extraction with Relation CorrelationsNeural Networks10.1016/j.neunet.2023.11.062171:C(14-24)Online publication date: 17-Apr-2024
  • (2024)A conditional multi-label model to improve prediction of a rare outcome: An illustration predicting autism diagnosisJournal of Biomedical Informatics10.1016/j.jbi.2024.104711157(104711)Online publication date: Sep-2024
  • (2024)Consistent and specific multi-view multi-label learning with correlation informationInformation Sciences10.1016/j.ins.2024.121395(121395)Online publication date: Aug-2024
  • (2024)Noisy feature decomposition-based multi-label learning with missing labelsInformation Sciences10.1016/j.ins.2024.120228(120228)Online publication date: Jan-2024
  • (2024)Malware classification through Abstract Syntax Trees and L-momentsComputers & Security10.1016/j.cose.2024.104082(104082)Online publication date: Sep-2024
  • (2024)Classifier chain-based monitoring method for multivariate surgical outcomesComputers & Industrial Engineering10.1016/j.cie.2024.110378194(110378)Online publication date: Aug-2024
  • (2024)A multi-label image classification method combining multi-stage image semantic information and label relevanceInternational Journal of Machine Learning and Cybernetics10.1007/s13042-024-02127-115:9(3911-3925)Online publication date: 8-Apr-2024
  • (2024)Multi-label Out-of-Distribution Detection with Spectral Normalized Joint EnergyWeb and Big Data10.1007/978-981-97-7244-5_3(31-45)Online publication date: 28-Aug-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media