More Web Proxy on the site http://driver.im/

Article

Transferring naive bayes classifiers for text classification

Authors:

Yong YuAuthors Info & Claims

AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence - Volume 1

Pages 540 - 545

Published: 22 July 2007 Publication History

Abstract

A basic assumption in traditional machine learning is that the training and test data distributions should be identical. This assumption may not hold in many situations in practice, but we may be forced to rely on a different-distribution data to learn a prediction model. For example, this may be the case when it is expensive to label the data in a domain of interest, although in a related but different domain there may be plenty of labeled data available. In this paper, we propose a novel transfer-learning algorithm for text classification based on an EM-based Naive Bayes classifiers. Our solution is to first estimate the initial probabilities under a distribution D_l of one labeled data set, and then use an EM algorithm to revise the model for a different distribution D_u of the test data which are unlabeled. We show that our algorithm is very effective in several different pairs of domains, where the distances between the different distributions are measured using the Kullback-Leibler (KL) divergence. Moreover, KL-divergence is used to decide the trade-off parameters in our algorithm. In the experiment, our algorithm outperforms the traditional supervised and semi-supervised learning algorithms when the distributions of the training and test sets are increasingly different.

References

[1]

Ben-David, S., and Schuller, R. 2003. Exploiting task relatedness for multiple task learning. In Proceedings of the Sixteenth Annual Conference on Learning Theory.

[2]

Bennett, P. N.; Dumais, S. T.; and Horvitz, E. 2003. Inductive transfer for text classification using generalized reliability indicators. In Proceedings of ICML-03 Workshop on The Continuum from Labeled and Unlabeled Data.

[3]

Boser, B. E.; Guyon, I.; and Vapnik, V. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory.

[4]

Caruana, R. 1997. Muititask learning. Machine Learning 28(1):41-75.

[5]

Daumé III, H., and Marcu, D. 2006. Domain adaptation for statistical classifiers. Journal of Artificial Intelligence Research 26:101-126.

[6]

Dempster, A. P.; Laird, N. M.; and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society 39:1-38.

[7]

Heckman, J. J. 1979. Sample selection bias as a specification error. Econometrica 47:153-161.

[8]

Joachims, T. 1999. Transductive inference for text classification using support vector machines. In Proceedings of Sixteenth International Conference on Machine Learning.

[9]

Joachims, T. 2002. Learning to Classify Text Using Support Vector Machines. Ph.D. Dissertation, Kluwer.

[10]

Kullback, S., and Leibler, R. A. 1951. On information and sufficiency. Annals of Mathematical Statistics 22(1):79-86.

[11]

Lewis, D. D. 1992. Representation and learning in information retrieval. Ph.D. Dissertation, Amherst, MA, USA.

[12]

Liu, B.; Lee, W. S.; Yu, P. S.; and Li, X. 2002. Partially supervised classification of text documents. In Proceedings of the Nineteenth International Conference on Machine Learning, 387-394. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc.

[13]

Nigam, K.; McCallum, A. K.; Thrun, S.; and Mitchell, T. 2000. Text classification from labeled and unlabeled documents using em. Machine Learning 39(2-3):103-134.

[14]

Raina, R.; Ng, A. Y.; and Koller, D. 2006. Constructing informative priors using transfer learning. In Proceedings of Twenty-Third International Conference on Machine Learning .

[15]

Rigutini, L.; Maggini, M.; and Liu, B. 2005. An em based training algorithm for cross-language text categorization. In Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence.

[16]

Schmidhuber, J. 1994. On learning how to learn learning strategies. Technical Report FKI-198-94, Fakultat fur Informatik.

[17]

Shimodaira, H. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference 90:227-244.

[18]

Thrun, S., and Mitchell, T. M. 1995. Learning one more thing. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence.

[19]

Wu, P., and Dietterich, T. G. 2004. Improving svm accuracy by training on auxiliary data sources. In Proceedings of the Twenty-First International Conference on Machine Learning.

[20]

Zadrozny, B. 2004. Learning and evaluating classifiers under sample selection bias. In Proceedings of the Twenty-First International Conference on Machine Learning.

Cited By

Liu YGuo BZhang DZeghlache DChen JZhang SZhou DShi XYu Z(2021)MetaStore: A Task-adaptative Meta-learning Model for Optimal Store Placement with Multi-city Knowledge TransferACM Transactions on Intelligent Systems and Technology10.1145/344727112:3(1-23)Online publication date: 21-Apr-2021
https://dl.acm.org/doi/10.1145/3447271
She DKrishna RYan LJana SRay BDevanbu PCohen MZimmermann T(2020)MTFuzz: fuzzing with a multi-task neural networkProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409723(737-749)Online publication date: 8-Nov-2020
https://dl.acm.org/doi/10.1145/3368089.3409723
Wu HYan YYe YMin HNg MWu Q(2019)Online Heterogeneous Transfer Learning by Knowledge TransitionACM Transactions on Intelligent Systems and Technology10.1145/330953710:3(1-19)Online publication date: 30-May-2019
https://dl.acm.org/doi/10.1145/3309537
Show More Cited By

Transferring naive bayes classifiers for text classification
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

Learning Instance Weighted Naive Bayes from labeled and unlabeled data

In real-world data mining applications, it is often the case that unlabeled instances are abundant, while available labeled instances are very limited. Thus, semi-supervised learning, which attempts to benefit from large amount of unlabeled data ...
Feature selection for multi-label naive Bayes classification

In multi-label learning, the training set is made up of instances each associated with a set of labels, and the task is to predict the label sets of unseen instances. In this paper, this learning problem is addressed by using a method called Mlnb which ...
Large scale text classification using semi-supervised multinomial naive bayes
ICML'11: Proceedings of the 28th International Conference on International Conference on Machine Learning

Numerous semi-supervised learning methods have been proposed to augment Multinomial Naive Bayes (MNB) using unlabeled documents, but their use in practice is often limited due to implementation difficulty, inconsistent prediction performance, or high ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

AAAI'07: Proceedings of the 22nd national conference on Artificial intelligence - Volume 1

July 2007

942 pages

ISBN:9781577353232

Editor:
Anthony Cohn
University of Leeds

Sponsors

Association for the Advancement of Artificial Intelligence

Publisher

AAAI Press

Publication History

Published: 22 July 2007

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

43
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu YGuo BZhang DZeghlache DChen JZhang SZhou DShi XYu Z(2021)MetaStore: A Task-adaptative Meta-learning Model for Optimal Store Placement with Multi-city Knowledge TransferACM Transactions on Intelligent Systems and Technology10.1145/344727112:3(1-23)Online publication date: 21-Apr-2021
https://dl.acm.org/doi/10.1145/3447271
She DKrishna RYan LJana SRay BDevanbu PCohen MZimmermann T(2020)MTFuzz: fuzzing with a multi-task neural networkProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3368089.3409723(737-749)Online publication date: 8-Nov-2020
https://dl.acm.org/doi/10.1145/3368089.3409723
Wu HYan YYe YMin HNg MWu Q(2019)Online Heterogeneous Transfer Learning by Knowledge TransitionACM Transactions on Intelligent Systems and Technology10.1145/330953710:3(1-19)Online publication date: 30-May-2019
https://dl.acm.org/doi/10.1145/3309537
Zou BLampos VCox I(2019)Transfer Learning for Unsupervised Influenza-like Illness Models from Online Search DataThe World Wide Web Conference10.1145/3308558.3313477(2505-2516)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308558.3313477
Zhang JLi WOgunbona PXu D(2019)Recent Advances in Transfer Learning for Cross-Dataset Visual RecognitionACM Computing Surveys10.1145/329112452:1(1-38)Online publication date: 17-Feb-2019
https://dl.acm.org/doi/10.1145/3291124
Heo KOh HYang HAtlee JBultan TWhittle J(2019)Resource-aware program analysis via online abstraction coarseningProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00027(94-104)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00027
Jiang YZhao KXia KXue JZhou LDing YQian P(2019)A Novel Distributed Multitask Fuzzy Clustering Algorithm for Automatic MR Brain Image SegmentationJournal of Medical Systems10.1007/s10916-019-1245-143:5(1-9)Online publication date: 1-May-2019
https://dl.acm.org/doi/10.1007/s10916-019-1245-1
Minnehan BSavakis A(2019)Deep domain adaptation with manifold aligned label transferMachine Vision and Applications10.1007/s00138-019-01003-130:3(473-485)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s00138-019-01003-1
Xiong DYan L(2018)A Classification Learning Research based on Discriminative Knowledge-Leverage TransferInternational Journal of Ambient Computing and Intelligence10.4018/IJACI.20181001049:4(52-68)Online publication date: 1-Oct-2018
https://dl.acm.org/doi/10.4018/IJACI.2018100104
Graf DRetschitzegger WSchwinger WPröll BKapsammer EChbeir RIshikawa HSumiya KHatano KKoeppen M(2018)Cross-domain informativeness classification for disaster situationsProceedings of the 10th International Conference on Management of Digital EcoSystems10.1145/3281375.3281385(183-190)Online publication date: 25-Sep-2018
https://dl.acm.org/doi/10.1145/3281375.3281385
Show More Cited By

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents