More Web Proxy on the site http://driver.im/

Article

Learning a meta-level prior for feature relevance from multiple related tasks

Authors:

Vassil Chatalbashev,

Daphne KollerAuthors Info & Claims

ICML '07: Proceedings of the 24th international conference on Machine learning

Pages 489 - 496

https://doi.org/10.1145/1273496.1273558

Published: 20 June 2007 Publication History

Abstract

In many prediction tasks, selecting relevant features is essential for achieving good generalization performance. Most feature selection algorithms consider all features to be a priori equally likely to be relevant. In this paper, we use transfer learning---learning on an ensemble of related tasks---to construct an informative prior on feature relevance. We assume that features themselves have meta-features that are predictive of their relevance to the prediction task, and model their relevance as a function of the meta-features using hyperparameters (called meta-priors). We present a convex optimization algorithm for simultaneously learning the meta-priors and feature weights from an ensemble of related prediction tasks which share a similar relevance structure. Our approach transfers the "meta-priors" among different tasks, which makes it possible to deal with settings where tasks have nonoverlapping features or the relevance of the features vary over the tasks. We show that learning feature relevance improves performance on two real data sets which illustrate such settings: (1) predicting ratings in a collaborative filtering task, and (2) distinguishing arguments of a verb in a sentence.

References

[1]

Argyriou, A., Evgeniou, T., & Pontil, M. (2006). Multi-task feature learning. Proceeding of NIPS. Cambridge, MA: MIT Press.

Digital Library

[2]

Baxter, J. (1997). A bayesian/information theoretic model of learning to learn viamultiple task sampling. Mach. Learn., 28, 7--39.

Digital Library

[3]

Baxter, J. (2000). Model for inductive learning. J. of Artificial Intelligence Research.

[4]

Caruana, R. (1997). Multitask learning. Machine Learning, 28, 41--75.

Digital Library

[5]

Evgeniou, T., Micchelli, C., & Pontil, M. (2005). Learning multiple tasks with kernel methods. J. Mach. Learn. Res.

Digital Library

[6]

Fink, M., Shwatz-Shalev, S., Singer, Y., & Ullman, S. (2006). Online multiclass learning by interclass hypothesis sharing. Proc. 23rd International Conference on Machine Learning.

Digital Library

[7]

Gildea, D., & Jurafsky, D. (2002). Automatic labeling of semantic roles.

Digital Library

[8]

Heskes, T. (2000). Empirical bayes for learning to learn. Proc. 17th International Conference on Machine Learning.

Digital Library

[9]

Kaelbling, L. (2003). JMLR special issue on variable and feature selection.

[10]

Kingsbury, P., Palmer, M., & Marcus, M. (2002). Adding semantic annotation to the penn treebank. Proceedings of the Human Language Technology Conference (HLT'02).

Digital Library

[11]

MacKay, D. (1992). Bayesian interpolation. Neural Computation, 4, 415--447.

Digital Library

[12]

Marlin, B. (2004). Collaborative filtering: A machine learning perspective.

[13]

McCallum, A., Rosenfeld, R., Mitchell, T., & Ng, A. Y. (1998). Improving text classification by shrinkage in a hierarchy of classes.

[14]

McCullagh, P., & Nelder, J. (1989). Generalized linear models. London: Chapman and Hall.

[15]

Moschitti, A. (2004). A study on convolution kernels for shallow statistic parsing. ACL.

Digital Library

[16]

Neal, R. (1995). Bayesian learning for neural networks. Doctoral dissertation. Adviser-Geoffrey Hinton.

Digital Library

[17]

Pradhan, S., Hacioglu, K., Krugler, V., Ward, W., Martin, J. H., & Jurafsky, D. (2005). Support vector learning for semantic argument classification. Machine Learning, 60, 11--39.

Digital Library

[18]

Raina, R., Ng, A., & Koller, D. (2006). Transfer learning by constructing informative priors. Proc. 21st International Conference on Machine Learning.

Digital Library

[19]

Taskar, B., Wong, M., & Koller, D. (2003). Learning on the test data: Leveraging unseen features. Proc. 20th International Conference on Machine Learning.

[20]

Teh, Y., Seeger, M., & Jordan, M. (2005). Semiparameteric latent factor models. Workshop on Artificial Intelligence and Statistics 10.

[21]

Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? Advances in Neural Information Processing Systems (pp. 640--646). The MIT Press.

[22]

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B.

[23]

Yu, K., Tresp, V., & Schwaighofer, A. (2005). Learning gaussian processes from multiple tasks.

Digital Library

[24]

Yuan, M., & Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Statist. Soc. B, 68, 49--67.

[25]

Zhang, J., Ghahramani, Z., & Yang, Y. (2005). Learning multiple related tasks using latent independent component analysis. Advances in Neural Information Processing Systems 17.

Cited By

Liu BZheng ZXiao YSun PLi XZhao SHuang YPeng T(2024)Self-paced method for transfer partial label learningInformation Sciences10.1016/j.ins.2024.121043(121043)Online publication date: Jun-2024
https://doi.org/10.1016/j.ins.2024.121043
N MG SR SNR J(2024)Federated transfer learning for intrusion detection system in industrial iot 4.0Multimedia Tools and Applications10.1007/s11042-024-18379-683:19(57913-57941)Online publication date: 16-Feb-2024
https://doi.org/10.1007/s11042-024-18379-6
Huang SBao Z(2023)Shortest Paths Discovery in Uncertain Networks via Transfer LearningProceedings of the ACM on Management of Data10.1145/35892861:2(1-25)Online publication date: 20-Jun-2023
https://doi.org/10.1145/3589286
Show More Cited By

Learning a meta-level prior for feature relevance from multiple related tasks
1. Computing methodologies

Recommendations

Relevance feature mapping for content-based multimedia information retrieval

This paper presents a novel ranking framework for content-based multimedia information retrieval (CBMIR). The framework introduces relevance features and a new ranking scheme. Each relevance feature measures the relevance of an instance with respect to ...
The effect of low-level image features on pseudo relevance feedback

Relevance feedback (RF) is a technique popularly used to improve the effectiveness of traditional content-based image retrieval systems. However, users must provide relevant and/or irrelevant images as feedback for their queries, which is a tedious ...
Improved AdaBoost-based image retrieval with relevance feedback via paired feature learning

Boost learning algorithm, such as AdaBoost, has been widely used in a variety of applications in multimedia and computer vision. Relevance feedback-based image retrieval has been formulated as a classification problem with a small number of training ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICML '07: Proceedings of the 24th international conference on Machine learning

June 2007

1233 pages

ISBN:9781595937933

DOI:10.1145/1273496

Editor:
Zoubin Ghahramani
University of Cambridge, United Kingdom

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Machine Learning Journal

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

ICML '07 & ILP '07

Sponsor:

ICML '07 & ILP '07: The 24th Annual International Conference on Machine Learning held in conjunction with the 2007 International Conference on Inductive Logic Programming

June 20 - 24, 2007

Oregon, Corvalis, USA

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

109
Total Citations
View Citations
703
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)5

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu BZheng ZXiao YSun PLi XZhao SHuang YPeng T(2024)Self-paced method for transfer partial label learningInformation Sciences10.1016/j.ins.2024.121043(121043)Online publication date: Jun-2024
https://doi.org/10.1016/j.ins.2024.121043
N MG SR SNR J(2024)Federated transfer learning for intrusion detection system in industrial iot 4.0Multimedia Tools and Applications10.1007/s11042-024-18379-683:19(57913-57941)Online publication date: 16-Feb-2024
https://doi.org/10.1007/s11042-024-18379-6
Huang SBao Z(2023)Shortest Paths Discovery in Uncertain Networks via Transfer LearningProceedings of the ACM on Management of Data10.1145/35892861:2(1-25)Online publication date: 20-Jun-2023
https://doi.org/10.1145/3589286
Chen TChen XDu XRashwan AYang FChen HWang ZLi Y(2023)AdaMV-MoE: Adaptive Multi-Task Vision Mixture-of-Experts2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01591(17300-17311)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01591
Xu HLi WCai Z(2023)Analysis on methods to effectively improve transfer learning performanceTheoretical Computer Science10.1016/j.tcs.2022.09.023940(90-107)Online publication date: Jan-2023
https://doi.org/10.1016/j.tcs.2022.09.023
Liang HFan ZSarkar RJiang ZChen TZou KCheng YHao CWang ZKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)M3ViTProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602332(28441-28457)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602332
Barton TYu HRogers KFulda NChiang SYorgason JWarnick K(2022)Towards Low-Power Machine Learning Architectures Inspired by Brain Neuromodulatory SignallingJournal of Low Power Electronics and Applications10.3390/jlpea1204005912:4(59)Online publication date: 4-Nov-2022
https://doi.org/10.3390/jlpea12040059
Pannell JRigby SPanoutsos G(2022)Application of transfer learning for the prediction of blast impulseInternational Journal of Protective Structures10.1177/2041419622109669914:2(242-262)Online publication date: 24-May-2022
https://doi.org/10.1177/20414196221096699
Wang GChoi KTeoh JLu J(2022)Deep Cross-Output Knowledge Transfer Using Stacked-Structure Least-Squares Support Vector MachinesIEEE Transactions on Cybernetics10.1109/TCYB.2020.300896352:5(3207-3220)Online publication date: May-2022
https://doi.org/10.1109/TCYB.2020.3008963
Zhang YWeninger FSchuller BPicard R(2022)Holistic Affect Recognition Using PaNDA: Paralinguistic Non-Metric Dimensional AnalysisIEEE Transactions on Affective Computing10.1109/TAFFC.2019.296188113:2(769-780)Online publication date: 1-Apr-2022
https://doi.org/10.1109/TAFFC.2019.2961881
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten