[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

GBC: Gradient boosting consensus model for heterogeneous data

Published: 01 June 2014 Publication History

Abstract

With the rapid development of database technologies, multiple data sources may be available for a given learning task e.g. collaborative filtering. However, the data sources may contain different types of features. For example, users' profiles can be used to build recommendation systems. In addition, a model can also use users' historical behaviors and social networks to infer users' interests on related products. We argue that it is desirable to collectively use any available multiple heterogeneous data sources in order to build effective learning models. We call this framework heterogeneous learning. In our proposed setting, data sources can include i nonoverlapping features, ii nonoverlapping instances, and iii multiple networks i.e. graphs that connect instances. In this paper, we propose a general optimization framework for heterogeneous learning, and devise a corresponding learning model from gradient boosting. The idea is to minimize the empirical loss with two constraints: 1 there should be consensus among the predictions of overlapping instances if any from different data sources; 2 connected instances in graph datasets may have similar predictions. The objective function is solved by stochastic gradient boosting trees. Furthermore, a weighting strategy is designed to emphasize informative data sources, and deemphasize the noisy ones. We formally prove that the proposed strategy leads to a tighter error bound. This approach consistently outperforms a standard concatenation of data sources on movie rating prediction, number recognition, and terrorist attack detection tasks. Furthermore, the approach is evaluated on AT&T's distributed database with over 500 000 instances, 91 different data sources, and over 45 000 000 joined features. We observe that the proposed model can improve out-of-sample error rate substantially.

References

[1]
<label>1</label> P.Melville, R. J.Mooney, and R.Nagarajan, Content-boosted collaborative filtering for improved recommendations, in AAAI/IAAI, 2002, pp.187-192.
[2]
<label>2</label> A.Blum and T. M.Mitchell, Combining labeled and unlabeled data with co-training, in COLT, 1998, pp.92-100.
[3]
<label>3</label> S.Oba, M.Kawanabe, K.Müller, and S.Ishii, Heterogeneous component analysis, in NIPS, 2007.
[4]
<label>4</label> K.Nigam and R.Ghani, Analyzing the effectiveness and applicability of co-training, in CIKM, 2000, pp.86-93.
[5]
<label>5</label> B.Long, P. S.Yu, and Z.Zhang, A general model for multiple view unsupervised learning, in SDM, 2008, pp.822-833.
[6]
<label>6</label> J.Gao, W.Fan, Y.Sun, and J.Han, Heterogeneous source consensus learning via decision propagation and negotiation, in KDD, 2009, pp.339-348.
[7]
<label>7</label> D.Agarwal, B.Chen, and B.Long, Localized factor models for multi-context recommendation, in KDD, 2011, pp.609-617.
[8]
<label>8</label> P.Sen, G.Namata, M.Bilgic, L.Getoor, B.Gallagher, and T.Eliassi-Rad, Collective classification in network data, AI Mag Volume 29 Issue 3 2008, pp.93-106.
[9]
<label>9</label> J. D.Lafferty, A.McCallum, and F. C. N.Pereira, Conditional random fields: probabilistic models for segmenting and labeling sequence data, in Proceedings of the International Conference on Machine Learning, 2001.
[10]
<label>10</label> B.Taskar, P.Abbeel, and D.Koller, Discriminative probabilistic models for relational data, in Proceedings of the Annual Conference on Uncertainty in Artificial Intelligence, 2002.
[11]
<label>11</label> H.Eldardiry and J.Neville, Across-model collective ensemble classification, in AAAI, 2011.
[12]
<label>12</label> S.Bickel, M.Brückner, and T.Scheffer, Discriminative learning for differing training and test distributions, in ICML, 2007, pp.81-88.
[13]
<label>13</label> R.Caruana, Multitask learning, Mach Learn Volume 28 Issue 1 1997, pp.41-75.
[14]
<label>14</label> J.Gao, W.Fan, J.Jiang, and J.Han, Knowledge transfer via multiple model local structure mapping, in KDD, 2008, pp.283-291.
[15]
<label>15</label> X.Shi, Q.Liu, W.Fan, P. S.Yu, and R.Zhu, Transfer learning on heterogenous feature spaces via spectral transformation, in ICDM, 2010, pp.1049-1054.
[16]
<label>16</label> S.Ben-David, J.Blitzer, K.Crammer, and F.Pereira, Analysis of representations for domain adaptation, in NIPS, 2006, pp.137-144.
[17]
<label>17</label> J.Blitzer, M.Dredze, and F.Pereira, Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification, in ACL, 2007.
[18]
<label>18</label> W.Dai, G.-R.Xue, Q.Yang, and Y.Yu, Co-clustering based classification for out-of-domain documents, in KDD, 2007, pp.210-219.
[19]
<label>19</label> R. K.Ando and T.Zhang, A high-performance semi-supervised learning method for text chunking, in ACL, 2005.
[20]
<label>20</label> J.Blitzer, R. T.McDonald, and F.Pereira, Domain adaptation with structural correspondence learning, in EMNLP, 2006, pp.120-128.
[21]
<label>21</label> H.Daumé, III, Frustratingly easy domain adaptation, in CoRR, vol. abs/0907.1815, 2009.
[22]
<label>22</label> S.-I.Lee, V.Chatalbashev, D.Vickrey, and D.Koller, Learning a meta-level prior for feature relevance from multiple related tasks, in ICML, 2007, pp.489-496.
[23]
<label>23</label> T.Jebara, Multi-task feature and kernel selection for svms, in ICML, 2004.
[24]
<label>24</label> C.Wang and S.Mahadevan, Manifold alignment using procrustes analysis, in ICML, 2008, pp.1120-1127.
[25]
<label>25</label> A.Argyriou, T.Evgeniou, and M.Pontil, Convex multi-task feature learning, Mach Learn Volume 73 Issue 3 2008, pp.243-272.
[26]
<label>26</label> D. P.Bertsekas, Nonlinear Programming, 2nd ed., Cambridge, MA, Athena Scientific, 1999.
[27]
<label>27</label> J. H.Friedman, Stochastic gradient boosting, Comput Stat Data Anal Volume 38 Issue 4 2002, pp.367-378.
[28]
<label>28</label> J.Shi and J.Malik, Normalized cuts and image segmentation, IEEE Trans Pattern Anal Mach Intell Volume 22 Issue 8 2000, pp.888-905.
[29]
<label>29</label> M.Balcan and A.Blum, A pac-style model for learning from labeled and unlabeled data, in COLT, 2005, pp.111-126.
[30]
<label>30</label> K.Sridharan and S. M.Kakade, An information theoretic framework for multi-view learning, in COLT, 2008, pp.403-414.
[31]
<label>31</label> Y.Koren, R. M.Bell, and C.Volinsky, Matrix factorization techniques for recommender systems, IEEE Comput Volume 42 Issue 8 2009, pp.30-37.
[32]
<label>32</label> M.van Breukelen and R.Duin, Neural network initialization by combined classifiers, in ICPR, 1998, pp.16-20.
[33]
<label>33</label> X. Z.Fern and C.Brodley, Cluster ensembles for high dimensional clustering: an empirical study, J Mach Learn Res Volume 22 Issue 8 2004, pp.888-905.
[34]
<label>34</label> X.He and P.Niyogi, Locality preserving projections, in NIPS, 2003.
[35]
<label>35</label> X.Zhu and A. B.Goldberg, Introduction to Semi-Supervised Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, 2009.
[36]
<label>36</label> P.Laskov, An improved decomposition algorithm for regression support vector machines, in NIPS, 1999, pp.484-490.

Cited By

View all
  • (2023)Predicting Bugs by Monitoring Developers during Task ExecutionProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00100(1110-1122)Online publication date: 14-May-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Statistical Analysis and Data Mining
Statistical Analysis and Data Mining  Volume 7, Issue 3
June 2014
65 pages
ISSN:1932-1864
EISSN:1932-1872
Issue’s Table of Contents

Publisher

John Wiley & Sons, Inc.

United States

Publication History

Published: 01 June 2014

Author Tags

  1. gradient boosting
  2. graph mining
  3. heterogeneous data
  4. social network

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Predicting Bugs by Monitoring Developers during Task ExecutionProceedings of the 45th International Conference on Software Engineering10.1109/ICSE48619.2023.00100(1110-1122)Online publication date: 14-May-2023

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media