[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3041838.3041901guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Link-based classification

Published: 21 August 2003 Publication History

Abstract

A key challenge for machine learning is tackling the problem of mining richly structured data sets, where the objects are linked in some way due to either an explicit or implicit relationship that exists between the objects. Links among the objects demonstrate certain patterns, which can be helpful for many machine learning tasks and are usually hard to capture with traditional statistical models. Recently there has been a surge of interest in this area, fueled largely by interest in web and hypertext mining, but also by interest in mining social networks, bibliographic citation data, epidemiological data and other domains best described using a linked or graph structure. In this paper we propose a framework for modeling link distributions, a link-based model that supports discriminative models describing both the link distributions and the attributes of linked objects. We use a structured logistic regression model, capturing both content and links. We systematically evaluate several variants of our link-based model on a range of data sets including both web and citation collections. In all cases, the use of the link distribution improves classification accuracy.

References

[1]
Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. COLT: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann Publishers
[2]
Chakrabarti, S. (2002). Mining the web. Morgan Kaufman.
[3]
Chakrabarti, S., Dom, B., & Indyk, P. (1998). Enhanced hypertext categorization using hyperlinks. Proc of SIGMOD-98.
[4]
Chellappa, R., & Jain, A. (1993). Markov random fields: theory and applications. Boston: Academic Press.
[5]
Cohn, D., & Hofmann, T. (2001). The missing link - a probabilistic model of document content and hypertext connectivity. Neural Information Processing Systems 13.
[6]
Cook, D., & Holder, L. (2000). Graph-based data mining. IEEE Intelligent Systems, 15, 32-41.
[7]
Craven, M., DiPasquo, D., Freitag, D., McCailum, A., Mitchell, T., Nigam, K., & Slattery, S. (1998). Learning to extract symbolic knowledge from the world wide web. Proc. of AAAI-98.
[8]
Dean, J., & Henzinger, M. R. (1999). Finding related pages in the World Wide Web. Computer Networks, 31, 1467- 1479.
[9]
Dzeroski, S., & Lavrac, N. (Eds.). (2001). Relational data mining. Berlin: Kluwer.
[10]
Feldman, R. (2002). Link analysis: Current state of the art. Tutorial at the KDD-02.
[11]
Flach, P. A., & Lavrac, N. (2000). The role of feature construction in inductive rule learning. Proc. of the ICML 2000 workshop on Attribute-Value and Relational Learning: crossing the boundaries.
[12]
Getoor, L., Friedman, N., Koller, D., & Taskar, B. (2002). Learning probabilistic models with link uncertainty. Journal of Machine Learning Research.
[13]
Giles, C. L., Bollacker, K., & Lawrence, S. (1998). CiteSeer: An automatic citation indexing system. ACM Digital Libraries 98.
[14]
Hosmer, D., & Lemeshow, S. (1989). Applied logistic regression. New York: Wiley.
[15]
Hummel, R., & Zucker, S. (1983). On the foundations relaxation labeling processes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 267-287.
[16]
Jensen, D. (1999). Statistical challenges to inductive inference in linked data. Seventh International Workshop on Artificial Intelligence and Statistics.
[17]
Jensen, D., & Goldberg, H. (1998). AAAI fall symposium on AI and link analysis. AAAI Press.
[18]
Joachims, T. (1998). Text categorization with support vector machines: learning with many relevant features. Proc. of ECML-98.
[19]
Joachims, T., Cristianini, N., & Shawe-Taylor, J. (2001). Composite kernels for hypertext categorisatiom Proc. of ICML- 01.
[20]
Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46, 604-632.
[21]
Kramer, S., Lavrac, N., & Flach, P. (2001). Propositionalization approaches to relational data mining. In S. Dzeroski and N. Lavrac (Eds.), Relational data mining, 262-291. Kluwer.
[22]
Kubica, J., Moore, A., Schneider, J. & Yang, Y. (2002). Stochastic link and group detection. Pro. of AAAI-02.
[23]
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. Proc. of ICML-01.
[24]
McCallum, A., & Nigam, K. (1998). A comparison event models for naive bayes text classification. AAAI-98 Workshop on Learning for Text Categorization.
[25]
McCallum, A., Nigam, K., Rennie, J., & Seymore, K. (2000). Automating the construction of internet portals with machine learning. Information Retrieval, 3, 127- 163.
[26]
Murphy, K., & Weiss, Y. (1999). Loopy belief propagation for approximate inference: an empirical study. Proc. of UAI-99. Morgan Kaufman.
[27]
Neville, J., & Jensen, D. (2000). Iterative classification relational data. Proc. AAAI-2000 Workshop on Learning Statistical Models from Relational Data. AAAI Press.
[28]
Ng, A. Y., & Jordan, M. I. (2002). On discriminative vs. generative classifiers: A comparison of logistic regression and naive bayes. Neural Information Processing Systems 14.
[29]
Oh, H.-J., Myaeng, S. H., & Lee, M.-H. (2000). A practical hypertext categorization method using links and incrementally available class information. Proc. of SIGIR-00.
[30]
Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The pagerank citation ranking: Bring order to the web (Technical Report). Stanford University.
[31]
Popescul, A., Ungar, L., Lawrence, S., & Pennock, D. (2002). Towards structural logistic regression: Combing relational and statistical learning. KDD Workshop on Multi-Relational Data Mining.
[32]
Quinlan, J. R., & Cameron-Jones, R. M. (1993). FOIL: A midterm report. Proc. of ECML-93.
[33]
Slattery, S., & Craven, M. (1998). Combining statistical and relational methods for learning in hypertext domains. Proc. of ILP-98.
[34]
Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. Proc. of UAI-02 (pp. 485-492). Edmonton, Canada.
[35]
Taskar, B., Segal, E., & Koller, D. (2001). Probabilistic classification and clustering in relational data. Proc. of IJCAI-01.
[36]
Yang, Y., Slattery, S., & Ghani, R. (2002). A study of approaches to hypertext categorization. Journal of Intelligent Information Systems, 18, 219-241.
[37]
Zhang, T., & Oles, F. J. (2001). Text categorization based on regularized linear classification methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 5-31.

Cited By

View all
  • (2025)D2-GCN: a graph convolutional network with dynamic disentanglement for node classificationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-3339-719:1Online publication date: 1-Jan-2025
  • (2023)DeepPSLProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/401(3606-3614)Online publication date: 19-Aug-2023
  • (2022)Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/349456716:4(1-19)Online publication date: 8-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICML'03: Proceedings of the Twentieth International Conference on International Conference on Machine Learning
August 2003
935 pages
ISBN:1577351894

Sponsors

  • Kluwer Academic Publishers
  • NSF: National Science Foundation
  • Kaidara Software
  • AAAI: American Association for Artificial Intelligence
  • Microsoft Research: Microsoft Research
  • HP: HP

Publisher

AAAI Press

Publication History

Published: 21 August 2003

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)D2-GCN: a graph convolutional network with dynamic disentanglement for node classificationFrontiers of Computer Science: Selected Publications from Chinese Universities10.1007/s11704-023-3339-719:1Online publication date: 1-Jan-2025
  • (2023)DeepPSLProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/401(3606-3614)Online publication date: 19-Aug-2023
  • (2022)Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node ClassificationACM Transactions on Knowledge Discovery from Data10.1145/349456716:4(1-19)Online publication date: 8-Jan-2022
  • (2021)InfoGCLProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542588(30414-30425)Online publication date: 6-Dec-2021
  • (2020)Task-Oriented Genetic Activation for Large-Scale Complex Heterogeneous Graph EmbeddingProceedings of The Web Conference 202010.1145/3366423.3380230(1581-1591)Online publication date: 20-Apr-2020
  • (2020)Graph representation ensemble learningProceedings of the 12th IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining10.1109/ASONAM49781.2020.9381465(24-31)Online publication date: 7-Dec-2020
  • (2019)Generalized matrix means for semi-supervised learning with multilayer graphsProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455619(14877-14886)Online publication date: 8-Dec-2019
  • (2019)Semi-implicit graph variational auto-encodersProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455248(10712-10723)Online publication date: 8-Dec-2019
  • (2019)Masked graph convolutional networkProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367607(4070-4077)Online publication date: 10-Aug-2019
  • (2019)Dual self-paced graph convolutional networkProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367471.3367606(4062-4069)Online publication date: 10-Aug-2019
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media