[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2063576.2063746acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Towards feature selection in network

Published: 24 October 2011 Publication History

Abstract

Traditional feature selection methods assume that the data are independent and identically distributed (i.i.d.). However, in real world, there are tremendous amount of data which are distributing in a network. Existing features selection methods are not suited for networked data because the i.i.d. assumption no longer holds. This motivates us to study feature selection in a network. In this paper, we present a supervised feature selection method based on Laplacian Regularized Least Squares (LapRLS) for networked data. In detail, we use linear regression to utilize the content information, and adopt graph regularization to consider the link information. The proposed feature selection method aims at selecting a subset of features such that the empirical error of LapRLS is minimized. The resultant optimization problem is a mixed integer programming, which is difficult to solve. It is relaxed into a $L_{2,1}$-norm constrained LapRLS problem and solved by accelerated proximal gradient descent algorithm. Experiments on benchmark networked data sets show that the proposed feature selection method outperforms traditional feature selection method and the state of the art learning in network approaches.

References

[1]
M. Belkin and P. Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6):1373--1396, 2003.
[2]
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 7:2399--2434, 2006.
[3]
M. Bilgic, L. Mihalkova, and L. Getoor. Active learning for networked data. In ICML, pages 79--86, 2010.
[4]
S. Boyd and L. Vandenberghe. Convex optimization. Cambridge University Press, Cambridge, 2004.
[5]
F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, February 1997.
[6]
D. A. Cohn and T. Hofmann. The missing link - a probabilistic model of document content and hypertext connectivity. In NIPS, pages 430--436, 2000.
[7]
Q. Gu, Z. Li, and J. Han. Generalized fisher score for feature selection. In Proceedings of the International Conference on Uncertainty in Artificial Intelligence, 2011.
[8]
I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157--1182, 2003.
[9]
X. He and P. Niyogi. Locality preserving projections. In NIPS, 2003.
[10]
T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42(1/2):177--196, 2001.
[11]
S. Ji and J. Ye. An accelerated gradient method for trace norm minimization. In ICML, page 58, 2009.
[12]
W.-J. Li and D.-Y. Yeung. Relation regularized matrix factorization. In IJCAI, pages 1126--1131, 2009.
[13]
W.-J. Li, D.-Y. Yeung, and Z. Zhang. Probabilistic relational pca. In Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 1123--1131. 2009.
[14]
J. Liu, S. Ji, and J. Ye. Multi-task feature learning via efficient $l_2,1$-norm minimization. In Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI 2009), 2009.
[15]
Y. Nesterov. Introductory Lectures on Convex Optimization: A Basic Course. Kluwer Academic Publishers, 2003.
[16]
M. E. Newman. Modularity and community structure in networks. Proc Natl Acad Sci U S A, 103(23):8577--8582, June 2006.
[17]
P. E. H. R. O. Duda and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2001.
[18]
P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Gallagher, and T. Eliassi-Rad. Collective classification in network data. AI Magazine, 29(3):93--106, 2008.
[19]
L. Shi, Y. Zhao, and J. Tang. Combining link and content for collective active learning. In CIKM, pages 1829--1832, 2010.
[20]
A. J. Smola and R. I. Kondor. Kernels and regularization on graphs. In COLT, pages 144--158, 2003.
[21]
V. N. Vapnik. The nature of statistical learning theory. Springer-Verlag New York, Inc., New York, NY, USA, 1995.
[22]
S. Yan, D. Xu, B. Zhang, H. Zhang, Q. Yang, and S. Lin. Graph embedding and extensions: A general framework for dimensionality reduction. IEEE Trans. Pattern Anal. Mach. Intell., 29(1):40--51, 2007.
[23]
Y. Yang and J. O. Pedersen. A comparative study on feature selection in text categorization. In ICML, pages 412--420, 1997.
[24]
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2003.
[25]
D. Zhou, J. Huang, and B. Schölkopf. Learning from labeled and unlabeled data on a directed graph. In ICML, pages 1036--1043, 2005.
[26]
S. Zhu, K. Yu, Y. Chi, and Y. Gong. Combining content and link for classification using matrix factorization. In SIGIR, pages 487--494, 2007.
[27]
X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.

Cited By

View all
  • (2024)FraudGuard Pro: An Ensemble-Based Feature Selection Algorithm for Credit Card Fraud Detection2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724073(1-11)Online publication date: 24-Jun-2024
  • (2024)Local-structure-preservation and redundancy-removal-based feature selection method and its application to the identification of biomarkers for schizophreniaNeuroImage10.1016/j.neuroimage.2024.120839(120839)Online publication date: Sep-2024
  • (2022)A novel biomarker selection method combining graph neural network and gene relationships applied to microarray dataBMC Bioinformatics10.1186/s12859-022-04848-y23:1Online publication date: 26-Jul-2022
  • Show More Cited By

Index Terms

  1. Towards feature selection in network

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
    October 2011
    2712 pages
    ISBN:9781450307178
    DOI:10.1145/2063576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Laplacian regularized least squares
    2. feature selection
    3. graph regularization
    4. network

    Qualifiers

    • Research-article

    Conference

    CIKM '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)11
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 13 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)FraudGuard Pro: An Ensemble-Based Feature Selection Algorithm for Credit Card Fraud Detection2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10724073(1-11)Online publication date: 24-Jun-2024
    • (2024)Local-structure-preservation and redundancy-removal-based feature selection method and its application to the identification of biomarkers for schizophreniaNeuroImage10.1016/j.neuroimage.2024.120839(120839)Online publication date: Sep-2024
    • (2022)A novel biomarker selection method combining graph neural network and gene relationships applied to microarray dataBMC Bioinformatics10.1186/s12859-022-04848-y23:1Online publication date: 26-Jul-2022
    • (2022)Networked Time Series Shapelet Learning for Power System Transient Stability AssessmentIEEE Transactions on Power Systems10.1109/TPWRS.2021.309342337:1(416-428)Online publication date: Jan-2022
    • (2022)A Survey on Sparse Learning Models for Feature SelectionIEEE Transactions on Cybernetics10.1109/TCYB.2020.298244552:3(1642-1660)Online publication date: Mar-2022
    • (2022)A Modified Naïve Bayes Classifier for Detecting Spam E-mails based on Feature Selection2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS)10.1109/ICICCS53718.2022.9788340(1634-1641)Online publication date: 25-May-2022
    • (2021)Deep Learning on Graphs10.1017/9781108924184Online publication date: 2-Sep-2021
    • (2021)SCHC: Incorporating Social Contagion and Hashtag Consistency for Topic-Oriented Social SummarizationDatabase Systems for Advanced Applications10.1007/978-3-030-73197-7_44(641-657)Online publication date: 6-Apr-2021
    • (2020)A Deep Multi-View Framework for Anomaly Detection on Attributed NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.3015098(1-1)Online publication date: 2020
    • (2020)Learning Distilled Graph for Large-Scale Social Network Data ClusteringIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2019.290406832:7(1393-1404)Online publication date: 1-Jul-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media