[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Combination Method for Reducing Dimensionality in Large Datasets

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2016 (ICANN 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9887))

Included in the following conference series:

Abstract

The amount of data in the world is growing exponentially due to the elevated number of applications in the most various contexts. This data needs to be analyzed in order to extract valuable underlying information from them. Machine learning is a useful tool to do this task, but the high complexity of the data forces to use other methods to reduce such complexity. Dimensionality reduction (feature selection) is one of the most used method to achieve this goal. As usual, many algorithms were proposed to reduce dimension of data, each one with its own advantages and drawbacks. The variety of algorithms usually makes researches to test several methods and choose the best solution. Based on that, this paper proposes a combination of feature selection algorithms in order to create a single and more stable solution. We tested this approach using real datasets and machine learning algorithms. Results showed we can use the combined solution with little or none loss in classification accuracy. So, our method can be used as a stable choice when there is few knowledge about the problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    avaliable at http://www.mathworks.com/matlabcentral/fileexchange/47129-information-theoretic-feature-selection.

References

  1. Battiti, R.: Using mutual information for selecting features in supervised neural net learning. IEEE Trans. Neural Netw. 5, 537–550 (1994)

    Article  Google Scholar 

  2. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Prentice-Hall Inc., Upper Saddle River (2006)

    Google Scholar 

  3. Gordon, G.J., Jensen, R.V., Hsiao, L.L., Gullans, S.R., Blumenstock, J.E., Ramaswamy, S., Richards, W.G., Sugarbaker, D.J., Bueno, R.: Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res. 62, 4963–4967 (2002)

    Google Scholar 

  4. Gorman, P.R., Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75–89 (1988)

    Article  Google Scholar 

  5. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. SIGKDD Explor. Newsl. 11(1), 10–18 (2009)

    Article  Google Scholar 

  6. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Upper Saddle River (1988)

    MATH  Google Scholar 

  7. Jolliffe, I.: Principal Component Analysis. Springer Series in Statistics. Springer, New York (2002)

    MATH  Google Scholar 

  8. Lichman, M.: UCI Machine Learning Repository (2013)

    Google Scholar 

  9. Nguyen, X.V., Chan, J., Romano, S., Bailey, J.: Effective global approaches for mutual information based feature selection. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 512–521. ACM, New York (2014)

    Google Scholar 

  10. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  11. Shen, Q., Diao, R., Su, P.: Feature selection ensemble. In Voronkov, A. (ed.) Turing-100. The Alan Turing Centenary. EPiC Series in Computing, vol. 10, pp. 289–306. EasyChair (2012)

    Google Scholar 

  12. Sigillito, V.G., Wing, S.P., Hutton, L.V., Baker, K.B.: Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APL Tech. Dig. 10, 262–266 (1989)

    Google Scholar 

  13. Tsanas, A., Little, M.A., Fox, C., Ramig, L.O.: Objective automatic assessment of rehabilitative speech treatment in parkinson’s disease. IEEE Trans. Neural Syst. Rehabil. Eng. 22(1), 181–190 (2014)

    Article  Google Scholar 

Download references

Acknowledgments

This paper was partially supported by CNPq Universal Grant no 480997/2013-6 and UFRN scholarship program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Araújo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Araújo, D., Jesus, J., Neto, A.D., Martins, A. (2016). A Combination Method for Reducing Dimensionality in Large Datasets. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9887. Springer, Cham. https://doi.org/10.1007/978-3-319-44781-0_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44781-0_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44780-3

  • Online ISBN: 978-3-319-44781-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics