Analysis of Company Growth Data Using Genetic Algorithms on Binary Trees

Gerrit K. Janssens²¹,
Kenneth Sösrensen²²,
Arthur Limère²³ &
…
Koen Vanhoof²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3518))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2678 Accesses

Abstract

This paper investigates why some companies grow faster than others, by data mining a survey of a large number of companies in Flanders (the northern part of Belgium). Faster or slower average growth over a time period is explained by building a classification tree containing several categorical variables (both quantitative and qualitative). The technique used – called genAID – splits the population at different levels. It is inspired by the Automatic Interaction Detector (AID) technique to find trees that explain the variability in average growth but uses a genetic algorithm to overcome some of the drawbacks of AID.

Classical AID or other tree-growing techniques usually generate a single tree for interpretation. This approach has been criticized because, due to the artifacts of data, spurious interactions may occur. genAID offers the user-analyst a set of trees, which are the best ones found over a number of generations of the genetic algorithm. The user-analyst is then offered the choice of choosing a tree by trading off explanatory power against either the ease of understanding or the conformity with an existing theory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Economic Growth Prediction Using Optimized Support Vector Machines

Article 23 September 2015

Econometric Genetic Programming in Binary Classification: Evolving Logistic Regressions Through Genetic Programming

Variable Selection in Binary Logistic Regression for Modelling Bankruptcy Risk

References

Adriaans, P., Zantinge, D.: Data Mining. Addison-Wesley, Harlow (1996)
Google Scholar
Chen, M.-S., Han, J., Yu, P.S.: Data mining: an overview from a database perspective. IEEE Transactions on Knowledge and Data Engineering 8, 866–883 (1996)
Article Google Scholar
Einhorn, H.J.: Alchemy in the behavioral sciences. Public Opinion Quarterly 36, 367–378 (1972)
Article Google Scholar
Kass, G.V.: Significance testing in automatic interaction detection (AID). Applied Statistics 24, 178–189 (1975)
Article Google Scholar
Kass, G.V.: An exploratory technique for investigating large quantities of categorical data. Applied Statistics 29, 119–127 (1980)
Article Google Scholar
Koza, J.R.: Genetic Programming. MIT Press, Cambridge (1992)
MATH Google Scholar
Laveren, E., Limère, A., Cleeren, K., Van Bilsen, E.: Growth factors of flemish enterprises: an exploratory study over the periode 1993-1997. Brussels Economic Journal-Cahiers Economiques de Bruxelles 46(1), 5–38 (2003)
Google Scholar
Morgan, J.N., Sonquist, J.A.: Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association 58, 415–435 (1963)
Article MATH Google Scholar
Ooghe, H., Verbaere, E., Croucke, M.: Ondernemingsdimensie en financiële structuur. Maandblad voor Accountancy en Bedrijfseconomie 3, 62–77 (1988) (in Dutch)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Google Scholar
Quinlan, J.R.: C4.5: Programs for machine learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Smith, M.: Neural networks for statistical modeling. Thomson, Boston (1996)
Google Scholar
Sonquist, J.A., Baker, E., Morgan, J.: Searching for structure. Technical report, Institute for Social Research. University of Michigan, Ann Arbor (1973)
Google Scholar
Söorensen, K., Janssens, G.K.: Data mining with genetic algorithms on binary trees. European Journal of Operational Research 151, 253–264 (2003)
Article MathSciNet Google Scholar
Van Hove, H., Verschoren, A.: Genetic algorithms and trees: part 1: recognition trees (the fixed width case). Computers and Artificial Intelligence 13, 453–476 (1994)
MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Applied Economics, Data Analysis and Modelling Research Group (DAM), Limburg University Centre, B–3590, Diepenbeek, Belgium
Gerrit K. Janssens & Koen Vanhoof
Faculty of Applied Economics, University of Antwerp, B–2000, Antwerp, Belgium
Kenneth Sösrensen
Faculty of Applied Economics, Financial Management Research Group (FIM), Limburg University Centre, B–3590, Diepenbeek, Belgium
Arthur Limère

Authors

Gerrit K. Janssens
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth Sösrensen
View author publications
You can also search for this author in PubMed Google Scholar
Arthur Limère
View author publications
You can also search for this author in PubMed Google Scholar
Koen Vanhoof
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Japan Advanced Institute of Science and Technology, Asahidai 1-1, 923-12292, Nomi, Japan
Tu Bao Ho
University of Hong Kong, Pokfulam Road, Hong Kong, China
David Cheung
Department of Computer Science and Engineering, Arizona State University, Tempe, Arizona, USA
Huan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Janssens, G.K., Sösrensen, K., Limère, A., Vanhoof, K. (2005). Analysis of Company Growth Data Using Genetic Algorithms on Binary Trees. In: Ho, T.B., Cheung, D., Liu, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2005. Lecture Notes in Computer Science(), vol 3518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11430919_29

Download citation

DOI: https://doi.org/10.1007/11430919_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26076-9
Online ISBN: 978-3-540-31935-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics