Abstract
Many data mining papers start with claiming that the exponential growth in the amount of data provides great opportunities for data mining. Reality can be different though. In real world applications, the number of sources over which this information is fragmented can grow at an even faster rate, resulting in barriers to widespread application of data mining and missed business opportunities. Let us illustrate this paradox with a motivating example from database marketing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baker, K., Harris, P., O’Brien, J.: Data fusion: An appraisal and experimental evaluation. Journal of the Market Research Society 31(2), 152–212 (1989)
Barr, R., Turner, J.: A new, linear programming approach to microdata file merging. In: 1978 Compendium of Tax Research. Office of Tax Analysis (1978)
Budd, E.: The creation of a microdata file for estimating the size distribution of income. Review of Income and Wealth 17, 317–333 (1971)
Chapman, P., Clinton, J., Khabaza, T., Reinartz, T., Wirth, R.: The crisp-dm process model. Tech. rep., Crisp Consortium (1999), http://www.crisp-dm.org/
D’Orazio, M., Di Zio, M., Scanu, M.: Statistical Matching: Theory and Practice. Wiley, Chichester (2006)
Flores, G.A., Albacea, E.A.: A genetic algorithm for constrained statistical matching. In: 10th National Convention on Statistics (NCS), Manila, Phillipines (2007)
Gusfield, D., Irving, R.W.: The stable marriage problem: structure and algorithms. MIT Press, Cambridge (1989)
Jephcott, J., Bock, T.: The application and validation of data fusion. Journal of the Market Research Society 40(3), 185–205 (1998)
Kamakura, W., Wedel, M.: Statistical data fusion for cross-tabulation. Journal of Marketing Research 34(4), 485–498 (1997)
Kum, H., Masterson, T.: Statistical matching using propensity scores: Theory and application to the levy institute measure of economic well-being. Working paper no. 535, The Levy Economics Institute of Bard College (2008)
Little, R., Rubin, D.: Statistical analysis with missing data. John Wiley and Sons, Chichester (1986)
Maat, B.: The need for fusing head and neck cancer data. can more data provide a better data mining model for predicting survivability of head and neck cancer patients? Master’s thesis, ICT in Business, Leiden Institute of Advanced Computer Science. Leiden University, The Netherlands (2006)
Moller, M.F.: A scaled conjugate gradient algorithm for fast supervised learning. Neural Networks 6(4), 525–533 (1993)
Nguyen, D.H., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: IJCNN International Joint Conference on Neural Networks, vol. 3, pp. 21–26 (1990)
O’Brien, S.: The role of data fusion in actionable media targeting in the 1990’s. Marketing and Research Today 19, 15–22 (1991)
Paass, G.: Statistical match: Evaluation of existing procedures and improvements by using additional information. In: Orcutt, G., Merz, K. (eds.) Microanalytic Simulation Models to Support Social and Financial Policy, pp. 401–422. Elsevier Science, Amsterdam (1986)
Pei, J., Getoor, L., de Keijzer, A. (eds.): First ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data, Paris, France, June 28. ACM, New York (2009)
van Pelt, X.: The fusion factory: A constrained data fusion approach. Master’s thesis, Leiden Institute of Advanced Computer Science, Leiden University, The Netherlands (2001)
van der Putten, P.: Utilizing the topology preserving property of self-organizing maps for classification. Master’s thesis, Cognitive Artificial Intelligence, Utrecht University, The Netherlands (1996)
van der Putten, P.: Data mining in direct marketing databases. In: Baets, W. (ed.) Complexity and Management: A Collection of Essays, World Scientific Publishers, Singapore (1999)
van der Putten, P.: Data fusion: A way to provide more data to mine in? In: Proceedings 12th Belgian-Dutch Artificial Intelligence (2000)
van der Putten, P.: Data fusion for data mining: a problem statement. In: Coil Seminar 2000, Chios, Greece, June 22-23 (2000)
van der Putten, P., Kok, J.N., Gupta, A.: Data fusion through statistical matching. Tech. Rep. Working Paper No. 4342-02, MIT Sloan School of Management, Cambridge, MA (2002)
van der Putten, P., Kok, J.N., Gupta, A.: Why the information explosion can be bad for data mining, and how data fusion provides a way out. In: Grossman, R.L., Han, J., Kumar, V., Mannila, H., Motwani, R. (eds.) SDM, SIAM, Philadelphia (2002)
van der Putten, P., Ramaekers, M., den Uyl, M., Kok, J.N.: A process model for a data fusion factory. In: Proceedings of the 14th Belgium/Netherlands Conference on Artificial Intelligence (BNAIC 2002), Leuven, Belgium (2002)
van der Putten, P., van Someren, M.: A Bias-Variance Analysis of a Real World Learning Problem: The CoIL Challenge 2000. Machine Learning 57(1-2), 177–195 (2004)
Radner, D., Rich, A., Gonzalez, M., Jabine, T., Muller, H.: Report on exact and statistical matching techniques. statistical working paper 5. Tech. rep., Office of Federal Statistical Policy and Standards US DoC (1980)
Raessler, S.: Statistical Matching: A Frequentist Theory, Practical Applications, and Alternative Bayesian Approaches. Springer, Heidelberg (2002)
Roberts, A.: Media exposure and consumer purchasing: An improved data fusion technique. Marketing And Research Today 22, 159–172 (1994)
Rodgers, W.L.: An evaluation of statistical matching. Journal of Business & Economic Statistics 2(1), 91–102 (1984)
Rubin, D.B.: Statistical matching using file concatenation with adjusted weights and multiple imputations. Journal of Business & Economic Statistics 4(1), 87–94 (1986)
Ruggles, N., Ruggles, R.: A strategy for merging and matching microdata sets. Annals Of Social And Economic Measurement 3(2), 353–371 (1974)
de Ruiter, M.: Bayesian classification in data mining: theory and practice. Master’s thesis, BWI, Free University of Amsterdam, The Netherlands (1999)
Smith, K.A., Chuan, S., van der Putten, P.: Determining the validity of clustering for data fusion. In: Proceedings of Hybrid Information Systems, Adelaide, Australia, December 11-12 (2001)
Soong, R., de Montigny, M.: Does fusion-on-the-fly really fly? In: ARF/ESOMAR Week of Audience Measurement (2003)
Soong, R., de Montigny, M.: No free lunch in data integration. In: ARF/ESOMAR Week of Audience Measurement (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
van der Putten, P., Kok, J.N. (2010). Using Data Fusion to Enrich Customer Databases with Survey Data for Database Marketing. In: Casillas, J., Martínez-López, F.J. (eds) Marketing Intelligent Systems Using Soft Computing. Studies in Fuzziness and Soft Computing, vol 258. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15606-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-15606-9_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15605-2
Online ISBN: 978-3-642-15606-9
eBook Packages: EngineeringEngineering (R0)