Abstract
In business applications such as direct marketing, decision-makers are required to choose the action which best maximizes a utility function. Cost-sensitive learning methods can help them achieve this goal. In this paper, we introduce Pessimistic Active Learning (PAL). PAL employs a novel pessimistic measure, which relies on confidence intervals and is used to balance the exploration/exploitation trade-off. In order to acquire an initial sample of labeled data, PAL applies orthogonal arrays of fractional factorial design. PAL was tested on ten datasets using a decision tree inducer. A comparison of these results to those of other methods indicates PAL’s superiority.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Blake CL, Merz CJ (1998), UCI repository of machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html
Buchner A, Mulvenna M (1998) Discovering internet marketing intelligence through online analytical web usage mining. SIGMOD Record 27(4): 54–61
Cestnik B (1990) Estimating probabilities: a crucial task in machine learning. In: Proceedings of the ninth European conference on artificial intelligence, pp 147–149
Cohn D, Atlas L, Ladner R (1994) Improving generalization with active learning. Mach Learn 15(2): 201–221
Cohn DA, Ghahramani Z, Jordan MI (1996) Active learning with statistical models. J Artif Intell Res 4: 129–145
Clarke P (2006) Christmas gift giving involvement. J Consumer Market 23(5): 283–291
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
Domingos P (2005) Mining social networks for viral marketing. IEEE Intell Syst 20(1): 80–82
Fong PWL (1995) A quantitative study of hypothesis selection. In: Twelfth international conference on machine learning, Tahoe City, California, pp 226–234
Hedayat AS, Sloane NJA, Stufken J (1999) Orthogonal arrays: theory and applications. Springer-Verlag, NY
Hwang J, Ding A (1997) Prediction intervals for artificial neural networks. J Am Stat Assoc 92: 748–757
Kaelbling LP (1993) Learning in embedded systems. MIT Press
Kaelbling LP, Littman M (1996) Reinforcement learning: a survey. J Artif Intell Res 4: 237–285
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598): 671–680
Kyriakopoulos K, Moorman C (2004) Tradeoffs in marketing exploitation and exploration strategies: the overlooked role of market orientatio. Int J Res Market 21: 219–240
Leemis LM, Trivedi KS (1996) A comparison of approximate interval estimators for the Bernoulli parameter. Am Stat 50: 63–68
Levin N, Zahavi J (2005) Data mining for target marketing. In: Maimon O, Rokach L (eds) The data mining and knowledge discovery handbook, pp 1261–1301
Lewis D, Gale W (1994) A sequential algorithm for training text classifiers. In: Proceedings of the international ACM-SIGIR conference on research and development in information retrieval, pp 3–12
Ling CX, Li C (1998) Data mining for direct marketing: problems and solutions. In: Proceedings 4th international conference on knowledge discovery in databases (KDD-98), New York, pp 73–79
Margineantu D (2005) Active cost-sensitive learning. In: Proceedings of the nineteenth international joint conference on artificial intelligence, IJCAI-05
Mayer UF, Sarkissian A (2003) Experimental design for solicitation campaigns. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, New York, NY, USA, pp 717–722
Montgomery DC (1997) Design and analysis of experiments, 4th edn. Wiley, New York
Pednault E, Abe N, Zadrozny B (2002) Sequential cost-sensitive decision making with reinforcement learning. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 259–268
Percus OE, Percus JK (1984) Modified Bayes technique in sequential clinical trials. Comput Biol Med 14: 127–134
Petkau AJ (1978) Sequential medical trials for comparing an experimental with a standard treatment. J Am Stat Assoc 73: 328–338
Potharst R, Kaymak U, Pijls W (2002) Neural networks for target selection in direct marketing. In: Smith KA, Gupta JND (eds) Neural networks in business: techniques and applications, Idea Group Publishing, pp 89–110
Putten P, Someren M (eds) (2000) CoIL Challenge 2000: The Insurance Case. Published by Sentient Machine Research, Amsterdam. Also a Leiden Institute of Advanced Computer Science Technical Report 2000–09. June 22
Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann
Robbins H (1952) Some aspects of the sequential design of experiments. Bull Am Math Soc 55: 527–535
Rokach L, Maimon O (2005) Top down induction of decision trees classifiers: a survey. IEEE SMC Trans Part C 35(4): 476–487
Rothaermel FT, Deeds DL (2004) Exploration and exploitation alliances in biotechnology: a system of new product development. Strateg Manage J 25(3): 201–217
Roy N, McCallum A (2001) Toward optimal active learning through sampling estimation of error reduction. In: Proceedings of the international conference on machine learning
Saar-Tsechansky M, Provost F (2004) Active sampling for class probability estimation and ranking. Mach Learn 54(2): 153–178
Saar-Tsechansky M, Provost F (2007) Decision-centric active learning of binary-outcome models. Inform Syst Res 18(1): 4–22
Schein AI, Ungar LH (2007) Active learning for logistic regression: an evaluation. Mach Learn 68(3): 235–265
Sloane NJA (2007) A library of orthogonal arrays. http://www.research.att.com/~njas/oadir/index.html
Sofroniou N, Hutcheson GD (2002) Confidence intervals for the predictions of logistic regression in the presence and absence of a variance-covariance matrix. Understand Stat 1(1): 3–18
Strehl AL, Littman ML (2005) A theoretical analysis of model-based interval estimation. In: Proceedings of the 22nd international conference on Machine learning, August 07–11, 2005, Bonn, Germany, pp 856–863
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT Press. http://www.cs.ualberta.ca/~sutton/book/ebook/the-book.html
Turney P (2000) Types of cost in inductive concept learning. Workshop on Cost Sensitive Learning at the 17th ICML, Stanford, CA
Utgoff PE (1989) Incremental induction of decision trees. Mach Learn 4(2): 161–186
Viaene S, Baesens B, De Moor B, VanDen Poel D, Dedene G, Suykens JAK, Vanthienen J, Viaene S, Van Gestel T (2001) Knowledge discovery using least squares support vector machine classifiers: a direct marketing case. Int J Intell Syst 16(9): 1023–1036
Weiss G, Tian T (2008) Maximizing classifier utility when there are data acquisition and modeling costs. Data Min Knowl Disc
Wiering M, Schmidhuber J (1998) Efficient model-based exploration. In: Proceedings of the 5th international conference on simulation of adaptive behavior, pp 223–228
Yinghui Y (2004) New data mining and marketing approaches for customer segmentation and promotion planning on the Internet, Phd Dissertation, University of Pennsylvania, ISBN 0-496-73213-1
Zahavi J, Levin N (1995) Issues and problems in applying neural computing to target marketing. (with N. Levin). J Direct Market 9(1): 7–12
Zahavi J, Levin N (1997) Applying neural computing to target marketing. J Direct Market 11(1): 5–22
Zadrozny B (2005) One-benefit learning: cost-sensitive learning with restricted cost information. In: Proceedings of the workshop on utility-based data mining at the eleventh ACM SIGKDD international conference on knowledge discovery and data mining
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Maytal Saar-Tsechansky.
Rights and permissions
About this article
Cite this article
Rokach, L., Naamani, L. & Shmilovici, A. Pessimistic cost-sensitive active learning of decision trees for profit maximizing targeting campaigns. Data Min Knowl Disc 17, 283–316 (2008). https://doi.org/10.1007/s10618-008-0105-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-008-0105-2