Constructing and Understanding Customer Spending Prediction Models

Tran Tri Dang ORCID: orcid.org/0000-0002-4413-1147¹,
Khang Nguyen Hoang¹,
Long Bui Thanh¹,
Tien Nguyen Thi Thuy² &
…
Cuong Nguyen Quoc²

366 Accesses
1 Citation
Explore all metrics

Abstract

Prediction models are being used more and more widely in many sectors. FinTech (Financial Technology) is not an exception. Many problems in FinTech can be considered prediction problems. Some notable examples are predicting the probability that a transaction is fraudulent or predicting the most suitable company to invest in, given some constraints. In this research, the focus is on customer spending prediction. More specifically, we are interested in knowing how much a customer may spend in a period given her past purchases. Such information is crucial for the optimal planning and budgeting of businesses. As a first step in tackling this prediction problem, this research explores the feasibility of different statistical methods and machine learning algorithms in accurately predicting customer spending. The subjects we investigate in this research include Beta Geometric/Negative Binomial Distribution (BG/NBD), Gamma–Gamma, Linear Regression, Random Forest, and Light Gradient Boosting Machine (LightGBM). To make the prediction models and their results more accessible to the average users, we utilize information visualization as the primary communication with human users. We hope doing so can bridge the gap between prediction performance and users’ insight into the reasons behind the performance. With better insight, users can make more appropriate decisions in selecting a method/algorithm to build a prediction model under a specific circumstance. The result of this research can also serve as a foundation for more in-depth work on the same problem in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

On the Feasibility of Machine Learning Models for Customer Spending Prediction Problem

Predicting the Propensity of Customers to Pay via Mobile Applications with Machine Learning Methods

Selecting the best way to forecast income in the banking industry using data mining methods, a case study

Article 17 September 2024

References

Stringfellow A, Nie W, Bowen DE. CRM: Profiting from understanding customer needs. Bus Horiz. 2004;47(5):45–52.
Article Google Scholar
Otto PE, et al. From spending to understanding: analyzing customers by their spending behavior. J Retail Consum Serv. 2009;16(1):10–8.
Article Google Scholar
Hall RE. Stochastic implications of the life cycle-permanent income hypothesis: theory and evidence. J Polit Econ. 1978;86(6):971–87.
Article Google Scholar
Campbell JY, Mankiw NG. Permanent income, current income, and consumption. J Business Econ Stat. 1990;8(3):265–79.
Google Scholar
Shea J. Myopia, liquidity constraints, and aggregate consumption: a simple test. J Money, Credit, Bank. 1995;27(3):798–805.
Article Google Scholar
Mehra YP, Martin E. Why does consumer sentiment predict household spending? FRB Richmond Economic Quarterly. 2003;89(4):51–67.
Google Scholar
Fornell C, Rust RT, Dekimpe MG. The effect of customer satisfaction on consumer spending growth. J Mark Res. 2010;47(1):28–35.
Article Google Scholar
Castéran H, Meyer-Waarden L, Reinartz W. Modeling customer lifetime value, retention, and churn. In: Castéran H, Meyer-Waarden L, Reinartz W, editors. Handbook of market research. Cham: Springer International Publishing; 2021. p. 1001–33.
Google Scholar
Gupta S, Lehmann DR, Stuart JA. Valuing customers. J Market Res. 2004;41(1):7–18.
Article Google Scholar
Cui D, Curry D. Prediction in marketing using the support vector machine. Mark Sci. 2005;24(4):595–615.
Article Google Scholar
Chen PP et al. Customer lifetime value in video games using deep learning and parametric models. In: 2018 IEEE international conference on big data (big data). IEEE, (2018).
Xie Y, et al. Customer churn prediction using improved balanced random forests. Expert Syst Appl. 2009;36(3):5445–9.
Article Google Scholar
Tsai C-F, Yu-Hsin Lu. Customer churn prediction by hybrid neural networks. Expert Syst Appl. 2009;36(10):12547–53.
Article Google Scholar
Huang B, Kechadi MT, Buckley B. Customer churn prediction in telecommunications. Expert Syst Appl. 2012;39(1):1414–25.
Article Google Scholar
Qiu J, Lin Z, Li Y. Predicting customer purchase behavior in the e-commerce context. Electron Commer Res. 2015;15:427–52.
Article Google Scholar
Martínez A, et al. A machine learning framework for customer purchase prediction in the non-contractual setting. Eur J Operational Res. 2020;281(3):588–96.
Article Google Scholar
Preece A. Asking ‘Why’in AI: Explainability of intelligent systems–perspectives and challenges. Intell Syst Account Finance Manag. 2018;25(2):63–72.
Article Google Scholar
Páez A. The pragmatic turn in explainable artificial intelligence (XAI). Mind Mach. 2019;29(3):441–59.
Article Google Scholar
Vilone G, Longo L. Explainable artificial intelligence: a systematic review. arXiv preprint arXiv:2006.00093 (2020).
Amershi, S, et al. Guidelines for human-AI interaction. Proceedings of the 2019 chi conference on human factors in computing systems. 2019.
Mengchen L, et al. Towards better analysis of deep convolutional neural networks. IEEE Trans Visual Comput Graphics. 2016;23(1):91–100.
Google Scholar
Kahng M, et al. A cti v is: Visual exploration of industry-scale deep neural network models. IEEE Trans Visual Comput Graphics. 2017;24(1):88–97.
Article Google Scholar
Spitzer M, et al. BoxPlotR: a web tool for generation of box plots. Nat Methods. 2014;11(2):121–2.
Article Google Scholar
Keim DA, et al. Generalized scatter plots. Inf Visual. 2010;9(4):301–11.
Google Scholar
Li Y, et al. Drawing and studying on histogram. Cluster Comput. 2019;22(Suppl 2):3999–4006.
Article Google Scholar
Shneiderman B. Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans Graph. 1992;11(1):92–9.
Article MATH Google Scholar
Cockburn A, McKenzie B. An evaluation of cone trees. People and Computers XIV—Usability or Else! Proceedings of HCI 2000. Springer London (2000).
Inselberg, A, Dimsdale B. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the first IEEE conference on visualization: visualization 90. IEEE, (1990).
Tran TD, Dang TK. Visualization of web form submissions for security analysis. Int J Web Inf Syst. 2013;9(2):165–80.
Article Google Scholar
Tran TD, TK Dang, Nguyen Le T-G. Interactive Visual Decision tree for developing detection rules of attacks on web applications. Int J Adv Comput Sci Appl 2018;9(7).
Marill KA. Advanced statistics: linear regression, part I: simple linear regression. Acad Emerg Med. 2004;11(1):87–93.
Article Google Scholar
Lu Y, et al. The state-of-the-art in predictive visual analytics. Comput Graph Forum. 2017;36(3):539–62.
Article Google Scholar
Ren D, et al. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE Trans Visual Comput Graphics. 2016;23(1):61–70.
Article Google Scholar
Steed CA, et al. CrossVis: A visual analytics system for exploring heterogeneous multivariate data with applications to materials and climate sciences. Graph Visual Comput. 2020;3:200013.
Article Google Scholar
Fader PS, Hardie BGS, Lee KL. Counting your customers the easy way: An alternative to the Pareto/NBD model. Mark Sci. 2005;24(2):275–84.
Article Google Scholar
Schmittlein DC, Morrison DG, Colombo R. Counting your customers: Who-are they and what will they do next? Manage Sci. 1987;33(1):1–24.
Article Google Scholar
Fader PS, Hardie BGS. The Gamma-Gamma model of monetary value. February. 2013;2:1–9.
Google Scholar
Yuan M, et al. Dimension reduction and coefficient estimation in multivariate linear regression. J R Stat Soc : Series B Stat Methodol. 2007;69(3):329–46.
Article MathSciNet MATH Google Scholar
Aiken LS, West SG, Pitts SC. Multiple linear regression. In: Weiner IB, editor. Handbook of psychology. US p: Wiley; 2003. p. 481–507.
Chapter Google Scholar
Brownlee J. Bagging and random forest ensemble algorithms for machine learning. Mach Learn Alg 2016;4–22.
Quinlan JR. Learning decision tree classifiers. ACM Comput Surv (CSUR). 1996;28(1):71–2.
Article Google Scholar
Dietterich TG. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach Lear. 2000;40:139–57.
Article Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Article MATH Google Scholar
Friedman JH. Greedy function approximation: a gradient boosting machine. Annal Stat. 2001;29:1189–232.
Article MathSciNet MATH Google Scholar
Ke G et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 2017;30.
Chai T, Draxler RR. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci Model Dev. 2014;7(3):1247–50.
Article Google Scholar
De Myttenaere A, et al. Mean absolute percentage error for regression models. NeuroComput. 2016;192:38–48.
Article Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

School of Science, Engineering & Technology (SSET), RMIT University, Ho Chi Minh City, Vietnam
Tran Tri Dang, Khang Nguyen Hoang & Long Bui Thanh
Skedulo Holdings, Inc., San Francisco, USA
Tien Nguyen Thi Thuy & Cuong Nguyen Quoc

Authors

Tran Tri Dang
View author publications
You can also search for this author in PubMed Google Scholar
Khang Nguyen Hoang
View author publications
You can also search for this author in PubMed Google Scholar
Long Bui Thanh
View author publications
You can also search for this author in PubMed Google Scholar
Tien Nguyen Thi Thuy
View author publications
You can also search for this author in PubMed Google Scholar
Cuong Nguyen Quoc
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tran Tri Dang.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Future Data and Security Engineering 2022” guest edited by Tran Khanh Dang.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dang, T.T., Hoang, K.N., Thanh, L.B. et al. Constructing and Understanding Customer Spending Prediction Models. SN COMPUT. SCI. 4, 852 (2023). https://doi.org/10.1007/s42979-023-02284-0

Download citation

Received: 08 May 2023
Accepted: 27 August 2023
Published: 08 November 2023
DOI: https://doi.org/10.1007/s42979-023-02284-0

Constructing and Understanding Customer Spending Prediction Models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Feasibility of Machine Learning Models for Customer Spending Prediction Problem

Predicting the Propensity of Customers to Pay via Mobile Applications with Machine Learning Methods

Selecting the best way to forecast income in the banking industry using data mining methods, a case study

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Constructing and Understanding Customer Spending Prediction Models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

On the Feasibility of Machine Learning Models for Customer Spending Prediction Problem

Predicting the Propensity of Customers to Pay via Mobile Applications with Machine Learning Methods

Selecting the best way to forecast income in the banking industry using data mining methods, a case study

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation