[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN113469730A - Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene - Google Patents

Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene Download PDF

Info

Publication number
CN113469730A
CN113469730A CN202110637643.7A CN202110637643A CN113469730A CN 113469730 A CN113469730 A CN 113469730A CN 202110637643 A CN202110637643 A CN 202110637643A CN 113469730 A CN113469730 A CN 113469730A
Authority
CN
China
Prior art keywords
prediction
repurchase
lightgbm
sample
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110637643.7A
Other languages
Chinese (zh)
Inventor
吴军
杨李平
牛夏夏
石力
李圆圆
孙李傲
宋鑫玉
郝伟怡
宋思聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Chemical Technology
Original Assignee
Beijing University of Chemical Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Chemical Technology filed Critical Beijing University of Chemical Technology
Priority to CN202110637643.7A priority Critical patent/CN113469730A/en
Publication of CN113469730A publication Critical patent/CN113469730A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Medical Informatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a client repurchase prediction method and device based on an RF-LightGBM fusion model under a non-contract scene. The method comprises the following steps: acquiring historical data of a user, and performing preprocessing and characteristic engineering on the historical data; taking the data after data preprocessing as a sample, and balancing a sample set by utilizing a SMOTE-ENN method; carrying out hyper-parameter optimization on a random forest algorithm and a LightGBM algorithm through a TPE optimization algorithm to construct a weak classifier; and performing ensemble learning on the training samples through the weak classifiers to obtain a strong classifier, and obtaining a final result about the repurchase prediction. The method analyzes according to the consumption data of the clients purchased by the enterprise, accurately predicts the repurchase behavior of the existing clients, guides the client relationship management decision and the accurate marketing strategy according to the repurchase behavior, improves the marketing conversion rate and reduces the enterprise operation cost.

Description

Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene
Technical Field
The invention relates to the technical field of computers, in particular to a customer repurchase prediction method and device based on an RF-LightGBM fusion model under a non-contract scene.
Background
With the advent of the big data age, predicting future purchasing intentions of consumers from massive historical consumer transaction data has become an important issue in enterprise management. The prediction of the client repeated purchasing behavior under the non-contract scene mainly refers to the prediction of the repeated purchasing behavior of the next time the client purchases the enterprise product under the situation that the enterprise and the client do not sign a purchase contract. The consumers with repeated purchasing intention can be accurately predicted, the customer demands can be more accurately matched through accurate marketing, the value of the new consumers is improved, and the new consumers are converted into faithful customers.
In the prior art, a chinese patent of invention (No. CN109146533B) discloses an information push method and apparatus, which specifically disclose obtaining at least two pieces of order information of a user for an item of the same item type, determining an average daily consumption of the user for an interval of the item type based on a purchase amount in the at least two pieces of order information, and determining a push date for pushing item information associated with the item of the item type to a user terminal of the user based on the average daily consumption and a purchase amount corresponding to a latest order, thereby improving effectiveness of information push. The chinese invention patent (publication No. CN108171530B) discloses a method and a device for increasing the unit price and the repurchase rate of customers, which comprises: selecting historical marketing data of a target store to obtain a historical marketing campaign effect, and obtaining a marketing campaign effect estimation initial value of the target store according to the historical marketing data and the historical marketing campaign effect; and constructing threshold adjustment factors according to the ratio of the historical marketing activities of all stores meeting the threshold order number and meeting the customer order number, calibrating the pre-estimated marketing activity effect of the target stores by using the threshold adjustment factors, and obtaining the pre-estimated value of the marketing activity effect of the target stores, thereby solving the problem that the marketing activity effect cannot be estimated more accurately according to the change of the threshold in the existing promotion activity effect evaluation technology. Although the product recommendation and the effect prediction are realized according to historical data in the prior art, the customer behavior cannot be accurately predicted.
The existing machine learning method is widely applied to the field of customer behavior prediction, but most of the existing machine learning method focuses on prediction in a shopping mall scene. In the prior art, the chinese invention application (publication No. CN110956497A) discloses a method for predicting a repeat purchasing behavior of an e-commerce platform user, comprising: the method comprises the steps of obtaining historical purchasing behavior data of a user, fusing a deep Catboost individual model, a double-layer attention BiGRU individual model and a DeepGBM individual model, modeling discrete purchasing record numerical values and behavior sequence characteristics in the historical purchasing data of the user, and improving accuracy of a prediction result. The Chinese invention application (publication number CN108520469A) discloses a user re-purchasing behavior analysis method based on an e-commerce platform, which selects effective purchasing records of users in a statistical period; carrying out data cleaning; marking a label of whether the purchase is repeated or not, a label of whether the purchase is repeated for a platform or a label of whether the purchase is repeated for a dangerous seed or not on each effective purchase record; counting the total number of purchasing users, the number of repeated purchasing users, the total number of purchasing users of each platform, the total number of repeated purchasing users of each platform, the total number of purchasing users of each dangerous type and the total number of repeated purchasing users of each dangerous type; and calculating the repeated purchase rate, the platform repeated purchase rate and the dangerous seed repeated purchase rate in the statistical time period. However, in the e-market scenario, the "implicit" feedback behavior of the customer's collection, praise, etc. can be retained, which is not available in the broader non-contract scenario. And the machine learning algorithm is mainly used for algorithm integration at present, so that the influence of the data set on the prediction result is ignored. Generally, in a purchasing situation, users who purchase repeatedly are less than users who purchase once, and thus, the problem of data category imbalance exists, which often causes overfitting of a model and causes low prediction accuracy.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a client repurchase prediction method and a client repurchase prediction device based on an RF-LightGBM fusion model under a non-contract scene, and the invention adopts the following technical scheme:
a customer repurchase prediction method based on an RF-LightGBM fusion model under a non-contract scene comprises the following steps:
acquiring historical purchase record data of a user, preprocessing the historical purchase record data and extracting features;
balancing the data subjected to the feature extraction by using a sample balancing method to obtain a balanced sample;
training sample data by using an optimization algorithm, and performing iterative optimization on the weak classifier in a specified weak classifier hyperparametric space;
performing ensemble learning to obtain a strong classifier by giving the same weight to each weak classifier;
predicting by using a strong classifier to obtain final results of product recommendation and repurchase behavior prediction;
and pushing product information to the terminal equipment of the user and/or sending a re-purchasing behavior prediction result to a management system according to the final result.
Further, the extracting features includes:
time of last purchase, frequency of purchases, total amount of purchases, duration of relationship, purchase interval.
Further, the sample equalization method comprises:
generating a few samples of the extracted features by using a SMOTE oversampling method, judging the generated samples by using an ENN (edited KNN) method, and removing the samples if the prediction result is different from the actual class label to obtain balanced samples.
Further, the optimization algorithm comprises:
and optimizing the model hyper-parameters by using a TPE (Tree-structured park Estimator) Tree-shaped park estimation optimization algorithm, and training the model under the condition of the optimal hyper-parameters.
Further, the weak analyzer comprises a random forest RF (random forest) model and a Light GBM model, the output results of the weak analyzer are classification probability values, and the mathematical expression is as follows:
Figure BDA0003105816590000031
in the formula, NtreeIs the total number of decision trees, hiFor the ith decision tree, P (x | y) represents the probability that the prediction sample x belongs to the class y.
Further, the ensemble learning specifically includes:
the RF model and the Light GBM model are given the same weight, and are integrated by using a Soft Voting (Soft Voting) method on the basis of the prediction probability, and the mathematical expression form is as follows:
PSoft Voting=(PRF+PLightGBM)/2
Figure BDA0003105816590000041
wherein, PSoft VotingPrediction probability, P, for a soft voting fusion modelRF,PLightGBMThe prediction probabilities of the random forest and the LightGBM are respectively represented, Result represents the prediction Result of the fusion model, 1 represents that the user belongs to the repurchase type, 0 represents that the user belongs to the non-repurchase type, and threshold represents the classification threshold.
And further, the method is used as a product recommendation guide based on the repurchase behavior prediction and the repurchase probability prediction.
The invention also provides a client repurchase prediction device based on the RF-LightGBM fusion model under a non-contract scene, which comprises the following components:
the acquisition module is used for acquiring historical purchase record data of a user, preprocessing the historical purchase record data and extracting features;
the balance module is used for balancing the data subjected to the feature extraction by using a sample balance method to obtain a balanced sample;
the optimization training module is used for training sample data by using an optimization algorithm and performing iterative optimization on the weak classifier in the specified hyper-parameter space of the weak classifier;
the ensemble learning module is used for performing ensemble learning to obtain a strong classifier by endowing the weak classifiers with the same weight;
the prediction module is used for predicting by using the strong classifier to obtain final results of product recommendation and repurchase behavior prediction;
and the pushing module is used for pushing the product information to the terminal equipment of the user according to the final result.
The invention also includes an electronic device comprising:
a processor, and a memory;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions to perform a method for forecasting customer buys under a non-contract scenario based on an RF-LightGBM fusion model as described above.
The present invention also includes a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for predicting a customer buyback based on an RF-LightGBM fusion model in a non-contract scenario as described above.
The invention achieves the following beneficial effects: analyzing according to the existing user purchasing behavior records of the enterprise, accurately predicting the existing user re-purchasing condition, and guiding a customer relationship management strategy and a marketing strategy according to the situation, so that the marketing conversion rate is improved, and the related operation cost is reduced; based on the purchasing behavior data of the customers, the re-purchasing behavior of the customers on the commodities is accurately predicted, the actual effective requirements of the customers are met, and meanwhile the enterprise communication cost can be reduced; the enterprise operation strategy is dynamically guided by the data, the data promotes decision making and assists in achieving the product marketing goal, and finally the goal of recommending a proper product to a proper user in an intelligent mode is achieved.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Detailed Description
The invention is further described below with reference to the accompanying drawings. The following examples are only for illustrating the technical solutions of the present invention more clearly, and the protection scope of the present invention is not limited thereby. It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
As shown in fig. 1, the embodiment discloses a customer repurchase prediction method based on an RF-LightGBM fusion model in a non-contract scenario, which includes the following steps:
(1) and acquiring historical purchase record data of the user, preprocessing the historical purchase record data and extracting features. The historical purchase record data is data that already exists. The extracting features includes: time of last purchase (R), frequency of purchases (F), total amount of purchases (M), duration of relationship (S), purchase interval (T).
(2) And carrying out sample equalization on the data subjected to the feature extraction by using a SMOTE-ENN method to obtain a model training set. And (4) adopting a multi-time sampling method with replacement for each type of sample in the original sample set to form a test sample.
(3) Training the training sample data by using a TPE optimization algorithm, and performing iterative optimization on the weak classifier in the specified hyper-parameter space of the weak classifier.
(4) And assigning the same weight to each weak classifier, performing ensemble learning to obtain a strong classifier, and obtaining a final result about product recommendation and repeated purchasing behavior retest.
In this embodiment, the step of preprocessing includes: to facilitate computer processing and user tagging, the character type is converted to numerical data, and the numerical data is converted to date type data. The extracted features include a recent purchase time (R), a frequency of purchases (F), a total amount of purchases (M), a relationship duration (S), a purchase interval (T):
a) r: the last consumption time of the product by the client is as follows:
R=Tlast_time-Tplast_time
wherein T islast_timeDenotes the end time of the reference period, Tplast_timeIndicating the time of the last order transaction by the customer for the item within the reference time period.
b) F: the number of purchases made by the customer over the observation period.
c) M: the total purchase amount of the product by the customer is in the following form:
Figure BDA0003105816590000071
where n represents the total number of times consumed by the customer over the reference time period and M represents the amount of a single consumption by the customer.
d) S: refers to the time interval from the first transaction to the last transaction of the client occurring within the reference time, and is in the form of:
S=Tplast_time-Tpfirst_time
wherein T isplast_timeIndicates the time of the last order transaction, T, of the customer for the item within the reference time periodpfirst_timeIndicating the time of the first order trade of the customer for the item within the reference time period.
e) T: the average trade time interval over a period of time for a customer is of the form:
Figure BDA0003105816590000072
the invention provides a method for processing unbalanced samples by adopting a SMOTE-ENN method, which has the advantages of having good effect on the problem of two classifications of only a small number of positive samples and having better performance by comparing different methods. The SMOTE-ENN method comprises the following steps:
(1) SMOTE method (Synthetic Minrity Oversampling Technique):
let A denote a minority of classes, arbitrarily take XiE.g. A, calculating the distance from the sample to all samples in the minority class sample set A by taking the Euclidean distance as a standard to obtain XiK nearest neighbor samples, randomly selecting one sample from the nearest neighbor samples, namely Xij(j ═ 1,2,. n); at XiAnd Xij(j ═ 1, 2.. times, n) are interpolated by random linear interpolation to construct new few samples Yj
Yj=Xi+rand(0,1)×(Xij-Xi)
In the formula, rand (0,1) represents a random number in the interval (0, 1).
(2) ENN method (Edited KNN)
And predicting each sample in the data set ND generated by the SMOTE method by using a K nearest neighbor (K is 5), and rejecting the sample if the prediction result is different from the actual class label. The Euclidean distance is selected as a measurement formula of the KNN algorithm, and the form is as follows:
Figure BDA0003105816590000081
in the formula, x and y represent two different users, and i represents a feature number.
Assigning a hyper-parameter configuration space of the weak classifier, and performing iterative optimization on the parameter space of the assigned weak classifier by adopting a TPE (thermal plastic article-Enn) optimization algorithm on a sample set constructed by the SMOTE-ENN method, wherein the optimization formula is as follows:
x*=arg minx∈χF(x)
wherein F (x) represents the objective function of the weak learner; x is the number of*Is the parameter at which the best results are obtained.
The TPE algorithm density is defined as:
Figure BDA0003105816590000082
wherein l (x) is represented by an observed value { x }iIs less than y*G (x) is the observed value { x }iAn objective function F (x) of y or more*The density composition of (a). Using y*As quantile γ for the observed value y. The Expected Improvement (EI) is:
Figure BDA0003105816590000091
the output result of the random forest model is the average of the probabilities of all decision trees, and the mathematical expression form is as follows:
Figure BDA0003105816590000092
wherein N istreeIs the total number of decision trees, hiFor the ith decision tree, P (x | y) represents the probability that the prediction sample x belongs to the class y.
The LightGBM model also outputs classification probabilities using the method described above.
The RF model and the Light GBM model are given the same weight, and are integrated by using a Soft Voting (Soft Voting) method on the basis of the prediction probability, and the mathematical expression form is as follows:
PSoft Voting=(PRF+PLightGBM)/2
Figure BDA0003105816590000093
wherein, PSoft VotingPrediction probability, P, for a soft voting fusion modelRF,PLightGBMRespectively representing the prediction probabilities of the random forest and the LightGBM model, Result representing the prediction Result of the fusion model,1 represents belonging to a subscriber of the type of repurchase, and 0 represents belonging to a subscriber of the type of non-repurchase. According to the test, the threshold value threshold of the invention is set to be 0.5, the prediction label is 1 when the threshold value threshold is larger than 0.5, and the prediction label is 0 when the threshold value threshold is smaller than 0.5, so that a prediction matrix is obtained
Figure BDA0003105816590000094
Therefore, the forecasting of the repeated purchasing behavior of the customer can be realized.
And pushing product information to the terminal equipment of the user and/or sending a re-purchasing behavior prediction result to a management system according to the final result.
The performance of the invention is measured as follows: the current algorithm uses the values of accuracy rate P, recall rate R and F1 as evaluation indexes, and performs the index calculation through the implementation of the data preprocessing method in the invention, and calculates the evaluation indexes by using the obtained label matrix, wherein the calculation formula is as follows:
Figure BDA0003105816590000101
Figure BDA0003105816590000102
Figure BDA0003105816590000103
the invention has good performance in the multi-channel marketing process of enterprises under a non-contract scene, and by taking the super-commercial power marketing as an example, after the system is applied, the conversion rate of the power marketing can be greatly improved, and more transactions are promoted to be generated. For enterprises, the effects of improving marketing guidance, increasing sales success rate, increasing the amount of finished products and transaction amount, reducing personnel cost and the like can be achieved. The performance on the data set, in particular: (1) on a training set generated by SMOTE-ENN, the model prediction accuracy rate is 98.73%, the recall rate is 99.09%, and the F1 value is 0.9874; (2) on a verification set consisting of real samples, the model prediction accuracy is 87.13%, the recall rate is 95.15%, and the F1 value is 0.8587; (3) the result is better than the prediction performance of the RF and LightGBM single model.
According to the invention, the user behavior characteristics are extracted from the display feedback of the historical purchase record of the customer by improving the classic RFM model to form a sample set, so that the problem that a large amount of implicit feedback is not available in a non-contract scene in the prior art is solved; according to the invention, the problem of data class imbalance of the data set in the prior art is effectively solved through the SMOTE-ENNN sample balancing method; the embodiment result shows that the method has good prediction performance and practical application value.

Claims (10)

1. A customer repurchase prediction method based on an RF-LightGBM fusion model under a non-contract scene is characterized by comprising the following steps:
acquiring historical purchase record data of a user, preprocessing the historical purchase record data and extracting features;
balancing the data subjected to the feature extraction by using a sample balancing method to obtain a balanced sample;
training sample data by using an optimization algorithm, and performing iterative optimization on the weak classifier in a specified weak classifier hyperparametric space;
performing ensemble learning to obtain a strong classifier by giving the same weight to each weak classifier;
predicting by using a strong classifier to obtain final results of product recommendation and repurchase behavior prediction;
and pushing product information to the terminal equipment of the user and/or sending a re-purchasing behavior prediction result to a management system according to the final result.
2. The method of claim 1, wherein the extracting features comprises:
time of last purchase, frequency of purchases, total amount of purchases, duration of relationship, purchase interval.
3. The method of claim 1, wherein the sample equalization method comprises:
generating a few samples of the extracted features by using a SMOTE oversampling method, judging the generated samples by using an ENN (edited KNN) method, and removing the samples if the prediction result is different from the actual class label to obtain balanced samples.
4. The method of claim 1, wherein the optimization algorithm comprises:
and optimizing the model hyper-parameters by using a TPE (Tree-structured park Estimator) Tree-shaped park estimation optimization algorithm, and training the model under the condition of the optimal hyper-parameters.
5. The customer repurchase prediction method based on the RF-LightGBM fusion model under the non-contract scene as claimed in claim 1, wherein the weak classifier comprises a random forest RF (random forms) model and a Light GBM model, the output results of the weak classifier are classification probability values, and the mathematical expression is as follows:
Figure FDA0003105816580000021
in the formula, NtreeIs the total number of decision trees, hiFor the ith decision tree, P (x | y) represents the probability that the prediction sample x belongs to the class y.
6. The method for predicting the customer repurchase based on the RF-LightGBM fusion model in the non-contract scenario as claimed in claim 1, wherein the ensemble learning specifically comprises:
the RF model and the Light GBM model are given the same weight, and are integrated by using a Soft Voting (Soft Voting) method on the basis of the prediction probability, and the mathematical expression form is as follows:
PSoft Voting=(PRF+PLightGBM)/2
Figure FDA0003105816580000022
wherein, PSoft VotingPrediction probability, P, for a soft voting fusion modelRF,PLightGBMThe prediction probabilities of the random forest and the LightGBM are respectively represented, Result represents the prediction Result of the fusion model, 1 represents that the user belongs to the repurchase type, 0 represents that the user belongs to the non-repurchase type, and threshold represents the classification threshold.
7. The method of claim 1, wherein the method for predicting the repurchase of the customer based on the RF-LightGBM fusion model under the non-contract scenario is based on a repurchase behavior prediction and a repurchase probability prediction as a product recommendation guide.
8. A client buyback prediction device based on an RF-LightGBM fusion model under a non-contract scene is characterized by comprising:
the acquisition module is used for acquiring historical purchase record data of a user, preprocessing the historical purchase record data and extracting features;
the balance module is used for balancing the data subjected to the feature extraction by using a sample balance method to obtain a balanced sample;
the optimization training module is used for training sample data by using an optimization algorithm and performing iterative optimization on the weak classifier in the specified hyper-parameter space of the weak classifier;
the ensemble learning module is used for performing ensemble learning to obtain a strong classifier by endowing the weak classifiers with the same weight;
the prediction module is used for predicting by using the strong classifier to obtain final results of product recommendation and repurchase behavior prediction;
and the pushing module is used for pushing the product information to the terminal equipment of the user according to the final result.
9. An electronic device, characterized in that:
comprises a processor and a memory;
the memory is configured to store computer-executable instructions, and the processor is configured to execute the computer-executable instructions to perform the method for forecasting customer buys-back in a non-contract scenario based on the RF-LightGBM fusion model according to any of claims 1 to 7.
10. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implements a method for predicting a customer buyback based on an RF-LightGBM fusion model in a non-contract scenario as claimed in any one of claims 1 to 7.
CN202110637643.7A 2021-06-08 2021-06-08 Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene Pending CN113469730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110637643.7A CN113469730A (en) 2021-06-08 2021-06-08 Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110637643.7A CN113469730A (en) 2021-06-08 2021-06-08 Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene

Publications (1)

Publication Number Publication Date
CN113469730A true CN113469730A (en) 2021-10-01

Family

ID=77869309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110637643.7A Pending CN113469730A (en) 2021-06-08 2021-06-08 Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene

Country Status (1)

Country Link
CN (1) CN113469730A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049155A (en) * 2021-11-17 2022-02-15 浙江华坤道威数据科技有限公司 Marketing operation method and system based on big data analysis
CN114511330A (en) * 2022-04-18 2022-05-17 山东省计算中心(国家超级计算济南中心) Improved CNN-RF-based Ethernet workshop Pompe deception office detection method and system
CN114549071A (en) * 2022-02-18 2022-05-27 上海钧正网络科技有限公司 Marketing strategy determination method and device, computer equipment and storage medium
CN114863341A (en) * 2022-05-17 2022-08-05 济南大学 Online course learning supervision method and system
CN115204537A (en) * 2022-09-17 2022-10-18 华北理工大学 Student score prediction method based on Bagging
CN117114807A (en) * 2023-08-24 2023-11-24 众合九通(北京)电子科技有限公司 Commodity recommendation method and system based on user relationship
CN117593044A (en) * 2024-01-18 2024-02-23 青岛网信信息科技有限公司 Dual-angle marketing campaign effect prediction method, medium and system

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107016569A (en) * 2017-03-21 2017-08-04 聚好看科技股份有限公司 The targeted customer's account acquisition methods and device of a kind of networking products
CN107294993A (en) * 2017-07-05 2017-10-24 重庆邮电大学 A kind of WEB abnormal flow monitoring methods based on integrated study
WO2018069817A1 (en) * 2016-10-10 2018-04-19 Tata Consultancy Services Limited System and method for predicting repeat behavior of customers
CN108171530A (en) * 2017-12-06 2018-06-15 口碑(上海)信息技术有限公司 It is a kind of to be used for visitor's unit price and the again method for improving and device of purchase rate
CN108520469A (en) * 2018-06-19 2018-09-11 南京新贝金服科技有限公司 A kind of user based on electric business platform purchases behavior analysis method again
CN108776922A (en) * 2018-06-04 2018-11-09 北京至信普林科技有限公司 Finance product based on big data recommends method and device
CN110210913A (en) * 2019-06-14 2019-09-06 重庆邮电大学 A kind of businessman frequent customer's prediction technique based on big data
CN110322085A (en) * 2018-03-29 2019-10-11 北京九章云极科技有限公司 A kind of customer churn prediction method and apparatus
CN110599336A (en) * 2018-06-13 2019-12-20 北京九章云极科技有限公司 Financial product purchase prediction method and system
CN110956497A (en) * 2019-11-27 2020-04-03 桂林电子科技大学 Method for predicting repeated purchasing behavior of user of electronic commerce platform
CN111008871A (en) * 2019-12-10 2020-04-14 重庆锐云科技有限公司 Real estate repurchase customer follow-up quantity calculation method, device and storage medium
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
CN111899055A (en) * 2020-07-29 2020-11-06 亿达信息技术有限公司 Machine learning and deep learning-based insurance client repurchase prediction method in big data financial scene

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018069817A1 (en) * 2016-10-10 2018-04-19 Tata Consultancy Services Limited System and method for predicting repeat behavior of customers
CN107016569A (en) * 2017-03-21 2017-08-04 聚好看科技股份有限公司 The targeted customer's account acquisition methods and device of a kind of networking products
CN107294993A (en) * 2017-07-05 2017-10-24 重庆邮电大学 A kind of WEB abnormal flow monitoring methods based on integrated study
CN108171530A (en) * 2017-12-06 2018-06-15 口碑(上海)信息技术有限公司 It is a kind of to be used for visitor's unit price and the again method for improving and device of purchase rate
CN110322085A (en) * 2018-03-29 2019-10-11 北京九章云极科技有限公司 A kind of customer churn prediction method and apparatus
CN108776922A (en) * 2018-06-04 2018-11-09 北京至信普林科技有限公司 Finance product based on big data recommends method and device
CN110599336A (en) * 2018-06-13 2019-12-20 北京九章云极科技有限公司 Financial product purchase prediction method and system
CN108520469A (en) * 2018-06-19 2018-09-11 南京新贝金服科技有限公司 A kind of user based on electric business platform purchases behavior analysis method again
CN110210913A (en) * 2019-06-14 2019-09-06 重庆邮电大学 A kind of businessman frequent customer's prediction technique based on big data
CN111045716A (en) * 2019-11-04 2020-04-21 中山大学 Related patch recommendation method based on heterogeneous data
CN110956497A (en) * 2019-11-27 2020-04-03 桂林电子科技大学 Method for predicting repeated purchasing behavior of user of electronic commerce platform
CN111008871A (en) * 2019-12-10 2020-04-14 重庆锐云科技有限公司 Real estate repurchase customer follow-up quantity calculation method, device and storage medium
CN111899055A (en) * 2020-07-29 2020-11-06 亿达信息技术有限公司 Machine learning and deep learning-based insurance client repurchase prediction method in big data financial scene

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
JAMES BERGSTRA: "Algorithms for hyper-parameter optimization", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, 31 December 2011 (2011-12-31), pages 1 *
JAMES BERGSTRA: "Algorithms for hyper-parameter optimization", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, pages 1 *
JUN WU: "User Value Identification Based on Improved RFM Model and -Means++ Algorithm for Complex Data Analysis", WIRELESS COMMUNICATIONS AND MOBILE COMPUTING *
季晨雨;: "不平衡数据分类研究及在银行营销中的应用", 山西电子技术, no. 05 *
张李义;李一然;文璇;: "新消费者重复购买意向预测研究", 数据分析与知识发现, no. 11 *
张浩;陈龙;魏志强: "基于数据增强和模型更新的异常流量检测技术", 信息网络安全, no. 02, 10 February 2020 (2020-02-10), pages 66 *
张浩等: "基于数据增强和模型更新的异常流量检测技术", 信息网络安全, pages 66 *
杨霞霞;苏锋;黄戌霞;: "基于改进随机森林算法的不平衡数据分类方法研究", 网络安全技术与应用, no. 10 *
陶新民等: "不均衡数据SVM分类算法及其应用", 31 October 2011, 黑龙江科学技术出版社, pages: 43 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114049155A (en) * 2021-11-17 2022-02-15 浙江华坤道威数据科技有限公司 Marketing operation method and system based on big data analysis
CN114049155B (en) * 2021-11-17 2022-08-19 浙江华坤道威数据科技有限公司 Marketing operation method and system based on big data analysis
CN114549071A (en) * 2022-02-18 2022-05-27 上海钧正网络科技有限公司 Marketing strategy determination method and device, computer equipment and storage medium
CN114511330A (en) * 2022-04-18 2022-05-17 山东省计算中心(国家超级计算济南中心) Improved CNN-RF-based Ethernet workshop Pompe deception office detection method and system
CN114863341A (en) * 2022-05-17 2022-08-05 济南大学 Online course learning supervision method and system
CN114863341B (en) * 2022-05-17 2024-05-31 济南大学 Online course learning supervision method and system
CN115204537A (en) * 2022-09-17 2022-10-18 华北理工大学 Student score prediction method based on Bagging
CN117114807A (en) * 2023-08-24 2023-11-24 众合九通(北京)电子科技有限公司 Commodity recommendation method and system based on user relationship
CN117593044A (en) * 2024-01-18 2024-02-23 青岛网信信息科技有限公司 Dual-angle marketing campaign effect prediction method, medium and system
CN117593044B (en) * 2024-01-18 2024-05-31 青岛网信信息科技有限公司 Dual-angle marketing campaign effect prediction method, medium and system

Similar Documents

Publication Publication Date Title
CN113469730A (en) Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene
CN108648074B (en) Loan assessment method, device and equipment based on support vector machine
CN111062757B (en) Information recommendation method and system based on multipath optimizing matching
CN110503531B (en) Dynamic social scene recommendation method based on time sequence perception
CN112418653A (en) Number portability and network diver identification system and method based on machine learning algorithm
CN110826886A (en) Electric power customer portrait construction method based on clustering algorithm and principal component analysis
CN109636482B (en) Data processing method and system based on similarity model
CN107403345A (en) Best-selling product Forecasting Methodology and system, storage medium and electric terminal
CN112785441B (en) Data processing method, device, terminal equipment and storage medium
CN110147389A (en) Account number treating method and apparatus, storage medium and electronic device
CN115204985A (en) Shopping behavior prediction method, device, equipment and storage medium
CN114861050A (en) Feature fusion recommendation method and system based on neural network
Chitra et al. Customer retention in banking sector using predictive data mining technique
CN116187808A (en) Electric power package recommendation method based on virtual power plant user-package label portrait
CN111861679A (en) Commodity recommendation method based on artificial intelligence
CN113627997A (en) Data processing method and device, electronic equipment and storage medium
CN116703250B (en) Second-hand vehicle business supervision and prediction system
CN118037401A (en) Knowledge graph-based agricultural product electronic commerce recommendation system
CN112150179A (en) Information pushing method and device
CN111506813A (en) Remote sensing information accurate recommendation method based on user portrait
CN116703533A (en) Business management data optimized storage analysis method
CN113763032B (en) Commodity purchase intention recognition method and device
EP3493082A1 (en) A method of exploring databases of time-stamped data in order to discover dependencies between the data and predict future trends
CN115293867A (en) Financial reimbursement user portrait optimization method, device, equipment and storage medium
US20230230143A1 (en) Product recommendation system, product recommendation method, and recordingmedium storing product recommendation program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination