CN118504752A

CN118504752A - Determination method of transaction risk prediction model, transaction risk prediction method, device, equipment, storage medium and program product

Info

Publication number: CN118504752A
Application number: CN202410605693.0A
Authority: CN
Inventors: 李钰; 焦勇博; 李思林; 李如旭
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2024-05-16
Filing date: 2024-05-16
Publication date: 2024-08-16

Abstract

The invention provides a method for determining a transaction risk prediction model, which can be applied to the technical field of big data and the technical field of financial science and technology. The method for determining the transaction risk prediction model comprises the following steps: aiming at each transaction risk prediction candidate model, acquiring a training sample set used in the training process of each transaction risk prediction candidate model; training M first initial models by using a training sample set to obtain M test models corresponding to each transaction risk prediction candidate model; respectively inputting the test samples into M test models, and outputting M prediction results; consistency judgment processing is carried out on the M prediction results, and judgment results used for representing the stability of each transaction risk prediction candidate model are generated; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models. The present disclosure also provides a determination apparatus, a device, a storage medium, and a program product of the transaction risk prediction model.

Description

Determination method of transaction risk prediction model, transaction risk prediction method, device, equipment, storage medium and program product

Technical Field

The present disclosure relates to the technical field of big data and the technical field of financial science and technology, and in particular, to a method for determining a transaction risk prediction model, a method and apparatus for predicting transaction risk, a device, a medium, and a program product.

Background

The rapid increase in financial traffic and explosive growth in financial transaction data has increased the difficulty of financial institutions managing credit risk to some extent. To enhance the management of credit risk, financial institutions typically conduct customer risk classification by statistical analysis methods, such as classification algorithms.

In the process of implementing the disclosed concept, the inventor finds that at least the following problems exist in the related art: when a classification model is used for classifying transaction risks of clients, only the accuracy of the classification model is generally considered, the classification robustness is not concerned, and the referenceability of the transaction risk classification result is low.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a method of determining a transaction risk prediction model, a transaction risk prediction method, apparatus, device, medium, and program product.

According to a first aspect of the present disclosure, there is provided a method for determining a transaction risk prediction model, including: aiming at each trained transaction risk prediction candidate model, acquiring a training sample set used in the training process of each transaction risk prediction candidate model, wherein the training sample set comprises a plurality of historical transaction behavior data and a plurality of historical transaction attribute data aiming at a plurality of sample clients, the training sample set is marked with a sample label for representing the real classification result of the sample clients, and the real classification result is related to the transaction risk degree of the sample clients; training M first initial models by using a training sample set to obtain M test models corresponding to each transaction risk prediction candidate model; inputting the test samples into M test models respectively, and outputting M prediction results, wherein the prediction results comprise classification results of test clients; consistency judgment processing is carried out on the M prediction results, and judgment results used for representing the stability of each transaction risk prediction candidate model are generated; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

According to an embodiment of the present disclosure, training M first initial models using a training sample set to obtain M test models corresponding to respective transaction risk prediction candidate models includes: dividing the training sample set into M training sample subsets; and respectively training M first initial models by using M training sample subsets to obtain M test models corresponding to each transaction risk prediction candidate model.

According to an embodiment of the present disclosure, performing consistency judgment processing on M prediction results, generating a judgment result for characterizing stability of each transaction risk prediction candidate model includes: calculating the probability of consistency of M prediction results; based on the probability that the M prediction results are consistent, a judgment result used for representing the stability of each transaction risk prediction candidate model is generated.

According to an embodiment of the present disclosure, determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the determination results of the N transaction risk prediction candidate models includes: based on the judgment results of the N transaction risk prediction candidate models, sorting the stability of the N transaction risk prediction candidate models to obtain sorting results; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the sequencing result.

According to an embodiment of the present disclosure, the method for determining a transaction risk prediction model further includes: acquiring N transaction risk prediction candidate models; the obtaining of the N transaction risk prediction candidate models comprises the following steps: reading a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample customers from a database; dividing a plurality of historical transaction behavior data and a plurality of historical transaction attribute data of a plurality of sample clients into L training sample sets, wherein L is greater than or equal to N; training L second initial models by using L training sample sets to obtain L initial transaction risk prediction models; calculating transaction risk prediction errors of the L initial transaction risk prediction models respectively; n transaction risk prediction candidate models are determined from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models.

According to an embodiment of the present disclosure, calculating the transaction risk prediction error for each of the L initial transaction risk prediction models includes: and respectively carrying out cross verification on the L initial transaction risk prediction models to obtain respective transaction risk prediction errors of the L initial transaction risk prediction models.

According to an embodiment of the present disclosure, determining N transaction risk prediction candidate models from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models includes: determining initial transaction risk prediction models with N transaction risk prediction errors meeting preset numerical conditions from the L initial transaction risk prediction models; and taking the initial transaction risk prediction models with the N transaction risk prediction errors meeting the preset numerical conditions as N transaction risk prediction candidate models.

A second aspect of the present disclosure provides a transaction risk prediction method, comprising: acquiring a plurality of transaction behavior data and a plurality of transaction attribute data for a target client; and inputting a plurality of transaction behavior data and a plurality of transaction attribute data aiming at the target client into the target transaction risk prediction model, and outputting a transaction risk prediction result of the target client, wherein the transaction risk prediction result is related to the degree of the transaction risk of the target client.

A third aspect of the present disclosure provides a determining apparatus of a transaction risk prediction model, including: the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring training sample sets used in the training process of each trained transaction risk prediction candidate model aiming at each trained transaction risk prediction candidate model, the training sample sets comprise a plurality of historical transaction behavior data and a plurality of historical transaction attribute data aiming at a plurality of sample clients, the training sample sets are marked with sample tags used for representing real classification results of the sample clients, and the real classification results are related to the transaction risk degree of the sample clients; the training module is used for training M first initial models by utilizing the training sample set to obtain M test models corresponding to each transaction risk prediction candidate model; the first output module is used for inputting the test samples into M test models respectively and outputting M prediction results, wherein the prediction results comprise classification results of test clients; the judging module is used for carrying out consistency judgment processing on the M prediction results and generating judgment results for representing the stability of each transaction risk prediction candidate model; and the determining module is used for determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

A fourth aspect of the present disclosure provides a transaction risk prediction apparatus, comprising: a third acquisition module for acquiring a plurality of transaction behavior data and a plurality of transaction attribute data for a target customer; the second output module is used for inputting a plurality of transaction behavior data and a plurality of transaction attribute data aiming at the target client into the target transaction risk prediction model and outputting a transaction risk prediction result of the target client, wherein the transaction risk prediction result is related to the degree of the transaction risk of the target client.

A fifth aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more computer programs, wherein the one or more processors execute the one or more computer programs to implement the steps of the method.

The sixth aspect of the present disclosure also provides a computer readable storage medium having stored thereon a computer program or instructions which when executed by a processor, perform the steps of the above method.

A seventh aspect of the present disclosure also provides a computer program product comprising a computer program or instructions which, when executed by a processor, performs the steps of the method described above.

According to the method, the device, the equipment, the medium and the program product for determining the transaction risk prediction model, the plurality of test models are trained based on the training sample set, the stability of the transaction risk prediction candidate model is evaluated based on the consistency of the prediction results of the plurality of test models, and the robustness of the transaction risk prediction candidate model with high stability is high. Therefore, in the process of determining the target transaction risk prediction model, the robustness of the model can be fully considered, so that the prediction result of the target transaction risk prediction model is higher in referenceability. In addition, transaction risk prediction is carried out based on historical transaction behavior data and historical transaction attribute data, objective distribution rules based on big data mining data are realized, and the generated prediction result is more accurate.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates application scenario diagrams of a method of determining a transaction risk prediction model, a method of predicting transaction risk, an apparatus, a device, a medium and a program product according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart of a method of determining a transaction risk prediction model, according to an embodiment of the disclosure;

FIG. 3 schematically illustrates a flowchart of a method of obtaining N transaction risk prediction candidate models, according to an embodiment of the disclosure;

FIG. 4 schematically illustrates a schematic diagram of cross-validation of L initial transaction risk prediction models, respectively, in accordance with an embodiment of the present disclosure;

FIG. 5 schematically illustrates a flow chart of a transaction risk prediction method according to an embodiment of the present disclosure;

FIG. 6 schematically illustrates a block diagram of a determination device of a transaction risk prediction model, according to an embodiment of the present disclosure;

FIG. 7 schematically illustrates a block diagram of a transaction risk prediction device according to an embodiment of the present disclosure; and

Fig. 8 schematically illustrates a block diagram of an electronic device adapted to implement a method of determining a transaction risk prediction model, a method of transaction risk prediction according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a convention should be interpreted in accordance with the meaning of one of skill in the art having generally understood the convention (e.g., "a system having at least one of A, B and C" would include, but not be limited to, systems having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In the technical solution of the present disclosure, the related user information (including, but not limited to, user personal information, user image information, user equipment information, such as location information, etc.) and data (including, but not limited to, data for analysis, stored data, displayed data, etc.) are information and data authorized by the user or sufficiently authorized by each party, and the related data is collected, stored, used, processed, transmitted, provided, disclosed, applied, etc. in compliance with relevant laws and regulations and standards, necessary security measures are taken, no prejudice to the public order colloquia is provided, and corresponding operation entries are provided for the user to select authorization or rejection.

In the scenario of using personal information to make an automated decision, the method, the device and the system provided by the embodiment of the disclosure provide corresponding operation inlets for users, so that the users can choose to agree or reject the automated decision result; if the user selects refusal, the expert decision flow is entered. The expression "automated decision" here refers to an activity of automatically analyzing, assessing the behavioral habits, hobbies or economic, health, credit status of an individual, etc. by means of a computer program, and making a decision. The expression "expert decision" here refers to an activity of making a decision by a person who is specializing in a certain field of work, has specialized experience, knowledge and skills and reaches a certain level of expertise.

It should be noted that, the method and the device for determining a transaction risk prediction model, the electronic device and the medium according to the embodiments of the present disclosure may be applied to the technical field of big data and the technical field of financial science, and may also be applied to any field other than the technical field of big data and the technical field of financial science.

The embodiment of the disclosure provides a method for determining a transaction risk prediction model, which comprises the following steps: aiming at each trained transaction risk prediction candidate model, acquiring a training sample set used in the training process of each transaction risk prediction candidate model, wherein the training sample set comprises a plurality of historical transaction behavior data and a plurality of historical transaction attribute data aiming at a plurality of sample clients, the training sample set is marked with a sample label for representing the real classification result of the sample clients, and the real classification result is related to the transaction risk degree of the sample clients; training M first initial models by using a training sample set to obtain M test models corresponding to each transaction risk prediction candidate model; inputting the test samples into M test models respectively, and outputting M prediction results, wherein the prediction results comprise classification results of test clients; consistency judgment processing is carried out on the M prediction results, and judgment results used for representing the stability of each transaction risk prediction candidate model are generated; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

Fig. 1 schematically illustrates an application scenario diagram of a method for determining a transaction risk prediction model, a method, an apparatus, a device, a medium and a program product for transaction risk prediction according to an embodiment of the present disclosure.

As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.

In an application scenario of the embodiment of the present disclosure, a user may generate a request for determining a transaction risk prediction model through the first terminal device 101, the second terminal device 102, and the third terminal device 103. In response to the request, the server 105 may be configured to perform a method for determining a transaction risk prediction model according to an embodiment of the present disclosure, including: aiming at each trained transaction risk prediction candidate model, acquiring a training sample set used in the training process of each transaction risk prediction candidate model, wherein the training sample set comprises a plurality of historical transaction behavior data and a plurality of historical transaction attribute data aiming at a plurality of sample clients, the training sample set is marked with a sample label for representing the real classification result of the sample clients, and the real classification result is related to the transaction risk degree of the sample clients; training M first initial models by using a training sample set to obtain M test models corresponding to each transaction risk prediction candidate model; inputting the test samples into M test models respectively, and outputting M prediction results, wherein the prediction results comprise classification results of test clients; consistency judgment processing is carried out on the M prediction results, and judgment results used for representing the stability of each transaction risk prediction candidate model are generated; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

It should be noted that, the method for determining the transaction risk prediction model and the method for predicting transaction risk provided by the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the determination device of the transaction risk prediction model and the transaction risk prediction device provided by the embodiments of the present disclosure may be generally disposed in the server 105. The determination method of the transaction risk prediction model, the transaction risk prediction method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the determining means of the transaction risk prediction model, the transaction risk prediction means provided by the embodiments of the present disclosure may also be provided in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The method of determining the transaction risk prediction model according to the embodiment of the present disclosure will be described in detail below with reference to fig. 2 to 4 based on the scenario described in fig. 1.

Fig. 2 schematically illustrates a flowchart of a method of determining a transaction risk prediction model according to an embodiment of the disclosure.

As shown in fig. 2, the method for determining the transaction risk prediction model of this embodiment includes operations S210 to S250.

In operation S210, for each trained transaction risk prediction candidate model, a training sample set used in a training process of each transaction risk prediction candidate model is obtained, the training sample set including a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample clients, the training sample set being marked with a sample tag for characterizing a real classification result of the sample clients, the real classification result being related to a level of transaction risk of the sample clients.

In operation S220, M first initial models are trained using the training sample set to obtain M test models corresponding to each transaction risk prediction candidate model.

In operation S230, the test samples are respectively input into M test models, and M prediction results are output, wherein the prediction results include classification results of the test clients.

In operation S240, consistency judgment processing is performed on the M prediction results, and a judgment result for characterizing stability of each transaction risk prediction candidate model is generated.

In operation S250, a target transaction risk prediction model is determined from the N transaction risk prediction candidate models based on the determination results of the N transaction risk prediction candidate models.

In accordance with embodiments of the present disclosure, financial institutions typically utilize statistical analysis methods to categorize customers, such as using categorization models. However, when classifying clients using classification models, only the accuracy of the classification model is generally of interest, and the classification robustness of the classification model is not of interest. The classification robustness of the classification model means that similar prediction results can be obtained when new samples are respectively input into two models obtained by training two independent samples with the same distribution.

For each model for predicting transaction risk, a plurality of test models can be trained by using the training sample set, and then the probability of consistent prediction results output by the plurality of test models is obtained. When the probability of consistent prediction results output by the plurality of test models is higher, the stability of the model is better. When the stability of the model for predicting transaction risk is good, the classification robustness is strong, and the referenceability of the risk prediction result obtained based on the model is strong.

According to an embodiment of the present disclosure, a plurality of models for predicting transaction risk may be trained in advance and a plurality of transaction risk prediction candidate models may be selected from the plurality of models for predicting transaction risk, for example, a model having a smaller prediction error may be selected from the plurality of models for predicting transaction risk as the transaction risk prediction candidate model in operation S210.

According to embodiments of the present disclosure, a training sample set used in training a transaction risk prediction candidate model may be obtained.

According to embodiments of the present disclosure, the historical transaction behavior data may include, for example, transaction time, transaction object, transaction amount, and the like. The historical transaction attribute data may include, for example, the name, gender, etc. of the sample customer.

According to embodiments of the present disclosure, a training sample set may be labeled with sample tags that characterize the true classification results of the sample clients. The true classification result may be, for example: sample clients are high risk clients, sample clients are low risk clients, etc.

In embodiments of the present disclosure, the user's consent or authorization may be obtained prior to obtaining the user's information. For example, before operation S210, a request to acquire user information may be issued to the user. In case that the user agrees or authorizes that the user information can be acquired, operation S210 is performed.

According to an embodiment of the present disclosure, in operation S220, for each transaction risk prediction candidate model: in order to evaluate the stability of the transaction risk prediction candidate model, the training sample set used in the process of training the transaction risk prediction candidate model may be used to train M first initial models, so as to obtain M test models corresponding to the transaction risk prediction candidate model.

According to an embodiment of the present disclosure, in operation S230, a test sample may be acquired, and the same test sample may be respectively input into M test models, to obtain M prediction results. The test sample may include transaction behavior data and transaction attribute data for the test customer. The test model may be, for example, a KNN classifier, a support vector machine, or the like. For example: the test model may be a KNN classifier, and the same test sample is input into M test models respectively, so that M prediction results may be, for example: calculating Euclidean distance between the test data and each training data; sorting according to the increasing relation of the distance; selecting K points with the smallest distance; determining the occurrence frequency of categories of the first K points; and returning the category with highest occurrence frequency in the first K points as the prediction classification of the test data.

In accordance with an embodiment of the present disclosure, in operation S240, consistency determination processing is performed on the M prediction results, and the determination result for characterizing the stability of each transaction risk prediction candidate model may be, for example: when the probability of the M prediction results being consistent is high, the judgment result may be: the transaction risk prediction candidate models corresponding to the M prediction results are high in stability.

For example: for the transaction risk prediction candidate model a, a training sample set used in training the transaction risk prediction candidate model a may be obtained. And training 2 first initial models by using the training sample set to obtain 2 test models corresponding to the transaction risk prediction candidate model A. The test samples may be input into the 2 test models described above, outputting 2 test results. The probability that 2 test results are identical may be calculated and the stability of the transaction risk prediction candidate model a may be determined based on the probability that 2 test results are identical. When the probability of the 2 test results being the same is higher, the stability of the transaction risk prediction candidate model A is better. Therefore, the classification robustness of the transaction risk prediction candidate model a is strong.

According to an embodiment of the present disclosure, in operation S250, determining the target transaction risk prediction model from the N transaction risk prediction candidate models based on the determination results of the N transaction risk prediction candidate models may be, for example: and taking the transaction risk prediction candidate model with higher stability as a target transaction risk prediction model.

According to the embodiment of the disclosure, a plurality of test models are trained based on a training sample set, stability of the transaction risk prediction candidate model is evaluated based on consistency of prediction results of the plurality of test models, and classification robustness of the transaction risk prediction candidate model with high stability is high. Therefore, in the process of determining the target transaction risk prediction model, the classification robustness of the model can be fully considered, so that the prediction result of the target transaction risk prediction model is higher in referenceability. In addition, transaction risk prediction is carried out based on historical transaction behavior data and historical transaction attribute data, objective distribution rules based on big data mining data are realized, and the generated prediction result is more accurate.

According to an embodiment of the present disclosure, specifically, training M first initial models using a training sample set, obtaining M test models corresponding to respective transaction risk prediction candidate models includes: dividing the training sample set into M training sample subsets; and respectively training M first initial models by using M training sample subsets to obtain M test models corresponding to each transaction risk prediction candidate model.

According to an embodiment of the present disclosure, for each transaction risk prediction candidate model: the training sample set used in the training process of the transaction risk prediction candidate model can be divided into M independent training sample subsets with the same distribution, and M first initial models are respectively trained by using the M independent training sample subsets with the same distribution to obtain M test models corresponding to each transaction risk prediction candidate model.

According to an embodiment of the present disclosure, specifically, performing consistency judgment processing on M prediction results, generating a judgment result for characterizing stability of each transaction risk prediction candidate model includes: calculating the probability of consistency of M prediction results; based on the probability that the M prediction results are consistent, a judgment result used for representing the stability of each transaction risk prediction candidate model is generated.

According to an embodiment of the present disclosure, the M predictors may be consistent: after the test samples are respectively input into M test models, the test samples are divided into the same type by the M test models. For example: after inputting a test sample corresponding to a certain test object into 2 test models corresponding to a transaction risk prediction candidate model A, the two test models divide the test object into low-risk clients.

According to the embodiment of the disclosure, the distance between the plurality of test models can be used for representing the probability that the prediction results corresponding to the plurality of test models are consistent. For example, for 2 test models, the distance of2 test models can be calculated by equation (1).

Wherein,A first one of the test models is represented,A second one of the test models is represented,Representing the probability that the predictions of the two test models differ,Representing the distance between the two test models, X may represent a test sample.

As can be seen from the formula (1), the higher the probability that the prediction results corresponding to the plurality of test models are consistent, the smaller the distance between the plurality of test models.

For each transaction risk prediction candidate model: the higher the probability that the prediction results of the plurality of test models corresponding to the transaction risk prediction candidate model are consistent, the better the stability of the transaction risk prediction candidate model is, and accordingly, the stronger the classification robustness of the transaction risk prediction candidate model is. The classification robustness of the transaction risk prediction candidate model can be calculated by equation (2).

Wherein,Can be calculated by formula (3).

Wherein,Can be calculated by formula (4).

In formulas (2) to (4), D1 and D2 respectively represent two independent and identically distributed training sample subsets corresponding to the training sample set.The meaning of (c) is as described above and is not described in detail herein.Representing a first test model resulting from the application of a first initial model ψ onto a training sample subset D1; Representing a second test model resulting from the application of the first initial model ψ onto the training sample subset D2. E represents a weighted average. CIS is the classification robustness of a transaction risk prediction candidate model, and is used to represent the probability that the same sample is classified into different classes after being input into the transaction risk prediction candidate model multiple times. The CIS has a value of 0-1, and the smaller the CIS has, the higher the stability of the transaction risk prediction candidate model is, and the stronger the classification robustness is.

According to an embodiment of the present disclosure, specifically, determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the determination results of the N transaction risk prediction candidate models includes: based on the judgment results of the N transaction risk prediction candidate models, sorting the stability of the N transaction risk prediction candidate models to obtain sorting results; and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the sequencing result.

According to the embodiment of the disclosure, the stability of the N transaction risk prediction candidate models may be ranked, and the transaction risk prediction candidate model with the highest stability may be used as the target transaction risk prediction model.

According to an embodiment of the present disclosure, the method for determining a transaction risk prediction model further includes obtaining N transaction risk prediction candidate models.

Fig. 3 schematically illustrates a flowchart of a method of obtaining N transaction risk prediction candidate models, according to an embodiment of the disclosure.

As shown in fig. 3, the method for acquiring N transaction risk prediction candidate models according to this embodiment includes operations S310 to S350.

In operation S310, a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample customers are read from a database.

In operation S320, the plurality of historical transaction behavior data and the plurality of historical transaction attribute data of the plurality of sample clients are divided into L training sample sets, where L is equal to or greater than N.

In operation S330, the L second initial models are trained using the L training sample sets, resulting in L initial transaction risk prediction models.

In operation S340, a transaction risk prediction error of each of the L initial transaction risk prediction models is calculated.

In operation S350, N transaction risk prediction candidate models are determined from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models.

According to an embodiment of the present disclosure, in operation S310 to operation S330, a plurality of historical transaction behavior data and a plurality of historical transaction attribute data of a plurality of sample customers may be acquired, and the plurality of historical transaction behavior data and the plurality of historical transaction attribute data may be divided into a plurality of training sample sets. The second initial model may be trained separately using each training sample set to obtain a plurality of initial transaction risk prediction models.

According to an embodiment of the present disclosure, a transaction risk prediction error of each initial transaction risk prediction model may be calculated separately in operation S340.

In accordance with an embodiment of the present disclosure, in operation S350, determining N transaction risk prediction candidate models from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models may be, for example: and taking the initial transaction risk prediction model with the transaction risk prediction error smaller than a preset threshold value as a transaction risk prediction candidate model.

For example: the plurality of historical transaction behavior data and the plurality of historical transaction attribute data for the plurality of sample customers may be partitioned into a plurality of training sample sets, and the second initial model may be trained separately using each training sample set. The second initial model may be, for example, a KNN classifier, a support vector machine, or the like. For example: the second initial model may be a KNN classifier. The KNN classifier can be trained to obtain an optimal K value. An error for each initial transaction risk prediction model may be calculated. One or more initial transaction risk prediction models having errors less than a predetermined threshold may be selected as transaction risk prediction candidate models. The K value of each transaction risk prediction candidate model can also be obtained, for example, K1, K2 and K3 …

According to the embodiment of the disclosure, based on the transaction risk prediction error, N transaction risk prediction candidate models are determined from L initial transaction risk prediction models, so that accuracy of a prediction result of the transaction risk prediction candidate models can be ensured.

Fig. 4 schematically illustrates a schematic diagram of cross-validation of L initial transaction risk prediction models, respectively, in accordance with an embodiment of the present disclosure.

According to embodiments of the present disclosure, parameter selection may be performed by cross-validation, followed by training the model with the selected optimal parameters to arrive at the final classifier. Specifically, the samples for training the initial transaction risk prediction model may be partitioned into two groups, one group being training samples and one group being test samples. In the training process of the initial transaction risk prediction model, a model is obtained by using training samples, and then a test sample is applied to the model to evaluate the transaction risk prediction error of the initial transaction risk prediction model. And repeatedly using samples for training an initial transaction risk prediction model, and sequentially cycling the processes, wherein the final return value is the average value of all the results after cycling.

According to embodiments of the present disclosure, a prediction error threshold may be preset. In the event that the risk of transaction prediction error is below a prediction error threshold, the initial risk of transaction prediction model may be used as a risk of transaction prediction candidate model.

Fig. 5 schematically illustrates a flow chart of a transaction risk prediction method according to an embodiment of the present disclosure.

As shown in fig. 5, the transaction risk prediction method of this embodiment includes operations S510 to S520.

In operation S510, a plurality of transaction behavior data and a plurality of transaction attribute data for a target customer are acquired.

In operation S520, a plurality of transaction behavior data and a plurality of transaction attribute data for the target customer are input into the target transaction risk prediction model, and a transaction risk prediction result of the target customer is output, the transaction risk prediction result being related to the degree of transaction risk of the target customer.

According to the embodiment of the disclosure, as the transaction risk prediction error of the target transaction risk prediction model is lower and the classification robustness is higher, the accuracy of the transaction risk prediction result obtained based on the target transaction risk prediction model is higher, and the generated customer portrait is higher in accuracy and higher in referenceability.

Based on the determination method of the transaction risk prediction model, the disclosure also provides a determination device of the transaction risk prediction model. The device will be described in detail below in connection with fig. 6.

Fig. 6 schematically shows a block diagram of a determination apparatus of a transaction risk prediction model according to an embodiment of the present disclosure.

As shown in fig. 6, the determining apparatus 600 of the transaction risk prediction model of this embodiment includes a first obtaining module 610, a training module 620, a first output module 630, a judging module 640, and a determining module 650.

The first obtaining module 610 is configured to obtain, for each trained transaction risk prediction candidate model, a training sample set used in a training process of each transaction risk prediction candidate model, where the training sample set includes a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample clients, and the training sample set is marked with a sample tag for characterizing a real classification result of the sample clients, where the real classification result relates to a transaction risk level of the sample clients. In an embodiment, the first obtaining module 610 may be configured to perform the operation S210 described above, which is not described herein.

The training module 620 is configured to train the M first initial models to obtain M test models corresponding to each transaction risk prediction candidate model by using the training sample set. In an embodiment, the training module 620 may be configured to perform the operation S220 described above, which is not described herein.

The first output module 630 is configured to input the test samples into M test models, and output M prediction results, where the prediction results include classification results of the test clients. In an embodiment, the first output module 630 may be used to perform the operation S230 described above, which is not described herein.

The judging module 640 is configured to perform consistency judgment processing on the M prediction results, and generate a judgment result for characterizing stability of each transaction risk prediction candidate model. In an embodiment, the determining module 640 may be configured to perform the operation S240 described above, which is not described herein.

The determining module 650 is configured to determine a target transaction risk prediction model from the N transaction risk prediction candidate models based on the determination results of the N transaction risk prediction candidate models. In an embodiment, the determining module 650 may be configured to perform the operation S250 described above, which is not described herein.

According to an embodiment of the present disclosure, the training module includes a first scoring sub-module and a first training sub-module.

The first dividing submodule is used for dividing the training sample set into M training sample subsets; the first training submodule is used for training M first initial models respectively by using M training sample subsets to obtain M test models corresponding to each transaction risk prediction candidate model.

According to an embodiment of the disclosure, the judging module includes a first calculating sub-module and a generating sub-module.

The first calculation submodule is used for calculating the probability that M prediction results are consistent; the generation submodule is used for generating judgment results for representing the stability of each transaction risk prediction candidate model based on the probability that M prediction results are consistent.

According to an embodiment of the present disclosure, the determination module includes a ranking sub-module and a first determination sub-module.

The sorting sub-module is used for sorting the stability of the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models to obtain sorting results; the first determination submodule is used for determining a target transaction risk prediction model from N transaction risk prediction candidate models based on the sorting result.

According to an embodiment of the present disclosure, the determining device of the transaction risk prediction model further includes a second obtaining module.

According to an embodiment of the disclosure, the second acquisition module includes a reading sub-module, a second scoring sub-module, a second training sub-module, a second computing sub-module, and a second determining sub-module.

The reading sub-module is used for reading a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample clients from the database; the second dividing submodule is used for dividing a plurality of historical transaction behavior data and a plurality of historical transaction attribute data of a plurality of sample clients into L training sample sets, wherein L is greater than or equal to N; the second training sub-module is used for training L second initial models by using L training sample sets to obtain L initial transaction risk prediction models; the second calculation submodule is used for calculating the transaction risk prediction errors of the L initial transaction risk prediction models respectively; the second determining submodule is used for determining N transaction risk prediction candidate models from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models.

According to an embodiment of the present disclosure, the second computing submodule includes a cross validation unit.

The cross verification unit is used for respectively carrying out cross verification on the L initial transaction risk prediction models to obtain respective transaction risk prediction errors of the L initial transaction risk prediction models.

According to an embodiment of the present disclosure, the second determination submodule comprises a determination unit.

The determining unit is used for determining an initial transaction risk prediction model with N transaction risk prediction errors meeting a preset numerical condition from the L initial transaction risk prediction models; and taking the initial transaction risk prediction models with the N transaction risk prediction errors meeting the preset numerical conditions as N transaction risk prediction candidate models.

According to an embodiment of the present disclosure, any of the first acquisition module 610, the training module 620, the first output module 630, the judgment module 640, and the determination module 650 may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Or at least some of the functionality of one or more of the modules may be combined with, and implemented in, at least some of the functionality of other modules. According to embodiments of the present disclosure, at least one of the first acquisition module 610, the training module 620, the first output module 630, the determination module 640, and the determination module 650 may be implemented at least in part as hardware circuitry, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging the circuitry, or in any one of or a suitable combination of any of the three implementations of software, hardware, and firmware. Or at least one of the first acquisition module 610, the training module 620, the first output module 630, the judgment module 640 and the determination module 650 may be at least partially implemented as computer program modules which, when executed, may perform the corresponding functions.

Fig. 7 schematically illustrates a block diagram of a transaction risk prediction device according to an embodiment of the present disclosure.

As shown in fig. 7, the transaction risk prediction apparatus 700 of this embodiment includes a third acquisition module 710 and a second output module 720.

The third obtaining module 710 is configured to obtain a plurality of transaction behavior data and a plurality of transaction attribute data for a target customer. In an embodiment, the third obtaining module 710 may be configured to perform the operation S510 described above, which is not described herein.

The second output module 720 is configured to input the plurality of transaction behavior data and the plurality of transaction attribute data for the target client into the target transaction risk prediction model, and output a transaction risk prediction result of the target client, where the transaction risk prediction result is related to the transaction risk level of the target client. In an embodiment, the second output module 720 may be used to perform the operation S520 described above, which is not described herein.

According to an embodiment of the present disclosure, any of the plurality of modules of the third acquisition module 710 and the second output module 720 may be combined in one module to be implemented, or any of the plurality of modules may be split into a plurality of modules. Or at least some of the functionality of one or more of the modules may be combined with, and implemented in, at least some of the functionality of other modules. According to embodiments of the present disclosure, at least one of the third acquisition module 710 and the second output module 720 may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or in hardware or firmware, such as any other reasonable way of integrating or packaging the circuits, or in any one of or a suitable combination of any of the three. Or at least one of the third acquisition module 710 and the second output module 720 may be at least partially implemented as a computer program module which, when executed, may perform the corresponding functions.

As shown in fig. 8, an electronic device 800 according to an embodiment of the present disclosure includes a processor 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. The processor 801 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 801 may also include on-board memory for caching purposes. The processor 801 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the disclosure.

In the RAM 803, various programs and data required for the operation of the electronic device 800 are stored. The processor 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. The processor 801 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 802 and/or the RAM 803. Note that the program may be stored in one or more memories other than the ROM 802 and the RAM 803. The processor 801 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in one or more memories.

According to an embodiment of the present disclosure, the electronic device 800 may also include an input/output (I/O) interface 805, the input/output (I/O) interface 805 also being connected to the bus 804. The electronic device 800 may also include one or more of the following components connected to the I/O interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 802 and/or RAM 803 and/or one or more memories other than ROM 802 and RAM 803 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. The program code, when executed in a computer system, causes the computer system to implement the method for determining a transaction risk prediction model, the method for predicting transaction risk provided by embodiments of the present disclosure.

The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 801. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, and/or from a removable medium 811 via a communication portion 809. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 801. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

The embodiments of the present disclosure are described above. These examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. A method of determining a transaction risk prediction model, the method comprising:

for each trained transaction risk prediction candidate model, acquiring a training sample set used in the training process of each transaction risk prediction candidate model, wherein the training sample set comprises a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample clients, and the training sample set is marked with a sample label for representing a real classification result of the sample clients, and the real classification result is related to the transaction risk degree of the sample clients;

Training M first initial models by using the training sample set to obtain M test models corresponding to each transaction risk prediction candidate model;

Inputting test samples into the M test models respectively, and outputting M prediction results, wherein the prediction results comprise classification results of test clients;

Carrying out consistency judgment processing on the M prediction results to generate judgment results for representing the stability of each transaction risk prediction candidate model;

and determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

2. The method of claim 1, wherein training M first initial models using the training sample set to obtain M test models corresponding to the respective transaction risk prediction candidate models comprises:

dividing the training sample set into M training sample subsets;

And respectively training M first initial models by using the M training sample subsets to obtain M test models corresponding to the transaction risk prediction candidate models.

3. The method of claim 1, wherein the performing a consistency determination process on the M prediction results, and generating a determination result for characterizing stability of the transaction risk prediction candidate models includes:

calculating the probability that the M prediction results are consistent;

And generating a judgment result for representing the stability of each transaction risk prediction candidate model based on the probability that the M prediction results are consistent.

4. The method of claim 1, wherein determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the determination of the N transaction risk prediction candidate models comprises:

based on the judging results of the N transaction risk prediction candidate models, sorting the stability of the N transaction risk prediction candidate models to obtain sorting results;

and determining a target transaction risk prediction model from N transaction risk prediction candidate models based on the sorting result.

5. The method according to claim 1, wherein the method further comprises:

acquiring N transaction risk prediction candidate models;

the obtaining N transaction risk prediction candidate models includes:

reading a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample customers from a database;

Dividing a plurality of historical transaction behavior data and a plurality of historical transaction attribute data of the plurality of sample clients into L training sample sets, wherein L is greater than or equal to N;

Training L second initial models by using the L training sample sets to obtain L initial transaction risk prediction models;

calculating the transaction risk prediction errors of the L initial transaction risk prediction models;

and determining N transaction risk prediction candidate models from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models.

6. The method of claim 5, wherein said calculating a transaction risk prediction error for each of said L initial transaction risk prediction models comprises:

and respectively carrying out cross verification on the L initial transaction risk prediction models to obtain respective transaction risk prediction errors of the L initial transaction risk prediction models.

7. The method of claim 5, wherein determining N transaction risk prediction candidate models from the L initial transaction risk prediction models based on the transaction risk prediction errors of the L initial transaction risk prediction models comprises:

determining initial transaction risk prediction models with N transaction risk prediction errors meeting preset numerical conditions from the L initial transaction risk prediction models;

And taking the initial transaction risk prediction models of which the N transaction risk prediction errors meet the preset numerical conditions as N transaction risk prediction candidate models.

8. A transaction risk prediction method, the method comprising:

acquiring a plurality of transaction behavior data and a plurality of transaction attribute data for a target client;

And inputting the transaction behavior data and the transaction attribute data aiming at the target client into a target transaction risk prediction model, and outputting a transaction risk prediction result of the target client, wherein the transaction risk prediction result is related to the degree of the transaction risk of the target client.

9. A device for determining a transaction risk prediction model, the device comprising:

A first obtaining module, configured to obtain, for each trained transaction risk prediction candidate model, a training sample set used in a training process of the each transaction risk prediction candidate model, where the training sample set includes a plurality of historical transaction behavior data and a plurality of historical transaction attribute data for a plurality of sample clients, and the training sample set is marked with a sample tag for characterizing a real classification result of the sample client, where the real classification result relates to a level of transaction risk of the sample client;

The training module is used for training M first initial models by utilizing the training sample set to obtain M test models corresponding to each transaction risk prediction candidate model;

The first output module is used for respectively inputting the test samples into the M test models and outputting M prediction results, wherein the prediction results comprise classification results of test clients;

the judging module is used for carrying out consistency judgment processing on the M prediction results and generating judgment results for representing the stability of each transaction risk prediction candidate model;

And the determining module is used for determining a target transaction risk prediction model from the N transaction risk prediction candidate models based on the judging results of the N transaction risk prediction candidate models.

10. A transaction risk prediction device, the device comprising:

A third acquisition module for acquiring a plurality of transaction behavior data and a plurality of transaction attribute data for a target customer;

And the second output module is used for inputting the transaction behavior data and the transaction attribute data aiming at the target client into a target transaction risk prediction model and outputting a transaction risk prediction result of the target client, wherein the transaction risk prediction result is related to the degree of the transaction risk of the target client.

11. An electronic device, comprising:

One or more processors;

A memory for storing one or more computer programs,

Characterized in that the one or more processors execute the one or more computer programs to implement the steps of the method according to any one of claims 1-8.

12. A computer-readable storage medium, on which a computer program or instructions is stored, characterized in that the computer program or instructions, when executed by a processor, implement the steps of the method according to any one of claims 1-8.

13. A computer program product comprising a computer program or instructions which, when executed by a processor, implement the steps of the method according to any one of claims 1 to 8.