Disclosure of Invention
In order to solve the above problems, the present application provides a method for determining a credit line of a user and a related device, which are used for solving the problems of a small user range and a low credit line of the credit.
In a first aspect of the present application, a method for determining a credit line granted to a user is provided, where the method includes:
acquiring relevant data of a user to be credited from a plurality of different sources, wherein the user to be credited is a user with a history crediting amount determined to be zero;
extracting a plurality of derivative features according to the related data of the user to be credited, wherein the derivative features are features with business meanings obtained by feature learning of the related data of the credited user;
and inputting the plurality of derived characteristics into a data-driven credit model, and determining the initial credit line of the user to be credited, wherein the data-driven credit model is built according to the derived characteristics and the credit line of the credited user in a fitting manner.
Optionally, if the user to be trusted is an enterprise user, the method further includes:
obtaining multi-dimensional data of the user to be credited, wherein the multi-dimensional data comprises various types of regional industry dimensional data, customer group intimacy dimensional data, customer value dimensional data and operation stability dimensional data;
obtaining comprehensive coefficients corresponding to the users to be trusted according to the dimensionality coefficients corresponding to the multidimensional data of the users to be trusted;
and adjusting the initial credit line according to the comprehensive coefficient to obtain a target credit line.
Optionally, the training step of the data-driven trust model includes:
acquiring related data of the trusted user according to the different sources;
carrying out feature learning on the related data of the trusted user to obtain a plurality of derived features;
and fitting the credit line of the trusted user according to the plurality of derived characteristics to obtain a data-driven credit model.
Optionally, the performing feature derivation on the related data of the trusted user to obtain a plurality of derived features includes:
preprocessing the related data of the trusted user to obtain the characteristics of original data;
and performing feature learning on the original data features to obtain a plurality of derivative features.
Optionally, if the user to be trusted is an enterprise user, the different sources include enterprise basic information, enterprise owner basic information, and external scene data.
In a second aspect of the present application, there is provided a device for determining a credit line granted to a user, the device including: an acquisition unit, an extraction unit and a determination unit;
the acquisition unit is used for acquiring relevant data of a user to be credited from a plurality of different sources, wherein the user to be credited is a user with a history crediting amount determined to be zero;
the extraction unit is used for extracting a plurality of derived features according to the related data of the user to be trusted, wherein the derived features are features with business meanings obtained by feature learning of the related data of the trusted user;
the determining unit is used for inputting the plurality of derived characteristics into a data-driven credit model and determining the initial credit line of the user to be credited, and the data-driven credit model is built according to the derived characteristics and the credit line of the credited user in a fitting mode.
Optionally, if the user to be trusted is an enterprise user, the apparatus further includes an adjusting unit, configured to:
obtaining multi-dimensional data of the user to be credited, wherein the multi-dimensional data comprises various types of regional industry dimensional data, customer group intimacy dimensional data, customer value dimensional data and operation stability dimensional data;
obtaining comprehensive coefficients corresponding to the users to be trusted according to the dimensionality coefficients corresponding to the multidimensional data of the users to be trusted;
and adjusting the initial credit line according to the comprehensive coefficient to obtain a target credit line.
Optionally, the apparatus further comprises a training unit, configured to:
acquiring related data of the trusted user according to the different sources;
carrying out feature learning on the related data of the trusted user to obtain a plurality of derived features;
and fitting the credit line of the trusted user according to the plurality of derived characteristics to obtain a data-driven credit model.
Optionally, the training unit is configured to:
preprocessing the related data of the trusted user to obtain the characteristics of original data;
and performing feature learning on the original data features to obtain a plurality of derivative features.
Optionally, the user to be trusted is an enterprise user, and the different sources include enterprise basic information, enterprise owner basic information, and external scene data.
In another aspect, the present application provides a computer device comprising a processor and a memory:
the memory is used for storing program codes and transmitting the program codes to the processor;
the processor is configured to perform the method of the above aspect according to instructions in the program code.
In another aspect the present application provides a computer readable storage medium for storing a computer program for performing the method of the above aspect.
Compared with the prior art, the technical scheme of the application has the advantages that:
the embodiment of the application provides a method and a related device for determining a user credit line, in order to find a high-quality user with potential in a user to be credited, the historical credit line of which is determined to be zero, when the credit line of the user is re-determined, the user is not only based on related data of a single source, but also based on related data of a plurality of different sources, so that the real situation of the user to be credited can be fully excavated. A plurality of derived features are extracted from the related data of the user to be trusted, the derived features are features with business meanings obtained by feature learning according to the related data of the trusted user, and the true situation of the trusted user can be reflected. And a data-driven credit granting model is established in advance according to the derived characteristics and the credit granting amount fitting of the trusted user, and the initial credit granting amount of the user to be granted can be obtained through the data-driven credit granting model and the extracted derived characteristics of the user to be granted. Therefore, the derivative characteristics and the data-driven trust model are used as the basis for evaluating the user to be trusted, the data support is provided, the influence of human experience is reduced, the user to be trusted is comprehensively and deeply analyzed from multiple dimensions through data from multiple different sources, the real situation of the user to be trusted can be comprehensively and accurately evaluated, potential high-quality users are mined from the user to be trusted, the trust limit of the part of users cannot be determined to be zero, the trust limit determination and the accuracy are improved, and the trust range of the user is expanded.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
At present, credit granting models used by banks mainly take rules, expert experiences and simple models, hereinafter referred to as expert rule models, are formulated by business personnel according to field expert experiences, information of users is input into the expert rule models, evaluation results are obtained, and credit granting amount is determined.
The expert rule model is formulated by means of expert industry experience, a plurality of main indexes are selected from the information of the user to be used as a basis for evaluating the income of the user, and the amount of the user is comprehensively calculated through an expert scoring card. Because the indexes selected by the expert experience model are greatly influenced by human factors and the expressed business meanings are simple, sometimes the situation that some indexes influence the high-quality user to have a low credit line, the high credit line cannot be given to the high-quality user for controlling risks, and even some high-quality users with potential are determined to have no credit line, so that the high-quality users are rejected occurs. The credit extension is small, and if the credit extension is needed, the requirement for calculating the quota needs to be reduced on the expert scoring card, but this inevitably brings the increase of bank risks.
And the expert rule model has a single data source, so that the actual real conditions (such as operating conditions, economic conditions, financial conditions, risk conditions, operating conditions and the like) of users (such as small and medium-sized enterprises, individual industrial and commercial enterprises, farmers and the like) cannot be comprehensively and objectively reflected. For example, the credit line is calculated from tax payment data, tax payment credit level and sales return of the account, and the highest of the two is used as the credit line. The mode for determining the credit line has higher requirement on the information of the user, the user can obtain higher credit line only under the condition that certain type of information is rich, but in many cases, the information of the user has the phenomena of incompleteness, inaccuracy, unreality and the like, the real condition of the user is difficult to be comprehensively and accurately evaluated according to an expert rule model, and the phenomenon of low credit line caused by low credit rating of the user exists.
Therefore, how to increase the credit granting amount of the user without increasing the risk is an urgent technical problem to be solved.
Based on this, the embodiment of the present application provides a method and a related device for determining a credit line of a user, in order to find a high-quality user with potential among users to be credited, whose historical credit line is determined to be zero, when re-determining the credit line of the user, the user does not only need to be based on related data from a single source, but also needs to be based on related data from multiple different sources, so as to fully dig out the real situation of the user to be credited. A plurality of derived features are extracted from the related data of the user to be trusted, the derived features are features with business meanings obtained by feature learning according to the related data of the trusted user, and the true situation of the trusted user can be reflected. And a data-driven credit granting model is established in advance according to the derived characteristics and the credit granting amount fitting of the trusted user, and the initial credit granting amount of the user to be granted can be obtained through the data-driven credit granting model and the extracted derived characteristics of the user to be granted. Therefore, the derivative characteristics and the data-driven trust model are used as the basis for evaluating the user to be trusted, the data support is provided, the influence of human experience is reduced, the user to be trusted is comprehensively and deeply analyzed from multiple dimensions through data from multiple different sources, the real situation of the user to be trusted can be comprehensively and accurately evaluated, potential high-quality users are mined from the user to be trusted, the trust limit of the part of users cannot be determined to be zero, the trust limit determination and the accuracy are improved, and the trust range of the user is expanded.
Referring to fig. 1, fig. 1 is a flowchart of a method for determining a credit line granted to a user according to the present application, where the method may include the following steps 101-103.
S101: and acquiring related data of the user to be trusted from a plurality of different sources.
In the related technology, the index made by the expert industry experience is input into the expert rule model, the credit line of a plurality of users is determined to be zero by determining the credit line of the users, and as can be seen from the above, some users with high potential quality exist in the users with the historical credit line determined to be zero.
In order to comprehensively and accurately evaluate the real situation of the user to be credited, the relevant data of the user to be credited can be obtained from a plurality of different sources, and the user to be credited can select the user with the credit line of zero within a fixed time period, such as within the last year and half, if the user applies for the over-small micro credit loan, the user is determined by the expert rule model.
Wherein the plurality of different sources may be data stored inside the bank and data obtained from outside. The data stored in the bank includes the internal loan and the user information of the user to be credited, and the data acquired from the outside includes the personal credit investigation and the public accumulation fund information.
If the user to be credited is an enterprise user, different sources comprise enterprise basic information, enterprise owner basic information and external scene data, wherein the enterprise basic information comprises enterprise basic information, enterprise credit investigation, industrial information, public water, issued industrial funds, in-line enterprise assets, in-line enterprise liabilities, tax payment and the like, the enterprise owner basic information can be enterprise owner basic information, enterprise owner credit investigation, in-line enterprise assets, in-line enterprise liabilities, enterprise owner credit cards, enterprise owner debit cards and the like, and the external scene data can be data of industrial companies, social security, public deposit, tax, electric power, customs and the like.
S102: and extracting a plurality of derived features according to the related data of the user to be trusted.
The derived features are features with business meanings obtained by feature learning of the related data of the trusted user, the derived features are generally caused by two reasons, the first is the change of the data, so that a plurality of originally-unavailable features appear in the data, and the second is the algorithm generates the derived features according to a certain relation among the features when the feature learning is carried out, and sometimes the derived features can better reflect the relation among the data features. By means of the derivation characteristics, an image system of the user to be trusted comprehensively can be established, the risk and income of the user to be trusted can be evaluated more comprehensively and accurately, and therefore the user can be given higher credit line.
The trusted user is a user whose credit line is determined to be not zero according to an expert rule model, characteristics which can represent characteristics of the trusted user in relevant data can be mined through characteristic learning, after derivative characteristics are determined, if the characteristics also exist in the user to be trusted, the user to be trusted is similar to the trusted user, the user to be trusted has certain potential and can become a high-quality user, so that a plurality of derivative characteristics can be extracted from the relevant data of the user to be trusted, the appropriate credit line can be calculated for the part of the user to be trusted, and the range of the user to be trusted is expanded.
Wherein, the characteristic learning mode comprises statistical aggregation of data, derivation according to time series, characteristic crossing, transparent transmission and the like, a plurality of derived characteristics can be obtained through characteristic learning,
as a possible implementation manner, after the relevant data of the user to be trusted is obtained, the relevant data can be preprocessed, and the preprocessing manner is as follows: the data range and the available field of the relevant data of the user to be trusted are determined, the credit line distribution of the trusted user is analyzed, the characteristics of the relevant data of the user to be trusted are roughly known, after the relevant data of the user to be trusted are analyzed, the relevant data of the user to be trusted can be processed, the original data characteristics are obtained, and a plurality of derivative characteristics are extracted from the original data characteristics. The preprocessing can include the processing of data abnormal values, the processing of missing values, logarithmic transformation, the encoding of discrete variables, the binning of continuous variables, and the like.
S103: and inputting the plurality of derived characteristics into a data-driven credit granting model, and determining an initial credit granting amount of the user to be granted.
The data-driven model is different from a traditional expert rule model customized based on expert experience, and the data-driven model utilizes massive and multidimensional data and obtains a more comprehensive and accurate model through an artificial intelligence algorithm. In the application, the data-driven credit granting model is pre-established according to historical data and can be used for representing the relationship between the derived characteristics and the credit granting amount, and the derived characteristics of the user to be granted are input into the established data-driven credit granting model to obtain the initial credit granting amount of the user to be granted.
Different from manual calculation, the data-driven credit granting model provided by the embodiment of the application is data-driven, the data-driven credit granting model and parameters are quantitatively calculated, subjective judgment is separated, spurious deviation is reduced, calculation efficiency is improved, and the credit granting limit reject ratio is reduced.
The following describes a process of establishing a data-driven trust model with reference to S201 to S203.
S201: and acquiring the related data of the trusted user according to a plurality of different sources.
The source of the related data of the trusted user is the same as that of the related data of the user to be trusted, and the related data of the trusted user is obtained through a plurality of different sources.
S202: and carrying out feature learning on the related data of the trusted user to obtain a plurality of derived features.
As a possible implementation manner, after the related data of the trusted user is acquired, the raw data features are acquired through the preprocessing manner, then feature learning is performed on the raw data features, and a plurality of derived features about the trusted user are analyzed and mined.
S203: and fitting the credit line of the trusted user according to the plurality of derived characteristics to obtain a data-driven credit model.
And establishing a driving credit granting model capable of representing the mapping relation of the independent variable and the dependent variable by taking the derived characteristics excavated from the related data of the trusted users as the independent variable and the credit granting amount corresponding to the trusted users as the dependent variable.
As a possible implementation mode, a training sample can be divided into two parts according to time dimension, the training sample is respectively a training set and a testing set, an initial driving credit granting model is established through the training sample in the training set, the credit granting model is driven through training sample training data in the testing set, the data driving credit granting model is adjusted through the training sample in the testing set, the training sample is repeatedly divided into two parts according to the time dimension, the data driving credit granting model is trained again, the data driving credit granting model is continuously optimized in an iterative mode through batch training, and a stable and optimal data driving credit granting model is obtained.
As one possible implementation, the training samples may be divided into two in the time dimension at a 9:1 ratio.
The embodiment of the present application does not specifically limit the fitting manner, and for example, the fitting may be performed by a Light Gradient Boosting Machine (a distributed Gradient Boosting framework based on a decision tree algorithm).
As a possible implementation manner, after the derived features are obtained, in order to accelerate the operation efficiency of the model, operations such as correlation analysis and importance extraction may be performed on the derived features before the fitting, so that features with high correlation degrees are removed through the correlation analysis, and known unimportant features are removed through the importance extraction.
According to the scheme, in order to excavate the high-quality user with potential in the users to be credited, the historical credit line of which is determined to be zero, when the credit line is determined again, the real situation of the users to be credited can be excavated fully on the basis of the related data of a single source and the related data of a plurality of different sources. A plurality of derived features are extracted from the related data of the user to be trusted, the derived features are features with business meanings obtained by feature learning according to the related data of the trusted user, and the true situation of the trusted user can be reflected. And a data-driven credit granting model is established in advance according to the derived characteristics and the credit granting amount fitting of the trusted user, and the initial credit granting amount of the user to be granted can be obtained through the data-driven credit granting model and the extracted derived characteristics of the user to be granted. Therefore, the derivative characteristics and the data-driven trust model are used as the basis for evaluating the user to be trusted, the data support is provided, the influence of human experience is reduced, the user to be trusted is comprehensively and deeply analyzed from multiple dimensions through data from multiple different sources, the real situation of the user to be trusted can be comprehensively and accurately evaluated, potential high-quality users are mined from the user to be trusted, the trust limit of the part of users cannot be determined to be zero, the trust limit determination and the accuracy are improved, and the trust range of the user is expanded.
As a possible implementation manner, if the user to be trusted is an enterprise user, after S103, the initial credit line of the user to be trusted may also be adjusted, see S104-S106:
s104: and acquiring multi-dimensional data of the user to be trusted.
The multi-dimensional data can include various data in regional industry dimensional data, customer group intimacy dimensional data, customer value dimensional data and business stability dimensional data, wherein the indexes need to meet the requirements of data coverage, data distinguishability, data stability and the like.
Data with different dimensionalities are equally divided into a plurality of hierarchies, and different hierarchies correspond to different dimensionality coefficients and users to be trusted with different potentials. The adjustment coefficients of different dimensions and different levels are obtained through data analysis modeling. For example, an expert rule model is built based on expert experience.
S105: and obtaining comprehensive coefficients corresponding to the users to be trusted according to the dimensionality coefficients corresponding to the multidimensional data of the users to be trusted.
After obtaining the dimension coefficients corresponding to the plurality of dimension data of the user to be trusted, the dimension coefficients can be integrated to form an integrated coefficient corresponding to the user to be trusted, and the integrated data represents the degree of whether the user to be trusted is a potential high-quality user.
S106: and adjusting the initial credit line according to the comprehensive coefficient to obtain the target credit line.
And adjusting the initial credit line according to the comprehensive coefficient of the user to be credited to obtain the target credit line. For example, for the users to be credited with higher comprehensive coefficient (such as 1.2 and 1.5), the users to be credited may have lower risk and stable operation, and the credit line can be properly increased through the comprehensive coefficient on the initial credit line obtained by the data-driven credit model. For the users to be credited with lower comprehensive coefficients (such as 0.8 and 0.6), the users to be credited have higher risks or are unstable in operation, and the credit line can be properly adjusted down through the comprehensive coefficients on the initial credit line obtained by the data-driven credit model.
Therefore, the credit line of each enterprise user is effectively subdivided according to the characteristics of the enterprise by the relevant data related to the enterprise user, the credit line is obtained by measuring and calculating a data-driven credit model, the initial credit line is adjusted by a comprehensive coefficient, a proper credit line can be calculated for partial high-quality users, each enterprise user is ensured to obtain a higher, more accurate and more available credit line on the premise of ensuring bank risks, more comprehensive and accurate determination of credit line of the user to be credited is realized, and the range of credit and user is enlarged.
The embodiment of the invention provides a device for determining the credit line of the user in addition to the method for determining the credit line of the user, as shown in figure 2, the device comprises: an acquisition unit 201, an extraction unit 202, and a determination unit 203;
the obtaining unit 201 is configured to obtain relevant data of a user to be trusted from multiple different sources, where the user to be trusted is a user whose historical credit line is determined to be zero;
the extracting unit 202 is configured to extract a plurality of derived features according to the relevant data of the user to be trusted, where the derived features are features having business meanings obtained by performing feature learning on the relevant data of the trusted user;
the determining unit 203 is configured to input the plurality of derived features into a data-driven credit model, and determine an initial credit line of the user to be credited, where the data-driven credit model is built according to the derived features and the credit line of the credited user in a fitting manner.
As a possible implementation manner, if the user to be trusted is an enterprise user, the apparatus further includes an adjusting unit, configured to:
obtaining multi-dimensional data of the user to be credited, wherein the multi-dimensional data comprises various types of regional industry dimensional data, customer group intimacy dimensional data, customer value dimensional data and operation stability dimensional data;
obtaining comprehensive coefficients corresponding to the users to be trusted according to the dimensionality coefficients corresponding to the multidimensional data of the users to be trusted;
and adjusting the initial credit line according to the comprehensive coefficient to obtain a target credit line.
As a possible implementation manner, the apparatus further includes a training unit configured to:
acquiring related data of the trusted user according to the different sources;
carrying out feature learning on the related data of the trusted user to obtain a plurality of derived features;
and fitting the credit line of the trusted user according to the plurality of derived characteristics to obtain a data-driven credit model.
As a possible implementation, the training unit is configured to:
preprocessing the related data of the trusted user to obtain the characteristics of original data;
and performing feature learning on the original data features to obtain a plurality of derivative features.
As a possible implementation manner, the user to be trusted is an enterprise user, and the different sources include enterprise basic information, enterprise owner basic information, and external scene data.
An embodiment of the present application further provides a computer device, referring to fig. 3, which shows a structural diagram of a computer device provided in an embodiment of the present application, and as shown in fig. 3, the device includes a processor 310 and a memory 320:
the memory 310 is used for storing program codes and transmitting the program codes to the processor;
the processor 320 is configured to execute any method for determining the credit line of the user provided in the above embodiments according to the instructions in the program code.
The embodiment of the application provides a computer-readable storage medium, wherein the computer-readable storage medium is used for storing a computer program, and the computer program is used for executing any method for determining the user credit line provided by the embodiment.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described apparatus embodiments are merely illustrative, and the units and modules described as separate components may or may not be physically separate. In addition, some or all of the units and modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present invention, and it is understood that various modifications and improvements can be made by those skilled in the art without departing from the spirit of the invention.