[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112766814A - Training method, device and equipment for credit risk pressure test model - Google Patents

Training method, device and equipment for credit risk pressure test model Download PDF

Info

Publication number
CN112766814A
CN112766814A CN202110160918.2A CN202110160918A CN112766814A CN 112766814 A CN112766814 A CN 112766814A CN 202110160918 A CN202110160918 A CN 202110160918A CN 112766814 A CN112766814 A CN 112766814A
Authority
CN
China
Prior art keywords
pressure
index
training
data
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110160918.2A
Other languages
Chinese (zh)
Inventor
方成
朱佳宁
金焰
丁允文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110160918.2A priority Critical patent/CN112766814A/en
Publication of CN112766814A publication Critical patent/CN112766814A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Educational Administration (AREA)
  • Game Theory and Decision Science (AREA)
  • Biophysics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Technology Law (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The embodiment of the specification provides a training method, a training device and training equipment for a credit risk pressure test model, which can be used in the technical field of big data. The method comprises the steps of determining a pressure index and a pressure bearing index; the pressure index represents factors influencing credit risk, and the pressure-bearing index represents factors influenced by the pressure index; acquiring an index data set according to the pressure index and the pressure-bearing index; each index data comprises pressure index data and pressure-bearing index data; dividing the index data set into a training set and a test set; training a preset neural network model based on a training set, a random gradient descent algorithm and a simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition. The accuracy of the credit risk pressure test can be effectively improved by utilizing the embodiment of the specification.

Description

Training method, device and equipment for credit risk pressure test model
Technical Field
The application relates to the field of big data, in particular to a training method, a training device and training equipment for a credit risk pressure test model.
Background
With the rapid development of science and technology, banks face various risks in the operation process, such as credit risk, market risk, liquidity risk, operation risk, legal risk and the like, but the influence of various risks on the banks is different. Since the credit assets account for 50% of the total assets of the bank, the credit risk is the risk with a large influence on the operation of the bank among various risks, and is the main object and core content of risk management of financial institutions and regulatory departments. Therefore, stress testing credit risk is becoming increasingly important to circumvent the risk that may occur in the future.
In the prior art, the pressure test of the credit risk mainly comprises that an expert sets parameters of a credit risk pressure test model according to historical experience, and then various risk parameters are input into the credit risk pressure test model for testing. The method is not only easily influenced by human subjective factors and has certain subjective limitation, but also has more limited reference information which can be obtained, so that the credit risk stress test result is inaccurate.
Therefore, there is a need for a solution to the above technical problems.
Disclosure of Invention
The embodiment of the specification provides a training method, a training device and training equipment for a credit risk pressure test model, and the accuracy of a credit risk pressure test can be effectively improved.
The training method, the training device and the training equipment for the credit risk stress test model are realized in the following modes.
A training method of a credit risk stress test model comprises the following steps: determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index; acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index; dividing the index data set into a training set and a test set; training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
A training apparatus for a credit risk stress test model, comprising: the determining module is used for determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index; the acquisition module is used for acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index; the dividing module is used for dividing the index data set into a training set and a test set; the training module is used for training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
A training apparatus for a credit risk stress test model, comprising at least one processor and a memory storing computer-executable instructions, which when executed by the processor, perform the steps of the method of any one of the method embodiments of the present specification.
A computer readable storage medium having stored thereon computer instructions which, when executed, implement the steps of any one of the method embodiments in the present specification.
The specification provides a training method, a training device and training equipment for a credit risk stress test model. In some embodiments, a pressure index and a pressure-bearing index can be determined, and an index data set is obtained according to the pressure index and the pressure-bearing index; further, the index data set can be divided into a training set and a testing set, and the preset neural network model is trained on the basis of the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure testing model; in the training process, the model obtained by training is verified by using the test set, the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition, the pressure index represents the factors influencing the credit risk, the pressure-bearing index represents the factors influenced by the pressure index, and each index data in the index data set comprises the data corresponding to the pressure index and the data corresponding to the pressure-bearing index. By adopting the implementation scheme provided by the specification, the accuracy of the credit risk pressure test can be effectively improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification, are incorporated in and constitute a part of this specification, and are not intended to limit the specification. In the drawings:
FIG. 1 is a schematic flow chart diagram illustrating one embodiment of a training method for a credit risk stress test model provided herein;
FIG. 2 is a schematic diagram illustrating a preset neural network model;
FIG. 3 is a block diagram of an embodiment of a training apparatus for a credit risk pressure test model provided in the present specification;
fig. 4 is a block diagram of a hardware structure of an embodiment of a training server of a credit risk pressure test model provided in the present specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments in the present specification, and not all of the embodiments. All other embodiments that can be obtained by a person skilled in the art on the basis of one or more embodiments of the present description without inventive step shall fall within the scope of protection of the embodiments of the present description.
The following describes an embodiment of the present disclosure with a specific application scenario as an example. Specifically, fig. 1 is a schematic flow chart of an embodiment of a training method of a credit risk pressure test model provided in this specification. Although the present specification provides the method steps or apparatus structures as shown in the following examples or figures, more or less steps or modules may be included in the method or apparatus structures based on conventional or non-inventive efforts.
One embodiment provided by the present specification can be applied to a client, a server, and the like. The client may include a terminal device, such as a smart phone, a tablet computer, and the like. The server may include a single computer device, or may include a server cluster formed by a plurality of servers, or a server structure of a distributed system, and the like.
It should be noted that the following description of the embodiments does not limit the technical solutions in other extensible application scenarios based on the present specification. In an embodiment of the training method for the credit risk stress test model provided in the present specification, as shown in fig. 1, the method may include the following steps.
S0: determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index.
In the embodiments of the present specification, the pressure index may be referred to as a pressure context variable, an interpretation variable, and the like, which may represent factors affecting credit risk. The pressure bearing index may also be referred to as a pressure bearing variable, an interpreted variable, which may represent a factor affected by the pressure index. The bearing variable is the variable which is ultimately influenced by the pressure studied by the pressure test and is also the interpreted variable of the credit risk pressure test model. Generally the credit risk profile may be represented by a pressure bearing variable. In some scenarios, the pressure index may be understood as an independent variable of the model, and the pressure-bearing index may be understood as a dependent variable of the model.
In some embodiments, the stress indicators may include macroscopic economic indicators and corporate financial indicators. Wherein the macro-economic indicator may comprise at least one of: GDP output gap, effective exchange rate, actual generalized currency supply, nominal interest rate of one-year-period goods and commodity price index; the corporate financial indicators include at least one of: the balance rate, the ratio of total balance to sales revenue, the ratio of cash to total balance, liquidity ratio, snap ratio, cash ratio, multiple of interest, multiple of interest support, rate of tangible balance, rate of tangible equity balance, title ratio, ratio of pre-interest profit to total, ratio of interest to sum of interest and pre-interest profit, ratio of savings to sales revenue, ratio of working capital to total, ratio of sales revenue to total, ratio of net profit to owner's equity, ratio of net profit to total, ratio of liquidity to total, retained profit, total. The loading index may include at least one of: default rate, bad loan rate, recovery rate, loan loss preparation, capital sufficiency, total profit.
In some embodiments, a stress indicator may be determined from stress data associated with a credit risk; and determining a pressure bearing index based on a credit risk conduction mechanism.
In some implementation scenarios, historical pressure events related to bank credit risk may be analyzed to obtain pressure data related to credit risk. Wherein the pressure data relating to the credit risk may be understood as a source of pressure for the credit risk. In some implementation scenarios, the credit risk pressure sources can be classified into six categories, namely economy, natural disasters, politics and society, changes in management policies, changes in industry, and internal factors of enterprises. Of course, the above mentioned sources of credit risk pressure are only exemplary, the embodiments of the present disclosure are not limited to the above examples, and other modifications are possible for those skilled in the art in light of the technical spirit of the present disclosure, and all that can be achieved is intended to be covered by the scope of the present disclosure as long as the functions and effects achieved by the present disclosure are the same as or similar to those of the present disclosure.
In some implementations, after obtaining the pressure data associated with the credit risk, a pressure indicator that can indicate an impact on the credit risk may be filtered from the pressure data associated with the credit risk. For example, in some implementation scenarios, after obtaining the pressure data related to the credit risk, the pressure data that is difficult to quantify in the pressure data related to the credit risk may be eliminated, the pressure data that is easy to quantify may be obtained, and then the pressure indicators that can indicate that the credit risk is affected may be screened from the pressure data that is easy to quantify. Since the range of pressure sources such as natural disasters, politics, and society is too wide, for example, natural disasters including volcanoes, earthquakes, typhoons, floods, and erlinuo are difficult to describe the conditions by using several indexes, in the embodiment of the present specification, the natural disasters, the politics, the society, changes in management policies, and changes in industries are determined as pressure sources which are difficult to quantify. Accordingly, economic factors and internal factors of enterprises can be represented by common macro-economic indexes and financial indexes, so that the economic factors and the internal factors of the enterprises are determined as easily-quantified pressure sources.
Of course, the above description is only exemplary, and other easily quantified pressure sources can be included, and the embodiments of the present disclosure are not limited to the above examples, and other modifications are possible for those skilled in the art in light of the technical spirit of the present disclosure, but all that can achieve the same or similar functions and effects as the present disclosure should be covered by the protection scope of the present disclosure.
In some implementation scenarios, after obtaining the easily quantified pressure data, the easily quantified pressure data may be divided into two categories, i.e., a macro-economic factor and an enterprise-internal factor, and then pressure indexes capable of representing credit risk are selected from the two categories.
In some implementation scenarios, in combination with the nature of the bank loan transaction, the following five macro-economic variables can be selected from the pressure data belonging to the macro-economic factors as the pressure index, as shown in table 1.
TABLE 1 pressure index-macroscopic economic factor
Name of variable Means of
gdpgap GDP yield gap
reer,neer Effective exchange rate
M2 Actual generalized currency supply
loanrate Nominal interest rate of one year's payment
cpi Price index of consumer goods
Wherein, the calculation method of the actual generalized currency supply amount data comprises the following steps: the actual generalized currency supply amount from which the influence of price fluctuations is eliminated is calculated by dividing the nominal generalized currency supply amount by the past year price index, with the 1990's price index as 100.
In some implementation scenarios, enterprise financial indexes mentioned in credit risk research at home and abroad and enterprise financial indexes which have a large influence on credit and can obtain data in a database of an existing loan enterprise are comprehensively considered, and the following 21 enterprise financial indexes can be selected from pressure data belonging to internal factors of the enterprise to serve as pressure indexes. As shown in table 2. The selected 21 enterprise financial indexes can be divided into a repayment capacity ratio and other ratios.
TABLE 2 pressure index-Enterprise internal factors
Figure BDA0002936580650000061
Of course, the above description is only exemplary, and the selection of the pressure index capable of indicating the credit risk from the pressure data corresponding to the macro economic factor and the enterprise internal factor may also include other indexes, and the embodiments of the present disclosure are not limited to the above examples, and other modifications may be made by those skilled in the art within the spirit of the present disclosure, but all that can achieve the same or similar functions and effects as the present disclosure should be covered within the scope of the present disclosure.
Since the pressure generated by the pressure situation variable is transmitted, the pressure-bearing variable is finally changed, so that the pressure situation variable can be regarded as a cause, and the pressure-bearing variable can be regarded as an effect. In some implementation scenarios, for the purpose of referencing the conduction problem of the currency policy, the credit risk of the bank also has its conduction mechanism, which is firstly a source of pressure of external macro-economy, and then affects the operation status of the loan enterprise or the personal financial status, resulting in the change of repayment capability, and finally resulting in the increase of bad loans of the bank, and further affecting the capital fund and profit status of the bank. Therefore, the following six indexes can be selected as the bearing variables of the credit risk of the bank:
(1) default rate (Default Rates)
The default is defined as: full repayment is not possible or it is overdue for more than 90 days. The default rate may be statistically derived from historical loan repayment status, which is the most directly affected variable by credit risk pressure. In general, the method for calculating the default rate may include default corporate household laws, default loan penmanship laws, default loan amount laws and the like. The number law of default enterprises can reflect the number proportion of the default enterprises, the number of default loan pens can reflect the number proportion of default times (default frequency), and the amount law of default loan can reflect the default loan fundThe balance weight. Because the amount of each loan may be very different (for example, the amount of the loan of the corporation A is 5 loans, 10 thousands, and the amount of the loan of the corporation B is 1 loan, 1 hundred million), the influence of the default of the loans with different amounts on the bank is different. If the default rate is calculated by adopting the default enterprise household number law or the default loan number law, the credit risk size faced by the bank may not be sufficiently reflected. Thus, in some implementations, the default loan amount laws may be selected to calculate the default rate PDRThe specific calculation formula is as follows:
Figure BDA0002936580650000071
wherein, PDRFor rate of default, AMTDRAMT for the amount of the loan default in this periodtotalThe total amount of the loan in the current period.
(2) Bad Loan rate (Non-Performing Loan Ratio)
According to the loan five-level classification standard, bank loans can be divided into five classes according to the risk degree: normal, concern, secondary, suspicious, loss, wherein secondary, suspicious, loss is a bad loan. The rate of bad loan may be understood as the financial institution's bad loan as a proportion of the total loan balance. The increase of credit risk can improve the rate of bad loan, so the rate of bad loan is a core index related to the quality of bank assets, and the specific calculation formula is as follows:
Figure BDA0002936580650000072
wherein, PNPLRFor bad loan rate, AMTSLAMT for the amount of the current secondary loanDLAMT for the amount of the suspicious loan at this timeLLAMT for the amount of the loss-like loan in the current periodtotal_olThe total loan balance of the current period.
(3) Recovery rate (Recovery Rates)
The loan recovery rate is for the default loan, and refers to the proportion of the loan recovery amount to the loan issuance amount or the loan accumulation amount in a certain period, and the loan recovery rate is reduced due to the increase of credit risk. When the loan is default, the value of the default rate can be calculated, but the default of one loan does not represent that the loan is lost completely, and the bank can gradually recover the loan by increasing the collection promoting force, disposing the mortgage and other measures, thereby reducing the loss to the maximum extent. The recovery rate is calculated as follows:
Figure BDA0002936580650000073
wherein, PRRTo recover, AMTTRAccumulated amount of withdrawal for the loan at this date, AMTtotalThe total amount of the loan in the current period.
(4) Loan Loss preparation (Loan Loss Provision)
The loan loss preparation is that a bank provides a special fund in a certain proportion according to the expected loss of the loan in order to prevent the occurrence of bad loans, and the loan loss preparation can be divided into general preparation, special preparation and special preparation. In general, in some implementations, the bank needs to provide general preparation for not less than 1% of the total loan balance at the end of the year, and special preparation for 2% of the interest-class loan, 25% of the secondary-class loan, 50% of the suspicious-class loan, and 100% of the loss-class loan. The special preparation is the preparation of a bank for the loan risk metering of a certain country, region, industry or a certain type, and the metering proportion is determined by the bank. The special preparation is not the reserve money which is often extracted by the bank, and the bank only counts and extracts the special preparation when meeting special conditions. According to the loan loss counting and lifting method, the pressure can influence the loan special preparation by influencing the loan risk condition and further changing the bank property quality. The formula for the loan loss preparation is as follows:
Figure BDA0002936580650000081
wherein, AMTLLPPrepared for loan loss, AMTtotal_olTotal loan balance of the current periodForehead, PtotalScale of note, AMT, for general preparation by bankSMLAMT for the current interest type loan amountSLAMT for the amount of the current secondary loanDLAMT for the amount of the suspicious loan at this timeLLFor the loss-like loan amount of the current period,
Figure BDA0002936580650000082
and
Figure BDA0002936580650000083
respectively represents the number and proportion of credit points of special preparation of i-th loan, so
Figure BDA0002936580650000084
Indicating the total amount of the special offer, and n indicating the amount of the loan category.
(5) Capital abundance Ratio (Capital Adequacy Ratio)
The capital of a bank is mainly the own capital, and the capital abundance ratio is the ratio of the total capital of the bank to the risk weighted assets thereof. Sufficient capital is beneficial for banks to make up for the risk loss, so the status of the capital abundance can reflect the risk status of the bank assets and the capability of resisting various risks. One of the important uses of bank capital is for loan loss preparation, where when credit risk causes a loan to be poor, and loss preparation is used to offset a poor loan, the net amount of the bank's capital decreases and the sufficiency of the capital decreases. The credit risk pressure affects the capitalization rate by affecting the loan loss preparation, and specifically, the formula for the capitalization rate is as follows:
Figure BDA0002936580650000085
among them, CAR capital abundance, AMTCapitalIs the capital sum of the bank, AMTRWAThe assets are weighted for the bank's risk.
(6) Total Profit (Total Profit)
Liquidity, security and profitability are basic goals for commercial bank operations management. The profit is the embodiment of bank operation achievement, also is the basis that the bank exists and the power of development, and domestic bank's profit constitutes mainly: business income (mainly interest income), investment income and the like minus business costs (mainly interest expenditure), business expenses, loan loss preparation, income tax and the like. Under pressure related to credit risk, the bank's bad loans increase, causing increased readiness for loan loss and, in turn, decreased profits. The total profit amount of the bank can be directly obtained in the financial statement of the bank, and the calculation formula is as follows:
AMTTP=AMTincome+AMTexpense-AMTLLP-AMTtax
wherein, AMTTPTo sum up profits, AMTincomeThe AMT is the income-type fund inflow amount of bank business income, investment income and the likeexpenseThe amount of money to be fund-out for the business cost and the business expense of the bank, AMTLLPPrepared for loan loss, AMTtaxIs the associated tax.
In the embodiment of the specification, the pressure indexes influencing the credit risk and the pressure-bearing indexes influenced by the pressure indexes are obtained by analyzing, selecting and quantifying the historical pressure events related to the credit risk of the bank and based on the conduction mechanism of the credit risk of the bank, so that a more accurate data source can be provided for subsequent model training, and the accuracy of subsequent credit risk pressure testing can be effectively guaranteed.
S2: acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index.
In the embodiment of the present description, after the pressure index and the pressure-bearing index are determined, an index data set may be obtained according to the pressure index and the pressure-bearing index. The index data set comprises index data which can be used for subsequent training of the model.
In some embodiments, the obtaining an index data set according to the pressure index and the pressure bearing index may include: acquiring a data table; wherein, the data table can comprise a macro economic data table, an enterprise financial statement and a loan record table; extracting corresponding data from the macroscopic economic data table and the enterprise financial statement according to the pressure index to obtain pressure index data; extracting corresponding data from the loan record table according to the pressure-bearing index to obtain pressure-bearing index data; and correlating the pressure index data with the pressure-bearing index data to obtain an index data set.
In some implementations, a data table associated with a bank loan, such as a loan record table, may be obtained from a database of the national people's bank. In some implementation scenarios, the macro economic data table and the enterprise financial statement may be obtained from financial information published by each bank in public. The time span of the data in each data table should cover economic downtimes, such as Asian financial crisis, so as to ensure that the data selection has certain representative significance.
Because the data index range included in the obtained data sheet is usually much larger than the required pressure index and pressure-bearing index range, in some implementation scenes, after the data sheet is obtained, corresponding data can be extracted from a macroscopic economic data sheet and an enterprise financial statement according to the pressure index to obtain pressure index data, and corresponding data is extracted from a loan record sheet according to the pressure-bearing index to obtain pressure-bearing index data. Therefore, the accuracy of model training can be improved by removing the unnecessary index data and keeping the data corresponding to the required index.
In some implementation scenarios, the obtained pressure index data and pressure-bearing index data relate to a plurality of data tables, and data in different data tables may have a corresponding relationship, for example, data in a loan record table may have an association with data in an enterprise financial index table, and data in the loan record table may have an association with data in a macro economic data table, so that in order to make the index data set include relatively perfect index data, data obtained from different data tables may be associated, so that data scattered in different data tables but having an association is stored in one data table, and at this time, data in the data table may be used as a corresponding index data set. Each line in the index data set may represent a record, and each record may include corresponding pressure index data and pressure-bearing index data, and may also include other basic attributes and the like.
In some embodiments, each piece of input data in the obtained index data set corresponds to one piece of output data, where the input data may include the above 26 pressure indexes, that is, 26 dimensions, and the output data may include the above 6 pressure-bearing indexes, that is, 6 dimensions. It is to be understood that the above description is only exemplary, and other numbers of indicators may be included in the input data and the output data in the indicator data set, and the embodiments of the present disclosure are not limited to the above examples, and other modifications may be made by those skilled in the art within the spirit of the present disclosure, but all the functions and effects that are achieved by the present disclosure are covered by the scope of the present disclosure.
S4: the index data set is divided into a training set and a test set.
In the embodiment of the present specification, after the index data set is obtained, the index data set may be divided into a training set and a test set, so as to perform model training and verification in the following.
In some embodiments, before dividing the index data set into a training set and a test set, a hot-card filling method may be used to fill missing data in the index data set to obtain a first index data set; deleting abnormal data in the first index data set by using a regression test model to obtain a second index data set; correspondingly, the second index data set is divided into a training set and a test set.
Due to the fact that abnormal data such as repetition, errors, wrong formats, non-standardization and the like may exist in the obtained index data set, in some implementation scenarios, the data in the index data set can be preprocessed before the index data set is divided into a training set and a testing set. Among them, the preprocessing may include missing value completion, outlier processing, and the like.
In some implementation scenarios, for missing data, a hot-card padding method may be used to find a similar object in the preset database, and then the similar object is filled with the value of the similar object. Wherein different questions may adopt different criteria to determine similarity. For example, in some implementation scenarios, it may be determined by the similarity matrix which variable (e.g., variable Y) is most related to the variable in which the missing value (e.g., variable X) is located, and then all variables are sorted according to the value size of the variable Y, so that the missing value of the variable X may be replaced by the case data that is arranged before the missing value. It is to be understood that the foregoing is only an exemplary illustration, and other ways of complementing the deficiency values may be included, and the embodiments of the present disclosure are not limited to the foregoing examples, and other modifications may be made by those skilled in the art within the spirit of the present disclosure, but all the modifications are intended to be covered by the scope of the present disclosure as long as the functions and effects achieved by the embodiments are the same as or similar to those of the present disclosure.
In some implementation scenarios, for abnormal values, data cleaning may be performed for abnormal data far from the predicted values through a regression test model. It is to be understood that the above description is only exemplary, and the abnormal value processing method may include other methods, and the embodiments of the present disclosure are not limited to the above examples, and other modifications may be made by those skilled in the art within the spirit of the present disclosure, but all the functions and effects achieved by the embodiments are within the scope of the present disclosure.
In the embodiment of the specification, a hot card filling method and a regression testing algorithm are introduced to perform data cleaning, invalid and abnormal data are eliminated, standard and accurate data are reserved, more reliable and accurate data can be provided for subsequent model training, and therefore prediction accuracy is improved.
In some implementations, after pre-processing data in the index dataset, the index dataset may be divided into a training set and a test set. The training set can be used for training the model, and the training set can be used for verifying the model obtained by training. For example, in some implementations, most (e.g., 70%) of the data in the index data set may be used as a training set for training the model, including calculating gradients, updating connection weights and thresholds, and the remaining data may be used as a test set for estimating errors. Of course, the division of the index data set may be set according to an actual scene, and this specification does not limit this.
S6: training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
In the embodiment of the present description, after the index data set is divided into the training set and the test set, the preset neural network model may be trained based on the training set, the stochastic gradient descent algorithm, and the simulated annealing algorithm.
In some embodiments, the training a preset neural network model based on the training set, the stochastic gradient descent algorithm, and the simulated annealing algorithm to obtain a credit risk pressure test model includes: constructing a preset neural network model; the preset neural network model comprises an input layer, a hidden layer and an output layer, the number of the neurons in the input layer is determined according to the number of the pressure indexes, and the number of the neurons in the output layer is determined according to the number of the pressure-bearing indexes; training the preset neural network model by using index data in the training set to obtain a first test model and a first error set; updating parameters of the preset neural network model based on the stochastic gradient descent algorithm and the simulated annealing algorithm in the training process; verifying the first test model by using the index data in the test set to obtain a second error set; judging whether the first error set and the second error set meet preset conditions or not; when the random gradient descent algorithm is determined not to be met, updating parameters of the first test model based on the random gradient descent algorithm and the simulated annealing algorithm to obtain a second test model; training the second test model by using the index data in the training set to obtain a third test model and a third error set; updating parameters of the second test model based on the random gradient descent algorithm and the simulated annealing algorithm in the training process; verifying the third test model by using the index data in the test set to obtain a fourth error set; judging whether the third error set and the fourth error set meet preset conditions or not; and when the third test model is determined to be satisfied, the third test model is used as a credit risk pressure test model.
In some implementation scenarios, the preset neural network model may be a bp (back propagation) neural network. The BP neural network is a multi-layer feedforward neural network trained according to an error back propagation algorithm. Of course, the preset neural network model may be other neural networks, and other modifications may be made by those skilled in the art in light of the spirit of the present application, but the present application is intended to cover the scope of the present application as long as the functions and effects achieved by the preset neural network model are the same as or similar to those of the present application.
In some implementation scenarios, the preset condition may be that the training set error is in a descending trend, and the verification set error is in an ascending trend. For example, in some implementations, when the training set error decreases and the test set error increases, which may indicate that a preset condition is met, the training is stopped. The preset condition may be other situations, for example, when the training set error is reduced to a certain threshold value, the training is stopped, and other modifications may be made by those skilled in the art in light of the technical spirit of the present application, but the present application is intended to cover the scope of the present application as long as the functions and effects achieved by the present application are the same as or similar to those achieved by the present application. It should be noted that the preset condition may be referred to as an "early-stop" strategy.
In some implementation scenarios, the updating the parameters of the preset neural network model or the first test model or the second test model based on the stochastic gradient descent algorithm and the simulated annealing algorithm may include: acquiring initial parameters in the K iteration process; updating the initial parameters in the K iteration process according to the following formula to obtain updated parameters after the K iteration:
v′k←vk+θ·Δvk
θ=2×random(0,1)
wherein, v'kIs shown asUpdating parameters after K iterations, vkDenotes the initial parameter during the K-th iteration, theta denotes the random number, Δ vkRepresenting the parameter variation in the K iteration process;
based on the error corresponding to the K iteration process and the error corresponding to the K-1 iteration process, determining the initial parameter in the K +1 iteration process according to the following formula:
vk+1=v′k×P
Figure BDA0002936580650000131
T←λ·T
wherein v isk+1Denotes the initial parameter during the K +1 th iteration, P denotes the probability of accepting the updated parameter, EkRepresenting the error corresponding to the K-th iteration, Ek-1The error corresponding to the K-1 iteration is shown, T represents the temperature in the simulated annealing algorithm, and lambda represents a positive number smaller than 1.
In the embodiment of the specification, the defects that a traditional BP neural network is easy to fall into local optimum and has low convergence speed can be effectively overcome by introducing a random gradient algorithm and a simulated annealing algorithm, and the BP neural network can be prevented from being over-fitted by introducing an early-stop strategy, so that the parameters of the model are more accurate, the artificial subjective experience judgment during credit risk testing is reduced, and the missing report and false report influence caused by artificial factors is reduced.
The above training process is described below with reference to a specific embodiment, however, it should be noted that the specific embodiment is only for better describing the present application and is not to be construed as limiting the present application. Fig. 2 is a schematic structural diagram of a neural network model provided in the present specification, as shown in fig. 2. The preset neural network model comprises an input layer, a hidden layer (hidden layer) and an output layer.
In this embodiment, the number of neurons in the input layer is 26, the number of neurons in the output layer is 6, and the number of neurons in the hidden layer can be adjusted by a trial-and-error method. In this embodiment, the ith character of the input layerThe value of the warp element can be noted as xiH (h) of hidden layer>0) The output value of each neuron can be denoted as bhThe output value of the jth neuron of the output layer can be recorded as yjThe connection weight between the ith neuron of the input layer and the h neuron of the hidden layer can be denoted as vihThe connection weight between the h-th neuron of the hidden layer and the j-th neuron of the output layer can be denoted as whjThe threshold of the h-th neuron of the hidden layer can be recorded as gammahThe threshold of the jth neuron of the output layer can be recorded as thetaj. In this embodiment, both the hidden layer neuron and the output layer neuron use a Sigmoid function as an activation function, and the Sigmoid function formula is as follows:
Figure BDA0002936580650000141
the input value alpha of the h-th neuron of the hidden layerhAnd output value bhThe calculation formulas are respectively as follows:
Figure BDA0002936580650000142
bh=sigmoid(αhh)
input value beta of j-th neuron of output layerjAnd the output value yjThe calculation formulas are respectively as follows:
Figure BDA0002936580650000143
yj=sigmoid(βjj)
in this embodiment, after the preset neural network model is constructed, the preset neural network model may be trained by using the index data in the training set.
Specifically, given a training set D { (x)1,y1),(x2,y2),...,(xm,ym)},xi∈R26,yi∈R6To the training sampleBook (x)k,yk) Assuming that the neural network model is output as
Figure BDA0002936580650000144
Then (x)k,yk) The corresponding mean square error can be written as:
Figure BDA0002936580650000145
wherein (x)k,yk) The k sample in the training set, i.e. the k index data, is represented, and m represents the number of training lumped samples.
Because the traditional BP neural network model is based on a gradient descent algorithm, the gradient of an error function at the current point is calculated in each iteration, and then the search direction is determined according to the gradient, so that the iteration result is easy to fall into local optimum. In order to avoid the training result from falling into the local optimum, in the embodiment, in the training process, the parameters of the model are updated by introducing a random gradient descent algorithm and a simulated annealing algorithm.
Specifically, for the introduced random gradient descent algorithm, a random factor is added when the parameter v is updated in each iteration, that is, the update formula of any parameter v is as follows:
v←v+θ·Δv
θ=2×random(0,1)
where v represents parameters in the model, such as connection weights, thresholds, etc., Δ v represents parameter variation, and θ represents random numbers.
Further, for the introduced simulated annealing algorithm, a worse result than the current solution is accepted with a certain probability in each iteration, and the probability of accepting the "suboptimal solution" gradually decreases as the number of iterations increases. In each iteration, the probability P of accepting a new parameter is:
Figure BDA0002936580650000151
T←λ·T
wherein,EkRepresenting the error corresponding to the K-th iteration, Ek-1The error corresponding to the K-1 iteration is shown, T represents the temperature in the simulated annealing algorithm, and lambda represents a positive number smaller than 1. In some implementations, λ takes a value between 0.8 and 0.99. T can be used to adjust the annealing rate, and the initial value of T is generally large, for example, 1000, which can ensure that the model iteration is more sufficient.
In this embodiment, the model error E is usedkMinimizing, a learning rate η (η ∈ (0, 1), η ═ 0.1 may be set, so that, when updating the parameters in each iteration, the update formula v ← v + θ · Δ v in the model, which connects the weights and the thresholds, can be expressed as:
whj←whj+θ×Δwhj
θj←θj+θ×Δθj
vih←vih+θ×Δvih
γh←γh+θ×Δγh
Δwhj=ηgjbh
Δθj=-ηgj
Δvih=ηehxi
Δγh=-ηeh
wherein, gjFor the gradient term of the neurons of the output layer, the calculation formula is
Figure BDA0002936580650000152
ehFor the gradient term of the hidden layer neuron, the calculation formula is
Figure BDA0002936580650000153
In the embodiment of the specification, only one hidden layer containing enough neurons is needed, and the continuous function with any complexity can be approximated with any precision. Due to its strong expression power, the BP neural network is susceptible to overfitting, and its training errors continue to decrease, but the testing errors may increase. In this embodiment of the present specification, most data (for example, 70% of the source database) in the index data set may be used as a training set to train the model, the remaining data may be used as a test set to estimate an error, if the training set error decreases but the test set error increases, the training may be stopped, and a corresponding parameter when the training is stopped may be used as a parameter of the final credit risk pressure test model.
In some embodiments, after obtaining the credit risk pressure test model, pressure index data in different pressure situations may also be obtained; wherein the different pressure scenarios are designed in advance based on a historical situation analysis method and an assumed situation analysis method; inputting the pressure index data into the credit risk pressure test model to obtain pressure-bearing index data corresponding to different pressure situations; and carrying out risk control based on the pressure-bearing index data.
In some implementation scenarios, a hybrid approach may be used to determine the stress situation by using a combination of historical situation analysis and hypothesis situation analysis, i.e. on the basis of analyzing and including the historical situation, the stress situation may be designed by combining the hypotheses of a more severe situation in the future.
In some implementations, the planned stress situation may be divided into mild, moderate, and severe. It should be noted that the pressure situation is set based on the end-of-term data, and the mild pressure index may be determined by mild pressurization based on the end-of-term and average indexes; the moderate pressure index can be determined by considering external macroscopic economic variables and the downlink change condition of enterprise financial data in ten years and taking the maximum change amplitude in one year as a basis, which is equivalent to a ten-year-one-chance pressure situation; severe stress can be determined at 1.5 times the moderate stress index, taking into account severe economic degradation that has never occurred before but is likely to occur. Wherein, the term end data refers to the end of the accounting term, such as the end of the month and the end of the year. Mild compression means that the indicator is slightly worse, e.g. slightly worse by 5%, than the end-of-term or average indicator. A downlink change situation refers to a worsening situation. Serious economic decline that has never occurred before but is likely to occur may include a war aggravating the epidemic resulting in a decline in economy, etc.
In some implementation scenarios, after the stress situation is designed, the established credit risk stress test model can be used to test the default rate, the bad loan rate, the recovery rate, the sufficient capital rate, the loan loss preparation and the total profit of the bank under light stress, moderate stress and heavy stress. Further, the banking staff can formulate a test report or an emergency scheme according to the pressure test result, so that risk early warning is avoided, fund is supplemented in advance, and the like. Therefore, through the application of the pressure test result, the bank can effectively control the risk and improve the risk management level.
In the embodiment of the specification, the pressure test is carried out by reasonably designing the pressure situation and utilizing the trained credit risk pressure test model, so that the accuracy of the credit risk pressure test can be effectively improved.
It is to be understood that the foregoing is only exemplary, and the embodiments of the present disclosure are not limited to the above examples, and other modifications may be made by those skilled in the art within the spirit of the present disclosure, and the scope of the present disclosure is intended to be covered by the claims as long as the functions and effects achieved by the embodiments are the same as or similar to the present disclosure.
From the above description, it can be seen that the pressure index and the pressure-bearing index can be determined, and the index data set is obtained according to the pressure index and the pressure-bearing index; further, the index data set can be divided into a training set and a testing set, and the preset neural network model is trained on the basis of the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure testing model; in the training process, the model obtained by training is verified by using the test set, the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition, the pressure index represents the factors influencing the credit risk, the pressure-bearing index represents the factors influenced by the pressure index, and each index data in the index data set comprises the data corresponding to the pressure index and the data corresponding to the pressure-bearing index. Compared with the prior art that a specialist sets parameters of a credit risk pressure test model according to historical experience and then inputs various risk parameters into the credit risk pressure test model for testing, the embodiment of the specification can effectively improve the accuracy of the credit risk pressure test.
In the present specification, each embodiment of the method is described in a progressive manner, and the same and similar parts in each embodiment may be joined together, and each embodiment focuses on the differences from the other embodiments. Reference is made to the description of the method embodiments.
Based on the training method for the credit risk pressure test model, one or more embodiments of the present specification further provide a training device for the credit risk pressure test model. The apparatus may include systems (including distributed systems), software (applications), modules, components, servers, clients, etc. that use the methods described in the embodiments of the present specification in conjunction with any necessary apparatus to implement the hardware. Based on the same innovative conception, embodiments of the present specification provide an apparatus as described in the following embodiments. Since the implementation scheme of the apparatus for solving the problem is similar to that of the method, the specific implementation of the apparatus in the embodiment of the present specification may refer to the implementation of the foregoing method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Specifically, fig. 3 is a schematic block diagram of an embodiment of a training apparatus for a credit risk pressure test model provided in this specification, and as shown in fig. 3, the training apparatus for a credit risk pressure test model provided in this specification may include: a determination module 120, an acquisition module 122, a division module 124, and a training module 126.
A determining module 120, configured to determine a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index;
an obtaining module 122, configured to obtain an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index;
a partitioning module 124, which may be configured to partition the metric data set into a training set and a test set;
the training module 126 may be configured to train a preset neural network model based on the training set, the stochastic gradient descent algorithm, and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
It should be noted that the above-mentioned description of the apparatus according to the method embodiment may also include other embodiments, and specific implementation manners may refer to the description of the related method embodiment, which is not described herein again.
The present specification also provides an embodiment of a training apparatus for a credit risk stress test model, comprising a processor and a memory for storing processor-executable instructions, which when executed by the processor, implement steps comprising: determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index; acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index; dividing the index data set into a training set and a test set; training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
It should be noted that the above-mentioned apparatuses may also include other embodiments according to the description of the method or apparatus embodiments. The specific implementation manner may refer to the description of the related method embodiment, and is not described in detail herein.
The method embodiments provided in the present specification may be executed in a mobile terminal, a computer terminal, a server or a similar computing device. Taking an example of the training server running on a server, fig. 4 is a block diagram of a hardware structure of an embodiment of a training server of a credit risk pressure test model provided in this specification, where the server may be a training device of the credit risk pressure test model or a training device of the credit risk pressure test model in the above embodiments. As shown in fig. 4, the server 10 may include one or more (only one shown) processors 100 (the processors 100 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 200 for storing data, and a transmission module 300 for communication functions. It will be understood by those skilled in the art that the structure shown in fig. 4 is only an illustration and is not intended to limit the structure of the electronic device. For example, the server 10 may also include more or fewer components than shown in FIG. 4, and may also include other processing hardware, such as a database or multi-level cache, a GPU, or have a different configuration than shown in FIG. 4, for example.
The memory 200 may be used to store software programs and modules of application software, such as program instructions/modules corresponding to the training method of the credit risk stress test model in the embodiment of the present specification, and the processor 100 executes various functional applications and data processing by executing the software programs and modules stored in the memory 200. Memory 200 may include high speed random access memory and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, memory 200 may further include memory located remotely from processor 100, which may be connected to a computer terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission module 300 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal. In one example, the transmission module 300 includes a Network adapter (NIC) that can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission module 300 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The method or apparatus provided by the present specification and described in the foregoing embodiments may implement service logic through a computer program and record the service logic on a storage medium, where the storage medium may be read and executed by a computer, so as to implement the effect of the solution described in the embodiments of the present specification. The storage medium may include a physical device for storing information, and typically, the information is digitized and then stored using an electrical, magnetic, or optical media. The storage medium may include: devices that store information using electrical energy, such as various types of memory, e.g., RAM, ROM, etc.; devices that store information using magnetic energy, such as hard disks, floppy disks, tapes, core memories, bubble memories, and usb disks; devices that store information optically, such as CDs or DVDs. Of course, there are other ways of storing media that can be read, such as quantum memory, graphene memory, and so forth.
The embodiments of the training method or apparatus for the credit risk pressure test model provided in this specification may be implemented in a computer by a processor executing corresponding program instructions, for example, implemented in a PC using a c + + language of a windows operating system, implemented in a linux system, or implemented in an intelligent terminal using android, an iOS system programming language, implemented in processing logic based on a quantum computer, or the like.
It should be noted that descriptions of the apparatus, the device, and the system described above according to the related method embodiments may also include other embodiments, and specific implementations may refer to descriptions of corresponding method embodiments, which are not described in detail herein.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the hardware + program class embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, when implementing one or more of the present description, the functions of some modules may be implemented in one or more software and/or hardware, or the modules implementing the same functions may be implemented by a plurality of sub-modules or sub-units, etc.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus, devices, systems according to embodiments of the invention. It will be understood that the implementation can be by computer program instructions which can be provided to a processor of a general purpose computer, special purpose computer, embedded processor or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
As will be appreciated by one skilled in the art, one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects.
The above description is merely exemplary of one or more embodiments of the present disclosure and is not intended to limit the scope of one or more embodiments of the present disclosure. Various modifications and alterations to one or more embodiments described herein will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims.

Claims (11)

1. A training method of a credit risk stress test model is characterized by comprising the following steps:
determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index;
acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index;
dividing the index data set into a training set and a test set;
training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
2. The method of claim 1, wherein determining the pressure indicator and the pressure bearing indicator comprises:
determining a stress indicator based on stress data associated with the credit risk;
and determining a pressure bearing index based on a credit risk conduction mechanism.
3. The method of claim 1, wherein the stress indicators comprise macro economic indicators and corporate financial indicators; the macro-economic indicators include at least one of: GDP output gap, effective exchange rate, actual generalized currency supply, nominal interest rate of one-year-period goods and commodity price index; the corporate financial indicators include at least one of: equity rate, total equity to sales revenue ratio, cash to total equity ratio, liquidity ratio, snap ratio, cash ratio, multiple interest, multiple interest support, tangible equity rate, title ratio, pre-interest profit to total assets ratio, interest to pre-interest tax profit and sum of interest ratio, credit to sales revenue ratio, operating capital to total assets ratio, sales revenue to total assets ratio, net profit to owner's equity ratio, net profit to total assets ratio, liquidity asset to total assets ratio, retained income, total assets; the pressure-bearing index comprises at least one of the following indexes: default rate, bad loan rate, recovery rate, loan loss preparation, capital sufficiency, total profit.
4. The method of claim 1, wherein said obtaining an index data set from said pressure index and said pressure bearing index comprises:
acquiring a data table; the data table comprises a macroscopic economic data table, an enterprise financial statement and a loan record table;
extracting corresponding data from the macroscopic economic data table and the enterprise financial statement according to the pressure index to obtain pressure index data;
extracting corresponding data from the loan record table according to the pressure-bearing index to obtain pressure-bearing index data;
and correlating the pressure index data with the pressure-bearing index data to obtain an index data set.
5. The method of claim 1, wherein before dividing the metric data set into a training set and a test set, the method comprises:
filling missing data in the index data set by using a hot card filling method to obtain a first index data set;
deleting abnormal data in the first index data set by using a regression test model to obtain a second index data set;
correspondingly, the second index data set is divided into a training set and a test set.
6. The method of claim 1, wherein training a preset neural network model based on the training set, the stochastic gradient descent algorithm, and the simulated annealing algorithm to obtain a credit risk pressure test model comprises:
constructing a preset neural network model; the preset neural network model comprises an input layer, a hidden layer and an output layer, the number of the neurons in the input layer is determined according to the number of the pressure indexes, and the number of the neurons in the output layer is determined according to the number of the pressure-bearing indexes;
training the preset neural network model by using index data in the training set to obtain a first test model and a first error set; updating parameters of the preset neural network model based on the stochastic gradient descent algorithm and the simulated annealing algorithm in the training process;
verifying the first test model by using the index data in the test set to obtain a second error set;
judging whether the first error set and the second error set meet preset conditions or not;
when the random gradient descent algorithm is determined not to be met, updating parameters of the first test model based on the random gradient descent algorithm and the simulated annealing algorithm to obtain a second test model;
training the second test model by using the index data in the training set to obtain a third test model and a third error set; updating parameters of the second test model based on the random gradient descent algorithm and the simulated annealing algorithm in the training process;
verifying the third test model by using the index data in the test set to obtain a fourth error set;
judging whether the third error set and the fourth error set meet preset conditions or not;
and when the third test model is determined to be satisfied, the third test model is used as a credit risk pressure test model.
7. The method of claim 6, wherein the updating the parameters of the pre-set neural network model or the first test model or the second test model based on the stochastic gradient descent algorithm and the simulated annealing algorithm comprises:
acquiring initial parameters in the K iteration process;
updating the initial parameters in the K iteration process according to the following formula to obtain updated parameters after the K iteration:
v′k←vk+θ·Δvk
θ=2×random(0,1)
wherein, v'kRepresents the updated parameter after the K-th iteration, vkDenotes the initial parameter during the K-th iteration, theta denotes the random number, Δ vkRepresenting the parameter variation in the K iteration process;
based on the error corresponding to the K iteration process and the error corresponding to the K-1 iteration process, determining the initial parameter in the K +1 iteration process according to the following formula:
vk+1=v′k×P
Figure FDA0002936580640000031
T←λ·T
wherein v isk+1To representInitial parameters in the K +1 th iteration, P represents the probability of accepting the updated parameters, EkRepresenting the error corresponding to the K-th iteration, Ek-1The error corresponding to the K-1 iteration is shown, T represents the temperature in the simulated annealing algorithm, and lambda represents a positive number smaller than 1.
8. The method of claim 1, further comprising:
acquiring pressure index data in different pressure scenes; wherein the different pressure scenarios are designed in advance based on a historical situation analysis method and an assumed situation analysis method;
inputting the pressure index data into the credit risk pressure test model to obtain pressure-bearing index data corresponding to different pressure situations;
and carrying out risk control based on the pressure-bearing index data.
9. A training device for a credit risk stress test model, comprising:
the determining module is used for determining a pressure index and a pressure-bearing index; the pressure index represents factors influencing credit risk, and the pressure bearing index represents factors influenced by the pressure index;
the acquisition module is used for acquiring an index data set according to the pressure index and the pressure-bearing index; each index data in the index data set comprises data corresponding to the pressure index and data corresponding to the pressure-bearing index;
the dividing module is used for dividing the index data set into a training set and a test set;
the training module is used for training a preset neural network model based on the training set, the random gradient descent algorithm and the simulated annealing algorithm to obtain a credit risk pressure test model; and in the training process, the model obtained by training is verified by using the test set, and the training is stopped when the error corresponding to the training set and the error corresponding to the test set meet the preset condition.
10. Training device for a credit risk stress test model, comprising at least one processor and a memory storing computer executable instructions, the processor implementing the steps of the method according to any one of claims 1 to 8 when executing said instructions.
11. A computer-readable storage medium having stored thereon computer instructions which, when executed, implement the steps of the method of any one of claims 1 to 8.
CN202110160918.2A 2021-02-05 2021-02-05 Training method, device and equipment for credit risk pressure test model Pending CN112766814A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110160918.2A CN112766814A (en) 2021-02-05 2021-02-05 Training method, device and equipment for credit risk pressure test model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110160918.2A CN112766814A (en) 2021-02-05 2021-02-05 Training method, device and equipment for credit risk pressure test model

Publications (1)

Publication Number Publication Date
CN112766814A true CN112766814A (en) 2021-05-07

Family

ID=75705141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110160918.2A Pending CN112766814A (en) 2021-02-05 2021-02-05 Training method, device and equipment for credit risk pressure test model

Country Status (1)

Country Link
CN (1) CN112766814A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114371678A (en) * 2022-01-11 2022-04-19 升发智联(北京)科技有限责任公司 Equipment safety production early warning method, system, equipment and storage medium
CN116805157A (en) * 2023-08-25 2023-09-26 中国人民解放军国防科技大学 Unmanned cluster autonomous dynamic evaluation method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114371678A (en) * 2022-01-11 2022-04-19 升发智联(北京)科技有限责任公司 Equipment safety production early warning method, system, equipment and storage medium
CN116805157A (en) * 2023-08-25 2023-09-26 中国人民解放军国防科技大学 Unmanned cluster autonomous dynamic evaluation method and device
CN116805157B (en) * 2023-08-25 2023-11-17 中国人民解放军国防科技大学 Unmanned cluster autonomous dynamic evaluation method and device

Similar Documents

Publication Publication Date Title
Bellini IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS
US8577791B2 (en) System and computer program for modeling and pricing loan products
KR102009309B1 (en) Management automation system for financial products and management automation method using the same
US20110166979A1 (en) Connecting decisions through customer transaction profiles
Chen [Retracted] Risk Assessment of Government Debt Based on Machine Learning Algorithm
CN112766814A (en) Training method, device and equipment for credit risk pressure test model
Biswas et al. Automated credit assessment framework using ETL process and machine learning
Zeng [Retracted] Research on Risk Measurement and Early Warning of Electronic Banking Business Based on GMDH Algorithm
Tikhonov et al. The Relationship Between the Financial Performance of Banks and the Quality of Credit Scoring Models
Qiang et al. [Retracted] Relationship Model between Human Resource Management Activities and Performance Based on LMBP Algorithm
Zang Construction of Mobile Internet Financial Risk Cautioning Framework Based on BP Neural Network
CN117455681A (en) Service risk prediction method and device
Drudi et al. A liquidity risk early warning indicator for Italian banks: a machine learning approach
Zakowska A New Credit Scoring Model to Reduce Potential Predatory Lending: A Design Science Approach
KR102633684B1 (en) Method for automatic credit scoring calibration considering macro-economic index and recofding medium storing program to execute the method
Zhang et al. A Financial Risk Early Warning of Listed Companies Based on PCA and BP Neural Network
Pham et al. Credit Rating Models for Firms in Vietnam Using Artificial Neural Networks (ANN)
Xia Financial security risk detection in colleges and universities relying on big data clustering center scheduling algorithm
Zhou Forecasting Loan Risk of Banks with Machine Learning in Main Street Lending Program
Sastry Business analytics and business intelligence machine learning model to predict bank loan defaults
CN118799058A (en) Loan risk assessment method and system based on artificial intelligence
Kisutsa Loan Default Prediction Using Machine Learning: a Case of Mobile Based Lending
Tang Simulation of Enterprise Financial Management Early Warning Model Based on Neural Network Algorithm
Soares et al. A simple fuzzy system applied to predict default rate
Jiang et al. Credit risk analysis based on machine learning methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210507