CN110264342A - A kind of business audit method and device based on machine learning - Google Patents
A kind of business audit method and device based on machine learning Download PDFInfo
- Publication number
- CN110264342A CN110264342A CN201910533825.2A CN201910533825A CN110264342A CN 110264342 A CN110264342 A CN 110264342A CN 201910533825 A CN201910533825 A CN 201910533825A CN 110264342 A CN110264342 A CN 110264342A
- Authority
- CN
- China
- Prior art keywords
- business
- random forest
- forest model
- decision tree
- review
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012550 audit Methods 0.000 title claims abstract description 60
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000010801 machine learning Methods 0.000 title claims abstract description 21
- 238000003066 decision tree Methods 0.000 claims abstract description 94
- 238000007637 random forest analysis Methods 0.000 claims abstract description 93
- 238000012552 review Methods 0.000 claims abstract description 75
- 239000013598 vector Substances 0.000 claims abstract description 53
- 238000012549 training Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 12
- 238000003860 storage Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 7
- 230000006399 behavior Effects 0.000 abstract description 4
- 230000001419 dependent effect Effects 0.000 abstract description 4
- 238000010586 diagram Methods 0.000 description 11
- 238000005034 decoration Methods 0.000 description 6
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 3
- 230000003068 static effect Effects 0.000 description 2
- 230000007306 turnover Effects 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Finance (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明实施例提供了一种基于机器学习的业务审核方法及装置,涉及金融科技技术领域,该方法包括:采用历史业务审核数据训练随机森林模型,使随机森林模型学习审核员的审核行为,然后将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果,之后再根据随机森林模型中每棵决策树输出的分类结果和随机森林模型中每棵决策树对应的权重,确定业务请求的审核结果。相较于专家模型来说,随机森林模型是基于历史业务审核数据训练获得,并不仅仅是专业审核人员的经验,因此对人为经验依赖小,降低了主观因素的影响,提高了审核模型的泛化能力和通用性。
The embodiment of the present invention provides a business review method and device based on machine learning, which relates to the technical field of financial technology. The method includes: using historical business review data to train a random forest model, so that the random forest model learns the review behavior of the auditor, and then Input the feature vector of the business applicant into each decision tree in the random forest model, obtain the classification results output by each decision tree in the random forest model, and then according to the classification results output by each decision tree in the random forest model and the random forest The weight corresponding to each decision tree in the model determines the review result of the business request. Compared with the expert model, the random forest model is trained based on historical business audit data, not just the experience of professional auditors, so it is less dependent on human experience, reduces the influence of subjective factors, and improves the generality of the audit model. capability and versatility.
Description
技术领域technical field
本发明实施例涉及金融科技(Fintech)技术领域,尤其涉及一种基于机器学习的业务审核方法及装置。Embodiments of the present invention relate to the technical field of financial technology (Fintech), in particular to a method and device for business review based on machine learning.
背景技术Background technique
随着计算机技术的发展,越来越多的技术应用(例如:人工智能,云计算、区块链等)在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,但由于金融行业的安全性、实时性要求,也对技术提出的更高的要求。在金融行业中,供应链贷款解决了上下游企业融资难、担保难的问题,而且通过打通上下游融资瓶颈,还可以降低供应链条融资成本,提高核心企业及配套企业的竞争力。目前,主要采用专家模型对贷款进行审核,通过收集多年从事贷款审核的业务人员的经验,经过整理归纳,总结为一套固定的业务逻辑规则,通过规则引擎部署在审核系统中,实现自动审核,该方法过于依赖业务人员的经验,主观因素强。With the development of computer technology, more and more technical applications (such as: artificial intelligence, cloud computing, blockchain, etc.) in the financial field, the traditional financial industry is gradually transforming into financial technology (Fintech), but due to the Security and real-time requirements also put forward higher requirements for technology. In the financial industry, supply chain loans solve the financing difficulties and guarantee difficulties of upstream and downstream enterprises, and by breaking through the upstream and downstream financing bottlenecks, they can also reduce supply chain financing costs and improve the competitiveness of core enterprises and supporting enterprises. At present, the expert model is mainly used to review loans. After collecting the experience of business personnel who have been engaged in loan review for many years, they have been sorted out and summarized into a set of fixed business logic rules. The rule engine is deployed in the review system to realize automatic review. This method relies too much on the experience of business personnel and has strong subjective factors.
发明内容Contents of the invention
由于采用专家模型审核业务的方案过于依赖业务人员的经验,主观因素强的问题,本发明实施例提供了一种基于机器学习的业务审核方法及装置。Due to the problems of relying too much on the experience of business personnel and strong subjective factors in the scheme of auditing business by using the expert model, the embodiments of the present invention provide a business audit method and device based on machine learning.
一方面,本发明实施例提供了一种基于机器学习的业务审核方法,包括:On the one hand, the embodiment of the present invention provides a business review method based on machine learning, including:
获取业务申请者的业务请求;Obtain business requests from business applicants;
根据所述业务请求提取所述业务申请者的特征向量;Extracting the feature vector of the service applicant according to the service request;
将所述业务申请者的特征向量输入随机森林模型中的每棵决策树,获得所述随机森林模型中每棵决策树输出的分类结果,所述随机森林模型是以历史业务审核数据为训练样本训练获得的;Input the feature vector of the business applicant into each decision tree in the random forest model, and obtain the classification result output by each decision tree in the random forest model, and the random forest model uses historical business audit data as a training sample acquired by training
根据所述随机森林模型中每棵决策树输出的分类结果,确定所述业务请求的审核结果。According to the classification result output by each decision tree in the random forest model, the review result of the service request is determined.
可选地,所述根据所述随机森林模型中每棵决策树输出的分类结果,确定所述业务请求的审核结果,包括:Optionally, the determining the review result of the business request according to the classification result output by each decision tree in the random forest model includes:
根据所述随机森林模型中每棵决策树输出的分类结果和所述随机森林模型中每棵决策树对应的权重,确定所述业务请求的审核结果。The review result of the service request is determined according to the classification result output by each decision tree in the random forest model and the weight corresponding to each decision tree in the random forest model.
可选地,所述根据所述随机森林模型中每棵决策树输出的分类结果和所述随机森林模型中每棵决策树对应的权重,确定所述业务请求的审核结果,包括:Optionally, determining the review result of the business request according to the classification result output by each decision tree in the random forest model and the weight corresponding to each decision tree in the random forest model includes:
将所述随机森林模型中分类结果相同的决策树的权重相加,确定每个分类结果的分类权重;Add the weights of the decision trees with the same classification results in the random forest model to determine the classification weight of each classification result;
将分类权重最大的分类结果作为审核结果。The classification result with the largest classification weight is taken as the review result.
可选地,所述业务请求为供应链业务请求,所述历史业务审核数据包括链属企业特征数据、核心企业特征数据、审核员的历史审核记录。Optionally, the business request is a supply chain business request, and the historical business audit data includes chain enterprise characteristic data, core enterprise characteristic data, and historical audit records of auditors.
可选地,所述随机森林模型是以历史业务审核数据为训练样本训练获得的,包括:Optionally, the random forest model is obtained by training historical business audit data as training samples, including:
获取历史业务审核数据;Obtain historical business audit data;
根据所述历史业务审核数据确定特征向量集合;determining a set of feature vectors according to the historical business audit data;
从所述特征向量集合中抽取N个子特征向量集合,所述N为预设正整数;Extracting N sub-feature vector sets from the feature vector set, where N is a preset positive integer;
采用所述N个子特征向量集合训练获得N棵决策树;Using the N sub-feature vector sets to train and obtain N decision trees;
将N棵决策树组成随机森林模型。Combine N decision trees into a random forest model.
一方面,本发明实施例提供了一种基于机器学习的业务审核装置,包括:On the one hand, an embodiment of the present invention provides a machine learning-based business review device, including:
获取模块,用于获取业务申请者的业务请求;An acquisition module, configured to acquire a business request from a business applicant;
提取模块,用于根据所述业务请求提取所述业务申请者的特征向量;An extraction module, configured to extract the feature vector of the service applicant according to the service request;
分类模块,用于将所述业务申请者的特征向量输入随机森林模型中的每棵决策树,获得所述随机森林模型中每棵决策树输出的分类结果,所述随机森林模型是以历史业务审核数据为训练样本训练获得的;The classification module is used to input the feature vector of the business applicant into each decision tree in the random forest model, and obtain the classification result output by each decision tree in the random forest model, and the random forest model is based on historical business The audit data is obtained from training samples;
处理模块,用于根据所述随机森林模型中每棵决策树输出的分类结果,确定所述业务请求的审核结果。The processing module is configured to determine the review result of the business request according to the classification result output by each decision tree in the random forest model.
可选地,所述处理模块具体用于:Optionally, the processing module is specifically configured to:
根据所述随机森林模型中每棵决策树输出的分类结果和所述随机森林模型中每棵决策树对应的权重,确定所述业务请求的审核结果。The review result of the service request is determined according to the classification result output by each decision tree in the random forest model and the weight corresponding to each decision tree in the random forest model.
可选地,所述处理模块具体用于:Optionally, the processing module is specifically configured to:
将所述随机森林模型中分类结果相同的决策树的权重相加,确定每个分类结果的分类权重;Add the weights of the decision trees with the same classification results in the random forest model to determine the classification weight of each classification result;
将分类权重最大的分类结果作为审核结果。The classification result with the largest classification weight is taken as the review result.
可选地,所述业务请求为供应链业务请求,所述历史业务审核数据包括链属企业特征数据、核心企业特征数据、审核员的历史审核记录。Optionally, the business request is a supply chain business request, and the historical business audit data includes chain enterprise characteristic data, core enterprise characteristic data, and historical audit records of auditors.
可选地,所述分类模块具体用于:Optionally, the classification module is specifically used for:
获取历史业务审核数据;Obtain historical business audit data;
根据所述历史业务审核数据确定特征向量集合;determining a set of feature vectors according to the historical business audit data;
从所述特征向量集合中抽取N个子特征向量集合,所述N为预设正整数;Extracting N sub-feature vector sets from the feature vector set, where N is a preset positive integer;
采用所述N个子特征向量集合训练获得N棵决策树;Using the N sub-feature vector sets to train and obtain N decision trees;
将N棵决策树组成随机森林模型。Combine N decision trees into a random forest model.
一方面,本发明实施例提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现基于机器学习的业务审核方法的步骤。On the one hand, an embodiment of the present invention provides a computer device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the program, a machine learning-based business Steps in the audit method.
一方面,本发明实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行基于机器学习的业务审核方法的步骤。On the one hand, an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program executable by a computer device, and when the program is run on the computer device, the computer device executes a machine learning-based service Steps in the audit method.
本发明实施例中,由于采用历史业务审核数据训练随机森林模型,使随机森林模型学习审核员的审核行为,然后将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果,之后再根据随机森林模型中每棵决策树输出的分类结果,确定业务请求的审核结果。相较于专家模型来说,随机森林模型是基于历史业务审核数据训练获得,并不仅仅是专业审核人员的经验,因此对人为经验依赖小,降低了主观因素的影响,提高了审核模型的泛化能力和通用性。In the embodiment of the present invention, since the random forest model is trained with historical business audit data, the random forest model learns the auditor's audit behavior, and then the feature vector of the business applicant is input into each decision tree in the random forest model to obtain a random forest The classification results output by each decision tree in the model, and then determine the review results of the business request based on the classification results output by each decision tree in the random forest model. Compared with the expert model, the random forest model is trained based on historical business audit data, not just the experience of professional auditors, so it is less dependent on human experience, reduces the influence of subjective factors, and improves the generality of the audit model. capability and versatility.
附图说明Description of drawings
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简要介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present invention. For Those of ordinary skill in the art can also obtain other drawings based on these drawings without any creative effort.
图1为本发明实施例提供的一种应用场景示意图;FIG. 1 is a schematic diagram of an application scenario provided by an embodiment of the present invention;
图2为本发明实施例提供的一种基于机器学习的业务审核方法的流程示意图;FIG. 2 is a schematic flowchart of a machine learning-based business review method provided by an embodiment of the present invention;
图3为本发明实施例提供的一种训练随机森林模型的方法的流程示意图;FIG. 3 is a schematic flowchart of a method for training a random forest model provided by an embodiment of the present invention;
图4为本发明实施例提供的一种基于机器学习的业务审核方法的流程示意图;FIG. 4 is a schematic flowchart of a machine learning-based business review method provided by an embodiment of the present invention;
图5为本发明实施例提供的一种基于机器学习的业务审核装置的结构示意图;FIG. 5 is a schematic structural diagram of a machine learning-based business review device provided by an embodiment of the present invention;
图6为本发明实施例提供的一种计算机设备的结构示意图。FIG. 6 is a schematic structural diagram of a computer device provided by an embodiment of the present invention.
具体实施方式Detailed ways
为了使本发明的目的、技术方案及有益效果更加清楚明白,以下结合附图及实施例,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本发明,并不用于限定本发明。In order to make the object, technical solution and beneficial effects of the present invention more clear, the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present invention, not to limit the present invention.
为了方便理解,下面对本发明实施例中涉及的名词进行解释。For the convenience of understanding, the terms involved in the embodiments of the present invention are explained below.
供应链贷款:Supply Chain Finance,是把供应链上的核心企业及其相关的上下游链属企业作为一个整体,根据供应链中企业的交易关系和行业特点制定基于货权及现金流控制的整体金融解决方案的一种融资模式。Supply Chain Finance: Supply Chain Finance is to take the core enterprises in the supply chain and their related upstream and downstream chain enterprises as a whole, and formulate an overall loan based on goods rights and cash flow control according to the transaction relationship and industry characteristics of the enterprises in the supply chain. A funding model for financial solutions.
核心企业:供应链中掌握核心技术、核心能力、核心环节的企业。Core enterprises: Enterprises in the supply chain that have mastered core technologies, core capabilities, and core links.
链属企业:供应链中核心企业的上下游链属企业。Chain affiliated enterprises: upstream and downstream chain affiliated enterprises of core enterprises in the supply chain.
本发明实施例中的基于机器学习的业务审核方法可以应用于如图1所示的应用场景,在该应用场景中包括终端设备101、审核服务器102,其中,终端设备101可以是智能手机、平板电脑或便携式个人计算机等等。审核服务器102可以是银行等金融机构的业务审核服务器。用户在终端设备101上提交业务请求,终端设备101将业务请求发送至审核服务器102。审核服务器102中包括训练好的用于业务审核的随机森林模型。审核服务器102根据所业务请求提取业务申请者的特征向量,将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果,根据随机森林模型中每棵决策树输出的分类结果,确定业务请求的审核结果。审核服务器102将审核结果发送至终端设备101,用户可以从终端设备101中查看业务请求的审核结果。The business review method based on machine learning in the embodiment of the present invention can be applied to the application scenario shown in Figure 1, which includes a terminal device 101 and an audit server 102, wherein the terminal device 101 can be a smart phone, a tablet computer or portable personal computer, etc. The verification server 102 may be a business verification server of a financial institution such as a bank. A user submits a service request on the terminal device 101 , and the terminal device 101 sends the service request to the audit server 102 . The audit server 102 includes a trained random forest model for business audit. The review server 102 extracts the feature vector of the business applicant according to the business request, inputs the feature vector of the business applicant into each decision tree in the random forest model, and obtains the classification result output by each decision tree in the random forest model. The classification result output by each decision tree in the model determines the review result of the business request. The review server 102 sends the review result to the terminal device 101 , and the user can view the review result of the service request from the terminal device 101 .
基于图1所示的应用场景图,本发明实施例提供了一种基于机器学习的业务审核方法的流程,该方法的流程可以由基于机器学习的业务审核装置执行,基于机器学习的业务审核装置可以是图1中的审核服务器102,如图2所示,包括以下步骤:Based on the application scenario diagram shown in Figure 1, the embodiment of the present invention provides a process of a business review method based on machine learning, and the process of the method can be executed by a business review device based on machine learning, and the business review device based on machine learning Can be audit server 102 among Fig. 1, as shown in Fig. 2, comprise the following steps:
步骤S201,获取业务申请者的业务请求。Step S201, acquiring the service request of the service applicant.
步骤S202,根据业务请求提取业务申请者的特征向量。Step S202, extracting the feature vector of the business applicant according to the business request.
业务申请者可以是个人或企业,业务请求可以是贷款请求,比如个人贷款请求、企业贷款请求、供应链贷款请求等。The business applicant can be an individual or an enterprise, and the business request can be a loan request, such as a personal loan request, an enterprise loan request, a supply chain loan request, and the like.
当业务申请者为个人时,业务申请者的特征向量可以是个人特征,比如个人基本信息、个人征信记录、个人业务记录、个人资产等。当业务申请者为企业时,业务申请者的特征向量可以是企业特征,比如企业征信记录、企业资质等。When the business applicant is an individual, the feature vector of the business applicant can be a personal feature, such as personal basic information, personal credit records, personal business records, personal assets, and the like. When the business applicant is an enterprise, the feature vector of the business applicant may be the characteristics of the enterprise, such as enterprise credit records, enterprise qualifications, and the like.
步骤S203,将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果。Step S203, input the feature vector of the business applicant into each decision tree in the random forest model, and obtain the classification result output by each decision tree in the random forest model.
随机森林模型是以历史业务审核数据为训练样本训练获得的。当随机森林模型用于审核个人业务请求时,历史业务审核数据包括个人特征数据、审核员的历史审核记录。当随机森林模型用于审核供应链业务请求时,历史业务审核数据包括链属企业特征数据、核心企业特征数据、审核员的历史审核记录。The random forest model is obtained by training with historical business audit data as training samples. When the random forest model is used to audit individual business requests, the historical business audit data includes personal characteristic data and auditor's historical audit records. When the random forest model is used to audit supply chain business requests, historical business audit data includes chain enterprise characteristic data, core enterprise characteristic data, and historical audit records of auditors.
步骤S204,根据随机森林模型中每棵决策树输出的分类结果,确定业务请求的审核结果。Step S204, according to the classification result output by each decision tree in the random forest model, determine the review result of the business request.
在一种可能的实施方式中,决策树输出的分类结果包括审核通过和审核不通过,当随机森林模型中输出的分类结果为审核通过的决策树数量大于审核不通过的决策树数量时,确定业务请求的审核结果为审核通过。当随机森林模型中输出的分类结果为审核通过的决策树数量小于审核不通过的决策树数量时,确定业务请求的审核结果为审核不通过。In a possible implementation, the classification results output by the decision tree include approval and failure, and when the classification result output in the random forest model is that the number of decision trees that pass the review is greater than the number of decision trees that fail the review, it is determined that The review result of the business request is approved. When the output classification result of the random forest model is that the number of decision trees that pass the review is less than the number of decision trees that fail the review, it is determined that the review result of the business request is not passed the review.
在一种可能的实施方式中,根据随机森林模型中每棵决策树输出的分类结果和随机森林模型中每棵决策树对应的权重,确定业务请求的审核结果。具体地,每个决策树对应的权重是根据决策树中特征向量的重要性确定的,所有决策树对应的权重相加的和为1。In a possible implementation manner, the review result of the service request is determined according to the classification result output by each decision tree in the random forest model and the weight corresponding to each decision tree in the random forest model. Specifically, the weight corresponding to each decision tree is determined according to the importance of feature vectors in the decision tree, and the sum of the weights corresponding to all decision trees is 1.
具体实施中,可以将随机森林模型中分类结果相同的决策树的权重相加,确定每个分类结果的分类权重,将分类权重最大的分类结果作为审核结果。In a specific implementation, the weights of the decision trees with the same classification results in the random forest model can be added to determine the classification weight of each classification result, and the classification result with the largest classification weight can be used as the review result.
示例性地,设定决策树的分类结果包括审核通过和审核不通过,将分类结果为审核通过的决策树的权重相加,获得审核通过的分类权重。将分类结果为审核不通过的决策树的权重相加,获得审核不通过的分类权重。当审核通过的分类权重大于审核不通过的分类权重时,确定审核结果为审核通过,当审核通过的分类权重小于审核不通过的分类权重时,确定审核结果为审核不通过。Exemplarily, the classification results of the decision tree are set to include approval and approval, and the weights of the decision trees whose classification results are approval are added to obtain the classification weight of approval. Add up the weights of the decision trees whose classification results fail the review to obtain the classification weight of the failed review. When the classification weight of the approved category is greater than the category weight of the failed category, the result of the review is determined to be approved;
由于采用历史业务审核数据训练随机森林模型,使随机森林模型学习审核员的审核行为,然后将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果,之后再根据随机森林模型中每棵决策树输出的分类结果,确定业务请求的审核结果。相较于专家模型来说,随机森林模型是基于历史业务审核数据训练获得,并不仅仅是专业审核人员的经验,因此对人为经验依赖小,降低了主观因素的影响,提高了审核模型的泛化能力和通用性。Since the random forest model is trained with historical business audit data, the random forest model learns the audit behavior of the auditor, and then the feature vector of the business applicant is input into each decision tree in the random forest model to obtain each decision tree in the random forest model The output classification results, and then according to the classification results output by each decision tree in the random forest model, determine the review results of the business request. Compared with the expert model, the random forest model is trained based on historical business audit data, not just the experience of professional auditors, so it is less dependent on human experience, reduces the influence of subjective factors, and improves the generality of the audit model. capability and versatility.
下面介绍采用历史业务审核数据训练获得随机森林模型的过程,如图3所示,包括以下步骤:The following describes the process of using historical business audit data to train and obtain a random forest model, as shown in Figure 3, including the following steps:
步骤S301,获取历史业务审核数据。Step S301, acquiring historical business review data.
历史业务审核数据包括链属企业特征数据、核心企业特征数据、审核员的历史审核记录,其中,链属企业特征数据包括链属企业基本信息(比如,注册成立时间、企业规模、员工数量、主营业务行业等)、链属企业经营财报(比如,负债率、负债金额、对外担保金额、营业额、营业利润等)、已质押应收账款金额、已质押应收账款笔数、应收账款特征数据(比如应收账款账期等)。核心企业特征数据包括核心企业基本信息(比如,注册成立时间、企业规模、员工数量、主营业务行业、是否为上市公司等)、核心企业经营财报、近期舆情重大负面新闻数量、应收账款特征数据(比如应收账款账期等)。Historical business audit data includes chain enterprise characteristic data, core enterprise characteristic data, and historical audit records of auditors. Among them, chain enterprise characteristic data includes chain enterprise basic information (for example, registration time, enterprise size, number of employees, principal operating business industry, etc.), financial reports of chain enterprises (for example, debt ratio, debt amount, external guarantee amount, turnover, operating profit, etc.), pledged accounts receivable amount, pledged accounts receivable number, payable Accounts receivable feature data (such as accounts receivable period, etc.). The characteristic data of the core enterprise include the basic information of the core enterprise (for example, time of incorporation, enterprise size, number of employees, main business industry, whether it is a listed company, etc.), core enterprise operating financial reports, the number of recent major negative public opinion news, accounts receivable Characteristic data (such as accounts receivable period, etc.).
步骤S302,根据历史业务审核数据确定特征向量集合。In step S302, a set of feature vectors is determined according to historical business review data.
具体地,获取历史业务审核数据后,对历史业务审核数据进行预处理。预处理可以包括以下几种方式:方式一、由于收集的历史业务审核数据可能存在差错、异常值和缺失的情况,可以对这类数据做赋默认值、剔除样本等操作,避免这类数据影响训练结果。方式二、对指定特征列做标准化处理,例如公司的营业收入是一维连续分布的数字数据,可此处做区间分类,如300万以下、300万到1000万、1000万到5000万、5000万以上4个区间类别,然后进行标记。方式三、把历史业务审核数据整理为多维数据矩阵,以待训练使用。Specifically, after the historical business audit data is acquired, the historical business audit data is preprocessed. Preprocessing can include the following methods: Method 1. Since the collected historical business audit data may have errors, outliers and missing situations, operations such as assigning default values and removing samples can be performed on such data to avoid the impact of such data training results. Method 2. Standardize the specified feature columns. For example, the company's operating income is a one-dimensional continuous distribution of digital data, and interval classification can be performed here, such as below 3 million, 3 million to 10 million, 10 million to 50 million, 5000 More than 4 interval categories of more than 10,000, and then mark them. Method 3: Organize historical business review data into a multi-dimensional data matrix for use in training.
步骤S303,从特征向量集合中抽取N个子特征向量集合,N为预设正整数。Step S303, extracting N sub-feature vector sets from the feature vector set, where N is a preset positive integer.
具体地,采用bootstrap方法从特征向量集合中抽取N个子特征向量集合。Specifically, the bootstrap method is used to extract N sub-feature vector sets from the feature vector set.
步骤S304,采用N个子特征向量集合训练获得N棵决策树,并将N棵决策树组成随机森林模型。Step S304, using N sub-feature vector sets to train to obtain N decision trees, and forming the N decision trees into a random forest model.
针对每个子特征向量集合,采用该子特征向量集合中的特征向量训练获得决策树,其中,决策树可以是CART(分类回归)树。根据子特征向量集合中的特征向量的重要性,设置决策树的权重,N棵决策树的权重之和为1。For each sub-feature vector set, the feature vectors in the sub-feature vector set are used to train to obtain a decision tree, wherein the decision tree may be a CART (Classification and Regression) tree. According to the importance of the eigenvectors in the sub-eigenvector set, the weight of the decision tree is set, and the sum of the weights of the N decision trees is 1.
采用历史业务审核数据训练随机森林模型,同时根据特征向量的重要性设置随机森林模型中每棵决策树的权重,故在采用随机森林模型审核业务请求时,结合决策树的分类结果以及决策树的权重能有效提供审核结果的准确性。Use historical business review data to train the random forest model, and set the weight of each decision tree in the random forest model according to the importance of the feature vector. Therefore, when using the random forest model to review business requests, combine the classification results of the decision tree and the decision tree. Weights can effectively provide the accuracy of audit results.
为了更好的解释本发明实施例,下面以供应链贷款作为实施场景描述本发明实施例提供的一种基于机器学习的业务审核方法,该方法由基于机器学习的业务审核装置执行,如图4所示,该方法包括以下步骤:In order to better explain the embodiment of the present invention, the following uses supply chain loans as an implementation scenario to describe a machine learning-based business review method provided by the embodiment of the present invention, which is executed by a machine-learning-based business review device, as shown in Figure 4 As shown, the method includes the following steps:
设定银行A在供应链金融领域有较多的贷款业务,业务系统的数据仓库积累了供应链业务的贷款企业申请记录和贷款审核历史数据。从业务系统数据、行内风险数据以及人行征信数据中提取特征向量集合Dt。采用bootstrap方法从特征向量集合Dt中抽取N个子特征向量集合{D1、D2、…、DN}。采用N个子特征向量集合训练获得N棵决策树,分别为{T1、T2、…、TN},根据子特征向量集合中的特征向量的重要性,设置N棵决策树的权重分别为{a1、a2、…、a N},N棵决策树的权重之和为1,N棵决策树组成随机森林模型。It is assumed that bank A has more loan business in the field of supply chain finance, and the data warehouse of the business system has accumulated loan application records and loan review history data of supply chain business. Extract feature vector set Dt from business system data, intra-bank risk data and PBOC credit data. Use the bootstrap method to extract N sub-feature vector sets {D1, D2, ..., DN} from the feature vector set Dt. Use N sub-feature vector sets to train and obtain N decision trees, respectively {T1, T2, ..., TN}, according to the importance of the feature vectors in the sub-feature vector sets, set the weights of N decision trees as {a1, a2,...,a N}, the sum of the weights of N decision trees is 1, and N decision trees form a random forest model.
供应链中的核心企业C地产公司与链属企业B装饰公司之间发生了一笔物业装修业务,业务金额五百万元,需待业务结束后C地产公司支付给B装饰公司,此时B装饰公司需要业务资金周转,通过此业务贸易背景的应收账款,向银行A申请贷款三百五十万元。银行A接收到B装饰公司的贷款请求后,从人行征信平台查询B装饰公司的征信记录,包括历史贷款登记记录、资产抵押记录、资产质押记录、企业对外担保记录等,再从企业工商注册数据源,获取C地产公司和B装饰公司的企业特征数据,比如企业规模、成立时间、企业注册资金、经营财报、应收账款特征数据等。从上面关于此贷款涉及到的数据中提取C地产公司的特征向量,将特征向量分别输入到N棵决策树中,每棵决策树输出一个分类结果,其中,分类结果包括审核通过和审核不通过。将分类结果为审核通过的决策树的权重相加,获得审核通过的分类权重。将分类结果为审核不通过的决策树的权重相加,获得审核不通过的分类权重。当审核通过的分类权重大于审核不通过的分类权重时,确定审核结果为审核通过,当审核通过的分类权重小于审核不通过的分类权重时,确定审核结果为审核不通过。There is a property decoration business between the core enterprise C Real Estate Company in the supply chain and the chain enterprise B Decoration Company. The decoration company needs business capital turnover, and applies for a loan of 3.5 million yuan from Bank A through the accounts receivable of this business trade background. After receiving the loan request from Decoration Company B, Bank A inquires the credit records of Decoration Company B from the credit investigation platform of the People's Bank of China, including historical loan registration records, asset mortgage records, asset pledge records, enterprise external guarantee records, etc. Register the data source to obtain the enterprise characteristic data of C real estate company and B decoration company, such as enterprise scale, establishment time, enterprise registered capital, operating financial report, account receivable characteristic data, etc. Extract the eigenvectors of C Real Estate Company from the above data related to this loan, and input the eigenvectors into N decision trees, each decision tree outputs a classification result, where the classification results include approval and approval. . Add up the weights of the decision trees whose classification results are approved to obtain the approved classification weights. Add up the weights of the decision trees whose classification results fail the review to obtain the classification weight of the failed review. When the classification weight of the approved category is greater than the category weight of the failed category, the result of the review is determined to be approved;
由于采用历史业务审核数据训练随机森林模型,使随机森林模型学习审核员的审核行为,然后将业务申请者的特征向量输入随机森林模型中的每棵决策树,获得随机森林模型中每棵决策树输出的分类结果,之后再根据随机森林模型中每棵决策树输出的分类结果,确定业务请求的审核结果。相较于专家模型来说,随机森林模型是基于历史业务审核数据训练获得,并不仅仅是专业审核人员的经验,因此对人为经验依赖小,降低了主观因素的影响,提高了审核模型的泛化能力和通用性。Since the random forest model is trained with historical business audit data, the random forest model learns the audit behavior of the auditor, and then the feature vector of the business applicant is input into each decision tree in the random forest model to obtain each decision tree in the random forest model The output classification results, and then according to the classification results output by each decision tree in the random forest model, determine the review results of the business request. Compared with the expert model, the random forest model is trained based on historical business audit data, not just the experience of professional auditors, so it is less dependent on human experience, reduces the influence of subjective factors, and improves the generality of the audit model. capability and versatility.
基于相同的技术构思,本发明实施例提供了一种基于机器学习的业务审核装置,如图5所示,该装置500包括:Based on the same technical concept, the embodiment of the present invention provides a machine learning-based business review device, as shown in Figure 5, the device 500 includes:
获取模块501,用于获取业务申请者的业务请求;An acquisition module 501, configured to acquire a business request from a business applicant;
提取模块502,用于根据所述业务请求提取所述业务申请者的特征向量;An extraction module 502, configured to extract the feature vector of the business applicant according to the business request;
分类模块503,用于将所述业务申请者的特征向量输入随机森林模型中的每棵决策树,获得所述随机森林模型中每棵决策树输出的分类结果,所述随机森林模型是以历史业务审核数据为训练样本训练获得的;The classification module 503 is used to input the feature vector of the business applicant into each decision tree in the random forest model, and obtain the classification result output by each decision tree in the random forest model, and the random forest model is based on the history The business review data is obtained from training samples;
处理模块504,用于根据所述随机森林模型中每棵决策树输出的分类结果,确定所述业务请求的审核结果。The processing module 504 is configured to determine the review result of the business request according to the classification result output by each decision tree in the random forest model.
可选地,所述处理模块504具体用于:Optionally, the processing module 504 is specifically configured to:
根据所述随机森林模型中每棵决策树输出的分类结果和所述随机森林模型中每棵决策树对应的权重,确定所述业务请求的审核结果。The review result of the service request is determined according to the classification result output by each decision tree in the random forest model and the weight corresponding to each decision tree in the random forest model.
可选地,所述处理模块504具体用于:Optionally, the processing module 504 is specifically configured to:
将所述随机森林模型中分类结果相同的决策树的权重相加,确定每个分类结果的分类权重;Add the weights of the decision trees with the same classification results in the random forest model to determine the classification weight of each classification result;
将分类权重最大的分类结果作为审核结果。The classification result with the largest classification weight is taken as the review result.
可选地,所述业务请求为供应链业务请求,所述历史业务审核数据包括链属企业特征数据、核心企业特征数据、审核员的历史审核记录。Optionally, the business request is a supply chain business request, and the historical business audit data includes chain enterprise characteristic data, core enterprise characteristic data, and historical audit records of auditors.
可选地,所述分类模块503具体用于:Optionally, the classification module 503 is specifically configured to:
获取历史业务审核数据;Obtain historical business audit data;
根据所述历史业务审核数据确定特征向量集合;determining a set of feature vectors according to the historical business audit data;
从所述特征向量集合中抽取N个子特征向量集合,所述N为预设正整数;Extracting N sub-feature vector sets from the feature vector set, where N is a preset positive integer;
采用所述N个子特征向量集合训练获得N棵决策树;Using the N sub-feature vector sets to train and obtain N decision trees;
将N棵决策树组成随机森林模型。Combine N decision trees into a random forest model.
基于相同的技术构思,本发明实施例提供了一种计算机设备,如图6所示,包括至少一个处理器601,以及与至少一个处理器连接的存储器602,本发明实施例中不限定处理器601与存储器602之间的具体连接介质,图6中处理器601和存储器602之间通过总线连接为例。总线可以分为地址总线、数据总线、控制总线等。Based on the same technical concept, an embodiment of the present invention provides a computer device, as shown in FIG. 6 , including at least one processor 601, and a memory 602 connected to at least one processor. The processor is not limited in the embodiment of the present invention. As for the specific connection medium between 601 and the memory 602, the bus connection between the processor 601 and the memory 602 in FIG. 6 is taken as an example. The bus can be divided into address bus, data bus, control bus and so on.
在本发明实施例中,存储器602存储有可被至少一个处理器601执行的指令,至少一个处理器601通过执行存储器602存储的指令,可以执行前述的基于机器学习的业务审核方法中所包括的步骤。In the embodiment of the present invention, the memory 602 stores instructions that can be executed by at least one processor 601, and at least one processor 601 can execute the instructions included in the aforementioned machine learning-based business review method by executing the instructions stored in the memory 602. step.
其中,处理器601是计算机设备的控制中心,可以利用各种接口和线路连接计算机设备的各个部分,通过运行或执行存储在存储器602内的指令以及调用存储在存储器602内的数据,从而进行业务审核。可选的,处理器601可包括一个或多个处理单元,处理器601可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器601中。在一些实施例中,处理器601和存储器602可以在同一芯片上实现,在一些实施例中,它们也可以在独立的芯片上分别实现。Among them, the processor 601 is the control center of the computer equipment, which can use various interfaces and lines to connect various parts of the computer equipment, and conduct business operations by running or executing instructions stored in the memory 602 and calling data stored in the memory 602. review. Optionally, the processor 601 may include one or more processing units, and the processor 601 may integrate an application processor and a modem processor. The tuner processor mainly handles wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 601 . In some embodiments, the processor 601 and the memory 602 can be implemented on the same chip, and in some embodiments, they can also be implemented on independent chips.
处理器601可以是通用处理器,例如中央处理器(CPU)、数字信号处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,可以实现或者执行本发明实施例中公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者任何常规的处理器等。结合本发明实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。The processor 601 may be a general-purpose processor, such as a central processing unit (CPU), a digital signal processor, an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array or other programmable logic devices, discrete gates or transistors Logic devices and discrete hardware components can implement or execute the methods, steps and logic block diagrams disclosed in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the methods disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
存储器602作为一种非易失性计算机可读存储介质,可用于存储非易失性软件程序、非易失性计算机可执行程序以及模块。存储器602可以包括至少一种类型的存储介质,例如可以包括闪存、硬盘、多媒体卡、卡型存储器、随机访问存储器(Random AccessMemory,RAM)、静态随机访问存储器(Static Random Access Memory,SRAM)、可编程只读存储器(Programmable Read Only Memory,PROM)、只读存储器(Read Only Memory,ROM)、带电可擦除可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、磁性存储器、磁盘、光盘等等。存储器602是能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。本发明实施例中的存储器602还可以是电路或者其它任意能够实现存储功能的装置,用于存储程序指令和/或数据。The memory 602, as a non-volatile computer-readable storage medium, can be used to store non-volatile software programs, non-volatile computer-executable programs and modules. The memory 602 may include at least one type of storage medium, such as flash memory, hard disk, multimedia card, card-type memory, random access memory (Random Access Memory, RAM), static random access memory (Static Random Access Memory, SRAM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Magnetic Memory, Disk, discs and more. Memory 602 is, but is not limited to, any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory 602 in the embodiment of the present invention may also be a circuit or any other device capable of implementing a storage function, and is used for storing program instructions and/or data.
基于相同的技术构思,本发明实施例提供了一种计算机可读存储介质,其存储有可由计算机设备执行的计算机程序,当所述程序在计算机设备上运行时,使得所述计算机设备执行基于机器学习的业务审核方法的步骤。Based on the same technical idea, an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program executable by a computer device. When the program is run on the computer device, the computer device executes a machine-based Learn the steps of the business audit methodology.
本领域内的技术人员应明白,本发明的实施例可提供为方法、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present invention may be provided as methods or computer program products. Accordingly, the present invention can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
尽管已描述了本发明的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本发明范围的所有变更和修改。While preferred embodiments of the invention have been described, additional changes and modifications to these embodiments can be made by those skilled in the art once the basic inventive concept is appreciated. Therefore, it is intended that the appended claims be construed to cover the preferred embodiment as well as all changes and modifications which fall within the scope of the invention.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. Thus, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and equivalent technologies thereof, the present invention also intends to include these modifications and variations.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533825.2A CN110264342B (en) | 2019-06-19 | 2019-06-19 | Business auditing method and device based on machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910533825.2A CN110264342B (en) | 2019-06-19 | 2019-06-19 | Business auditing method and device based on machine learning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110264342A true CN110264342A (en) | 2019-09-20 |
CN110264342B CN110264342B (en) | 2024-06-28 |
Family
ID=67919576
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910533825.2A Active CN110264342B (en) | 2019-06-19 | 2019-06-19 | Business auditing method and device based on machine learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110264342B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111612166A (en) * | 2020-04-23 | 2020-09-01 | 中国科学院计算机网络信息中心 | A Machine Learning-Based Method for Predicting Reimbursement Time |
CN112258135A (en) * | 2020-05-15 | 2021-01-22 | 北京沃东天骏信息技术有限公司 | Method and device for auditing prescription data and computer-readable storage medium |
CN112308466A (en) * | 2020-11-26 | 2021-02-02 | 东莞市盟大塑化科技有限公司 | Enterprise qualification auditing method and device, computer equipment and storage medium |
CN112579579A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Material mobile data auditing method and device, storage medium and electronic equipment |
CN112734352A (en) * | 2019-10-28 | 2021-04-30 | 北京京东尚科信息技术有限公司 | Document auditing method and device based on data dimensionality |
CN113159175A (en) * | 2021-04-21 | 2021-07-23 | 平安科技(深圳)有限公司 | Data prediction method, device, equipment and storage medium |
CN113435842A (en) * | 2021-06-28 | 2021-09-24 | 京东科技控股股份有限公司 | Business process processing method and computer equipment |
CN113705668A (en) * | 2021-08-27 | 2021-11-26 | 创新奇智(广州)科技有限公司 | Method, device, equipment and storage medium for detecting working state of component |
CN114202399A (en) * | 2021-12-13 | 2022-03-18 | 金蝶软件(中国)有限公司 | Intelligent approval method and related device |
CN114496196A (en) * | 2022-02-18 | 2022-05-13 | 潍坊医学院附属医院 | Automatic auditing system for clinical biochemical inspection in medical laboratory |
CN114971867A (en) * | 2022-06-07 | 2022-08-30 | 中国银行股份有限公司 | Method and apparatus for determining bank auditors |
WO2024021555A1 (en) * | 2022-07-29 | 2024-02-01 | 京东科技信息技术有限公司 | Resource examination and approval method and device, and random forest model training method and device |
CN118195517A (en) * | 2024-03-18 | 2024-06-14 | 湖南一二三零一文化旅游服务有限公司 | Report automatic auditing method and system for travel statistics data report |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8818910B1 (en) * | 2013-11-26 | 2014-08-26 | Comrise, Inc. | Systems and methods for prioritizing job candidates using a decision-tree forest algorithm |
CN105279691A (en) * | 2014-07-25 | 2016-01-27 | 中国银联股份有限公司 | Financial transaction detection method and equipment based on random forest model |
CN107092827A (en) * | 2017-03-30 | 2017-08-25 | 中国民航大学 | A kind of Android malware detection method based on improvement forest algorithm |
CN107766883A (en) * | 2017-10-13 | 2018-03-06 | 华中师范大学 | A kind of optimization random forest classification method and system based on weighted decision tree |
CN108510507A (en) * | 2018-03-27 | 2018-09-07 | 哈尔滨理工大学 | A kind of 3D vertebra CT image active profile dividing methods of diffusion-weighted random forest |
CN108665159A (en) * | 2018-05-09 | 2018-10-16 | 深圳壹账通智能科技有限公司 | A kind of methods of risk assessment, device, terminal device and storage medium |
CN109145965A (en) * | 2018-08-02 | 2019-01-04 | 深圳辉煌耀强科技有限公司 | Cell recognition method and device based on random forest disaggregated model |
CN109214914A (en) * | 2018-08-24 | 2019-01-15 | 厦门集微科技有限公司 | A kind of loan information checking method and device based on communication open platform |
CN109359669A (en) * | 2018-09-10 | 2019-02-19 | 平安科技(深圳)有限公司 | Method for detecting abnormality, device, computer equipment and storage medium are submitted an expense account in medical insurance |
CN109389490A (en) * | 2018-09-26 | 2019-02-26 | 深圳壹账通智能科技有限公司 | Loan product matching process, device, computer equipment and storage medium |
CN109460872A (en) * | 2018-11-14 | 2019-03-12 | 重庆邮电大学 | One kind being lost unbalanced data prediction technique towards mobile communication subscriber |
CN109564677A (en) * | 2018-11-09 | 2019-04-02 | 香港应用科技研究院有限公司 | Super-resolution synthesis system and method based on random forest classifier weighting result |
CN109726826A (en) * | 2018-12-19 | 2019-05-07 | 东软集团股份有限公司 | Training method, device, storage medium and the electronic equipment of random forest |
CN109829471A (en) * | 2018-12-19 | 2019-05-31 | 东软集团股份有限公司 | Training method, device, storage medium and the electronic equipment of random forest |
-
2019
- 2019-06-19 CN CN201910533825.2A patent/CN110264342B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8818910B1 (en) * | 2013-11-26 | 2014-08-26 | Comrise, Inc. | Systems and methods for prioritizing job candidates using a decision-tree forest algorithm |
CN105279691A (en) * | 2014-07-25 | 2016-01-27 | 中国银联股份有限公司 | Financial transaction detection method and equipment based on random forest model |
CN107092827A (en) * | 2017-03-30 | 2017-08-25 | 中国民航大学 | A kind of Android malware detection method based on improvement forest algorithm |
CN107766883A (en) * | 2017-10-13 | 2018-03-06 | 华中师范大学 | A kind of optimization random forest classification method and system based on weighted decision tree |
CN108510507A (en) * | 2018-03-27 | 2018-09-07 | 哈尔滨理工大学 | A kind of 3D vertebra CT image active profile dividing methods of diffusion-weighted random forest |
CN108665159A (en) * | 2018-05-09 | 2018-10-16 | 深圳壹账通智能科技有限公司 | A kind of methods of risk assessment, device, terminal device and storage medium |
CN109145965A (en) * | 2018-08-02 | 2019-01-04 | 深圳辉煌耀强科技有限公司 | Cell recognition method and device based on random forest disaggregated model |
CN109214914A (en) * | 2018-08-24 | 2019-01-15 | 厦门集微科技有限公司 | A kind of loan information checking method and device based on communication open platform |
CN109359669A (en) * | 2018-09-10 | 2019-02-19 | 平安科技(深圳)有限公司 | Method for detecting abnormality, device, computer equipment and storage medium are submitted an expense account in medical insurance |
CN109389490A (en) * | 2018-09-26 | 2019-02-26 | 深圳壹账通智能科技有限公司 | Loan product matching process, device, computer equipment and storage medium |
CN109564677A (en) * | 2018-11-09 | 2019-04-02 | 香港应用科技研究院有限公司 | Super-resolution synthesis system and method based on random forest classifier weighting result |
CN109460872A (en) * | 2018-11-14 | 2019-03-12 | 重庆邮电大学 | One kind being lost unbalanced data prediction technique towards mobile communication subscriber |
CN109726826A (en) * | 2018-12-19 | 2019-05-07 | 东软集团股份有限公司 | Training method, device, storage medium and the electronic equipment of random forest |
CN109829471A (en) * | 2018-12-19 | 2019-05-31 | 东软集团股份有限公司 | Training method, device, storage medium and the electronic equipment of random forest |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112579579A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Material mobile data auditing method and device, storage medium and electronic equipment |
CN112734352A (en) * | 2019-10-28 | 2021-04-30 | 北京京东尚科信息技术有限公司 | Document auditing method and device based on data dimensionality |
CN111612166B (en) * | 2020-04-23 | 2022-10-25 | 中国科学院计算机网络信息中心 | Reimbursement time prediction method based on machine learning |
CN111612166A (en) * | 2020-04-23 | 2020-09-01 | 中国科学院计算机网络信息中心 | A Machine Learning-Based Method for Predicting Reimbursement Time |
CN112258135A (en) * | 2020-05-15 | 2021-01-22 | 北京沃东天骏信息技术有限公司 | Method and device for auditing prescription data and computer-readable storage medium |
CN112308466A (en) * | 2020-11-26 | 2021-02-02 | 东莞市盟大塑化科技有限公司 | Enterprise qualification auditing method and device, computer equipment and storage medium |
CN113159175B (en) * | 2021-04-21 | 2023-06-06 | 平安科技(深圳)有限公司 | Data prediction method, device, equipment and storage medium |
CN113159175A (en) * | 2021-04-21 | 2021-07-23 | 平安科技(深圳)有限公司 | Data prediction method, device, equipment and storage medium |
CN113435842A (en) * | 2021-06-28 | 2021-09-24 | 京东科技控股股份有限公司 | Business process processing method and computer equipment |
CN113435842B (en) * | 2021-06-28 | 2024-12-27 | 京东科技控股股份有限公司 | Business process processing method and computer equipment |
CN113705668A (en) * | 2021-08-27 | 2021-11-26 | 创新奇智(广州)科技有限公司 | Method, device, equipment and storage medium for detecting working state of component |
CN114202399A (en) * | 2021-12-13 | 2022-03-18 | 金蝶软件(中国)有限公司 | Intelligent approval method and related device |
CN114496196A (en) * | 2022-02-18 | 2022-05-13 | 潍坊医学院附属医院 | Automatic auditing system for clinical biochemical inspection in medical laboratory |
CN114971867A (en) * | 2022-06-07 | 2022-08-30 | 中国银行股份有限公司 | Method and apparatus for determining bank auditors |
WO2024021555A1 (en) * | 2022-07-29 | 2024-02-01 | 京东科技信息技术有限公司 | Resource examination and approval method and device, and random forest model training method and device |
CN118195517A (en) * | 2024-03-18 | 2024-06-14 | 湖南一二三零一文化旅游服务有限公司 | Report automatic auditing method and system for travel statistics data report |
Also Published As
Publication number | Publication date |
---|---|
CN110264342B (en) | 2024-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110264342A (en) | A kind of business audit method and device based on machine learning | |
CN107945024B (en) | Method for identifying internet financial loan enterprise operation abnormity, terminal equipment and storage medium | |
CN108846520B (en) | Loan overdue prediction method, loan overdue prediction device and computer-readable storage medium | |
US20190205993A1 (en) | Transaction data categorizer system and method | |
WO2020238229A1 (en) | Transaction feature generation model training method and devices, and transaction feature generation method and devices | |
US11250433B2 (en) | Using semi-supervised label procreation to train a risk determination model | |
CN111861174A (en) | Credit assessment method for user portrait | |
WO2020177478A1 (en) | Credit-based qualification information auditing method, apparatus and device | |
CN110276692B (en) | A method and device for processing transaction data | |
Lohokare et al. | Automated data collection for credit score calculation based on financial transactions and social media | |
CN113052703A (en) | Transaction risk early warning method and device | |
CN111582932A (en) | Inter-scene information pushing method and device, computer equipment and storage medium | |
CN112613978B (en) | Bank capital sufficiency prediction method and device, electronic equipment and medium | |
CN113052679A (en) | Model training method, prediction method and device based on multi-view learning and electronic equipment | |
CN113420098A (en) | Core entity object identification method and device and electronic equipment | |
CN113159924A (en) | Method and device for determining trusted client object | |
KR20220102961A (en) | Method and apparatus of predicting default rate of individual business based on artificial intelligence model using credit information | |
CN116934512A (en) | Financial month knot auditing method and device, computer equipment and storage medium | |
Mikhaylov et al. | Features of Digital Transformation of Modern Banking Transactions | |
CN107305662A (en) | Recognize the method and device of violation account | |
CN117033431A (en) | Work order processing method, device, electronic equipment and medium | |
Zand | Towards intelligent risk-based customer segmentation in banking | |
CN112950225A (en) | Customer category determination method, device and storage medium | |
CN113516558A (en) | A business processing method, device and equipment | |
CN115860889A (en) | Financial loan big data management method and system based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |