CN117827882B

CN117827882B - Deep learning-based financial database SQL quality scoring method, system, equipment and storable medium

Info

Publication number: CN117827882B
Application number: CN202410014519.9A
Authority: CN
Inventors: 陈传凯; 刘宁; 李超德
Original assignee: Beijing Xinshu Technology Co ltd
Current assignee: Beijing Xinshu Technology Co ltd
Priority date: 2024-01-04
Filing date: 2024-01-04
Publication date: 2024-08-20
Anticipated expiration: 2044-01-04
Also published as: CN117827882A

Abstract

The invention provides a deep learning-based financial database SQL quality scoring method, a deep learning-based financial database SQL quality scoring system, a deep learning-based financial database SQL quality scoring device and a deep learning-based financial database SQL quality scoring storage medium. The invention can automatically finish SQL quality grading without defining rules in advance, and gives users visual SQL quality evaluation results, thereby having remarkable advantages in adaptability, automation and expandability. The method can better adapt to the dynamic change of SQL sentences, reduce the need of manual intervention and effectively process large-scale SQL query data. These advantages make the invention more efficient and accurate in terms of quality scores of SQL statements.

Description

Deep learning-based financial database SQL quality scoring method, system, equipment and storable medium

Technical Field

The invention relates to a financial database SQL quality scoring method, system, equipment and storable medium based on deep learning, and belongs to the field of intelligent operation and maintenance.

Background

In the financial field, data is a core element of driving decisions and business operations. With the acceleration of digital transformation, financial institutions accumulate massive amounts of data, including transaction records, customer information, market dynamics, risk assessment, and the like. Currently, the primary storage mode of data is still a relational database. SQL (Structured Query Language) is widely used in data acquisition, processing, and analysis as a standardized database query language in relational databases. However, as the volume of data grows and the complexity increases, the quality problems of SQL statements become increasingly prominent, including:

(1) Query performance problem: due to the large data volume, complex table structure, unreasonable index design and other reasons, some SQL queries may have low execution efficiency, which results in slow system response and influences user experience and business flow.

(2) Data accuracy problem: low quality SQL queries can lead to data extraction errors, omissions, or duplicates, affecting the accuracy of data analysis results, and thus affecting decision making and risk management.

(3) Problem of wasting resources: ineffective or redundant SQL queries may consume excessive computing resources and memory space, increasing operating and maintenance costs and energy consumption.

(4) Potential safety hazard problem: improper SQL query statements may expose sensitive information, causing data leakage and security risks.

In the financial industry, data management and security are subject to stringent regulatory requirements, as are the quality requirements for SQL statements. Meanwhile, in the fierce market competition, financial institutions also need to improve business performance and customer experience by means of data analysis and intelligent decision making. In addition, with the explosive growth of financial data volume, the traditional SQL query method has not been capable of meeting the data processing requirements of high efficiency, accuracy and safety, so that the quality of SQL sentences has to be paid attention to the financial institutions.

At present, in the financial field, SQL quality scoring is paid attention to, and query performance is optimized and data accuracy and safety are improved by designing different scoring algorithms. The existing SQL quality scoring algorithm mainly comprises the following methods:

(1) Grammar and semantic checking: it is mainly checked whether the grammar of SQL statement is correct and the semantics are reasonable. For example, it is checked whether there is a syntax error, whether tables and columns exist, whether the condition of JOIN operation is satisfied, and the like.

(2) Performance evaluation: the execution performance of SQL sentences, such as query response time, resource consumption and the like, is mainly focused. Such methods may predict the performance of SQL statements based on historical data and database statistics.

(3) Security and compliance check: such algorithms mainly detect whether SQL statements present potential security risks, such as SQL injection, rights abuse, sensitive data leakage, etc. At the same time, they may also check whether the SQL statement meets certain compliance requirements or best practices.

Traditional methods are usually based on fixed rules or static features for analysis, and are difficult to adapt to the dynamic changes of SQL sentence structure and semantics. Meanwhile, complex rules and indexes are required to be manually defined and maintained, and automation and expansion to a large-scale SQL query set are difficult.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

FIG. 2 is a dynamic SQL graph of SQL1 of the embodiments.

FIG. 3 is a dynamic SQL graph of SQL2 of the embodiments.

Disclosure of Invention

Based on the analysis, the invention provides a financial database SQL quality scoring method based on deep learning, which introduces a dynamic graph convolution layer, the weight is dynamically updated according to the structure and semantic change of the SQL sentence, so that the adaptability and generalization capability of the model are improved. The scoring method comprises the following specific steps:

(1) Collecting historical SQL query sentences, including normal operation and known attack or abnormal behavior, and converting the SQL sentences into abstract syntax trees;

(2) Carrying out graphical representation on AST of each SQL sentence, constructing a dynamic SQL graph, and dynamically updating the weight of the edge according to the characteristics of the current SQL sentence;

(3) Extracting structural features and attribute features of the SQL map;

(4) For historical data, assigning a risk tag to the historical data according to whether it relates to a security event or abnormal behavior;

(5) Constructing a graph neural network model comprising a dynamic graph convolution layer, a self-attention mechanism and multi-task learning, and using the graph neural network model to learn the embedded representation of the SQL graph;

(6) The graph neural network model is trained using a supervised learning approach, with features of the SQL graph as inputs and predicted values of risk scores and other related tasks as outputs.

(7) For a new SQL query statement, firstly converting the SQL query statement into AST and constructing an SQL graph, then extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain an embedded representation of the graph.

Further, in step (6), in each dynamic graph convolution layer, the weight matrix is dynamically updated according to the current hidden states of the nodes and edgesWhere W ^(l) represents the weight parameters in the convolution layer of the first layer graph, f is a learning function for dynamically computing W ^(l) based on the current hidden states of the nodes and edges,Representing the hidden state of node v at the first layer,Representing the hidden state of node u at the first level, node u being a node adjacent to node v, e _uv representing the feature vector of the edge between nodes u and v.

Further, in step (6), a self-attention mechanism is introduced into the graph neural network model, and self-attention coefficients of the node v are calculatedWherein W, W ₁ and W ₂ are learnable weight vectors and weight matrices,AndFor hidden states of node v at different layers, the softmax function is used to normalize the attention coefficients and the tanh function is used for nonlinear mapping. And adding a plurality of output nodes at the last layer of the multi-layer sensor, and using a shared hidden layer to realize multi-task learning.

Further, in step (7), risk_score=sigmoid (Σ _vg(v)×α_v×W_r×h_v), where g (v) is a learnable weight adjustment function associated with node v, g (v) =w _g×h_v+b_g, parameters W _g and b _g are learnable weights and biases, α _v is the self-attention coefficient of node v, h _v is the hidden state of node v, W _r is a learnable weight matrix, and the sigmoid function compresses the risk score to within the range of [0,1 ].

The invention also provides a financial database SQL quality scoring system based on deep learning, which comprises the following modules:

(1) And the data collection and preprocessing module is used for: the module collects historical SQL query sentences, including normal operation and known attack or abnormal behavior, and converts the SQL sentences into abstract syntax trees;

(2) The dynamic SQL graph construction module: the module graphically represents AST of each SQL sentence, and dynamically updates the weight of the edge according to the characteristics of the current SQL sentence when constructing a graph;

(3) And the feature extraction module is used for: extracting structural features and attribute features of the SQL map;

(4) Risk tag allocation module: for historical data, assigning a risk tag to the historical data according to whether it relates to a security event or abnormal behavior;

(5) The graph neural network model building module: constructing a graph neural network model comprising a dynamic graph convolution layer, a self-attention mechanism and multi-task learning, and using the graph neural network model to learn the embedded representation of the SQL graph;

(6) Model training module: the graph neural network model is trained using a supervised learning approach, with features of the SQL graph as inputs and predicted values of risk scores and other related tasks as outputs.

(7) Risk score calculation module: for a new SQL query statement, firstly converting the SQL query statement into AST and constructing an SQL graph, then extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain an embedded representation of the graph.

Further, in the model training module, in each dynamic graph convolution layer, the weight matrix is dynamically updated according to the current hidden states of the nodes and the edgesWhere W ^(l) represents the weight parameters in the convolution layer of the first layer graph, f is a learning function for dynamically computing W ^(l) based on the current hidden states of the nodes and edges,Representing the hidden state of node v at the first layer,Representing the hidden state of node u at the first level, node u being a node adjacent to node v, e _uv representing the feature vector of the edge between nodes u and v.

Further, in the model training module, a self-attention mechanism is introduced into the graph neural network model, and the self-attention coefficient of the node v is calculatedWherein W, W ₁ and W ₂ are learnable weight vectors and weight matrices,AndFor hidden states of node v at different layers, the softmax function is used to normalize the attention coefficients and the tanh function is used for nonlinear mapping. And adding a plurality of output nodes at the last layer of the multi-layer sensor, and using a shared hidden layer to realize multi-task learning.

Further, in the risk score calculation module, risk_score=sigmoid (Σ _vg(v)×α_v×W_r×h_v), where g (v) is a learnable weight adjustment function related to node v, g (v) =w _g×h_v+b_g, parameters W _g and b _g are learnable weights and biases, α _v is a self-attention coefficient of node v, h _v is a hidden state of node v, W _r is a learnable weight matrix, and the sigmoid function compresses the risk score to be within the range of [0,1 ].

The present invention also provides an apparatus comprising: the device comprises a data acquisition device, a processor and a memory; the data acquisition device is used for acquiring data; the memory is used for storing one or more program instructions; the processor is configured to execute one or more program instructions to perform any of the methods described above.

The present invention further provides a computer readable storage medium having one or more program instructions embodied therein for performing any of the methods described above.

With the invention, there are significant advantages in the following 3 aspects:

The adaptability: traditional methods typically analyze based on fixed rules or static features, and are difficult to adapt to dynamic changes in SQL statement structure and semantics. However, the present invention utilizes a dynamic graph convolution layer to accommodate structural and semantic changes in SQL statements. Through a self-attention mechanism, the invention can capture long-distance dependency relationship among nodes, and better understand the overall structure and intention of query.

And (3) automation: traditional methods require complex rules and metrics to be manually defined and maintained, which is time consuming and error prone. In contrast, the invention is trained and optimized by a machine learning method, thereby greatly reducing the need of manual intervention. This means that the present invention can automatically learn and identify security risks in SQL statements without the need for manually defining and maintaining rules.

Scalability: traditional approaches may encounter performance bottlenecks when processing large-scale SQL query sets. However, the invention utilizes the graph neural network to learn, and can effectively process large-scale SQL query data. Through multi-task learning and sharing of the hidden layer, the method can be effectively expanded to more related tasks, and the generalization capability of the model is improved.

Detailed Description

The invention designs an SQL quality scoring method based on deep learning, which can dynamically update weights according to the structure and semantic change of SQL sentences by introducing a dynamic graph convolution layer, thereby improving the adaptability and generalization capability of a model.

The SQL quality scoring method based on deep learning mainly comprises the steps shown in fig. 1, specifically:

(1) Data collection and preprocessing

Historical SQL query statements are collected, including normal operation and known attack or abnormal behavior, and the SQL statements are converted into abstract syntax tree (Abstract Syntax Tree, AST) representations. An abstract syntax tree is a data structure representing the structure and syntax elements of source code or programming language statements.

(2) Construction of dynamic SQL graphs

The AST of each SQL statement is graphically represented, wherein nodes are defined as SQL keywords, table names, column names, functions and other elements in the graph, and edges are defined as relationships among the elements (such as parent nodes-child nodes, tables-columns and the like). Given that the structure and semantics of SQL statements may change over time, a dynamic graph convolution layer may be introduced to accommodate these changes. When constructing the graph, the weights of the edges are dynamically updated according to the characteristics of the current SQL statement.

(3) Feature extraction

The method for extracting the structural features and the attribute features of the SQL map mainly comprises the following steps:

1) Node type and number of graph

2) Edge type and number of graphs

3) Degree distribution of nodes

4) Hierarchical structure information

5) Importance of tables and columns (based on access frequency, sensitivity, and context information)

6) Combination and order of SQL keywords

7) Function and operator used

(4) Risk label assignment

For historical data, risk tags are assigned to it according to whether it relates to a security event or abnormal behavior. For example, SQL statements that involve data leakage, injection attacks, or abnormal data modification are marked as high risk.

(5) Construction of a graph neural network model

A graph neural network model is constructed that includes a dynamic graph convolutional layer, a self-attention mechanism, and a multitasking study for learning an embedded representation of the SQL graph. The graph neural network model may contain multiple dynamic graph convolution layers, self-attention layers, and pooling layers for capturing local and global graph structure information.

(6) Model training

The graph neural network model is trained by using a supervised learning method, the characteristics of the SQL graph are input, and the output is a risk score and the predicted value of other related tasks, such as the execution time of SQL sentences, the data access amount and the like.

In each dynamic graph convolution layer, dynamically updating a weight matrix according to the current hidden states of the nodes and the edges, wherein the calculation mode is as follows:

Where W ^(l) represents the weight parameters in the picture volume layer of the first layer, f is a learning function for dynamically calculating W ^(l) based on the current hidden states of the nodes and edges, The hidden state of the node v at the first layer is represented, and the characteristic representation of the node v after the previous layer of graph rolling lamination processing is represented.The hidden state of the node u at the first layer is represented, and the characteristic representation of the node u after the previous layer of graph rolling lamination processing is represented. Node u is one of the nodes adjacent to node v, and e _uv represents a feature vector of an edge between nodes u and v, and contains information describing characteristics of the edge uv, such as direction, type, weight, etc. of the edge.

To better capture long-range dependencies between nodes, self-attention mechanisms are introduced in the graph neural network model. The self-attention coefficient of node v can be calculated using the following formula

Where W, W ₁, and W ₂ are learnable weight vectors and weight matrices,AndIs the hidden state of node v at different layers, the softmax function is used to normalize the attention coefficient, and the tanh function is used for nonlinear mapping.

And adding a plurality of output nodes at the last layer of the multi-layer sensor, and using a shared hidden layer to realize multi-task learning.

(7) Risk score calculation:

For a new SQL query statement, firstly converting the SQL query statement into AST and constructing an SQL graph, then extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain an embedded representation of the graph.

The risk score is calculated from the embedded representation of the graph using a graph annotation mechanism, emphasizing the nodes or edges that have a greater impact on risk.

risk_score＝sigmoid(∑_vg(v)×α_v×W_r×h_v)

Where g (v) is a learnable weight adjustment function associated with node v, g (v) =w _g×h_v+b_g, parameters W _g and b _g are learnable weights and biases, α _v is the self-attention coefficient of node v, h _v is the hidden state of node v, W _r is a learnable weight matrix, and the sigmoid function compresses the risk score to within the range of [0,1 ].

Since the actual values and diagrams are affected by the specific data set, model parameters and training process, the following describes the process of the method according to the invention by taking a simple specific example as an example:

(1) Data collection and pretreatment:

the following two SQL query statements are collected as input data:

SQL1:SELECT column1,column2 FROM table1 WHERE condition1 AND condition2

SQL2:SELECT×FROM users WHERE username＝'admin'OR 1＝1'

performing lexical analysis and grammar analysis on each SQL sentence, and converting the SQL sentence into Abstract Syntax Trees (ASTs) respectively:

AST1 (corresponding to SQL 1): [ SELECT, [ column1, column2], FROM, table1, WHERE, [ condition1, AND, condition2]

AST2 (corresponding to SQL 2): [ SELECT, ×, FROM, users, WHERE, [ username, =, 'admin', OR, 1=1 ]

(2) Constructing a dynamic SQL graph:

each AST is converted into a graph representation in which nodes represent syntax elements in the SQL statement and edges represent relationships between the elements. The weights of the edges are dynamically updated according to the characteristics of the current SQL statement.

Simplified diagrams (only part of the nodes and edges are shown) are shown in fig. 2-3, respectively, fig. 2 corresponding to SQL1 and fig. 3 corresponding to SQL2.

(3) Feature extraction:

the structural features and attribute features of each SQL graph, such as node type, edge type, node degree distribution, hierarchical structure information, etc., are extracted.

Processing results (simplified representation):

Feature vector 1 (corresponding to SQL 1): [0.1,0.2,0.3.] (assuming these are extracted eigenvalues)

Feature vector 2 (corresponding to SQL 2): [0.4,0.5,0.6.] (assuming these are extracted eigenvalues)

(4) Risk tag assignment:

normal queries are marked as low risk, high risk queries are marked as high risk.

Treatment results:

Tag 1 (corresponding to SQL 1): low risk

Tag 2 (corresponding to SQL 2): high risk

(5) Constructing a graph neural network model:

a neural network model is constructed that includes a dynamic graph convolutional layer, a self-attention mechanism, and a multitasking study.

(6) Model training:

The graph neural network model is trained using a supervised learning approach, with inputs being features of the SQL graph and outputs being predictive of risk scores and other relevant tasks.

In each dynamic graph convolutional layer, the weight matrix W ^(l) is dynamically updated according to the current hidden states of the nodes and edges.

Introducing self-attention mechanism in graph neural network model, calculating attention coefficient between nodes

G (v) of the linear function representation is introduced for adjusting the contribution of the different nodes to the final risk score.

(7) Risk score calculation:

and converting the new SQL query statement into AST and constructing an SQL graph, extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain the embedded representation of the graph.

Calculating a risk score by using a formula with adjusted attention weight, and assuming that the risk score obtained by model calculation is as follows:

Risk score 1 (corresponding to SQL 1): 0.1 (lower risk)

Risk score 2 (corresponding to SQL 2): 0.9 (higher risk)

Aiming at specific SQL sentences, analysis can be carried out through fixed rules or static features in the traditional method, but the invention does not need predefined rules at all, automatically finishes the grading of SQL quality, gives users visual SQL quality evaluation results, and has remarkable advantages in the aspects of adaptability, automation and expandability. The method can better adapt to the dynamic change of SQL sentences, reduce the need of manual intervention and effectively process large-scale SQL query data. These advantages make the invention more efficient and accurate in terms of quality scores of SQL statements.

The units, devices or modules etc. set forth in the above embodiments may be implemented in particular by a computer chip or entity or by a product having a certain function. For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, when implementing the present application, the functions of each module may be implemented in the same or multiple pieces of software and/or hardware, or a module implementing the same function may be implemented by multiple sub-modules or a combination of sub-units. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller can be regarded as a hardware component, and means for implementing various functions included therein can also be regarded as a structure within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a mobile terminal, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.

Various embodiments in this specification are described in a progressive manner, and identical or similar parts are all provided for each embodiment, each embodiment focusing on differences from other embodiments. The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable electronic devices, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.

The foregoing description of the embodiments has been provided for the purpose of illustrating the general principles of the application, and is not meant to limit the scope of the application, but to limit the application to the particular embodiments, and any modifications, equivalents, improvements, etc. that fall within the spirit and principles of the application are intended to be included within the scope of the application.

Claims

1. The method introduces a dynamic graph convolution layer and dynamically updates weights according to the structure and semantic change of SQL sentences, and is characterized in that: the method comprises the following specific steps:

(3) Extracting structural features and attribute features of the SQL map;

(6) Training a graph neural network model by using a supervised learning method, wherein the characteristics of the SQL graph are taken as input, and the predicted values of risk scores and other related tasks are taken as output;

(7) For a new SQL query statement, firstly converting the SQL query statement into AST and constructing an SQL graph, then extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain an embedded representation of the graph;

In step (6), in each dynamic graph convolution layer, dynamically updating the weight matrix according to the current hidden states of the nodes and edges Where W ^(l) represents the weight parameters in the convolution layer of the first layer graph, f is a learning function for dynamically computing W ^(l) based on the current hidden states of the nodes and edges,Representing the hidden state of node v at the first layer,Representing the hidden state of a node u at the first layer, wherein the node u is a node adjacent to the node v, and e _uv represents the feature vector of the edge between the nodes u and v;

introducing self-attention mechanism in graph neural network model, calculating self-attention coefficient of node v Wherein W, W ₁ and W ₂ are learnable weight vectors and weight matrices,AndFor the hidden states of the node v at different layers, a softmax function is used for normalizing the attention coefficient, and a tanh function is used for nonlinear mapping; and adding a plurality of output nodes at the last layer of the multi-layer sensor, and using a shared hidden layer to realize multi-task learning.

2. The deep learning-based financial database SQL quality scoring method of claim 1, wherein: in step (7), risk_score=sigmoid (Σ _vg(v)×α_v×W_r×h_v), where g (v) is a learnable weight adjustment function associated with node v, g (v) =w _g×h_v+b_g, parameters W _g and b _g are learnable weights and biases, α _v is the self-attention coefficient of node v, h _v is the hidden state of node v, W _r is a learnable weight matrix, and the sigmoid function compresses the risk score to within the range of [0,1 ].

3. A financial database SQL quality scoring system based on deep learning is characterized in that: the system comprises the following modules:

(6) Model training module: training a graph neural network model by using a supervised learning method, wherein the characteristics of the SQL graph are taken as input, and the predicted values of risk scores and other related tasks are taken as output;

(7) Risk score calculation module: for a new SQL query statement, firstly converting the SQL query statement into AST and constructing an SQL graph, then extracting the characteristics of the SQL query statement and inputting the SQL query statement into a trained graph neural network model to obtain an embedded representation of the graph;

In the model training module, in each dynamic graph convolution layer, the weight matrix is dynamically updated according to the current hidden states of the nodes and the edges Where W ^(l) represents the weight parameters in the convolution layer of the first layer graph, f is a learning function for dynamically computing W ^(l) based on the current hidden states of the nodes and edges,Representing the hidden state of node v at the first layer,Representing the hidden state of a node u at the first layer, wherein the node u is a node adjacent to the node v, and e _uv represents the feature vector of the edge between the nodes u and v;

in the model training module, a self-attention mechanism is introduced into the graph neural network model, and the self-attention coefficient of the node v is calculated Wherein W, W ₁ and W ₂ are learnable weight vectors and weight matrices,AndFor the hidden states of the node v at different layers, a softmax function is used for normalizing the attention coefficient, and a tanh function is used for nonlinear mapping; and adding a plurality of output nodes at the last layer of the multi-layer sensor, and using a shared hidden layer to realize multi-task learning.

4. A deep learning based financial database SQL quality scoring system as recited in claim 3, wherein: in the risk score calculation module, risk_score=sigmoid (Σ _vg(v)×α_v×W_r×h_v), where g (v) is a learnable weight adjustment function related to node v, g (v) =w _g×h_v+b_g, parameters W _g and b _g are learnable weights and biases, α _v is a self-attention coefficient of node v, h _v is a hidden state of node v, W _r is a learnable weight matrix, and the sigmoid function compresses the risk score into the range of [0,1 ].

5. A deep learning based financial database SQL quality scoring apparatus, the apparatus comprising: the device comprises a data acquisition device, a processor and a memory; the data acquisition device is used for acquiring data; the memory is used for storing one or more program instructions; the processor being configured to execute one or more program instructions for performing the method of any of the preceding claims 1-2.

6. A deep learning based financial database SQL quality scoring computer readable storage medium having one or more program instructions embodied therein for performing the method of any of claims 1-2.