CN109977014B

CN109977014B - Block chain-based code error identification method, device, equipment and storage medium

Info

Publication number: CN109977014B
Application number: CN201910221141.9A
Authority: CN
Inventors: 李夫路; 梁爽; 杜松
Original assignee: Taikang Insurance Group Co Ltd
Current assignee: Taikang Insurance Group Co Ltd
Priority date: 2019-03-22
Filing date: 2019-03-22
Publication date: 2022-04-05
Anticipated expiration: 2039-03-22
Also published as: CN109977014A

Abstract

The disclosure provides a code error identification method based on a block chain, a code error identification device, electronic equipment and a computer readable storage medium, and belongs to the technical field of block chains. The method comprises the following steps: acquiring a target code; converting the object code into an eigentensor; processing the feature tensor based on a code error recognition model to obtain an error type of the target code, wherein the code error recognition model is a machine learning model obtained by training according to code case data stored in a block chain network; writing the object code data to the blockchain network. The code management method and the code management system can realize intelligent code error recognition, improve efficiency, reduce labor cost and improve the safety of code management.

Description

Block chain-based code error identification method, device, equipment and storage medium

Technical Field

The present disclosure relates to the field of block chain technologies, and in particular, to a block chain-based code error identification method, a block chain-based code error identification apparatus, an electronic device, and a computer-readable storage medium.

Background

In enterprises in the industries of computers, software, the internet and the like, code writing is very important and very basic work, and errors inevitably occur in the process of code writing, so that the codes cannot normally run, and therefore the codes need to be corrected. Currently, code error correction work in enterprises is mainly completed manually by programmers, and the programmers check errors in codes line by line according to code rules and experiences, so that the mode is very inefficient and consumes a great deal of labor cost.

Therefore, it is necessary to provide a code error recognition method.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The present disclosure provides a block chain-based code error recognition method, a block chain-based code error recognition apparatus, an electronic device, and a computer-readable storage medium, which overcome, at least to some extent, the problems of inefficiency and high labor cost caused by manual error correction in the prior art.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to an aspect of the present disclosure, there is provided a block chain-based code error recognition method, including: acquiring a target code; converting the object code into an eigentensor; processing the feature tensor based on a code error recognition model to obtain an error type of the target code, wherein the code error recognition model is a machine learning model obtained by training according to code case data stored in a block chain network; writing the object code data to the blockchain network.

In an exemplary embodiment of the present disclosure, the feature tensor comprises a matrix of word vectors; the converting the object code into an eigentensor comprises: preprocessing the target code; and performing word segmentation on the preprocessed target code, and performing vector conversion on the obtained words to generate the word vector matrix.

In an exemplary embodiment of the disclosure, the pre-processing comprises any one or more of: replacing all characters in each quotation mark in the object code with placeholders; removing symbols in the object code; and filling preset characters in the target code so as to enable the target code to reach a standard length.

In an exemplary embodiment of the present disclosure, the method further comprises: acquiring a plurality of groups of training data from code case data stored in a blockchain network, wherein the training data comprises sample codes and sample error types corresponding to the sample codes; and training the machine learning model by using the training data to obtain the code error recognition model.

In an exemplary embodiment of the present disclosure, the method further comprises: acquiring a solution from the code case data, and constructing a solution association table based on the association relationship between the sample error type and the solution; after the feature tensor is processed based on the code error recognition model and the error type of the target code is obtained, the method further includes: and searching a solution corresponding to the target code in a solution association table according to the error type of the target code.

In an exemplary embodiment of the present disclosure, the machine learning model includes a text convolutional neural network model.

According to an aspect of the present disclosure, there is provided a block chain-based code error recognition apparatus including: the code acquisition module is used for acquiring a target code; a tensor conversion module, configured to convert the target code into an feature tensor; the error identification module is used for processing the characteristic tensor to obtain the error type of the target code based on a code error identification model, wherein the code error identification model is a machine learning model obtained by training according to code case data stored in a block chain network; and the data writing module is used for writing the target code into the block chain network.

In an exemplary embodiment of the present disclosure, the feature tensor comprises a matrix of word vectors; the tensor conversion module comprises: the preprocessing unit is used for preprocessing the target code; and the vector conversion unit is used for segmenting the preprocessed target code and performing vector conversion on the obtained words to generate the word vector matrix.

In an exemplary embodiment of the present disclosure, the apparatus further includes a model training module, configured to obtain multiple sets of training data from code case data stored in a blockchain network, where the training data includes sample codes and sample error types corresponding to the sample codes, and train the machine learning model using the training data to obtain the code error recognition model.

In an exemplary embodiment of the present disclosure, the apparatus further includes: the association table building module is used for obtaining a solution from the code case data and building a solution association table based on the association relationship between the sample error type and the solution; and the scheme acquisition module is used for searching a solution corresponding to the target code in a solution association table according to the error type of the target code.

In an exemplary embodiment of the present disclosure, the error type of the object code is a machine recognition result; the data writing module is used for acquiring a manual processing result of the target code, and if the manual processing result is not matched with the machine identification result, the target code and the manual processing result are written into the blockchain network as new case data.

According to an aspect of the present disclosure, there is provided an electronic device including: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform the method of any one of the above via execution of the executable instructions.

According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method of any one of the above.

Exemplary embodiments of the present disclosure have the following advantageous effects:

converting the target code into a feature tensor, identifying the error type of the target code through a code error identification model, and writing the target code into a block chain network for storage. On one hand, the code error recognition model is a machine learning model obtained by training according to the code case data, and can learn the incidence relation between the code and the error type, so that the error type of the code to be recognized is intelligently recognized, the error recognition speed and efficiency are improved, and the labor cost is reduced. On the other hand, the code is stored through the block chain network in the exemplary embodiment, so that loss or tampering of code data can be prevented, the management efficiency of the code is improved, and subsequent recall calling is facilitated.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

FIG. 1 illustrates a system architecture diagram of one operating environment of the present exemplary embodiment;

FIG. 2 is a flow chart illustrating a method of code error identification in the exemplary embodiment;

FIG. 3 is a flow chart illustrating a method of code error identification in the exemplary embodiment;

fig. 4 is a block diagram showing a structure of a code error recognition apparatus in the present exemplary embodiment;

fig. 5 shows an electronic device for implementing the above method in the present exemplary embodiment;

fig. 6 illustrates a computer-readable storage medium for implementing the above-described method in the present exemplary embodiment.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The flow charts shown in the drawings are merely illustrative and do not necessarily include all of the steps. For example, some steps may be decomposed, and some steps may be combined or partially combined, so that the actual execution sequence may be changed according to the actual situation.

FIG. 1 illustrates a system architecture diagram of an exemplary embodiment operating environment of the present disclosure. As shown in fig. 1, the system 100 may include a plurality of member nodes 101 forming a blockchain network 110. The blockchain network 110 may be a federation chain or a private chain, and is configured to store and manage object codes of each member node 101, each member node 101 may be a computer or a server (e.g., a computer of an employee that needs to perform code error identification) in an enterprise, upload, store, and update data in the blockchain network 110 based on a certain consensus mechanism, or perform voting and verification on a newly added member node 101, where the data in the blockchain network 110 may be data related to a code, such as a source code, a code test result, code error information, error correction information, and the like.

In an exemplary embodiment, the system 100 may further include a management node 102, which may be a higher-level computer or server in an enterprise (e.g., a server running a program main program) to undertake the task of checking and managing the newly added member nodes 101 in the blockchain network 110, or to check the necessity of data uploaded to the blockchain network 110 by each member node 101.

It should be understood that the number of nodes shown in fig. 1 is only exemplary, and any number of member nodes 101 may be provided, and the management node 102 may also be a cluster composed of a plurality of nodes according to actual needs. The present disclosure is not particularly limited thereto.

According to the system shown in fig. 1, the exemplary embodiment of the present disclosure provides a method for identifying a code error based on a blockchain, which may be implemented in the form of a code testing system, the system is a system associated with a blockchain network, and the system may be disposed on any one or more member nodes 101 in the blockchain network 110 or on a management node 102, so that the member node 101 or the management node 102 may be an execution subject of the exemplary embodiment. Fig. 2 shows a flow of the method, which may include the following steps S210 to S240:

step S210, an object code is acquired.

The target code may be a source code that needs to be subjected to error recognition, the language in which the code is written is not limited in the present exemplary embodiment, and the system may support error recognition for all types of programming languages, and may also support a specific programming language or programming languages according to an actual application scenario.

In this exemplary embodiment, the target code may be uploaded to the system by a member node in the block chain network, or an association between the system and the code database may be set, and after a new code is written in the database, the new code is synchronized to the system to perform error identification, and the system may obtain the target code in other manners, which is not particularly limited in this disclosure.

In an exemplary embodiment, after the target code is obtained, the system may generate a new processing task and set a task number, and manage the processing procedure of each code misidentification based on the number.

Step S220, the target code is converted into an feature tensor.

The feature tensor is a tensor obtained by extracting feature information of the target code, and may be in the form of a one-dimensional tensor (such as a vector), a two-dimensional tensor (such as a matrix), and the like, and the target code is comprehensively expressed in the form of a multi-dimensional numerical value. The following describes an exemplary method for converting the feature tensor, but, of course, the present disclosure is not limited to a specific example, and the conversion method is not limited by the present disclosure.

In an exemplary embodiment, a plurality of dimensions may be predefined, information of each dimension is extracted from the code to be identified, and a numerical conversion is performed, or a certain normalization process may be performed to obtain a normalized numerical value in each dimension, so as to generate an eigenvector, that is, a one-dimensional feature tensor, of the code to be identified.

In an exemplary embodiment, the structure of the code to be recognized may be analyzed to generate an abstract syntax tree composed of syntax units, and then the syntax type of each syntax unit is determined according to the node position of each syntax unit in the abstract syntax tree, and each syntax unit is encoded by using a preset syntax type code, converted into a numerical character, and further combined into the feature tensor according to the sequence of the feature tensor in the code to be recognized.

In an exemplary embodiment, the feature tensor may also be in the form of a feature matrix, for example, characters in the code to be recognized are subjected to unicode (uniform code) encoding, and the encoding results of each character are combined in the longitudinal direction to obtain the feature matrix of the code to be recognized, that is, the two-dimensional feature tensor.

Step S230, based on the code error identification model, processing the feature tensor to obtain the error type of the target code.

The code error recognition model is a machine learning model obtained by training according to code case data stored in the block chain network. The code case data refers to data related to code error correction cases, and each case can include codes, error types and solutions, and is uploaded by each member node and stored in the blockchain network. The code error identification model may take the feature tensor of the target code as input, and take the identified error type as output, where the error type is a classification result of the code errors, and may be expressed in the form of error codes, for example, and each error code corresponds to a class of code errors. After the error type of the target code is determined, a programmer is facilitated to specifically solve the error in the code.

Step S240, writing the target code into the blockchain network.

In this exemplary embodiment, the system may record and manage the current processing status or processing progress of each object code, for example, the member node uploads the object code to the system in step S210, the system may be marked as "pending", after steps S220 and S230 are performed, the system may be marked as "recognized", then the programmer modifies the code manually so that the problem is solved, the system may be marked as "resolved", for the resolved object code, the code and the related information in the processing process may be written into the blockchain network, of course, the object code in the pending or recognized status may also be written into the blockchain network, and if new information is generated subsequently, the object code continues to be written into the blockchain network in the form of new data. It can be seen that steps S210 to S230 are a process of code error identification, and step S240 may use this process as a code error correction case, and write the relevant information data into the blockchain network, so as to refer to and call later. Therefore, the sequence of the steps in the present exemplary embodiment is not limited, and for example, step S240 may be executed after step S210, and then steps S220 and S230 may be executed.

In the blockchain network, a new block may be generated for each code error correction case, and in step S240, if the target code is a new case, a new block may be generated at the tail of the blockchain network, the target code is written into the block, and information data related to the target code is subsequently written into the block. The system may record the task number in the block header of each block so that each block corresponds to one code error correction case for indexing, and if the same case contains multiple versions of code during processing, it may be written to the same block for management.

It should be added that when data is written into the blockchain network, certain encryption may be performed, for example, writing after encryption by means of hash encryption. The following are illustrated by way of example: if the member node uploads the target code for error identification to the system, a new block may be generated in the blockchain network, as shown in table 1, where the new block may include the following information: the task number, the name or employee number of the staff member, the privacy authority, the date, the public key and signature of the staff member, the code to be identified, the processing status, the relevant certification information material (such as pictures and videos) and the storage link thereof, etc. Subsequently, if someone accesses the data in the block, their public key may be recorded. Where part of the information (usually the more important information) may be stored in the form of hash pointer links.

TABLE 1

Based on the above description, in the present exemplary embodiment, the target code is converted into the feature tensor, the error type of the target code is identified by the code error identification model, and the target code is written into the block chain network and stored. On one hand, the code error recognition model is a machine learning model obtained by training according to the code case data, and can learn the incidence relation between the code and the error type, so that the error type of the code to be recognized is intelligently recognized, the error recognition speed and efficiency are improved, and the labor cost is reduced. On the other hand, the code is stored through the block chain network in the exemplary embodiment, so that loss or tampering of code data can be prevented, the management efficiency of the code is improved, and subsequent recall calling is facilitated.

In an exemplary embodiment, the feature tensor may be in the form of a word vector matrix, and accordingly, the step S220 may be implemented by the following steps:

preprocessing the target code;

and performing word segmentation on the preprocessed target code, and performing vector conversion on the obtained words to generate a word vector matrix.

The pretreatment can comprise any one or more of the following treatment methods:

(1) all characters in each quotation mark in the object code are replaced by placeholders, the placeholders are used for representing that the part of characters are reference content, the reference content generally has no practical influence on the grammar of the code and can not be considered when error recognition is carried out, and the placeholders can be represented by specific characters such as 'strp', 'quote%', and the like.

(2) The symbols in the object code are removed. In a code language, symbols are generally used for representing intervals, wrapping lines or defining character types and the like, have small influence on the grammar of the code, and do not change the original semantics of the code after being removed. For example, a regular expression [ \ w' ] + | [ "| | may be employed on the basis of guaranteeing the original code syntax! "# $% &' () +, -/; <? Extracting symbols in the target code by @ [ \\ \ ^ \ \ \ \ \ _ { | } -', and removing; of course, if symbols that have a significant impact on the syntax are present in the object code, these symbols may be retained.

(3) And filling preset characters in the target code so as to enable the target code to reach a standard length. The standard length is a uniform standard set for all object codes, so that the object codes have a uniform length for subsequent processing, and may be in the form of a standard number of characters, a standard number of words, a standard number of bytes (bytes), or the like. When the target code is not long enough, the preset character can be filled, and the preset character can be a specific character specially used for indicating filling, such as "fill%", "f%", "0", and the like, and can be customized in the system in advance.

It should be understood that the above 3 methods are only exemplified for the pretreatment, and in practical applications, they can be used in combination, or other pretreatment methods can be adopted, such as: automatically correcting spelling errors in the target code, segmenting the target code, and the like, which is not particularly limited by this disclosure.

After preprocessing, the target code may be segmented, and the segmentation may be based on morphemes of the code, where the morphemes refer to the smallest language units with complete meaning in the code, such as various function names, arguments, operators, and the like. Each word is then converted into a word vector, which is a kind of encoding performed on the word to represent the word in the form of characteristic dimensions of syntax or semantics, and in the present exemplary embodiment, the word vector conversion may be implemented by a pre-trained embedding layer, a one-hot encoding manner, or by using word2vec (a word vectorization tool), etc. After the word vector conversion is completed, the word vectors of each word are combined to obtain a word vector matrix of the target code, and the word vector matrix contains the characteristic information of each word in the target code and can more comprehensively and fully represent the target code.

In an exemplary embodiment, the process of obtaining the word vector matrix may also be implemented by: after the target code is preprocessed, the preprocessed target code is converted into a text characteristic vector, and the text characteristic vector is converted into a word vector matrix through a mapping layer between a text and a word vector which are established in advance.

In an exemplary embodiment, as shown in fig. 3, the code error recognition method may further include the following steps for training the error recognition model:

step S310, obtaining a plurality of groups of training data from code case data stored in a block chain network, wherein the training data comprises sample codes and sample error types corresponding to the sample codes;

in step S320, a machine learning model is trained by using the training data to obtain a code error recognition model.

Wherein, the sample code is a source code in the history case, and is input data of training data in the embodiment; the sample error type can be an error type obtained by manual detection in a historical case, and is used as the labeling data of the training data in the embodiment; thus, the sample code and sample error type are grouped correspondingly. When the model is trained, the method of step S220 may be adopted to convert the sample code into a sample feature tensor, input the sample feature tensor into the machine learning model, and iteratively adjust parameters in the model, so that the result output by the model is close to the labeled data, and finally, if the model reaches a certain accuracy on the verification set in the training data, the training is completed. In the present exemplary embodiment, machine learning models for classification are all suitable as the code error recognition model of the present embodiment, such as a neural network model, a support vector product model, and the like.

In an exemplary embodiment, a text convolution neural network model may also be adopted, text information may be processed in a convolution manner, the model generally includes an embedding layer, and a one-hot form character encoding may be processed, and converted into an intermediate vector having semantic features, and then feature processing is performed, and finally an error type is obtained. Therefore, when the codes are converted into the feature tensors, one-hot encoding can be carried out on the morphemes in the codes according to the code lexicon, the conversion mode is simple, and the conversion efficiency is high. Therefore, the text convolution neural network model has high processing efficiency, and the local features in the codes can be extracted in a convolution mode, so that high accuracy is realized.

Further, in an exemplary embodiment, as shown in fig. 3 above, the code error identification method may further include the following steps:

step S330, a solution is obtained from the code case data, and a solution association table is constructed based on the association relationship between the sample error type and the solution;

accordingly, after step S230, the code error recognition method may further include the steps of:

step S231, according to the error type of the target code, searching a solution corresponding to the target code in the solution association table.

The solution association table is based on the association relationship between various error types summarized by a large amount of code case data and solutions, and can record which solution each type of error is suitable for taking. The error type and the corresponding solution of the code in each case are found out through the code case data collected in the block chain network, so that an association table can be established, and when new error types and solutions appear, the association table can be added into the table to update the table. In the association table, one type of error may correspond to multiple solutions, or one solution may be used to solve multiple types of errors, that is, there may be "one-to-many", "many-to-one", or "many-to-many", and the like, which is not particularly limited by the present disclosure. Based on the association relation in the association table, the solution corresponding to the error type of the target code can be searched, so that a solution suggestion is provided, and the problem of code error can be solved by a programmer.

In an exemplary embodiment, the association table may also record the matching degree between each type of error and the solution, for example, the matching degree is calculated according to the number of times of solution in the case, and then in step S231, the solution with the highest matching degree may be searched.

As can be seen from the above, the error type of the target code is the result predicted by the code error recognition model, and can be regarded as the machine recognition result; in an exemplary embodiment, as shown in fig. 3 above, step S240 may include the following steps:

and step S241, acquiring a manual processing result of the target code, and writing the target code and the manual processing result into a block chain network as new case data if the manual processing result is not matched with the machine identification result.

After the error type of the target code is provided through the code error recognition model or the solution of the target code is provided through the solution association table, a programmer can intervene manually to perform subsequent processing, and the manual processing result is uploaded to the system after the subsequent processing. After the system obtains the manual processing result, the manual processing result is matched with the machine recognition result to confirm whether the manual processing result is a processing result according to the machine recognition result, in other words, the system detects whether the machine recognition result is correct, if the manual processing result is not correct, other manual processing modes are adopted, the machine recognition result is possibly incorrect, the target code and the manual processing result can be marked as new case data and written into a block chain network, the block chain network can be used for subsequently optimizing a training code error recognition model or updating a solution association table, and the manual processing result can be used as marking data corresponding to the target code.

In an exemplary embodiment, if the result of the manual processing matches the result of the machine recognition, the result may also be written into the blockchain network, which may be used as general data or may be labeled as case data, and this disclosure does not specifically limit this.

Through the steps, the processed code data are written into the block chain network, the code case data stored in the block chain network are continuously increased, and the code error recognition model can be periodically trained and optimized by the code case data in the block chain network, so that the model is continuously updated in a closed-loop mode, and the error recognition accuracy is improved.

An exemplary embodiment of the present disclosure also provides a block chain-based code error recognition apparatus, as shown in fig. 4, the apparatus 400 may include: a code obtaining module 410, configured to obtain a target code; a tensor conversion module 420, configured to convert the object code into an feature tensor; the error identification module 430 is configured to process the feature tensor to obtain an error type of the target code based on a code error identification model, where the code error identification model is a machine learning model obtained by training according to code case data stored in a block chain network; and a data writing module 440, configured to write the target code into the blockchain network.

In an exemplary embodiment, the feature tensor can be a word vector matrix; the tensor conversion module may include: the preprocessing unit is used for preprocessing the target code; and the vector conversion unit is used for segmenting the preprocessed target code and performing vector conversion on the obtained words to generate a word vector matrix.

In an exemplary embodiment, the pre-processing may include any one or more of: replacing all characters in each quotation mark in the target code with placeholders; removing symbols in the object code; and filling preset characters in the target code so as to enable the target code to reach a standard length.

In an exemplary embodiment, the code error recognition apparatus may further include a model training module, configured to obtain multiple sets of training data from the code case data stored in the blockchain network, where the training data includes sample codes and sample error types corresponding to the sample codes, and train a machine learning model using the training data to obtain the code error recognition model.

In an exemplary embodiment, the code error recognition apparatus may further include: the association table building module is used for obtaining a solution from the code case data and building a solution association table based on the association relationship between the sample error type and the solution; and the scheme acquisition module is used for searching a solution corresponding to the target code in the solution association table according to the error type of the target code.

In an exemplary embodiment, the error type of the object code is a machine identification result; the data writing module may be configured to obtain a manual processing result of the object code, and if the manual processing result does not match the machine identification result, write the object code and the manual processing result as new case data into the blockchain network.

In an exemplary embodiment, the machine learning model may be a text convolutional neural network model.

The details of the modules are described in detail in the embodiments of the method section, and thus are not described again.

Exemplary embodiments of the present disclosure also provide an electronic device capable of implementing the above method.

As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 500 according to such an exemplary embodiment of the present disclosure is described below with reference to fig. 5. The electronic device 500 shown in fig. 5 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.

As shown in fig. 5, the electronic device 500 is embodied in the form of a general purpose computing device. The components of the electronic device 500 may include, but are not limited to: the at least one processing unit 510, the at least one memory unit 520, a bus 530 connecting various system components (including the memory unit 520 and the processing unit 510), and a display unit 540.

Where the storage unit stores program code, the program code may be executed by the processing unit 510 such that the processing unit 510 performs the steps according to various exemplary embodiments of the present disclosure as described in the above-mentioned "exemplary methods" section of this specification. For example, processing unit 510 may perform the method steps shown in fig. 2 or fig. 3, and so on.

The storage unit 520 may include readable media in the form of volatile storage units, such as a random access memory unit (RAM)521 and/or a cache memory unit 522, and may further include a read only memory unit (ROM) 523.

The storage unit 520 may also include a program/utility 524 having a set (at least one) of program modules 525, such program modules 525 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 530 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 500 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 500, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 500 to communicate with one or more other computing devices. Such communication may occur via an input/output (I/O) interface 560. Also, the electronic device 500 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 560. As shown, the network adapter 560 communicates with the other modules of the electronic device 500 over the bus 530. It should be appreciated that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the exemplary embodiments of the present disclosure.

Exemplary embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device.

Referring to fig. 6, a program product 600 for implementing the above method according to an exemplary embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Furthermore, the above-described figures are merely schematic illustrations of processes included in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit according to an exemplary embodiment of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.

Claims

1. A code error identification method based on a block chain is characterized by comprising the following steps:

acquiring a target code;

dividing words of the target code by taking morphemes as a reference, converting each obtained word into a word vector, and combining the word vectors of each word to convert the target code into a feature tensor;

processing the feature tensor based on a code error recognition model to obtain an error type of the target code, wherein the code error recognition model is a machine learning model obtained by training according to code case data stored in a block chain network;

according to the error type of the target code, searching a solution corresponding to the target code in a solution association table constructed in advance, and processing the target code; and

writing processing results of the target code into the blockchain network for periodically training and optimizing the code error recognition model, wherein the processing results include at least one of processing results of the code error recognition model on a feature tensor corresponding to the target code and processing results of the target code according to the solution.

2. The method of claim 1, wherein the feature tensor comprises a matrix of word vectors;

the dividing the target code into words by taking morphemes as a reference, converting each obtained word into a word vector, and combining the word vectors of each word to convert the target code into a feature tensor comprises the following steps:

preprocessing the target code;

and performing word segmentation on the preprocessed target code, and performing vector conversion on the obtained words to generate the word vector matrix.

3. The method of claim 2, wherein the pre-processing comprises any one or more of:

replacing all characters in each quotation mark in the object code with placeholders;

removing symbols in the object code; and

and filling preset characters in the target code so as to enable the target code to reach a standard length.

4. The method of claim 1, further comprising:

acquiring a plurality of groups of training data from code case data stored in a blockchain network, wherein the training data comprises sample codes and sample error types corresponding to the sample codes;

and training the machine learning model by using the training data to obtain the code error recognition model.

5. The method of claim 4, further comprising:

and acquiring a solution from the code case data, and constructing the solution association table based on the association relationship between the sample error type and the solution.

6. The method of claim 1, the type of error of the object code being a machine identification result;

the writing the processing result of the target code into the blockchain network comprises:

and acquiring a manual processing result of the target code, and if the manual processing result is not matched with the machine identification result, writing the target code and the manual processing result into the blockchain network as new case data.

7. The method of any of claims 1-6, wherein the machine learning model comprises a text convolutional neural network model.

8. An apparatus for identifying code errors based on block chains, comprising:

the code acquisition module is used for acquiring a target code;

the tensor conversion module is used for dividing words of the target code by taking morphemes as a reference, converting each obtained word into a word vector, and combining the word vectors of each word to convert the target code into a characteristic tensor;

the error identification module is used for processing the characteristic tensor to obtain the error type of the target code based on a code error identification model, wherein the code error identification model is a machine learning model obtained by training according to code case data stored in a block chain network;

a scheme obtaining module, configured to search, according to the error type of the target code, a solution corresponding to the target code in a solution association table that is constructed in advance, and process the target code; and

and the data writing module is used for writing the processing result of the target code into the block chain network so as to periodically train and optimize the code error recognition model, wherein the processing result comprises at least one of a processing result of the code error recognition model on a feature tensor corresponding to the target code and a processing result of the target code according to the solution.

9. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any of claims 1-7 via execution of the executable instructions.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 7.