CN113868391A

CN113868391A - Knowledge graph-based legal document generation method, device, equipment and medium

Info

Publication number: CN113868391A
Application number: CN202111137344.3A
Authority: CN
Inventors: 刘璐
Original assignee: Ping An International Smart City Technology Co Ltd
Current assignee: Ping An International Smart City Technology Co Ltd
Priority date: 2021-09-27
Filing date: 2021-09-27
Publication date: 2021-12-31
Anticipated expiration: 2041-09-27
Also published as: CN113868391B

Abstract

The application relates to artificial intelligence, and provides a legal document generation method, a device, equipment and a medium based on a knowledge graph, wherein the method comprises the following steps: acquiring a plurality of trial elements of the historical case according to the first file data of the historical case, and determining the similarity between the trial elements; taking a plurality of judging elements as nodes, taking the similarity between the judging elements as the relationship between the nodes, and generating a case knowledge graph which comprises judging results; acquiring file data of a case to be processed, extracting a plurality of target trial elements from the file data, and determining the similarity between the target trial elements; determining a target judging result corresponding to the case to be processed from the case knowledge map according to the similarity between the plurality of target judging elements and each target judging element; and generating the legal documents according to the target referee result and the file data. The present application also relates to blockchains that enable the rapid and accurate generation of legal documents.

Description

Knowledge graph-based legal document generation method, device, equipment and medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a method, an apparatus, a device, and a medium for generating a legal document based on a knowledge graph.

Background

With the development of social economy and the improvement of law control, the judicial requirements of the people are increasing day by day, and cases of court proposal are more and more. The case quantity of case management cases of the judge is gradually increased, the problem of workload overload exists, the law and regulations serving as basis of case judgment are continuously updated, the difficulty of case judgment of the judge is increased, and the error rate of case judgment of the judge is easily increased due to too large workload or increased difficulty of judgment. As the number of cases under examination by the judges increases, the writing work of legal documents becomes increasingly burdensome. Therefore, how to generate the legal documents of the case quickly and accurately and reduce the time for writing procedural elements to improve the judging efficiency becomes a problem to be solved urgently.

Disclosure of Invention

The main purpose of the present application is to provide a legal document generation method, device, equipment and medium based on knowledge graph, aiming at quickly and accurately generating a legal document of a case, thereby improving the trial and judgment efficiency of the case.

In a first aspect, the present application provides a method for generating a legal document based on a knowledge-graph, comprising:

acquiring first volume data of a history case, and performing text conversion on the first volume data according to a preset format to obtain second volume data;

acquiring a plurality of trial elements of the historical case from the second file data, and determining the similarity between the trial elements;

taking a plurality of judging elements as nodes, and taking the similarity between the judging elements as the relationship between the nodes to generate a case knowledge graph, wherein the case knowledge graph comprises judging results of the historical cases;

acquiring file data of a case to be processed, extracting a plurality of target judging elements from the file data, and determining the similarity between the target judging elements;

determining a target judging result corresponding to the case to be processed from the case knowledge graph according to the similarity between the plurality of target judging elements and each target judging element;

and generating the legal documents of the cases to be processed according to the target referee result and the file data.

In a second aspect, the present application also provides a legal document generation apparatus comprising:

the acquisition module is used for acquiring first volume data of a history case and performing text conversion on the first volume data according to a preset format to obtain second volume data;

the processing module is used for acquiring a plurality of trial elements of the historical case from the second file data and determining the similarity between the trial elements;

the training module is used for taking the plurality of judging elements as nodes and taking the similarity between the judging elements as the relationship between the nodes to generate a case knowledge graph, and the case knowledge graph comprises judging results of the historical cases;

the extraction module is used for acquiring file data of a case to be processed, extracting a plurality of target trial elements from the file data and determining the similarity between the target trial elements;

the determining module is used for determining a target judging result corresponding to the case to be processed from the case knowledge graph according to the similarity between the plurality of target judging elements and each target judging element;

and the generating module is used for generating the legal documents of the cases to be processed according to the target referee result and the file data.

In a third aspect, the present application also provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, implements the steps of the legal document generation method as described above.

In a fourth aspect, the present application also provides a computer readable storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of the legal document generation method as described above.

The application provides a legal document generation method, a device, equipment and a medium based on a knowledge graph, and obtains second volume data by obtaining first volume data of a historical case and performing text conversion on the first volume data according to a preset format; acquiring a plurality of trial elements of the historical case from the second file data, and determining the similarity between the trial elements; taking a plurality of judging elements as nodes, taking the similarity between the judging elements as the relationship between the nodes, and generating a case knowledge graph which comprises judging results of historical cases; acquiring file data of a case to be processed, extracting a plurality of target trial elements from the file data, and determining the similarity between the target trial elements; determining a target judging result corresponding to the case to be processed from the case knowledge map according to the similarity between the plurality of target judging elements and each target judging element; and generating the legal documents of the cases to be processed according to the target referee result and the file data. Legal documents of all types of cases are quickly and accurately generated through the case knowledge graph, and the time for writing procedural elements is shortened, so that the judging efficiency of the cases is improved, and the construction of a smart court is accelerated.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic flow chart illustrating the steps of a method for generating a legal document based on a knowledge-graph according to an embodiment of the present application;

FIG. 2 is a flow diagram illustrating sub-steps of the legal document generation method of FIG. 1;

FIG. 3 is a schematic diagram of a case knowledge graph provided in the embodiment;

FIG. 4 is a schematic block diagram of a legal document generation apparatus provided in an embodiment of the present application;

FIG. 5 is a schematic block diagram of a sub-module of the legal document generation apparatus of FIG. 4;

fig. 6 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.

The implementation, functional features and advantages of the objectives of the present application will be further explained with reference to the accompanying drawings.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The flow diagrams depicted in the figures are merely illustrative and do not necessarily include all of the elements and operations/steps, nor do they necessarily have to be performed in the order depicted. For example, some operations/steps may be decomposed, combined or partially combined, so that the actual execution sequence may be changed according to the actual situation. In addition, although the division of the functional blocks is made in the device diagram, in some cases, it may be divided in blocks different from those in the device diagram.

The embodiment of the application provides a legal document generation method, a device, equipment and a medium based on a knowledge graph. The legal document generation method can be applied to terminal equipment or a server, and the terminal equipment can be electronic equipment such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant and wearable equipment. The server may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), and a big data and artificial intelligence platform. The following explanation will be given taking an example in which the legal document generating method is applied to a server.

Some embodiments of the present application will be described in detail below with reference to the accompanying drawings. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating steps of a method for generating a legal document based on a knowledge graph according to an embodiment of the present application.

As shown in fig. 1, the legal document generation method includes steps S101 to S106.

S101, first volume data of the historical case are obtained, and text conversion is carried out on the first volume data according to a preset format to obtain second volume data.

It should be noted that the present application is mainly used in the field of document generation where unformatted and template classes depend on information extraction, that is, in the field of solving the judgment result part of legal documents. The expression capability of a semantic model in the judicial field can be improved, the French language is used while the expression extremely close to the natural language is generated, and the specification and the rigor of the official document are met.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a natural language processing technology, machine learning/deep learning and other directions. The legal documents of various types of cases are generated by using the related technical means of artificial intelligence, so that the method is fast and convenient for judicial auxiliary service, the construction of a smart court is accelerated, and the judgment quality and effect are improved.

In one embodiment, the historical cases are, for example, filed cases including, but not limited to, civil, criminal, and administrative, which may be one or more, the first file data may be file data for filed cases, which may include a plurality of file materials.

The file data of the civil litigation case comprises file materials such as prosecution, counter prosecution, answer prosecution, attorney, criminal with civil prosecution, upper prosecution, mandatory application, property security application, prior execution application, avoidance application, missing application, death application, payment order application, and reassurance prosecution; the file data of criminal litigation cases comprises file materials such as official tolls, criminal self-complaints, insurance pending application books, prosecution books, rechecking complaint books and the like; the file data of the administrative litigation case comprises file materials such as an administrative review application, an administrative prosecution, an administrative lition answer, an administrative mandatory execution application, an administrative prosecution, an administrative review prosecution and the like.

For example, in the case of a civil litigation case, the original report can submit a complaint to a court to initiate a litigation request; after the court determines that the litigation request of the original report meets the requirements of the proposal, the proposal is carried out; the court obtains litigation data such as evidence submitted by the original report; further, the court sends case tickets to the notice; and obtaining litigation data such as the answer forms and evidences submitted by the notice. The first file data of the civil litigation case can be generated by combining the evidence submitted by the original report, the answer forms and evidence submitted by the reported report, and other materials and evidence obtained by other ways.

In one embodiment, the historical cases are, for example, portfolio data of already committed dispute cases, including but not limited to cases that may include, but are not limited to, labor-type dispute cases, marital-family-type dispute cases, company-type dispute cases, and economic-type dispute cases. The history case may be one or more, the first volume data may be volume data of a committed dispute case, and the volume data may include a plurality of volume materials.

The file data of the labor dispute cases comprise file materials such as labor contract books, labor service contract books and labor arbitration complaint books; the volume data of the marital family dispute cases comprise volume materials such as prenuptial property agreements, divorce agreements, foster daughter agreements, legacy foster agreements, and accommodations; the portfolio data of the company dispute case comprises portfolio materials such as various company contracts, equity transfer agreements, legal opinions, lawyer working reports, various regulations and regulations, company chapters, regulations and regulations, personnel management methods, labor management methods, asset management methods and the like; the file data of the economic dispute cases comprise file materials such as buying and selling contracts, transfer contracts, borrowing contracts, contracting contracts, transportation contracts, leasing contracts, gifting contracts, operating contracts and partnership agreements.

In one embodiment, the history file of the history case is image data, and the text data of the history case is obtained by performing Optical Character Recognition (OCR) on the history file; carrying out data cleaning on the text data of the historical case to obtain a first volume of data of the historical case; and performing text conversion on the first volume data according to a preset format to obtain second volume data. It should be noted that the first file data is converted into the preset format to obtain the second file data, and the second file data is standardized through text conversion of the preset format, which is beneficial to improving the generation accuracy of the legal text.

The first volume data is, for example, TXT format data or PDF format data, the first volume data includes a plurality of volume materials, for example, the volume data of a civil litigation case includes volume materials such as prosecution, counter prosecution, answer, and attorney terms, and the second volume data of a history case may include a plurality of volume materials after format conversion, which is not specifically limited in this embodiment. It should be noted that, the data cleaning includes, but is not limited to, clearing dirty data such as special symbols and messy codes that are not related to the original text data; the preset format takes each history case as a primary key, takes other text information of the history case, such as file names, as a secondary key, and takes the corresponding content of the file names as a specific value of the secondary key.

And S102, acquiring a plurality of trial elements of the historical case from the second file data, and determining the similarity between the trial elements.

The judging elements are obtained by analyzing according to the second file data, and the similarity between the judging elements is obtained according to the semantic similarity between every two judging elements. For example, carrying out keyword recognition on the second file data of each case to obtain a plurality of trial elements of each second file data; and determining semantic similarity between every two trial elements in the plurality of trial elements to obtain the similarity between the plurality of trial elements.

In one embodiment, as shown in fig. 2, step S102 includes: substeps 1021 to substep S1023.

And the substep S1021, classifying the file materials in the second file data based on a preset text classification model to obtain the type information of the file materials.

In one embodiment, the text classification models include a fastText model, a textCNN model, a charCNN model, a Bi-LSTM model, and a Bi-LSTM + Attention model. Illustratively, the text classification model is a textCNN model, and the textCNN model includes an Embedding layer, a convolution layer, a pooling layer, and an output layer, and the basic idea is to rigidly normalize data into a uniform format, and then perform serialization processing on the normalized text, so as to determine type information of the file material in the second file data.

For example, a word vector is generated for each word in the file material through the Embedding layer, a matrix representing a sentence or a text is formed according to a plurality of word vectors, and a row in the matrix is the word vector of one word. It should be noted that, because the number of words in a sentence or a text is different, the number of rows of the matrix representing the sentence or the text is different, and the matrix may be processed by truncation or padding to obtain normalized matrix data, and then the normalized matrix data is spliced to obtain a target matrix X ═ X1, X2,... times.xn ], where the text includes n words, each word vector is represented as X i, and the target matrix is represented by X. For text data, a target matrix is formed by splicing word vectors of a plurality of words, and feature extraction is performed on each word one by one through a convolution layer, so that the target matrix is expressed as a feature matrix of a text. The pooling layer can adopt a one-dimensional pooling kernel to perform maximum pooling operation on each row in the convolution result to obtain a plurality of pooling characteristics, and then all the pooling results are spliced to form a pooling result, wherein the purpose of the pooling operation is to extract generalization characteristics and improve the generalization capability of the model. In the output layer, through the result of the pooling, the feature vector of the text or the sentence can be obtained, through the operation of softmax, the text can be classified, and through the division of different feature areas, the convolution is carried out to obtain the type information of the file material, and the semantic information is considered to a certain extent.

And a substep S1022, extracting a plurality of keywords from each file material according to a preset keyword extraction model and type information, and taking each keyword as an evaluation element.

The keyword extraction model corresponds to the type information of the file materials, a plurality of keyword extraction models can be called according to the second file data of each historical case, and keywords are extracted from the file materials corresponding to the keyword extraction models respectively to obtain a plurality of trial elements. The trial elements refer to elements related to trial by the business experts according to the principle of each case, for example, the trial elements of financial loan disputes include but are not limited to loan agreement names, loan fund, owed fund, penalty annual interest rate, guarantee responsibility, loan agreement non-self signature, forced loan agreement and the like, and a plurality of trial elements are obtained by analyzing corresponding elements in the file materials by utilizing a keyword extraction model. It should be noted that the identification accuracy of the trial elements can be improved by allocating the keyword extraction models corresponding to the type information of the file materials for processing, so that the accurate construction of the knowledge graph of the subsequent case is facilitated, and the model performance is effectively improved.

In one embodiment, the keyword extraction model includes a plurality of word Set models (SoW), the word Set models are sets formed by a plurality of Words, each word appears only once, and the word Set models can be flexibly Set according to the type information and the actual situation of the volume materials; extracting a plurality of keywords from each file material according to a preset keyword extraction model and type information, wherein the extraction comprises the following steps: selecting a target word set model from the plurality of word set models according to the type information, and establishing a corresponding relation between the target word set model and each file material; and inputting each file material into the corresponding target word set model for identification to obtain a plurality of keywords. The word set model corresponds to the type information of the file materials, a target word set model corresponding to the type information is selected, a plurality of keywords in the file materials are identified through the target word set model to serve as trial elements, the word set model can be obtained through training of a preset element dictionary, the preset element dictionary is formed by combing the trial elements and all other elements related to legal documents for various cases in advance through business experts, such as original reported information, original reported agent information and the like, a plurality of words can be accurately matched from each file material through the word set model, and the plurality of words obtained through matching are the keywords.

And a substep S1023 of inputting the plurality of trial elements into a preset semantic similarity model for processing so as to determine the similarity between every two trial elements.

In an embodiment, the semantic similarity model may also correspond to type information of the file materials, a plurality of semantic similarity models may be called according to the second file data of each case, and semantic similarity recognition is performed on a plurality of trial elements of the file materials respectively corresponding to the plurality of semantic similarity models, so as to obtain a similarity between every two trial elements. The type information of the file materials is distributed to the corresponding semantic similarity models for processing, so that the identification accuracy of the similarity between the judging elements can be improved, the accurate construction of the knowledge graph of the subsequent case is facilitated, and the model performance is effectively improved.

In one embodiment, the semantic similarity model includes an input layer, a presentation layer, and a matching layer; inputting a plurality of judging elements into an input layer for mapping so as to obtain a word feature vector of each judging element; inputting the word feature vector of each judging element into a presentation layer for processing to obtain a low latitude semantic vector of each judging element; inputting the low-latitude semantic vector of each judging element into a matching layer for calculation to obtain the cosine distance between every two low-latitude semantic vectors; and determining the similarity between every two judging elements according to the cosine distance between every two low latitude semantic vectors. The semantic similarity model can accurately identify the similarity between the judging elements and improve the model performance of the case knowledge graph.

The semantic similarity model comprises a Deep Neural Network (DNN), judging elements are expressed into low latitude semantic vectors through the Deep Neural network, and the distance between two semantic vectors is calculated through cosine distance, so that the similarity between every two judging elements is obtained. In the training stage of the semantic similarity model, a loss function is minimized through a maximum likelihood estimation method, the model is converged through random gradient descent (SGD), and the semantic similarity model is finally trained. The semantic similarity model can be used for predicting the semantic similarity between two judging elements and obtaining the low latitude semantic vector expression of a certain judging element.

Illustratively, the deep neural network comprises an input layer, a representation layer and a matching layer; the input layer is used for mapping the trial elements to a vector space to obtain word feature vectors of the trial elements and inputting the word feature vectors into the presentation layer; the presentation layer adopts a Bag of words (BOW) mode, which is equivalent to discarding the position information of the word vectors of the judging elements, and the word vectors of the whole judging elements are placed in a Bag without sequence for outputting 128-dimensional low-latitude semantic vectors; the matching layer is used for calculating cosine distances of two semantic vectors (128 dimensions), outputting matching scores between the two judging elements, determining a loss function of the model according to the matching scores, and adjusting model parameters according to the loss function of the model until the model converges to obtain a semantic similarity model. The semantic similarity model does not need to be mapped in an unsupervised model in the middle process, so that the precision is high.

In one embodiment, a plurality of trial elements of the historical case are obtained from the second file data through the case trial strategy, and the similarity between the trial elements is determined. The case judging strategy comprises a keyword extraction model and a semantic similarity model. Calling a keyword extraction model and a semantic similarity model from a preset database; extracting keyword information from each file material based on the keyword extraction model to obtain a plurality of judging elements; and determining the similarity between every two trial elements based on the multiple trial elements and the semantic similarity model. The accuracy of identifying the trial elements and the similarity between the trial elements is high, and the model performance of the subsequently established case knowledge graph can be effectively improved.

In an embodiment, after the file materials in the second file data are classified based on a preset text classification model to obtain the type information of the file materials, the file materials are analyzed or extracted from each file material based on a preset dictionary to obtain a plurality of trial elements, original appeal requests and counseling dialects, wherein the original appeal requests are obtained through the analysis of the appeal form submitted by the original advice, and the counseling or court writing and record analysis of the counseling process record submitted by the counseling are obtained through the analysis of the counseling form or court writing and record of the counseling and the counseling process record submitted by the counseling. The original appeal and the defendant dialect can be used in the subsequent legal documents generation process, namely the legal documents comprise the original appeal and the defendant dialect, so that the original appeal and the defendant dialect need to be identified so as to generate the legal documents.

And S103, taking the plurality of judging elements as nodes, taking the similarity between the judging elements as the relationship between the nodes, and generating a case knowledge graph which comprises judging results of historical cases.

The case knowledge graph is constructed through a preset data model structure (schema), the case knowledge graph comprises judging results of historical cases, the data model structure of the case knowledge graph can be constructed according to actual conditions, the case knowledge graph corresponds to the historical cases and can be one or more, the judging results of the historical cases are associated in advance and included when the case knowledge graph is constructed, and the subsequent obtaining of the judging results in the case knowledge graph is facilitated.

As shown in fig. 3, the nodes in the case knowledge graph include case numbers, reports, report agents, evidences, report appeals, defendant and evidence elements, and the like, and the relationships among the nodes include subject indication relationships, inclusion relationships, and the like. It should be noted that the case knowledge graph model is based on the case number and is based on the main frame of the defendant, the original defender, the defendant agent and the original defender agent. The primary ontology is divided into original reports, original report agents, reported agents and the like to form case main body information, the secondary ontology mainly refers to evidence materials provided by the original reports, evidence elements of historical cases are obtained according to the evidence materials, and information such as original complaint requests and reports are distinguished.

In one embodiment, a plurality of judging elements are used as entities, the similarity between the judging elements is used as the relation between the entities, and a preset knowledge graph model is subjected to iterative training until the knowledge graph model is converged to obtain the case knowledge graph. The convergence condition of the knowledge graph model may be determined according to an actual situation, for example, the iteration number reaches a preset number, the iteration time is greater than or equal to a preset time, and a parameter of the model after the iteration reaches a preset standard.

In one embodiment, the case knowledge graph can be multiple, each historical case corresponds to one case knowledge graph, and after the case knowledge graphs are generated, a first judging element and a second judging element corresponding to each case knowledge graph are respectively determined; and generating index information of the case knowledge graph according to the first judging element and the second judging element. The key information of a case is not only a word but also a sentence or a segment of characters in many cases. Therefore, it is necessary to extract the first and second judging elements to represent these keywords, which are index information, so as to facilitate the case knowledge map search. For example, the first judging element is an addressee, and the second judging element is an original.

In one embodiment, each historical case corresponds to one case knowledge graph, and the case knowledge graphs corresponding to the historical cases are used as sub-case knowledge graphs to be spliced to obtain the updated case knowledge graph. The case knowledge maps corresponding to the cases are spliced by taking the preset judging elements as target nodes, the preset judging elements are case numbers for example, and the case knowledge maps are spliced by the target nodes of the case numbers to obtain the updated case knowledge maps. The updated case knowledge graph comprises the judgment results of the historical cases.

And step S104, obtaining the file data of the case to be processed, extracting a plurality of target trial elements from the file data, and determining the similarity between the target trial elements.

The file data of the case to be processed comprises evidence materials submitted by the original notice and materials such as court trial notes generated in the trial process. Extracting a plurality of target trial elements from the file data, and determining the similarity between the target trial elements.

In one embodiment, after the file data of the case to be processed is obtained, the file data is input into a preset keyword extraction model, so as to extract a plurality of target trial elements from the file data. It should be noted that, the keyword extraction model may be obtained according to the corresponding embodiment of the foregoing sub-step S1022, and the keyword extraction model analyzes the corresponding elements in the file material to obtain a plurality of trial elements. A plurality of target trial elements can be accurately extracted from the file data through the keyword extraction model.

Illustratively, a target word set model is selected from the plurality of word set models according to the type information of the file data, the file data is input into the target word set model for recognition, and a plurality of keywords are obtained and used as target judging elements.

In one embodiment, after extracting a plurality of target trial elements from the file data, the plurality of target trial elements are input into a preset semantic similarity model for processing so as to determine the similarity between the target trial elements. It should be noted that, a semantic similarity model may be obtained according to the corresponding embodiment of the foregoing sub-step S1023, and the semantic similarity model may include an input layer, a presentation layer, and a matching layer, and the similarity between the target trial elements can be accurately determined by the semantic similarity model.

And S105, determining a target judgment result corresponding to the case to be processed from the case knowledge graph according to the similarity between the plurality of target judgment elements and each target judgment element.

In one embodiment, the case knowledge graph is multiple; taking the similarity between the multiple target judging elements and each target judging element as a search index, and determining a matched case knowledge graph from the multiple case knowledge graphs; acquiring judging results of historical cases corresponding to the matched case knowledge maps; and taking the referee result of the historical case as a target referee result corresponding to the case to be processed. The referee result can include the dispute focus, legal basis, and judicial opinions of the historical cases, and the dispute focus is a case cause that both parties dispute and disagree with each other about a certain fact, process or evidence.

In one embodiment, the case knowledge graph is obtained by piecing together sub-case knowledge graphs corresponding to a plurality of historical cases. Taking the similarity between the multiple target judging elements and each target judging element as a search index, and determining a matched sub-case knowledge graph from the case knowledge graph; acquiring judging results of the historical cases corresponding to the knowledge maps of the sub-cases; and taking the referee result of the historical case as a target referee result corresponding to the case to be processed. The matched sub-case knowledge graph may be a sub-knowledge graph with the highest matching degree with the plurality of target trial elements and the similarity between the target trial elements, which is not specifically limited in this embodiment.

And S106, generating a legal document of the case to be processed according to the target referee result and the file data.

Among them, the legal documents are documents used by the judicial authorities, parties, lawyers, and the like to solve litigation and non-litigation documents, and also include non-normative documents of the judicial authorities. Legal documents include but are not limited to official documents, which mainly include subject information, trial process information, official results, and the like. And generating the legal documents of the cases to be processed according to the target referee result and the file data. After the legal documents are generated, the generated legal documents can be pushed to the judge to provide effective reference for the case, the judge can adjust the judging opinions in the legal documents by oneself, the legal document content is modified, the judging result of the case is intelligently generated through the case knowledge graph, the legal documents of the case can be quickly and accurately generated, and the time for writing procedural elements is shortened to improve the judging efficiency.

In one embodiment, obtaining the anti-dialect record data of the case principal from the file data; determining a case theory template of a case to be processed according to the anti-dialect record data; and generating the legal documents of the cases to be processed according to the case theory template, the plurality of target trial factors and the target judgment result. Wherein the anti-counseling record data comprises an original appeal and an advertised anti-counseling. The defendant is informed in the case-managing process and can be determined by whether the defendant/defendant agent submits the answer form or whether the court trial is opened to orally answer the debt, so that the defendant includes the information of the answer form, the record of the orally answer and the like. Therefore, through the type label of the file materials in the file data or the identification of the type information of the file materials through a preset text classification model, the original complaint request and the debate can be accurately obtained from the file data, and thus the debate record data of the case party can be obtained.

In one embodiment, obtaining the anti-dialect record data of the case principal from the file data; according to the anti-dispute recorded data, dispute focus information of the case to be processed is determined; determining a case theory template corresponding to dispute focus information; and generating the legal documents of the cases to be processed according to the case theory template, the plurality of target trial factors and the target judgment result. Each dispute focus corresponds to a case theory template, the case theory template comprises referee bases such as legal provision, and the case theory template can be extracted from a database. Filling a plurality of target trial elements in the case theory template, intelligently obtaining the case theory result of the case to be processed, replacing the judgment theory result in the target judgment result with the case theory result, and intelligently obtaining the judgment result of the case to be processed.

It should be noted that the dispute focus information may be obtained by inputting the anti-dispute recorded data into a dispute focus mapping model for processing, the dispute focus mapping model may be a neural network model, the training data of the dispute focus mapping model is a large number of court trial records and corresponding dispute focuses labeled by the service experts, the training data may be used to perform iterative training on the dispute focus mapping model in advance, and the dispute focus mapping model is stored.

In one embodiment, generating a legal document of a case to be processed according to a case theory template, a plurality of target trial factors and a target judgment result comprises: generating a case theory result of the case to be processed according to the case theory template and the plurality of target trial factors; determining a referee theory result in a target referee result; and replacing the judgment theory result with a case theory result in the target judgment result so as to obtain the legal documents of the case to be processed. The case theory template can be set according to the type of the legal document to be generated, a plurality of target judging elements are filled in the corresponding positions of the case theory template to obtain case theory results, the judging theory results in the target judging results are replaced by the case theory results, and judging results of cases to be processed can be intelligently obtained. Determining corresponding typesetting information according to the document type of the legal document; and typesetting the judging result of the case to be processed according to the typesetting information, and quickly and accurately generating the legal documents of various types of cases.

In one embodiment, it is determined whether the portfolio data includes forensic record data for the principal of the case; if the file data does not comprise the anti-disfigurement record data of the case party, acquiring the main body information and the auditing process information of the case party from the file data; and generating the legal documents of the cases to be processed according to the main information, the trial process information and the target judgment result. The main body information comprises original quilt information, agent lawyer information and the like, the trial process information comprises case facts, case reasons, case incident process information and the like, and the target judgment result comprises a dispute focus, a legal basis, a trial suggestion and the like.

It should be noted that, a case theory result of the case to be processed is generated according to the case theory template and the plurality of target trial factors; determining a referee theory result in a target referee result; replacing the judgment theory result with a case theory result in the target judgment result to obtain a judgment result of the case to be processed; and generating the legal documents of the cases to be processed according to the main information, the trial process information and the judgment results of the cases to be processed. Thereby promote the judgement efficiency of case for the construction of wisdom court.

In one embodiment, a document type of a legal document to be generated is determined; determining corresponding typesetting information according to the document type of the legal document; and typesetting the judging result of the case to be processed, the main body information in the file data and the trial process information according to the typesetting information to obtain the legal document. The case judgment result of the case to be processed is intelligently generated through the case knowledge graph, the legal documents of the case can be quickly and accurately generated, and the time for writing procedural elements is shortened to improve the judgment efficiency.

According to the legal document generation method based on the knowledge graph, provided by the embodiment, the second file data is obtained by obtaining the first file data of the historical case and performing text conversion on the first file data according to the preset format; acquiring a plurality of trial elements of the historical case from the second file data, and determining the similarity between the trial elements; taking a plurality of judging elements as nodes, taking the similarity between the judging elements as the relationship between the nodes, and generating a case knowledge graph which comprises judging results of historical cases; acquiring file data of a case to be processed, extracting a plurality of target trial elements from the file data, and determining the similarity between the target trial elements; determining a target judging result corresponding to the case to be processed from the case knowledge map according to the similarity between the plurality of target judging elements and each target judging element; and generating the legal documents of the cases to be processed according to the target referee result and the file data. Legal documents of all types of cases are quickly and accurately generated through the case knowledge graph, and the time for writing procedural elements is shortened, so that the judging efficiency of the cases is improved, and the construction of a smart court is accelerated.

It should be noted that, in order to further ensure the privacy and security of the related information such as the above-mentioned file data and legal documents, the related information such as the above-mentioned file data and legal documents may also be stored in a node of a block chain, and the technical solution of the present application may also be applicable to adding other data files stored in the block chain, and the block chain referred to in the present application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Referring to fig. 4, fig. 4 is a schematic block diagram of a legal document generating apparatus according to an embodiment of the present application.

As shown in fig. 4, the legal document generating apparatus 200 includes: an acquisition module 201, a processing module 202, a training module 203, an extraction module 204, a determination module 205, and a generation module 206.

The acquisition module 201 is configured to acquire first volume data of a history case, and perform text conversion on the first volume data according to a preset format to obtain second volume data;

the processing module 202 is configured to obtain a plurality of trial elements of the historical case from the second portfolio data, and determine similarity between the trial elements;

the training module 203 is configured to use a plurality of trial elements as nodes, use the similarity between the trial elements as a relationship between the nodes, and generate a case knowledge graph, where the case knowledge graph includes a result of the judgment of the historical case;

the extracting module 204 is configured to obtain file data of a case to be processed, extract a plurality of target trial elements from the file data, and determine a similarity between the target trial elements;

a determining module 205, configured to determine, according to the similarity between the multiple target trial elements and each target trial element, a target trial result corresponding to the case to be processed from the case knowledge graph;

a generating module 206, configured to generate the legal document of the case to be processed according to the target referee result and the portfolio data.

In one embodiment, as shown in FIG. 5, the processing module 202 includes:

the classification submodule 2021 is configured to classify the file materials in the second file data based on a preset text classification model, so as to obtain type information of the file materials;

the extraction submodule 2022 is configured to extract a plurality of keywords from each of the file materials according to a preset keyword extraction model and the type information, and use each of the keywords as an evaluation element;

the processing submodule 2023 is configured to input the trial elements into a preset semantic similarity model for processing, so as to determine a similarity between every two trial elements.

In one embodiment, the keyword extraction model comprises a plurality of word set models; the extraction submodule 2022 is also used to:

selecting a target word set model from the plurality of word set models according to the type information, and establishing a corresponding relation between the target word set model and each file material;

and inputting each file material into the corresponding target word set model for identification to obtain a plurality of keywords.

In one embodiment, the semantic similarity model includes an input layer, a representation layer, and a matching layer; the processing sub-module 2023 is further configured to:

inputting a plurality of trial elements to the input layer for mapping so as to obtain a word feature vector of each trial element;

inputting the word feature vector of each judging element into the presentation layer for processing to obtain a low latitude semantic vector of each judging element;

inputting the low-latitude semantic vector of each judging element into the matching layer for calculation to obtain the cosine distance between every two low-latitude semantic vectors;

and determining the similarity between every two judging elements according to the cosine distance between every two low latitude semantic vectors.

In one embodiment, the generation module 206 is further configured to:

acquiring the anti-dialect record data of the case party from the file data;

according to the anti-dispute recorded data, dispute focus information of the case to be processed is determined;

determining a case theory template corresponding to the dispute focus information;

and generating the legal documents of the case to be processed according to the case theory template, the plurality of target trial factors and the target judgment result.

In one embodiment, the generation module 206 is further configured to:

generating a case theory result of the case to be processed according to the case theory template and the plurality of target trial factors;

determining a referee theoretic result in the target referee result;

and replacing the judgment theory result with the case theory result in the target judgment result so as to obtain the legal documents of the case to be processed.

In one embodiment, the generation module 206 is further configured to:

if the file data does not comprise the anti-disfigurement record data of the case party, acquiring the subject information and the auditing process information of the case party from the file data;

and generating the legal documents of the cases to be processed according to the main information, the trial process information and the target judgment result.

It should be noted that, as will be clear to those skilled in the art, for convenience and brevity of description, the specific working processes of the above-described apparatus and each module and unit may refer to the corresponding processes in the foregoing legal document generation method embodiment, and are not described herein again.

The apparatus provided by the above embodiments may be implemented in the form of a computer program, which can be run on a computer device as shown in fig. 6.

Referring to fig. 6, fig. 6 is a schematic block diagram illustrating a structure of a computer device according to an embodiment of the present disclosure. The computer device may be a server or a terminal device.

As shown in fig. 6, the computer device includes a processor, a memory and a network interface connected by a system bus, wherein the memory may include a storage medium and an internal memory, and the storage medium may be nonvolatile or volatile.

The storage medium may store an operating system and a computer program. The computer program includes program instructions that, when executed, cause a processor to perform any one of the methods of knowledge-graph based legal document generation.

The processor is used for providing calculation and control capability and supporting the operation of the whole computer equipment.

The internal memory provides an environment for the execution of a computer program on a storage medium, which when executed by a processor, causes the processor to perform any one of the methods for generating a legal document based on a knowledge-graph.

The network interface is used for network communication, such as sending assigned tasks and the like. Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

It should be understood that the Processor may be a Central Processing Unit (CPU), and the Processor may be other general purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, etc. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

Wherein, in one embodiment, the processor is configured to execute a computer program stored in the memory to implement the steps of:

In one embodiment, the processor, when implementing the obtaining of the plurality of trial elements of the historical case from the second file data and determining the similarity between the trial elements, is configured to implement:

classifying the file materials in the second file data based on a preset text classification model to obtain the type information of the file materials;

extracting a plurality of keywords from each file material according to a preset keyword extraction model and the type information, and taking each keyword as an judging element;

and inputting a plurality of judging elements into a preset semantic similarity model for processing so as to determine the similarity between every two judging elements.

In one embodiment, the keyword extraction model comprises a plurality of word set models; the processor is configured to, when extracting a plurality of keywords from each of the portfolio materials according to a preset keyword extraction model and the type information, perform:

In one embodiment, the semantic similarity model includes an input layer, a representation layer, and a matching layer; when the processor is used for inputting the judging elements into the semantic similarity model for processing so as to determine the similarity between every two judging elements, the processor is used for realizing that:

In one embodiment, the processor is configured to, when the generating of the legal document of the case to be processed according to the target referee result and the portfolio data is implemented, implement:

acquiring the anti-dialect record data of the case party from the file data;

In one embodiment, the processor is configured to, when the legal document of the case to be processed is generated according to the case theory template, the plurality of target trial elements and the target umpire result, implement:

determining a referee theoretic result in the target referee result;

In one embodiment, the processor is further configured to implement:

It should be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the computer device described above may refer to the corresponding process in the foregoing embodiment of the method for generating a legal document based on a knowledge graph, and details are not described here again.

The application is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Embodiments of the present application further provide a computer-readable storage medium, on which a computer program is stored, where the computer program includes program instructions, and when the program instructions are executed, the method that is implemented may refer to various embodiments of the method for generating a legal document based on a knowledge graph of the present application.

The computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device.

It is to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. While the invention has been described with reference to specific embodiments, the scope of the invention is not limited thereto, and those skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A legal document generation method based on a knowledge graph is characterized by comprising the following steps:

2. The method of generating legal documents according to claim 1, wherein said obtaining a plurality of trial elements of said historical case from said second portfolio data and determining similarity between each of said trial elements comprises:

3. The legal document generation method of claim 2, wherein the keyword extraction model comprises a plurality of word set models; the extracting a plurality of keywords from each of the file materials according to a preset keyword extraction model and the type information includes:

4. The legal document generation method of claim 2, wherein the semantic similarity model comprises an input layer, a presentation layer, and a matching layer; the inputting the plurality of trial elements into the semantic similarity model for processing to determine the similarity between every two trial elements includes:

5. The legal document generation method of any one of claims 1-4, wherein the generating the legal document of the case to be processed according to the target referee result and the portfolio data comprises:

acquiring the anti-dialect record data of the case party from the file data;

6. The legal document generation method of claim 5, wherein said generating the legal document of the case to be processed according to the case theory template, the plurality of target trial elements and the target umpire result comprises:

determining a referee theoretic result in the target referee result;

7. The legal document generation method of claim 5, further comprising:

8. A legal document generation apparatus, comprising:

9. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program, when executed by the processor, performs the steps of the legal document generation method of any one of claims 1 to 7.

10. A computer-readable storage medium, having a computer program stored thereon, wherein the computer program, when executed by a processor, performs the steps of the legal document generation method of any one of claims 1 to 7.