CN116484010B - Knowledge graph construction method and device, storage medium and electronic device - Google Patents
Knowledge graph construction method and device, storage medium and electronic device Download PDFInfo
- Publication number
- CN116484010B CN116484010B CN202310247469.4A CN202310247469A CN116484010B CN 116484010 B CN116484010 B CN 116484010B CN 202310247469 A CN202310247469 A CN 202310247469A CN 116484010 B CN116484010 B CN 116484010B
- Authority
- CN
- China
- Prior art keywords
- referee
- training
- judge
- knowledge graph
- gist
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010276 construction Methods 0.000 title claims abstract description 58
- 238000012549 training Methods 0.000 claims abstract description 131
- 239000013598 vector Substances 0.000 claims description 49
- 239000011159 matrix material Substances 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 11
- 238000012163 sequencing technique Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 8
- 238000012216 screening Methods 0.000 claims description 8
- 125000004122 cyclic group Chemical group 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 4
- 238000007781 pre-processing Methods 0.000 claims description 4
- 238000000034 method Methods 0.000 abstract description 61
- 230000008569 process Effects 0.000 abstract description 27
- 238000000605 extraction Methods 0.000 description 18
- 238000002372 labelling Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000013507 mapping Methods 0.000 description 3
- 230000008707 rearrangement Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- JEIPFZHSYJVQDO-UHFFFAOYSA-N iron(III) oxide Inorganic materials O=[Fe]O[Fe]=O JEIPFZHSYJVQDO-UHFFFAOYSA-N 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005498 polishing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013138 pruning Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/18—Legal services
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Tourism & Hospitality (AREA)
- Molecular Biology (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Technology Law (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Primary Health Care (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Economics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
Abstract
The application discloses a knowledge graph construction method and device for generating a judging gist, a storage medium and an electronic device. The method comprises the following steps: receiving a training referee document and a target referee document; training and obtaining a knowledge graph by using the training referee document to construct a model; inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relationship between the entity and the entity. The court reasoning process is mapped by using the structure of the judicial third-stage theory, so that the fact identification reason can be constrained to effectively filter the useless referee gist, and the referee gist can be improved by the aid of the structure of the judicial third-stage theory. The method solves the technical problem that the finally generated referee gist reference value is not high because the referee reasoning process is not considered, and the fact recognition reason cannot be constrained to effectively filter useless referee gist.
Description
Technical Field
The present invention relates to the field of legal document processing, and in particular, to a knowledge graph construction method and apparatus for generating a gist of a referee, a storage medium, and an electronic apparatus.
Background
At present, the number of Chinese referee paperwork referee books exceeds one hundred million, and increases rapidly at a speed of over ten thousand per day. Meanwhile, it often takes tens of minutes or even several hours to fully understand a referee document due to the logic complexity of the referee document. Therefore, how to assist law practitioners to quickly acquire knowledge from massive referee document data is widely focused by expert scholars at home and abroad. The judge gist is used as short description of the judge document key information, can assist law practitioners to quickly know the document key information, and becomes an important means and a convenient tool for developers to understand programs. Therefore, the invention tries to provide a method for generating the gist of the civil judgment document based on the judge reasoning knowledge graph so as to reduce the time cost of reading the judge document by legal practitioners.
The method for generating the gist of the judgment of the civil judgment can be regarded as a text abstract generating task in a specific field, and aims at accurately and simply generating the gist of the judgment while ensuring that key information of the judgment document is not lost. The text abstract correlation technique can be mainly divided into an extraction method and an abstract method. While the abstract method identifies and concatenates related words from the original text, the abstract method attempts to express the main content in a concise manner, possibly using words that are not in the original text. Early studies explored various approaches including rules for manual design, syntax tree pruning, and statistical information machine translation techniques.
Under the application scene generated by the judge gist, the original text needs to be subjected to abstract writing based on a certain specific format, so that the requirement on abstract generation is simplified, and the quality of abstract generation can be improved. At present, automatic text abstracts based on templates are mainly divided into two types: one is an automatic summarization technique based on hard templates, which is based on a fixed framework, performs summary generation in a manner similar to filling or editing, and templates are the subject of summary generation. And selecting other texts similar to the input texts from a related knowledge base in a retrieval mode based on an automatic abstract technology of the soft template, and using an abstract corresponding to the texts as the soft template to assist automatic abstract generation, wherein the soft template is not used as a main body for generating texts in the process. Compared with a hard template method, the abstract generated by the soft template method is more flexible and is not limited by the limitation of the template, and has higher readability and better key point extraction function.
Cao et al first tried to assist digest generation using soft templates, inspired by a traditional hard template-based digest model, attempting to guide digest generation with an already existing digest as a "soft template". The abstract model based on the soft template is realized through three steps of searching, rearranging and rewriting. The soft template can acquire a certain score, so that the abstract based on the soft template has higher stability and readability. Wang et al optimize the use of templates on the basis of Cao by first accelerating the process of selecting the best template by a fast rearrangement method, thereby optimizing overall training and inference speed. And finally, extracting key information in the template through a bidirectional selection layer so as to better assist abstract generation. Gao et al consider that existing template-based abstract models are not applicable on long datasets for short datasets, while existing methods tend to copy non-templated words from templates, such as facts and entities that are specifically related to the template. The formation of the final abstract is thus guided by separating the prototype abstract from the prototype facts and performing multiple polishing refinements.
However, the above technique does not consider the referee reasoning process, and therefore, the fact-recognizing reason cannot be constrained to effectively filter the useless referee gist, resulting in a low reference value of the referee gist finally generated.
Aiming at the problem that the reference value of the finally generated referee gist caused by useless referee gist is not high because the referee reasoning process is not considered in the related art, the fact recognition reason cannot be constrained, and no effective solution is proposed at present.
Disclosure of Invention
The main purpose of the present application is to provide a knowledge graph construction method, a device, a storage medium and an electronic device for generating a referee gist, so as to solve the problem that the reference value of the ultimately generated referee gist is not high due to the fact that the judgment reasoning process is not considered, and the fact-recognizing reason cannot be constrained to effectively filter useless referee gist.
In order to achieve the above object, according to one aspect of the present application, there is provided a knowledge graph construction method for generating a gist of a referee.
The knowledge graph construction method for generating the gist of the referee comprises the following steps: receiving a training referee document and a target referee document; training and obtaining a knowledge graph by using the training referee document to construct a model; inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relationship between the entity and the entity.
Further, before receiving the training referee document and the target referee document, the method further comprises:
collecting and preprocessing in an open database to obtain a judge document;
dividing each judge document into principal information, fact description, court views and judging results by using a rule analysis engine of a regular expression;
screening out judge documents with fact descriptions exceeding a preset token threshold;
dividing the judge document obtained after screening into training data and a target judge document;
and manually marking the training data to obtain the training referee document.
Further, training and obtaining a knowledge graph construction model by using the training referee document comprises the following steps:
analyzing the training referee document into a character sequence as input, and converting the character sequence into a sequence in a low-dimensional real-value vector form through a character vector matrix;
coding the sequence by using a large-scale language model to obtain a semantic representation vector;
using a long-short-time memory network and a conditional random field as a decoder to convert the semantic representation vector corresponding to each character into a corresponding entity label;
and converting characters corresponding to the head and tail entities into a probability matrix of the relation by using the cyclic neural network as a decoder.
Further, training and obtaining the knowledge graph construction model by using the training referee document further comprises:
adopting a contrast loss function, and obtaining a knowledge graph construction model based on training of the training referee document;
the contrast loss function is:
wherein f (x) represents a semantic representation vector corresponding to the target character, f (x) + ) Representing the semantic representation vector corresponding to the positive sample,representing the semantic representation vector corresponding to the negative sample.
Further, inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing the referee reasoning knowledge graph based on the relationship between the entity and the entity, further includes:
training the judge document to obtain a judge gist generation model;
and inputting the judge reasoning knowledge graph into a judge gist generation model to generate a judge gist.
Further, inputting the referee reasoning knowledge graph into a referee gist generation model, and generating the referee gist includes:
calculating the text similarity, the semantic similarity and the structural similarity of the target referee document and the format templates, and taking three format templates with the highest text similarity, semantic similarity and structural similarity as candidate templates;
Calculating and sequencing the real similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sequencing result;
and generating a judge gist according to the judge reasoning knowledge graph and the soft template.
Further, calculating and sorting the true similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sorting result comprises calculating a cross entropy loss between the true similarity and the predicted similarity:
generating a judge gist including maximizing negative log likelihood estimation of abstract prediction probability according to the judge reasoning knowledge graph and the soft template:
in order to achieve the above object, according to another aspect of the present application, there is provided a knowledge graph construction apparatus for generating a gist of a referee.
The knowledge graph construction device for generating the gist of a referee according to the present application includes: the receiving module is used for receiving the training referee document and the target referee document; the training module is used for training and acquiring a knowledge graph to construct a model by using the training referee document; the building module is used for inputting the target referee document into the knowledge graph building model so as to extract the relation between the entity and the entity of the target referee document and building the referee reasoning knowledge graph based on the relation between the entity and the entity.
To achieve the above object, according to another aspect of the present application, there is provided a computer-readable storage medium.
According to the computer-readable storage medium of the present application, a computer program is stored therein, wherein the computer program is configured to execute the knowledge graph construction method for generating the referee gist at the time of execution.
To achieve the above object, according to another aspect of the present application, there is provided an electronic device.
An electronic device according to the present application, comprising: a memory and a processor, the memory storing a computer program, wherein the processor is configured to run the computer program to execute the knowledge graph construction method for generating the gist of the referee.
In the embodiment of the application, a mode of constructing a knowledge graph is adopted, and a training referee document and a target referee document are received; training and obtaining a knowledge graph by using the training referee document to construct a model; inputting the target referee document into the knowledge graph construction model to extract the relation between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relation between the entity and the entity; the method achieves the aim of mapping the court reasoning process by adopting the structure of the judicial three-section theory, and can restrict the fact-recognizing reason to effectively filter the useless referee gist, thereby realizing the technical effect of improving the referee gist's referee value, and further solving the technical problem of low referee reference value finally generated due to the fact-recognizing reason not being restricted to effectively filter the useless referee gist because the referee reasoning process is not considered.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application and to provide a further understanding of the application with regard to the other features, objects and advantages of the application. The drawings of the illustrative embodiments of the present application and their descriptions are for the purpose of illustrating the present application and are not to be construed as unduly limiting the present application. In the drawings:
fig. 1 is a flowchart of a knowledge graph construction method for generating a referee gist according to an embodiment of the present application;
fig. 2 is a schematic structural view of a knowledge graph construction apparatus for generating a gist of a referee according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a referee thrust knowledge graph in accordance with an embodiment of the present application;
FIG. 4 is a schematic diagram of training using a training referee document according to an embodiment of the present application.
Detailed Description
In order to make the present application solution better understood by those skilled in the art, the following description will be made in detail and with reference to the accompanying drawings in the embodiments of the present application, it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe the embodiments of the present application described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the present application, the terms "upper", "lower", "left", "right", "front", "rear", "top", "bottom", "inner", "outer", "middle", "vertical", "horizontal", "lateral", "longitudinal" and the like indicate an azimuth or a positional relationship based on that shown in the drawings. These terms are only used to better describe the present utility model and its embodiments and are not intended to limit the scope of the indicated devices, elements or components to the particular orientations or to configure and operate in the particular orientations.
Also, some of the terms described above may be used to indicate other meanings in addition to orientation or positional relationships, for example, the term "upper" may also be used to indicate some sort of attachment or connection in some cases. The specific meaning of these terms in the present utility model will be understood by those of ordinary skill in the art according to the specific circumstances.
Furthermore, the terms "mounted," "configured," "provided," "connected," "coupled," and "sleeved" are to be construed broadly. For example, it may be a fixed connection, a removable connection, or a unitary construction; may be a mechanical connection, or an electrical connection; may be directly connected, or indirectly connected through intervening media, or may be in internal communication between two devices, elements, or components. The specific meaning of the above terms in the present utility model can be understood by those of ordinary skill in the art according to the specific circumstances.
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
According to an embodiment of the present utility model, there is provided a knowledge graph construction method for generating a gist of a referee, as shown in fig. 1, the method including steps S101 to S103 as follows:
Step S101, receiving a training referee document and a target referee document;
training the judge document to be marked data, and training and acquiring a knowledge graph to construct a model; the target referee document is unlabeled data and can be used for constructing a corresponding knowledge graph based on a trained model.
According to an embodiment of the present invention, preferably, before receiving the training referee document and the target referee document, the method further includes:
collecting and preprocessing in an open database to obtain a judge document;
dividing each judge document into principal information, fact description, court views and judging results by using a rule analysis engine of a regular expression;
screening out judge documents with fact descriptions exceeding a preset token threshold;
dividing the judge document obtained after screening into training data and a target judge document;
and manually marking the training data to obtain the training referee document.
Collecting massive civil referee documents from court public data, performing ontology modeling on referee reasoning knowledge patterns, dividing the knowledge patterns into training data and target data (target referee documents), and manually labeling the training data to obtain training data. Since the downstream task is mainly civil decision, researchers use python as a development language from the open data source Chinese referee paperwork (https:// wenshu. Court. Gov. Cn /), write an automated collection program to collect tens of millions of civil decision files as required. The collected data is processed by removing useless characters and the like and then is stored in a database. Then, each document is divided into four parts by a regular expression-based rule parsing engine: principal information, fact description, forensic views, and decision results. Only documents that describe more than 50 token's are kept based on research needs.
After the collection is completed, the collected civil referee document is analyzed, and a referee reasoning knowledge map ontology library is obtained. The judge reasoning knowledge graph comprises eight types of ontology including original notice, court, basic fact, fact identification, evidence collection and law regulation, wherein nine types of relations including original notice assertion, notice identification, court identification fact, fact identification conclusion, fact identification reason, evidence item, evidence collection conclusion, evidence collection reason and legal basis are included among the ontology, and the original notice assertion is the relation between the original notice and the basic fact; the interview is called a relationship between interview and ground truth; court-of-law facts are relationships between courts and ground facts; the fact-approval conclusion is the relationship between the fact-approval and the basic fact; the fact-recognizing reasons are the relationship between the court recognizing facts, the proving matters, the legal basis, the legal regulations and the staged fact recognizing and the fact recognizing; the proof item is the relation between evidence and basic facts; the evidence collection conclusion is the relation between the evidence and the evidence collection, the relation type is Boolean, and the relation type can be divided into collection and non-collection; the evidence collection reason is the relation between the evidence collection and the basic facts; legal basis is the relationship between legal regulations and basic facts and between legal regulations and fact approval. Through the relationship between the ontology and the ontology, the judge reasoning process of the judge can be effectively represented, and the model is helped to better understand the judge document.
The collected referee document is divided into training data and target data, wherein the training data is used for model training, and the target data is used for generating a final referee gist. Meanwhile, manual labeling is carried out on training data, and labeling results comprise all entities, relations and final referee gist contained in referee documents.
Step S102, training and obtaining a knowledge graph by using the training referee document to construct a model;
and performing model training by using a training referee document containing entity, relation and final referee gist label, and finally obtaining a knowledge graph corresponding to the referee document without labels. Can help the model effectively understand the overly complex semantic logic of the referee document.
According to an embodiment of the present invention, as shown in fig. 4, preferably, training to obtain a knowledge-graph construction model using the training referee document includes:
analyzing the training referee document into a character sequence as input, and converting the character sequence into a sequence in a low-dimensional real-value vector form through a character vector matrix;
coding the sequence by using a large-scale language model to obtain a semantic representation vector;
using a long-short-time memory network and a conditional random field as a decoder to convert the semantic representation vector corresponding to each character into a corresponding entity label;
And converting characters corresponding to the head and tail entities into a probability matrix of the relation by using the cyclic neural network as a decoder.
The invention utilizes the mode of the bottom layer representation of the shared neural network to carry out entity relationship joint extraction so as to obtain key elements required by the judge reasoning knowledge graph. Specifically, the model mainly comprises the following modules:
input layer: the referee document is parsed into a sequence of characters as input and converted into a sequence in the form of a low-dimensional real-valued vector by a character vector matrix.
Coding layer: the sequences are encoded by using a large-scale language model (such as Bert) to obtain a semantic representation vector H, and the encoded semantic representation vector comprises the upper part and the lower part Wen Yuyi of the characters and is the main basis for extracting the entity relationship.
Entity extraction layer: the entity extraction layer uses a long-short-time memory network and a conditional random field as decoders to convert semantic representation vectors corresponding to each character into corresponding entity labels. Wherein the entity tag is in common form { O, B-entity, I-entity, E-entity }, O represents common characters, B-entity represents the beginning character of the entity, I-entity represents the middle character of the entity, and E-entity represents the ending character of the entity.
Relation extraction layer: the relation extraction layer uses a cyclic neural network as a decoder to convert characters corresponding to the head and tail entities into a probability matrix of the relation, wherein the relation with the maximum probability is the relation corresponding to the head and tail entities.
According to an embodiment of the present invention, preferably, training to obtain the knowledge-graph construction model by using the training referee document further includes:
adopting a contrast loss function, and obtaining a knowledge graph construction model based on training of the training referee document;
the contrast loss function is:
wherein f (x) represents a semantic representation vector corresponding to the target character, f (x) + ) Representing the semantic representation vector corresponding to the positive sample,representing the semantic representation vector corresponding to the negative sample.
Contrast loss layer: aiming at the problem of entity boundary prediction errors in the traditional entity relation extraction method, the invention introduces a contrast learning method, and adds contrast loss in the training process so that the model can fully learn the difference between different types of characters, thereby improving the entity extraction performance.
Aiming at the construction of the judge reasoning knowledge graph, the judge reasoning knowledge graph construction and extraction method based on the comparison learning is provided, so that the model can fully understand the judge reasoning logic of the case. The innovation is that the coding capacity of the model is improved through the introduction of contrast learning, so that the construction quality of the judge reasoning knowledge graph is improved.
And step S103, inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relationship between the entity and the entity.
The finally constructed knowledge graph construction model can extract the entity and entity relationship of the target referee document, and construct the referee reasoning knowledge graph shown in figure 3 based on the entity and entity relationship; the generated judge thrust knowledge graph has the relationship between the entities and the entity, has the capability of reflecting the judge reasoning process, expresses the process of identifying and speaking about basic facts by a court in the form of the judge reasoning knowledge graph, and can restrict the fact identification reason to effectively filter useless judge gist, thereby leading the finally generated judge gist to have high reference value.
From the above description, it can be seen that the following technical effects are achieved:
in the embodiment of the application, a mode of constructing a knowledge graph is adopted, and a training referee document and a target referee document are received; training and obtaining a knowledge graph by using the training referee document to construct a model; inputting the target referee document into the knowledge graph construction model to extract the relation between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relation between the entity and the entity; the method achieves the aim of mapping the court reasoning process by adopting the structure of the judicial three-section theory, and can restrict the fact-recognizing reason to effectively filter the useless referee gist, thereby realizing the technical effect of improving the referee gist's referee value, and further solving the technical problem of low referee reference value finally generated due to the fact-recognizing reason not being restricted to effectively filter the useless referee gist because the referee reasoning process is not considered.
According to the embodiment of the present invention, preferably, the step of inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing the referee reasoning knowledge graph based on the relationship between the entity and the entity further includes:
training the judge document to obtain a judge gist generation model;
and inputting the judge reasoning knowledge graph into a judge gist generation model to generate a judge gist.
The training referee document is marked with not only entity and entity relation but also referee gist, so that the referee gist generation model can be trained by training the referee document, and the trained referee gist generation model can generate referee gist based on the input referee reasoning knowledge graph.
According to the embodiment of the present application, preferably, inputting the referee inference knowledge graph into a referee gist generation model, generating a referee gist includes:
calculating the text similarity, the semantic similarity and the structural similarity of the target referee document and the format templates, and taking three format templates with the highest text similarity, semantic similarity and structural similarity as candidate templates;
Calculating and sequencing the real similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sequencing result;
and generating a judge gist according to the judge reasoning knowledge graph and the soft template.
Under the application scene generated by the judging gist, the original text needs to be subjected to abstract writing based on a certain specific format, so that the method for generating the abstract based on the soft template is provided. Specifically, the method mainly comprises the following modules:
and a retrieval module: candidate templates are selected from the training corpus. Specifically, the invention summarizes a plurality of templates commonly used by referees and aligns them with referees documents in a training set; in the training and predicting process, searching the referee document with the highest similarity with the target referee document through the similarity, and using the corresponding target as a candidate template. In order to improve the accuracy of the candidate templates, text similarity, semantic similarity and structural similarity are used as main basis for retrieval:
text similarity: and calculating the VSM similarity among all referee documents in the training set, and taking each sample with the highest similarity as a target search result.
Semantic similarity: and (3) coding all referee documents in the training set by using Bert, calculating the distance between the semantic vector of the target abstract and the semantic vector of the candidate referee document, and selecting the referee document with the minimum distance as the target display result.
Structural similarity: all referee documents in the training set are converted into corresponding referee reasoning knowledge graphs, the similarity of the referee reasoning knowledge graphs is calculated through a graph similarity algorithm, and the highest similarity is taken as a target search result.
And (3) a rearrangement module: the 3 candidate templates are scored and ranked, and one is selected as the soft template to be used. For the retrieved candidate templates, a final soft template needs to be selected, and it is desirable to be as close to the real abstract as possible. The similarity between the template and the true abstract is evaluated by using the ROUGE index, the true similarity, and the predictive model is trained to evaluate the score of each candidate template.
And (3) a rewriting module: and taking the judge reasoning knowledge graph as input, and generating a summary according to the input and the template. Semantic information of the template is integrated in the abstract generation process of the original text by using an attention mechanism of the template, and a result of structural search is used as supplementary mode information, so that the performance of the whole model is further improved.
According to the embodiment of the application, preferably, the three candidate templates are subjected to real similarity and predicted similarity calculation and ranked, and selecting one as the used soft template according to the ranking result comprises calculating cross entropy loss between the real similarity and the predicted similarity:
Generating a judge gist including maximizing negative log likelihood estimation of abstract prediction probability according to the judge reasoning knowledge graph and the soft template:
aiming at the generation of the judging gist of the judgment book, the invention provides a soft template-based civil judgment gist generation method, which ensures that the model keeps key information related to judging reasoning as much as possible while ensuring smooth and smooth judging gist. The semantic and structure fusion soft template retrieval method is provided, and the accuracy of the candidate templates is improved, so that the quality of generating the judge gist is improved.
It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.
According to an embodiment of the present invention, there is also provided an apparatus for implementing the above knowledge graph construction method for generating a referee gist, as shown in fig. 2, the apparatus including:
a receiving module 10 for receiving a training referee document and a target referee document;
training the judge document to be marked data, and training and acquiring a knowledge graph to construct a model; the target referee document is unlabeled data and can be used for constructing a corresponding knowledge graph based on a trained model.
According to an embodiment of the present invention, preferably, before receiving the training referee document and the target referee document, the method further includes:
collecting and preprocessing in an open database to obtain a judge document;
dividing each judge document into principal information, fact description, court views and judging results by using a rule analysis engine of a regular expression;
screening out judge documents with fact descriptions exceeding a preset token threshold;
dividing the judge document obtained after screening into training data and a target judge document;
and manually marking the training data to obtain the training referee document.
Collecting massive civil referee documents from court public data, performing ontology modeling on referee reasoning knowledge patterns, dividing the knowledge patterns into training data and target data (target referee documents), and manually labeling the training data to obtain training data. Since the downstream task is mainly civil decision, researchers use python as a development language from the open data source Chinese referee paperwork (https:// wenshu. Court. Gov. Cn /), write an automated collection program to collect tens of millions of civil decision files as required. The collected data is processed by removing useless characters and the like and then is stored in a database. Then, each document is divided into four parts by a regular expression-based rule parsing engine: principal information, fact description, forensic views, and decision results. Only documents that describe more than 50 token's are kept based on research needs.
After the collection is completed, the collected civil referee document is analyzed, and a referee reasoning knowledge map ontology library is obtained. The judge reasoning knowledge graph comprises eight types of ontology including original notice, court, basic fact, fact identification, evidence collection and law regulation, wherein nine types of relations including original notice assertion, notice identification, court identification fact, fact identification conclusion, fact identification reason, evidence item, evidence collection conclusion, evidence collection reason and legal basis are included among the ontology, and the original notice assertion is the relation between the original notice and the basic fact; the interview is called a relationship between interview and ground truth; court-of-law facts are relationships between courts and ground facts; the fact-approval conclusion is the relationship between the fact-approval and the basic fact; the fact-recognizing reasons are the relationship between the court recognizing facts, the proving matters, the legal basis, the legal regulations and the staged fact recognizing and the fact recognizing; the proof item is the relation between evidence and basic facts; the evidence collection conclusion is the relation between the evidence and the evidence collection, the relation type is Boolean, and the relation type can be divided into collection and non-collection; the evidence collection reason is the relation between the evidence collection and the basic facts; legal basis is the relationship between legal regulations and basic facts and between legal regulations and fact approval. Through the relationship between the ontology and the ontology, the judge reasoning process of the judge can be effectively represented, and the model is helped to better understand the judge document.
The collected referee document is divided into training data and target data, wherein the training data is used for model training, and the target data is used for generating a final referee gist. Meanwhile, manual labeling is carried out on training data, and labeling results comprise all entities, relations and final referee gist contained in referee documents.
A training module 20, configured to use the training referee document to train and acquire a knowledge graph to construct a model;
and performing model training by using a training referee document containing entity, relation and final referee gist label, and finally obtaining a knowledge graph corresponding to the referee document without labels. Can help the model effectively understand the overly complex semantic logic of the referee document.
According to an embodiment of the present invention, as shown in fig. 4, preferably, training to obtain a knowledge-graph construction model using the training referee document includes:
analyzing the training referee document into a character sequence as input, and converting the character sequence into a sequence in a low-dimensional real-value vector form through a character vector matrix;
coding the sequence by using a large-scale language model to obtain a semantic representation vector;
using a long-short-time memory network and a conditional random field as a decoder to convert the semantic representation vector corresponding to each character into a corresponding entity label;
And converting characters corresponding to the head and tail entities into a probability matrix of the relation by using the cyclic neural network as a decoder.
The invention utilizes the mode of the bottom layer representation of the shared neural network to carry out entity relationship joint extraction so as to obtain key elements required by the judge reasoning knowledge graph. Specifically, the model mainly comprises the following modules:
input layer: the referee document is parsed into a sequence of characters as input and converted into a sequence in the form of a low-dimensional real-valued vector by a character vector matrix.
Coding layer: the sequences are encoded by using a large-scale language model (such as Bert) to obtain a semantic representation vector H, and the encoded semantic representation vector comprises the upper part and the lower part Wen Yuyi of the characters and is the main basis for extracting the entity relationship.
Entity extraction layer: the entity extraction layer uses a long-short-time memory network and a conditional random field as decoders to convert semantic representation vectors corresponding to each character into corresponding entity labels. Wherein the entity tag is in common form { O, B-entity, I-entity, E-entity }, O represents common characters, B-entity represents the beginning character of the entity, I-entity represents the middle character of the entity, and E-entity represents the ending character of the entity.
Relation extraction layer: the relation extraction layer uses a cyclic neural network as a decoder to convert characters corresponding to the head and tail entities into a probability matrix of the relation, wherein the relation with the maximum probability is the relation corresponding to the head and tail entities.
According to an embodiment of the present invention, preferably, training to obtain the knowledge-graph construction model by using the training referee document further includes:
adopting a contrast loss function, and obtaining a knowledge graph construction model based on training of the training referee document;
the contrast loss function is:
wherein f (x) represents a semantic representation vector corresponding to the target character, f (x) + ) Representing the semantic representation vector corresponding to the positive sample,representing the semantic representation vector corresponding to the negative sample.
Contrast loss layer: aiming at the problem of entity boundary prediction errors in the traditional entity relation extraction method, the invention introduces a contrast learning method, and adds contrast loss in the training process so that the model can fully learn the difference between different types of characters, thereby improving the entity extraction performance.
Aiming at the construction of the judge reasoning knowledge graph, the judge reasoning knowledge graph construction and extraction method based on the comparison learning is provided, so that the model can fully understand the judge reasoning logic of the case. The innovation is that the coding capacity of the model is improved through the introduction of contrast learning, so that the construction quality of the judge reasoning knowledge graph is improved.
The construction module 30 is configured to input the target referee document into the knowledge graph construction model, so as to extract a relationship between an entity and an entity of the target referee document, and construct a referee reasoning knowledge graph based on the relationship between the entity and the entity.
The finally constructed knowledge graph construction model can extract the entity and entity relationship of the target referee document, and construct the referee reasoning knowledge graph shown in figure 3 based on the entity and entity relationship; the generated judge thrust knowledge graph has the relationship between the entities and the entity, has the capability of reflecting the judge reasoning process, expresses the process of identifying and speaking about basic facts by a court in the form of the judge reasoning knowledge graph, and can restrict the fact identification reason to effectively filter useless judge gist, thereby leading the finally generated judge gist to have high reference value.
From the above description, it can be seen that the following technical effects are achieved:
in the embodiment of the application, a mode of constructing a knowledge graph is adopted, and a training referee document and a target referee document are received; training and obtaining a knowledge graph by using the training referee document to construct a model; inputting the target referee document into the knowledge graph construction model to extract the relation between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relation between the entity and the entity; the method achieves the aim of mapping the court reasoning process by adopting the structure of the judicial three-section theory, and can restrict the fact-recognizing reason to effectively filter the useless referee gist, thereby realizing the technical effect of improving the referee gist's referee value, and further solving the technical problem of low referee reference value finally generated due to the fact-recognizing reason not being restricted to effectively filter the useless referee gist because the referee reasoning process is not considered.
According to the embodiment of the present invention, preferably, the step of inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing the referee reasoning knowledge graph based on the relationship between the entity and the entity further includes:
training the judge document to obtain a judge gist generation model;
and inputting the judge reasoning knowledge graph into a judge gist generation model to generate a judge gist.
The training referee document is marked with not only entity and entity relation but also referee gist, so that the referee gist generation model can be trained by training the referee document, and the trained referee gist generation model can generate referee gist based on the input referee reasoning knowledge graph.
According to the embodiment of the present application, preferably, inputting the referee inference knowledge graph into a referee gist generation model, generating a referee gist includes:
calculating the text similarity, the semantic similarity and the structural similarity of the target referee document and the format templates, and taking three format templates with the highest text similarity, semantic similarity and structural similarity as candidate templates;
Calculating and sequencing the real similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sequencing result;
and generating a judge gist according to the judge reasoning knowledge graph and the soft template.
Under the application scene generated by the judging gist, the original text needs to be subjected to abstract writing based on a certain specific format, so that the method for generating the abstract based on the soft template is provided. Specifically, the method mainly comprises the following modules:
and a retrieval module: candidate templates are selected from the training corpus. Specifically, the invention summarizes a plurality of templates commonly used by referees and aligns them with referees documents in a training set; in the training and predicting process, searching the referee document with the highest similarity with the target referee document through the similarity, and using the corresponding target as a candidate template. In order to improve the accuracy of the candidate templates, text similarity, semantic similarity and structural similarity are used as main basis for retrieval:
text similarity: and calculating the VSM similarity among all referee documents in the training set, and taking each sample with the highest similarity as a target search result.
Semantic similarity: and (3) coding all referee documents in the training set by using Bert, calculating the distance between the semantic vector of the target abstract and the semantic vector of the candidate referee document, and selecting the referee document with the minimum distance as the target display result.
Structural similarity: all referee documents in the training set are converted into corresponding referee reasoning knowledge graphs, the similarity of the referee reasoning knowledge graphs is calculated through a graph similarity algorithm, and the highest similarity is taken as a target search result.
And (3) a rearrangement module: the 3 candidate templates are scored and ranked, and one is selected as the soft template to be used. For the retrieved candidate templates, a final soft template needs to be selected, and it is desirable to be as close to the real abstract as possible. The similarity between the template and the true abstract is evaluated by using the ROUGE index, the true similarity, and the predictive model is trained to evaluate the score of each candidate template.
And (3) a rewriting module: and taking the judge reasoning knowledge graph as input, and generating a summary according to the input and the template. Semantic information of the template is integrated in the abstract generation process of the original text by using an attention mechanism of the template, and a result of structural search is used as supplementary mode information, so that the performance of the whole model is further improved.
According to the embodiment of the application, preferably, the three candidate templates are subjected to real similarity and predicted similarity calculation and ranked, and selecting one as the used soft template according to the ranking result comprises calculating cross entropy loss between the real similarity and the predicted similarity:
Generating a judge gist including maximizing negative log likelihood estimation of abstract prediction probability according to the judge reasoning knowledge graph and the soft template:
aiming at the generation of the judging gist of the judgment book, the invention provides a soft template-based civil judgment gist generation method, which ensures that the model keeps key information related to judging reasoning as much as possible while ensuring smooth and smooth judging gist. The semantic and structure fusion soft template retrieval method is provided, and the accuracy of the candidate templates is improved, so that the quality of generating the judge gist is improved.
It will be apparent to those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, or they may alternatively be implemented in program code executable by computing devices, such that they may be stored in a memory device for execution by the computing devices, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The foregoing description is only of the preferred embodiments of the present application and is not intended to limit the same, but rather, various modifications and variations may be made by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application.
Claims (6)
1. A knowledge graph construction method for generating a gist of a referee, comprising:
receiving a training referee document and a target referee document;
training and obtaining a knowledge graph by using the training referee document to construct a model;
training and acquiring a knowledge graph construction model by using the training referee document comprises the following steps:
analyzing the training referee document into a character sequence as input, and converting the character sequence into a sequence in a low-dimensional real-value vector form through a character vector matrix;
coding the sequence by using a large-scale language model to obtain a semantic representation vector;
using a long-short-time memory network and a conditional random field as a decoder to convert the semantic representation vector corresponding to each character into a corresponding entity label;
using a cyclic neural network as a decoder to convert characters corresponding to the head and tail entities into a probability matrix of the relation;
Training and obtaining a knowledge graph construction model by using the training referee document further comprises:
adopting a contrast loss function, and obtaining a knowledge graph construction model based on training of the training referee document;
the contrast loss function is:
wherein f (x) represents a semantic representation vector corresponding to the target character, f (x) + ) Representing the semantic representation vector corresponding to the positive sample,representing a semantic representation vector corresponding to the negative sample;
inputting the target referee document into the knowledge graph construction model to extract the relation between the entity and the entity of the target referee document, and constructing a referee reasoning knowledge graph based on the relation between the entity and the entity;
training the judge document to obtain a judge gist generation model;
inputting the judge reasoning knowledge graph into a judge gist generation model to generate a judge gist;
inputting the judge reasoning knowledge graph into a judge gist generation model, wherein the generating of the judge gist comprises the following steps:
calculating the text similarity, the semantic similarity and the structural similarity of the target referee document and the format templates, and taking three format templates with the highest text similarity, semantic similarity and structural similarity as candidate templates;
Calculating and sequencing the real similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sequencing result;
and generating a judge gist according to the judge reasoning knowledge graph and the soft template.
2. The knowledge-graph construction method according to claim 1, further comprising, before receiving the training referee document and the target referee document:
collecting and preprocessing in an open database to obtain a judge document;
dividing each judge document into principal information, fact description, court views and judging results by using a rule analysis engine of a regular expression;
screening out judge documents with fact descriptions exceeding a preset token threshold;
dividing the judge document obtained after screening into training data and a target judge document;
and manually marking the training data to obtain the training referee document.
3. The knowledge graph construction method according to claim 1, wherein the steps of calculating and sorting the true similarity and the predicted similarity for the three candidate templates, and selecting one of the soft templates as the used soft template according to the sorting result includes calculating a cross entropy loss between the true similarity and the predicted similarity:
Generating a judge gist including maximizing negative log likelihood estimation of abstract prediction probability according to the judge reasoning knowledge graph and the soft template:
4. a knowledge graph construction apparatus for generating a gist of a referee, comprising:
the receiving module is used for receiving the training referee document and the target referee document;
the training module is used for training and acquiring a knowledge graph to construct a model by using the training referee document;
the building module is used for inputting the target referee document into the knowledge graph building model so as to extract the relation between the entity and the entity of the target referee document and building a referee reasoning knowledge graph based on the relation between the entity and the entity;
training and acquiring a knowledge graph construction model by using the training referee document comprises the following steps:
analyzing the training referee document into a character sequence as input, and converting the character sequence into a sequence in a low-dimensional real-value vector form through a character vector matrix;
coding the sequence by using a large-scale language model to obtain a semantic representation vector;
using a long-short-time memory network and a conditional random field as a decoder to convert the semantic representation vector corresponding to each character into a corresponding entity label;
Using a cyclic neural network as a decoder to convert characters corresponding to the head and tail entities into a probability matrix of the relation;
training and obtaining a knowledge graph construction model by using the training referee document further comprises:
adopting a contrast loss function, and obtaining a knowledge graph construction model based on training of the training referee document;
the contrast loss function is:
wherein f (x) represents a semantic representation vector corresponding to the target character, f (x) + ) Representing the semantic representation vector corresponding to the positive sample,representing a semantic representation vector corresponding to the negative sample;
inputting the target referee document into the knowledge graph construction model to extract the relationship between the entity and the entity of the target referee document, and constructing the referee reasoning knowledge graph based on the relationship between the entity and the entity, further comprises:
training the judge document to obtain a judge gist generation model;
inputting the judge reasoning knowledge graph into a judge gist generation model to generate a judge gist;
inputting the judge reasoning knowledge graph into a judge gist generation model, wherein the generating of the judge gist comprises the following steps:
calculating the text similarity, the semantic similarity and the structural similarity of the target referee document and the format templates, and taking three format templates with the highest text similarity, semantic similarity and structural similarity as candidate templates;
Calculating and sequencing the real similarity and the predicted similarity of the three candidate templates, and selecting one of the three candidate templates as a used soft template according to the sequencing result;
and generating a judge gist according to the judge reasoning knowledge graph and the soft template.
5. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program, wherein the computer program is configured to execute the knowledge-graph construction method for generating a referee's gist according to any one of claims 1 to 3 at run-time.
6. An electronic device, comprising: a memory and a processor, characterized in that the memory stores a computer program, wherein the processor is configured to run the computer program to execute the knowledge-graph construction method for generating the gist of a referee according to any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310247469.4A CN116484010B (en) | 2023-03-15 | 2023-03-15 | Knowledge graph construction method and device, storage medium and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310247469.4A CN116484010B (en) | 2023-03-15 | 2023-03-15 | Knowledge graph construction method and device, storage medium and electronic device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116484010A CN116484010A (en) | 2023-07-25 |
CN116484010B true CN116484010B (en) | 2024-01-16 |
Family
ID=87212826
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310247469.4A Active CN116484010B (en) | 2023-03-15 | 2023-03-15 | Knowledge graph construction method and device, storage medium and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116484010B (en) |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110633458A (en) * | 2018-06-25 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Method and device for generating referee document |
WO2020052184A1 (en) * | 2018-09-10 | 2020-03-19 | 平安科技(深圳)有限公司 | Judgment document processing method and apparatus, computer device and storage medium |
CN111680504A (en) * | 2020-08-11 | 2020-09-18 | 四川大学 | Legal information extraction model, method, system, device and auxiliary system |
CN111813923A (en) * | 2019-11-29 | 2020-10-23 | 北京嘀嘀无限科技发展有限公司 | Text summarization method, electronic device and storage medium |
WO2021072321A1 (en) * | 2019-10-11 | 2021-04-15 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for generating knowledge graphs and text summaries from document databases |
CN113010684A (en) * | 2020-12-31 | 2021-06-22 | 北京法意科技有限公司 | Construction method and system of civil complaint and judgment map |
WO2021164226A1 (en) * | 2020-02-20 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for querying knowledge map of legal cases, device and storage medium |
CN113312501A (en) * | 2021-06-29 | 2021-08-27 | 中新国际联合研究院 | Construction method and device of safety knowledge self-service query system based on knowledge graph |
CN113723108A (en) * | 2021-08-11 | 2021-11-30 | 北京工业大学 | Event extraction method and device, electronic equipment and storage medium |
TW202201336A (en) * | 2020-06-16 | 2022-01-01 | 國立政治大學 | Method for generating abstract of written judgment automatically |
CN115238697A (en) * | 2022-07-26 | 2022-10-25 | 贵州数联铭品科技有限公司 | Judicial named entity recognition method based on natural language processing |
CN115269857A (en) * | 2022-04-28 | 2022-11-01 | 东北林业大学 | Knowledge graph construction method and device based on document relation extraction |
CN115374270A (en) * | 2021-12-21 | 2022-11-22 | 一拓通信集团股份有限公司 | Legal text abstract generation method based on graph neural network |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190198137A1 (en) * | 2017-12-26 | 2019-06-27 | International Business Machines Corporation | Automatic Summarization of Patient Data Using Medically Relevant Summarization Templates |
CN112148871B (en) * | 2020-09-21 | 2024-04-12 | 北京百度网讯科技有限公司 | Digest generation method, digest generation device, electronic equipment and storage medium |
US20220164683A1 (en) * | 2020-11-25 | 2022-05-26 | Fmr Llc | Generating a domain-specific knowledge graph from unstructured computer text |
CN112487826A (en) * | 2020-11-30 | 2021-03-12 | 北京百度网讯科技有限公司 | Information extraction method, extraction model training method and device and electronic equipment |
CN113239208A (en) * | 2021-05-06 | 2021-08-10 | 广东博维创远科技有限公司 | Mark training model based on knowledge graph |
-
2023
- 2023-03-15 CN CN202310247469.4A patent/CN116484010B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110633458A (en) * | 2018-06-25 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Method and device for generating referee document |
WO2020052184A1 (en) * | 2018-09-10 | 2020-03-19 | 平安科技(深圳)有限公司 | Judgment document processing method and apparatus, computer device and storage medium |
WO2021072321A1 (en) * | 2019-10-11 | 2021-04-15 | The Board Of Trustees Of The Leland Stanford Junior University | Systems and methods for generating knowledge graphs and text summaries from document databases |
CN111813923A (en) * | 2019-11-29 | 2020-10-23 | 北京嘀嘀无限科技发展有限公司 | Text summarization method, electronic device and storage medium |
WO2021164226A1 (en) * | 2020-02-20 | 2021-08-26 | 平安科技(深圳)有限公司 | Method and apparatus for querying knowledge map of legal cases, device and storage medium |
TW202201336A (en) * | 2020-06-16 | 2022-01-01 | 國立政治大學 | Method for generating abstract of written judgment automatically |
CN111680504A (en) * | 2020-08-11 | 2020-09-18 | 四川大学 | Legal information extraction model, method, system, device and auxiliary system |
CN113010684A (en) * | 2020-12-31 | 2021-06-22 | 北京法意科技有限公司 | Construction method and system of civil complaint and judgment map |
CN113312501A (en) * | 2021-06-29 | 2021-08-27 | 中新国际联合研究院 | Construction method and device of safety knowledge self-service query system based on knowledge graph |
CN113723108A (en) * | 2021-08-11 | 2021-11-30 | 北京工业大学 | Event extraction method and device, electronic equipment and storage medium |
CN115374270A (en) * | 2021-12-21 | 2022-11-22 | 一拓通信集团股份有限公司 | Legal text abstract generation method based on graph neural network |
CN115269857A (en) * | 2022-04-28 | 2022-11-01 | 东北林业大学 | Knowledge graph construction method and device based on document relation extraction |
CN115238697A (en) * | 2022-07-26 | 2022-10-25 | 贵州数联铭品科技有限公司 | Judicial named entity recognition method based on natural language processing |
Non-Patent Citations (6)
Title |
---|
基于深度学习的裁判文书知识图谱构建研究;黄煜俊;《社会科学Ⅰ辑》(第04期);第24-52页 * |
基于自动文摘的新闻聚合关键技术研究;周华健;《信息科技》;全文 * |
基于自然语义处理的裁判文书分割系统;郑少婉;陆培民;;信息技术与网络安全(02);全文 * |
结合法条的司法裁判文书摘要生成方法研究;魏鑫炀;《信息科技》;全文 * |
面向法律文书的文本摘要算法研究;王刚;《信息科技》;全文 * |
面向法律文本的命名实体识别研究;徐江南;《信息科技》(第09期);第27-34页 * |
Also Published As
Publication number | Publication date |
---|---|
CN116484010A (en) | 2023-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113011533B (en) | Text classification method, apparatus, computer device and storage medium | |
Karpathy et al. | Deep visual-semantic alignments for generating image descriptions | |
CN111738004A (en) | Training method of named entity recognition model and named entity recognition method | |
CN106095753B (en) | A kind of financial field term recognition methods based on comentropy and term confidence level | |
CN111709242B (en) | Chinese punctuation mark adding method based on named entity recognition | |
CN113377897B (en) | Multi-language medical term standard standardization system and method based on deep confrontation learning | |
CN113961685A (en) | Information extraction method and device | |
CN112101027A (en) | Chinese named entity recognition method based on reading understanding | |
CN112541337B (en) | Document template automatic generation method and system based on recurrent neural network language model | |
CN114048354B (en) | Test question retrieval method, device and medium based on multi-element characterization and metric learning | |
CN111460147B (en) | Title short text classification method based on semantic enhancement | |
CN110941958A (en) | Text category labeling method and device, electronic equipment and storage medium | |
CN116821297A (en) | Stylized legal consultation question-answering method, system, storage medium and equipment | |
CN112860898B (en) | Short text box clustering method, system, equipment and storage medium | |
CN116843175A (en) | Contract term risk checking method, system, equipment and storage medium | |
CN114881043B (en) | Deep learning model-based legal document semantic similarity evaluation method and system | |
CN117291192B (en) | Government affair text semantic understanding analysis method and system | |
CN113920379A (en) | Zero sample image classification method based on knowledge assistance | |
CN116484010B (en) | Knowledge graph construction method and device, storage medium and electronic device | |
CN115858813A (en) | Project consultation report retrieval method based on combined semantics and associated matching | |
Liu | IntelliExtract: An End-to-End Framework for Chinese Resume Information Extraction from Document Images | |
CN116720502B (en) | Aviation document information extraction method based on machine reading understanding and template rules | |
CN118069818B (en) | Knowledge question-answering method based on large language model enhancement | |
Siriguleng | Automatic Punctuation Method of Ancient Chinese Texts Based on SikuBERT and Multi-head Attention Mechanism: An Exploration of Ancient Classical Ritual Literature | |
CN117291174A (en) | Pre-training language model optimization method, device and storage medium based on custom mask object |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |