[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN111930959A - Method and device for generating text by using map knowledge - Google Patents

Method and device for generating text by using map knowledge Download PDF

Info

Publication number
CN111930959A
CN111930959A CN202010674570.4A CN202010674570A CN111930959A CN 111930959 A CN111930959 A CN 111930959A CN 202010674570 A CN202010674570 A CN 202010674570A CN 111930959 A CN111930959 A CN 111930959A
Authority
CN
China
Prior art keywords
entity
word
attribute value
list
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010674570.4A
Other languages
Chinese (zh)
Other versions
CN111930959B (en
Inventor
薛小娜
牟小峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Minglue Artificial Intelligence Group Co Ltd
Original Assignee
Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Minglue Artificial Intelligence Group Co Ltd filed Critical Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority to CN202010674570.4A priority Critical patent/CN111930959B/en
Publication of CN111930959A publication Critical patent/CN111930959A/en
Application granted granted Critical
Publication of CN111930959B publication Critical patent/CN111930959B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a device for generating a text by using atlas knowledge, wherein the method comprises the following steps: step S1: constructing a map database from the prior knowledge, the map database comprising: a triple library and a doublet library; the triplet includes: a first entity, a relation word and a second entity; the doublet includes: a relation term or a second entity, and, an attribute value; step S2: constructing the information above the triples for any one of the triples in the triples library; step S3: screening the attribute values of the relation words according to the information of the triples to obtain the attribute values of the screened relation words, or screening the attribute values of the second entities according to the information of the triples to obtain the attribute values of the screened second entities; step S4: selecting different texts to generate templates according to the relation words; step S5: and filling the first entity, the relation words, the second entity, and the attribute values of the screened relation words and/or the attribute values of the second entity into a text generation template to generate a text.

Description

Method and device for generating text by using map knowledge
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method and a device for generating a text by using atlas knowledge.
Background
The information extraction technology is used for extracting the relation information in the text, and the relation information is stored in a structured mode, such as a binary group or a triple group mode, so that the storage cost can be saved, and the map display and the information retrieval are facilitated.
In the process of implementing the embodiments of the present disclosure, it is found that at least the following problems exist in the related art: tuple information stored in a structured mode can play a role of enriching knowledge maps, but the tuple information has the characteristics of zero dispersity, disorder and the like, and a user cannot understand the maps well.
Disclosure of Invention
The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. The foregoing summary is not an extensive overview and is intended to neither identify key/critical elements nor delineate the scope of such embodiments, but is intended to be a prelude to the more detailed description that is presented later.
The embodiment of the disclosure provides a method and a device for generating a text by using atlas knowledge, so as to solve the technical problem to a certain extent.
In some embodiments, a method for generating text for atlas knowledge, comprises: step S1: constructing a map database from the prior knowledge, the map database comprising: a triple library and a doublet library; the triplet includes: a first entity, a relation word and a second entity; the doublet includes: a relation term or a second entity, and, an attribute value; step S2: constructing the information above the triples for any one of the triples in the triples library; step S3: screening the attribute values of the relation words according to the information of the triples to obtain the attribute values of the screened relation words, or screening the attribute values of the second entities according to the information of the triples to obtain the attribute values of the screened second entities; step S4: selecting different texts to generate templates according to the relation words; step S5: and filling the attribute values of the first entity, the second entity and the screened second entity, and the attribute values of the screened relation words and/or the relation words into the text generation template to generate a text.
Optionally, step S3 further includes: step S31: acquiring and calculating the intersection word number of the feature value list of any relation word attribute value or the feature value list of the second entity attribute value and the triple information in the relation word attribute value list and the second entity attribute value list, and adding the current relation word attribute value into the relation word attribute value confirmation list or adding the current second entity attribute value into the second entity attribute value confirmation list when the intersection word number is greater than a first preset threshold value; step S32: and determining the screened related word attribute value or second entity attribute value according to the word number of the related word attribute value confirmation list or the word number of the second entity attribute value confirmation list.
Optionally, the determining the filtered relationship term attribute value or the second entity attribute value in step S32 further includes: when the number of words in the related word attribute value confirmation list is 0, the screened related word attribute value is null, or when the number of words in the second entity attribute value confirmation list is 0, the screened second entity attribute value is null; when the number of words in the related word attribute value confirmation list is 1, the screened related word attribute value is a word in the related word attribute value confirmation list, or when the number of words in the second entity attribute value confirmation list is 1, the screened second entity attribute value is a word in the second entity attribute value confirmation list; and when the number of words in the related word attribute value confirmation list is greater than or equal to 2, the screened related word attribute value is that the words in the related word attribute value confirmation list are connected by a pause number and end by an 'equal' word, or when the number of words in the second entity attribute value confirmation list is greater than or equal to 2, the screened second entity attribute value is that the words in the second entity attribute value confirmation list are connected by a pause number and end by an 'equal' word.
Optionally, in step S31, the method further includes obtaining a feature value list of the relation term or the second entity attribute value, and further includes: establishing a first word list, wherein the first word list comprises words in the duplicate removal binary group library and the triple library; establishing a second word list, and performing word segmentation on the text information to obtain the second word list; taking an intersection from the first word list and the second word list, and sequencing the intersection according to the word sequence of the second word list to obtain a third word list; in the third word list, N words before the related word attribute value form a feature value list corresponding to the current related word attribute value, or N words before the second entity attribute value form a feature value list corresponding to the current second entity attribute value.
Optionally, the triplet information includes: the above triple information, the first entity of the current triple and the relation word.
Optionally, the above triplet information includes: the first entity of the above triplet, the relation word of the above triplet, the attribute value of the second entity of the above triplet and the second entity of the above triplet; and when the current triple is the first triple, the above triple information is null.
Optionally, step S4 further includes: and performing word segmentation and part-of-speech tagging on the related words, and selecting different templates according to the number of the segmented words and the part-of-speech of the related words.
Optionally, selecting different text generation templates according to the number of participles and the part of speech of the related word, including: when the related word cannot be participated and is a verb or preposition, the text generation template comprises the following steps: the first entity, the screened related term attribute value, the related term, the screened second entity attribute value and the second entity; when the relation word cannot be participated and is a noun, the text generation template comprises: the first entity, the relation words, the screened second entity attribute values and the second entity.
Optionally, selecting different templates according to the number of participles and the part of speech of the related word, including: when the relation words are segmented into prepositions and verbs, the text generation template comprises the following steps: the first entity, the relation preposition, the screened second entity attribute value, the second entity, the screened relation word attribute value and the relation verb.
In some embodiments, an apparatus for generating text from knowledge of an image, comprises a processor and a memory storing program instructions, the processor configured to, upon execution of the program instructions, perform the aforementioned method for generating text from knowledge of an image.
The method and the device for generating the text by the atlas knowledge provided by the embodiment of the disclosure can realize the following technical effects:
through the knowledge graph, coherent, smooth and meaningful texts are generated, and a user can be assisted to understand the knowledge graph better.
The foregoing general description and the following description are exemplary and explanatory only and are not restrictive of the application.
Drawings
One or more embodiments are illustrated by way of example in the accompanying drawings, which correspond to the accompanying drawings and not in limitation thereof, in which elements having the same reference numeral designations are shown as like elements and not in limitation thereof, and wherein:
FIG. 1 is a schematic flow chart diagram of a method for generating text from atlas knowledge provided by an embodiment of the disclosure;
fig. 2 is a knowledge graph constructed based on a graph database of first text information according to an embodiment of the present disclosure.
Detailed Description
So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.
In the embodiments of the present disclosure, "a plurality" means two or more unless specifically defined otherwise.
Fig. 1 is a flowchart illustrating a method for generating text using atlas knowledge according to an embodiment of the present disclosure. As shown in fig. 1, an embodiment of the present disclosure provides a method for generating a text by using atlas knowledge, including: step S1: constructing a map database from the prior knowledge, the map database comprising: a triple library and a bigram library, the triplets comprising: a first entity, a relation word and a second entity, the binary group comprising: a relation term or a second entity, and, an attribute value; step S2: constructing the information above the triples for any one of the triples in the triples library; step S3: screening the attribute values of the relation words according to the information of the triples to obtain the attribute values of the screened relation words, or screening the attribute values of the second entities according to the information of the triples to obtain the attribute values of the screened second entities; step S4: selecting different texts to generate templates according to the relation words; step S5: and filling the attribute values of the first entity, the second entity and the screened second entity, and the attribute values of the screened relation words and/or the relation words into the text generation template to generate a text. Wherein the binary group comprises: the relation word and its attribute value, or the second entity and its attribute value. Step S5 includes: and filling the attribute values of the first entity, the second entity and the screened second entity and the attribute values of the screened relation words into a text generation template, or filling the attribute values of the first entity, the second entity and the screened second entity and the attribute values of the screened relation words and the relation words into a text generation template.
The following text information is taken as an example to illustrate the method for generating the atlas knowledge text provided by the embodiment of the disclosure, and the method does not limit the embodiment of the disclosure. For example, the first text message is "A group is dedicated to exploring the landing of a new generation of artificial intelligence technology in the catering industry"; the second text information is the ground of the application of the B project, such as public opinion knowledge map, intelligent wind control, intelligent question and answer, intelligent examination and the like, through the development of work of information extraction, text information mining, semantic analysis and the like. S1, constructing a map database according to the existing text information, wherein the map database comprises: triple library triplesData: a first triple1 ═ a group, explore, land >, < floor >, a second triple2 ═ floor, in, trade >, a third triple3 ═ B item, pass, information extraction >, a fourth triple4 ═ B item, pass, text information mining >, a fifth triple5 ═ B item, pass, speech analysis >, a sixth triple6 ═ B item, push, land >; binary library tuplestata: first tuple < floor: artificial intelligence technology >, second tuple < industry: restaurant >, the third tuple < floor: public opinion knowledge map >, the fourth tuple < floor: intelligent wind control >, the fifth tuple < floor: smart question-answer >, sixth tuple < floor: intelligent audit >; FIG. 2 is a knowledge graph constructed by a graph database based on first text information, as shown in FIG. 2, a knowledge graph is constructed based on a triple1, a triple2, a first binary and a second binary, relation word connection is used between entities, and mod connection is used between the second entity or relation word and an attribute value; s2, constructing a triplet previous information preTextList for any triplet in the triplet library, for example, the triplet previous information preTextList1 of triplet 1 [ 'a clique', 'exploration' ], the triplet previous information preTextList2 of triplet 2 [ 'a clique', 'exploration', 'artificial intelligence technique', 'land', 'at' ]; s3, screening the attribute values of the relation words according to the information of the triples to obtain the attribute values of the screened relation words, or screening the attribute values of the second entities according to the information of the triples to obtain the attribute values of the screened second entities; for example, the attribute value of the "floor" of the second entity in the triple1 is filtered according to the preTextList2, and the obtained filtered attribute value of the second entity is the "artificial intelligence technology"; s4, selecting different texts to generate templates according to the relation words; for example, in triple1 ═ a group, search, fall >, the first text generation template1 ═ first entity + relationship word attribute value after screening + relationship word + second entity of second entity attribute value after screening +' + second entity "is selected according to the relationship word" search "; s5, filling the first entity, the relation words, the second entity, and the screened relation words and/or the attribute values of the second entity into a text generation template to generate a text; for example, the text "landing of a clique exploration artificial intelligence technology" is generated by filling the template1 with the first entity, the relation term, the second entity, the filtered relation term and/or the attribute value of the second entity.
Through the knowledge graph, coherent, smooth and meaningful texts are generated, and a user can be assisted to understand the knowledge graph better.
In some embodiments, different attribute values corresponding to the same second entity or relation word may be sorted based on tuplesData, and the value of "word: attribute1 "and" word: attribute2 "is organized as" word: [ 'attribute 1', 'attribute 2' ] ". For example, attribute values of an entity "landed" are sorted, and an attribute value list corresponding to the "landed" is [ 'artificial intelligence technology', 'public opinion knowledge map', 'intelligent wind control', 'intelligent question and answer', 'intelligent audit', ]. In this way, it may be convenient to find the attribute value of each entity or relationship.
In some embodiments, step S3 further includes: s31, acquiring and calculating the feature value list of any relation word attribute value in the relation word attribute value list and the second entity attribute value list or the intersection word number of the feature value list of the second entity attribute value and the triple information, and adding the current relation word attribute value into the relation word attribute value confirmation list or adding the current second entity attribute value into the second entity attribute value confirmation list when the intersection word number is greater than a first preset threshold value; s32, determining the screened related word attribute value or second entity attribute value according to the word number of the related word attribute value confirmation list or the word number of the second entity attribute value confirmation list. The first preset threshold may be set to 3, and those skilled in the art may set the first preset threshold to other values according to requirements.
For example, for triple2, the attribute value "restaurant" of the second entity "business" feature value list att _ i _ features [ 'a clique', 'exploration', 'artificial intelligence', 'on' ], the triplet upper information preTextList2 of triple2 [ 'a clique', 'exploration', 'artificial intelligence', 'on', 'off-ground', 'on' ]; calculating the number of words of intersection between a characteristic value List Att _ i _ features of the second entity attribute value and triple information pretextList2 to be 4, wherein the number of words of intersection is larger than a first preset threshold value, adding the current second entity attribute value into a second entity attribute value confirmation List entry 2Att _ List, and adding 'catering' into an attribute value confirmation List of 'industry'; and determining that the screened attribute value of the second entity is 'catering' according to the fact that the word number of the entry 2Att _ List of the 'industry' of the second entity is 1.
In some embodiments, the determining the filtered relation term attribute value or the second entity attribute value in step S32 further includes: when the number of words in the related word attribute value confirmation list is 0, the screened related word attribute value is null, or when the number of words in the second entity attribute value confirmation list is 0, the screened second entity attribute value is null; when the number of words in the related word attribute value confirmation list is 1, the screened related word attribute value is a word in the related word attribute value confirmation list, or when the number of words in the second entity attribute value confirmation list is 1, the screened second entity attribute value is a word in the second entity attribute value confirmation list; and when the number of words in the related word attribute value confirmation list is greater than or equal to 2, the screened related word attribute value is that the words in the related word attribute value confirmation list are connected by a pause number and end by an 'equal' word, or when the number of words in the second entity attribute value confirmation list is greater than or equal to 2, the screened second entity attribute value is that the words in the second entity attribute value confirmation list are connected by a pause number and end by an 'equal' word.
For example, when the number of words in the second entity attribute value confirmation List entry 2Att _ List is 0, the second entity attribute value entry 2AttStr after the screening is empty, that is, entry 2AttStr ═ is "; for triple2, when the number of words of entry 2Att _ List of the second entity "industry" is 1, the screened attribute value of the second entity is the attribute value of the second entity to confirm the word "food and drink" in the List; according to the above steps, triple6< B item can be derived, and in landing >, the number of entries of entry 2Att _ List is [ 'public opinion knowledge map', 'intelligent wind control', 'intelligent question and answer', 'intelligent audit', ] and the number of entries of entry 2Att _ List is greater than 2, and the second entity attribute value entry 2AttStr after screening is "public opinion knowledge map, intelligent wind control, intelligent question and answer, intelligent audit, etc.
In some embodiments, step S31 further includes obtaining a feature value list of the relation term or the second entity attribute value, further including: establishing a first word list, wherein the first word list comprises words in the duplicate removal binary group library and the triple library; establishing a second word list, and performing word segmentation on the text information to obtain the second word list; taking an intersection from the first word list and the second word list, and sequencing the intersection according to the word sequence of the second word list to obtain a third word list; in the third word list, N words before the related word attribute value form a feature value list corresponding to the current related word attribute value, or N words before the second entity attribute value form a feature value list corresponding to the current second entity attribute value. N may be set to 5, and those skilled in the art may also set the value of N according to actual requirements.
For example, a first word list [ 'a clique', 'exploration', 'landing', 'in', 'industry', 'artificial intelligence', 'dining' ]iscreated from the first textual information; the word segmentation is performed on the first text information using a dictionary to obtain a second word list [ ' a clique ', ' effort ', ' in ', ' exploration ', ' new ', ' one ', ' generation ', ' artificial intelligence ', ' in ', ' restaurant ', ' industry ', ' in ', ' down ', ' etc. ', wherein, can use "the Chinese grammar information dictionary" as the basic reference of word segmentation and part of speech tagging; taking the intersection of the first word list and the second word list as [ 'A clique', 'exploration', 'landing', 'in', 'industry', 'artificial intelligence technology', 'dining', ] and sequencing the word lists obtained after the intersection is taken according to the word sequence of the second word list to obtain a third word list as [ 'A clique', 'exploration', 'artificial intelligence technology', 'in', 'dining', 'industry', 'landing' ]; for the first tuple < floor: the "group a" and the "exploration" before the "artificial intelligence technique" constitute a feature value list corresponding to the "artificial intelligence technique", and for the second binary group < industry: restaurant >, "group A" before restaurant "," exploration "," artificial intelligence technology ", and" forming a list of feature values corresponding to "restaurant". In the map database, one attribute value may correspond to a plurality of feature values, and one attribute value corresponds to a plurality of feature values to indicate that the attribute appears in different contexts.
In some embodiments, the format of the feature value list corresponding to the attribute value in tuplestata is "attribute: [ feature _1, feature _1, …, feature _ m ], m is equal to or greater than 1 ", wherein feature _ m is the mth feature value of attribute. For example, the attribute value of "landing" corresponds to the feature value list of "artificial intelligence technique" as [ 'a clique', 'exploration' ].
In some embodiments, the triplet information includes: the above triple information, the first entity of the current triple and the relation word.
In some embodiments, the above triplet information includes: the first entity of the above triplet, the relation word of the above triplet, the attribute value of the second entity of the above triplet and the second entity of the above triplet; and when the current triple is the first triple, the above triple information is null.
For example, for triple1 ═ a group, explore, fall >, and triple2 ═ fall, in business >. triple1 is the first tuple, the triple information preTriple [ ], the triple information preTextList of triple1 [ 'a clique', 'exploration' ]; the above triplet information preTriple of triple2 [ 'a clique', 'exploration', 'artificial intelligence technique', 'land' ], the above triplet information preTextList of triple2 [ 'a clique', 'exploration', 'artificial intelligence technique', 'land' ].
In some embodiments, step S4 further includes: and performing word segmentation and part-of-speech tagging on the related words, and selecting different texts to generate templates according to the number and the part-of-speech of the related words.
In some embodiments, selecting different text generation templates according to the part-of-speech number and part-of-speech of the related words comprises: when the related word cannot be participated and is a verb or preposition, the text generation template comprises the following steps: the method comprises the steps of a first entity, a screened relation word attribute value, a screened relation word, a screened second entity attribute value and a second entity, wherein a first text generation template1 is ' + the second entity of ' the first entity + the screened relation word attribute value + the relation word + the screened second entity attribute value + '; when the relation word can not be participated and is a noun, the text generation template comprises: the text generating template comprises a first entity, relation words, screened second entity attribute values and a second entity, wherein a second text generating template2 is 'the' + relation word + 'of the first entity +' is '+ the screened second entity attribute values + the second entity'; when the relation words are segmented into prepositions and verbs, the text generation template comprises the following steps: the third text generation template3 is "first entity + relation preposition + screened second entity attribute value + second entity + screened relation word attribute value + relation verb".
For example, for triple1 ═ group a, explore, land >, "explore" cannot be participled, and is a verb, the selection template is "+ second entity" of "first entity + screened relation word attribute value + relation word + screened second entity attribute value +"; and filling the first entity, the second entity, the attribute value of the screened relation word and the relation word into a text generation template, wherein the generated text is 'A group exploration artificial intelligence technology landing'. For the triples < ' X solution ', ' foundation ', ' modern information technology ', ' the screened relation word attribute value is null, the screened second entity attribute value is ' cloud computing, Internet of things, big data, a mobile terminal, an artificial intelligence technology and the like ', ' the foundation ' cannot be participled and is a noun, and the selected template is ' the first entity + ' and ' the relation word + ' is ' the + the screened second entity attribute value + the second entity '; filling attribute values and relation words of the first entity, the second entity and the screened second entity into a text generation template, wherein the generated text is the basis of the X solution and is based on the modern information technologies such as cloud computing, Internet of things, big data, mobile terminals and artificial intelligence technologies. For the triples < ' > a group ', ' operation by giving ' and ' organization internal ', ' the attribute value of the relation word after screening is ' high efficiency ', the attribute value of the second entity after screening is ' group ', ' operation by giving ' can be participled as r1 as ' let ', r2 as ' operation ', wherein r1 is preposition, r2 is verb, and the template is ' first entity + relation preposition + second entity attribute value after screening + second entity + relation verb '; and filling the attribute values of the first entity, the second entity, the screened second entity and the screened attribute values of the relation words into a text generation template to generate a text, wherein the text is 'A group to enable the interior of the group organization to operate efficiently'. The selection and organization modes of selecting different templates according to the number of parts of speech and the part of speech of the related words are not limited to the ones provided by the embodiments of the present disclosure, and those skilled in the art can set and select the templates according to actual situations.
The disclosed embodiments provide an apparatus for generating text by using map knowledge, comprising a processor and a memory storing program instructions, wherein the processor is configured to execute the method for generating text by using map knowledge when executing the program instructions.
Through the knowledge graph, coherent, smooth and meaningful texts are generated, and a user can be assisted to understand the knowledge graph better.
The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in other forms, so that those skilled in the art may apply the above-described modifications and variations to the present invention without departing from the spirit of the present invention.

Claims (10)

1. A method for generating text from atlas knowledge, comprising:
step S1: constructing a spectra database from prior knowledge, the spectra database comprising: the system comprises a triple library and a double library, wherein the triple comprises: a first entity, a relation word and a second entity, the binary group comprising: a relation term or a second entity, and, an attribute value;
step S2: constructing the information above the triples for any one of the triples in the triples library;
step S3: screening the attribute values of the relation words according to the information of the triples to obtain the attribute values of the screened relation words, or screening the attribute values of the second entities according to the information of the triples to obtain the attribute values of the screened second entities;
step S4: selecting different texts to generate templates according to the relation words;
step S5: and filling the attribute values of the first entity, the second entity and the screened second entity, and the attribute values of the screened relation words and/or the relation words into the text generation template to generate a text.
2. The method according to claim 1, wherein the step S3 further comprises:
step S31: acquiring and calculating the intersection word number of any one of a relation word attribute value list and a second entity attribute value list or the feature value list of the second entity attribute value and the triple information, and adding the current relation word attribute value into a relation word attribute value confirmation list or adding the current second entity attribute value into a second entity attribute value confirmation list when the intersection word number is greater than a first preset threshold value;
step S32: and determining the screened related word attribute value or second entity attribute value according to the word number of the related word attribute value confirmation list or the word number of the second entity attribute value confirmation list.
3. The method according to claim 2, wherein step S32 further comprises:
when the number of words in the related word attribute value confirmation list is 0, the screened related word attribute value is null, or when the number of words in the second entity attribute value confirmation list is 0, the screened second entity attribute value is null;
when the number of words in the related word attribute value confirmation list is 1, the screened related word attribute value is a word in the related word attribute value confirmation list, or when the number of words in the second entity attribute value confirmation list is 1, the screened second entity attribute value is a word in the second entity attribute value confirmation list;
when the number of words in the related word attribute value confirmation list is greater than or equal to 2, the screened related word attribute value is that the words in the related word attribute value confirmation list are connected by a pause sign and end by an "equal" word, or when the number of words in the second entity attribute value confirmation list is greater than or equal to 2, the screened second entity attribute value is that the words in the second entity attribute value confirmation list are connected by a pause sign and end by an "equal" word.
4. The method according to claim 2, wherein the step S31 further comprises:
establishing a first word list, wherein the first word list comprises words in the duplicate removal binary group library and the triple library;
establishing a second word list, and performing word segmentation on the text information to obtain the second word list;
taking an intersection of the first word list and the second word list, and sequencing the intersection according to the word sequence of the second word list to obtain a third word list;
in the third word list, N words before the related word attribute value form a feature value list corresponding to the current related word attribute value, or N words before the second entity attribute value form a feature value list corresponding to the current second entity attribute value.
5. The method according to claim 1 or 2, wherein the triplet context information comprises: the above triple information, the first entity of the current triple and the relation word.
6. The method of claim 5, wherein the above triplet information comprises:
the first entity of the above triplet, the relation word of the above triplet, the attribute value of the second entity of the above triplet and the second entity of the above triplet;
and when the current triple is the first triple, the above triple information is null.
7. The method according to claim 1, wherein the step S4 further comprises:
and performing word segmentation and part-of-speech tagging on the related words, and selecting different texts to generate templates according to the number of the segmented words and the part-of-speech of the related words.
8. The method of claim 7, wherein selecting different text generation templates according to the part-of-speech number and part-of-speech of the related term comprises:
when the related word cannot be participated and is a verb or preposition, the text generation template comprises the following steps: the first entity, the screened related term attribute value, the related term, the screened second entity attribute value and the second entity;
when the relation word cannot be participated and is a noun, the text generation template comprises: the first entity, the relation words, the screened second entity attribute values and the second entity.
9. The method of claim 7, wherein selecting different text generation templates according to the part-of-speech number and part-of-speech of the related term comprises:
when the relation words are segmented into prepositions and verbs, the text generation template comprises the following steps: the first entity, the relation preposition, the screened second entity attribute value, the second entity, the screened relation word attribute value and the relation verb.
10. An apparatus for generating text from atlas knowledge, comprising a processor and a memory storing program instructions, wherein the processor is configured to perform the method for generating text from atlas knowledge according to any of claims 1 to 9 when executing the program instructions.
CN202010674570.4A 2020-07-14 2020-07-14 Method and device for generating text by map knowledge Active CN111930959B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010674570.4A CN111930959B (en) 2020-07-14 2020-07-14 Method and device for generating text by map knowledge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010674570.4A CN111930959B (en) 2020-07-14 2020-07-14 Method and device for generating text by map knowledge

Publications (2)

Publication Number Publication Date
CN111930959A true CN111930959A (en) 2020-11-13
CN111930959B CN111930959B (en) 2024-02-09

Family

ID=73314069

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010674570.4A Active CN111930959B (en) 2020-07-14 2020-07-14 Method and device for generating text by map knowledge

Country Status (1)

Country Link
CN (1) CN111930959B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559761A (en) * 2020-12-07 2021-03-26 上海明略人工智能(集团)有限公司 Method and system for generating text based on map, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156083A (en) * 2015-03-31 2016-11-23 联想(北京)有限公司 A kind of domain knowledge processing method and processing device
CN107908671A (en) * 2017-10-25 2018-04-13 南京擎盾信息科技有限公司 Knowledge mapping construction method and system based on law data
CN108804521A (en) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates
CN109241286A (en) * 2018-09-21 2019-01-18 百度在线网络技术(北京)有限公司 Method and apparatus for generating text
CN109684394A (en) * 2018-12-13 2019-04-26 北京百度网讯科技有限公司 Document creation method, device, equipment and storage medium
CN110347898A (en) * 2019-06-28 2019-10-18 北京牡丹电子集团有限责任公司宁安智慧工程中心 A kind of the response generation method and system of network public-opinion monitoring
CN110489755A (en) * 2019-08-21 2019-11-22 广州视源电子科技股份有限公司 Document creation method and device
CN110750975A (en) * 2019-10-21 2020-02-04 北京明略软件系统有限公司 Introduction text generation method and device
CN110765753A (en) * 2019-12-27 2020-02-07 广东博智林机器人有限公司 Method, system, computer device and storage medium for generating file

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156083A (en) * 2015-03-31 2016-11-23 联想(北京)有限公司 A kind of domain knowledge processing method and processing device
CN107908671A (en) * 2017-10-25 2018-04-13 南京擎盾信息科技有限公司 Knowledge mapping construction method and system based on law data
CN108804521A (en) * 2018-04-27 2018-11-13 南京柯基数据科技有限公司 A kind of answering method and agricultural encyclopaedia question answering system of knowledge based collection of illustrative plates
CN109241286A (en) * 2018-09-21 2019-01-18 百度在线网络技术(北京)有限公司 Method and apparatus for generating text
CN109684394A (en) * 2018-12-13 2019-04-26 北京百度网讯科技有限公司 Document creation method, device, equipment and storage medium
CN110347898A (en) * 2019-06-28 2019-10-18 北京牡丹电子集团有限责任公司宁安智慧工程中心 A kind of the response generation method and system of network public-opinion monitoring
CN110489755A (en) * 2019-08-21 2019-11-22 广州视源电子科技股份有限公司 Document creation method and device
CN110750975A (en) * 2019-10-21 2020-02-04 北京明略软件系统有限公司 Introduction text generation method and device
CN110765753A (en) * 2019-12-27 2020-02-07 广东博智林机器人有限公司 Method, system, computer device and storage medium for generating file

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559761A (en) * 2020-12-07 2021-03-26 上海明略人工智能(集团)有限公司 Method and system for generating text based on map, electronic equipment and storage medium
CN112559761B (en) * 2020-12-07 2024-04-09 上海明略人工智能(集团)有限公司 Atlas-based text generation method, atlas-based text generation system, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111930959B (en) 2024-02-09

Similar Documents

Publication Publication Date Title
CN108874878B (en) Knowledge graph construction system and method
CN109284363A (en) A kind of answering method, device, electronic equipment and storage medium
CN111241212B (en) Knowledge graph construction method and device, storage medium and electronic equipment
CN109325040B (en) FAQ question-answer library generalization method, device and equipment
CN105912645A (en) Intelligent question and answer method and apparatus
CN117217315B (en) Method and device for generating high-quality question-answer data by using large language model
CN117271558A (en) Language query model construction method, query language acquisition method and related devices
CN111930959A (en) Method and device for generating text by using map knowledge
CN114547303A (en) Text multi-feature classification method and device based on Bert-LSTM
CN117494806B (en) Relation extraction method, system and medium based on knowledge graph and large language model
CN110929085B (en) System and method for processing electric customer service message generation model sample based on meta-semantic decomposition
CN113672699A (en) Knowledge graph-based NL2SQL generation method
CN117131152B (en) Information storage method, apparatus, electronic device, and computer readable medium
CN118133972A (en) Content retrieval generation method and device based on knowledge graph and storage medium
CN112765976A (en) Text similarity calculation method, device and equipment and storage medium
CN115878818B (en) Geographic knowledge graph construction method, device, terminal and storage medium
CN112732969A (en) Image semantic analysis method and device, storage medium and electronic equipment
CN112287077A (en) Statement extraction method and device for combining RPA and AI for document, storage medium and electronic equipment
CN116097253A (en) Method and device for constructing multi-level knowledge graph
CN117725895A (en) Document generation method, device, equipment and medium
CN116186219A (en) Man-machine dialogue interaction method, system and storage medium
CN115757720A (en) Project information searching method, device, equipment and medium based on knowledge graph
CN114780755A (en) Playing data positioning method and device based on knowledge graph and electronic equipment
CN115292506A (en) Knowledge graph ontology construction method and device applied to office field
CN113901793A (en) Event extraction method and device combining RPA and AI

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant