CN108959248A - A kind of entity mask method and device, computer readable storage medium - Google Patents
A kind of entity mask method and device, computer readable storage medium Download PDFInfo
- Publication number
- CN108959248A CN108959248A CN201810643729.9A CN201810643729A CN108959248A CN 108959248 A CN108959248 A CN 108959248A CN 201810643729 A CN201810643729 A CN 201810643729A CN 108959248 A CN108959248 A CN 108959248A
- Authority
- CN
- China
- Prior art keywords
- segmentation
- rule
- text
- word
- entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/103—Formatting, i.e. changing of presentation of documents
- G06F40/106—Display of layout of documents; Previewing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
This application discloses a kind of entity mask method and devices, computer readable storage medium, which comprises according to default rule, is segmented to text to be marked;The mouse action of user is monitored and receives, if received mouse action selects word to operate to be predefined, the deviation post of text where detecting current mouse is whether in the segmentation divided;If the deviation post of text where current mouse chooses the segmentation and the segmentation is shown as selected state in the segmentation divided.The application is by being segmented text to be marked according to default rule, and word operation is selected to choose the segmentation divided by predefined, it solves the problems, such as the label substance selection that the entity mark of script needs many mouses to choose and moves, substantially increases efficiency when choosing label text.
Description
Technical field
The present invention relates to natural language processing (Natural Language Processing, NLP) technical fields, especially
It is related to a kind of entity mask method and device, computer readable storage medium.
Background technique
It is universal with big data and artificial intelligence (Artificial Intelligence, AI), in enterprise-level application
The relevant technology of natural language processing is used by more and more.Currently, although many major companies provide part of speech identification, entity is known
Not, hypertext transfer protocol (Hyper Text Transfer Protocol, the HTTP) service of the models such as relation recognition, but this
The Natural Language Processing Models overwhelming majority of a little service behinds is obtained by internet data training.And the text in internet
Word content sources are extensive: the existing content from professional media, the content for also having netizen individual to generate.Internet text is with enterprise
Portion's content of text is compared in the industry, and there are larger differences with writing style for word.Therefore, natural language processing technique is wanted in enterprise
Reach preferable effect in grade application, generally require after text marks in enterprise, re -training at be suitable for enterprise from
The Natural Language Processing Models that body needs.
For task most important in NLP: entity extraction, it is also desirable to be labeled using the text data in enterprise, then
Training pattern.Entity mark generally refers to the process that manually text data is marked, for example, " Beijing is the head of China
All " in the words, " Beijing " the two words are marked as place name (abbreviation Loc), and " China " the two words are marked as country's (letter
Claim Country)." Beijing " or " China " the two words in above content generally are chosen using mouse, are then marked as
Loc or Country.
A place is chosen currently, drawing the entity in selection sheet by mouse and needing to first pass through mouse, then presses mouse
Left button starts to continue to move to, and until having selected corresponding label substance, this label text chooses mode to have poor user
Experience, especially when the data volume of single people's mark is more, mark speed is substantially reduced.
Summary of the invention
In order to solve the above-mentioned technical problems, the present invention provides a kind of entity mask method and device, computer-readable deposit
Storage media can be improved mark speed.
In order to solve the above-mentioned technical problem, the technical solution of the embodiment of the present invention is achieved in that
The embodiment of the invention provides a kind of entity mask methods, comprising:
According to default rule, text to be marked is segmented;
The mouse action of user is monitored and received, if received mouse action selects word to operate to be predefined, is detected
Whether the deviation post of text where current mouse is in the segmentation divided;
If the deviation post of text where current mouse in the segmentation divided, chooses the segmentation and should
Segmentation is shown as selected state.
Further, the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
Further, the priority of the rule i is greater than or equal to the priority of rule i+1, and each rule cannot be cut
The segmentation for dividing the rule higher than own priority to be syncopated as, wherein i is the natural number between 1 to 4.
Further, it is described it is predefined select word operation be left mouse button double-click, left mouse button is clicked or right mouse button
It clicks.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer-readable recording medium storage
Have one or more program, one or more of programs can be executed by one or more processor, with realize such as with
The step of upper described in any item entity mask methods.
The embodiment of the invention also provides a kind of entity annotation equipments, including processor and memory, in which:
The processor is for executing the entity marking program stored in memory, to realize as described in any of the above item
The step of entity mask method.
The embodiment of the invention also provides a kind of entity annotation equipment, including segmentation module, monitors module and chooses module,
Wherein:
Segmentation module, for being segmented to text to be marked according to default rule;
Module is monitored, the double click for monitoring and receiving user operates, if received mouse action is predefined
Select word to operate, then notice choose module;
Module is chosen, for receiving the notice for choosing module, whether is the deviation post of text where detection current mouse
In the segmentation divided, if the deviation post of text where current mouse selects in the segmentation divided
In the segmentation and the segmentation is shown as selected state.
Further, the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
Further, the priority of the rule i is greater than or equal to the priority of rule i+1, and each rule cannot be cut
The segmentation for dividing the rule higher than own priority to be syncopated as, wherein i is the natural number between 1 to 4.
Further, it is described it is predefined select word operation be left mouse button double-click, left mouse button is clicked or right mouse button
It clicks.
Technical solution of the present invention has the following beneficial effects:
Entity mask method and device, computer readable storage medium provided by the invention, by according to default rule
Text to be marked is segmented, and selects word operation to choose the segmentation divided by predefined, solves the reality of script
The problem of body mark needs many mouses to choose and the label substance moved selects, substantially increases when choosing label text
Efficiency.
Detailed description of the invention
The drawings described herein are used to provide a further understanding of the present invention, constitutes part of this application, this hair
Bright illustrative embodiments and their description are used to explain the present invention, and are not constituted improper limitations of the present invention.In the accompanying drawings:
Fig. 1 is a kind of flow diagram of entity mask method of the embodiment of the present invention;
Fig. 2 is a kind of text structure schematic diagram marked by entity of the embodiment of the present invention;
Fig. 3 is a kind of structural schematic diagram of entity annotation equipment of the embodiment of the present invention.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention
Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application
Feature can mutual any combination.
Natural language processing, be the data such as voice, text are handled, are converted, a major class problem of Extracting Information
General name.Entity, emphasis refers to name Entity recognition (the Named Entity in natural language processing field here
Recognition, NER), but it is not limited to name entity.Relationship, here emphasis refer to entity in natural language processing field with
Relationship between entity.Entity recognition, from input text in extract the entity with certain semantic information, as name, the date,
Place, organization etc..Relation recognition, from the pass extracted in input text between the entity and entity with certain semantic information
System, such as parent and child, employ, hold a post, geographical relationship.Training, refer in machine learning field, machine according to training data with
And loss function updates the process of model parameter.Chinese word segmentation (Chinese Word Segmentation, CWS) refers to
One chinese character sequence is cut into individual word one by one.Participle be exactly by continuous word sequence according to certain specification again
It is combined into the process of word sequence.
With reference to Fig. 1, a kind of entity mask method according to an embodiment of the present invention includes the following steps:
Step 101: according to default rule, text to be marked being segmented;
It should be noted that text to be marked is before being presented to front end, in the regular cutting being predetermined from the background
At multistage content, front end normally continuously displays the multistage content segmented.
In the present embodiment, the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
In the present embodiment, the priority of regular i is greater than or equal to the priority of rule i+1, and each rule is unable to cutting ratio certainly
The segmentation that the high rule of body priority is syncopated as, wherein i is the natural number between 1 to 4.
It should be noted that having the dictionary of oneself inside the browser models such as Google's browser (Google Chrome)
And some rules, so that user automatically selects the word (such as " China " this word) in dictionary or meets pre- by double-clicking
If the content (such as cell-phone number of 11 bit digitals composition) of rule, but the dictionary built in the browsers such as Chrome and rule are simultaneously
Practical mark task cannot be helped to promote annotating efficiency.Therefore we need to redesign browser (application scenarios of the invention
Can also be not limited to browser, other texts show with it is equally applicable in operation application) built-in dictionary and rule reach
To it is desirable that by double-click mark content, such as:
(1) only number, year, month, day composition content, it should be integrally selected by double click, as in Fig. 2 1.
It is shown;
(2) content in bracket (length within the scope of preset string length, such as within 10 Chinese characters), by double
Hit, it is all selected, as in Fig. 2 3. shown in;
(3) content in quotation marks (length within the scope of preset string length, such as within 10 Chinese characters), by double
Hit, it is all selected, as in Fig. 2 2. shown in;
(4) word in certain dictionaries, by double-click, directly be selected, as in Fig. 2 4. shown in;
(5) word obtained using segmenter.
The rule of the application is not limited to the five above text segmentation rules.Above text segmentation rule, priority according to
Secondary reduction, the low rule of priority can only carry out cutting other than the high cutting of priority, cannot high rule of cutting priority again
The text field then cut out.
Step 102: the mouse action of user is monitored and receive, if received mouse action selects word to grasp to be predefined
Make, then whether the deviation post of text where detecting current mouse is in the segmentation divided;
In the present embodiment, if received mouse action be not it is predefined select word to operate, return step 102 continues to supervise
Listen and receive the mouse action of user.
In the present embodiment, it is described it is predefined select word operation be left mouse button double-click, left mouse button is clicked or right mouse
Key is clicked.
Step 103: if the deviation post of text where current mouse chooses this point in the segmentation divided
The segmentation is simultaneously shown as selected state by section.
Specifically, when user carries out double click operation by left mouse button, script that double click event is preset (for example,
Javascript script) it monitors and handles;Specific processing method is the deviation post p of text where obtaining current mouse, inquiry
Whether p is within some pre- section s segmented, if section s is shown as selected state within section s.
It should be noted that choosing one section of text that there is high efficiency using left mouse button double-click, for labeled data
Improved efficiency there is significant help.By the method for this custom browser double click operation behavior, script can be needed
The label substance selection wanted many mouses to choose and moved, is converted into simple double click operation.It will will be greatly reduced and choose mark
Sign efficiency when text.
In the present embodiment, if the deviation post of text where current mouse returns not in the segmentation divided
Step 102 is returned to continue to monitor and receive the mouse action of user.
The embodiment of the invention also discloses a kind of computer readable storage medium, the computer-readable recording medium storage
Have one or more program, one or more of programs can be executed by one or more processor, with realize such as with
The step of upper described in any item entity mask methods.
The embodiment of the invention also discloses a kind of entity annotation equipments, including processor and memory, in which: the processing
Device is for executing the entity marking program stored in memory, to realize the step of the entity mask method as described in any of the above item
Suddenly.
With reference to Fig. 3, a kind of entity annotation equipment according to an embodiment of the present invention, including segmentation module 301, monitoring module
302 and choose module 303, in which:
Segmentation module 301, for being segmented to text to be marked according to default rule;
Module 302 is monitored, the double click for monitoring and receiving user operates, if received mouse action is predetermined
Justice selects word to operate, then notice chooses module 303;
Module 303 is chosen, for receiving the notice for choosing module 303, the bits of offset of text where detection current mouse
It sets whether in the segmentation divided, if the deviation post of text where current mouse is in the segmentation divided
It is interior, then it chooses the segmentation and the segmentation is shown as selected state.
It should be noted that text to be marked is before being presented to front end, be segmented from the background module 301 according to
Default rule is cut into multistage content, and front end normally continuously displays the multistage content segmented.
In the present embodiment, the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
In the present embodiment, the priority of regular i is greater than or equal to the priority of rule i+1, and each rule is unable to cutting ratio certainly
The segmentation that the high rule of body priority is syncopated as, wherein i is the natural number between 1 to 4.
It should be noted that having the dictionary of oneself inside the browser models such as Google's browser (Google Chrome)
And some rules, so that user automatically selects the word (such as " China " this word) in dictionary or meets pre- by double-clicking
If the content (such as cell-phone number of 11 bit digitals composition) of rule, but the dictionary built in the browsers such as Chrome and rule are simultaneously
Practical mark task cannot be helped to promote annotating efficiency.Therefore we need to redesign browser (application scenarios of the invention
Can also be not limited to browser, other texts show with it is equally applicable in operation application) built-in dictionary and rule reach
To it is desirable that by double-click mark content, such as:
(1) only number, year, month, day composition content, it should be integrally selected by double click, as in Fig. 2 1.
It is shown;
(2) content in bracket (length within the scope of preset string length, such as within 10 Chinese characters), by double
Hit, it is all selected, as in Fig. 2 3. shown in;
(3) content in quotation marks (length within the scope of preset string length, such as within 10 Chinese characters), by double
Hit, it is all selected, as in Fig. 2 2. shown in;
(4) word in certain dictionaries, by double-click, directly be selected, as in Fig. 2 4. shown in;
(5) word obtained using segmenter.
The rule of the application is not limited to the five above text segmentation rules.Above text segmentation rule, priority according to
Secondary reduction, the low rule of priority can only carry out cutting other than the high cutting of priority, cannot high rule of cutting priority again
The text field then cut out.
In the present embodiment, if received mouse action be not it is predefined select word to operate, monitor module 302 and continue to supervise
Listen and receive the mouse action of user.
In the present embodiment, it is described it is predefined select word operation be left mouse button double-click, left mouse button is clicked or right mouse
Key is clicked.
For example, when user carries out double click operation by left mouse button, script that double click event is preset (for example,
Javascript script) it monitors and handles;Specific processing method is the deviation post p of text where obtaining current mouse, inquiry
Whether p is within some pre- section s segmented, if section s is shown as selected state within section s.
It should be noted that choosing one section of text that there is high efficiency using left mouse button double-click, for labeled data
Improved efficiency there is significant help.By the method for this custom browser double click operation behavior, script can be needed
The label substance selection wanted many mouses to choose and moved, is converted into simple double click operation.It will will be greatly reduced and choose mark
Sign efficiency when text.
In the present embodiment, if the deviation post of text where current mouse selects not in the segmentation divided
Middle module 303 does not choose any segmentation, monitors module 302 and continues to monitor and receive the mouse action of user.
Those of ordinary skill in the art will appreciate that all or part of the steps in the above method can be instructed by program
Related hardware is completed, and described program can store in computer readable storage medium, such as read-only memory, disk or CD
Deng.Optionally, one or more integrated circuits can be used also to realize in all or part of the steps of above-described embodiment.Accordingly
Ground, each module/unit in above-described embodiment can take the form of hardware realization, can also use the shape of software function module
Formula is realized.The present invention is not limited to the combinations of the hardware and software of any particular form.
The above is only a preferred embodiment of the present invention, and certainly, the invention may also have other embodiments, without departing substantially from this
In the case where spirit and its essence, those skilled in the art make various corresponding changes in accordance with the present invention
And deformation, but these corresponding changes and modifications all should fall within the scope of protection of the appended claims of the present invention.
Claims (10)
1. a kind of entity mask method characterized by comprising
According to default rule, text to be marked is segmented;
The mouse action of user is monitored and receives, if received mouse action selects word to operate to be predefined, detection is current
Whether the deviation post of text where mouse is in the segmentation divided;
If the deviation post of text where current mouse in one divided is segmented, chooses the segmentation and by the segmentation
It is shown as selected state.
2. the method according to claim 1, wherein the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
3. according to the method described in claim 2, it is characterized in that, the priority of the rule i is greater than or equal to rule i+1's
Priority, each rule are unable to the segmentation that the cutting rule higher than own priority is syncopated as, wherein i be 1 to 4 it
Between natural number.
4. the method according to claim 1, wherein it is described it is predefined select word operation be left mouse button double-click,
Left mouse button is clicked or right mouse button is clicked.
5. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or
Multiple programs, one or more of programs can be executed by one or more processor, to realize such as Claims 1-4
Any one of described in entity mask method the step of.
6. a kind of entity annotation equipment, which is characterized in that including processor and memory, in which:
The processor is for executing the entity marking program stored in memory, to realize such as any one of claims 1 to 4
The step of described entity mask method.
7. a kind of entity annotation equipment, which is characterized in that including segmentation module, monitor module and choose module, in which:
Segmentation module, for being segmented to text to be marked according to default rule;
Module is monitored, the double click for monitoring and receiving user operates, if received mouse action is predefined choosing
Word operation, then notice chooses module;
Module is chosen, for receiving the notice for choosing module, whether the deviation post of text is where detection current mouse
In the segmentation divided, if the deviation post of text where current mouse chooses this in the segmentation divided
It is segmented and the segmentation is shown as selected state.
8. device according to claim 7, which is characterized in that the default rule includes at least one of:
Rule 1: at least one digital and following three: the content of year, month, day composition is a segmentation;
Rule 2: length is less than or equal to the content in the bracket of preset characters string length, is a segmentation;
Rule 3: length is less than or equal to the content in the quotation marks of preset characters string length, is a segmentation;
Rule 4: the word in preset dictionary is a segmentation;
Rule 5: the word separated by preset segmenter is a segmentation.
9. device according to claim 8, which is characterized in that the priority of the rule i is greater than or equal to rule i+1's
Priority, each rule are unable to the segmentation that the cutting rule higher than own priority is syncopated as, wherein i be 1 to 4 it
Between natural number.
10. device according to claim 7, which is characterized in that it is described it is predefined select word operation be left mouse button double-click,
Left mouse button is clicked or right mouse button is clicked.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810643729.9A CN108959248A (en) | 2018-06-21 | 2018-06-21 | A kind of entity mask method and device, computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810643729.9A CN108959248A (en) | 2018-06-21 | 2018-06-21 | A kind of entity mask method and device, computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108959248A true CN108959248A (en) | 2018-12-07 |
Family
ID=64492042
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810643729.9A Pending CN108959248A (en) | 2018-06-21 | 2018-06-21 | A kind of entity mask method and device, computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108959248A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109829167A (en) * | 2019-02-22 | 2019-05-31 | 维沃移动通信有限公司 | A kind of participle processing method and mobile terminal |
US11954439B2 (en) | 2019-08-30 | 2024-04-09 | Boe Technology Group Co., Ltd. | Data labeling method and device, and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106126052A (en) * | 2016-06-23 | 2016-11-16 | 北京小米移动软件有限公司 | Text selection method and device |
CN106202004A (en) * | 2016-07-13 | 2016-12-07 | 上海轻维软件有限公司 | Combined data cutting method based on regular expressions and separator |
CN106484266A (en) * | 2016-10-18 | 2017-03-08 | 北京锤子数码科技有限公司 | A kind of text handling method and device |
CN106951168A (en) * | 2017-03-03 | 2017-07-14 | 宇龙计算机通信科技(深圳)有限公司 | A kind of literal processing method and mobile terminal |
-
2018
- 2018-06-21 CN CN201810643729.9A patent/CN108959248A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106126052A (en) * | 2016-06-23 | 2016-11-16 | 北京小米移动软件有限公司 | Text selection method and device |
CN106202004A (en) * | 2016-07-13 | 2016-12-07 | 上海轻维软件有限公司 | Combined data cutting method based on regular expressions and separator |
CN106484266A (en) * | 2016-10-18 | 2017-03-08 | 北京锤子数码科技有限公司 | A kind of text handling method and device |
CN106951168A (en) * | 2017-03-03 | 2017-07-14 | 宇龙计算机通信科技(深圳)有限公司 | A kind of literal processing method and mobile terminal |
Non-Patent Citations (1)
Title |
---|
中国标准出版社编: "《字符集和信息编码国家标准汇编 下》", 31 October 1998 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109829167A (en) * | 2019-02-22 | 2019-05-31 | 维沃移动通信有限公司 | A kind of participle processing method and mobile terminal |
CN109829167B (en) * | 2019-02-22 | 2023-11-21 | 维沃移动通信有限公司 | Word segmentation processing method and mobile terminal |
US11954439B2 (en) | 2019-08-30 | 2024-04-09 | Boe Technology Group Co., Ltd. | Data labeling method and device, and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP2570974B1 (en) | Automatic crowd sourcing for machine learning in information extraction | |
WO2021174864A1 (en) | Information extraction method and apparatus based on small number of training samples | |
Anthony | Visualisation in corpus-based discourse studies | |
JP2009026195A (en) | Article classification apparatus, article classification method and program | |
WO2020233386A1 (en) | Intelligent question-answering method and device employing aiml, computer apparatus, and storage medium | |
CN113220836A (en) | Training method and device of sequence labeling model, electronic equipment and storage medium | |
CN112269862B (en) | Text role labeling method, device, electronic equipment and storage medium | |
Sasidhar et al. | A survey on named entity recognition in Indian languages with particular reference to Telugu | |
CN114861677B (en) | Information extraction method and device, electronic equipment and storage medium | |
CN111160041A (en) | Semantic understanding method and device, electronic equipment and storage medium | |
CN111737623A (en) | Webpage information extraction method and related equipment | |
Kaur et al. | A survey of named entity recognition in English and other Indian languages | |
CN112115252A (en) | Intelligent auxiliary writing processing method and device, electronic equipment and storage medium | |
CN110442730A (en) | A kind of knowledge mapping construction method based on deepdive | |
CN106372232B (en) | Information mining method and device based on artificial intelligence | |
CN109062871B (en) | Text labeling method and device and computer readable storage medium | |
CN108959248A (en) | A kind of entity mask method and device, computer readable storage medium | |
CN105786971A (en) | International Chinese-teaching oriented grammar point identification method | |
CN114970502B (en) | Text error correction method applied to digital government | |
CN113836316B (en) | Processing method, training method, device, equipment and medium for ternary group data | |
CN114064913A (en) | Knowledge graph-based document retrieval method and system | |
Mohnot et al. | Hybrid approach for Part of Speech Tagger for Hindi language | |
CN111597302B (en) | Text event acquisition method and device, electronic equipment and storage medium | |
CN109062890B (en) | Label switching method and device and computer readable storage medium | |
Deng et al. | [Retracted] Intelligent Recognition Model of Business English Translation Based on Improved GLR Algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181207 |
|
RJ01 | Rejection of invention patent application after publication |