CN107943786A - A kind of Chinese name entity recognition method and system - Google Patents
A kind of Chinese name entity recognition method and system Download PDFInfo
- Publication number
- CN107943786A CN107943786A CN201711137581.3A CN201711137581A CN107943786A CN 107943786 A CN107943786 A CN 107943786A CN 201711137581 A CN201711137581 A CN 201711137581A CN 107943786 A CN107943786 A CN 107943786A
- Authority
- CN
- China
- Prior art keywords
- name entity
- target text
- name
- sets
- chinese
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Character Discrimination (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of Chinese name entity recognition method and system, this method to comprise the following steps:S1, carry out rule-based matched Entity recognition to target text, obtains the first name entity sets;S2, carry out target text using statistic algorithm Entity recognition, the name entity sets of acquisition second;S3, after being cleaned to the first name entity sets and the second name entity sets, obtain recognition result.After the present invention is based respectively on rule match and statistic algorithm to target text progress Entity recognition, after both recognition results are cleaned, ask for obtaining last Chinese Entity recognition result, can be while Chinese Entity recognition accuracy rate be ensured, greatly improve the recall ratio of Chinese Entity recognition, and Chinese entity automatic identification is carried out by this method, recognition speed is fast, can be widely applied in the field of information processing to text.
Description
Technical field
The present invention relates to computer application and field of information processing, more particularly to a kind of Chinese name entity recognition method
And system.
Background technology
It is information element basic in target text to name entity, is the basis of correct understanding target text.Chinese entity
Name identification is the important foundation instrument of the application fields such as information extraction, syntactic analysis, machine learning, in natural language processing skill
Art occupies critical role during moving towards practical.Chinese name Entity recognition seeks to judge whether a character string represents
One name entity.In information extraction research, Chinese name Entity recognition is a technology most with practical value at present.Often
Method is to be based purely on the recognition methods of hidden Markov, maximum entropy model.
At present, since the name of Chinese Business Name is not strong with word rule, use is more random, often in the form of abbreviation
Occur, such as " Bank of China Co., Ltd. " often occurs in the form of abbreviation, and such as " Bank of China " or " middle row ", this is public affairs
Take charge of the identification of name, using bringing difficulty.It is identified generally, for referred to as this kind of Chinese name entity of Chinese company,
There are following difficult point:1st, under different field, scene, name the extension of abbreviation variant.2nd, certain form of entity name becomes
Change frequently, and can be followed without stringent rule.3rd, expression-form is various.4th, enormous amount, it is impossible to enumerate, it is difficult to all
It is embodied in dictionary.Generally speaking, in the processing of Chinese target text, since Chinese word segmentation effect largely effects on Chinese name
The recognition effect of entity, and then target text analysis and treatment effect are influenced, cause that recall ratio is low and recognition speed is slow.
The content of the invention
In order to solve above-mentioned technical problem, the object of the present invention is to provide a kind of Chinese name entity recognition method and it is
System.
The technical solution adopted by the present invention to solve the technical problems is:
A kind of Chinese name entity recognition method, comprises the following steps:
S1, carry out rule-based matched Entity recognition to target text, obtains the first name entity sets;
S2, carry out target text using statistic algorithm Entity recognition, the name entity sets of acquisition second;
S3, after being cleaned to the first name entity sets and the second name entity sets, obtain recognition result.
Further, the step S1, specifically includes:
The content of target text, be separated by S11 by sentence;
S12, carry out the content extraction based on punctuation mark rule to the target text after separation;
S13, carry out the content extraction based on syntactic template rule to the target text after separation;
S14, carry out the content extraction based on table features to the target text after separation;
S15, all name entities generation the first name entity sets that acquisition will be extracted.
Further, the step S2, specifically includes:
S21, by target text carry out word segmentation processing;
S22, based on default part of speech database, part-of-speech tagging is carried out to word segmentation processing result;
S23, based on hidden Markov model statistical learning method, after carrying out statistical analysis to part-of-speech tagging result, will point
Name entity generation the second name entity sets that analysis obtains.
Further, the step S3, specifically includes:
S31, according to default noise lexicon, the first name entity sets and the second name entity sets are carried out respectively
Data cleansing, rejects noise vocabulary;
S32, by after cleaning first name entity sets and second name entity sets seek union after, as name entity
Recognition result.
Another technical solution is used by the present invention solves its technical problem:
A kind of Chinese name entity recognition system, including with lower module:
First identification module, for carrying out rule-based matched Entity recognition to target text, it is real to obtain the first name
Body set;
Second identification module, for carrying out Entity recognition to target text using statistic algorithm, obtains the second name entity
Set;
Cleaning module, after being cleaned to the first name entity sets and the second name entity sets, is identified
As a result.
Further, first identification module, specifically includes:
Separating element, for the content of target text to be separated by sentence;
First extracting unit, for carrying out the content extraction based on punctuation mark rule to the target text after separation;
Second extracting unit, for carrying out the content extraction based on syntactic template rule to the target text after separation;
3rd extracting unit, for carrying out the content extraction based on table features to the target text after separation;
Generation unit, for all name entities obtained generation the first name entity sets will to be extracted.
Further, second identification module, specifically includes:
Word segmentation processing unit, for target text to be carried out word segmentation processing;
Part-of-speech tagging unit, for based on default part of speech database, part-of-speech tagging to be carried out to word segmentation processing result;
Statistical analysis unit, for based on hidden Markov model statistical learning method, uniting to part-of-speech tagging result
After meter analysis, name entity generation the second name entity sets of acquisition will be analyzed.
Further, the cleaning module, specifically includes:
Data cleansing unit, for according to default noise lexicon, ordering respectively the first name entity sets and second
Name entity sets carries out data cleansing, rejects noise vocabulary;
Computing unit, after the first name entity sets after cleaning and the second name entity sets are asked union, makees
To name Entity recognition result.
The method of the present invention, the beneficial effect of system are:The present invention is based respectively on rule match and statistic algorithm to target text
After this progress Entity recognition, after both recognition results are cleaned, ask for obtaining last Chinese Entity recognition as a result, can
While Chinese Entity recognition accuracy rate is ensured, to greatly improve the recall ratio of Chinese Entity recognition, and pass through this method
Chinese entity automatic identification is carried out, recognition speed is fast.
Brief description of the drawings
Fig. 1 is the flow chart of the Chinese name entity recognition method of the present invention;
Fig. 2 is the structure diagram of the Chinese name entity recognition system of the present invention.
Embodiment
With reference to Fig. 1, the present invention provides a kind of Chinese name entity recognition method, comprise the following steps:
S1, carry out rule-based matched Entity recognition to target text, obtains the first name entity sets;
S2, carry out target text using statistic algorithm Entity recognition, the name entity sets of acquisition second;
S3, after being cleaned to the first name entity sets and the second name entity sets, obtain recognition result.
Wherein, target text refers to that needs carry out the text of Chinese name Entity recognition.
After this method is based respectively on rule match and statistic algorithm to target text progress Entity recognition, by both identification
As a result after being cleaned, ask for obtaining last Chinese Entity recognition as a result, can ensure Chinese Entity recognition accuracy rate
Meanwhile greatly improve the recall ratio of Chinese Entity recognition, and Chinese entity automatic identification is carried out by this method, can have compared with
Fast recognition speed.
Preferred embodiment is further used as, the step S1, specifically includes:
The content of target text, be separated by S11 by sentence;
S12, carry out the content extraction based on punctuation mark rule to the target text after separation;Such as in some files,
Custom adds double quotation marks in entity name, or plus punctuation marks used to enclose the title, at this time, the title in double quotation marks or punctuation marks used to enclose the title is extracted
Come.Therefore, corresponding punctuation mark rule, these punctuation marks rule note can be created according to the use habit of people
Load and the Chinese relevant punctuation mark of entity name and corresponding decimation rule, content extraction is carried out according to punctuation mark rule
Afterwards as the alternative of Chinese entity name.
S13, carry out the content extraction based on syntactic template rule to the target text after separation;For example, " declaration ",
Subject before the verbs such as " title ", " saying ", is typically all entity name, therefore, according to the language habits, creates corresponding syntax mould
Plate gauge then, these syntactic templates rule record with the Chinese relevant word of entity name and corresponding decimation rule, so as to
To be extracted according to syntactic template regular targets text.
S14, carry out the content extraction based on table features to the target text after separation;
S15, all name entities generation the first name entity sets that acquisition will be extracted.
Preferred embodiment is further used as, the step S2, specifically includes:
S21, by target text carry out word segmentation processing;
S22, based on default part of speech database, part-of-speech tagging is carried out to word segmentation processing result;
S23, based on hidden Markov model statistical learning method, after carrying out statistical analysis to part-of-speech tagging result, will point
Name entity generation the second name entity sets that analysis obtains.This step is based on hidden Markov model statistical learning method, first
According to known, correct entity name, the probability that keyword occurs before it is counted, then by the high keyword of probability,
Extrapolate entity name.So as on the premise of the Chinese entity name accuracy rate that identification obtains is not influenced, greatly improve
The recall ratio of identification, more can comprehensively identify the Chinese entity name obtained in text, and be obtained by automatic identification
Chinese entity name, recognition speed are fast.
Preferred embodiment is further used as, the step S3, specifically includes:
S31, according to default noise lexicon, the first name entity sets and the second name entity sets are carried out respectively
Data cleansing, rejects noise vocabulary;
S32, by after cleaning first name entity sets and second name entity sets seek union after, as name entity
Recognition result.
With reference to Fig. 2, the present invention provides a kind of Chinese name entity recognition system, including with lower module:
First identification module 100, for carrying out rule-based matched Entity recognition to target text, obtains the first name
Entity sets;
Second identification module 200, for carrying out Entity recognition to target text using statistic algorithm, it is real to obtain the second name
Body set;
Cleaning module 300, after being cleaned to the first name entity sets and the second name entity sets, is known
Other result.
Preferred embodiment is further used as, first identification module 100, specifically includes:
Separating element, for the content of target text to be separated by sentence;
First extracting unit, for carrying out the content extraction based on punctuation mark rule to the target text after separation;
Second extracting unit, for carrying out the content extraction based on syntactic template rule to the target text after separation;
3rd extracting unit, for carrying out the content extraction based on table features to the target text after separation;
Generation unit, for all name entities obtained generation the first name entity sets will to be extracted.
Preferred embodiment is further used as, second identification module 200, specifically includes:
Word segmentation processing unit, for target text to be carried out word segmentation processing;
Part-of-speech tagging unit, for based on default part of speech database, part-of-speech tagging to be carried out to word segmentation processing result;
Statistical analysis unit, for based on hidden Markov model statistical learning method, uniting to part-of-speech tagging result
After meter analysis, name entity generation the second name entity sets of acquisition will be analyzed.
Preferred embodiment is further used as, the cleaning module 300, specifically includes:
Data cleansing unit, for according to default noise lexicon, ordering respectively the first name entity sets and second
Name entity sets carries out data cleansing, rejects noise vocabulary;
Computing unit, after the first name entity sets after cleaning and the second name entity sets are asked union, makees
To name Entity recognition result.
One kind Chinese name entity recognition system of the present invention, can perform foregoing the provided one kind Chinese name of the present invention
Entity recognition method, any combination implementation steps of executing method embodiment, possess the corresponding function of this method and beneficial to effect
Fruit.
Above is the preferable of the present invention is implemented to be illustrated, but the invention is not limited to the implementation
Example, those skilled in the art can also make a variety of equivalent variations on the premise of without prejudice to spirit of the invention or replace
Change, these equivalent modifications or replacement are all contained in the application claim limited range.
Claims (8)
1. a kind of Chinese name entity recognition method, it is characterised in that comprise the following steps:
S1, carry out rule-based matched Entity recognition to target text, obtains the first name entity sets;
S2, carry out target text using statistic algorithm Entity recognition, the name entity sets of acquisition second;
S3, after being cleaned to the first name entity sets and the second name entity sets, obtain recognition result.
A kind of 2. Chinese name entity recognition method according to claim 1, it is characterised in that the step
S1, specifically includes:
The content of target text, be separated by S11 by sentence;
S12, carry out the content extraction based on punctuation mark rule to the target text after separation;
S13, carry out the content extraction based on syntactic template rule to the target text after separation;
S14, carry out the content extraction based on table features to the target text after separation;
S15, all name entities generation the first name entity sets that acquisition will be extracted.
A kind of 3. Chinese name entity recognition method according to claim 1, it is characterised in that the step
S2, specifically includes:
S21, by target text carry out word segmentation processing;
S22, based on default part of speech database, part-of-speech tagging is carried out to word segmentation processing result;
S23, based on hidden Markov model statistical learning method, after carrying out statistical analysis to part-of-speech tagging result, analysis is obtained
Name entity generation the second name entity sets obtained.
A kind of 4. Chinese name entity recognition method according to claim 1, it is characterised in that the step
S3, specifically includes:
S31, according to default noise lexicon, data are carried out to the first name entity sets and the second name entity sets respectively
Cleaning, rejects noise vocabulary;
S32, by after cleaning first name entity sets and second name entity sets seek union after, as name Entity recognition
As a result.
5. a kind of Chinese name entity recognition system, it is characterised in that including with lower module:
First identification module, for carrying out rule-based matched Entity recognition to target text, obtains the first name entity set
Close;
Second identification module, for carrying out Entity recognition to target text using statistic algorithm, obtains the second name entity sets;
Cleaning module, after being cleaned to the first name entity sets and the second name entity sets, obtains recognition result.
A kind of 6. Chinese name entity recognition system according to claim 5, it is characterised in that the first identification mould
Block, specifically includes:
Separating element, for the content of target text to be separated by sentence;
First extracting unit, for carrying out the content extraction based on punctuation mark rule to the target text after separation;
Second extracting unit, for carrying out the content extraction based on syntactic template rule to the target text after separation;
3rd extracting unit, for carrying out the content extraction based on table features to the target text after separation;
Generation unit, for all name entities obtained generation the first name entity sets will to be extracted.
A kind of 7. Chinese name entity recognition system according to claim 5, it is characterised in that the second identification mould
Block, specifically includes:
Word segmentation processing unit, for target text to be carried out word segmentation processing;
Part-of-speech tagging unit, for based on default part of speech database, part-of-speech tagging to be carried out to word segmentation processing result;
Statistical analysis unit, for based on hidden Markov model statistical learning method, statistical to be carried out to part-of-speech tagging result
After analysis, name entity generation the second name entity sets of acquisition will be analyzed.
8. a kind of Chinese name entity recognition system according to claim 5, it is characterised in that the cleaning module, tool
Body includes:
Data cleansing unit, for according to default noise lexicon, naming respectively the first name entity sets and second real
Body set carries out data cleansing, rejects noise vocabulary;
Computing unit, after the first name entity sets after cleaning and the second name entity sets are asked union, as life
Name Entity recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711137581.3A CN107943786B (en) | 2017-11-16 | 2017-11-16 | Chinese named entity recognition method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711137581.3A CN107943786B (en) | 2017-11-16 | 2017-11-16 | Chinese named entity recognition method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107943786A true CN107943786A (en) | 2018-04-20 |
CN107943786B CN107943786B (en) | 2021-12-07 |
Family
ID=61931531
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711137581.3A Active CN107943786B (en) | 2017-11-16 | 2017-11-16 | Chinese named entity recognition method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107943786B (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647194A (en) * | 2018-04-28 | 2018-10-12 | 北京神州泰岳软件股份有限公司 | information extraction method and device |
CN110008307A (en) * | 2019-01-18 | 2019-07-12 | 中国科学院信息工程研究所 | A kind of rule-based and statistical learning deformation entity recognition method and device |
CN110750991A (en) * | 2019-09-18 | 2020-02-04 | 平安科技(深圳)有限公司 | Entity identification method, device, equipment and computer readable storage medium |
WO2020133291A1 (en) * | 2018-12-28 | 2020-07-02 | 深圳市优必选科技有限公司 | Text entity recognition method and apparatus, computer device, and storage medium |
CN111382570A (en) * | 2018-12-28 | 2020-07-07 | 深圳市优必选科技有限公司 | Text entity recognition method and device, computer equipment and storage medium |
CN111488467A (en) * | 2020-04-30 | 2020-08-04 | 北京建筑大学 | Construction method and device of geographical knowledge graph, storage medium and computer equipment |
CN112926333A (en) * | 2021-04-09 | 2021-06-08 | 平安科技(深圳)有限公司 | Entity identification method and device, electronic equipment and storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060047500A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Named entity recognition using compiler methods |
CN1910573A (en) * | 2003-12-31 | 2007-02-07 | 新加坡科技研究局 | System for identifying and classifying denomination entity |
EP1783744A1 (en) * | 2005-11-03 | 2007-05-09 | Robert Bosch Corporation | Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling |
CN102103594A (en) * | 2009-12-22 | 2011-06-22 | 北京大学 | Character data recognition and processing method and device |
CN102314417A (en) * | 2011-09-22 | 2012-01-11 | 西安电子科技大学 | Method for identifying Web named entity based on statistical model |
CN103268348A (en) * | 2013-05-28 | 2013-08-28 | 中国科学院计算技术研究所 | Method for identifying user query intention |
CN103942347A (en) * | 2014-05-19 | 2014-07-23 | 焦点科技股份有限公司 | Word separating method based on multi-dimensional comprehensive lexicon |
CN103995885A (en) * | 2014-05-29 | 2014-08-20 | 百度在线网络技术(北京)有限公司 | Method and device for recognizing entity names |
CN105302794A (en) * | 2015-10-30 | 2016-02-03 | 苏州大学 | Chinese homodigital event recognition method and system |
CN105808523A (en) * | 2016-03-08 | 2016-07-27 | 浪潮软件股份有限公司 | Method and apparatus for identifying document |
CN105843875A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Smart robot-oriented question and answer data processing method and apparatus |
CN106055545A (en) * | 2015-04-10 | 2016-10-26 | 穆西格马交易方案私人有限公司 | Text mining system and tool |
-
2017
- 2017-11-16 CN CN201711137581.3A patent/CN107943786B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1910573A (en) * | 2003-12-31 | 2007-02-07 | 新加坡科技研究局 | System for identifying and classifying denomination entity |
US20060047500A1 (en) * | 2004-08-31 | 2006-03-02 | Microsoft Corporation | Named entity recognition using compiler methods |
EP1783744A1 (en) * | 2005-11-03 | 2007-05-09 | Robert Bosch Corporation | Unified treatment of data-sparseness and data-overfitting in maximum entropy modeling |
CN102103594A (en) * | 2009-12-22 | 2011-06-22 | 北京大学 | Character data recognition and processing method and device |
CN102314417A (en) * | 2011-09-22 | 2012-01-11 | 西安电子科技大学 | Method for identifying Web named entity based on statistical model |
CN103268348A (en) * | 2013-05-28 | 2013-08-28 | 中国科学院计算技术研究所 | Method for identifying user query intention |
CN103942347A (en) * | 2014-05-19 | 2014-07-23 | 焦点科技股份有限公司 | Word separating method based on multi-dimensional comprehensive lexicon |
CN103995885A (en) * | 2014-05-29 | 2014-08-20 | 百度在线网络技术(北京)有限公司 | Method and device for recognizing entity names |
CN106055545A (en) * | 2015-04-10 | 2016-10-26 | 穆西格马交易方案私人有限公司 | Text mining system and tool |
CN105302794A (en) * | 2015-10-30 | 2016-02-03 | 苏州大学 | Chinese homodigital event recognition method and system |
CN105808523A (en) * | 2016-03-08 | 2016-07-27 | 浪潮软件股份有限公司 | Method and apparatus for identifying document |
CN105843875A (en) * | 2016-03-18 | 2016-08-10 | 北京光年无限科技有限公司 | Smart robot-oriented question and answer data processing method and apparatus |
Non-Patent Citations (5)
Title |
---|
TIAN-FANG YAO等: "Repairing errors for Chinese word segmentation and part-of-speech tagging", 《 PROCEEDINGS. INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS》 * |
YI-CHENG PAN等: "Named entity recognition from spoken documents using global evidences and external knowledge sources with applications on Mandarin Chinese", 《IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, 2005.》 * |
何炎祥等: "基于CRF和规则相结合的地理命名实体识别方法", 《计算机应用与软件》 * |
刘豹等: "基于统计和规则相结合的科技术语自动抽取研究", 《计算机工程与应用》 * |
张宏生: "使用HMM模型改进规则自动生成的命名实体识别系统性能", 《中小企业管理与科技(下旬刊)》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647194A (en) * | 2018-04-28 | 2018-10-12 | 北京神州泰岳软件股份有限公司 | information extraction method and device |
CN108647194B (en) * | 2018-04-28 | 2022-04-19 | 北京神州泰岳软件股份有限公司 | Information extraction method and device |
WO2020133291A1 (en) * | 2018-12-28 | 2020-07-02 | 深圳市优必选科技有限公司 | Text entity recognition method and apparatus, computer device, and storage medium |
CN111382570A (en) * | 2018-12-28 | 2020-07-07 | 深圳市优必选科技有限公司 | Text entity recognition method and device, computer equipment and storage medium |
CN111382570B (en) * | 2018-12-28 | 2024-05-03 | 深圳市优必选科技有限公司 | Text entity recognition method, device, computer equipment and storage medium |
CN110008307A (en) * | 2019-01-18 | 2019-07-12 | 中国科学院信息工程研究所 | A kind of rule-based and statistical learning deformation entity recognition method and device |
CN110750991A (en) * | 2019-09-18 | 2020-02-04 | 平安科技(深圳)有限公司 | Entity identification method, device, equipment and computer readable storage medium |
WO2021051872A1 (en) * | 2019-09-18 | 2021-03-25 | 平安科技(深圳)有限公司 | Entity identification method, device, apparatus, and computer readable storage medium |
CN110750991B (en) * | 2019-09-18 | 2022-04-15 | 平安科技(深圳)有限公司 | Entity identification method, device, equipment and computer readable storage medium |
CN111488467A (en) * | 2020-04-30 | 2020-08-04 | 北京建筑大学 | Construction method and device of geographical knowledge graph, storage medium and computer equipment |
CN111488467B (en) * | 2020-04-30 | 2022-04-05 | 北京建筑大学 | Construction method and device of geographical knowledge graph, storage medium and computer equipment |
CN112926333A (en) * | 2021-04-09 | 2021-06-08 | 平安科技(深圳)有限公司 | Entity identification method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN107943786B (en) | 2021-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107943786A (en) | A kind of Chinese name entity recognition method and system | |
Huang et al. | PHMOSpell: Phonological and morphological knowledge guided Chinese spelling check | |
CN107451126B (en) | Method and system for screening similar meaning words | |
CN107463607B (en) | Method for acquiring and organizing upper and lower relations of domain entities by combining word vectors and bootstrap learning | |
CN100536532C (en) | Method and system for automatic subtilting | |
CN102693279B (en) | Method, device and system for fast calculating comment similarity | |
CN110717018A (en) | Industrial equipment fault maintenance question-answering system based on knowledge graph | |
CN104408078A (en) | Construction method for key word-based Chinese-English bilingual parallel corpora | |
CN109637537B (en) | Method for automatically acquiring annotated data to optimize user-defined awakening model | |
CN103020230A (en) | Semantic fuzzy matching method | |
CN103309926A (en) | Chinese and English-named entity identification method and system based on conditional random field (CRF) | |
CN108733647B (en) | Word vector generation method based on Gaussian distribution | |
CN110362678A (en) | A kind of method and apparatus automatically extracting Chinese text keyword | |
CN112069826A (en) | Vertical domain entity disambiguation method fusing topic model and convolutional neural network | |
CN104750820A (en) | Filtering method and device for corpuses | |
CN110188359B (en) | Text entity extraction method | |
CN109062904A (en) | Logical predicate extracting method and device | |
CN109190099B (en) | Sentence pattern extraction method and device | |
CN109522396B (en) | Knowledge processing method and system for national defense science and technology field | |
Kessler et al. | Extraction of terminology in the field of construction | |
Sagcan et al. | Toponym recognition in social media for estimating the location of events | |
Sheikh et al. | How diachronic text corpora affect context based retrieval of oov proper names for audio news | |
CN108229565A (en) | A kind of image understanding method based on cognition | |
Al-Sultany et al. | Enriching tweets for topic modeling via linking to the wikipedia | |
CN108763487A (en) | A kind of word representation method of fusion part of speech and sentence information based on Mean Shift |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |