[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107038178A - The analysis of public opinion method and apparatus - Google Patents

The analysis of public opinion method and apparatus Download PDF

Info

Publication number
CN107038178A
CN107038178A CN201610628754.0A CN201610628754A CN107038178A CN 107038178 A CN107038178 A CN 107038178A CN 201610628754 A CN201610628754 A CN 201610628754A CN 107038178 A CN107038178 A CN 107038178A
Authority
CN
China
Prior art keywords
information
original text
text
analysis
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610628754.0A
Other languages
Chinese (zh)
Other versions
CN107038178B (en
Inventor
金戈
张�杰
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201610628754.0A priority Critical patent/CN107038178B/en
Priority to PCT/CN2017/077965 priority patent/WO2018023981A1/en
Publication of CN107038178A publication Critical patent/CN107038178A/en
Application granted granted Critical
Publication of CN107038178B publication Critical patent/CN107038178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of the analysis of public opinion method, this method includes:The information of user's issue is gathered, identification described information obtains corresponding original text;Based on the textual classification model being pre-configured with, judge the original text whether be pre-set categories text, the pre-set categories are to be related to the classification belonging to the information of specified event;If the original text is the text of pre-set categories, based on the name physical model being pre-configured with, judge whether the original text is related to the specified event;If the original text is related to the specified event, it is determined that the corresponding information of the original text is the target information for being related to the specified event.The invention also discloses a kind of the analysis of public opinion device.The present invention realizes the automatic identification and lookup to being related to specified event information, the content without manually checking each information one by one.

Description

The analysis of public opinion method and apparatus
Technical field
The present invention relates to Internet technical field, more particularly to a kind of the analysis of public opinion method and apparatus.
Background technology
With the development of Internet technology, the social activity class website such as microblogging or social software have been provided the user and can believed The shared social platform of breath, user can issue various information in social platform, for example, share private life to friend at any time Information, to lead referral business information etc..Due to the simple operation of social platform, Information Communication speed is fast, increasing industry Business personnel carry out promoting service by social platform.
However, have partial service person when carrying out promoting service, the popularization resource that may be provided using company where it The business of other companies is promoted, offending message is issued, the interests of company are damaged.It is presently mainly by company for such case The information issued to business personnel of relevant persons in charge manually verified one by one, search the promoting service information of business personnel's issue In whether relate to the promotion message of non-our company's business.
With increasing for business personnel's quantity, the information content that daily inspector need to verify also is riseing step by step, exceedes Artificial afforded scope, causes to verify efficiency reduction.Also, except promoting service information in the information of business personnel's issue, Private life's information is also possible that, the information for having plurality of classes needs manually to be screened, and causes the workload manually verified Increase, more have impact on artificial verification efficiency.
The content of the invention
It is a primary object of the present invention to provide a kind of the analysis of public opinion method and apparatus, it is intended to solve artificial search and be related to finger Determine the low technical problem of the information efficiency of event.
To achieve the above object, the present invention provides a kind of the analysis of public opinion method, and the analysis of public opinion method includes following step Suddenly:
The information of user's issue is gathered, identification described information obtains corresponding original text;
Based on the textual classification model being pre-configured with, judge the original text whether be pre-set categories text, it is described Pre-set categories are to be related to the classification belonging to the information of specified event;
If the original text is the text of pre-set categories, based on the name physical model being pre-configured with, judge described Whether original text is related to the specified event;
If the original text is related to the specified event, it is determined that the corresponding information of the original text is described to be related to The target information for the event of specifying.
In one embodiment, when described information is pictorial information, the identification described information obtains corresponding first The step of beginning text, includes:
Multithreading recognizes the text message in the pictorial information, obtains the corresponding initial text of the pictorial information This.
In one embodiment, the corresponding information of the original text that obtains is the target for being related to the specified event After the step of information, in addition to:
The user is marked for risk subscribers.
In one embodiment, the information of the collection user issue, identification described information obtains corresponding original text The step of before, in addition to:
Participle is carried out to the training corpus of mark classification in advance, characteristic variable is extracted, the classification marked in advance includes The pre-set categories;
The relation for obtaining the characteristic variable and the classification marked in advance is trained based on model-naive Bayesian, is obtained The textual classification model.
In one embodiment, the information of the collection user issue, identification described information obtains corresponding original text The step of before, in addition to:
Participle is carried out to the training corpus of the entity of mark name in advance, using Chinese Construction of A Model part of speech sequence, extracted Characteristic variable, the name entity marked in advance is the name entity for being related to the specified event;
The characteristic variable is obtained based on conditional random field models training described with whether including in the training corpus The relation of the name entity marked in advance, obtains the name physical model.
In addition, to achieve the above object, the present invention also provides a kind of the analysis of public opinion device, the analysis of public opinion device bag Include:
Acquisition module, the information for gathering user's issue, identification described information obtains corresponding original text;
Sort module, for based on the textual classification model being pre-configured with, judging whether the original text is default class Other text, the pre-set categories are to be related to the classification belonging to the information of specified event;
Identification module, if being the text of pre-set categories for the original text, based on the name entity being pre-configured with Model, judges whether the original text is related to the specified event;
Determining module, if being related to the specified event for the original text, it is determined that the original text is corresponding Information is the target information for being related to the specified event.
In one embodiment, when described information is pictorial information, the acquisition module is additionally operable to,
Multithreading recognizes the text message in the pictorial information, obtains the corresponding initial text of the pictorial information This.
In one embodiment, the analysis of public opinion device also includes:
Labeling module, for marking the user for risk subscribers.
In one embodiment, the analysis of public opinion device also includes:
Disaggregated model training module, carries out participle for the training corpus to mark classification in advance, extracts characteristic variable, institute Stating the classification marked in advance includes the pre-set categories;Based on model-naive Bayesian training obtain the characteristic variable with it is described The relation of the classification marked in advance, obtains the textual classification model.
In one embodiment, the analysis of public opinion device also includes:
Physical model training module is named, for naming mark in advance the training corpus of entity to carry out participle, the Chinese is used Language model constructs part of speech sequence, extracts characteristic variable, the name entity marked in advance is to be related to the specified event Name entity;The characteristic variable is obtained based on conditional random field models training described with whether including in the training corpus The relation of the name entity marked in advance, obtains the name physical model.
The analysis of public opinion method and apparatus that the embodiment of the present invention is proposed, by gathering the information that user issues, identification letter Breath obtains corresponding original text;Then, whether based on the textual classification model being pre-configured with, it is default class to judge original text Other text, because default classification is to be related to the classification belonging to the information of specified event, so as to get rid of and be related to finger Determine other unrelated texts of the information category of event, reduce workload, improve the accuracy rate of hit information;If initial text This is the text of preset kind, then based on the name physical model being pre-configured with, judges whether original text is related to specified event; If original text is related to specified event, it is determined that the corresponding information of original text is the target information for being related to specified event.This hair The original text of obtained pre-set categories is screened in bright by textual classification model, belongs to the information type for being related to specified event, The original text unrelated with the information type of specified event is excluded;Then, based on the name entity included in original text The name entity being related to specified event, the identification of entity is named using name physical model, according to knowledge to original text Other result judges whether original text is related to specified event, so that whether the information for having obtained user's issue is related to specified thing The analysis result of part, realizes the automatic identification to being related to specified event information, without manually checking each information one by one Content.Also, the automatic classification to text message and name Entity recognition, are protected based on textual classification model and name physical model Hinder the accuracy that the target information obtained is related to specified event, meet actual application demand.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the analysis of public opinion method first embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the analysis of public opinion method second embodiment of the present invention;
Fig. 3 is the schematic flow sheet of the analysis of public opinion method 3rd embodiment of the present invention;
Fig. 4 is the schematic flow sheet of the analysis of public opinion method fourth embodiment of the present invention;
Fig. 5 is the schematic flow sheet of the embodiment of the analysis of public opinion method the 5th of the present invention;
Fig. 6 is the analysis of public opinion device first embodiment of the present invention, the high-level schematic functional block diagram of second embodiment;
Fig. 7 is the high-level schematic functional block diagram of the analysis of public opinion device 3rd embodiment of the present invention;
Fig. 8 is the high-level schematic functional block diagram of the analysis of public opinion device fourth embodiment of the present invention;
Fig. 9 is the high-level schematic functional block diagram of the embodiment of the analysis of public opinion device the 5th of the present invention;
Figure 10 is a kind of training corpus classification mark schematic diagram in the embodiment of the present invention;
Figure 11 is a kind of participle of training corpus, characteristic variable extraction schematic diagram in the embodiment of the present invention;
Figure 12 is a kind of disaggregated model training schematic diagram in the embodiment of the present invention;
Figure 13 is a kind of participle of training corpus, characteristic variable extraction schematic diagram in the embodiment of the present invention.
The realization, functional characteristics and advantage of the object of the invention will be described further referring to the drawings in conjunction with the embodiments.
Embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
Reference picture 1, the analysis of public opinion method first embodiment of the present invention provides a kind of the analysis of public opinion method, the analysis of public opinion Method includes:
Step S10, the information of collection user's issue, identification described information obtain corresponding original text.
The present invention obtains the information of pre-set categories by classifying to the information collected, then real by naming again Body Model captures the target information for being related to specified event from the information of pre-set categories, passes through textual classification model and name entity Model realization is captured to the automation for specifying event information, and other corporate business for for example capturing our company employee issue are pushed away Guangxin breath, our company promoting service information out of date, or the business promotion that rival company issues on Extension Software Platform Information etc..
In the present embodiment with capture our company employee issue be related to promotes non-our company's business information progress illustrate Bright, the specified event in the present embodiment is to promote non-our company's business.The business promoted in the present embodiment also is understood as pushing away Wide product or commodity etc..
Specifically, as a kind of embodiment, first, collection of server user issues in social platform or Extension Software Platform Every terms of information, for example gather the information issued on wechat circle of friends or microblog of our company employee.User namely sheet It is secondary to search the target group that issue is related to specified event information.
Then, the text message in server identification information, obtains the corresponding original text of information.
Step S20, based on the textual classification model being pre-configured with, judge the original text whether be pre-set categories text This, the pre-set categories are to be related to the classification belonging to the information of specified event.
Obtain original text after, server using the textual classification model being pre-configured with, judge original text whether be The text of pre-set categories is, it is necessary to which explanation, default classification is belonging to the information for being related to specified event that need to currently search Classification, to remove the text of other unrelated classifications.
Specifically, as a kind of embodiment, the textual classification model being pre-configured with includes multiple preassigned classes Not, the text of input can be mapped to the classification belonging to it.Textual classification model it is preassigned it is of all categories can be according to reality Need to be configured, text to be classified, removal be not directed to specify event, other classifications irrelevant information, only from It is categorized into be related in the information of the information generic of specified event and searches target information, the work of target information lookup can be reduced Measure, improve search efficiency.
The information of non-our company's business is promoted due to needing to search being related to for our company employee issue in the present embodiment, belongs to battalion Information is sold, then the preassigned classification of the present embodiment Chinese version disaggregated model includes marketing text and non-marketing text, this implementation It is text of marketing that the pre-set categories that screening is obtained are needed in example, the irrelevant information such as private life's information to remove user's issue.
When classifying using textual classification model to original text, participle is carried out to original text first, construction is special Matrix of variables is levied, original text is converted into standardized data, textual classification model is then put into and is classified.Text classification mould Text feature variable and preassigned relation of all categories that type is obtained based on training, maximum probability is categorized into by original text Classification in.
Thus, the classification of original text is obtained.
Then, server judges whether the classification of original text is identical with default classification, obtains judged result.If initial The classification of text is identical with default classification, then judges text of the original text as pre-set categories;If the classification of original text with Default classification is differed, then it is not the text of pre-set categories to judge original text.In the present embodiment, the text of pre-set categories Market text.
Further, in order to which more accurate screening obtains text, the textual data of physical model identification need to be named by reducing Amount, further can be divided into subclass, such as finance marketing text, religion according to the scopes of services of our company by marketing text Marketing text, Real Estate Marketing text etc. are educated, the subclass of corporate scope will be belonged in text categories of marketing as default Classification, only obtains this and belongs to the original text of subclass, so as to remove other marketing texts unrelated with our company business scope.
If step S30, the original text are the texts of pre-set categories, based on the name physical model being pre-configured with, Judge whether the original text is related to the specified event.
After the kind judging result of original text is obtained, if original text is the text of pre-set categories, namely initial text Originally the classification belonging to the information for being related to specified event is belonged to, then server needs to judge whether initial text is related to specified event.
Specifically, first, based on the name physical model being pre-configured with, whether including default in identification original text Entity is named, recognition result is obtained;Then, according to recognition result and default decision condition, judge whether original text is related to The event of specifying.
Wherein, default decision condition can judge that the original text for including default name entity is related to specified thing The original text that part or judgement do not include default name entity is related to specified event.
It should be noted that the name entity that configuration specifies event to be related to is default name entity, specified event is related to Name entity can be according to the specific flexible configuration of content of specified event.
As a kind of embodiment, due to needing being related to for lookup our company employee issue to promote non-our company in the present embodiment The information of business, then preset every Business Name that name entity is our company, to determine without default life in the present embodiment The original text of name entity is the text for being related to specified event, namely the original text without our company's Business Name is to be related to push away The text of Guang Fei our company business.
When it is implemented, original text is carried out into participle first, part of speech identification is carried out using Chinese model, is removed useless Word, original text is converted to the part of speech sequence of normal structure.Then, the name physical model being pre-configured with is put into, identification is just Beginning text is whether, comprising default name entity, to be identified result.
If original text does not include default name entity, judge that this original text is related to specified event;If initial text This includes default name entity, then judges that this original text is not related to specified event.
Thus, judged result is obtained.
If step S40, the original text are related to the specified event, it is determined that the corresponding information of the original text is It is related to the target information of the specified event.
If original text is related to specified event, namely original text is marketing text, and does not include our company's business wherein Title.
Then, it is the target information for being related to specified event, obtained target information to determine the corresponding information of this original text Do not include the marketing message of our company's Business Name, be related to the non-our company's business of popularization.
Certainly, the present invention, based on the name physical model being pre-configured with, can also confirm to include according to the event of specifying needs The original text of default name entity is the text for being related to specified event.
For example, when searching the business promotion information of rival company issue on Extension Software Platform, it is competition pair to specify event The business promotion of hand company, default name entity is rival company, the then marketing message issued from Extension Software Platform In, message of the screening comprising default name entity is used as the target message for being related to specified event required to look up.
In the present embodiment, the information of collection user issue, identification information obtains corresponding original text;Then, it is based on The textual classification model being pre-configured with, judge original text whether be pre-set categories text, because default classification is is related to Classification belonging to the information for the event of specifying, so as to other texts that the information category getting rid of be related to specified event is unrelated This, reduces workload, improves the accuracy rate of hit information;If original text is the text of preset kind, based on pre- The name physical model first configured, judges whether original text is related to specified event, if original text is related to specified event, really It is the target information for being related to specified event to determine the corresponding information of original text.Screened in the present embodiment by textual classification model The original text of the pre-set categories arrived, belongs to the information type for being related to specified event, has excluded the info class with specified event The unrelated original text of type;Then, the name entity being related to based on the name entity included in original text and specified event, is made The identification of entity is named to original text with name physical model, judges whether original text is related to finger according to recognition result Determine event, so that whether the information for having obtained user's issue is the information for being related to specified event, realize to being related to specified event The automatic identification of information, the content without manually checking each information one by one.Also, it is real based on textual classification model and name Automatic classification and name Entity recognition of the body Model to text message, have ensured that the target information obtained is being related to specified event just True rate, meets actual application demand.
Further, reference picture 2, the analysis of public opinion method second embodiment of the present invention provides a kind of the analysis of public opinion method, base In the embodiment shown in above-mentioned Fig. 1, when described information is pictorial information, the identification described information obtains corresponding initial The step of text, includes:
Step S11, multithreading recognize the text message in the pictorial information, obtain the pictorial information corresponding Original text.
Because the information that user generally issues is probably picture, therefore, it may be included in the information that collection of server is obtained Pictorial information.
When collection of server is to pictorial information, it is necessary to carry out Text region to the picture in pictorial information, such as Image captioning (image captions) technology, the automatic content for reading picture can be utilized.
In order to heighten information gathering speed, server can carry out the reading of text in picture information simultaneously using multithreading Take, the text of reading is the corresponding original text of pictorial information.
In the present embodiment, when the information of collection is pictorial information, the text in multithreading identification pictorial information, Obtained text message is the corresponding original text of pictorial information, it is then possible to the corresponding original text of pictorial information Classified and name the identification of entity, the information checking to pictorial information is realized, to determine whether to be related to specified thing The target information of part.Also, the identification of pictorial information is carried out in the present embodiment using multithreading, information gathering speed is improved.
Further, reference picture 3, the analysis of public opinion method 3rd embodiment of the present invention provides a kind of the analysis of public opinion method, base After the step of the embodiment (the present embodiment is by taking Fig. 1 as an example) shown in above-mentioned Fig. 1 or Fig. 2, the S40, in addition to:
Step S50, the mark user are risk subscribers.
After target information is obtained, server obtains the user profile for issuing this target information automatically, such as account name Information, and be risk subscribers by this user annotation.
Further, if there is multinomial target information, the corresponding user profile configuration of each target information can be obtained and obtained Risk subscribers inventory, is risk subscribers by each user annotation in risk subscribers inventory, is available for manually being further analyzed Or veritify.
In the present embodiment, the automatic marking to issuing target information user, the items issued by each user are realized Information is classified and names Entity recognition, obtains issuing the risk subscribers for being related to specified event target information, so that will be final Lookup target navigate to user, be convenient for further processing.
Further, reference picture 4, the analysis of public opinion method fourth embodiment of the present invention provides a kind of the analysis of public opinion method, base Before the embodiment (the present embodiment is by taking Fig. 1 as an example) shown in above-mentioned Fig. 1, Fig. 2 or Fig. 3, the step S10, in addition to:
Step S60, the training corpus to mark classification in advance carry out participle, extract characteristic variable, described to mark in advance Classification includes the pre-set categories;
Step S70, trained based on model-naive Bayesian and obtain the characteristic variable and the classification marked in advance Relation, obtains the textual classification model.
The present embodiment is by using model-naive Bayesian, training text disaggregated model, obtained textual classification model energy It is enough that successfully the original text of input is classified.
As a kind of embodiment, training corpus is different classes of text, and classification is labeled with advance, for example, is schemed 10, according to the semanteme of training corpus, training corpus is divided into different classifications and is labeled.
Then, participle is carried out to training corpus, extracts characteristic variable, the characteristic variable of extraction can be word frequency, part of speech etc.. Such as Figure 11, is carried out after participle to training corpus, and the word frequency for extracting training corpus is used as the characteristic variable of training corpus.
Then, based on model-naive Bayesian, characteristic variable and the relation of all categories marked in advance are extracted in training.
Specifically, the characteristic variable matrix of standard can be configured to according to the characteristic variable of training text, X is used as; It is of all categories as Y using what is marked in advance, the relation for obtaining characteristic variable X and classification Y is trained using model-naive Bayesian, so that Obtain textual classification model.
Such as Figure 12, after carrying out participle to each training corpus, extracting characteristic variable, the phrase that is included according to each text and The corresponding word frequency of phrase, construction obtains the standardized feature matrix of variables of each training corpus.Then, according to each training corpus Characteristic variable matrix classification corresponding with its, the pass for obtaining characteristic variable and classification can be searched based on model-naive Bayesian System, so as to by reciprocal training, by being assigned in correct classification for each training corpus maximum probability.
Hereby it is achieved that the training of textual classification model.
It should be noted that being used in the classification marked in advance comprising the classification for relating to specify belonging to the information of event Whether identify original text is the text for being related to the classification belonging to the information of specified event.
In the present embodiment, participle is carried out to the training corpus of mark classification in advance, extracts characteristic variable;Then, it is based on The relation of characteristic variable and the classification marked in advance is extracted in model-naive Bayesian training, is obtained textual classification model, is obtained Textual classification model.Due to when carrying out the training of textual classification model, including in the classification marked in advance to training corpus Pre-set categories, therefore, train obtained textual classification model to putting into the original text in model can classify To the classification of original text, thus judge original text whether be pre-set categories text.
Further, reference picture 5, the embodiment of the analysis of public opinion method the 5th of the present invention provides a kind of the analysis of public opinion method, base Before the embodiment (the present embodiment is by taking Fig. 1 as an example) shown in above-mentioned Fig. 1, Fig. 2, Fig. 3 or Fig. 4, the step S10, in addition to:
Step S80, the training corpus progress participle for naming mark in advance entity, use Chinese Construction of A Model part of speech sequence Row, extract characteristic variable, the name entity marked in advance is the name entity for being related to the specified event;
Step S90, the characteristic variable obtained with whether being wrapped in the training corpus based on conditional random field models training Relation containing the name entity marked in advance, obtains the name physical model.
The present embodiment obtains naming physical model by conditional random field models training, so as to the initial text to input Originally it is named the identification of entity.
Specifically, as a kind of embodiment, training corpus is each text, and it is real that each training corpus is labeled with name in advance Body, for example:{{company_name:Company 1 } } it is proposed { { product_name:The grand life of honor } } product, use Company_name (Business Name) and product_name (name of product) are named the mark of entity.Need explanation It is that the name entity of mark is the name entity for being related to specified event.
Then, participle is carried out to training corpus, training corpus is configured to the part of speech sequence of standard using Chinese model, Characteristic variable extraction is carried out to each word, unstructured data is changed into the eigenmatrix of structuring.Wherein, the spy of extraction Variable is levied to include but is not limited to:Part of speech, contextual information, structure of word etc..For example, to training corpus, " company 1 is proposed The grand people's product of honor." carry out after participle, the characteristic variable of extraction is as shown in figure 13.
Then, based on conditional random field models, recognize whether wrapped in each training corpus using the characteristic variable training of extraction Containing the name entity marked in advance, the name entity that characteristic variable is marked in advance with whether including in training corpus is extracted Relation, obtains naming physical model.
Thus, whether the name physical model obtained can be for including the life marked in advance in the text of identified input Name entity.Because the name entity marked in advance is related to specified event, therefore whether server can wrap according to original text Recognition result containing the name entity for being related to specified event, judges whether original text is related to specified event.
In the present embodiment, the training corpus for naming entity to mark in advance carries out participle, uses Chinese Construction of A Model Part of speech sequence, extracts characteristic variable, the name entity marked in advance is the name entity for being related to specified event;Then, based on bar The relation for the name entity that characteristic variable is marked in advance with whether including in training corpus is extracted in the training of part random field models, is obtained To name physical model, whether included in the text so as to which input can be identified using name physical model and relate to refer to The name entity of event is determined, for judging whether original text is related to specified event.
A kind of reference picture 6, the analysis of public opinion device first embodiment of the invention provides a kind of the analysis of public opinion device, the public sentiment Analytical equipment includes:
Acquisition module 10, the information for gathering user's issue, identification described information obtains corresponding original text.
The present invention obtains the information of pre-set categories by classifying to the information collected, then real by naming again Body Model captures the target information for being related to specified event from the information of pre-set categories, passes through textual classification model and name entity Model realization is captured to the automation for specifying event information, and other corporate business for for example capturing our company employee issue are pushed away Guangxin breath, our company promoting service information out of date, or the business promotion that rival company issues on Extension Software Platform Information etc..
In the present embodiment with capture our company employee issue be related to promotes non-our company's business information progress illustrate Bright, the specified event in the present embodiment is to promote non-our company's business.The business promoted in the present embodiment also is understood as pushing away Wide product or commodity etc..
Specifically, as a kind of embodiment, first, acquisition module 10 gathers user in social platform or Extension Software Platform The every terms of information of issue, for example, gather the information that our company employee issues on wechat circle of friends or microblog.User I.e. this searches the target group that issue is related to specified event information.
Then, the text message in the identification information of acquisition module 10, obtains the corresponding original text of information.
Sort module 20, for based on the textual classification model being pre-configured with, judging whether the original text is default The text of classification, the pre-set categories are to be related to the classification belonging to the information of specified event.
After original text is obtained, sort module 20 judges that original text is using the textual classification model being pre-configured with The no text for pre-set categories is, it is necessary to which explanation, default classification is the information for being related to specified event that need to currently search Affiliated classification, to remove the text of other unrelated classifications.
Specifically, as a kind of embodiment, the textual classification model being pre-configured with includes multiple preassigned classes Not, the text of input can be mapped to the classification belonging to it.Textual classification model it is preassigned it is of all categories can be according to reality Need to be configured, text to be classified, removal be not directed to specify event, other classifications irrelevant information, only from It is categorized into be related in the information of the information generic of specified event and searches target information, the work of target information lookup can be reduced Measure, improve search efficiency.
The information of non-our company's business is promoted due to needing to search being related to for our company employee issue in the present embodiment, belongs to battalion Information is sold, then the preassigned classification of the present embodiment Chinese version disaggregated model includes marketing text and non-marketing text, this implementation It is text of marketing that the pre-set categories that screening is obtained are needed in example, the irrelevant information such as private life's information to remove user's issue.
When being classified using textual classification model to original text, sort module 20 is divided original text first Word, constructs characteristic variable matrix, original text is converted into standardized data, then puts into textual classification model and is classified. Text feature variable and preassigned relation of all categories that textual classification model is obtained based on training, original text is classified Into the classification of maximum probability.
Thus, sort module 20 obtains the classification of original text.
Then, sort module 20 judges whether the classification of original text is identical with default classification, obtains judged result.If The classification of original text is identical with default classification, then judges text of the original text as pre-set categories;If the class of original text Do not differed with default classification, then it is not the text of pre-set categories to judge original text.In the present embodiment, the text of pre-set categories Sheet namely marketing text.
Further, in order to which more accurate screening obtains text, the textual data of physical model identification need to be named by reducing Amount, further can be divided into subclass, such as finance marketing text, religion according to the scopes of services of our company by marketing text Marketing text, Real Estate Marketing text etc. are educated, the subclass of corporate scope will be belonged in text categories of marketing as default Classification, only obtains this and belongs to the original text of subclass, so as to remove other marketing texts unrelated with our company business scope.
Identification module 30, it is real based on the name being pre-configured with if being the text of pre-set categories for the original text Body Model, judges whether the original text is related to the specified event.
After the kind judging result of original text is obtained, if original text is the text of pre-set categories, namely initial text Originally the classification belonging to the information for being related to specified event is belonged to, then identification module 30 needs to judge whether initial text is related to specified thing Part.
Specifically, first, identification module 30 recognizes whether wrapped in original text based on the name physical model being pre-configured with Containing default name entity, recognition result is obtained;Then, identification module 30 is according to recognition result and default decision condition, Judge whether original text is related to specified event.
Wherein, default decision condition can judge that the original text for including default name entity is related to specified thing The original text that part or judgement do not include default name entity is related to specified event.
It should be noted that the name entity that the configuration of identification module 30 specifies event to be related to is default name entity, refer to Determining name entity that event is related to can be according to the specific flexible configuration of content of specified event.
As a kind of embodiment, due to needing being related to for lookup our company employee issue to promote non-our company in the present embodiment The information of business, then preset every Business Name that name entity is our company, to determine without default life in the present embodiment The original text of name entity is the text for being related to specified event, namely the original text without our company's Business Name is to be related to push away The text of Guang Fei our company business.
When it is implemented, original text is carried out participle by identification module 30 first, part of speech knowledge is carried out using Chinese model Not, obsolete word is removed, original text is converted to the part of speech sequence of normal structure.Then, the name entity mould being pre-configured with is put into Type, identification original text is whether, comprising default name entity, to be identified result.
If original text does not include default name entity, identification module 30 judges that this original text is related to specified thing Part;If original text includes default name entity, identification module 30 judges that this original text is not related to specified event.
Thus, identification module 30 obtains judged result.
Determining module 40, if being related to the specified event for the original text, it is determined that the original text correspondence Information be the target information for being related to the specified event.
If original text is related to specified event, namely original text is marketing text, and does not include our company's business wherein Title.
Then, determining module 40 determines that the corresponding information of this original text is the target information for being related to specified event, obtains Target information namely the marketing message not comprising our company's Business Name, are related to the non-our company's business of popularization.
Certainly, the present invention, based on the name physical model being pre-configured with, can also confirm to include according to the event of specifying needs The original text of default name entity is the text for being related to specified event.
For example, when searching the business promotion information of rival company issue on Extension Software Platform, it is competition pair to specify event The business promotion of hand company, default name entity is rival company, then identification module 30 is issued from Extension Software Platform Marketing message in, message of the screening comprising default name entity is used as the target message for being related to specified event required to look up.
In the present embodiment, the information of the collection of acquisition module 10 user issue, identification information obtains corresponding original text; Then, sort module 20 is based on the textual classification model being pre-configured with, judge original text whether be pre-set categories text, by It is to be related to the classification belonging to the information of specified event in default classification, so as to get rid of and be related to the information of specified event Other unrelated texts of classification, reduce workload, improve the accuracy rate of hit information;If original text is preset kind Text, then identification module 30 judge whether original text is related to specified event based on the name physical model being pre-configured with;If Original text is related to specified event, it is determined that module 40 determines that the corresponding information of original text is the target letter for being related to specified event Breath.The original text of obtained pre-set categories is screened in the present embodiment by textual classification model, belongs to and is related to specified event Information type, has excluded the original text unrelated with the information type of specified event;Then, based on including in original text The name entity that name entity and specified event are related to, the knowledge of entity is named using name physical model to original text , do not judge whether original text is related to specified event according to recognition result, thus obtained user issue information whether It is related to the analysis result of specified event, realizes the automatic identification to being related to specified event information, without manually checks one by one The content of each information.Also, automatic classification and name based on textual classification model and name physical model to text message Entity recognition, has ensured that the target information obtained is related to the accuracy of specified event, has met actual application demand.
Further, reference picture 6, the analysis of public opinion device second embodiment of the present invention provides a kind of the analysis of public opinion device, base In the invention described above the analysis of public opinion device first embodiment, when described information is pictorial information, the acquisition module 10 is also For,
Multithreading recognizes the text message in the pictorial information, obtains the corresponding initial text of the pictorial information This.
, therefore, can in the item information that acquisition module 10 is collected because the information that user generally issues is probably picture Pictorial information can be included.
When collecting pictorial information, acquisition module 10 needs to carry out Text region to the picture in pictorial information, Image captioning (image captions) technology, the automatic content for reading picture can for example be utilized.
In order to heighten information gathering speed, acquisition module 10 can carry out text in picture information simultaneously using multithreading Read, the text of reading is the corresponding original text of pictorial information.
In the present embodiment, when the information of collection is pictorial information, the control multithreading identification picture of acquisition module 10 Text in item of information, obtained text message is the corresponding original text of pictorial information, it is then possible to pictorial information Corresponding original text is classified and names the identification of entity, realizes the information checking to pictorial information, to judge to be No is the target information for being related to specified event.Also, the identification of pictorial information is carried out in the present embodiment using multithreading, is improved Information gathering speed.
Further, reference picture 7, the analysis of public opinion device 3rd embodiment of the present invention provides a kind of the analysis of public opinion device, base In any embodiment shown in above-mentioned Fig. 6, the analysis of public opinion device also includes:
Labeling module 50, for marking the user for risk subscribers.
After target information is obtained, labeling module 50 obtains the user profile for issuing this target information, such as account automatically The information such as name, and be risk subscribers by this user annotation.
Further, if there is multinomial target information, labeling module 50 can obtain the corresponding user's letter of each target information Breath configuration obtains risk subscribers inventory, is risk subscribers by each user annotation in risk subscribers inventory, is available for manually entering to advance The analysis or veritification of one step.
In the present embodiment, labeling module 50 realizes the automatic marking to issuing target information user, passes through each user The every terms of information of issue is classified and names Entity recognition, obtains issuing the risk subscribers for being related to specified event target information, So as to which final lookup target is navigated into user, further processing is convenient for.
Further, reference picture 8, the analysis of public opinion device fourth embodiment of the present invention provides a kind of the analysis of public opinion device, base In the embodiment (the present embodiment is by taking Fig. 6 as an example) shown in above-mentioned Fig. 6 or Fig. 7, the analysis of public opinion device also includes:
Disaggregated model training module 60, carries out participle for the training corpus to mark classification in advance, extracts characteristic variable, The classification marked in advance includes the pre-set categories;The characteristic variable and institute are obtained based on model-naive Bayesian training The relation of the classification marked in advance is stated, the textual classification model is obtained.
The present embodiment disaggregated model training module 60 is obtained by using model-naive Bayesian, training text disaggregated model To textual classification model successfully the original text of input can be classified.
As a kind of embodiment, training corpus is different classes of text, and classification is labeled with advance, for example, is schemed 10, according to the semanteme of training corpus, training corpus is divided into different classifications and is labeled.
Then, disaggregated model training module 60 carries out participle to training corpus, extracts characteristic variable, the characteristic variable of extraction Can be word frequency, part of speech etc..Such as Figure 11, is carried out after participle to training corpus, extracts the word frequency of training corpus as training language The characteristic variable of material.
Then, disaggregated model training module 60 is based on model-naive Bayesian, and characteristic variable and mark in advance are extracted in training Relation of all categories.
Specifically, the characteristic variable matrix of standard can be configured to according to the characteristic variable of training text, X is used as; It is of all categories as Y using what is marked in advance, the relation for obtaining characteristic variable X and classification Y is trained using model-naive Bayesian, so that Obtain textual classification model.
Such as Figure 12, after carrying out participle to each training corpus, extracting characteristic variable, the phrase that is included according to each text and The corresponding word frequency of phrase, construction obtains the standardized feature matrix of variables of each training corpus.Then, according to each training corpus Characteristic variable matrix classification corresponding with its, the pass for obtaining characteristic variable and classification can be searched based on model-naive Bayesian System, so as to by reciprocal training, by being assigned in correct classification for each training corpus maximum probability.
Thus, disaggregated model training module 60 realizes the training of textual classification model.
It should be noted that being used in the classification marked in advance comprising the classification for relating to specify belonging to the information of event Whether identify original text is the text for being related to the classification belonging to the information of specified event.
In the present embodiment, 60 pairs of disaggregated model training module marks the training corpus progress participle of classification in advance, extracts Characteristic variable;Then, the relation for extracting characteristic variable and the classification marked in advance is trained based on model-naive Bayesian, text is obtained This disaggregated model, obtained textual classification model.Due to when carrying out the training of textual classification model, in advance to training corpus mark Include pre-set categories in the classification of note, therefore, train obtained textual classification model can be initial in model to putting into Carry out classifying and obtaining the classification of original text in text, thus judge original text whether be pre-set categories text.
Further, reference picture 9, the embodiment of the analysis of public opinion device the 5th of the present invention provides a kind of the analysis of public opinion device, base In the embodiment (the present embodiment is by taking Fig. 6 as an example) shown in above-mentioned Fig. 6, Fig. 7 or Fig. 8, the analysis of public opinion device also includes:
Physical model training module 70 is named, for naming mark in advance the training corpus of entity to carry out participle, is used Chinese Construction of A Model part of speech sequence, extracts characteristic variable, and the name entity marked in advance is to be related to the specified event Name entity;The characteristic variable is obtained with whether being included in the training corpus based on conditional random field models training The relation of the name entity marked in advance is stated, the name physical model is obtained.
The present embodiment name physical model training module 70 obtains naming physical model by conditional random field models training, The identification of entity is named so as to the original text to input.
Specifically, as a kind of embodiment, training corpus is each text, and it is real that each training corpus is labeled with name in advance Body, for example:{{company_name:Company 1 } } it is proposed { { product_name:The grand life of honor } } product, use Company_name (Business Name) and product_name (name of product) are named the mark of entity.Need explanation It is that the name entity of mark is the name entity for being related to specified event.
Then, name physical model training module 70 carries out participle to training corpus, and language will be trained using Chinese model Material is configured to the part of speech sequence of standard, carries out characteristic variable extraction to each word, unstructured data is changed into structuring Eigenmatrix.Wherein, the characteristic variable of extraction includes but is not limited to:Part of speech, contextual information, structure of word etc..For example, To training corpus, " company 1 is proposed the grand people's product of honor." carry out after participle, the characteristic variable of extraction is as shown in figure 13.
Then, name physical model training module 70 is based on conditional random field models, is trained using the characteristic variable of extraction The name entity for whether including in each training corpus and marking in advance is recognized, characteristic variable is extracted with whether being included in training corpus There is the relation of the name entity marked in advance, obtain naming physical model.
Thus, whether the name physical model obtained can be for including the life marked in advance in the text of identified input Name entity.Because the name entity marked in advance is related to specified event, therefore identification module 30 can be according to original text The no recognition result for including the name entity for being related to specified event, judges whether original text is related to specified event.
In the present embodiment, the training corpus of 70 pairs of name physical model training module mark name in advance entity is divided Word, using Chinese Construction of A Model part of speech sequence, extracts characteristic variable, the name entity marked in advance is to be related to specified event Name entity;Then, characteristic variable is extracted with whether including advance mark in training corpus based on conditional random field models training The relation of the name entity of note, obtains naming physical model, so as to identify input using name physical model Whether comprising the name entity for relating to specify event in text, for judging whether original text is related to specified event.
The alternative embodiment of the present invention is these are only, is not intended to limit the scope of the invention, it is every to utilize this hair Equivalent structure or equivalent flow conversion that bright specification and accompanying drawing content are made, or directly or indirectly it is used in other related skills Art field, is included within the scope of the present invention.

Claims (10)

1. a kind of the analysis of public opinion method, it is characterised in that the analysis of public opinion method comprises the following steps:
The information of user's issue is gathered, identification described information obtains corresponding original text;
Based on the textual classification model being pre-configured with, judge the original text whether be pre-set categories text, it is described default Classification is to be related to the classification belonging to the information of specified event;
If the original text is the text of pre-set categories, based on the name physical model being pre-configured with, judge described initial Whether text is related to the specified event;
If the original text is related to the specified event, it is determined that the corresponding information of the original text is is related to described specify The target information of event.
2. the analysis of public opinion method as claimed in claim 1, it is characterised in that described when described information is pictorial information The step of identification described information obtains corresponding original text includes:
Multithreading recognizes the text message in the pictorial information, obtains the corresponding original text of the pictorial information.
3. the analysis of public opinion method as claimed in claim 1, it is characterised in that the corresponding information of the determination original text After the step of target information to be related to the specified event, in addition to:
The user is marked for risk subscribers.
4. the analysis of public opinion method as described in claim 1 or 2 or 3, it is characterised in that the information of the collection user issue, Before the step of identification described information obtains corresponding original text, in addition to:
Participle is carried out to the training corpus of mark classification in advance, characteristic variable is extracted, the classification marked in advance includes described Pre-set categories;
The relation for obtaining the characteristic variable and the classification marked in advance is trained based on model-naive Bayesian, obtains described Textual classification model.
5. the analysis of public opinion method as described in claim 1 or 2 or 3, it is characterised in that the information of the collection user issue, Before the step of identification described information obtains corresponding original text, in addition to:
Participle is carried out to the training corpus of the entity of mark name in advance, using Chinese Construction of A Model part of speech sequence, feature is extracted Variable, the name entity marked in advance is the name entity for being related to the specified event;
The characteristic variable is obtained based on conditional random field models training described advance with whether including in the training corpus The relation of the name entity of mark, obtains the name physical model.
6. a kind of the analysis of public opinion device, it is characterised in that the analysis of public opinion device includes:
Acquisition module, the information for gathering user's issue, identification described information obtains corresponding original text;
Sort module, for based on the textual classification model being pre-configured with, judging whether the original text is pre-set categories Text, the pre-set categories are to be related to the classification belonging to the information of specified event;
Identification module, if being the text of pre-set categories for the original text, based on the name physical model being pre-configured with, Judge whether the original text is related to the specified event;
Determining module, if being related to the specified event for the original text, it is determined that the corresponding information of the original text To be related to the target information of the specified event.
7. the analysis of public opinion device as claimed in claim 6, it is characterised in that described when described information is pictorial information Acquisition module is additionally operable to,
Multithreading recognizes the text message in the pictorial information, obtains the corresponding original text of the pictorial information.
8. the analysis of public opinion device as claimed in claim 6, it is characterised in that the analysis of public opinion device also includes:
Labeling module, for marking the user for risk subscribers.
9. the analysis of public opinion device as described in claim 6 or 7 or 8, it is characterised in that the analysis of public opinion device also includes:
Disaggregated model training module, carries out participle for the training corpus to mark classification in advance, extracts characteristic variable, described pre- The classification first marked includes the pre-set categories;The characteristic variable is obtained and described advance based on model-naive Bayesian training The relation of the classification of mark, obtains the textual classification model.
10. the analysis of public opinion device as described in claim 6 or 7 or 8, it is characterised in that the analysis of public opinion device also includes:
Physical model training module is named, for naming mark in advance the training corpus of entity to carry out participle, Chinese is used Construction of A Model part of speech sequence, extracts characteristic variable, the name entity marked in advance is the name for being related to the specified event Entity;The characteristic variable is obtained based on conditional random field models training described advance with whether including in the training corpus The relation of the name entity of mark, obtains the name physical model.
CN201610628754.0A 2016-08-03 2016-08-03 Public opinion analysis method and device Active CN107038178B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610628754.0A CN107038178B (en) 2016-08-03 2016-08-03 Public opinion analysis method and device
PCT/CN2017/077965 WO2018023981A1 (en) 2016-08-03 2017-03-24 Public opinion analysis method, device, apparatus and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610628754.0A CN107038178B (en) 2016-08-03 2016-08-03 Public opinion analysis method and device

Publications (2)

Publication Number Publication Date
CN107038178A true CN107038178A (en) 2017-08-11
CN107038178B CN107038178B (en) 2020-07-21

Family

ID=59532642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610628754.0A Active CN107038178B (en) 2016-08-03 2016-08-03 Public opinion analysis method and device

Country Status (2)

Country Link
CN (1) CN107038178B (en)
WO (1) WO2018023981A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784083A (en) * 2017-09-30 2018-03-09 北京合力智联科技有限公司 A kind of automatic identification processing method of network public sentiment information validity
CN108170742A (en) * 2017-12-19 2018-06-15 百度在线网络技术(北京)有限公司 Picture public sentiment acquisition methods, device, computer equipment and storage medium
CN108829678A (en) * 2018-06-20 2018-11-16 广东外语外贸大学 Name entity recognition method in a kind of Chinese international education field
CN109376237A (en) * 2018-09-04 2019-02-22 中国平安人寿保险股份有限公司 Prediction technique, device, computer equipment and the storage medium of client's stability
CN110287313A (en) * 2019-05-20 2019-09-27 阿里巴巴集团控股有限公司 A kind of the determination method and server of risk subject
WO2019184118A1 (en) * 2018-03-26 2019-10-03 平安科技(深圳)有限公司 Risk model training method and apparatus, a risk identification method and apparatus, and device and medium
CN110737820A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Method and apparatus for generating event information
CN110929026A (en) * 2018-09-19 2020-03-27 阿里巴巴集团控股有限公司 Abnormal text recognition method and device, computing equipment and medium
CN111488737A (en) * 2019-01-09 2020-08-04 阿里巴巴集团控股有限公司 Text recognition method, device and equipment
CN112749269A (en) * 2019-10-31 2021-05-04 北京国双科技有限公司 Entity public opinion calculation method and system
CN117575829A (en) * 2023-11-24 2024-02-20 之江实验室 Public opinion propagation modeling simulation and risk early warning method based on large language model

Families Citing this family (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241528B (en) * 2018-08-24 2023-09-01 讯飞智元信息科技有限公司 Criminal investigation result prediction method, device, equipment and storage medium
CN109145216B (en) * 2018-08-29 2023-08-25 中国平安保险(集团)股份有限公司 Network public opinion monitoring method, device and storage medium
CN109325165B (en) * 2018-08-29 2023-08-22 中国平安保险(集团)股份有限公司 Network public opinion analysis method, device and storage medium
CN109582949B (en) * 2018-09-14 2022-11-22 创新先进技术有限公司 Event element extraction method and device, computing equipment and storage medium
CN109344401B (en) * 2018-09-18 2023-04-28 深圳市元征科技股份有限公司 Named entity recognition model training method, named entity recognition method and named entity recognition device
CN109472018A (en) * 2018-09-26 2019-03-15 深圳壹账通智能科技有限公司 Enterprise's public sentiment monitoring method, device, computer equipment and storage medium
CN109344232B (en) * 2018-11-13 2024-03-15 平安科技(深圳)有限公司 Public opinion information retrieval method and terminal equipment
CN109635074B (en) * 2018-11-13 2024-05-07 平安科技(深圳)有限公司 Entity relationship analysis method and terminal equipment based on public opinion information
CN109740146B (en) * 2018-12-10 2023-02-03 厦门市美亚柏科信息股份有限公司 Public opinion monitoring method, terminal and storage medium
CN109710933A (en) * 2018-12-25 2019-05-03 广州天鹏计算机科技有限公司 Acquisition methods, device, computer equipment and the storage medium of training corpus
CN109726397B (en) * 2018-12-27 2024-02-02 网易(杭州)网络有限公司 Labeling method and device for Chinese named entities, storage medium and electronic equipment
CN109918645B (en) * 2019-01-28 2022-12-02 平安科技(深圳)有限公司 Method and device for deeply analyzing text, computer equipment and storage medium
CN109933709B (en) * 2019-01-31 2023-09-26 平安科技(深圳)有限公司 Public opinion tracking method and device for video text combined data and computer equipment
CN109902099B (en) * 2019-01-31 2023-09-26 平安科技(深圳)有限公司 Public opinion tracking method and device based on graphic and text big data and computer equipment
CN109858039B (en) * 2019-03-01 2023-09-05 北京奇艺世纪科技有限公司 Text information identification method and identification device
CN110008445B (en) * 2019-03-08 2023-04-18 创新先进技术有限公司 Event extraction method and device and electronic equipment
CN110097250B (en) * 2019-03-20 2024-07-02 平安直通咨询有限公司上海分公司 Product risk prediction method, device, computer equipment and storage medium
CN110175733B (en) * 2019-04-01 2023-07-11 创新先进技术有限公司 Public opinion information processing method and server
CN110134845A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Project public sentiment monitoring method, device, computer equipment and storage medium
CN110263158B (en) * 2019-05-24 2023-08-01 创新先进技术有限公司 Data processing method, device and equipment
CN112132368A (en) * 2019-06-06 2020-12-25 阿里巴巴集团控股有限公司 Information processing method and device, computing equipment and storage medium
CN110347830B (en) * 2019-06-28 2023-09-05 创新先进技术有限公司 Public opinion early warning implementation method and device
CN110609950B (en) * 2019-08-02 2022-09-16 济南大学 Public opinion system search word recommendation method and system
CN110674297B (en) * 2019-09-24 2022-04-29 支付宝(杭州)信息技术有限公司 Public opinion text classification model construction method, public opinion text classification device and public opinion text classification equipment
CN110826330B (en) * 2019-10-12 2023-11-07 上海数禾信息科技有限公司 Name recognition method and device, computer equipment and readable storage medium
CN112733869B (en) * 2019-10-28 2024-05-28 中移信息技术有限公司 Method, device, equipment and storage medium for training text recognition model
CN110866387A (en) * 2019-11-04 2020-03-06 云目未来科技(北京)有限公司 Method and device for processing text information for public opinion analysis and storage medium
CN111026885B (en) * 2019-12-23 2023-09-01 公安部第三研究所 Terrorism event entity attribute extraction system and method based on text corpus
CN111144118B (en) * 2019-12-26 2023-05-12 携程计算机技术(上海)有限公司 Method, system, equipment and medium for identifying named entities in spoken text
CN111159525A (en) * 2019-12-31 2020-05-15 中国银行股份有限公司 Text information acquisition method and device
CN111324706B (en) * 2020-01-21 2023-05-26 全球能源互联网研究院有限公司 Labeling method and device and electronic equipment
CN111310014A (en) * 2020-02-21 2020-06-19 深圳中兴网信科技有限公司 Scenic spot public opinion monitoring system, method, device and storage medium based on deep learning
CN111695439B (en) * 2020-05-20 2024-05-10 平安科技(深圳)有限公司 Image structured data extraction method, electronic device and storage medium
CN111538888A (en) * 2020-06-05 2020-08-14 国网山东省电力公司检修公司 Network public opinion intensity evolution analysis system based on active monitoring engine and big data
CN111680226A (en) * 2020-06-16 2020-09-18 杭州安恒信息技术股份有限公司 Network public opinion analysis method, device, system, equipment and readable storage medium
CN111914141B (en) * 2020-07-30 2023-01-10 广州城市信息研究所有限公司 Public opinion knowledge base construction method and public opinion knowledge base
CN111881382B (en) * 2020-07-30 2024-05-14 北京百度网讯科技有限公司 Information display method and device, system and medium implemented by computer system
CN112035668B (en) * 2020-09-02 2024-09-20 深圳前海微众银行股份有限公司 Event main body recognition model optimization method, device, equipment and readable storage medium
CN112800343B (en) * 2021-02-01 2022-09-30 霍尔果斯大颜色信息科技有限公司 Method and system for monitoring network public sentiment based on big data
CN112818234B (en) * 2021-02-02 2022-09-02 霍尔果斯大颜色信息科技有限公司 Network public opinion information analysis processing method and system
CN113094620B (en) * 2021-04-23 2023-10-10 中南大学 Network public opinion cloud platform data analysis model exchange method, system and platform
CN113449508B (en) * 2021-07-15 2023-01-17 上海理工大学 Internet public opinion correlation deduction prediction analysis method based on event chain
CN113435861A (en) * 2021-07-15 2021-09-24 支付宝(杭州)信息技术有限公司 Public opinion data-based business operation and maintenance method and device and electronic equipment
CN113536133B (en) * 2021-07-30 2023-04-11 西安康奈网络科技有限公司 Internet data processing method based on single public opinion event
CN113609391B (en) * 2021-08-06 2024-04-19 北京金堤征信服务有限公司 Event recognition method and device, electronic equipment, medium and program
CN113610427B (en) * 2021-08-19 2023-08-18 深圳市德信软件有限公司 Event early warning index obtaining method, device, terminal equipment and storage medium
CN113626718A (en) * 2021-09-18 2021-11-09 广东电网有限责任公司广州供电局 Man-machine interaction event processing method and system for enterprise management system
CN115600601B (en) * 2022-11-08 2023-03-31 税友软件集团股份有限公司 Method, device, equipment and medium for constructing tax law knowledge base
CN117649117B (en) * 2024-01-30 2024-05-07 浙江数洋科技有限公司 Treatment scheme determining method and device and computer equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751458A (en) * 2009-12-31 2010-06-23 暨南大学 Network public sentiment monitoring system and method
CN103186600A (en) * 2011-12-28 2013-07-03 北大方正集团有限公司 Specific analysis method and device of Internet public sentiment
CN103841216A (en) * 2014-04-01 2014-06-04 深圳市科盾科技有限公司 Network public opinion monitoring system based on cloud platform
CN104346408A (en) * 2013-08-08 2015-02-11 中国移动通信集团公司 Method and equipment for labeling network user
KR20150046793A (en) * 2013-10-21 2015-05-04 대한민국(국민안전처 국립재난안전연구원장) Disaster detecting system using social media
CN104881417A (en) * 2014-02-28 2015-09-02 深圳市网安计算机安全检测技术有限公司 Public opinion analyzing method and system
CN105183299A (en) * 2015-09-30 2015-12-23 珠海许继芝电网自动化有限公司 Human-computer interface service processing system and method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101751458A (en) * 2009-12-31 2010-06-23 暨南大学 Network public sentiment monitoring system and method
CN103186600A (en) * 2011-12-28 2013-07-03 北大方正集团有限公司 Specific analysis method and device of Internet public sentiment
CN104346408A (en) * 2013-08-08 2015-02-11 中国移动通信集团公司 Method and equipment for labeling network user
KR20150046793A (en) * 2013-10-21 2015-05-04 대한민국(국민안전처 국립재난안전연구원장) Disaster detecting system using social media
CN104881417A (en) * 2014-02-28 2015-09-02 深圳市网安计算机安全检测技术有限公司 Public opinion analyzing method and system
CN103841216A (en) * 2014-04-01 2014-06-04 深圳市科盾科技有限公司 Network public opinion monitoring system based on cloud platform
CN105183299A (en) * 2015-09-30 2015-12-23 珠海许继芝电网自动化有限公司 Human-computer interface service processing system and method

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784083A (en) * 2017-09-30 2018-03-09 北京合力智联科技有限公司 A kind of automatic identification processing method of network public sentiment information validity
CN108170742A (en) * 2017-12-19 2018-06-15 百度在线网络技术(北京)有限公司 Picture public sentiment acquisition methods, device, computer equipment and storage medium
WO2019184118A1 (en) * 2018-03-26 2019-10-03 平安科技(深圳)有限公司 Risk model training method and apparatus, a risk identification method and apparatus, and device and medium
CN108829678A (en) * 2018-06-20 2018-11-16 广东外语外贸大学 Name entity recognition method in a kind of Chinese international education field
CN110737820A (en) * 2018-07-03 2020-01-31 百度在线网络技术(北京)有限公司 Method and apparatus for generating event information
CN109376237A (en) * 2018-09-04 2019-02-22 中国平安人寿保险股份有限公司 Prediction technique, device, computer equipment and the storage medium of client's stability
CN109376237B (en) * 2018-09-04 2024-05-28 中国平安人寿保险股份有限公司 Client stability prediction method, device, computer equipment and storage medium
CN110929026A (en) * 2018-09-19 2020-03-27 阿里巴巴集团控股有限公司 Abnormal text recognition method and device, computing equipment and medium
CN110929026B (en) * 2018-09-19 2023-04-25 阿里巴巴集团控股有限公司 Abnormal text recognition method, device, computing equipment and medium
CN111488737A (en) * 2019-01-09 2020-08-04 阿里巴巴集团控股有限公司 Text recognition method, device and equipment
CN111488737B (en) * 2019-01-09 2023-04-14 阿里巴巴集团控股有限公司 Text recognition method, device and equipment
CN110287313A (en) * 2019-05-20 2019-09-27 阿里巴巴集团控股有限公司 A kind of the determination method and server of risk subject
CN112749269A (en) * 2019-10-31 2021-05-04 北京国双科技有限公司 Entity public opinion calculation method and system
CN117575829A (en) * 2023-11-24 2024-02-20 之江实验室 Public opinion propagation modeling simulation and risk early warning method based on large language model
CN117575829B (en) * 2023-11-24 2024-09-13 之江实验室 Public opinion propagation modeling simulation and risk early warning method based on large language model

Also Published As

Publication number Publication date
CN107038178B (en) 2020-07-21
WO2018023981A1 (en) 2018-02-08

Similar Documents

Publication Publication Date Title
CN107038178A (en) The analysis of public opinion method and apparatus
CN102054015B (en) System and method of organizing community intelligent information by using organic matter data model
CN111738011A (en) Illegal text recognition method and device, storage medium and electronic device
TWI438637B (en) Systems and methods for capturing and managing collective social intelligence information
CN112347244B (en) Yellow-based and gambling-based website detection method based on mixed feature analysis
CN108885623B (en) Semantic analysis system and method based on knowledge graph
CN104067567B (en) System and method for carrying out spam detection using character histogram
CN111078978B (en) Network credit website entity identification method and system based on website text content
CN110110577B (en) Method and device for identifying dish name, storage medium and electronic device
CN103336766A (en) Short text garbage identification and modeling method and device
CN108550054B (en) Content quality evaluation method, device, equipment and medium
CN110737821B (en) Similar event query method, device, storage medium and terminal equipment
CN102227724A (en) Machine learning for transliteration
CN107491435A (en) Method and device based on Computer Automatic Recognition user feeling
US9245035B2 (en) Information processing system, information processing method, program, and non-transitory information storage medium
CN103729474A (en) Method and system for identifying vest account numbers of forum users
CN110472057B (en) Topic label generation method and device
JP5098631B2 (en) Mail classification system, mail search system
CN110858353A (en) Method and system for obtaining case referee result
CN112016317A (en) Sensitive word recognition method and device based on artificial intelligence and computer equipment
CN115238688B (en) Method, device, equipment and storage medium for analyzing association relation of electronic information data
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN115577172A (en) Article recommendation method, device, equipment and medium
CN104462279B (en) Analyze the acquisition methods and device of characteristics of objects information
US20180063056A1 (en) Message sorting system, message sorting method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant