[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN114492432A - Cooperative enterprise identification method and device - Google Patents

Cooperative enterprise identification method and device Download PDF

Info

Publication number
CN114492432A
CN114492432A CN202210099908.7A CN202210099908A CN114492432A CN 114492432 A CN114492432 A CN 114492432A CN 202210099908 A CN202210099908 A CN 202210099908A CN 114492432 A CN114492432 A CN 114492432A
Authority
CN
China
Prior art keywords
data
enterprise
hot search
hot
public opinion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210099908.7A
Other languages
Chinese (zh)
Inventor
胡屹
江一帆
马无缰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210099908.7A priority Critical patent/CN114492432A/en
Publication of CN114492432A publication Critical patent/CN114492432A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for identifying a cooperative enterprise, belongs to the technical field of artificial intelligence, and can be applied to the technical field of finance or other technical fields. The cooperative enterprise identification method comprises the following steps: determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank; integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts; retrieving the retrieval keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data to obtain an enterprise cooperation prediction result; and identifying the cooperative enterprises according to the enterprise cooperation prediction result. The invention can grasp the cooperation opportunity in time and effectively improve the cooperation benefit.

Description

Cooperative enterprise identification method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for identifying a cooperative enterprise.
Background
Under the intense market competition, as a product marketing strategy, banks often select some cooperative organizations to release joint-name products, such as theme credit cards or theme deposit slips, so as to improve the value of the products by means of the influence of both parties. At present, when a cooperative enterprise is screened, visual judgment of an administrator on the enterprise is often relied on, a data support and a comprehensive and comprehensive system automatic evaluation scheme are lacked, and the method is not timely for developing a cooperative opportunity with an external organization by means of public opinion hotspots.
Disclosure of Invention
The embodiment of the invention mainly aims to provide a method and a device for identifying a cooperative enterprise, timely grasping a cooperative contract and effectively improving the cooperative benefit.
In order to achieve the above object, an embodiment of the present invention provides a method for identifying a collaborative enterprise, including:
determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts;
retrieving the retrieval keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data to obtain an enterprise cooperation prediction result;
and identifying the cooperative enterprises according to the enterprise cooperation prediction result.
The embodiment of the invention also provides a device for identifying the cooperative enterprise, which comprises:
the set determining module is used for determining a hot search keyword set and a search keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word library;
the public opinion data module is used for integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to acquire the public opinion data corresponding to the integrated hot search abstracts;
the prediction module is used for retrieving the search keyword set to obtain enterprise data, inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data, and obtaining an enterprise cooperation prediction result;
and the identification module is used for identifying the cooperative enterprises according to the enterprise cooperation prediction result.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program stored on the memory and running on the processor, wherein the processor realizes the steps of the cooperative enterprise identification method when executing the computer program.
An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the cooperative enterprise identification method.
The method and the device for identifying the cooperative enterprises in the embodiment of the invention firstly determine the hot search keyword set and the retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search lexicon, then integrate each hot search abstract according to the similarity between the keyword sets to obtain the corresponding public opinion data, then obtain the enterprise data according to the retrieval keyword set and obtain the enterprise cooperative prediction result according to the public opinion data and the enterprise data to identify the cooperative enterprises, can grasp the cooperative opportunity in time and effectively improve the cooperative benefit.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a flow chart of a collaborative enterprise identification method in an embodiment of the present invention;
FIG. 2 is a flow chart of a collaborative enterprise identification method in accordance with another embodiment of the present invention;
FIG. 3 is a diagram of the correspondence of hot search headlines, news headlines, and news comments;
fig. 4 is a flowchart of S101 in the embodiment of the present invention;
FIG. 5 is a flowchart of obtaining public opinion data according to an embodiment of the present invention;
fig. 6 is a block diagram showing the construction of a cooperative enterprise identification apparatus in the embodiment of the present invention;
fig. 7 is a block diagram of a computer device in the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
In view of the fact that the prior art often depends on intuitive judgment of managers on enterprises when screening cooperative enterprises, lacks data support and a comprehensive and comprehensive system automatic evaluation scheme, and is not timely in developing a cooperative chance with an external organization by means of public opinion hotspots, embodiments of the present invention provide a method and an apparatus for identifying a cooperative enterprise, which can perform overall modeling on multiple dimensions such as public opinion dynamics of news topics, market, risks of news-related organizations, and a relation with a bank by means of an artificial intelligence technology based on hotspot news data, so as to evaluate the benefit of a joint product brought to the bank. The present invention will be described in detail below with reference to the accompanying drawings.
FIG. 1 is a flow chart of a method for identifying a collaborative enterprise according to an embodiment of the present invention. Fig. 2 is a flowchart of a method for identifying a collaborative enterprise according to another embodiment of the present invention. As shown in fig. 1-2, the method for identifying a collaborative enterprise includes:
s101: and determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank.
Before executing S101, the system periodically collects news hot search lists released by various internal and external big news information platforms through a preset collection channel, a collection strategy and a collection element by using a crawler technology, wherein the collection channel is a target news platform needing to be collected. The acquisition strategy is to acquire various preset parameters of news in advance by applying a crawler technology, and comprises the acquisition period, the acquisition mode (including a webpage, a public number, an App and the like), the acquisition path (which access path is used for acquiring the news list after entering a news platform) and the acquisition of the number of the hot news in the search list. The acquisition elements refer to contents to be acquired aiming at each hot news search list, and the acquisition elements comprise acquisition time, an acquisition platform, a hot search title, a news text, news comments, a news praise number, a news badly-evaluated number, hot search words and the like. The regular period refers to collecting the information according to a certain time period, and the collecting period can be daily, weekly, monthly and the like (the proposal is weekly).
Fig. 3 is a diagram of correspondence between hot-search titles, news titles, and news reviews. As shown in FIG. 3, one or more news headlines are often associated with the same hot-searched headline, and the content of the news is highly relevant to the hot-searched headlines; and the same hot search news often has a plurality of news comments. The hot search title, the news title and the news comment are in a tree-like relationship, and h and j in fig. 3 are positive integers.
For example, a news hot-search list is marked as "# official opening of a garden" and multiple news related to the hot-search list can be obtained after clicking the title, and the news is marked as "# official opening of a garden, a ticket is subjected to hot robbery", "three-mouth family plays one round, and the number of the tickets is light and loose to 1 ten thousand … …", "big-track weekend fire explosion … … without tickets", and the like. Through each news headline, the news text of the headline can be viewed, as well as news comments.
And next, summarizing the news headlines acquired in a time period according to the incidence relation between the news headlines and the hot search headlines. For example, all news headlines associated with the same hot-search headline are collected into the same text (i.e., different news headlines are connected end to end) as the news headlines used in the subsequent steps, i.e., the news headlines mentioned in the subsequent steps refer to the news headlines after collection.
Similarly, the praise counts and the bad comment counts of all news associated with the same hot search title are summarized (i.e., the praise counts of different news titles are summed up, and the bad comment counts of different news titles are summed up), i.e., the praise counts and the bad comment counts of news mentioned in the subsequent steps refer to the summarized numerical values.
It should be noted that, because each news comment may express different emotions and needs to be analyzed and identified separately, the present invention does not summarize the news comments. In addition, the subsequent steps need to use hot search titles, news praise numbers, news badly-evaluated numbers and news comments, and information acquired by the rest crawlers (such as the acquisition time, the acquisition platform, the news text and other contents) can be stored in the system as news details for manual inquiry.
In one embodiment, before executing S101, the method further includes:
generating original hot searching data according to the hot searching titles and the corresponding news titles; and inputting the original hot search data into a summary identification model created based on the historical summary data to obtain the hot search summary.
In specific implementation, each hot search title and the associated news title are summarized into the same text (that is, the hot search title and the news title are connected end to end), and the text is defined as the original hot search data. The method comprises the steps of firstly carrying out word segmentation processing on original hot search data through an open source word segmentation library, and then judging whether the original hot search data belongs to business news or not by means of a summary identification model established based on historical summary data. Whether business news belongs to the two-classification problem or not, machine learning supervised learning algorithms commonly used in the industry can be used for training classification models, such as a naive Bayes algorithm, an SVM algorithm, an LSTM algorithm and the like. A summary recognition model for judging whether the news is commercial news or not can be obtained through model training. After new original hot-searching data are obtained, whether each piece of original hot-searching data is commercial news or not can be identified through the classification model. And for news which does not belong to the commercial news, the news is not processed in the subsequent step, and the system only keeps the original hot search data corresponding to the commercial news as a hot search abstract to perform subsequent processing.
Fig. 4 is a flowchart of S101 in the embodiment of the present invention. As shown in fig. 4, S101 includes:
s201: and determining the adjusting parameters of the vocabularies according to the matching results of the vocabularies of the hot search abstracts and the hot search lexicon.
The hot search words refer to hot search words, and the system removes the duplication of the hot search words and then reserves the hot search words in a local hot search word bank for processing in subsequent steps. In specific implementation, the hot search word bank can be matched with each word in the hot search abstract, if the word is matched with the word in the hot search word bank, the hot search word bank in the hit is considered to adopt the adjustment parameter alpha, otherwise, the hot search word bank in the hit is not adopted.
S202: and determining frequency data of each vocabulary according to the adjusting parameters, the word frequency and the frequency index of each vocabulary.
The invention calculates the frequency data of each word in the hot search abstract by a TF-IDF word frequency analysis method, and the frequency data is higher if the contribution of a certain word to the specific theme of the article is larger. In specific implementation, the frequency data may be determined by the following formula:
Figure BDA0003492060480000051
wherein, TFIDFnewThe frequency data of the vocabulary, TF is the word frequency of the vocabulary, and IDF is the inverse text frequency index of the vocabulary; alpha and alpha-1All are adjusting parameters, when the vocabulary is matched with the words in the hot searching lexicon, the adjusting parameter is alpha, and when the vocabulary is not matched with the words in the hot searching lexicon, the adjusting parameter is alpha-1Generally, α.gtoreq.1.
S203: and determining a hot search keyword set and a retrieval keyword set according to the frequency data of each vocabulary.
In specific implementation, TFIDF in each hot search summary can be extractednewThe top N vocabularies are used as the keyword set of the hot search summary (N can be 20-50 in actual use); extracting TFIDF in each hot search summarynewThe top Np words are used as a search keyword set (Np)<N, Np have a small value, e.g., 3).
S102: and integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts.
The integration of the hot search summaries comprises the integration of the hot search summaries in the same period and the integration of the hot search summaries in the current period and the historical period.
In one embodiment, integrating the hot search summaries according to the similarity between the keyword sets of the hot search summaries comprises:
converting the keyword set of each hot search abstract into a word vector; and when the similarity between the word vectors meets a preset similarity condition, integrating the corresponding hot search abstracts.
In specific implementation, the keywords in the keyword set can be converted into K-dimensional word vectors (K may be a power series of 2, such as 128 or 256, or may be other integers in actual use) through the open-source pre-trained word vector model, and the word vector summary value of each abstract is obtained through summarization. And calculating the cosine similarity of the sum values of the two hot search abstract word vectors, wherein the larger the cosine similarity value is, the more similar the two hot search abstract words are. A threshold M (M may take a value between 0.6 and 0.9 in actual use) may be set, and it is determined that the hot search digests greater than M are the contents of the same topic and the integration process is performed, whereas it is determined that the hot search digests less than or equal to M are the contents of different topics and no integration is performed. N, K, M, Np are all adjustable parameters, in practical application, the initial value can be set according to manual experience, and then the adjustment is carried out according to the effect of practical model training.
For example, "today's official opening of the circle! "," +: tourists run "##" ", and"% "log off 2.2 hundred million shares to move back and forth together with investors. "three titles are hot-searched titles obtained from different platform hot-searched titles, but the first two titles belong to hot-searched titles related to" + ", belong to adjacent hot-searched titles, and can be integrated. The third is obviously different from the first two, and needs to be used as a separate hot search title and is not integrated with the first two.
Fig. 5 is a flowchart of obtaining public opinion data according to an embodiment of the present invention. As shown in fig. 5, acquiring public opinion data corresponding to the integrated hot search summary includes:
s301: and determining public opinion index data according to the emotion index data of the comments corresponding to the integrated hot search abstract.
The public opinion index data comprises public opinion parameter data and public opinion fluctuation data.
In one embodiment, S301 includes:
determining public opinion parameter data according to emotion index data of comments corresponding to the integrated hot search abstract;
and determining public opinion fluctuation data according to the emotion index data and the public opinion parameter data.
In specific implementation, the emotion index data of each piece of news comment before integration needs to be analyzed first. The emotion index data comprises five types of '0-strong support, 1-support, 2-neutral, 3-spit groove and 4-strong spit groove', so that the problems of the emotion index data actually belong to multi-classification problems, and emotion analysis supervision learning algorithms commonly used in the industry can be used for training classification models, such as a text convolutional neural network model (TextCNN), a BERT algorithm, a naive Bayesian algorithm and the like. Each public opinion evaluation can be regarded as an entity, each public opinion evaluation is subjected to emotion classification through an emotion analysis supervision learning algorithm to obtain the emotion index of each comment, and the result of each public opinion evaluation can be identified.
After the result of each public opinion evaluation is obtained, summarizing and counting can be carried out according to the hot search titles to which the news comments belong, so that the number of 0-strong support, 1-support, 2-neutral, 3-spit grooves and 4-strong spit grooves evaluated by each hot search title is calculated and is respectively recorded as Ni (i is 0-4). The public opinion parameter data is the average value of the emotion index data, and the public opinion fluctuation data is the variance of the emotion index data.
When the hot search abstracts in the current period and the historical period are integrated, similarity analysis is carried out on the hot search abstracts in the current period and the hot search abstracts in the historical period, and if a result with higher similarity to the hot search abstracts in the current period is not found in the hot search abstracts in the historical period, the hot search abstracts in the current period are taken as an example. On the contrary, if the similarity between the hot search summary of the current period and the hot search summary in a certain historical period is found to be high (if the similarity between the hot search summary of the current period and a plurality of historical hot search summaries is high, the highest historical hot search summary is taken), the emotion index data of the hot search summary of the current period is used as the data of a certain statistical time period of the historical hot search summaries.
TABLE 1
Figure BDA0003492060480000061
Table 1 is a public opinion index time series data table in the first embodiment. As shown in table 1, the historical hot-search news is ". star today's business", and the public opinion parameter data and the public opinion fluctuation data of T week are 4.2 and 0.02, respectively. The hot news of the T +1 th week is ". x. festival attacking and thinking", and the public opinion parameter data and the public opinion fluctuation data are 4.0 and 0.01, respectively. If the similarity between the two is found to be high through the similarity analysis, a piece of time series data is finally generated.
TABLE 2
Figure BDA0003492060480000071
Table 2 is a public opinion index time series data table in the second embodiment. As shown in table 2, the historical hot-search news is ". star today's business", and the public opinion parameter data and the public opinion fluctuation data of T week are 4.2 and 0.02, respectively. The hot news search at week T +1 is "xx campaign reservation", and its public opinion parameter data and public opinion fluctuation data are 4.4 and 0.01, respectively. If the similarity of the two is low through similarity analysis, and the similarity of other hot-searching news and the similarity of the other hot-searching news and the mark of the current business are also low, two pieces of time sequence data are generated by the mark and the X, and 0 supplementing operation needs to be carried out on the mark of the T +1 week; the "xxx" in week T was subjected to a 0-complementing operation.
If the similarity between the current hot search abstract and a certain historical hot search abstract is higher, the word vectors of the historical hot search abstract need to be further updated except for the summary processing, and the updating formula is as follows:
the updated word vector of the historical hot search summary is (word vector of the historical hot search summary + word vector of the current hot search summary)/2.
S302: and determining the news approval rate according to the approval data of the news corresponding to the integrated hot search abstract.
The praise data comprises the news praise number and the news badness number. The news praise numbers are obtained by summing up news praise numbers associated with hot search titles to be integrated of different platforms. Similarly, the bad news scores of different platforms are summed up according to the same method. The news approval rate is news approval/(news approval + news badness).
S303: and acquiring public opinion platform data corresponding to the integrated hot search abstract.
The public opinion platform data is the number of hot search summaries entering different platforms.
S103: and retrieving the search keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction effect data to obtain an enterprise cooperation prediction result.
The public opinion training data is time sequence data, and the enterprise training data and the prediction effect data are non-time sequence data (namely historical time point data), so that the time sequence data needs to be converted into the non-time sequence data, and new statistical indexes can be established according to different time periods, such as the highest values of the public opinion training data of about 1 month, the public opinion training data of about 3 months, the public opinion index data of about 3 months and the like.
TABLE 3
Figure BDA0003492060480000081
TABLE 4
Figure BDA0003492060480000082
TABLE 5
Figure BDA0003492060480000083
Table 3 is a public opinion training data time sequence table, table 4 is a public opinion training specific data time sequence table, and table 5 is a public opinion training specific data non-time sequence table. As shown in tables 3-5, after converting time series data to non-time series data, the model needs to predict the enterprise collaborative prediction result by looking at the public opinion training data, the enterprise training data and the prediction result data. The problem belongs to a regression model of machine learning, and algorithms such as GBRFR (gradient enhanced random forest regression), ETR (EXTRA TREE regression) and the like commonly used in the industry can be used for training the regression model.
For example, for a certain hot news search, the model predicts the credit card issuance amount of 3 months in the future through public opinion statistics of the hot news search in the last 6 months and enterprise data of the current time point. For example, when the system acquires the "star" related hot news, the credit card issuance amount for the next 3 months of credit card issuance may be estimated.
The method can also be applied to the Wide and Deep neural network model and simultaneously carry out modeling based on time sequence data and non-time sequence data. In specific implementation, the time sequence data can be used as Deep characteristics, the Wide characteristics are associated with non-time sequence data through a shallow full-connection network, and the enterprise cooperation prediction result still adopts a single index value.
In one embodiment, retrieving the set of search keywords to obtain the enterprise data comprises:
searching and retrieving the keyword set in the cooperation system to obtain an enterprise name; and acquiring enterprise data according to the enterprise name and the enterprise cooperation prediction result definition.
In a specific implementation, the names of the businesses can be respectively retrieved in a business information retrieval system (such as a sky-eye check and a business registration website) based on the Np vocabularies. For example, TFIDF in hot search ". today's developmentnewThe top 3 words are ". times.", "price" and "queue", respectively, and the specific business name can be found by searching ". times.". In order to avoid retrieving multiple companies for different keywords, the terms may be manually aligned.
The enterprise data comprises the comprehensive influence basic data of the enterprise and the service relation data of the enterprise. The scale of the enterprise, the credit data of the enterprise and the current situation of the enterprise all relate to whether the bank chooses to cooperate with the bank. In order to avoid market risk and reputation risk brought to banks by joint-name cooperative products, enterprises with good credit and good operation condition are often selected for cooperation, and comprehensive influence basic data of the enterprises, such as enterprise scale, enterprise credit condition, judicial litigation number and the like, can be acquired in an enterprise information retrieval system according to enterprise names through a crawler technology.
TABLE 6
Figure BDA0003492060480000091
Table 6 is a service relationship data table. As shown in table 6, the service relationship elements of the enterprise include assets, intermediate revenue contributions, asset precipitation of corporate legal, and the like, and the data can be preprocessed to be suitable for the subsequent machine learning model, for example, the continuous variables are subjected to variable grouping to be converted into discrete variables, the discrete variables are subjected to normalization processing, and the like.
The definition of the enterprise cooperation prediction result can be determined according to the product operation core indexes of different joint name cooperation products, and proper enterprise data is selected according to the definition to be trained. For example, the joint-name cooperation product is a credit card product, and for a bank, the product operation core index is usually the credit card issuing amount or the total consumption amount of credit card customers, so the enterprise cooperation prediction result can be defined as the credit card issuing amount or the total consumption amount of credit card customers; if the product operation core index is a large-amount deposit receipt product, the enterprise cooperation prediction result can be defined as the total purchase amount of the deposit receipt client. In particular, the definition of the enterprise cooperation prediction result and the statistical period of the prediction need to be determined in advance. For example, the enterprise collaboration forecast is defined as the credit card issuance amount 3 months after the affiliation.
S104: and identifying the cooperative enterprises according to the enterprise cooperation prediction result.
In specific implementation, hot search news with the credit card issuing quantity larger than Nk in the future 3 months and enterprise names can be pushed to bank managers as cooperative enterprises by setting a threshold Nk (the Nk is a preset bank minimum marketing target value), so that decision basis is provided for the bank managers to develop the cooperation of joint-name products of the enterprises.
The execution subject of the collaborative enterprise identification method shown in fig. 1 may be a computer. As can be seen from the process shown in fig. 1, in the method for identifying a collaborative enterprise according to the embodiment of the present invention, a hot search keyword set and a search keyword set are determined according to the matching result between the vocabulary of each hot search abstract and the hot search lexicon, then corresponding public opinion data is obtained after each hot search abstract is integrated according to the similarity between the keyword sets, and then enterprise data is obtained according to the search keyword set and an enterprise collaboration prediction result is obtained according to the public opinion data and the enterprise data to identify the collaborative enterprise, so that a collaborative opportunity can be grasped in time, and the collaborative benefit is effectively improved.
The specific process of the embodiment of the invention is as follows:
1. and generating original hot searching data according to the hot searching titles and the corresponding news titles.
2. And inputting the original hot search data into a summary identification model created based on the historical summary data to obtain the hot search summary.
3. And determining the adjusting parameters of the vocabularies according to the matching results of the vocabularies of the hot search abstracts and the hot search lexicon.
4. And determining frequency data of each vocabulary according to the adjusting parameters, the word frequency and the frequency index of each vocabulary.
5. And determining a hot search keyword set and a retrieval keyword set according to the frequency data of each vocabulary.
6. And converting the keyword set of each hot search abstract into word vectors, and integrating the corresponding hot search abstract when the similarity among the word vectors meets the preset similarity condition.
7. And determining public opinion parameter data according to the emotion index data of the corresponding comments of the integrated hot search abstract, and determining public opinion fluctuation data according to the emotion index data and the public opinion parameter data.
8. And determining news praise rate according to the praise data of the news corresponding to the integrated hot search abstract, and acquiring public opinion platform data corresponding to the integrated hot search abstract.
9. And searching the search keyword set in the cooperation system to obtain the enterprise name, and defining and obtaining enterprise data according to the enterprise name and the enterprise cooperation prediction result.
10. And inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data to obtain an enterprise cooperation prediction result, and identifying the cooperative enterprises according to the enterprise cooperation prediction result.
In summary, the cooperative enterprise identification method provided by the embodiment of the invention is based on hot news data, and is used for integrally modeling multiple dimensions such as the public sentiment dynamic of news topics, the market, risks and bank relations of news association organizations and the like through an artificial intelligence algorithm technology, so that the benefit brought to banks by the associated products is evaluated, managers are helped to better grasp the opportunity of developing a cooperative contract with foreign organizations by using the public sentiment hot spots, and better cooperative benefit is brought.
Based on the same inventive concept, the embodiment of the invention also provides a device for identifying the cooperative enterprise, and as the problem solving principle of the device is similar to that of the method for identifying the cooperative enterprise, the implementation of the device can refer to the implementation of the method, and repeated parts are not repeated.
Fig. 6 is a block diagram showing the configuration of a cooperative enterprise identification apparatus in the embodiment of the present invention. As shown in fig. 6, the cooperative enterprise recognition apparatus includes:
the set determining module is used for determining a hot search keyword set and a search keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
the public opinion data module is used for integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain the public opinion data corresponding to the integrated hot search abstracts;
the prediction module is used for retrieving the search keyword set to obtain enterprise data, inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data, and obtaining an enterprise cooperation prediction result;
and the identification module is used for identifying the cooperative enterprises according to the enterprise cooperation prediction result.
To sum up, the cooperative enterprise recognition apparatus according to the embodiment of the present invention determines a hot search keyword set and a search keyword set according to a matching result between a vocabulary of each hot search abstract and a hot search lexicon, integrates each hot search abstract according to a similarity between the keyword sets to obtain corresponding public opinion data, obtains enterprise data according to the search keyword set, and obtains an enterprise cooperation prediction result according to the public opinion data and the enterprise data to recognize a cooperative enterprise, so that a cooperative opportunity can be grasped in time, and a cooperative benefit can be effectively improved.
The embodiment of the invention also provides a specific implementation mode of the computer equipment, which can realize all the steps in the cooperative enterprise identification method in the embodiment. Fig. 7 is a block diagram of a computer device in an embodiment of the present invention, and referring to fig. 7, the computer device specifically includes the following:
a processor (processor)701 and a memory (memory) 702.
The processor 701 is configured to call the computer program in the memory 702, and the processor implements all the steps in the collaborative enterprise identification method in the above embodiment when executing the computer program, for example, the processor implements the following steps when executing the computer program:
determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts;
retrieving the retrieval keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data to obtain an enterprise cooperation prediction result;
and identifying the cooperative enterprises according to the enterprise cooperation prediction result.
To sum up, the computer device of the embodiment of the invention determines the hot search keyword set and the search keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search lexicon, integrates each hot search abstract according to the similarity between the keyword sets to obtain the corresponding public opinion data, obtains the enterprise data according to the search keyword set, and obtains the enterprise cooperation prediction result according to the public opinion data and the enterprise data to identify the cooperative enterprises, so that the cooperative opportunity can be grasped in time, and the cooperative benefit is effectively improved.
An embodiment of the present invention further provides a computer-readable storage medium capable of implementing all the steps in the collaborative enterprise identification method in the foregoing embodiment, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, implements all the steps of the collaborative enterprise identification method in the foregoing embodiment, for example, when the processor executes the computer program, the processor implements the following steps:
determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts;
retrieving the retrieval keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on the public opinion training data, the enterprise training data and the prediction result data to obtain an enterprise cooperation prediction result;
and identifying the cooperative enterprises according to the enterprise cooperation prediction result.
To sum up, the computer-readable storage medium of the embodiment of the present invention determines a hot search keyword set and a search keyword set according to a matching result between a vocabulary of each hot search abstract and a hot search lexicon, integrates each hot search abstract according to a similarity between the keyword sets to obtain corresponding public opinion data, obtains enterprise data according to the search keyword set, and obtains an enterprise cooperation prediction result according to the public opinion data and the enterprise data to identify a cooperative enterprise, so that a cooperative opportunity can be grasped in time, and a cooperative benefit can be effectively improved.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The various illustrative logical blocks, or elements, or devices described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside as discrete components in a user terminal.
In one or more exemplary designs, the functions described in the embodiments of the present invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.

Claims (10)

1. A method for identifying a collaborative enterprise, comprising:
determining a hot search keyword set and a retrieval keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain public opinion data corresponding to the integrated hot search abstracts;
retrieving the retrieval keyword set to obtain enterprise data, and inputting the public opinion data and the enterprise data into a prediction model established based on public opinion training data, enterprise training data and prediction result data to obtain an enterprise cooperation prediction result;
and identifying a cooperative enterprise according to the enterprise cooperation prediction result.
2. The method of claim 1, wherein determining the set of hot search keywords and the set of search keywords based on matching the vocabulary of each hot search summary with the vocabulary library of hot search keywords comprises:
determining the adjusting parameters of the vocabularies according to the matching results of the vocabularies of the hot search abstracts and the hot search lexicon;
determining frequency data of each vocabulary according to the adjusting parameters, the word frequency and the frequency index of each vocabulary;
and determining a hot search keyword set and a retrieval keyword set according to the frequency data of each vocabulary.
3. The method of claim 1, wherein integrating the hot search summaries according to the similarity between the keyword sets of the hot search summaries comprises:
converting the keyword set of each hot search abstract into a word vector;
and when the similarity between the word vectors meets a preset similarity condition, integrating the corresponding hot search abstracts.
4. The method of claim 1, wherein the obtaining public opinion data corresponding to the integrated hot search summary comprises:
determining public opinion index data according to the emotion index data of the comments corresponding to the integrated hot search abstract;
determining news approval rate according to approval data of the news corresponding to the integrated hot search abstract;
and acquiring public opinion platform data corresponding to the integrated hot search abstract.
5. The method as claimed in claim 4, wherein the public opinion index data includes public opinion parameter data and public opinion fluctuation data;
determining public opinion index data according to the emotion index data of the comments corresponding to the integrated hot search abstract comprises the following steps:
determining public opinion parameter data according to emotion index data of comments corresponding to the integrated hot search abstract;
and determining the public opinion fluctuation data according to the emotion index data and the public opinion parameter data.
6. The method of claim 1, wherein retrieving the set of search keywords to obtain enterprise data comprises:
searching the search keyword set in a cooperation system to obtain an enterprise name;
and defining and acquiring enterprise data according to the enterprise name and the enterprise cooperation prediction result.
7. The method of claim 1, further comprising:
generating original hot searching data according to the hot searching titles and the corresponding news titles;
and inputting the original hot search data into a summary identification model created based on historical summary data to obtain the hot search summary.
8. A collaborative enterprise identification apparatus, comprising:
the set determining module is used for determining a hot search keyword set and a search keyword set according to the matching result of the vocabulary of each hot search abstract and the hot search word bank;
the public opinion data module is used for integrating the hot search abstracts according to the similarity among the keyword sets of the hot search abstracts to obtain the public opinion data corresponding to the integrated hot search abstracts;
the prediction module is used for retrieving the retrieval keyword set to obtain enterprise data, inputting the public opinion data and the enterprise data into a prediction model established based on public opinion training data, enterprise training data and prediction result data, and obtaining an enterprise cooperation prediction result;
and the identification module is used for identifying the cooperative enterprises according to the enterprise cooperation prediction result.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executed on the processor, wherein the processor when executing the computer program implements the steps of the method for identifying a collaborative enterprise of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the collaborative enterprise identification method according to any one of claims 1 to 7.
CN202210099908.7A 2022-01-27 2022-01-27 Cooperative enterprise identification method and device Pending CN114492432A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210099908.7A CN114492432A (en) 2022-01-27 2022-01-27 Cooperative enterprise identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210099908.7A CN114492432A (en) 2022-01-27 2022-01-27 Cooperative enterprise identification method and device

Publications (1)

Publication Number Publication Date
CN114492432A true CN114492432A (en) 2022-05-13

Family

ID=81475615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210099908.7A Pending CN114492432A (en) 2022-01-27 2022-01-27 Cooperative enterprise identification method and device

Country Status (1)

Country Link
CN (1) CN114492432A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109684481A (en) * 2019-01-04 2019-04-26 深圳壹账通智能科技有限公司 The analysis of public opinion method, apparatus, computer equipment and storage medium
CN109992668A (en) * 2019-04-04 2019-07-09 上海冰鉴信息科技有限公司 A kind of enterprise's the analysis of public opinion method and apparatus based on from attention
CN110689438A (en) * 2019-08-26 2020-01-14 深圳壹账通智能科技有限公司 Enterprise financial risk scoring method and device, computer equipment and storage medium
US10552843B1 (en) * 2016-12-05 2020-02-04 Intuit Inc. Method and system for improving search results by recency boosting customer support content for a customer self-help system associated with one or more financial management systems
CN112581006A (en) * 2020-12-25 2021-03-30 杭州衡泰软件有限公司 Public opinion engine and method for screening public opinion information and monitoring enterprise main body risk level
WO2021136009A1 (en) * 2019-12-31 2021-07-08 阿里巴巴集团控股有限公司 Search information processing method and apparatus, and electronic device
CN113205409A (en) * 2021-05-28 2021-08-03 中国工商银行股份有限公司 Loan transaction processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10552843B1 (en) * 2016-12-05 2020-02-04 Intuit Inc. Method and system for improving search results by recency boosting customer support content for a customer self-help system associated with one or more financial management systems
CN109684481A (en) * 2019-01-04 2019-04-26 深圳壹账通智能科技有限公司 The analysis of public opinion method, apparatus, computer equipment and storage medium
CN109992668A (en) * 2019-04-04 2019-07-09 上海冰鉴信息科技有限公司 A kind of enterprise's the analysis of public opinion method and apparatus based on from attention
CN110689438A (en) * 2019-08-26 2020-01-14 深圳壹账通智能科技有限公司 Enterprise financial risk scoring method and device, computer equipment and storage medium
WO2021136009A1 (en) * 2019-12-31 2021-07-08 阿里巴巴集团控股有限公司 Search information processing method and apparatus, and electronic device
CN112581006A (en) * 2020-12-25 2021-03-30 杭州衡泰软件有限公司 Public opinion engine and method for screening public opinion information and monitoring enterprise main body risk level
CN113205409A (en) * 2021-05-28 2021-08-03 中国工商银行股份有限公司 Loan transaction processing method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
NAJERA SANCHEZ等: "A Systematic Review of Sustainable Banking through a Co-Word Analysis", 《SUSTAINABILITY》, vol. 12, no. 1, 31 January 2020 (2020-01-31) *
孙超: "面向产业合作的半监督关系抽取", 《中国优秀硕士学位论文全文数据库 信息科技辑》, 15 August 2020 (2020-08-15) *

Similar Documents

Publication Publication Date Title
US20210272040A1 (en) Systems and methods for language and speech processing with artificial intelligence
Day et al. Deep learning for financial sentiment analysis on finance news providers
US11687218B1 (en) User interface for use with a search engine for searching financial related documents
CN106503014B (en) Real-time information recommendation method, device and system
WO2019175571A1 (en) Combined methods and systems for online media content
CN109767318A (en) Loan product recommended method, device, equipment and storage medium
CN112419029B (en) Similar financial institution risk monitoring method, risk simulation system and storage medium
CN112418956A (en) Financial product recommendation method and device
Fu et al. A sentiment-aware trading volume prediction model for P2P market using LSTM
CN111695938A (en) Product pushing method and system
CN109492097B (en) Enterprise news data risk classification method
CN114819967A (en) Data processing method and device, electronic equipment and computer readable storage medium
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
CN114693409A (en) Product matching method, device, computer equipment, storage medium and program product
Zhong et al. Identification of opinion spammers using reviewer reputation and clustering analysis
CN114492432A (en) Cooperative enterprise identification method and device
US11379929B2 (en) Advice engine
WO2022271431A1 (en) System and method that rank businesses in environmental, social and governance (esg)
CN116340644A (en) Financial product recommendation method and device based on collaborative filtering algorithm
Zhu [Retracted] Analysis of the Influence of Multimedia Information Fusion on the Psychological Emotion of Financial Investment Customers under the Background of e‐Commerce
Shiri et al. Meme it Up: Patterns of Emoji Usage on Twitter
Gunarathne et al. Racial Bias in Social Media Customer Service: Evidence from Twitter
Park et al. Twitter sentiment analysis using machine learning
Law et al. Assessing Public Opinions of Products Through Sentiment Analysis: Product Satisfaction Assessment by Sentiment Analysis
Singh et al. Machine Learning and Artificial Intelligence based Analysis for Top Organization.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination