[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112269909A - Expert recommendation method based on multi-source information fusion technology - Google Patents

Expert recommendation method based on multi-source information fusion technology Download PDF

Info

Publication number
CN112269909A
CN112269909A CN202010964492.1A CN202010964492A CN112269909A CN 112269909 A CN112269909 A CN 112269909A CN 202010964492 A CN202010964492 A CN 202010964492A CN 112269909 A CN112269909 A CN 112269909A
Authority
CN
China
Prior art keywords
expert
subnet
abstract
keyword
executing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010964492.1A
Other languages
Chinese (zh)
Other versions
CN112269909B (en
Inventor
朱全银
方强强
李翔
马甲林
张柯文
王文川
胥心心
王胜标
丁行硕
成洁怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Greater Bay Area Technology Innovation Service Center (Guangzhou) Co.,Ltd.
Guangzhou Jingzhi Information Technology Co ltd
Original Assignee
Huaiyin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huaiyin Institute of Technology filed Critical Huaiyin Institute of Technology
Priority to CN202010964492.1A priority Critical patent/CN112269909B/en
Publication of CN112269909A publication Critical patent/CN112269909A/en
Application granted granted Critical
Publication of CN112269909B publication Critical patent/CN112269909B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/258Heading extraction; Automatic titling; Numbering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an expert recommendation method based on a multi-source information fusion technology, which comprises the following steps: crawling a technical expert scientific and technological paper, an invention patent, fund project information and Web page information to construct a knowledge base, and constructing a keyword dictionary keywords according to keyword fields of the knowledge base; extracting author fields of a knowledge base to perform word frequency co-occurrence analysis, and constructing an expert cooperative relationship subnet; respectively extracting the research direction of Web page experts and personal information by using a regular expression and a named entity recognition algorithm to construct a Web subnet; respectively extracting document-subject and subject-keyword from the abstract field of the knowledge base by lda algorithm, and extracting 5 words with the maximum abstract field weight by TF-IDF algorithm to jointly construct a subject subnet; and constructing and calculating expert centrality values in the expert information network by taking the expert names-mechanisms as constraint conditions and combining the three subnets, sequencing the expert centrality values and recommending the experts ranked at the top 5 as recommendation results.

Description

Expert recommendation method based on multi-source information fusion technology
Technical Field
The invention belongs to the field of multi-source information fusion and expert recommendation, and particularly relates to an expert recommendation method based on a multi-source information fusion technology.
Background
The traditional technical expert recommendation algorithm generally adopts a single data source for recommendation, the single data source recommendation is easily restricted by the data source, the expert information is easily lost, the expert information cannot be comprehensively displayed, the expert recommendation is isolated, the cooperative relationship, the region and the supply and employment organization relationship among experts cannot be effectively expanded, a researcher can only recommend the expert information by one attribute, the multi-dimensional attributes of the experts can be fused according to the restriction conditions by the multi-source information fusion method, the technical expert information can be comprehensively displayed, and the expert relationship information can be expanded by the three subnets of an expert cooperative relationship subnet, a Web subnet and a subject subnet which are constructed by the multi-source information fusion technology, so that the technical expert recommendation is more comprehensive and accurate, and the breadth and the depth of a recommendation result are improved.
The existing research bases of Zhuquanhui, Lixiang, von Willi and the like comprise: zhaoyang, Zhuquan Yin, Huronglin, Dianthus superbus, a mixed recommendation algorithm [ J ] based on a self-coding machine and clustering, 2018,35(11): 52-56; lexiang, Zhu-Mi, Co-clustering and Scoring matrix shared collaborative filtering recommendations [ J ] computer science and exploration 2014,8(06): 751-; liu jin Ling, von Wanli, Zhang Yao red Chinese text clustering method based on rescaling [ J ] computer engineering and applications, 2012,48(21): 146-; the Web science and technology news classification extraction algorithm [ J ] proceedings of the Huaiyin institute of Industrial science and technology, 2015,24(5): 18-24. The patent is applied, published and authorized by cinnabar, plum-circle, von willebra and the like: von Wanli, Zhu quan Yin, Shibenmin, etc., a recommendation method of image review experts based on Pearson similarity and FP # Growth: CN106897370A, 2017.06.27; a logistics recommendation method based on clustering and cosine similarity comprises the following steps of: CN106886872A, 2017.06.23; li Xiang, Zhu quan Yin, Hurong forest, Zhonghong a spectral clustering-based cold-chain logistics stowage can only recommend the method: CN105654267A, 2016.06.08; the university student professional recommendation method based on deep learning comprises the following steps of: CN110188978A, 2019.08.30; an expert combination recommendation method based on image quantity, such as Zhu Quanyin, Ji Rui, Nijinxun, and the like, comprises the following steps: CN110162638A, 2019.08.23; zhu quan Yin, in Shi Min; an expert combined recommendation method based on knowledge graph, such as Huronglin and Von Wanli, comprises the following steps: CN109062961A, 2018.12.21.
The multi-source information fusion technology comprises the following steps:
information fusion, also known as data fusion, also known as sensor information fusion or multi-sensor information fusion, is an information processing process that correlates, correlates and synthesizes data and information obtained from single and multiple information sources to obtain accurate position and identity estimation, and comprehensively and timely evaluates situations, threats and their importance levels; the process is a continuous refining (refining) process for estimating, evaluating and evaluating the demand of the additional information source, and is a process for continuously self-correcting the information processing process so as to improve the result.
The prior patent application of the multi-source information fusion technology comprises the following steps: the information fusion method and system based on the information fusion engine of the ship networking gateway are characterized by comprising the following steps: CN 109814444A, 2019.05.28, the problem of redundant backup and system structure of the information fusion module of the sensor data acquisition system is solved; the transformer substation fire detection system based on multi-sensor information fusion and the detection information fusion method are characterized in that the transformer substation fire detection system comprises a fire detection system and a fire detection information fusion method, wherein the fire detection system comprises the following steps: CN 105185022A, 2015.12.23, the invention can flexibly adapt to complex detection environment, expand detection range, improve sensitivity, reduce false alarm rate, and greatly improve the capability of reliably distinguishing true and false fire and the efficiency and accuracy of fire early warning of a transformer substation; the plum courage and intelligent information fusion image type fire detector and the detection information fusion method comprise the following steps: CN 103630948A, 2014.03.12, the phenomena of false alarm and false alarm are reduced to the utmost extent, and the accuracy and reliability of the image type fire detector are effectively improved; the complex background target identification method based on multi-dimensional information fusion comprises the following steps of: CN 109492700A, 2019.03.19, the accuracy and reliability of target identification are improved.
Prior experts have recommended patent applications including: suyurong and Lishenghua, an expert recommendation method and system: although CN 111160699 a,2020.05.15 can implement a more standard recommendation result for users on the basis of multiple recommendation systems, there may be a problem of information redundancy and a problem of information loss cannot be avoided; zyongfeng, Tantaniusau and Lizhenhua, an expert recommendation method and system based on multiple data sources: CN 111008330A, 2020.04.14, generates recommendation results by adding score fields to experts and sorting the experts according to values corresponding to the score fields, although the patent relates to multi-source data, only scores the expert fields and fails to mine implicit expert relationships among multiple data; wangjian, Sunjiao and forest hongfei, a community question and answer expert recommendation method based on a recurrent neural network comprises the following steps: CN 108021616A, 2018.05.11, the recommendation method can effectively represent sentence grammar and semantic information, reduces manual intervention on recommendation results, but is easily influenced by an original corpus so as to influence the recommendation results.
Although the above patents effectively improve the recommendation result, the recommendation result is not related to the recommendation of the expert relationship and the regional relationship, and the factors such as the regional relationship and the cooperation relationship cannot be comprehensively considered by only recommending the expert, so that the recommendation result is invalid and cannot be applied to actual recommendation.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems in the prior art, the invention provides an expert recommendation method based on a multi-source information fusion technology, which constructs an expert information network by constructing an expert cooperative relationship subnet, a Web subnet and a subject subnet, fusing the three subnets by taking an expert name-mechanism as a constraint condition, calculating and sequencing the centrality values of experts in the expert information network, and recommending the experts according to a sequencing result.
The technical scheme is as follows: in order to solve the technical problem, the invention provides an expert recommendation method based on a multi-source information fusion technology, which comprises the following specific steps:
(1) and crawling technical expert data to construct a knowledge base and constructing keyword dictionaries.
(2) And extracting author fields of the knowledge base to perform word frequency co-occurrence analysis to construct an expert cooperative relationship subnet.
(3) And respectively extracting the research direction and personal information of Web page experts by using a regular expression and a named entity recognition algorithm to construct an expert Web subnet.
(4) And respectively extracting document-subject and subject-keyword from the abstract field of the knowledge base by using lda algorithm, and extracting 5 words with the maximum weight of the abstract field by using TF-IDF algorithm to jointly construct a subject subnet.
(5) And constructing and calculating expert centrality values in the expert information network by taking the expert names-mechanisms as constraint conditions and combining the three subnets, sequencing the expert centrality values and recommending the experts ranked at the top 5 as recommendation results.
Further, the specific steps of the step (1) are as follows:
(1.1) acquiring a scientific and technical paper document W from a knowledge base, wherein the total number of the W sections is M, and creating a null keyword dictionary keywords;
(1.2) define the Global Loop variable Vi initialized to 1 for traversal W, Vi ∈ (1, M), where W isViThe Vi-th document is shown;
(1.3) judging whether Vi is less than or equal to M, if so, executing the step (1.4), and if not, executing the step (1.11);
(1.4) defining initialization of the Loop variable Vij to 1 as document WVijThe j (th) keyword, Vij ∈ (1, N), N is the document WVijThe number of keywords;
(1.5) judging whether Vij is belonged to keywords or not, if yes, executing the step (1.6), and if not, executing the step (1.10);
(1.6) the keywords Vij exist in the keyword table, and the writing in of the Vij is abandoned;
(1.7) let Vij ═ Vij + 1;
(1.8) judging whether Vij is less than or equal to N, if so, executing the step (1.5), and if not, executing the step (1.9);
(1.9) let Vi be Vi +1 and perform step (1.3);
(1.10) writing the keyword Vij into a keyword table keywords, and executing the step (1.7);
(1.11) obtaining keyword tables keywords containing all keywords.
Further, the specific steps of the step (2) are as follows:
(2.1) acquiring scientific and technical paper documents W from the knowledge base, wherein the total number of W sections is M, the cyclic variable Vi and the documents WVi
(2.2) judging whether Vi is less than or equal to M, if so, executing the step (2.3), and if not, executing the step (2.5);
(2.3) to WViThe authors of scientific and technological articles are separated to obtain the relationship R ═ WVi,WViaIn which WViaIs the W-thViThe a-th author name of the article;
(2.4) let Vi be Vi +1 and perform step (2.2);
(2.5) obtaining all the literature authors' relations R after the separation;
(2.6) carrying out frequency statistics on all authors in the relationship R of the authors in the document to obtain the frequency A of the authors, which is { m, Na }, wherein Na is the name of the author, and m is the total number of times of Na occurrence;
(2.7) counting the co-occurrence frequency G ═ { m, Nap Naq }, wherein G represents that the author Nap and Naq co-occur m times;
(2.8) converting the co-occurrence frequency G of the authors into a co-occurrence network to obtain an author relation subnet.
Further, the specific steps of the step (3) are as follows:
(3.1) acquiring expert Web page information from a knowledge base;
(3.2) acquiring expert information of the expert Web page through a named entity recognition algorithm;
(3.3) obtaining personal information of experts;
(3.4) defining a regular expression rule Ru;
(3.5) judging whether the value of the rule Ru on the Web page is empty, if so, executing the step (3.8), and if not, executing the step (3.6);
(3.6) obtaining the research direction of an expert;
(3.7) obtaining the research direction of an expert and the personal information of the expert and constructing a Web subnet;
and (3.8) obtaining the personal information of the experts and constructing the Web subnet.
Further, the specific steps of the step (4) are as follows:
(4.1) acquiring a scientific and technological thesis document W from a knowledge base, wherein the total number of the W sections is M, circulating variables Vi, and creating an empty Abstract text Abstract;
(4.2) judging whether Vi is less than or equal to M, if so, executing the step (4.3), and if not, executing the step (4.5);
(4.3) preparation of document WViThe Abstract of the text Abstract is written into an Abstract text Abstract;
(4.4) let Vi be Vi +1 and perform step (4.2);
(4.5) obtaining Abstract text Abstract containing all documents W;
(4.6) adding keyword dictionary keywords and carrying out jieba word segmentation on the Abstract text Abstract to obtain the Abstract text Abstract' after word segmentation;
(4.7) performing document-topic and topic-keyword calculations on Abstract' by the lda algorithm;
(4.8) obtaining document-subject and subject-keyword of Abstract text Abstract;
(4.9) carrying out weight calculation on Abstract' by using a TF-IDF algorithm;
(4.10) acquiring 5 words with the maximum weight in the Abstract;
and (4.11) constructing a topic subnet by the document-topic, the topic-keyword and the 5 words with the maximum weight in the abstract.
Further, the specific steps of the step (5) are as follows:
(5.1) taking an expert cooperative relationship subnet, a Web subnet and a subject subnet;
(5.2) associating the expert cooperative relationship subnet, the Web subnet and the subject subnet by taking the expert name-mechanism as a constraint condition;
(5.3) obtaining an expert information network;
(5.4) calculating and sequencing the expert centrality values in the expert information network;
and (5.5) taking the expert at the top 5 as a final recommendation result according to the sorting result.
By adopting the technical scheme, the invention has the following beneficial effects:
the invention solves the problems of single expert attribute expression and insufficient expert relationship expression caused by insufficient data sources of the conventional recommendation system, constructs an expert cooperative relationship subnet, a Web subnet and a subject subnet by using multi-source information, and integrates the three subnets by taking an expert name-mechanism as a constraint condition to construct a technical expert information network. The recommendation system constructed based on the expert information network can expand the expert information, enhances the relation between the cooperation relation and the regional relation among technical experts, enables the technical expert to recommend more comprehensively and accurately, and improves the breadth and the depth of a recommendation result.
Drawings
FIG. 1 is a general flow diagram of the present invention;
FIG. 2 is a flow diagram of constructing a keyword dictionary in an exemplary embodiment;
FIG. 3 is a flow diagram of construction of an expert partnership subnet in a specific embodiment;
FIG. 4 is a flow diagram of constructing a Web subnet in an exemplary embodiment;
FIG. 5 is a flow diagram of constructing a topic subnet in a specific embodiment;
FIG. 6 is a flow chart of expert recommendation in an exemplary embodiment.
Detailed Description
The present invention is further illustrated by the following specific examples in conjunction with the national standards of engineering, it being understood that these examples are intended only to illustrate the invention and not to limit the scope of the invention, which is defined in the claims appended hereto, as modifications of various equivalent forms by those skilled in the art upon reading the present invention.
As shown in fig. 1 to 6, the expert recommendation method based on the multi-source information fusion technology according to the present invention includes the following steps:
step 1: crawling technical expert data to construct a knowledge base, and constructing a keyword dictionary keywords:
step 1.1: acquiring scientific and technical paper documents W from a knowledge base, wherein the total number of the W sections is M, and creating a null keyword dictionary keywords;
step 1.2: defining a Global Loop variable Vi is initialized to 1 for traversing W, Vi ∈ (1, M), where W isViThe Vi-th document is shown;
step 1.3: judging whether the Vi is less than or equal to M, if so, executing the step 1.4, and if not, executing the step 1.11;
step 1.4: defining initialization of a Loop variable Vij to 1 as document WVijThe j (th) keyword, Vij ∈ (1, N), N is the document WVijThe number of keywords;
step 1.5: judging whether Vij belongs to keywords or not, if so, executing the step 1.6, and if not, executing the step 1.10;
step 1.6: the keyword Vij exists in the keyword table, and the writing in of the Vij is abandoned;
step 1.7: let Vij ═ Vij + 1;
step 1.8: judging whether N is equal to or less than Vij, if so, executing the step 1.5, and if not, executing the step 1.9;
step 1.9: let Vi be Vi +1 and perform step 1.3;
step 1.10: writing the keyword Vij into a keyword table keywords, and executing the step 1.7;
step 1.11: keyword tables comprising all keywords are obtained.
Step 2: extracting author fields of a knowledge base to perform word frequency co-occurrence analysis to construct an expert cooperative relationship subnet:
step 2.1: acquiring scientific and technical paper documents W, the total number of W sections is M, cyclic variables Vi and documents W from a knowledge baseVi
Step 2.2: judging whether the Vi is less than or equal to M, if so, executing the step 2.3, and if not, executing the step 2.5;
step 2.3: to the W thViThe authors of scientific and technological articles are separated to obtain the relationship R ═ WVi,WViaIn which WViaIs the W-thViThe a-th author name of the article;
step 2.4: let Vi be Vi +1 and perform step 2.2;
step 2.5: obtaining all separated literature author relations R;
step 2.6: performing frequency statistics on all authors in the relationship R of the authors in the document to obtain the frequency A of the authors, which is { m, Na }, wherein Na is the name of the author, and m is the total number of times of Na occurrence;
step 2.7: counting the co-occurrence frequency G ═ { m, Nap Naq }, wherein G represents that the author Nap and Naq co-occur m times;
step 2.8: and converting the co-occurrence frequency G of the author into a co-occurrence network to obtain an author relation subnet.
And step 3: respectively extracting the research direction of Web page experts and personal information by using a regular expression and a named entity recognition algorithm to construct an expert Web subnet:
step 3.1: acquiring expert Web page information from a knowledge base;
step 3.2: acquiring expert information of an expert Web page through a named entity recognition algorithm;
step 3.3: obtaining expert personal information;
step 3.4: defining regular expression rules Ru;
step 3.5: judging whether the value of the rule Ru on the Web page is empty or not, if so, executing the step 3.8, and if not, executing the step 3.6;
step 3.6: obtaining the research direction of an expert;
step 3.7: obtaining an expert research direction and expert personal information and constructing a Web subnet;
step 3.8: and obtaining the personal information of the expert and constructing a Web subnet.
And 4, step 4: respectively extracting document-subject and subject-keyword from the knowledge base abstract field by lda algorithm, extracting 5 words with the maximum abstract field weight by TF-IDF algorithm, and jointly constructing a subject subnet:
step 4.1: acquiring scientific and technical thesis documents W from a knowledge base, wherein the total number of W sections is M, circulating variables Vi are obtained, and a blank Abstract text Abstract is created;
step 4.2: judging whether the Vi is less than or equal to M, if so, executing a step 4.3, and if not, executing a step 4.5;
step 4.3: document WViThe Abstract of the text Abstract is written into an Abstract text Abstract;
step 4.4: let Vi be Vi +1 and perform step 4.2;
step 4.5: obtaining Abstract text Abstract containing all documents W;
step 4.6: adding keyword dictionary keywords and carrying out jieba word segmentation on the Abstract text Abstract to obtain the Abstract text Abstract' after word segmentation;
step 4.7: carrying out document-subject and subject-keyword calculation on Abstract' by an lda algorithm;
step 4.8: obtaining a document-subject and a subject-keyword of the Abstract text Abstract;
step 4.9: carrying out weight calculation on Abstract' by a TF-IDF algorithm;
step 4.10: acquiring 5 words with the maximum weight in the Abstract;
step 4.11: and (4) constructing a topic subnet by the document-topic, the topic-keyword and the 5 words with the maximum weight in the abstract.
And 5: constructing and calculating expert centrality values in an expert information network by taking an expert name-mechanism as a constraint condition and combining three subnets, sequencing the expert centrality values and recommending the experts ranked 5 above as recommendation results:
step 5.1: taking an expert cooperative relationship subnet, a Web subnet and a subject subnet;
step 5.2: associating the expert cooperative relationship subnet, the Web subnet and the subject subnet by taking the expert name-mechanism as a constraint condition;
step 5.3: acquiring an expert information network;
step 5.4: calculating and sequencing the centrality values of the experts in the expert information network;
step 5.5: and taking the expert at the top 5 as a final recommendation result according to the sorting result.
The variables involved in the above steps are shown in the following table:
Figure BDA0002681717780000081
Figure BDA0002681717780000091
39382 pieces of data are processed, and expert information, document abstracts, keywords and Web page information are extracted from the crawled data to construct a knowledge base. The expert cooperation relationship subnet, the Web subnet and the subject subnet are established through a multi-source information fusion technology, the technical expert information network is established by taking an expert name-mechanism as a constraint condition, and the expert recommendation system is established by combining the technical expert information network.
The invention creatively provides an expert recommendation method based on a multi-source information fusion technology, which solves the problem of single attribute expression of the existing expert recommendation system, constructs an expert cooperative relationship subnet, a Web subnet and a subject subnet by using multi-source information, and fuses three subnets to construct a technical expert information base by taking an expert name-mechanism as a constraint condition. The technical expert recommendation system fusing the three subnets can comprehensively display expert information, deep level association recommendation can be performed according to the cooperation relationship and the regional relationship among the technical experts, the breadth and the depth of a recommendation range are improved, and the expert recommendation has higher accuracy and expansibility.

Claims (6)

1. An expert recommendation method based on a multi-source information fusion technology is characterized by comprising the following specific steps:
(1) crawling technical expert data to construct a knowledge base and constructing keyword dictionaries;
(2) extracting author fields of a knowledge base to perform word frequency co-occurrence analysis to construct an expert cooperative relationship subnet;
(3) respectively extracting the research direction of Web page experts and personal information by using a regular expression and a named entity recognition algorithm to construct an expert Web subnet;
(4) respectively extracting document-subject and subject-keyword from the abstract field of the knowledge base by lda algorithm, extracting 5 words with the maximum weight of the abstract field by TF-IDF algorithm, and constructing a subject subnet together;
(5) and constructing and calculating expert centrality values in the expert information network by taking the expert names-mechanisms as constraint conditions and combining the three subnets, sequencing the expert centrality values and recommending the experts ranked at the top 5 as recommendation results.
2. The expert recommendation method based on the multi-source information fusion technology according to claim 1, wherein the specific steps of constructing the keyword dictionary keywords in the step (1) are as follows:
(1.1) acquiring a scientific and technical paper document W from a knowledge base, wherein the total number of the W sections is M, and creating a null keyword dictionary keywords;
(1.2) define the Global Loop variable Vi initialized to 1 for traversal W, Vi ∈ (1, M), where W isViThe Vi-th document is shown;
(1.3) judging whether Vi is less than or equal to M, if so, executing the step (1.4), and if not, executing the step (1.11);
(1.4) define Loop variable Vij initializationIs 1 in the document WVijThe j (th) keyword, Vij ∈ (1, N), N is the document WVijThe number of keywords;
(1.5) judging whether Vij is belonged to keywords or not, if yes, executing the step (1.6), and if not, executing the step (1.10);
(1.6) the keywords Vij exist in the keyword table, and the writing in of the Vij is abandoned;
(1.7) let Vij ═ Vij + 1;
(1.8) judging whether Vij is less than or equal to N, if so, executing the step (1.5), and if not, executing the step (1.9);
(1.9) let Vi be Vi +1 and perform step (1.3);
(1.10) writing the keyword Vij into a keyword table keywords, and executing the step (1.7);
(1.11) obtaining keyword tables keywords containing all keywords.
3. The expert recommendation method based on the multi-source information fusion technology according to claim 1, wherein the specific steps of extracting author fields of the knowledge base in the step (2) and performing word frequency co-occurrence analysis to construct an expert cooperation relationship subnet are as follows:
(2.1) acquiring scientific and technical paper documents W from the knowledge base, wherein the total number of W sections is M, the cyclic variable Vi and the documents WVi
(2.2) judging whether Vi is less than or equal to M, if so, executing the step (2.3), and if not, executing the step (2.5);
(2.3) to WViThe authors of scientific and technological articles are separated to obtain the relationship R ═ WVi,WViaIn which WViaIs the W-thViThe a-th author name of the article;
(2.4) let Vi be Vi +1 and perform step (2.2);
(2.5) obtaining all the literature authors' relations R after the separation;
(2.6) carrying out frequency statistics on all authors in the relationship R of the authors in the document to obtain the frequency A of the authors, which is { m, Na }, wherein Na is the name of the author, and m is the total number of times of Na occurrence;
(2.7) counting the co-occurrence frequency G ═ { n, Nap Naq }, wherein G denotes that the author Nap and Naq co-occur n times;
(2.8) converting the co-occurrence frequency G of the authors into a co-occurrence network to obtain an author relation subnet.
4. The expert recommendation method based on the multi-source information fusion technology according to claim 1, wherein the step (3) of extracting the research direction of the Web page expert and the personal information by using the regular expression and the named entity recognition algorithm respectively to construct the expert Web subnet specifically comprises the following steps:
(3.1) acquiring expert Web page information from a knowledge base;
(3.2) acquiring expert information of the expert Web page through a named entity recognition algorithm;
(3.3) obtaining personal information of experts;
(3.4) defining a regular expression rule Ru;
(3.5) judging whether the value of the rule Ru on the Web page is empty, if so, executing the step (3.8), and if not, executing the step (3.6);
(3.6) obtaining the research direction of an expert;
(3.7) obtaining the research direction of an expert and the personal information of the expert and constructing a Web subnet;
and (3.8) obtaining the personal information of the experts and constructing the Web subnet.
5. The expert recommendation method based on multi-source information fusion technology according to claim 1, characterized in that the specific steps of obtaining document-subject, subject-keyword and 5 words with the largest weight of the summary field by lda and TF-IDF algorithm in the step (4) are as follows:
(4.1) acquiring a scientific and technological thesis document W from a knowledge base, wherein the total number of the W sections is M, circulating variables Vi, and creating an empty Abstract text Abstract;
(4.2) judging whether Vi is less than or equal to M, if so, executing the step (4.3), and if not, executing the step (4.5);
(4.3) preparation of document WViThe Abstract of the text Abstract is written into an Abstract text Abstract;
(4.4) let Vi be Vi +1 and perform step (4.2);
(4.5) obtaining Abstract text Abstract containing all documents W;
(4.6) adding keyword dictionary keywords and carrying out jieba word segmentation on the Abstract text Abstract to obtain the Abstract text Abstract' after word segmentation;
(4.7) performing document-topic and topic-keyword calculations on Abstract' by the lda algorithm;
(4.8) obtaining document-subject and subject-keyword of Abstract text Abstract;
(4.9) carrying out weight calculation on Abstract' by using a TF-IDF algorithm;
(4.10) acquiring 5 words with the maximum weight in the Abstract;
and (4.11) constructing a topic subnet by the document-topic, the topic-keyword and the 5 words with the maximum weight in the abstract.
6. The expert recommendation method based on the multi-source information fusion technology according to claim 1, wherein the specific steps of constructing and calculating the centrality value of the expert in the expert information network by using the expert name-organization as a constraint condition and combining three subnets in step (5), and using the expert with the centrality value ranked 5 top as a recommendation result are as follows:
(5.1) taking an expert cooperative relationship subnet, a Web subnet and a subject subnet;
(5.2) associating the expert cooperative relationship subnet, the Web subnet and the subject subnet by taking the expert name-mechanism as a constraint condition;
(5.3) obtaining an expert information network;
(5.4) calculating and sequencing the expert centrality values in the expert information network;
and (5.5) taking the expert at the top 5 as a final recommendation result according to the sorting result.
CN202010964492.1A 2020-09-15 2020-09-15 Expert recommendation method based on multi-source information fusion technology Active CN112269909B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010964492.1A CN112269909B (en) 2020-09-15 2020-09-15 Expert recommendation method based on multi-source information fusion technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010964492.1A CN112269909B (en) 2020-09-15 2020-09-15 Expert recommendation method based on multi-source information fusion technology

Publications (2)

Publication Number Publication Date
CN112269909A true CN112269909A (en) 2021-01-26
CN112269909B CN112269909B (en) 2022-06-03

Family

ID=74349510

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010964492.1A Active CN112269909B (en) 2020-09-15 2020-09-15 Expert recommendation method based on multi-source information fusion technology

Country Status (1)

Country Link
CN (1) CN112269909B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988951A (en) * 2021-03-16 2021-06-18 福州数据技术研究院有限公司 Scientific research project review expert accurate recommendation method and storage device
CN113537927A (en) * 2021-06-28 2021-10-22 北京航空航天大学 Scientific and technological resource service platform transaction coordination system and method
CN114547284A (en) * 2022-02-22 2022-05-27 同方知网数字出版技术股份有限公司 Anti-cheating intelligent method for recommending reviewers to system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075942A (en) * 2007-06-22 2007-11-21 清华大学 Method and system for processing social network expert information based on expert value progation algorithm
CN103559262A (en) * 2013-11-04 2014-02-05 北京邮电大学 Community-based author and academic paper recommending system and recommending method
US20160154798A1 (en) * 2014-03-06 2016-06-02 Webfire, Llc Method of automatically constructing content for web sites
CN110688405A (en) * 2019-08-23 2020-01-14 上海科技发展有限公司 Expert recommendation method, device, terminal and medium based on artificial intelligence
CN110990662A (en) * 2019-11-22 2020-04-10 北京市科学技术情报研究所 Domain expert selection method based on citation network and scientific research cooperation network
CN111143690A (en) * 2019-12-31 2020-05-12 中国电子科技集团公司信息科学研究院 Expert recommendation method and system based on associated expert database

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075942A (en) * 2007-06-22 2007-11-21 清华大学 Method and system for processing social network expert information based on expert value progation algorithm
CN103559262A (en) * 2013-11-04 2014-02-05 北京邮电大学 Community-based author and academic paper recommending system and recommending method
US20160154798A1 (en) * 2014-03-06 2016-06-02 Webfire, Llc Method of automatically constructing content for web sites
CN110688405A (en) * 2019-08-23 2020-01-14 上海科技发展有限公司 Expert recommendation method, device, terminal and medium based on artificial intelligence
CN110990662A (en) * 2019-11-22 2020-04-10 北京市科学技术情报研究所 Domain expert selection method based on citation network and scientific research cooperation network
CN111143690A (en) * 2019-12-31 2020-05-12 中国电子科技集团公司信息科学研究院 Expert recommendation method and system based on associated expert database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
徐硕: "《基于论文和资源的技术机会发现方法》", 31 January 2018 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988951A (en) * 2021-03-16 2021-06-18 福州数据技术研究院有限公司 Scientific research project review expert accurate recommendation method and storage device
CN113537927A (en) * 2021-06-28 2021-10-22 北京航空航天大学 Scientific and technological resource service platform transaction coordination system and method
CN113537927B (en) * 2021-06-28 2024-06-07 北京航空航天大学 Transaction collaboration system and method for scientific and technological resource service platform
CN114547284A (en) * 2022-02-22 2022-05-27 同方知网数字出版技术股份有限公司 Anti-cheating intelligent method for recommending reviewers to system

Also Published As

Publication number Publication date
CN112269909B (en) 2022-06-03

Similar Documents

Publication Publication Date Title
Nagarajan et al. Classifying streaming of Twitter data based on sentiment analysis using hybridization
Xie et al. An improved algorithm for sentiment analysis based on maximum entropy
CN110046260B (en) Knowledge graph-based hidden network topic discovery method and system
CN112269909B (en) Expert recommendation method based on multi-source information fusion technology
CN107766585B (en) Social network-oriented specific event extraction method
Radovanović et al. Text mining: Approaches and applications
CN105824959B (en) Public opinion monitoring method and system
CN109165383B (en) Data aggregation, analysis, mining and sharing method based on cloud platform
Rahimi et al. An overview on extractive text summarization
CN108304552B (en) Named entity linking method based on knowledge base feature extraction
Zeng et al. A classification-based approach for implicit feature identification
CN110728151B (en) Information depth processing method and system based on visual characteristics
CN114254653A (en) Scientific and technological project text semantic extraction and representation analysis method
CN106599824B (en) A kind of GIF animation emotion identification method based on emotion pair
Gan et al. Experimental comparison of three topic modeling methods with LDA, Top2Vec and BERTopic
Lubis et al. Latent Semantic Indexing (LSI) and Hierarchical Dirichlet Process (HDP) Models on News Data
Dung Natural language understanding
Kama et al. Analyzing implicit aspects and aspect dependent sentiment polarity for aspect-based sentiment analysis on informal Turkish texts
Sahono et al. Extrovert and introvert classification based on Myers-Briggs Type Indicator (MBTI) using support vector machine (SVM)
Popovski et al. Food Data Integration by using Heuristics based on Lexical and Semantic Similarities.
Tian et al. Research on image classification based on a combination of text and visual features
Cherif et al. A hybrid optimal weighting scheme and machine learning for rendering sentiments in tweets
CN115687960B (en) Text clustering method for open source security information
Zheng et al. A short-text oriented clustering method for hot topics extraction
Wang et al. Content-based weibo user interest recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20210126

Assignee: JIANGSU AOFAN TECHNOLOGY CO.,LTD.

Assignor: HUAIYIN INSTITUTE OF TECHNOLOGY

Contract record no.: X2022980027215

Denomination of invention: An expert recommendation method based on multi-source information fusion technology

Granted publication date: 20220603

License type: Common License

Record date: 20221229

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230615

Address after: Room 501, No. 502, No. 894, Tianhe North Road, Tianhe District, Guangzhou, Guangdong 510000

Patentee after: Greater Bay Area Technology Innovation Service Center (Guangzhou) Co.,Ltd.

Address before: 510000 room 432, second floor, unit 2, building 2, No. 24, Jishan new road street, Tianhe District, Guangzhou City, Guangdong Province (office only)

Patentee before: Guangzhou Jingzhi Information Technology Co.,Ltd.

Effective date of registration: 20230615

Address after: 510000 room 432, second floor, unit 2, building 2, No. 24, Jishan new road street, Tianhe District, Guangzhou City, Guangdong Province (office only)

Patentee after: Guangzhou Jingzhi Information Technology Co.,Ltd.

Address before: 223005 Jiangsu Huaian economic and Technological Development Zone, 1 East Road.

Patentee before: HUAIYIN INSTITUTE OF TECHNOLOGY