CN116402046B - Post entry construction method based on recruitment information - Google Patents
Post entry construction method based on recruitment information Download PDFInfo
- Publication number
- CN116402046B CN116402046B CN202310680645.3A CN202310680645A CN116402046B CN 116402046 B CN116402046 B CN 116402046B CN 202310680645 A CN202310680645 A CN 202310680645A CN 116402046 B CN116402046 B CN 116402046B
- Authority
- CN
- China
- Prior art keywords
- post
- sentence
- responsibility
- list
- requirement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007115 recruitment Effects 0.000 title claims abstract description 137
- 238000010276 construction Methods 0.000 title claims abstract description 26
- 230000011218 segmentation Effects 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 32
- 238000004458 analytical method Methods 0.000 claims abstract description 19
- 238000004140 cleaning Methods 0.000 claims abstract description 8
- 238000012545 processing Methods 0.000 claims description 28
- 230000014509 gene expression Effects 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 9
- 238000001914 filtration Methods 0.000 claims description 7
- 230000008439 repair process Effects 0.000 claims description 3
- 230000001502 supplementing effect Effects 0.000 claims description 3
- 238000000605 extraction Methods 0.000 abstract description 13
- 238000005065 mining Methods 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 description 14
- 238000013461 design Methods 0.000 description 13
- 230000008569 process Effects 0.000 description 13
- 238000011161 development Methods 0.000 description 10
- 230000018109 developmental process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 9
- 239000000047 product Substances 0.000 description 7
- 238000007405 data analysis Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 5
- 238000001514 detection method Methods 0.000 description 4
- 238000005553 drilling Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 206010044565 Tremor Diseases 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000012827 research and development Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000011284 combination treatment Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000013486 operation strategy Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000000737 periodic effect Effects 0.000 description 2
- 238000012797 qualification Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000002360 explosive Substances 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000011282 treatment Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000003442 weekly effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/105—Human resources
- G06Q10/1053—Employment or hiring
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Strategic Management (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application provides a recruitment information-based post entry construction method, which belongs to the technical field of recruitment information mining and analysis, and comprises the steps of collecting recruitment information, dividing and cleaning, and constructing a recruitment information table; according to the keyword in the preset keyword list as the word head, performing short sentence segmentation on all recruitment information in the recruitment information list, obtaining a keyword sentence and constructing a keyword sentence list; the post responsibility noun list and the post requirement noun list are respectively taken as constraints, post responsibility short sentences and post requirement short sentences are separated from the keyword sentence list, phrase segmentation and combination are carried out on the post responsibility short sentences and the post requirement short sentences according to the sentence periods of the short sentences, and post responsibility vocabulary and post requirement vocabulary are constructed; and finally, repairing the two lemmas according to the lemma repairing logic, and determining the post lemmas corresponding to recruitment information. The method and the device can effectively extract the keywords in the recruitment information, improve the keyword extraction accuracy and meet the service requirements of actual recruitment.
Description
Technical Field
The application belongs to the technical field of recruitment information mining analysis, and particularly relates to a method for mining task words and skill words from recruitment information to construct post entries.
Background
With the explosive development of the internet, the information volume in the network starts to increase explosively, and how to quickly and accurately acquire knowledge from massive information becomes the most central and urgent requirement of people. In the conventional recruiting process, a large number of recruiters are required to manually take charge of the processes of multiple interviews and screening, which consumes a large amount of manpower and time. The recruitment text is a special information text, the recorded text information is scattered, and the vocabulary in the text is influenced by the development and adjustment of the prior art, so that new recruitment vocabulary is easy to appear, and the accuracy and the adaptability of the recruitment information mining analysis are higher.
At present, a technical scheme for extracting keywords required by work responsibilities and skills in recruitment information through a keyword model is also available, and the common practice is to firstly climb recruitment requirements and responsibilities of posts on the intelligent couplet about data mining, perform word segmentation processing on the crawled information according to a word segmentation technology, such as barking word segmentation, and the like, extract keywords from the recruitment information, obtain words with higher occurrence frequency, and further obtain the keywords in the recruitment information. However, the existing word segmentation technology such as jieba aims at common words and common words, or a custom dictionary mode to realize word segmentation. And the keywords required for work responsibilities and posts are cut off from the results of the general word and the common word pattern word segmentation, so that the service requirements cannot be met. Meanwhile, keyword extraction is performed in a self-defined dictionary mode, so that the keyword extraction quality can be guaranteed, but the workload is large, the consistency of words in part of professional recruitment information is strong, the self-defined dictionary cannot realize the exhaustion of word segmentation information of each post recruitment information, and the accuracy of keyword extraction is not high.
Disclosure of Invention
Therefore, the post entry construction method based on recruitment information is beneficial to solving the problems that the keyword extraction accuracy is low and the actual recruitment business requirement cannot be met due to the fact that the keyword in the recruitment information is difficult to effectively extract by the existing keyword extraction method.
In order to achieve the above purpose, the application adopts the following technical scheme:
the application provides a post entry construction method based on recruitment information, which comprises the following steps:
collecting recruitment information, dividing and cleaning the recruitment information according to a first preset serial number list rule, and constructing a recruitment information list;
according to the keyword in the preset keyword list as the word head, performing short sentence segmentation on all recruitment information in the recruitment information list, obtaining a keyword sentence and constructing a keyword sentence list; the preset keyword list specifically comprises a post description subject term list, a post responsibility proper noun list and a post requirement proper noun list;
taking the post responsibility noun table as constraint, separating a post responsibility short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post responsibility short sentence according to the sentence period of the post responsibility short sentence to construct a post responsibility entry;
Taking the post requirement noun table as constraint, separating a post requirement short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post requirement short sentence according to the sentence pattern of the post requirement short sentence to construct a post requirement entry;
and repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to vocabulary entry repairing logic, and determining the post vocabulary entry corresponding to the recruitment information.
Further, the collecting recruitment information and dividing and cleaning the recruitment information according to a first preset serial number list rule to construct a recruitment information list specifically includes:
presetting a plurality of sequence number list rules, and sequentially linking each sequence number list rule to form a complete regular expression, so as to obtain a first preset sequence number list rule;
collecting recruitment information, identifying a sequence number format in the recruitment information according to a first preset sequence number list rule, performing data segmentation on a recruitment information text according to the sequence of the identified sequence numbers, and converting text modes of the recruitment information corresponding to each sequence number into regular expressions one by one to form a recruitment information list.
Further, the step of performing short sentence segmentation on all recruitment information in the recruitment information table according to the keyword in the preset keyword table as the word head, obtaining the keyword sentence and constructing a keyword sentence list specifically includes:
The method comprises the steps of defining a post description subject matter list, a post responsibility term list and a post requirement term list in advance, taking post description subject matters in the post description subject matter list, post responsibility terms in the post responsibility term list and post requirement terms in the post requirement term list as the word head of a short sentence, and carrying out short sentence segmentation on all recruitment information in the recruitment information list to respectively obtain the post description short sentence, the post responsibility short sentence and the post requirement short sentence;
the post responsibility short sentence and the post requirement short sentence are subject-treated respectively, so that the post responsibility short sentence or the post requirement short sentence only contains one post responsibility proper noun or post requirement proper noun;
respectively traversing the post responsibility short sentence, determining the starting sequence number and the ending sequence number of the post responsibility proper noun in the post responsibility short sentence, constructing a starting sequence number group of the front keyword and the rear keyword in the post responsibility short sentence, traversing the post requirement short sentence, determining the starting sequence number and the ending sequence number of the post requirement proper noun in the post responsibility short sentence, and constructing the starting sequence number and the ending sequence number of the front keyword and the rear keyword in the post requirement short sentence;
respectively carrying out keyword filtering processing on the post description short sentence, the post responsibility short sentence and the post requirement short sentence according to preset keyword filtering logic to obtain a key sentence of recruitment information;
Based on the key sentences, the key words are used as keys, the key sentences are used as values, and a key sentence list is formed in a key value pair mode.
Further, the step of using the step function noun table as a constraint, separating a step function short sentence from the keyword sentence list, and performing phrase segmentation and combination on the step function short sentence according to the sentence period of the step function short sentence to construct a step function entry, which specifically includes:
separating the post responsibility short sentence from the keyword sentence list according to the post responsibility proper noun in the post responsibility proper noun list;
traversing the separated post responsibility short sentence, judging the sentence pattern of the post responsibility short sentence, if the post responsibility short sentence is punctuation mark sentence pattern, taking the post responsibility noun in the post responsibility short sentence as a first phrase, and constructing a post responsibility entry by taking the original post responsibility short sentence as a first-class task word;
if the post responsibility short sentence is a bracket sentence pattern sentence, the post responsibility vocabulary is constructed according to the bracket sentence pattern sentence processing logic.
Further, the sentence pattern processing logic is specifically:
if the sentence pattern of the post responsibility short sentence is a standard bracket sentence pattern, setting a first phrase as a post responsibility noun, separating phrases before and after the standard bracket and text information in the standard bracket, combining the phrases before and after the standard bracket into a first-level task word to obtain a first-level post responsibility phrase, dividing the text information in the standard bracket into a second-level task word according to punctuation marks to obtain a second-level post responsibility phrase, and combining the first-level post responsibility phrase and the second-level post responsibility phrase to construct a post responsibility entry;
If the sentence pattern of the post responsibility short sentence is a non-standard bracket sentence pattern, the matched first phrase is a post responsibility noun, the phrase in front of the first preset keyword is used as a first-level task word, the phrase behind the first preset keyword is divided into a second-level task word according to punctuation marks, and the first-level task word and the second-level task word are combined.
Further, the step requirement term table is used as a constraint, step requirement phrases are separated from the keyword sentence list, phrase segmentation and combination are carried out on the step requirement phrases according to the sentence pattern of the step requirement phrases, and step requirement terms are constructed specifically including:
separating the post requirement phrases from the keyword sentence list according to the post requirement proper nouns in the post requirement proper noun list;
traversing the separated post requirement clauses, judging the sentence patterns of the post requirement clauses, and if the post requirement clauses are punctuation mark sentence patterns, constructing post requirement entries according to punctuation mark sentence patterns processing logic of the post responsibility clauses;
if the post requirement short sentence is a bracket sentence pattern sentence, constructing a post requirement entry according to bracket sentence pattern sentence processing logic of the post responsibility short sentence;
If the post request phrase is a double sentence pattern sentence, the post request vocabulary is constructed according to the double sentence pattern processing logic.
Further, the dual sentence pattern processing logic specifically includes:
if the post requirement short sentence comprises three phrases, constructing a post requirement entry according to a preset three-section word rule;
if the post requirement phrase contains two phrases, constructing a post requirement entry according to a preset two-section word rule.
Further, the step of repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to vocabulary entry repairing logic, and the step of determining the post vocabulary entry corresponding to the recruitment information specifically includes:
performing punctuation mark analysis on the post responsibility vocabulary entry and the post requirement vocabulary entry respectively, and removing invalid punctuation marks before and after the post responsibility vocabulary entry and the post requirement vocabulary entry;
deleting invalid character strings before and after the post responsibility entry and the post requirement entry;
and respectively supplementing word functions of the post responsibility vocabulary entry and the post requirement vocabulary entry, and determining the post vocabulary entry corresponding to the recruitment information.
The application adopts the technical proposal and has at least the following beneficial effects:
the recruitment information is collected and segmented and cleaned according to a first preset serial number list rule, so that a recruitment information list is constructed; according to the keyword in the preset keyword list as the word head, performing short sentence segmentation on all recruitment information in the recruitment information list, obtaining a keyword sentence and constructing a keyword sentence list; the preset keyword list specifically comprises a post description subject term list, a post responsibility proper noun list and a post requirement proper noun list; taking the post responsibility noun table as constraint, separating a post responsibility short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post responsibility short sentence according to the sentence period of the post responsibility short sentence to construct a post responsibility entry; taking the post requirement noun table as constraint, separating a post requirement short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post requirement short sentence according to the sentence pattern of the post requirement short sentence to construct a post requirement entry; and repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to vocabulary entry repairing logic, and determining the post vocabulary entry corresponding to the recruitment information. According to the application, the keywords in the recruitment information are effectively extracted by taking the keywords in the preset keyword list as the word head, segmenting all recruitment information in the recruitment information list, acquiring the keywords and constructing the keyword sentence list, so that the keyword extraction accuracy is improved. Meanwhile, the position requirement short sentence and the position responsibility short sentence are separated from the keyword sentence list to be subjected to phrase segmentation and combination treatment, so that the position requirement entry and the position responsibility entry are constructed, the keyword extraction quality is ensured, the workload of recruitment data analysis is reduced, and the service requirement of actual recruitment can be met.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a flowchart illustrating a post entry construction method based on recruitment information according to an example embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail below.
Referring to fig. 1, fig. 1 is a post entry construction method based on recruitment information, as shown in fig. 1, the method includes:
s1: collecting recruitment information, dividing and cleaning the recruitment information according to a first preset serial number list rule, and constructing a recruitment information list;
s2: according to the keyword in the preset keyword list as the word head, performing short sentence segmentation on all recruitment information in the recruitment information list, obtaining a keyword sentence and constructing a keyword sentence list; the preset keyword list specifically comprises a post description subject term list, a post responsibility proper noun list and a post requirement proper noun list;
S3: taking the post responsibility noun table as constraint, separating a post responsibility short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post responsibility short sentence according to the sentence period of the post responsibility short sentence to construct a post responsibility entry;
s4: taking the post requirement noun table as constraint, separating a post requirement short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post requirement short sentence according to the sentence pattern of the post requirement short sentence to construct a post requirement entry;
s5: and repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to vocabulary entry repairing logic, and determining the post vocabulary entry corresponding to the recruitment information.
Further, in one embodiment, the collecting recruitment information and dividing and cleaning the recruitment information according to a first preset serial number list rule, and constructing a recruitment information table specifically includes:
presetting a plurality of sequence number list rules, and sequentially linking each sequence number list rule to form a complete regular expression, so as to obtain a first preset sequence number list rule;
collecting recruitment information, identifying a sequence number format in the recruitment information according to a first preset sequence number list rule, performing data segmentation on a recruitment information text according to the sequence of the identified sequence numbers, and converting text modes of the recruitment information corresponding to each sequence number into regular expressions one by one to form a recruitment information list.
The first preset sequence number list rule is obtained from a recruitment sequence number list, wherein the recruitment sequence number list consists of ordered regular expressions of recruitment information texts, and one expression is an element in the list. The sequence number list rule members of the recruitment sequence number list are as follows:
rule 1: [0-9] {1,2}, ] {1} covers recruitment information in the format of '1 ', '12 ', ' and '2 ', '16 ';
rule 2: [. {1} covers the recruitment information in the' sequence number format;
rule 3: (] {1} [0-9] {1,2 }) ] {1} covers ' (2), (32) ' recruitment information in sequence number format, wherein the ' is either a full angle or a half angle, and the sequence number can be one-digit number or two-digit number;
rule 4: the [0-9] {1,2 }) ] {1} covers ' 3) ', ' 12) ' recruitment information in serial number format, wherein ' is full angle or half angle, and the serial number can be one-digit number or two-digit number.
In the specific practical process, the recruitment information is segmented and cleaned specifically as follows: the recruitment information text is divided according to the sequence number list rule of the recruitment sequence number list, and a new recruitment information list is constructed. Recruitment number table is as follows:
['[0-9]{1,2}[、.]{1}','[• ]{1}','[((]{1}[0-9]{1,2}[))]{1}','[0-9]{1,2}[))]{1}']
The recruitment information list construction process is approximately as follows:
(1) Sequence number list rules are extracted, each rule being a regular expression.
(2) Each rule is linked together to form a complete regular expression, and a complete expression is defined by each rule or relationship.
(3) And (5) dividing the data to form a sentence-by-sentence recruitment information list.
Specifically, the application also provides an example of dividing and cleaning recruitment information, which is specifically as follows:
(1) Input data:
work duty: 1. the method is responsible for constructing a communication game content/author understanding system, cooperating with algorithm, research and development, data and auditing team, constructing a perfect content understanding optimizing link, and improving content identification capacity and efficiency; 2. the game content data management system is responsible for the scientific operation of game content data and can provide solutions according to different ecological targets; 3. and the method is responsible for optimizing the content operation strategy, a reasonable data effect evaluation system can be set by utilizing data-aided decision, and each module is linked through experiments to continuously iterate and verify, so that a new method is explored and found to promote key indexes and enable game users to grow. The working requirements are as follows: 1. the science and technology of the family is that of the science and technology of the family, such as economics, statistics, mathematics, physics, information technology and related professions; related working experience such as content center station/content strategy/content ecology and the like is prioritized over 2 years; actual combat experiential priority with data-driven business, consultation company background priority, internet business analysis/BI background priority; 2. the content understanding capability is strong, the industry/bidding dynamics are concerned, and the thinking and logic of the internet products are clear; 3. excellent data analysis capability (familiarity with big data tools), familiarity with A/BTsting experimental theory and flow, knowledge of commonly used machine learning and deep learning algorithms; good at utilizing data-driven requirements or guiding decisions; 4. has strong communication cooperation capability, project management capability, target guidance, self-driving capability and strong curiosity and learning capability.
(2) Outputting a result:
[' work responsibilities: ' is responsible for constructing a communication game content/author understanding system, cooperating with algorithm, research and development, data and auditing team, constructing a perfect content understanding optimizing link, and improving content identification capacity and efficiency; ' is responsible for the scientific operation of game content data and can provide solutions according to different ecological targets; ' is responsible for optimizing content operation strategies, a reasonable data effect evaluation system can be set by utilizing data auxiliary decision making, and a new method is explored and found through continuous iteration and verification of each module of experimental linkage so as to promote key indexes and enable game users to grow. The working requirements are as follows: ' the science of the family is more than the science of economics, statistics, mathematics, physics, information technology and related professions; related working experience such as content center station/content strategy/content ecology and the like is prioritized over 2 years; actual combat experiential priority with data-driven business, consultation company background priority, internet business analysis/BI background priority; ' stronger content understanding capability, concern industry/bidding dynamics, have internet product thinking and clear logic; ' excellent data analysis capability (familiarity with big data tools), familiarity with A/BTsting experimental theory and flow, knowledge of commonly used machine learning and deep learning algorithms; good at utilizing data-driven requirements or guiding decisions; ' has strong communication cooperation capability, project management capability, target guidance, self-driving and strong curiosity and learning capability. ']
Further, in one embodiment, the step of performing phrase segmentation on all recruitment information in the recruitment information table according to the keyword in the preset keyword table as a word head, obtaining a keyword sentence and constructing a keyword sentence list specifically includes:
the method comprises the steps of defining a post description subject matter list, a post responsibility term list and a post requirement term list in advance, taking post description subject matters in the post description subject matter list, post responsibility terms in the post responsibility term list and post requirement terms in the post requirement term list as the word head of a short sentence, and carrying out short sentence segmentation on all recruitment information in the recruitment information list to respectively obtain the post description short sentence, the post responsibility short sentence and the post requirement short sentence;
the post responsibility short sentence and the post requirement short sentence are subject-treated respectively, so that the post responsibility short sentence or the post requirement short sentence only contains one post responsibility proper noun or post requirement proper noun;
respectively traversing the post responsibility short sentence, determining the starting sequence number and the ending sequence number of the post responsibility proper noun in the post responsibility short sentence, constructing a starting sequence number group of the front keyword and the rear keyword in the post responsibility short sentence, traversing the post requirement short sentence, determining the starting sequence number and the ending sequence number of the post requirement proper noun in the post responsibility short sentence, and constructing the starting sequence number and the ending sequence number of the front keyword and the rear keyword in the post requirement short sentence;
Respectively carrying out keyword filtering processing on the post description short sentence, the post responsibility short sentence and the post requirement short sentence according to preset keyword filtering logic to obtain a key sentence of recruitment information;
based on the key sentences, the key words are used as keys, the key sentences are used as values, and a key sentence list is formed in a key value pair mode.
The post description keyword list consists of recruitment information aiming at work responsibility and work requirement starting prompting subject words, wherein one subject word is an element in the list. The post responsibility verb list consists of proper nouns defining job responsibility target tasks in recruitment information, and one verb is a member of the list. The invention uses the proper noun concept in English to explain a special verb in recruitment information, and the proper noun sentence is a phrase which takes the verb as the main word in recruitment information. The post requirement verb list consists of proper nouns defining the proficiency of work requirements in recruitment information, and one verb is a member of the list. The construction of the keyword sentence aims at taking the keyword in the keyword list as the word head, dividing each element in the input list (i.e. the recruitment information list) and constructing the keyword sentence list. The keyword sentence list element is a key value pair, the key is a keyword, and the value is a short sentence beginning with the keyword.
Specifically, in one embodiment, the post description subject matter list, post responsibility proper noun list and post requirement proper noun list set by the application have the following keyword phrase contents:
(1) Post description subject vocabulary: the following general terms are used for the following general terms of 'position requirement', 'position responsibility', 'position description', 'working responsibility', 'working requirement', 'job requirements', 'job qualifications', 'job categories', [ job responsibilities ], and [ job qualifications ].
(2) Post responsibility proper noun list: [ ' responsible ', ' participating ', ' dominant ', ' evaluation and testing ', ' periodic dominance ', ' assistance organization ', ' coordination and dominance ', ' engage ', ' analyze and track ', ' insight ', ' mine ', ' track ', ' explore ', ' and explore ', '.
(3) Post claim term table: the terms "knowledge", "well-known", "familiar" and, 'mastery', 'proficiency', 'possessing', 'have', 'understand', 'deep participate', 'deep understand', 'skilled use', 'skilled application', 'skilled reading', 'deep understanding', 'skilled mastering', 'deep understanding', 'deep participation or leading', 'deep participation'.
Further, the process of constructing the key sentence according to the key words provided in the above embodiment is specifically as follows:
(1) Thematic processing is carried out on the post responsibility short sentence and the post requirement short sentence, namely, each short sentence has only one proper noun, and the proper noun is necessarily positioned at the forefront of the short sentence.
(2) The method comprises the steps of traversing a post responsibility short sentence respectively, determining a starting sequence number and an ending sequence number of a post responsibility proper noun in the post responsibility short sentence, constructing a starting sequence number group of a front keyword and a rear keyword in the post responsibility short sentence, traversing a post requirement short sentence, determining a starting sequence number and an ending sequence number of the post requirement proper noun in the post responsibility short sentence, and constructing a starting sequence number and an ending sequence number of the front keyword and the rear keyword in the post requirement short sentence.
(3) When two keywords, namely "deep participation" and "deep participation or leading" are encountered to exist simultaneously, short keywords (namely, deep participation) are filtered out, and only long keywords (namely, deep participation or leading) are reserved.
(4) The processing modes of 'skilled mastery' and 'mastery' two proper nouns are as follows: when the "skilled grasp" is encountered, the positions of the two proper nouns are valid during positioning, and the position number of the "skilled grasp" is determined before the "grasp", and then the "grasp" is the invalid proper noun in the sentence, and the "grasp" is skipped when the keyword sentence is intercepted, and the position number of the next proper noun is the position number of the second proper noun.
(5) When two adjacent posts describe the subject words, the short noun positions need to be skipped for segmentation.
(6) The processing of a keyword is in the middle of a short sentence, in which case this keyword needs to be skipped, provided that the second keyword text is preceded by a character and must be punctuation.
(7) And intercepting character strings of short sentences according to the value segments of the sequence number groups in the key sentences to be used as members of a newly constructed key sentence list.
Wherein, the above processes (3), (4), (5) and (6) are preset keyword filtering logic of the present application. The application provides an example analysis result for constructing a key statement list, which is specifically as follows:
(1) Input data:
[' post responsibilities: ' is responsible for computer vision leading edge research and technical innovation, focus characterization learning, face analysis, detection and identification, generation technology and the like, and related achievements can be posted for meeting; ' is responsible for the research and development and optimization of a face related algorithm, including face recognition, detection, key point positioning, 3D reconstruction, GAN and the like; ' participate in and promote the landing of the related technology on the product lines such as the head strip, the tremble sound and the like. Post requirements: ' mathematics, computer, electronics, automation, and the like; ' familiar with C++ or Python, has stronger code development capability; ' priority of treatises published in international tip meetings or journals (including but not limited to CVPR, ICCV, ECCV, neurIPS, ICML, AAAI, TPAMI, IJCV, etc.); ' weekly attendance can be guaranteed for at least 4 days, continuous practice can be carried out for more than 3 months, long-term practice is very welcome, and meanwhile, a teacher needs to agree. ']
(2) Outputting a result:
[ { 'post responsibilities': ' post responsibilities: 'is responsible for': ' take charge of computer vision leading edge research and technical innovation, focus characterization learning, face analysis, detection and identification, generation technology and the like, and related achievements can be posted for meeting; 'is responsible for': ' is responsible for the research and the development and the optimization of a face related algorithm, including face recognition, detection, key point positioning, 3D reconstruction, GAN and the like; ' participate }: ' participate and promote the landing of the related art on the product lines of the head, tremble, and the like. 'post requires': ' post requirement: 'familiarity {' familiarity }: ' familiarity with C++ or Python, ' }, { ' has: ' have a strong code development ability; '}]
Keyword sentence element instance: { 'participation': ' participate and promote the landing of the related art on the product lines of the head, tremble, and the like. 'post requires': ' post requirement: '}. Three of these elements are used for building post responsibility entries (also referred to as task entries in the application), two are used for building post requirement entries (also referred to as skill entries in the application) and three are used for building post requirement entries (i.e. key sentence elements such as "work responsibility", "post requirement" and "post responsibility") only as boundary members by members with work responsibility as keys, and the key sentence elements are not used when building task entries and skill entries.
Further, in one embodiment, the step responsibility words are separated from the keyword sentence list by taking the step responsibility noun table as a constraint, and the step responsibility words are formed by performing phrase segmentation and combination on the step responsibility words according to periods of the step responsibility words, which specifically includes:
separating the post responsibility short sentence from the keyword sentence list according to the post responsibility proper noun in the post responsibility proper noun list;
traversing the separated post responsibility short sentence, judging the sentence pattern of the post responsibility short sentence, if the post responsibility short sentence is punctuation mark sentence pattern, taking the post responsibility noun in the post responsibility short sentence as a first phrase, and constructing a post responsibility entry by taking the original post responsibility short sentence as a first-class task word;
if the post responsibility short sentence is a bracket sentence pattern sentence, the post responsibility vocabulary is constructed according to the bracket sentence pattern sentence processing logic.
The sentence pattern processing logic of the brackets specifically comprises:
if the sentence pattern of the post responsibility short sentence is a standard bracket sentence pattern, setting a first phrase as a post responsibility noun, separating phrases before and after the standard bracket and text information in the standard bracket, combining the phrases before and after the standard bracket into a first-level task word to obtain a first-level post responsibility phrase, dividing the text information in the standard bracket into a second-level task word according to punctuation marks to obtain a second-level post responsibility phrase, and combining the first-level post responsibility phrase and the second-level post responsibility phrase to construct a post responsibility entry;
If the sentence pattern of the post responsibility short sentence is a non-standard bracket sentence pattern, the matched first phrase is a post responsibility noun, the phrase in front of the first preset keyword is used as a first-level task word, the phrase behind the first preset keyword is divided into a second-level task word according to punctuation marks, and the first-level task word and the second-level task word are combined.
Specifically, in one embodiment, the post role entry construction is to separate entries from keyword-based sentence list elements under the constraint of a post role noun table, each post role entry structure is: job proper noun: first term- > second term (optional); the post responsibility noun table is as follows: [ 'responsible', 'participating', 'dominant', 'evaluation and testing', 'periodic dominance', 'assistance organization', 'coordination and dominance', 'engage', 'analyze and track', 'insight', 'mine', 'track', 'explore'. The ];
the concrete post responsibility entry is constructed as follows: the job-duty short sentence has a bracket sentence pattern and a punctuation mark sentence pattern, and the bracket sentence pattern is divided into a "()" sentence pattern and a non "()" sentence pattern, namely a standard bracket sentence pattern and a non-standard bracket sentence pattern, and each sentence pattern needs to be processed respectively.
1) Standard bracket sentence pattern analysis: traversing the analyzed short sentence, and outputting the () front and back phrases and () information by using the matched first phrase as post responsibility proper noun. The words before and after () can be separated by using the rule, the separated words are combined into new words, and the first-stage post responsibility word group is formed, and the information in the () is divided into the second-stage post responsibility word group according to punctuation marks.
Related examples: post responsibility short sentence original text: and the novel software and hardware architecture design is responsible for light-weight virtualization (like products KataContainer, firecracker, gVisor in industry and the like) scene virtualization and the like.
After separation and combination, three building results of post responsibility entries are respectively:
responsible for: novel software and hardware architecture designs such as lightweight virtualized scene virtualization, namely KataContainer, an industry similar product;
responsible for: light-weight virtualization scene virtualization and other novel software and hardware architecture designs- > Firecracker;
responsible for: lightweight virtualized scene virtualization and other novel software and hardware architecture designs- > gVisor.
2) Nonstandard bracket sentence pattern analysis: the matching first phrase is the post responsibility proper noun, and the matching proper noun and [: including, for example, as in the case of identifier information, outputting a phrase before and after the identifier, the identifier "including", "for example", and "as in", i.e., the first preset keyword. In the sentence pattern, the keyword, such as the phrase in front of the keyword, is a first-level task word, and the phrase in back of the keyword is divided into a second-level task word according to punctuation marks.
Examples: post responsibility short sentence original text: and the ecological chain construction of the operating system is participated, such as technical exploration, open source, industry authentication and the like.
After separation and combination, three building results of post responsibility entries are respectively:
participating in the ecological chain construction of an operating system- > technical exploration;
participating in the construction of an ecological chain of an operating system- > open source;
participating in the construction of an ecological chain of an operating system and the authentication of industry.
Punctuation pattern analysis: i.e., divided periods with punctuation, such as: [,; punctuation marks.
For this sentence pattern, the word and sentence contents are not needed to be divided, and the original word and sentence is the first-level task word without the second-level task word.
Related examples:
one original text: exploring new technologies of microkernel, macrokernel and exokernel systems, and developing and changing around business scenes and hardware systems;
one of the results is: and exploring new technologies of micro-kernel, macro-kernel and outer-kernel systems, and developing and changing around business scenes and hardware systems.
Furthermore, the application also provides a post responsibility entry construction example as a whole, which is as follows:
(1) The input is a list of keywords and sentences as follows:
[ { 'post responsibilities': post responsibilities: capability construction and technical innovation in the field of corporate public cloud, private cloud, hybrid cloud business, '}, {' dominant ': dominant/focused operating system,' }, { 'post responsibilities': post responsibilities include, but are not limited to: 'responsible' is responsible for the OS design of the diversity computing chips such as CPU, DPU, GPU and the like, and the unified standard of Host-device heterogeneous deployment is realized; 'responsible' is responsible for architecture design of the data surface OS under the diversity computing hardware, so that computing power is scheduled according to load, and the best resource utilization rate and performance are achieved. 'post requirements': post requirements: 'proficiency computer architecture, {' proficiency operating system theory, rich design development experience for OS kernels such as Linux or Windows; '}].
(2) The output is a post responsibility entry list as follows:
the method is mainly used for focusing on capability construction and technical innovation in the field of an operating system, is responsible for realizing the OS design of a diversity computing chip such as CPU, DPU, GPU and the like, realizing Host-device heterogeneous deployment integration unified standard, and is responsible for realizing the architecture design of a data surface OS under diversity computing hardware, realizing computing power according to load scheduling and achieving optimal resource utilization rate and performance.
Further, in one embodiment, the step requirement term table is used as a constraint, step requirement phrases are separated from the keyword sentence list, and phrase segmentation and combination are performed on the step requirement phrases according to periods of the step requirement phrases, so as to construct step requirement terms, which specifically includes:
separating the post requirement phrases from the keyword sentence list according to the post requirement proper nouns in the post requirement proper noun list;
traversing the separated post requirement clauses, judging the sentence patterns of the post requirement clauses, and if the post requirement clauses are punctuation mark sentence patterns, constructing post requirement entries according to punctuation mark sentence patterns processing logic of the post responsibility clauses;
if the post requirement short sentence is a bracket sentence pattern sentence, constructing a post requirement entry according to bracket sentence pattern sentence processing logic of the post responsibility short sentence;
If the post request phrase is a double sentence pattern sentence, the post request vocabulary is constructed according to the double sentence pattern processing logic.
Wherein, the double sentence pattern processing logic specifically comprises:
if the post requirement short sentence comprises three phrases, constructing a post requirement entry according to a preset three-section word rule;
if the post requirement phrase contains two phrases, constructing a post requirement entry according to a preset two-section word rule.
Specifically, in one embodiment, the post requirement entry is constructed by separating entries from the keyword list elements under the constraint of the post requirement noun table, and each post requirement entry structure is: post claim term: first term- > second term (optional). The post claim term table is as follows: the terms "knowledge", "well-known", "familiar" and, 'mastery', 'proficiency', 'possessing', 'have', 'understand', 'deep participate', 'deep understand', 'skilled use', 'skilled application', 'skilled reading', 'deep understanding', 'skilled mastering', 'deep understanding', 'deep participation or leading', 'deep participation'.
The process of constructing the post requirement entry according to the post requirement noun table is specifically as follows:
(1) The post requirement short sentence has a bracket sentence pattern, a double bracket sentence pattern and a punctuation mark sentence pattern, and the construction mode of the bracket sentence pattern is the same as the bracket sentence pattern processing logic of the post responsibility short sentence.
(2) Two have sentence pattern: such as: has experience and management ability. Two sentence types of keyword and sentence tables are provided, one is a two-section word and sentence table, the other is a three-section word and sentence table, and the structures of the two word and sentence tables are as follows:
two-stage word and sentence table: jbjy—2= [ [ 'have', 'project experience', ] have,
[ ' possess ', ' project ability ', [ (have ', ' project ability ',
[ ' have ', ' manage experience ', [ (have ', ' manage experience ',
[ ' have ', ' management ability ', [ (have ', ' management ability ', ]
Three-stage word and sentence table: jbjy 3 = [ [ 'have', 'channel resources', 'experience',
the construction method of the double post required vocabulary entry with sentence pattern specifically comprises the following steps:
1) Three-segment word rule:
the three-segment word rule meaning is that the three-segment sentence pattern phrases 1, 2 and 3 are matched and meet, the phrase 1, the phrase 2 and the phrase 3 respectively correspond to the key words and sentences in the three-segment word and sentence table (for example, the phrase 1 is 'provided with', the phrase 2 is 'channel experience', the phrase 3 is 'experience'), and the words and sentences between the phrase 1 and the phrase 2 and between the phrase 2 and the phrase 3 are respectively matched values, and the matched values are the words and sentences related to the position requirements in recruitment information.
Re-dividing the matching value between the phrase 1 and the phrase 2, wherein the dividing rule is punctuation marks [,; output K posts require second level vocabulary entry,// ] {1 }.
One complete post required entry consists of a phrase 1 + phrase 3+ post required second-level entry. The corresponding three-segment word rule logic code is as follows: ptn=jndc_ +' (.?.
Examples: the system has channel resources such as securities investment, credit investment and the like, and the working experience or sales experience of relevant channels of banks or financial cloud business is expanded for more than 3 years.
Building post required vocabulary entry according to the three-section vocabulary rule, wherein the post required vocabulary entry is specifically:
1. the system has securities investment channel resources, and bank or financial cloud business related channel expansion working experience or sales experience for more than 3 years;
2. the method has the advantages of credit investment channel resources, and bank or financial cloud business related channel expansion working experience or sales experience for more than 3 years.
2) Two-segment word rule:
the rule meaning of the two-segment word is that the matching meets the two-segment sentence pattern phrase 1 and phrase 2, and the matching value between the phrase 1 and the phrase 2 is output.
Re-dividing the matching value between the phrase 1 and the phrase 2, wherein the dividing rule is punctuation marks, [,; and (1)// ], and outputting K post requirement entries.
One complete post requirement entry consists of the phrase 1 + post requirement entry. The corresponding two-segment word rule logic code specifically comprises the following steps: ptn=jndc++' (..
Examples: has good team cooperation spirit and stronger communication capability and drilling and researching capability.
After the words and sentences are processed according to the two-section word rule, the constructed post requirement entries are specifically as follows:
1. the team cooperation spirit is good;
2. has stronger communication capability and drilling and grinding capability.
(3) Punctuation pattern analysis: i.e., divided periods with punctuation, such as: [,; and (5) carrying out punctuation marks such as/(and the like, completing data segmentation, and outputting K post required entries. Building post required vocabulary entry, wherein the vocabulary entry is formed by post required nouns and post required vocabulary entry. The specific phrase segmentation process refers to punctuation sentence pattern processing logic in the post responsibility entry construction process. The logical codes of the matching rules of punctuation patterns are as follows:
ptn = '[、,;//]{1}';
jn = re.split(r'[、,,;]{1}',jndy_t[0])。
in a specific practical process, the application also provides a post requirement entry construction example as a whole, and the concrete steps are as follows:
(1) The input value is a list of keywords and sentences as follows:
[ { 'work responsibilities': work responsibilities: 'responsible' for development work of synchronization assistant/Tencentrated album housekeeping iOS, lemonmac, etc.; 'bear': bear include demand analysis of iOS client, design and development implementation of scheme, performance tuning, emphasis/difficulty technology attack; 'responsible' for architecture design work at the system and module level; 'bear': bear and push member technology share in team. 'work requirements': work requirements: 'master' use of iOS development tools and test tools; ' familiarity with object-oriented programming ideas and design modes, has certain architecture design capabilities; 'the }' has good analysis and solution capability; ' the } ' has the { ' with good team cooperation spirit, and has stronger communication capability and drilling and researching capability; '}].
(2) The output value is a list of post required entries as follows:
the method is characterized by comprising the following steps of mastering an iOS development tool and a testing tool, familiarizing an object-oriented programming idea and a design mode, familiarizing with a certain architecture design capability, having a good analysis and solution capability, having a good team cooperation spirit, having a strong communication capability and a strong drilling capability.
Further, in one embodiment, the sentence pattern of the phrase may be analyzed during the term construction process by constructing a sentence pattern identification table. The sentence pattern identification table is composed of key word groups capable of expressing a sentence pattern, and one word group is a member in the list.
Sentence members include: 'the following (including:' e.g., 'A', 'B', 'A includes', 'A such as', 'A'.
Wherein having/having sentence members includes:
three sentence members: [ 'have', 'channel resources', 'experience',
[ 'possess', 'channel resources', 'experience', ].
Two sentence members: [ ' have ', ' project experience ', [ (have ', ' project experience ',
[ ' possess ', ' project ability ', [ (have ', ' project ability ',
[ ' have ', ' manage experience ', [ (have ', ' manage experience ',
[ 'have', 'manage ability', [ (have ',' manage ability ].
Further, in one embodiment, the repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to the vocabulary entry repairing logic, and determining the post vocabulary entry corresponding to the recruitment information specifically includes:
and performing punctuation mark analysis on the post responsibility entry and the post requirement entry respectively, removing invalid punctuation marks, such as commas, periods, colon, semicolons, and the like, before and after the post responsibility entry and the post requirement entry, and specifically setting according to actual needs.
And deleting invalid character strings before and after the post responsibility entry and the post requirement entry, such as words including but not limited to parallel, add items, add capabilities, and the like, and can be specifically set according to actual service requirements.
And respectively supplementing word functions of the post responsibility vocabulary entry and the post requirement vocabulary entry, and determining the post vocabulary entry corresponding to the recruitment information. The term restoration is to judge the integrity of the term according to a preset rule, and restore the integrity of the term by utilizing a word function according to a judgment result.
Further, in one embodiment, a term modification table may also be constructed to facilitate the repair of terms. The repair vocabulary is composed of punctuation marks to be deleted before and after the vocabulary, a vocabulary tail supplement table, an adjective vocabulary and other deletable vocabularies.
Punctuation marks to be deleted before and after entry include: 'and' respectively; 'and' are used for the treatment of cancer. ',': ' the + ' and the- ' and the · and the: ' and the \and the like '.
The vocabulary entry and the sentence end supplement word dictionary: { ' have deep ' knowledge '.
Adjective list: ' rich ', ' and ability ', ' certain ', ' large ', ' excellent ', ' stronger ', ' common ', ' sharp ', ' intense ', ' excellent ', ' complex ', ' rich ', ' mainstream ', ' good ', ' active ', ' excellent condition ', ' communication coordination ', ' no violation ', ' occupational moral ', ' active participation ', ' parallel ', like '.
Other deletable vocabulary: 'at least one', 'more than three', 'at least 2', 'one or more', '1 to 2', 'and', 'a person recording', 'a person priority recording', 'a candidate priority', 'one or more gates', 'a score', 'a priority', 'a person, a person priority'.
The realization method provided by the application supports the expansion of all word lists, and the word list members are derived from recruitment information texts.
Further, in other embodiments of the present application, the term repairing process may also be performed during the building process of the post responsibility term and the post requirement term, which may further simplify the processing flow of the method.
According to the application, the keywords in the recruitment information are effectively extracted by taking the keywords in the preset keyword list as the word head, segmenting all recruitment information in the recruitment information list, acquiring the keywords and constructing the keyword sentence list, so that the keyword extraction accuracy is improved. Meanwhile, the position requirement short sentence and the position responsibility short sentence are separated from the keyword sentence list to be subjected to phrase segmentation and combination treatment, so that the position requirement entry and the position responsibility entry are constructed, the keyword extraction quality is ensured, the workload of recruitment data analysis is reduced, and the service requirement of actual recruitment can be met.
Compared with the existing recruitment information keyword extraction technology, the method and the device can be suitable for information mining analysis of new recruitment vocabularies while guaranteeing the keyword extraction quality, and solve the problems that the existing keyword extraction technology is difficult to meet business requirements and the custom dictionary technology cannot exhaust post recruitment word segmentation information. Technical support and subsequent data analysis can be provided for talent culture plans of institutions such as institutions and enterprises, adjustment of course outline and teaching tasks, personalized learning guidance can be carried out according to industry requirements, and autonomous learning and life learning of industry talents are realized.
It is to be understood that the same or similar parts in the above embodiments may be referred to each other, and that in some embodiments, the same or similar parts in other embodiments may be referred to.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present application. While embodiments of the present application have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the application, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the application.
Claims (4)
1. The post entry construction method based on recruitment information is characterized by comprising the following steps:
collecting recruitment information, dividing and cleaning the recruitment information according to a first preset serial number list rule, and constructing a recruitment information list;
according to the keyword in the preset keyword list as the word head, performing short sentence segmentation on all recruitment information in the recruitment information list, obtaining a keyword sentence and constructing a keyword sentence list; the preset keyword list specifically comprises a post description subject term list, a post responsibility proper noun list and a post requirement proper noun list;
taking the post responsibility noun table as constraint, separating a post responsibility short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post responsibility short sentence according to the sentence period of the post responsibility short sentence to construct a post responsibility entry; the method specifically comprises the following steps: separating the post responsibility short sentence from the keyword sentence list according to the post responsibility proper noun in the post responsibility proper noun list; traversing the separated post responsibility short sentence, judging the sentence pattern of the post responsibility short sentence, if the post responsibility short sentence is punctuation mark sentence pattern, taking the post responsibility noun in the post responsibility short sentence as a first phrase, and constructing a post responsibility entry by taking the original post responsibility short sentence as a first-class task word; if the post responsibility short sentence is a bracket sentence pattern sentence, constructing a post responsibility vocabulary according to the bracket sentence pattern sentence processing logic;
Taking the post requirement noun table as constraint, separating a post requirement short sentence from the keyword sentence list, and carrying out phrase segmentation and combination on the post requirement short sentence according to the sentence pattern of the post requirement short sentence to construct a post requirement entry; the method specifically comprises the following steps: separating the post requirement phrases from the keyword sentence list according to the post requirement proper nouns in the post requirement proper noun list; traversing the separated post requirement clauses, judging the sentence patterns of the post requirement clauses, and if the post requirement clauses are punctuation mark sentence patterns, constructing post requirement entries according to punctuation mark sentence patterns processing logic of the post responsibility clauses; if the post requirement short sentence is a bracket sentence pattern sentence, constructing a post requirement entry according to bracket sentence pattern sentence processing logic of the post responsibility short sentence; if the post request short sentence is a double sentence pattern sentence, constructing a post request entry according to the double sentence pattern processing logic;
repairing the post responsibility vocabulary entry and the post requirement vocabulary entry according to vocabulary entry repairing logic, and determining the post vocabulary entry corresponding to recruitment information;
the sentence pattern sentence processing logic between brackets specifically comprises: if the sentence pattern of the post responsibility short sentence is a standard bracket sentence pattern, setting a first phrase as a post responsibility noun, separating phrases before and after the standard bracket and text information in the standard bracket, combining the phrases before and after the standard bracket into a first-level task word to obtain a first-level post responsibility phrase, dividing the text information in the standard bracket into a second-level task word according to punctuation marks to obtain a second-level post responsibility phrase, and combining the first-level post responsibility phrase and the second-level post responsibility phrase to construct a post responsibility entry; if the sentence pattern of the post responsibility short sentence is a non-standard bracket sentence pattern, matching the first phrase to be a post responsibility noun, taking the phrase in front of the first preset keyword as a first-level task word, dividing the phrase behind the first preset keyword into a second-level task word according to punctuation marks, and combining the first-level task word and the second-level task word; the first preset keywords specifically comprise, for example and as follows;
The double sentence pattern processing logic specifically comprises: if the post requirement short sentence comprises three phrases, constructing a post requirement entry according to a preset three-section word rule; if the post requirement phrase contains two phrases, constructing a post requirement entry according to a preset two-section word rule.
2. The recruitment information-based post entry construction method of claim 1, wherein the steps of collecting recruitment information and dividing and cleaning the recruitment information according to a first preset serial number list rule, and constructing a recruitment information list comprise the following steps:
presetting a plurality of sequence number list rules, and sequentially linking each sequence number list rule to form a complete regular expression, so as to obtain a first preset sequence number list rule;
collecting recruitment information, identifying a sequence number format in the recruitment information according to a first preset sequence number list rule, performing data segmentation on a recruitment information text according to the sequence of the identified sequence numbers, and converting text modes of the recruitment information corresponding to each sequence number into regular expressions one by one to form a recruitment information list.
3. The recruitment information-based post entry construction method according to claim 1, wherein the steps of performing phrase segmentation on all recruitment information in the recruitment information table according to keywords in a preset keyword table as a word head, obtaining keywords and constructing a keyword sentence list include:
The method comprises the steps of defining a post description subject matter list, a post responsibility term list and a post requirement term list in advance, taking post description subject matters in the post description subject matter list, post responsibility terms in the post responsibility term list and post requirement terms in the post requirement term list as the word head of a short sentence, and carrying out short sentence segmentation on all recruitment information in the recruitment information list to respectively obtain the post description short sentence, the post responsibility short sentence and the post requirement short sentence;
the post responsibility short sentence and the post requirement short sentence are subject-treated respectively, so that the post responsibility short sentence or the post requirement short sentence only contains one post responsibility proper noun or post requirement proper noun;
respectively traversing the post responsibility short sentence, determining the starting sequence number and the ending sequence number of the post responsibility proper noun in the post responsibility short sentence, constructing a starting sequence number group of the front keyword and the rear keyword in the post responsibility short sentence, traversing the post requirement short sentence, determining the starting sequence number and the ending sequence number of the post requirement proper noun in the post responsibility short sentence, and constructing the starting sequence number and the ending sequence number of the front keyword and the rear keyword in the post requirement short sentence;
respectively carrying out keyword filtering processing on the post description short sentence, the post responsibility short sentence and the post requirement short sentence according to preset keyword filtering logic to obtain a key sentence of recruitment information;
Based on the key sentences, the key words are used as keys, the key sentences are used as values, and a key sentence list is formed in a key value pair mode.
4. The recruitment information-based post entry construction method according to claim 1, wherein the step of repairing the post responsibility entry and the post requirement entry according to the entry repair logic, and determining the post entry corresponding to the recruitment information, comprises:
performing punctuation mark analysis on the post responsibility vocabulary entry and the post requirement vocabulary entry respectively, and removing invalid punctuation marks before and after the post responsibility vocabulary entry and the post requirement vocabulary entry;
deleting invalid character strings before and after the post responsibility entry and the post requirement entry;
and respectively supplementing word functions of the post responsibility vocabulary entry and the post requirement vocabulary entry, and determining the post vocabulary entry corresponding to the recruitment information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310680645.3A CN116402046B (en) | 2023-06-09 | 2023-06-09 | Post entry construction method based on recruitment information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310680645.3A CN116402046B (en) | 2023-06-09 | 2023-06-09 | Post entry construction method based on recruitment information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116402046A CN116402046A (en) | 2023-07-07 |
CN116402046B true CN116402046B (en) | 2023-08-18 |
Family
ID=87020301
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310680645.3A Active CN116402046B (en) | 2023-06-09 | 2023-06-09 | Post entry construction method based on recruitment information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116402046B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0602955A2 (en) * | 1992-12-17 | 1994-06-22 | Xerox Corporation | Text recognition |
JP2007183796A (en) * | 2006-01-06 | 2007-07-19 | Pma:Kk | Business evaluation value calculation system |
CN111241361A (en) * | 2020-01-09 | 2020-06-05 | 福州数据技术研究院有限公司 | Intelligent referral system and method for enterprises and colleges based on cloud platform |
CN111428469A (en) * | 2020-02-27 | 2020-07-17 | 宋继华 | Sentence pattern structure diagram analysis oriented interactive labeling method and system |
CN112486919A (en) * | 2020-11-13 | 2021-03-12 | 北京北大千方科技有限公司 | Document management method, system and storage medium |
CN114138979A (en) * | 2021-10-29 | 2022-03-04 | 中南民族大学 | Cultural relic safety knowledge map creation method based on word expansion unsupervised text classification |
CN114817450A (en) * | 2022-05-18 | 2022-07-29 | 珠海金山办公软件有限公司 | Keyword recognition method, device, equipment and medium |
CN115186050A (en) * | 2022-09-08 | 2022-10-14 | 粤港澳大湾区数字经济研究院(福田) | Method, system and related equipment for recommending selected questions based on natural language processing |
-
2023
- 2023-06-09 CN CN202310680645.3A patent/CN116402046B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0602955A2 (en) * | 1992-12-17 | 1994-06-22 | Xerox Corporation | Text recognition |
JP2007183796A (en) * | 2006-01-06 | 2007-07-19 | Pma:Kk | Business evaluation value calculation system |
CN111241361A (en) * | 2020-01-09 | 2020-06-05 | 福州数据技术研究院有限公司 | Intelligent referral system and method for enterprises and colleges based on cloud platform |
CN111428469A (en) * | 2020-02-27 | 2020-07-17 | 宋继华 | Sentence pattern structure diagram analysis oriented interactive labeling method and system |
CN112486919A (en) * | 2020-11-13 | 2021-03-12 | 北京北大千方科技有限公司 | Document management method, system and storage medium |
CN114138979A (en) * | 2021-10-29 | 2022-03-04 | 中南民族大学 | Cultural relic safety knowledge map creation method based on word expansion unsupervised text classification |
CN114817450A (en) * | 2022-05-18 | 2022-07-29 | 珠海金山办公软件有限公司 | Keyword recognition method, device, equipment and medium |
CN115186050A (en) * | 2022-09-08 | 2022-10-14 | 粤港澳大湾区数字经济研究院(福田) | Method, system and related equipment for recommending selected questions based on natural language processing |
Non-Patent Citations (1)
Title |
---|
互联网招聘数据分析与可视化系统设计与实现;田书丽;《中国优秀硕士学位论文全文数据库社会科学Ⅱ辑》(第12期);H126-19 * |
Also Published As
Publication number | Publication date |
---|---|
CN116402046A (en) | 2023-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110807328B (en) | Named entity identification method and system for legal document multi-strategy fusion | |
Yang et al. | Adversarial learning for chinese ner from crowd annotations | |
WO2020010834A1 (en) | Faq question and answer library generalization method, apparatus, and device | |
CN107004000A (en) | A kind of language material generating means and method | |
Firmani et al. | Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio-Episode 1: Machine Transcription of the Manuscripts. | |
Kumar et al. | IIT-TUDA: System for sentiment analysis in Indian languages using lexical acquisition | |
CN103729402A (en) | Method for establishing mapping knowledge domain based on book catalogue | |
CN114417851B (en) | Emotion analysis method based on keyword weighted information | |
Kanan et al. | Improving arabic text classification using p-stemmer | |
CN111563372B (en) | Typesetting document content self-duplication checking method based on teaching book publishing | |
Tarride et al. | Large-scale genealogical information extraction from handwritten Quebec parish records | |
CN113377916A (en) | Extraction method of main relations in multiple relations facing legal text | |
Thorvaldsen et al. | A tale of two transcriptions. Machine-assisted transcription of historical sources | |
CN116821351A (en) | Span information-based end-to-end power knowledge graph relation extraction method | |
CN112257442A (en) | Policy document information extraction method based on corpus expansion neural network | |
CN116402046B (en) | Post entry construction method based on recruitment information | |
CN110347812A (en) | A kind of search ordering method and system towards judicial style | |
Xu et al. | Exploiting lists of names for named entity identification of financial institutions from unstructured documents | |
Paju et al. | Towards an ontology and epistemology of text reuse | |
Ammirati et al. | In codice ratio: Machine transcription of medieval manuscripts | |
CN117473971A (en) | Automatic generation method and system for bidding documents based on purchasing text library | |
Bhoir et al. | Resume Parser using hybrid approach to enhance the efficiency of Automated Recruitment Processes | |
Dikow et al. | Developing responsible AI practices at the Smithsonian Institution | |
Maheswari et al. | Rule based morphological variation removable stemming algorithm | |
Hamza et al. | Text mining: A survey of Arabic root extraction algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |