Disclosure of Invention
In view of the above, the invention provides an item bank generation method, a server and a computer-readable storage medium, which can generate industry problems according to description information of a recruitment position automatically and store the industry problems according to industry classification, thereby saving manpower.
Firstly, in order to achieve the above object, the present invention provides a question bank generating method, which is applied to a server, and the method includes:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
and storing the industry problems according to industry classification to generate question banks of different industries.
Optionally, the step of converting the job description information into a factual sentence with complete information specifically includes the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
Optionally, the step of generating an industry question according to the fact sentence by the question generation template specifically includes:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
Optionally, the step of retrieving a question generation template pre-stored by the server, matching a main part of the question, and generating an industry question further includes the following steps:
constructing a text classification model for labeling sample deep learning;
checking whether the industry problem is smooth or not by using the text classification model for the labeled sample deep learning;
and if the industry problem sentences are not smooth, the text classification model for marking the sample deep learning adjusts the industry problem.
Optionally, the step of storing the industry questions according to industry categories to generate question banks of different industries specifically includes the following steps:
constructing a term frequency-inverse document frequency (TFIDF) model;
checking whether the industry problem has similar problems in the question bank by utilizing the TFIDF model;
and if similar industry problems exist, deleting the currently generated industry problems.
Optionally, the question bank generating method further includes the following steps:
extracting keywords of the job description;
expanding related words related to the keywords;
and generating an industry problem according to the related words.
In addition, in order to achieve the above object, the present invention further provides a server, where the server includes a memory and a processor, the memory stores an item library generating system operable on the processor, and the item library generating system implements the following steps when executed by the processor:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
and storing the industry problems according to industry classification to generate question banks of different industries.
Optionally, the step of converting the job description information into a factual sentence with complete information specifically includes the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
Optionally, the step of generating an industry question according to the fact sentence by the question generation template specifically includes:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium storing a question bank generating system, which is executable by at least one processor, so as to make the at least one processor execute the steps of the question bank generating method.
Compared with the prior art, the server, the question bank generating method and the computer readable storage medium provided by the invention have the advantages that firstly, the description information of the recruitment position of the recruitment website is obtained; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an alternative hardware architecture of the server 2 according to the present invention.
In this embodiment, the server 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 1 only shows the server 2 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The memory 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the server 2, such as a hard disk or a memory of the server 2. In other embodiments, the memory 11 may also be an external storage device of the server 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the server 2. Of course, the memory 11 may also include both an internal storage unit of the server 2 and an external storage device thereof. In this embodiment, the memory 11 is generally used for storing an operating system installed in the server 2 and various types of application software, such as program codes of the question bank generating system 200. Furthermore, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the server 2. In this embodiment, the processor 12 is configured to run the program codes stored in the memory 11 or process data, for example, run the question bank generating system 200.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is generally used for establishing communication connection between the server 2 and other electronic devices.
The application environment and the hardware structure and function of the related devices of the various embodiments of the present invention have been described in detail so far. Hereinafter, various embodiments of the present invention will be proposed based on the above-described application environment and related devices.
First, the present invention provides an item bank generating system 200.
Referring to fig. 2, a block diagram of a first embodiment of the library generating system 200 of the present invention is shown.
In this embodiment, the question bank generating system 200 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the question bank generating operation according to the embodiments of the present invention can be implemented. In some embodiments, the question bank generating system 200 may be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in fig. 2, the question bank generating system 200 may be divided into an acquiring module 201, a converting module 202, a generating module 203, and a storing module 204. Wherein:
the obtaining module 201 is configured to obtain description information of the recruitment position of the recruitment website.
Generally, professional interview questions of the interviewer or the interview are manually drawn up by the recruiter according to different position information, however, when the position requirement changes or positions are newly increased, if the position requirement changes, the interview questions need to be drawn up again and printed if the position requirement changes or the positions are newly increased, and if the position requirement changes, the interview questions need to be drawn up again, the test system needs to be updated, and great inconvenience is brought to the recruiter.
Therefore, in this embodiment, the server 2 logs in the recruitment website, and the server 2 acquires the job description information of the recruitment website through the acquisition module 201.
The conversion module 202 is configured to convert the job description information into a factual sentence with complete information.
Specifically, the server 2 analyzes the job description information, extracts keywords of the job description information, and separates the job description information into several fact sentences having complete information.
In the present embodiment, the server 2 associates words related to the same entity in the job description information by using the reference resolution. The method is characterized in that words are firstly classified into one type, then the words most possibly referring to the same entity are combined together according to the degree of closeness among the words, the words are clustered in a successive aggregation mode, and then the words are separated into several fact sentences with complete information. For example, the description of a certain position information is: the method has the advantages that common statistical modeling, data mining and machine learning methods are known, and can be applied to establish a model to solve practical problems and develop data products; familiarity with SAS, SPSSModeller, R, Python, etc. software; skilled Oracle, SQLServer, Mysql, etc. Deriving "they" in the job description by reference resolution means "statistical modeling, data mining, machine learning methods", and finally deriving four facts of the job description information after associating words about the same entity according to reference resolution: (1) the method comprises the following steps of learning common statistical modeling, data mining and machine learning. (2) And (3) establishing a model by using common statistical modeling, data mining and machine learning methods to solve the actual problem and develop a data product. (3) Familiar with SAS, SPSSModeller, R, Python, etc. software. (4) Skilled Oracle, SQLServer, Mysql, etc.
The generating module 203 is configured to generate an industry question from the fact sentence according to a question generating template.
Specifically, the server memory pre-stores a problem generation template, which is illustrated by taking a fact sentence of "understanding a common statistical modeling, data mining, and machine learning method" as an example, the server selects an appropriate problem generation template according to a tree query language (VP < ((VV < adept? ".
The storage module 204 is configured to store the industry problems according to industry categories to generate question banks of different industries.
Specifically, the question bank stored in the memory of the server 2 is classified according to industry, such as IT, medical treatment, intellectual property, and the like. And the server 2 identifies the industry corresponding to the industry problem according to the industry problem and stores the industry problem into the corresponding industry question bank so as to update the question bank in the memory.
In other embodiments of the present invention, the server 2 further classifies the generated job description information according to specialties, mainly including a general purpose problem, a problem with a high speciality, and a problem with a high speciality.
In other embodiments of the present invention, the server 2 may further extract the keywords of the job description, develop words related to an industry, and generate an industry problem for the related words.
Through the program module 201 and 204, the question bank generating system 200 provided by the invention firstly obtains the description information of the recruitment position of the recruitment website; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Further, a second embodiment of the invention (as shown in fig. 3) is proposed based on the above-mentioned first embodiment of the question bank generating system 200 of the invention. In this embodiment, the question bank generating system 200 further includes an analyzing module 205, a retrieving module 206, and a matching module 207, wherein:
the parsing module 205 is configured to decompose the fact sentence into a tree-like syntax structure, so as to obtain a syntax tree of the fact sentence through analysis; and the grammar tree is also used for analyzing the grammar tree to obtain the subject part of the question.
Specifically, the server 2 decomposes the fact sentence into a tree-like grammar structure of the principal and predicate objects through the parsing module 205, and analyzes the grammar structure. In this embodiment, a PCFG (Probabilistic Context-Free Grammar) is used for syntax structure analysis, which is composed of common syntax rules and probabilities corresponding to the rules. And (3) possibly generating a plurality of potential syntax trees for each fact sentence, calculating the generation probability of each syntax tree by the server according to the PCFG, and selecting the syntax tree with the highest probability as the syntax tree of the fact sentence.
The parsing module 205 further parses and queries the syntax tree through a tree query language Tregex to select a main part of the problem, for example, we can select a main part of the problem "statistical modeling, data mining, machine learning method" from a fact sentence "understanding common statistical modeling, data mining, machine learning method" through a tree query language (VP < (VV $ IP) ".
The retrieval module 206 is configured to retrieve a question generation template pre-stored in the server 2;
the matching module 207 is configured to match the main part of the question to the question generation module, so as to generate an industry question.
Specifically, as can be seen from the above, in the first embodiment, the server 2 stores the question generation module in advance. In this embodiment, when the main part of the problem is obtained through analysis by the analysis module 205, the problem generation module pre-stored in the memory of the server 2 is called through the calling module 206, and the problem main part is matched to the problem generation module through the matching module 207, so as to generate an industrial problem.
Through the program module 205 and 207, the question bank generating system 200 provided by the invention can decompose the fact sentence into a tree-shaped grammar structure to analyze and obtain a grammar tree of the fact sentence, and analyze the grammar tree to obtain a main part of the problem, and further, match the subject part of the problem with a problem generating template to generate an industry problem, thereby realizing automatic generation of the industry problem and avoiding manual operation.
Further, a third embodiment of the invention (as shown in fig. 4) is proposed based on the above-mentioned first embodiment of the question bank generating system 200 of the invention. In this embodiment, the question bank generating system 200 further includes a constructing module 208 and an examining module 209, wherein:
the building module 208 is configured to build a text classification model for labeling deep learning of a sample;
the checking module 209 is configured to check whether the industry problem is smooth by using the text classification model of the labeled sample deep learning; and if the industry problem sentences are not smooth, adjusting the industry problem by using the text classification model for marking sample deep learning.
Specifically, as can be seen from the above, in the first embodiment, the industry issue is that the server 2 automatically generates an industry issue by matching the issue body part to the issue generation module by the matching module 206. Since the industry problem is automatically generated by the server 2, and there may be a problem of discontent sentences, in this embodiment, in order to avoid the defect of discontent industry problem, the server 2 constructs a text classification model of deep learning of a labeled sample through the construction module 208, and checks whether the industry problem is smooth or not through the checking module 209 by using the text classification model of deep learning of the labeled sample; if the industry question sentence is not smooth, the checking module 209 adjusts the industry question by using the text classification model of the labeled sample deep learning.
The building module 208 is further configured to build a term frequency-inverse document frequency (TFIDF) model;
the checking module 209 is further configured to check whether the similar problem exists in the question bank for the industry problem by using the TFIDF model, and delete the currently generated industry problem if the similar problem exists.
Generally, the same problem may have different descriptions, but this causes the problem to be duplicated. In order to avoid repeated questions, when a problem is generated, the server 2 further constructs a TFIDF model through the construction module 208, and checks whether a similar problem already exists in the question bank stored in the server 2 by using the TFIDF model. In this embodiment, a text is subjected to vector representation through a TFIDF model, then pairwise similarity of all problems in each professional category is calculated by using a cosine similarity algorithm, the problems are classified according to a set threshold, if the problem base has similar industrial problems, the currently generated industrial problems are deleted, and finally only one problem with the most abundant semantics is stored in each category.
Through the program module 208 and 209, the question bank generating system 200 provided by the invention can also check whether the industry problem is smooth or not by constructing the text classification model for labeling sample deep learning, and simultaneously, check whether the industry problem has similar problems in the question bank or not by constructing the TFIDF model, and if the industry problem has similar problems, delete the currently generated industry problem, thereby ensuring that the industry problems in the question bank are smooth and not repeated.
In addition, the invention also provides a question bank generating method.
Fig. 5 is a schematic flow chart showing a first embodiment of the method for generating a question bank according to the present invention. In this embodiment, the execution order of the steps in the flowchart shown in fig. 5 may be changed and some steps may be omitted according to different requirements.
And S301, acquiring description information of the recruitment position of the recruitment website.
Generally, professional interview questions of the interviewer or the interview are manually drawn up by the recruiter according to different position information, however, when the position requirement changes or positions are newly increased, if the position requirement changes, the interview questions need to be drawn up again and printed if the position requirement changes or the positions are newly increased, and if the position requirement changes, the interview questions need to be drawn up again, the test system needs to be updated, and great inconvenience is brought to the recruiter.
Therefore, in this embodiment, the server 2 logs in the recruitment website, and the server 2 acquires the job description information of the recruitment website.
Step S302, converting the position description information into a factual sentence with complete information.
Specifically, the server 2 analyzes the job description information, extracts keywords of the job description information, and separates the job description information into several fact sentences having complete information.
In the present embodiment, the server 2 associates words related to the same entity in the job description information by using the reference resolution. The method is characterized in that words are firstly classified into one type, then the words most possibly referring to the same entity are combined together according to the degree of closeness among the words, the words are clustered in a successive aggregation mode, and then the words are separated into several fact sentences with complete information. For example, the description of a certain position information is: the method has the advantages that common statistical modeling, data mining and machine learning methods are known, and can be applied to establish a model to solve practical problems and develop data products; familiarity with SAS, SPSSModeller, R, Python, etc. software; skilled Oracle, SQLServer, Mysql, etc. Deriving "they" in the job description by reference resolution means "statistical modeling, data mining, machine learning methods", and finally deriving four facts of the job description information after associating words about the same entity according to reference resolution: (1) the method comprises the following steps of learning common statistical modeling, data mining and machine learning. (2) And (3) establishing a model by using common statistical modeling, data mining and machine learning methods to solve the actual problem and develop a data product. (3) Familiar with SAS, SPSSModeller, R, Python, etc. software. (4) Skilled Oracle, SQLServer, Mysql, etc.
And step S303, generating an industry question according to the fact sentence and a question generation template.
Specifically, the server 2 pre-stores a problem generation template in the memory, which is illustrated by taking a fact sentence of "understanding a common statistical modeling, data mining, and machine learning method" as an example, and selects an appropriate problem generation template according to a tree query language (VP < ((VV < adept | understanding | familiar | mastering | understanding | possess | including | meeting) $. (NP | IP ═ unmov))) by using a rule template matching module, and finally generates "please talk about statistical modeling, data mining, and machine learning method? ".
And step S304, storing the industry problems according to industry classification to generate question banks of different industries.
Specifically, the question bank stored in the memory of the server 2 is classified according to industry, such as IT, medical treatment, intellectual property, and the like. And the server 2 identifies the industry corresponding to the industry problem according to the industry problem and stores the industry problem into the corresponding industry question bank so as to update the question bank in the memory.
In other embodiments of the present invention, the server 2 further classifies the generated job description information according to specialties, mainly including a general purpose problem, a problem with a high speciality, and a problem with a high speciality.
In other embodiments of the present invention, the server 2 may further extract the keywords of the job description, develop words related to an industry, and generate an industry problem for the related words.
Through the steps S301-304, the question bank generating method provided by the invention comprises the steps of firstly, obtaining the description information of the recruitment position of the recruitment website; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Further, based on the above first embodiment of the method for generating a question bank according to the present invention, a second embodiment of the method for generating a question bank according to the present invention is provided.
Fig. 6 is a schematic flow chart of a method for generating a question bank according to a second embodiment of the present invention.
In this embodiment, the step of generating the industry problem by using the fact sentence according to the problem generation template specifically includes the following steps:
step S401, decomposing the fact sentence into a tree-like syntax structure to obtain a syntax tree of the fact sentence through analysis.
Step S402, analyzing the syntax tree to obtain the subject part of the question.
Specifically, the server 2 decomposes the fact sentence into a tree-like grammar structure of the principal and predicate objects, and analyzes the grammar structure. In this embodiment, a PCFG (Probabilistic Context-Free Grammar) is used for syntax structure analysis, which is composed of common syntax rules and probabilities corresponding to the rules. And (3) possibly generating a plurality of potential syntax trees for each fact sentence, calculating the generation probability of each syntax tree by the server according to the PCFG, and selecting the syntax tree with the highest probability as the syntax tree of the fact sentence.
Furthermore, the server 2 analyzes and queries the syntax tree through a tree query language Tregex to select a main part of the problem, for example, we can select a main part of the problem from a fact sentence "understanding common statistical modeling, data mining and machine learning methods" through a tree query language (VP < (VV $ IP) "," statistical modeling, data mining and machine learning methods ".
In step S403, a question generation template pre-stored in the server 2 is called.
And S404, matching the main body part of the problem to the problem generation module to generate an industry problem.
Specifically, as can be seen from the above, in the first embodiment, the server 2 stores the question generation module in advance. In this embodiment, when the server 2 obtains the main part of the problem through analysis, the server 2 calls the problem generation module pre-stored in the memory of the server 2, and matches the main part of the problem with the problem generation module, so as to generate an industrial problem.
Through the steps S401 to S404, the question bank generating method provided by the present invention can decompose the fact sentence into a tree-like syntax structure to analyze and obtain a syntax tree of the fact sentence, and analyze the syntax tree to obtain a main part of the problem, and further, match a subject part of the problem to a problem generating template to generate an industry problem, thereby realizing automatic generation of the industry problem and avoiding manual work.
Further, a third embodiment of the library generating method of the present invention is provided based on the above-mentioned first embodiment of the library generating method of the present invention.
Fig. 7 is a flow chart illustrating a third embodiment of the method for generating a question bank according to the present invention. In this embodiment, the method further includes:
step S501, a text classification model for labeling sample deep learning is constructed.
Step S502, checking whether the industry problem is smooth or not by using the text classification model of the labeled sample deep learning.
And S503, if the industry problem sentence is not smooth, adjusting the industry problem by using the text classification model of the labeled sample deep learning.
Specifically, as can be seen from the above, in the first embodiment, the industry issue is that the server 2 automatically generates an industry issue by matching the issue body part to the issue generation module by the matching module 206. Since the industry problem is automatically generated by the server 2, and there may be a problem of discontent sentences, in this embodiment, in order to avoid the defect of discontent industry problem, the server 2 constructs a text classification model of deep learning of a labeled sample through the construction module 208, and checks whether the industry problem is smooth or not through the checking module 209 by using the text classification model of deep learning of the labeled sample; if the industry question sentence is not smooth, the checking module 209 adjusts the industry question by using the text classification model of the labeled sample deep learning.
Step S504, a term frequency-inverse document frequency (TFIDF) model is constructed.
And step S505, checking whether the industry problem has similar problems in the question bank by using the TFIDF model.
Step S506, if similar industry problems exist, deleting the currently generated industry problems.
Generally, the same problem may have different descriptions, but this causes the problem to be duplicated. In order to avoid repeated questions, when a problem is generated, the server 2 further constructs a TFIDF model through the construction module 208, and checks whether a similar problem already exists in the question bank stored in the server 2 by using the TFIDF model. In this embodiment, a text is subjected to vector representation through a TFIDF model, then pairwise similarity of all problems in each professional category is calculated by using a cosine similarity algorithm, the problems are classified according to a set threshold, if the problem base has similar industrial problems, the currently generated industrial problems are deleted, and finally only one problem with the most abundant semantics is stored in each category.
Through the steps S501 to S506, the question bank generating method provided by the present invention can also check whether the industry problem is smooth or not by constructing the text classification model for labeling sample deep learning, and simultaneously, check whether the industry problem has a similar problem in the question bank by constructing the TFIDF model, and if the industry problem has a similar problem, delete the currently generated industry problem, thereby ensuring that the industry problems in the question bank are smooth and not repeated.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.