[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107943881B - Question bank generating method, server and computer readable storage medium - Google Patents

Question bank generating method, server and computer readable storage medium Download PDF

Info

Publication number
CN107943881B
CN107943881B CN201711130606.7A CN201711130606A CN107943881B CN 107943881 B CN107943881 B CN 107943881B CN 201711130606 A CN201711130606 A CN 201711130606A CN 107943881 B CN107943881 B CN 107943881B
Authority
CN
China
Prior art keywords
industry
question
server
generating
question bank
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711130606.7A
Other languages
Chinese (zh)
Other versions
CN107943881A (en
Inventor
徐国强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Financial Technology Co Ltd Shanghai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Financial Technology Co Ltd Shanghai filed Critical OneConnect Financial Technology Co Ltd Shanghai
Priority to CN201711130606.7A priority Critical patent/CN107943881B/en
Publication of CN107943881A publication Critical patent/CN107943881A/en
Application granted granted Critical
Publication of CN107943881B publication Critical patent/CN107943881B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/105Human resources
    • G06Q10/1053Employment or hiring

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a question bank generating method, which comprises the following steps: acquiring the description information of the recruitment position of the recruitment website; converting the job description information into a factual sentence with complete information; generating an industry problem by the fact sentence according to a problem generation template; and storing the industry problems according to industry classification to generate question banks of different industries. The invention also provides a server and a computer readable storage medium. The question bank generating method, the server and the computer readable storage medium provided by the invention automatically generate the industry problems according to the description information of the recruitment position, and store the industry problems according to the industry classification, thereby saving the labor.

Description

Question bank generating method, server and computer readable storage medium
Technical Field
The present invention relates to the field of computer applications, and in particular, to a question bank generating method, a server, and a computer-readable storage medium.
Background
In the recruitment process, the interview and the written examination can comprehensively investigate the knowledge reserve, the professional skills, the thinking logic and other quality characteristics of the applicant. At present, no matter on-site written test, on-line test and interview are all problems which need to be drawn up by a recruiter according to different positions in advance, but the problems can not be generated automatically according to the positions, and great inconvenience is brought to the recruiter.
Disclosure of Invention
In view of the above, the invention provides an item bank generation method, a server and a computer-readable storage medium, which can generate industry problems according to description information of a recruitment position automatically and store the industry problems according to industry classification, thereby saving manpower.
Firstly, in order to achieve the above object, the present invention provides a question bank generating method, which is applied to a server, and the method includes:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
and storing the industry problems according to industry classification to generate question banks of different industries.
Optionally, the step of converting the job description information into a factual sentence with complete information specifically includes the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
Optionally, the step of generating an industry question according to the fact sentence by the question generation template specifically includes:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
Optionally, the step of retrieving a question generation template pre-stored by the server, matching a main part of the question, and generating an industry question further includes the following steps:
constructing a text classification model for labeling sample deep learning;
checking whether the industry problem is smooth or not by using the text classification model for the labeled sample deep learning;
and if the industry problem sentences are not smooth, the text classification model for marking the sample deep learning adjusts the industry problem.
Optionally, the step of storing the industry questions according to industry categories to generate question banks of different industries specifically includes the following steps:
constructing a term frequency-inverse document frequency (TFIDF) model;
checking whether the industry problem has similar problems in the question bank by utilizing the TFIDF model;
and if similar industry problems exist, deleting the currently generated industry problems.
Optionally, the question bank generating method further includes the following steps:
extracting keywords of the job description;
expanding related words related to the keywords;
and generating an industry problem according to the related words.
In addition, in order to achieve the above object, the present invention further provides a server, where the server includes a memory and a processor, the memory stores an item library generating system operable on the processor, and the item library generating system implements the following steps when executed by the processor:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
and storing the industry problems according to industry classification to generate question banks of different industries.
Optionally, the step of converting the job description information into a factual sentence with complete information specifically includes the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
Optionally, the step of generating an industry question according to the fact sentence by the question generation template specifically includes:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium storing a question bank generating system, which is executable by at least one processor, so as to make the at least one processor execute the steps of the question bank generating method.
Compared with the prior art, the server, the question bank generating method and the computer readable storage medium provided by the invention have the advantages that firstly, the description information of the recruitment position of the recruitment website is obtained; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Drawings
FIG. 1 is a schematic diagram of an alternative hardware architecture for a server according to the present invention;
FIG. 2 is a block diagram of a first embodiment of the question bank generating system according to the present invention;
FIG. 3 is a block diagram of a second embodiment of the question bank generating system according to the present invention;
FIG. 4 is a schematic diagram of program modules of a question bank generating system according to a third embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for generating a question bank according to a first embodiment of the present invention;
FIG. 6 is a flowchart illustrating a second embodiment of the method for generating a question bank according to the present invention;
FIG. 7 is a flowchart illustrating a method for generating a question bank according to a third embodiment of the present invention.
Reference numerals:
Figure GDA0002733020950000041
Figure GDA0002733020950000051
the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an alternative hardware architecture of the server 2 according to the present invention.
In this embodiment, the server 2 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which may be communicatively connected to each other through a system bus. It is noted that fig. 1 only shows the server 2 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The memory 11 includes at least one type of readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the server 2, such as a hard disk or a memory of the server 2. In other embodiments, the memory 11 may also be an external storage device of the server 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the server 2. Of course, the memory 11 may also include both an internal storage unit of the server 2 and an external storage device thereof. In this embodiment, the memory 11 is generally used for storing an operating system installed in the server 2 and various types of application software, such as program codes of the question bank generating system 200. Furthermore, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is typically used to control the overall operation of the server 2. In this embodiment, the processor 12 is configured to run the program codes stored in the memory 11 or process data, for example, run the question bank generating system 200.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is generally used for establishing communication connection between the server 2 and other electronic devices.
The application environment and the hardware structure and function of the related devices of the various embodiments of the present invention have been described in detail so far. Hereinafter, various embodiments of the present invention will be proposed based on the above-described application environment and related devices.
First, the present invention provides an item bank generating system 200.
Referring to fig. 2, a block diagram of a first embodiment of the library generating system 200 of the present invention is shown.
In this embodiment, the question bank generating system 200 includes a series of computer program instructions stored in the memory 11, and when the computer program instructions are executed by the processor 12, the question bank generating operation according to the embodiments of the present invention can be implemented. In some embodiments, the question bank generating system 200 may be divided into one or more modules based on the particular operations implemented by the various portions of the computer program instructions. For example, in fig. 2, the question bank generating system 200 may be divided into an acquiring module 201, a converting module 202, a generating module 203, and a storing module 204. Wherein:
the obtaining module 201 is configured to obtain description information of the recruitment position of the recruitment website.
Generally, professional interview questions of the interviewer or the interview are manually drawn up by the recruiter according to different position information, however, when the position requirement changes or positions are newly increased, if the position requirement changes, the interview questions need to be drawn up again and printed if the position requirement changes or the positions are newly increased, and if the position requirement changes, the interview questions need to be drawn up again, the test system needs to be updated, and great inconvenience is brought to the recruiter.
Therefore, in this embodiment, the server 2 logs in the recruitment website, and the server 2 acquires the job description information of the recruitment website through the acquisition module 201.
The conversion module 202 is configured to convert the job description information into a factual sentence with complete information.
Specifically, the server 2 analyzes the job description information, extracts keywords of the job description information, and separates the job description information into several fact sentences having complete information.
In the present embodiment, the server 2 associates words related to the same entity in the job description information by using the reference resolution. The method is characterized in that words are firstly classified into one type, then the words most possibly referring to the same entity are combined together according to the degree of closeness among the words, the words are clustered in a successive aggregation mode, and then the words are separated into several fact sentences with complete information. For example, the description of a certain position information is: the method has the advantages that common statistical modeling, data mining and machine learning methods are known, and can be applied to establish a model to solve practical problems and develop data products; familiarity with SAS, SPSSModeller, R, Python, etc. software; skilled Oracle, SQLServer, Mysql, etc. Deriving "they" in the job description by reference resolution means "statistical modeling, data mining, machine learning methods", and finally deriving four facts of the job description information after associating words about the same entity according to reference resolution: (1) the method comprises the following steps of learning common statistical modeling, data mining and machine learning. (2) And (3) establishing a model by using common statistical modeling, data mining and machine learning methods to solve the actual problem and develop a data product. (3) Familiar with SAS, SPSSModeller, R, Python, etc. software. (4) Skilled Oracle, SQLServer, Mysql, etc.
The generating module 203 is configured to generate an industry question from the fact sentence according to a question generating template.
Specifically, the server memory pre-stores a problem generation template, which is illustrated by taking a fact sentence of "understanding a common statistical modeling, data mining, and machine learning method" as an example, the server selects an appropriate problem generation template according to a tree query language (VP < ((VV < adept? ".
The storage module 204 is configured to store the industry problems according to industry categories to generate question banks of different industries.
Specifically, the question bank stored in the memory of the server 2 is classified according to industry, such as IT, medical treatment, intellectual property, and the like. And the server 2 identifies the industry corresponding to the industry problem according to the industry problem and stores the industry problem into the corresponding industry question bank so as to update the question bank in the memory.
In other embodiments of the present invention, the server 2 further classifies the generated job description information according to specialties, mainly including a general purpose problem, a problem with a high speciality, and a problem with a high speciality.
In other embodiments of the present invention, the server 2 may further extract the keywords of the job description, develop words related to an industry, and generate an industry problem for the related words.
Through the program module 201 and 204, the question bank generating system 200 provided by the invention firstly obtains the description information of the recruitment position of the recruitment website; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Further, a second embodiment of the invention (as shown in fig. 3) is proposed based on the above-mentioned first embodiment of the question bank generating system 200 of the invention. In this embodiment, the question bank generating system 200 further includes an analyzing module 205, a retrieving module 206, and a matching module 207, wherein:
the parsing module 205 is configured to decompose the fact sentence into a tree-like syntax structure, so as to obtain a syntax tree of the fact sentence through analysis; and the grammar tree is also used for analyzing the grammar tree to obtain the subject part of the question.
Specifically, the server 2 decomposes the fact sentence into a tree-like grammar structure of the principal and predicate objects through the parsing module 205, and analyzes the grammar structure. In this embodiment, a PCFG (Probabilistic Context-Free Grammar) is used for syntax structure analysis, which is composed of common syntax rules and probabilities corresponding to the rules. And (3) possibly generating a plurality of potential syntax trees for each fact sentence, calculating the generation probability of each syntax tree by the server according to the PCFG, and selecting the syntax tree with the highest probability as the syntax tree of the fact sentence.
The parsing module 205 further parses and queries the syntax tree through a tree query language Tregex to select a main part of the problem, for example, we can select a main part of the problem "statistical modeling, data mining, machine learning method" from a fact sentence "understanding common statistical modeling, data mining, machine learning method" through a tree query language (VP < (VV $ IP) ".
The retrieval module 206 is configured to retrieve a question generation template pre-stored in the server 2;
the matching module 207 is configured to match the main part of the question to the question generation module, so as to generate an industry question.
Specifically, as can be seen from the above, in the first embodiment, the server 2 stores the question generation module in advance. In this embodiment, when the main part of the problem is obtained through analysis by the analysis module 205, the problem generation module pre-stored in the memory of the server 2 is called through the calling module 206, and the problem main part is matched to the problem generation module through the matching module 207, so as to generate an industrial problem.
Through the program module 205 and 207, the question bank generating system 200 provided by the invention can decompose the fact sentence into a tree-shaped grammar structure to analyze and obtain a grammar tree of the fact sentence, and analyze the grammar tree to obtain a main part of the problem, and further, match the subject part of the problem with a problem generating template to generate an industry problem, thereby realizing automatic generation of the industry problem and avoiding manual operation.
Further, a third embodiment of the invention (as shown in fig. 4) is proposed based on the above-mentioned first embodiment of the question bank generating system 200 of the invention. In this embodiment, the question bank generating system 200 further includes a constructing module 208 and an examining module 209, wherein:
the building module 208 is configured to build a text classification model for labeling deep learning of a sample;
the checking module 209 is configured to check whether the industry problem is smooth by using the text classification model of the labeled sample deep learning; and if the industry problem sentences are not smooth, adjusting the industry problem by using the text classification model for marking sample deep learning.
Specifically, as can be seen from the above, in the first embodiment, the industry issue is that the server 2 automatically generates an industry issue by matching the issue body part to the issue generation module by the matching module 206. Since the industry problem is automatically generated by the server 2, and there may be a problem of discontent sentences, in this embodiment, in order to avoid the defect of discontent industry problem, the server 2 constructs a text classification model of deep learning of a labeled sample through the construction module 208, and checks whether the industry problem is smooth or not through the checking module 209 by using the text classification model of deep learning of the labeled sample; if the industry question sentence is not smooth, the checking module 209 adjusts the industry question by using the text classification model of the labeled sample deep learning.
The building module 208 is further configured to build a term frequency-inverse document frequency (TFIDF) model;
the checking module 209 is further configured to check whether the similar problem exists in the question bank for the industry problem by using the TFIDF model, and delete the currently generated industry problem if the similar problem exists.
Generally, the same problem may have different descriptions, but this causes the problem to be duplicated. In order to avoid repeated questions, when a problem is generated, the server 2 further constructs a TFIDF model through the construction module 208, and checks whether a similar problem already exists in the question bank stored in the server 2 by using the TFIDF model. In this embodiment, a text is subjected to vector representation through a TFIDF model, then pairwise similarity of all problems in each professional category is calculated by using a cosine similarity algorithm, the problems are classified according to a set threshold, if the problem base has similar industrial problems, the currently generated industrial problems are deleted, and finally only one problem with the most abundant semantics is stored in each category.
Through the program module 208 and 209, the question bank generating system 200 provided by the invention can also check whether the industry problem is smooth or not by constructing the text classification model for labeling sample deep learning, and simultaneously, check whether the industry problem has similar problems in the question bank or not by constructing the TFIDF model, and if the industry problem has similar problems, delete the currently generated industry problem, thereby ensuring that the industry problems in the question bank are smooth and not repeated.
In addition, the invention also provides a question bank generating method.
Fig. 5 is a schematic flow chart showing a first embodiment of the method for generating a question bank according to the present invention. In this embodiment, the execution order of the steps in the flowchart shown in fig. 5 may be changed and some steps may be omitted according to different requirements.
And S301, acquiring description information of the recruitment position of the recruitment website.
Generally, professional interview questions of the interviewer or the interview are manually drawn up by the recruiter according to different position information, however, when the position requirement changes or positions are newly increased, if the position requirement changes, the interview questions need to be drawn up again and printed if the position requirement changes or the positions are newly increased, and if the position requirement changes, the interview questions need to be drawn up again, the test system needs to be updated, and great inconvenience is brought to the recruiter.
Therefore, in this embodiment, the server 2 logs in the recruitment website, and the server 2 acquires the job description information of the recruitment website.
Step S302, converting the position description information into a factual sentence with complete information.
Specifically, the server 2 analyzes the job description information, extracts keywords of the job description information, and separates the job description information into several fact sentences having complete information.
In the present embodiment, the server 2 associates words related to the same entity in the job description information by using the reference resolution. The method is characterized in that words are firstly classified into one type, then the words most possibly referring to the same entity are combined together according to the degree of closeness among the words, the words are clustered in a successive aggregation mode, and then the words are separated into several fact sentences with complete information. For example, the description of a certain position information is: the method has the advantages that common statistical modeling, data mining and machine learning methods are known, and can be applied to establish a model to solve practical problems and develop data products; familiarity with SAS, SPSSModeller, R, Python, etc. software; skilled Oracle, SQLServer, Mysql, etc. Deriving "they" in the job description by reference resolution means "statistical modeling, data mining, machine learning methods", and finally deriving four facts of the job description information after associating words about the same entity according to reference resolution: (1) the method comprises the following steps of learning common statistical modeling, data mining and machine learning. (2) And (3) establishing a model by using common statistical modeling, data mining and machine learning methods to solve the actual problem and develop a data product. (3) Familiar with SAS, SPSSModeller, R, Python, etc. software. (4) Skilled Oracle, SQLServer, Mysql, etc.
And step S303, generating an industry question according to the fact sentence and a question generation template.
Specifically, the server 2 pre-stores a problem generation template in the memory, which is illustrated by taking a fact sentence of "understanding a common statistical modeling, data mining, and machine learning method" as an example, and selects an appropriate problem generation template according to a tree query language (VP < ((VV < adept | understanding | familiar | mastering | understanding | possess | including | meeting) $. (NP | IP ═ unmov))) by using a rule template matching module, and finally generates "please talk about statistical modeling, data mining, and machine learning method? ".
And step S304, storing the industry problems according to industry classification to generate question banks of different industries.
Specifically, the question bank stored in the memory of the server 2 is classified according to industry, such as IT, medical treatment, intellectual property, and the like. And the server 2 identifies the industry corresponding to the industry problem according to the industry problem and stores the industry problem into the corresponding industry question bank so as to update the question bank in the memory.
In other embodiments of the present invention, the server 2 further classifies the generated job description information according to specialties, mainly including a general purpose problem, a problem with a high speciality, and a problem with a high speciality.
In other embodiments of the present invention, the server 2 may further extract the keywords of the job description, develop words related to an industry, and generate an industry problem for the related words.
Through the steps S301-304, the question bank generating method provided by the invention comprises the steps of firstly, obtaining the description information of the recruitment position of the recruitment website; then, converting the job description information into a factual sentence with complete information; then, generating an industry problem by the fact sentence according to a problem generation template; and finally, storing the industry problems according to industry classification to generate question banks of different industries. Therefore, the problem that people need to manually make a question bank of interviews, written tests or online tests according to the recruitment position \35820;. the problem can be avoided, the industry problem can be automatically generated according to the description information of the recruitment position, and the industry problem can be stored according to industry classification, so that the labor is saved.
Further, based on the above first embodiment of the method for generating a question bank according to the present invention, a second embodiment of the method for generating a question bank according to the present invention is provided.
Fig. 6 is a schematic flow chart of a method for generating a question bank according to a second embodiment of the present invention.
In this embodiment, the step of generating the industry problem by using the fact sentence according to the problem generation template specifically includes the following steps:
step S401, decomposing the fact sentence into a tree-like syntax structure to obtain a syntax tree of the fact sentence through analysis.
Step S402, analyzing the syntax tree to obtain the subject part of the question.
Specifically, the server 2 decomposes the fact sentence into a tree-like grammar structure of the principal and predicate objects, and analyzes the grammar structure. In this embodiment, a PCFG (Probabilistic Context-Free Grammar) is used for syntax structure analysis, which is composed of common syntax rules and probabilities corresponding to the rules. And (3) possibly generating a plurality of potential syntax trees for each fact sentence, calculating the generation probability of each syntax tree by the server according to the PCFG, and selecting the syntax tree with the highest probability as the syntax tree of the fact sentence.
Furthermore, the server 2 analyzes and queries the syntax tree through a tree query language Tregex to select a main part of the problem, for example, we can select a main part of the problem from a fact sentence "understanding common statistical modeling, data mining and machine learning methods" through a tree query language (VP < (VV $ IP) "," statistical modeling, data mining and machine learning methods ".
In step S403, a question generation template pre-stored in the server 2 is called.
And S404, matching the main body part of the problem to the problem generation module to generate an industry problem.
Specifically, as can be seen from the above, in the first embodiment, the server 2 stores the question generation module in advance. In this embodiment, when the server 2 obtains the main part of the problem through analysis, the server 2 calls the problem generation module pre-stored in the memory of the server 2, and matches the main part of the problem with the problem generation module, so as to generate an industrial problem.
Through the steps S401 to S404, the question bank generating method provided by the present invention can decompose the fact sentence into a tree-like syntax structure to analyze and obtain a syntax tree of the fact sentence, and analyze the syntax tree to obtain a main part of the problem, and further, match a subject part of the problem to a problem generating template to generate an industry problem, thereby realizing automatic generation of the industry problem and avoiding manual work.
Further, a third embodiment of the library generating method of the present invention is provided based on the above-mentioned first embodiment of the library generating method of the present invention.
Fig. 7 is a flow chart illustrating a third embodiment of the method for generating a question bank according to the present invention. In this embodiment, the method further includes:
step S501, a text classification model for labeling sample deep learning is constructed.
Step S502, checking whether the industry problem is smooth or not by using the text classification model of the labeled sample deep learning.
And S503, if the industry problem sentence is not smooth, adjusting the industry problem by using the text classification model of the labeled sample deep learning.
Specifically, as can be seen from the above, in the first embodiment, the industry issue is that the server 2 automatically generates an industry issue by matching the issue body part to the issue generation module by the matching module 206. Since the industry problem is automatically generated by the server 2, and there may be a problem of discontent sentences, in this embodiment, in order to avoid the defect of discontent industry problem, the server 2 constructs a text classification model of deep learning of a labeled sample through the construction module 208, and checks whether the industry problem is smooth or not through the checking module 209 by using the text classification model of deep learning of the labeled sample; if the industry question sentence is not smooth, the checking module 209 adjusts the industry question by using the text classification model of the labeled sample deep learning.
Step S504, a term frequency-inverse document frequency (TFIDF) model is constructed.
And step S505, checking whether the industry problem has similar problems in the question bank by using the TFIDF model.
Step S506, if similar industry problems exist, deleting the currently generated industry problems.
Generally, the same problem may have different descriptions, but this causes the problem to be duplicated. In order to avoid repeated questions, when a problem is generated, the server 2 further constructs a TFIDF model through the construction module 208, and checks whether a similar problem already exists in the question bank stored in the server 2 by using the TFIDF model. In this embodiment, a text is subjected to vector representation through a TFIDF model, then pairwise similarity of all problems in each professional category is calculated by using a cosine similarity algorithm, the problems are classified according to a set threshold, if the problem base has similar industrial problems, the currently generated industrial problems are deleted, and finally only one problem with the most abundant semantics is stored in each category.
Through the steps S501 to S506, the question bank generating method provided by the present invention can also check whether the industry problem is smooth or not by constructing the text classification model for labeling sample deep learning, and simultaneously, check whether the industry problem has a similar problem in the question bank by constructing the TFIDF model, and if the industry problem has a similar problem, delete the currently generated industry problem, thereby ensuring that the industry problems in the question bank are smooth and not repeated.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (8)

1. A question bank generating method is applied to a server and is characterized by comprising the following steps:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
storing the industry problems according to industry classification to generate question banks of different industries;
the step of converting the job description information into a factual sentence with complete information specifically comprises the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
2. The question bank generating method of claim 1, wherein the generating of the industry question from the fact sentence according to the question generating template specifically comprises:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
3. The question bank generating method of claim 2, wherein the step of retrieving the question generating template pre-stored by the server, matching the main part of the question, and generating an industry question further comprises the following steps:
constructing a text classification model for labeling sample deep learning;
checking whether the industry problem is smooth or not by using the text classification model for the labeled sample deep learning;
and if the industry problem sentences are not smooth, the text classification model for marking the sample deep learning adjusts the industry problem.
4. The question bank generating method of claim 1, wherein the step of storing the industry questions according to industry categories to generate question banks of different industries comprises the following steps:
constructing a term frequency-inverse document frequency (TFIDF) model;
checking whether the industry problem has similar problems in the question bank by utilizing the TFIDF model;
and if similar industry problems exist, deleting the currently generated industry problems.
5. The question bank generating method according to any one of claims 1 to 3, wherein said question bank generating method further comprises the steps of:
extracting keywords of the job description;
expanding related words related to the keywords;
and generating an industry problem according to the related words.
6. A server, comprising a memory, a processor, the memory having a question bank generating system stored thereon, the question bank generating system being executable on the processor, the question bank generating system when executed by the processor performing the steps of:
acquiring the description information of the recruitment position of the recruitment website;
converting the job description information into a factual sentence with complete information;
generating an industry problem by the fact sentence according to a problem generation template;
storing the industry problems according to industry classification to generate question banks of different industries;
the step of converting the job description information into a factual sentence with complete information specifically comprises the following steps:
classifying each word of the position description information into one category;
combining words most likely to refer to the same entity together according to the degree of closeness among the words;
and clustering the words in a successive aggregation mode, and further separating the words into several factual sentences with complete information.
7. The server according to claim 6, wherein the generating of the industry question from the fact sentence according to the question generation template specifically comprises:
decomposing the fact sentence into a tree-shaped grammar structure so as to analyze and obtain a grammar tree of the fact sentence;
analyzing the syntax tree to obtain a main part of the problem;
calling a question generation template prestored by the server;
matching the main body part of the problem with the problem generation template to generate an industry problem.
8. A computer-readable storage medium storing a question bank generating system executable by at least one processor to cause the at least one processor to perform the steps of the question bank generating method of any one of claims 1-5.
CN201711130606.7A 2017-11-15 2017-11-15 Question bank generating method, server and computer readable storage medium Active CN107943881B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711130606.7A CN107943881B (en) 2017-11-15 2017-11-15 Question bank generating method, server and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711130606.7A CN107943881B (en) 2017-11-15 2017-11-15 Question bank generating method, server and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN107943881A CN107943881A (en) 2018-04-20
CN107943881B true CN107943881B (en) 2020-12-15

Family

ID=61931258

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711130606.7A Active CN107943881B (en) 2017-11-15 2017-11-15 Question bank generating method, server and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN107943881B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241534B (en) * 2018-09-12 2022-12-27 重庆工业职业技术学院 Examination question automatic generation method and device based on text AI learning
CN115881121A (en) * 2020-06-22 2023-03-31 广州小鹏汽车科技有限公司 Voice interaction method, server and computer-readable storage medium
CN113298488B (en) * 2021-04-30 2023-06-06 北京五八赶集信息技术有限公司 Industry problem library construction method, device, electronic equipment and computer readable medium
CN114240348A (en) * 2021-12-08 2022-03-25 中信银行股份有限公司 Method and system for testing matching degree of development talents based on AI word segmentation calculation

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101261690A (en) * 2008-04-18 2008-09-10 北京百问百答网络技术有限公司 A system and method for automatic problem generation
US8370278B2 (en) * 2010-03-08 2013-02-05 Microsoft Corporation Ontological categorization of question concepts from document summaries
US20130196305A1 (en) * 2012-01-30 2013-08-01 International Business Machines Corporation Method and apparatus for generating questions
GB2531720A (en) * 2014-10-27 2016-05-04 Ibm Automatic question generation from natural text
CN104978396A (en) * 2015-06-02 2015-10-14 百度在线网络技术(北京)有限公司 Knowledge database based question and answer generating method and apparatus
CN105512864A (en) * 2016-01-28 2016-04-20 丁沂 Method for automatically acquiring post professional ability requirements based on internet

Also Published As

Publication number Publication date
CN107943881A (en) 2018-04-20

Similar Documents

Publication Publication Date Title
WO2021042503A1 (en) Information classification extraction method, apparatus, computer device and storage medium
JP6629942B2 (en) Hierarchical automatic document classification and metadata identification using machine learning and fuzzy matching
CN111581976B (en) Medical term standardization method, device, computer equipment and storage medium
US20200081899A1 (en) Automated database schema matching
CN107943881B (en) Question bank generating method, server and computer readable storage medium
CN111783471B (en) Semantic recognition method, device, equipment and storage medium for natural language
CN111667923B (en) Data matching method and device, computer readable medium and electronic equipment
CN113986864A (en) Log data processing method and device, electronic equipment and storage medium
CN114612921B (en) Form recognition method and device, electronic equipment and computer readable medium
CN111177375A (en) Electronic document classification method and device
CN113868419B (en) Text classification method, device, equipment and medium based on artificial intelligence
CN110717333B (en) Automatic generation method and device for article abstract and computer readable storage medium
CN113987125A (en) Text structured information extraction method based on neural network and related equipment thereof
CN113360654A (en) Text classification method and device, electronic equipment and readable storage medium
CN112417887A (en) Sensitive word and sentence recognition model processing method and related equipment thereof
CN114357174B (en) Code classification system and method based on OCR and machine learning
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
US11361565B2 (en) Natural language processing (NLP) pipeline for automated attribute extraction
CN116910592B (en) Log detection method and device, electronic equipment and storage medium
CN112579781A (en) Text classification method and device, electronic equipment and medium
CN111782781A (en) Semantic analysis method and device, computer equipment and storage medium
CN111177387A (en) User list information processing method, electronic device and computer readable storage medium
CN113095073B (en) Corpus tag generation method and device, computer equipment and storage medium
CN115858776A (en) Variant text classification recognition method, system, storage medium and electronic equipment
CN114416847A (en) Data conversion method, device, server and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20180604

Address after: 518000 Room 201, building A, No. 1, Qian Wan Road, Qianhai Shenzhen Hong Kong cooperation zone, Shenzhen, Guangdong (Shenzhen Qianhai business secretary Co., Ltd.)

Applicant after: Shenzhen one ledger Intelligent Technology Co., Ltd.

Address before: 200030 Xuhui District, Shanghai Kai Bin Road 166, 9, 10 level.

Applicant before: Shanghai Financial Technologies Ltd

TA01 Transfer of patent application right
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1250818

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant