CN117473096A - Knowledge point labeling method fusing LATEX labels and model thereof - Google Patents
Knowledge point labeling method fusing LATEX labels and model thereof Download PDFInfo
- Publication number
- CN117473096A CN117473096A CN202311834982.XA CN202311834982A CN117473096A CN 117473096 A CN117473096 A CN 117473096A CN 202311834982 A CN202311834982 A CN 202311834982A CN 117473096 A CN117473096 A CN 117473096A
- Authority
- CN
- China
- Prior art keywords
- representation
- latex
- information
- knowledge
- ith index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000004816 latex Substances 0.000 title claims abstract description 136
- 238000002372 labelling Methods 0.000 title claims abstract description 59
- 239000013598 vector Substances 0.000 claims abstract description 86
- 238000004364 calculation method Methods 0.000 claims abstract description 38
- 238000012216 screening Methods 0.000 claims abstract description 34
- 230000004927 fusion Effects 0.000 claims abstract description 15
- 238000011176 pooling Methods 0.000 claims description 42
- 230000006870 function Effects 0.000 claims description 38
- 238000000034 method Methods 0.000 claims description 34
- 230000007246 mechanism Effects 0.000 claims description 22
- 238000012549 training Methods 0.000 claims description 19
- 238000007781 pre-processing Methods 0.000 claims description 13
- 239000011159 matrix material Substances 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 12
- 230000004913 activation Effects 0.000 claims description 10
- 238000003058 natural language processing Methods 0.000 claims description 10
- 238000012512 characterization method Methods 0.000 claims description 7
- 238000012360 testing method Methods 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 6
- 238000004140 cleaning Methods 0.000 claims description 4
- 239000000463 material Substances 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000001755 vocal effect Effects 0.000 claims 1
- 230000009286 beneficial effect Effects 0.000 abstract 1
- 230000014509 gene expression Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000009472 formulation Methods 0.000 description 3
- 239000000203 mixture Substances 0.000 description 3
- 238000007670 refining Methods 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000009411 base construction Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 210000002464 muscle smooth vascular Anatomy 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000001418 vibrating-sample magnetometry Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Tourism & Hospitality (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Educational Administration (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Educational Technology (AREA)
- Databases & Information Systems (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a knowledge point labeling method and a model thereof fused with LATEX labels, comprising the following steps: constructing a data set, inputting the constructed original problem text in the data set into a sentence coder module, and outputting a result; the output results are input into a subject knowledge fusion module, and the calculation results are respectively final semantic representation; inputting the final semantic representation into a gating screening module, wherein the output result is the information of the original problem text which is finally reserved under the influence of subject knowledge information; and inputting the output result into a linear layer with a sigmoid function to obtain a final classification probability vector, and converting the final classification probability vector into a predictive label through a threshold classifier. The beneficial effects of the invention are as follows: two kinds of finer discipline knowledge, namely information of LATEX label concepts and term types, are introduced, and key information is provided for labeling most knowledge points under the condition of unbalanced sample distribution.
Description
Technical Field
The invention relates to the field related to multi-label text classification tasks, in particular to a knowledge point labeling method and a model thereof fused with LATEX labels.
Background
Since the end of the 90 s of the 20 th century, researchers have extensively explored the area of text classification, from traditional single-tag classification methods to multi-tag classification methods, with the development of the internet and the massive generation of digitized information. In recent years, with the expansion of the scale of internet education and the increase of the demand of online learning of students, the application of big data technology in the education field becomes more and more important, and meanwhile, problems play a very important role in course teaching. The knowledge points are evaluated by analyzing the problems made by students, but how to accurately mark the knowledge points examined by the problems is a key problem for optimizing the problem base construction and personalized learning.
In the field of mathematical disciplines, mathematical knowledge points are basic organization units and transfer units in mathematical education information for describing and expressing core concepts and gist of mathematical disciplines. The problem knowledge point labeling task aims at labeling core concepts and key points examined in problems. Because the core concepts and the key points examined in the problems are not unique, the problem knowledge point labeling task can be regarded as a multi-label text classification task. However, problems such as unbalanced sample distribution, layering of labels, limited field and the like exist in the problem knowledge labeling task. More critical is that the mathematical discipline knowledge has specificity that makes the model unable to understand the semantics of the problem text deeply. For example, problems have particularities such as symbolization, formulation, logic complexity, expression refining, and the like, which are difficult problems of the task of marking the knowledge points of the research problems.
The number of knowledge point labels in the problem knowledge point automatic labeling task is more, and most problem examples only contain 1 to 3 knowledge points through statistics of sampling data, so that the label space is sparse. The problem of sparse labeling can cause poor labeling effect of the existing model on knowledge points with fewer training examples, and the performance of the model is difficult to improve.
Most of the traditional knowledge point labeling methods adopt a mode of combining statistics and a machine learning algorithm, a plurality of works are performed later to generate space vectors based on vector space models (Vector Space Model, VSMs), and knowledge point labeling of texts in respective fields is achieved by calculating text similarity. However, the method is based on the shallow features only, does not consider the context information of the text, is excessively dependent on a corpus, and is not strong in universality. There have therefore been proposed, in recent years, a deep learning method based on word vector representation. But the word vector representation in such methods is static and cannot learn its context representation effectively for the newly added training problem. With the advent of BERT (which is a deep learning model based on an attention mechanism), the word vector characterization problem was solved, and more work is performed to improve the performance of the respective domain model in the form of embedding a pre-training framework.
Although the way of embedding the pre-training framework directly is very powerful in terms of vocabulary and semantic expressions, the semantic coding effect on the prior knowledge specific to the respective domain is not good, especially in the mathematical discipline domain. Therefore, the pre-training model is utilized and the specificity of the mathematical text is combined, such as priori knowledge of mathematical symbols, formulas, problem analysis and the like is integrated, and finally, the performance on the problem knowledge point labeling task is further improved. However, when the prior knowledge is fused, the models are directly cascaded (Concate) by the word vector representation and the original problem text representation, and finally the cascading result is sent to the classifier, and the explicit fusion actually introduces some noise to interfere with the original semantic representation of the problem. And the method for marking the knowledge points of the intermediate representation of the problems is realized by carrying out the problem cleaning and replacement on the original problem text in advance through the field knowledge, and the problem that the complete semantic representation of the original problem text is damaged also exists, so that the characteristic of effective information is lost during model classification.
Disclosure of Invention
In order to solve the problems, the invention provides a knowledge point labeling method and a model thereof integrating LATEX labels, which take the particularities of formulation, expression refining and the like of the representation of mathematical discipline knowledge into consideration, introduce two kinds of finer discipline knowledge, namely information of LATEX label concepts and term types, and further provide key information for labeling most knowledge points under the condition of unbalanced sample distribution.
The technical scheme of the invention is as follows: a knowledge point labeling method integrating LATEX labels comprises the following steps:
step S1, constructing a data set, collecting problems in a junior middle school mathematics test paper, and preprocessing the collected problems; marking the knowledge points of the collected problems after pretreatment; finally obtaining a problem data set, wherein problems in the problem data set are called an original problem text w;
step S2, inputting the original problem text w constructed in the step S1, and LATEX label concept text lc and term type text tt in the original problem text w into a sentence encoder module of the knowledge point automatic labeling model, and outputting an original problem text representation e and a LATEX label concept representation e as the output results lc And the term type represents e tt ;
Step S3, inputting the output result obtained in the step S2 into a discipline knowledge fusion module, and using a cross attention mechanism to represent the LATEX label concept as e lc And the term type represents e tt Respectively fusing with the original problem text representation e, and outputting a deep semantic representation M with the LATEX label concept as an output result lc And deep semantic representation M of term types tt The method comprises the steps of carrying out a first treatment on the surface of the The calculation result after the average pooling operation in the discipline knowledge fusion module is respectively used as the final semantic representation of the LATEX label concept and the term type, namely the pooling representation of the LATEX label concept Pooling representation with term type +.>;
Step S4, inputting the final semantic representation in the step S3 into a gating and screening module, and reserving key information related to discipline knowledge in the original discipline text representation e in a form of few parameters through a gating and screening mechanism of implicitly fusing various discipline knowledge, wherein the output result of the gating and screening module is information of the original discipline text w which is finally reserved under the influence of LATEX label concept information and term type information, namely the finally reserved information e for short cls-remain2 ;
Step S5, the final reserved information e output by the gating and screening module in step S4 is processed cls-remain2 As an input to the prediction module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the prediction tag, which is converted into the prediction tag by a threshold classifier.
Further, in step S1, the data set is constructed specifically as follows:
step S11, collecting 16226 problems from 800 junior middle school mathematics test paper, wherein the collected problems cover all knowledge points related to junior middle school mathematics, and four problems comprise selection problems, blank filling problems, answer solutions and judgment problems;
Step S12, preprocessing the collected problems, firstly performing invalid character removal, duplicate removal and completion cleaning operations on the problems to obtain 14200 problems; then adopting a mathematical formula identification tool to identify the formulas existing in the form of pictures into a formula format supported by Word;
step S13, marking the knowledge points of the problems in an automatic mode after preprocessing, wherein the marked knowledge points of the problems are derived from two aspects, namely, the query result of an online education platform and the knowledge point grading standard constructed by referring to the teaching materials of the middle school people on the other hand;
step S14, finally obtaining a data set containing 12073 problems through problem preprocessing and knowledge point labeling.
Further, in step S13, knowledge points of the problem are labeled, specifically:
step S131, finding a plurality of three-level knowledge points corresponding to the problems by means of the problem query function of the online education platform;
step S132, inquiring first, second and third knowledge points corresponding to the problems in the knowledge point grading standard;
step S133, taking the three-level knowledge points obtained by the online education platform as the main, screening out the three-level knowledge points inquired from the knowledge point grading standard, and inquiring the first-level knowledge points and the second-level knowledge points to which the three-level knowledge points belong from the three-level knowledge points;
Step S134, carrying out similarity judgment on the knowledge point labeling results of all problems by means of a Levenstein similarity algorithm and a semantic similarity model, unifying the labeling results with large similarity, and ensuring that the labeled knowledge points are not redundant;
and S135, removing the knowledge points and the corresponding problems which are not examined in the middle examination according to the middle examination class provided by the mathematics education expert in the junior middle school.
Further, in step S2, the sentence encoder module specifically includes:
step S21, a sentence encoder module selects RoBERTa as a pre-training language model, wherein the RoBERTa pre-training language model is a robust optimized BERT method, and the sentence encoder module inputs a text lc comprising an original problem text w, a LATEX label concept and a term type text tt, and the three share parameters of the RoBERTa pre-training language model;
step S22, roBERTa Pre-trained language model as a function, w i The original problem text, lc, for the ith index i LATEX label concept text, tt for the ith index i For the term type text of the ith index, the specific calculation process is shown in formula (1);
(1);
wherein e i Original problem text w for the ith index i Through Roberta pre-training language model The resulting vector representation of the type, i.e. the original problem text representation e called the i-th index i ,e i lc Vector representation obtained for the text of the LATEX label concept of the ith index through the RoBERTa pre-trained language model, namely the LATEX label concept representation e called the ith index i lc ,e i tt Vector representation of text of term type for the ith index via Roberta pre-trained language model, namely term type representation e called the ith index i tt ;
Step S23, extracting the output of the model in the last layer of natural language processing field as text word vector representation, i.e. the original problem text representation e of the ith index i LATEX tag conceptual representation e of ith index i lc The term type of the ith index represents e i tt 。
Further, in step S3, the scientific knowledge fusion module specifically includes:
step S31, inputting text word vector representation output by a model in the last layer of natural language processing field in a sentence coder module;
step S32, using cross-attention mechanism to express the i-th index LATEX label concept as e i lc The term type of the ith index represents e i tt Original problem text representation e respectively associated with the ith index i Fusing and outputting a deep semantic representation M of LATEX label concept with the result of ith index i lc And the deep semantic representation M of the term type of the ith index i tt ;
Step S33, simultaneously, learning stable characteristic representation of the knowledge point automatic labeling model in a plurality of independent characteristic spaces, introducing a multi-head attention mechanism, and finally, the attention calculation process is shown as a formula (2) and a formula (3);
(2);
(3);
wherein head ij lc The feature representation for the j-th attention calculation for the i-th indexed LATEX label conceptual representation, softmax as an activation function, converts the input non-normalized score into a probability distribution, W j Q 、W j K 、W j V Projection parameter matrix respectively representing query vector, key vector and value vector in jth self-attention calculation, T is LATEX label conceptual representation e of ith index i lc Sum key vector W j K Transpose of multiplication d K Original problem text representation e for the ith index i Is a second dimension of size;
head ij tt a feature representation for performing a jth attention calculation for the term type representation of the ith index;
M i lc for the deep semantic representation of the LATEX label concept obtained by cascade after the attention calculation of h times on the LATEX label concept representation of the ith index, the deep semantic representation M of the LATEX label concept of the ith index is called i lc ,Representing a cascading operation, h representing the number of attention calculations;
M i tt For the deep semantic representation of the term type concatenated after h times of attention calculations on the term type representation of the ith index, the deep semantic representation M of the term type of the ith index is called i tt ;
Step S34, extracting the average pooling result of the last layer of embedded vector of the model in the natural language processing field as sentence information representation, and representing the deep semantic representation M of the LATEX label concept of the ith index i lc And the deep semantic representation M of the term type of the ith index i tt Carrying out average pooling, respectively taking the calculation result as final semantic characterization of LATEX label concepts and term types, and calculating as shown in a formula (4);
(4);
wherein,for the result of averaging the deep semantic representation of the label concept of the i-th index, the pooled representation of the LATEX label concept called i-th index +.>,For the result of the averaged pooling of the deep semantic representation of the term type of the i-th index, the pooled representation of the term type called i-th index +.>The method comprises the steps of carrying out a first treatment on the surface of the AvgPool is the deep semantic representation M of the LATEX tag concept for the ith index, respectively i lc And the deep semantic representation M of the term type of the ith index i tt And (3) averaging and pooling operation.
Further, in step S4, the gating and screening module specifically includes:
Step S41, the input data, i.e., the pooled representation of the LATEX tag concept of the ith indexPooling of term types with the ith index is +.>;
Step S42, the LATEX label concept is represented by pooling acting on the ith indexAnd a CLS tag vector (CLS tag vector is a specially position-coded vector for representing the whole sequence or sentence meaning, here a sentence representation for replacing the original problem text) e cls Is controlled by the gate of (2)The mechanism calculates the proportion of the original problem text information to be kept under the influence of the LATEX label concept information so as to screen out the related key information in the original problem text, and the calculation process is shown in a formula (5);
(5);
wherein r is i lc For the weight values that are retained under the influence of the LATEX tag concept information of the i-th index,to activate a function, W lc Pooled representation of LATEX tag concept for index i>And CLS tag vector e cls Spliced learnable matrix, b lc Is a bias vector, [ e ] cls ,]For CLS tag vector e cls And the pooling representation of the LATEX label concept of the ith index +.>Splicing to obtain a result;
e i cls-remain1 for weight value r reserved under influence of the i-th indexed LATEX label concept information i lc And CLS tag vector e cls The result obtained by multiplication represents the information of the original problem text reserved under the influence of the LATEX label conceptual information of the ith index, which is simply called as preliminary reserved information e i cls-remain1 ;
Step S43, the original problem text is finally kept under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index i cls-remain2 The calculation process is shown in a formula (6);
(6);
wherein r is i tt For the weight value reserved under the influence of the term type information of the i-th index,representing a sigmoid activation function, inputting information e reserved for preliminary i cls-remain1 ,W tt For the preliminary reserved information e i cls-remain1 And the pooling representation of the term type of the ith index +.>Spliced learnable matrix, b tt Is a bias vector, [ e ] i cls-remain1 , ]Information e reserved for preliminary i cls-remain1 And the pooling representation of the term type of the ith index +.>Splicing to obtain a result;
e i cls-remain2 then it is the preliminary reserved information e i cls-remain1 And r i tt The final output of the gate screening module obtained by multiplication, namely the information which represents the final reserved information of the original problem text under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index, is called the final reserved information e under the influence of the subject knowledge information of the ith index i cls-remain2 ;
Step S5, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 As an input to the predictive module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the predictive label, which can direct the classification probability to a threshold classifierThe amount is converted into a predictive tag.
Further, the prediction module in step S5 specifically includes:
step S51, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 After being input into a linear layer with a sigmoid function, a final classification probability vector is obtained as shown in a formula (7);
(7);
wherein,the j-th classification probability vector obtained for the linear layer of the sigmoid function, sigmoid being the activation function, W c Information e finally reserved under influence of subject knowledge information for ith index i cls-remain2 Matrix capable of learning, b c A bias vector;
step S52, a classification threshold delta is introduced,the j-th classification probability vector obtained by judging the linear layer of the sigmoid function corresponding to the j-th knowledge point label of the current problem>Obtaining the j knowledge point label corresponding to the current problem according to the size relation of the classification threshold delta >As in equation (8);
(8) ;
step S53, adopting distributed balance loss to balance the number of examples among all knowledge point labels, wherein the calculation of a specific loss function is shown in a formula (9);
(9);
wherein L is DB Represents the distribution balance loss obtained last, C represents the total number of knowledge points, k represents the kth problem in the data set,training is added as a weighting coefficient to make up the gap between the expected and actual sampling probabilities, y j k True marks representing the j-th knowledge point corresponding to the k-th problem, y j k E {0,1}, log represents logarithm, z j k Representing the probability of predicting the jth knowledge point of the kth problem, v j Is a class-specific bias, representing the bias of the intrinsic model; lambda is a decisive factor influencing the loss gradient, representing the probability z of classification j k To a degree of "tolerance".
Further, the automatic knowledge point labeling model integrating the LATEX label is applied to the automatic knowledge point labeling method integrating the LATEX label, and is mainly divided into four modules, namely a sentence encoder module, a disciplinary knowledge integration module, a gating screening module and a prediction module, wherein the sentence encoder module is used as a first module of the automatic knowledge point labeling model, and the four modules are sequentially in a serial structure.
The invention has the advantages that: (1) According to the invention, the representation of mathematical discipline knowledge is considered to have the particularities of formulation, expression refining and the like, so that two kinds of finer discipline knowledge, namely information of both LATEX label concepts and term types, are introduced, and key information is provided for labeling most knowledge points under the condition that the distribution of a constructed problem data set is unbalanced.
(2) The invention designs a gating mechanism for implicit fusion of discipline knowledge, which uses fewer parameters to keep key information related to two discipline knowledge in original problem text representation, thereby reducing noise generated during feature fusion.
(3) The problem knowledge point automatic labeling model integrating the discipline knowledge introduces two refined mathematical discipline knowledge of LATEX label concepts and term types as prompts, and utilizes a attention mechanism to update deep semantic characterization of the two, then implicitly integrates the deep semantic characterization of the two knowledge through a gating mechanism on the premise of not interfering with original problem text representation, and utilizes distribution balance loss to balance the number of examples among all knowledge point labels.
Drawings
FIG. 1 is a diagram of an overall model framework of the present invention.
Detailed Description
The invention constructs a problem knowledge point labeling data set of junior middle school, firstly, text is collected from human teaching version junior middle school math teaching materials and test papers, a problem knowledge point labeling data set of junior middle school is constructed, the data set is subjected to a large number of preprocessing operations to clean and template problem problems, and a plurality of experts carry out multi-round knowledge point labeling on the problems, wherein the labeling consistency rate reaches 96.02%. Then, a detailed experiment is carried out on the data set, and the experimental result shows that the knowledge point automatic labeling model provided by the invention: (1) in microF 1 ,macroF 2 ,weightedF 1 The three evaluation indexes are respectively improved by 1.99%,2.99% and 2.12% compared with the reference model; (2) for knowledge points with fewer training examples, the marking effect is improved; (3) f tested in four sets of baseline comparison experiments based on different pre-trained models 1 The value (which is an indicator for evaluating the performance of the classification model) exceeds the selected baseline.
The technical scheme of the invention is as follows: a knowledge point labeling method integrating LATEX labels comprises the following steps:
step S1, constructing a data set, collecting problems in a junior middle school mathematics test paper, and preprocessing the collected problems; marking the knowledge points of the collected problems after pretreatment; finally, obtaining a problem data set, wherein any problem in the problem data set comprises two parts, one part is an original problem text w, and the other part is a real label Q;
Step S2, inputting the original problem text w constructed in the step S1, and LATEX label concept text lc and term type text tt in the original problem text w into a sentence encoder module of the knowledge point automatic labeling model, and outputting the result as the original problem textThe present representation e, LATEX label conceptual representation e lc And the term type represents e tt ;
Step S3, inputting the output result obtained in the step S2 into a discipline knowledge fusion module, and using a cross attention mechanism to represent the LATEX label concept as e lc And the term type represents e tt Respectively fusing with the original problem text representation e, and outputting a deep semantic representation M with the LATEX label concept as an output result lc And deep semantic representation M of term types tt The method comprises the steps of carrying out a first treatment on the surface of the The calculation result after the average pooling operation in the discipline knowledge fusion module is respectively used as the final semantic representation of the LATEX label concept and the term type, namely the pooling representation of the LATEX label conceptPooling representation with term type +.>;
Step S4, inputting the final semantic representation in the step S3 into a gating and screening module, and reserving key information related to discipline knowledge in the original discipline text representation e in a form of few parameters through a gating and screening mechanism of implicitly fusing various discipline knowledge, wherein the output result of the gating and screening module is information of the original discipline text w which is finally reserved under the influence of LATEX label concept information and term type information, namely the finally reserved information e for short cls-remain2 ;
Step S5, the final reserved information e output by the gating and screening module in step S4 is processed cls-remain2 As an input to the prediction module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the prediction tag, which is converted into the prediction tag by a threshold classifier.
Further, in step S1, the data set is constructed specifically as follows:
step S11, collecting 16226 problems from 800 junior middle school mathematics test paper, wherein the collected problems cover all knowledge points related to junior middle school mathematics, and four problems comprise selection problems, blank filling problems, answer solutions and judgment problems;
step S12, preprocessing the collected problems, firstly performing invalid character removal, duplicate removal and completion cleaning operations on the problems to obtain 14200 problems; then adopting a mathematical formula identification tool to identify the formulas existing in the form of pictures into a formula format supported by Word;
step S13, marking the knowledge points of the problems in an automatic mode after preprocessing, wherein the marked knowledge points of the problems are derived from two aspects, namely, the query result of an online education platform and the knowledge point grading standard constructed by referring to the teaching materials of the middle school people on the other hand;
Step S14, finally obtaining a data set containing 12073 problems through problem preprocessing and knowledge point labeling.
Further, in step S13, knowledge points of the problem are labeled, specifically:
step S131, finding a plurality of three-level knowledge points corresponding to the problems by means of the problem query function of the online education platform;
step S132, inquiring first, second and third knowledge points corresponding to the problems in the knowledge point grading standard;
step S133, taking the three-level knowledge points obtained by the online education platform as the main, screening out the three-level knowledge points inquired from the knowledge point grading standard, and inquiring the first-level knowledge points and the second-level knowledge points to which the three-level knowledge points belong from the three-level knowledge points;
step S134, carrying out similarity judgment on the knowledge point labeling results of all problems by means of a Levenstein similarity algorithm and a semantic similarity model, unifying the labeling results with large similarity, and ensuring that the labeled knowledge points are not redundant;
and S135, removing the knowledge points and the corresponding problems which are not examined in the middle examination according to the middle examination class provided by the mathematics education expert in the junior middle school.
Further, in step S2, the sentence encoder module specifically includes:
step S21, a sentence encoder module selects RoBERTa as a pre-training language model, wherein the RoBERTa pre-training language model is a robust optimized BERT method, and the sentence encoder module inputs a text lc comprising an original problem text w, a LATEX label concept and a term type text tt, and the three share parameters of the RoBERTa pre-training language model;
Step S22, roBERTa Pre-trained language model as a function, w i The original problem text, lc, for the ith index i LATEX label concept text, tt for the ith index i For the term type text of the ith index, the specific calculation process is shown in formula (1);
(1);
wherein e i Original problem text w for the ith index i Vector representation obtained through Roberta pre-trained language model, i.e. original problem text representation e called ith index i ,e i lc Vector representation obtained for the text of the LATEX label concept of the ith index through the RoBERTa pre-trained language model, namely the LATEX label concept representation e called the ith index i lc ,e i tt Vector representation of text of term type for the ith index via Roberta pre-trained language model, namely term type representation e called the ith index i tt ;
Step S23, extracting the output of the model in the last layer of natural language processing field as text word vector representation, i.e. the original problem text representation e of the ith index i LATEX tag conceptual representation e of ith index i lc The term type of the ith index represents e i tt 。
Further, in step S3, the scientific knowledge fusion module specifically includes:
step S31, inputting text word vector representation output by a model in the last layer of natural language processing field in a sentence coder module;
Step S32, using cross-attention mechanism to express the i-th index LATEX label concept as e i lc The term type of the ith index represents e i tt Original problem text representation e respectively associated with the ith index i Fusing and outputting a deep semantic representation M of LATEX label concept with the result of ith index i lc And the deep semantic representation M of the term type of the ith index i tt ;
Step S33, simultaneously, learning stable characteristic representation of the knowledge point automatic labeling model in a plurality of independent characteristic spaces, introducing a multi-head attention mechanism, and finally, the attention calculation process is shown as a formula (2) and a formula (3);
(2);
(3);
wherein head ij lc The feature representation for the j-th attention calculation for the i-th indexed LATEX label conceptual representation, softmax as an activation function, converts the input non-normalized score into a probability distribution, W j Q 、W j K 、W j V Projection parameter matrix respectively representing query vector, key vector and value vector in jth self-attention calculation, T is LATEX label conceptual representation e of ith index i lc Sum key vector W j K Transpose of multiplication d K Original problem text representation e for the ith index i Is a second dimension of size;
head ij tt a feature representation for performing a jth attention calculation for the term type representation of the ith index;
M i lc For the deep semantic representation of the LATEX label concept obtained by cascade after the attention calculation of h times on the LATEX label concept representation of the ith index, the deep semantic representation M of the LATEX label concept of the ith index is called i lc ,Represents cascade operation, h represents attentionCalculating the times;
M i tt for the deep semantic representation of the term type concatenated after h times of attention calculations on the term type representation of the ith index, the deep semantic representation M of the term type of the ith index is called i tt ;
Step S34, extracting the average pooling result of the last layer of embedded vector of the model in the natural language processing field as sentence information representation, and representing the deep semantic representation M of the LATEX label concept of the ith index i lc And the deep semantic representation M of the term type of the ith index i tt Carrying out average pooling, respectively taking the calculation result as final semantic characterization of LATEX label concepts and term types, and calculating as shown in a formula (4);
(4);
wherein,for the result of averaging the deep semantic representation of the label concept of the i-th index, the pooled representation of the LATEX label concept called i-th index +.>,For the result of the averaged pooling of the deep semantic representation of the term type of the i-th index, the pooled representation of the term type called i-th index +. >The method comprises the steps of carrying out a first treatment on the surface of the AvgPool is the deep semantic representation M of the LATEX tag concept for the ith index, respectively i lc And the deep semantic representation M of the term type of the ith index i tt And (3) averaging and pooling operation.
Further, in step S4, the gating and screening module specifically includes:
step S41, transfusionPooled representation of the LATEX tag concept of the input data, i.e., the ith indexPooling of term types with the ith index is +.>;
Step S42, the LATEX label concept is represented by pooling acting on the ith indexAnd a CLS tag vector (CLS tag vector is a specially position-coded vector for representing the whole sequence or sentence meaning, here a sentence representation for replacing the original problem text) e cls Calculating the proportion of the information of the original problem text to be kept under the influence of the LATEX label concept information so as to screen out the key information related to the original problem text, wherein the calculation process is shown in a formula (5);
(5);
wherein r is i lc For the weight values that are retained under the influence of the LATEX tag concept information of the i-th index,to activate a function, W lc Pooled representation of LATEX tag concept for index i>And CLS tag vector e cls Spliced learnable matrix, b lc Is a bias vector, [ e ] cls ,]For CLS tag vector e cls And the pooling representation of the LATEX label concept of the ith index +.>Splicing to obtain a result;
e i cls-remain1 for weight value r reserved under influence of the i-th indexed LATEX label concept information i lc And CLS tag vector e cls The result obtained by multiplication represents the information of the original problem text reserved under the influence of the LATEX label conceptual information of the ith index, which is simply called as preliminary reserved information e i cls-remain1 ;
Step S43, the original problem text is finally kept under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index i cls-remain2 The calculation process is shown in a formula (6);
(6);
wherein r is i tt For the weight value reserved under the influence of the term type information of the i-th index,representing a sigmoid activation function, inputting information e reserved for preliminary i cls-remain1 ,W tt For the preliminary reserved information e i cls-remain1 And the pooling representation of the term type of the ith index +.>Spliced learnable matrix, b tt Is a bias vector, [ e ] i cls-remain1 , ]Information e reserved for preliminary i cls-remain1 And the pooling representation of the term type of the ith index +.>Splicing to obtain a result;
e i cls-remain2 then it is the preliminary reserved information e i cls-remain1 And r i tt The final output of the gate screening module obtained by multiplication, namely the information which represents the final reserved information of the original problem text under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index, is called the final reserved information e under the influence of the subject knowledge information of the ith index i cls-remain2 ;
Step S5, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 As the input of the prediction module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the prediction tag, and the final classification probability vector can be converted into the prediction tag by a threshold classifier.
Further, the prediction module in step S5 specifically includes:
step S51, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 After being input into a linear layer with a sigmoid function, a final classification probability vector is obtained as shown in a formula (7);
(7);
wherein,the j-th classification probability vector obtained for the linear layer of the sigmoid function, sigmoid being the activation function, W c Information e finally reserved under influence of subject knowledge information for ith index i cls-remain2 Matrix capable of learning, b c A bias vector;
step S52, a classification threshold delta is introduced,the j-th classification probability vector obtained by judging the linear layer of the sigmoid function corresponding to the j-th knowledge point label of the current problem>Obtaining the j knowledge point label corresponding to the current problem according to the size relation of the classification threshold delta >As in equation (8);
(8) ;
step S53, adopting distributed balance loss to balance the number of examples among all knowledge point labels, wherein the calculation of a specific loss function is shown in a formula (9);
(9);
wherein L is DB Represents the distribution balance loss obtained last, C represents the total number of knowledge points, k represents the kth problem in the data set,training is added as a weighting coefficient to make up the gap between the expected and actual sampling probabilities, y j k True marks representing the j-th knowledge point corresponding to the k-th problem, y j k E {0,1}, log represents logarithm, z j k Representing the probability of predicting the jth knowledge point of the kth problem, v j Is a class-specific bias, representing the bias of the intrinsic model; lambda is a decisive factor influencing the loss gradient, representing the probability z of classification j k To a degree of "tolerance".
Further, the automatic knowledge point labeling model integrating the LATEX label is applied to the automatic knowledge point labeling method integrating the LATEX label, and is mainly divided into four modules, namely a sentence encoder module, a disciplinary knowledge integration module, a gating screening module and a prediction module, wherein the sentence encoder module is used as a first module of the automatic knowledge point labeling model, and the four modules are sequentially in a serial structure.
FIG. 1 shows a schematic diagram of a computer systemFirstly, constructing data required by a sentence encoder module, taking out original problem text w from the constructed mathematical data set, inputting the original problem text w together with LATEX label concept text lc and term type text tt of the problem into the sentence encoder module, sharing parameters of the sentence encoder module, processing the sentence encoder module to obtain output of a model (transducer) of the final layer natural language processing field, and representing the output as text word vectors, wherein the text word vectors comprise LATEX label concept representation e lc The term type denotes e tt And an original problem text representation e.
Then, LATEX tag concept is expressed as e lc The term type denotes e tt And the original problem text representation e are input to the discipline knowledge fusion module and LATEX label concept representation e is input by using a cross-attention mechanism lc And term type represents e tt Respectively fusing the input data with the original problem text representation e, and taking the output result as a deep semantic representation after the update of two disciplines of knowledge, namely, a deep semantic representation M of LATEX label concept lc And deep semantic representation M of term types tt . At the same time, in order to make the model learn stable characteristic representation in several independent characteristic spaces, the invention introduces a multi-head attention mechanism and makes the deep semantic representation M of LATEX label concept lc Deep semantic representation M with term type tt Average pooling is performed to obtain pooled representations of LATEX label concepts respectivelyPooling representation with term type +.>And the pooled representation of LATEX tag concept +.>Pooling representation with term type +.>As final semantic characterizations of LATEX tag concepts and term types, respectively.
Pooling representation of LATEX tag conceptsPooling of term types is indicated->And CLS tag vector e cls Input to the gate screening module. Here, multiple gating mechanisms are used to control in turn the amount of effective information that should be retained in the original problem text. First of all, by a pooling representation acting on the LATEX label concept +.>And CLS tag vector e cls Calculating the ratio of the original problem text information to be kept under the influence of LATEX label concept information so as to screen out the key information related to the original problem text; similarly, another gating mechanism is to consider the influence of term type information and preserve key information in sentence representation. Wherein the input is the output of the last gating mechanism, and the information e is finally reserved cls-remain2 Then as the final output of the gating screening module.
The classifier is used as a final prediction module, and only the final reserved information e output by the gating screening module is needed cls -remain2 Inputting to the linear layer with sigmoid activation function to obtain the j-th classification probability vectorAnd a threshold classifier is introduced, and then the predicted knowledge points are finally obtained through a marker decoder.
As most of knowledge point labels in the data set have relatively less problem data, even a plurality of knowledge point labels only having single-digit examples exist, the unbalanced distribution of the labels greatly increases the complexity of the multi-knowledge point labeling task. So that probability vectors will be classifiedThe real labels Q of the problems shown in FIG. 1 are distributedBalancing a Loss function (Distribution-Balanced Loss for Multi-Label Classification in Long-published data, DB_Low) to balance the number of instances between knowledge point tags, where the Loss is L DB 。
The knowledge point automatic labeling model adopts a deep learning model frame as PyTorch in the experiment. The text embedding dimension of the original problem, LATEX tag concept, term type is 768 dimensions. The similarity threshold is set to 0.95, and the number of heads of the multi-head attention mechanismThe classification threshold δ is set to 6, the initial learning rate is set to 0.00003, and the classification threshold δ is set to 0.5. />
Claims (8)
1. A knowledge point labeling method integrating LATEX labels is characterized by comprising the following steps of: the method comprises the following steps:
Step S1, constructing a data set, collecting problems in a junior middle school mathematics test paper, and preprocessing the collected problems; marking the knowledge points of the collected problems after pretreatment; finally obtaining a problem data set, wherein problems in the problem data set are called an original problem text w;
step S2, inputting the original problem text w constructed in the step S1, and LATEX label concept text lc and term type text tt in the original problem text w into a sentence encoder module of the knowledge point automatic labeling model, and outputting an original problem text representation e and a LATEX label concept representation e as the output results lc And the term type represents e tt ;
Step S3, inputting the output result obtained in the step S2 into a discipline knowledge fusion module, and using a cross attention mechanism to represent the LATEX label concept as e lc And the term type represents e tt Respectively fusing with the original problem text representation e, and outputting a deep semantic representation M with the LATEX label concept as an output result lc And deep semantic representation M of term types tt The method comprises the steps of carrying out a first treatment on the surface of the The calculation result after the average pooling operation in the discipline knowledge fusion module is respectively used as the final semantic representation of the LATEX label concept and the term type, namely the pooling representation of the LATEX label conceptPooling representation with term type +. >;
Step S4, inputting the final semantic representation in the step S3 into a gating and screening module, and reserving key information related to discipline knowledge in the original discipline text representation e in a form of few parameters through a gating and screening mechanism of implicitly fusing various discipline knowledge, wherein the output result of the gating and screening module is information of the original discipline text w which is finally reserved under the influence of LATEX label concept information and term type information, namely the finally reserved information e for short cls-remain2 ;
Step S5, the final reserved information e output by the gating and screening module in step S4 is processed cls-remain2 As an input to the prediction module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the prediction tag, which is converted into the prediction tag by a threshold classifier.
2. The knowledge point labeling method fused with LATEX labels according to claim 1, wherein the method comprises the following steps: in step S1, the data set is constructed specifically as follows:
step S11, collecting 16226 problems from 800 junior middle school mathematics test paper, wherein the collected problems cover all knowledge points related to junior middle school mathematics, and four problems comprise selection problems, blank filling problems, answer solutions and judgment problems;
Step S12, preprocessing the collected problems, firstly performing invalid character removal, duplicate removal and completion cleaning operations on the problems to obtain 14200 problems; then adopting a mathematical formula identification tool to identify the formulas existing in the form of pictures into a formula format supported by Word;
step S13, marking the knowledge points of the problems in an automatic mode after preprocessing, wherein the marked knowledge points of the problems are derived from two aspects, namely, the query result of an online education platform and the knowledge point grading standard constructed by referring to the teaching materials of the middle school people on the other hand;
step S14, finally obtaining a data set containing 12073 problems through problem preprocessing and knowledge point labeling.
3. The knowledge point labeling method fused with LATEX labels according to claim 2, wherein the method comprises the following steps: in step S13, the knowledge points of the problem are labeled, specifically:
step S131, finding a plurality of three-level knowledge points corresponding to the problems by means of the problem query function of the online education platform;
step S132, inquiring first, second and third knowledge points corresponding to the problems in the knowledge point grading standard;
step S133, taking the three-level knowledge points obtained by the online education platform as the main, screening out the three-level knowledge points inquired from the knowledge point grading standard, and inquiring the first-level knowledge points and the second-level knowledge points to which the three-level knowledge points belong from the three-level knowledge points;
Step S134, carrying out similarity judgment on the knowledge point labeling results of all problems by means of a Levenstein similarity algorithm and a semantic similarity model, unifying the labeling results with large similarity, and ensuring that the labeled knowledge points are not redundant;
and S135, removing the knowledge points and the corresponding problems which are not examined in the middle examination according to the middle examination class provided by the mathematics education expert in the junior middle school.
4. A method for labeling knowledge points by fusing LATEX labels according to claim 3, wherein: the sentence encoder module in step S2 specifically includes:
step S21, a sentence encoder module selects RoBERTa as a pre-training language model, wherein the RoBERTa pre-training language model is a robust optimized BERT method, and the sentence encoder module inputs a text lc comprising an original problem text w, a LATEX label concept and a term type text tt, and the three share parameters of the RoBERTa pre-training language model;
step S22, roberta pretrainingLanguage model as a function, w i The original problem text, lc, for the ith index i LATEX label concept text, tt for the ith index i For the term type text of the ith index, the specific calculation process is shown in formula (1);
(1);
Wherein e i Original problem text w for the ith index i Vector representation obtained through Roberta pre-trained language model, i.e. original problem text representation e called ith index i ,e i lc Vector representation obtained for the text of the LATEX label concept of the ith index through the RoBERTa pre-trained language model, namely the LATEX label concept representation e called the ith index i lc ,e i tt Vector representation of text of term type for the ith index via Roberta pre-trained language model, namely term type representation e called the ith index i tt ;
Step S23, extracting the output of the model in the last layer of natural language processing field as text word vector representation, i.e. the original problem text representation e of the ith index i LATEX tag conceptual representation e of ith index i lc The term type of the ith index represents e i tt 。
5. The knowledge point labeling method fused with LATEX labels according to claim 4, wherein the method comprises the following steps: the academic or vocational study knowledge fusion module in step S3 specifically includes:
step S31, inputting text word vector representation output by a model in the last layer of natural language processing field in a sentence coder module;
step S32, using cross-attention mechanism to express the i-th index LATEX label concept as e i lc The term type of the ith index represents e i tt Original problem text representation e respectively associated with the ith index i Fusing and outputting a deep semantic representation M of LATEX label concept with the result of ith index i lc And the deep semantic representation M of the term type of the ith index i tt ;
Step S33, simultaneously, learning stable characteristic representation of the knowledge point automatic labeling model in a plurality of independent characteristic spaces, introducing a multi-head attention mechanism, and finally, the attention calculation process is shown as a formula (2) and a formula (3);
(2);
(3);
wherein head ij lc The feature representation for the j-th attention calculation for the i-th indexed LATEX label conceptual representation, softmax as an activation function, converts the input non-normalized score into a probability distribution, W j Q 、W j K 、W j V Projection parameter matrix respectively representing query vector, key vector and value vector in jth self-attention calculation, T is LATEX label conceptual representation e of ith index i lc Sum key vector W j K Transpose of multiplication d K Original problem text representation e for the ith index i Is a second dimension of size;
head ij tt a feature representation for performing a jth attention calculation for the term type representation of the ith index;
M i lc for the deep semantic representation of the LATEX label concept obtained by cascade after the attention calculation of h times on the LATEX label concept representation of the ith index, the deep semantic representation M of the LATEX label concept of the ith index is called i lc ,Indicating cascade operation, h indicating an attention meterCounting times;
M i tt for the deep semantic representation of the term type concatenated after h times of attention calculations on the term type representation of the ith index, the deep semantic representation M of the term type of the ith index is called i tt ;
Step S34, extracting the average pooling result of the last layer of embedded vector of the model in the natural language processing field as sentence information representation, and representing the deep semantic representation M of the LATEX label concept of the ith index i lc And the deep semantic representation M of the term type of the ith index i tt Carrying out average pooling, respectively taking the calculation result as final semantic characterization of LATEX label concepts and term types, and calculating as shown in a formula (4);
(4);
wherein,for the result of averaging the deep semantic representation of the label concept of the i-th index, the pooled representation of the LATEX label concept called i-th index +.>,For the result of the averaged pooling of the deep semantic representation of the term type of the i-th index, the pooled representation of the term type called i-th index +.>The method comprises the steps of carrying out a first treatment on the surface of the AvgPool is the deep semantic representation M of the LATEX tag concept for the ith index, respectively i lc And the deep semantic representation M of the term type of the ith index i tt And (3) averaging and pooling operation.
6. The knowledge point labeling method fused with LATEX labels according to claim 5, wherein the method comprises the following steps: in step S4, the gating and screening module specifically includes:
step S41, the input data, i.e., the pooled representation of the LATEX tag concept of the ith indexPooling of term types with the ith index is +.>;
Step S42, the LATEX label concept is represented by pooling acting on the ith indexAnd CLS tag vector e cls Calculating the proportion of the information of the original problem text to be kept under the influence of the LATEX label concept information so as to screen out the key information related to the original problem text, wherein the calculation process is shown in a formula (5);
(5);
wherein r is i lc For the weight values that are retained under the influence of the LATEX tag concept information of the i-th index,to activate a function, W lc Pooled representation of LATEX tag concept for index i>And CLS tag vector e cls Spliced learnable matrix, b lc Is a bias vector, [ e ] cls ,]To CLS tag vectore cls And the pooling representation of the LATEX label concept of the ith index +.>Splicing to obtain a result;
e i cls-remain1 for weight value r reserved under influence of the i-th indexed LATEX label concept information i lc And CLS tag vector e cls The result obtained by multiplication represents the information of the original problem text reserved under the influence of the LATEX label conceptual information of the ith index, which is simply called as preliminary reserved information e i cls-remain1 ;
Step S43, the original problem text is finally kept under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index i cls-remain2 The calculation process is shown in a formula (6);
(6);
wherein r is i tt For the weight value reserved under the influence of the term type information of the i-th index,representing a sigmoid activation function, inputting information e reserved for preliminary i cls-remain1 ,W tt For the preliminary reserved information e i cls-remain1 And the pooling representation of the term type of the ith index +.>Spliced learnable matrix, b tt Is a bias vector, [ e ] i cls-remain1 ,]Information e reserved for preliminary i cls-remain1 And the pooling representation of the term type of the ith index +.>Splicing to obtain a result;
e i cls-remain2 then it is the preliminary reserved information e i cls-remain1 And r i tt The final output of the gate screening module obtained by multiplication, namely the information which represents the final reserved information of the original problem text under the influence of the LATEX label conceptual information of the ith index and the term type information of the ith index, is called the final reserved information e under the influence of the subject knowledge information of the ith index i cls-remain2 ;
Step S5, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 As the input of the prediction module, the input is passed through a linear layer with a sigmoid function to obtain a final classification probability vector, which is a representation of the prediction tag, and the final classification probability vector can be converted into the prediction tag by a threshold classifier.
7. The knowledge point labeling method fused with LATEX labels according to claim 6, wherein the method comprises the following steps: the prediction module in step S5 comprises the following specific steps:
step S51, the final reserved information e under the influence of the subject knowledge information of the ith index output by the gating screening module i cls-remain2 After being input into a linear layer with a sigmoid function, a final classification probability vector is obtained as shown in a formula (7);
(7);
wherein,the j-th classification probability vector obtained for the linear layer of the sigmoid function, sigmoid being the activation function, W c Information e finally reserved under influence of subject knowledge information for ith index i cls-remain2 Matrix capable of learning, b c Is a bias vector;
step S52, a classification threshold delta is introduced,the j-th classification probability vector obtained by judging the linear layer of the sigmoid function corresponding to the j-th knowledge point label of the current problem >Obtaining the j knowledge point label corresponding to the current problem according to the size relation of the classification threshold delta>As in equation (8);
(8);
step S53, adopting distributed balance loss to balance the number of examples among all knowledge point labels, wherein the calculation of a specific loss function is shown in a formula (9);
(9);
wherein L is DB Represents the distribution balance loss obtained last, C represents the total number of knowledge points, k represents the kth problem in the data set,training is added as a weighting coefficient to make up the gap between the expected and actual sampling probabilities, y j k True marks representing the j-th knowledge point corresponding to the k-th problem, y j k E {0,1}, log represents logarithm, z j k Representing the probability of predicting the jth knowledge point of the kth problem, v j Is a level specific deviation representing the natural modeDeviation of the pattern; lambda is a decisive factor influencing the loss gradient, representing the probability z of classification j k To a degree of "tolerance".
8. The automatic knowledge point labeling model fused with LATEX labels is applied to the knowledge point labeling method fused with LATEX labels, and is characterized in that: the system mainly comprises four modules, namely a sentence encoder module, a disciplinary knowledge fusion module, a gating screening module and a prediction module, wherein the sentence encoder module is used as a first module of an automatic knowledge point labeling model, and the four modules are sequentially in a serial structure.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311834982.XA CN117473096B (en) | 2023-12-28 | 2023-12-28 | Knowledge point labeling method fusing LATEX labels and model thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311834982.XA CN117473096B (en) | 2023-12-28 | 2023-12-28 | Knowledge point labeling method fusing LATEX labels and model thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117473096A true CN117473096A (en) | 2024-01-30 |
CN117473096B CN117473096B (en) | 2024-03-15 |
Family
ID=89638326
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311834982.XA Active CN117473096B (en) | 2023-12-28 | 2023-12-28 | Knowledge point labeling method fusing LATEX labels and model thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117473096B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160063323A1 (en) * | 2014-09-02 | 2016-03-03 | Abbyy Development Llc | Methods and systems for processing of images of mathematical expressions |
CN109299281A (en) * | 2018-07-06 | 2019-02-01 | 浙江学海教育科技有限公司 | The mask method of knowledge point label |
JP2020161111A (en) * | 2019-03-27 | 2020-10-01 | ワールド ヴァーテックス カンパニー リミテッド | Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus |
CN112580361A (en) * | 2020-12-18 | 2021-03-30 | 蓝舰信息科技南京有限公司 | Formula based on unified attention mechanism and character recognition model method |
CN113420543A (en) * | 2021-05-11 | 2021-09-21 | 江苏大学 | Automatic mathematical test question labeling method based on improved Seq2Seq model |
CN116244445A (en) * | 2022-12-29 | 2023-06-09 | 中国航空综合技术研究所 | Aviation text data labeling method and labeling system thereof |
CN116578665A (en) * | 2022-12-29 | 2023-08-11 | 成都索贝数码科技股份有限公司 | Method and equipment for jointly extracting extensible text information based on prompt learning |
-
2023
- 2023-12-28 CN CN202311834982.XA patent/CN117473096B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160063323A1 (en) * | 2014-09-02 | 2016-03-03 | Abbyy Development Llc | Methods and systems for processing of images of mathematical expressions |
CN109299281A (en) * | 2018-07-06 | 2019-02-01 | 浙江学海教育科技有限公司 | The mask method of knowledge point label |
JP2020161111A (en) * | 2019-03-27 | 2020-10-01 | ワールド ヴァーテックス カンパニー リミテッド | Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus |
CN112580361A (en) * | 2020-12-18 | 2021-03-30 | 蓝舰信息科技南京有限公司 | Formula based on unified attention mechanism and character recognition model method |
CN113420543A (en) * | 2021-05-11 | 2021-09-21 | 江苏大学 | Automatic mathematical test question labeling method based on improved Seq2Seq model |
CN116244445A (en) * | 2022-12-29 | 2023-06-09 | 中国航空综合技术研究所 | Aviation text data labeling method and labeling system thereof |
CN116578665A (en) * | 2022-12-29 | 2023-08-11 | 成都索贝数码科技股份有限公司 | Method and equipment for jointly extracting extensible text information based on prompt learning |
Non-Patent Citations (3)
Title |
---|
MINGWEN WANG 等: "Improved Chinese Word Segmentation Algorithm of Quantitative Units in Elementary Mathematics Application Problems", 《 2021 7TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC)》, 8 April 2022 (2022-04-08), pages 493 - 9 * |
罗文兵 等: "基于依存结构学习的中学数学术语鲁棒抽取", 《 中文信息学报》, 14 December 2023 (2023-12-14), pages 75 - 85 * |
郭崇慧;吕征达;: "一种基于集成学习的试题多知识点标注方法", 运筹与管理, no. 02, 25 February 2020 (2020-02-25), pages 133 - 140 * |
Also Published As
Publication number | Publication date |
---|---|
CN117473096B (en) | 2024-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107273490B (en) | Combined wrong question recommendation method based on knowledge graph | |
CN113656570B (en) | Visual question-answering method and device based on deep learning model, medium and equipment | |
CN106469560B (en) | Voice emotion recognition method based on unsupervised domain adaptation | |
CN110532557B (en) | Unsupervised text similarity calculation method | |
CN112508334A (en) | Personalized paper combining method and system integrating cognitive characteristics and test question text information | |
CN113962219A (en) | Semantic matching method and system for knowledge retrieval and question answering of power transformer | |
CN114969275A (en) | Conversation method and system based on bank knowledge graph | |
CN114818717A (en) | Chinese named entity recognition method and system fusing vocabulary and syntax information | |
CN111582506A (en) | Multi-label learning method based on global and local label relation | |
CN113420543B (en) | Mathematical test question automatic labeling method based on improved Seq2Seq model | |
CN113343690A (en) | Text readability automatic evaluation method and device | |
CN112347780B (en) | Judicial fact finding generation method, device and medium based on deep neural network | |
CN115659947A (en) | Multi-item selection answering method and system based on machine reading understanding and text summarization | |
CN118152547B (en) | Robot answer method, medium and system according to understanding capability of questioner | |
CN114722833A (en) | Semantic classification method and device | |
CN112966518B (en) | High-quality answer identification method for large-scale online learning platform | |
CN113901224A (en) | Knowledge distillation-based secret-related text recognition model training method, system and device | |
CN112749566B (en) | Semantic matching method and device for English writing assistance | |
CN117034921B (en) | Prompt learning training method, device and medium based on user data | |
CN117473096B (en) | Knowledge point labeling method fusing LATEX labels and model thereof | |
CN116306653A (en) | Regularized domain knowledge-aided named entity recognition method | |
CN114239575B (en) | Statement analysis model construction method, statement analysis method, device, medium and computing equipment | |
CN116362247A (en) | Entity extraction method based on MRC framework | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN117216617A (en) | Text classification model training method, device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |