CN113505589B

CN113505589B - MOOC learner cognitive behavior recognition method based on BERT model

Info

Publication number: CN113505589B
Application number: CN202110802482.2A
Authority: CN
Inventors: 刘智; 刘三女牙; 杨宗凯; 孔玺; 陈浩; 戴志诚
Original assignee: Central China Normal University
Current assignee: Central China Normal University
Priority date: 2021-07-15
Filing date: 2021-07-15
Publication date: 2024-07-23
Anticipated expiration: 2041-07-15
Also published as: CN113505589A

Abstract

The invention provides a MOOC learner cognitive behavior recognition method based on a BERT model, which comprises the following steps: acquiring learner discussion text data in the MOOC forum, and generating professional corpus in the MOOC comment field; preprocessing the language material to generate pre-training data containing professional knowledge in the MOOC field; retraining the BERT model by using an MLM and NSP strategy in combination with the pre-training data to obtain MOOC-BERT; constructing a MOOC learner cognitive behavior annotation data set; and (3) fine-tuning parameters and weights in the MOOC-BERT by using the labeling data set to generate a cognitive behavior recognition model facing the MOOC learner. The MOOC learner cognitive behavior recognition method based on the BERT model is used for improving the recognition capability of the learner cognitive behaviors in the online learning environment and effectively helping teachers analyze the cognitive behavior types of MOOC learners in a large-scale scene.

Description

MOOC learner cognitive behavior recognition method based on BERT model

Technical Field

The invention belongs to the fields of natural language processing technology and education data mining, and particularly relates to a MOOC learner cognitive behavior recognition method based on a BERT model.

Background

The learner cognitive behavior refers to a behavior tendency abstracted from an interactive utterance and is used for representing thinking activities in learning interaction in the brain, such as analysis, disassembly, reconstruction and other application of cognitive skills when solving problems. Discussion forums, instant dialogues and the like are main carriers for promoting learners to realize cognitive processing through collaborative discussion, and rich cognitive behavior information generated in the interaction process is recorded in discussion texts. Through recognition and analysis of cognitive behaviors contained in the discussion texts, teachers are helped to better understand the cognitive processing process of learners.

MOOC provides rich learning resources, interactive situations and flexible learning time for learners, and provides an important approach for informal learning. However, because the teacher and the students are in a quasi-separated state in time and space in online learning, the problems of lack of effective interaction between teachers and students, cognitive navigation, low interaction quality, high discontinue one's studies rate and the like still exist. The MOOC forum provides important communication places for teachers and students, and large-scale text data generated in the interaction process provides reliable data sources for recognition of cognitive behaviors. Through automatic analysis of discussion texts, behavior characteristics of knowledge content cognitive processing of MOOC learners can be tracked, so that teaching intervention strategies of teachers are formulated, and support is provided for improving cognitive navigation of online learners, improving interaction quality, reducing discontinue one's studies rate and the like.

The current cognitive behavior recognition method requires that various features are manually extracted from a large amount of text information in advance, and then the features are input into an SVM (Support Vector Machine, SVM),Bayse, random Forest and other machine learning models. However, the MOOC course has the characteristics of huge scale, various types, rich Chinese information and the like, and in addition, the recognition of complex behaviors such as cognition and the like of learners in words is more dependent on context information. The existing method has the problems that the feature extraction is complicated, more potential semantics are not easy to obtain and the like, so that the recognition effect is difficult to effectively improve.

The pre-training language model BERT is used as one of the natural language processing algorithms with optimal current performance, and provides possibility for the accurate recognition of the cognitive behaviors of MOOC learners. BERT shows optimal performance in a number of tasks by performing unsupervised training on a large scale of predictions. However, since the education field contains a large number of abstract concepts, nouns and the like, the BERT model trained by using the universal corpus has a further effect to be improved when the learner cognitive behavior recognition task is involved.

Disclosure of Invention

Aiming at the defects and improvement demands of the prior art, the invention provides a MOOC learner cognitive behavior recognition method based on a BERT model, which is used for improving the recognition capability of the learner cognitive behavior in an online learning environment. The invention mainly comprises two objects: firstly, the BERT model is retrained by using the corpus in the MOOC comment field, and secondly, the cognitive behaviors of MOOC learners are identified and analyzed by using the model after the retrained.

A MOOC learner cognitive behavior recognition method based on a BERT model comprises the following steps:

(1) And obtaining a large number of unmarked learner discussion text data from each large Chinese MOOC forum by utilizing a crawler technology, and preprocessing the obtained large-scale discussion text data to obtain professional corpus in the MOOC comment field required by BERT model pre-training.

(2) And (3) processing the professional domain corpus generated in the step (1) to generate pre-training data.

(3) And (3) performing joint pre-training on the model by using a Chinese BERT version BERT-base-chinese and combining training data in the step (2) on the basis, and using a masking language model (Masked Language Model, MLM) and next sentence prediction (Next Sentence Prediction, NSP) to obtain the MOOC-BERT model.

(4) And acquiring training corpus of the learner cognitive behavior recognition model, marking, and marking a cognitive behavior label corresponding to the text of each sentence level to form a cognitive behavior data set.

(5) And (3) performing Fine-Tuning training (Fine-Tuning) on parameters in the MOOC-BERT model by using the cognitive behavior data set in the step (4). In the training process, dropout is added to prevent overfitting, and a Softmax regression function is added after BERT for classifying cognitive behavioral labels of discussion texts at sentence level. Thus, a model for identifying the cognitive behavior of the MOOC learner is ultimately obtained.

In the above technical scheme, the specific process of generating the MOOC-BERT model in the step (3) is as follows:

(3-1) reading pre-training data tfrecoed file, wherein parameters and weights of Embedding layers and output layers of the model are jointly trained by using MLM and NSP in the pre-training process. MLM refers to context information for each word characterization by randomly masking some words and then predicting. NSP is to randomly pick two sentences to determine whether there is a context between sentences. Through the two training tasks, the model can learn the professional corpus characteristics in the MOOC field;

Training by using a learning rate layer-by-layer attenuation mode in the training process, wherein the model bottom layer adopts a low learning rate to seek an optimal solution, and the model top layer adopts a high learning rate to accelerate learning;

(3-3) calculating the Loss value in each iteration process by using the cross entropy Loss function, and iterating until the Loss value is minimized. Finally, a BERT model fused with the professional knowledge in the MOOC field is obtained.

In the above technical scheme, the specific process of obtaining and labeling the training corpus of the learner cognitive behavior recognition model in the step (4) is as follows:

(4-1) selecting forum data of a certain proper MOOC course, preprocessing, and selecting data to be marked to form a data set;

(4-2) marking sentence-level texts by two coders according to the cognitive behavior coding framework, wherein each text is marked as a label;

and (4-3) carrying out consistency test after labeling, ensuring labeling quality, and finally obtaining a learner cognitive behavior data set.

In the above scheme, the specific process of performing fine tuning training on parameters in the MOOC-BERT model in the step (5) is as follows:

(5-1) preprocessing text information in the cognitive behavior data set, adding [ CLS ] before a text data sequence, and adding [ SEP ] at the position of the sentence end, wherein [ CLS ] and [ SEP ] are labels representing the start and the end in BERT training respectively;

(5-2) fitting the data set to a training set: verification set: dividing a test set (6:2:2), performing Fine Tuning (Fine-Tuning) on MOOC-BERT by using the data set, adding a Softmax regression function on a BERT output layer to output a cognitive behavior label, and finally obtaining a learner cognitive behavior recognition model;

(5-3) verifying the model effect using the F value, the accuracy to prove the effectiveness of the learner cognitive behavior recognition method.

The invention constructs a large-scale non-labeling corpus by acquiring learner discussion data in the Chinese MOOC forum, then uses the corpus to pretrain the BERT (BERT-base-chinese) of the Chinese version, and finally acquires a pretrained language model MOOC-BERT fused with the learner discussion data knowledge of the Chinese MOOC. And then, the parameters in the MOOC-BERT are finely adjusted by using the labeled learner cognitive behavior data set, and finally, the accuracy rate of recognizing the cognitive behaviors of the MOOC learner reaches more than 85%. According to the technical scheme, the method does not need to manually extract the characteristics, and the difficulty of characteristic engineering of the traditional machine learning method is overcome. In addition, MOOC-BERT learns high-level semantic features in more fields, has higher recognition effect when facing recognition behavior recognition and the like and highly depends on context tasks, and has good generalization. In actual deployment, the cognitive behavior type of the learner can be output by giving the text to be tested. The method can effectively help teachers analyze the cognitive behavior types of MOOC learners in a large-scale scene, and is helpful for teachers to find out the cognitive processing rules of students. Thereby providing basis for the establishment of individual learning strategy and intervention strategy of MOOC learners.

Drawings

Fig. 1 is a workflow diagram of the method for recognizing cognitive behaviors of a moc learner based on a BERT model according to the present invention.

FIG. 2 is a diagram of the model for recognizing the cognitive behaviors of MOOC learners in accordance with the present invention.

FIG. 3 is a graph of recognition results of cognitive behaviors of MOOC learners in accordance with the present invention.

FIG. 4 is a diagram of a framework of learning behavior codes for a MOOC learner in accordance with the present invention.

Detailed Description

For a clearer description of the objects, technical solutions, procedures and advantages of the present invention, the details of the present invention will be further described with reference to the accompanying drawings in conjunction with the specific embodiments. It should be understood that the embodiments described herein are for better explaining the present invention, and are not limiting of the present invention.

The embodiment of the invention is a method for identifying cognitive behaviors of a MOOC learner based on a BERT model, which is further described with reference to FIG. 1, and comprises the following steps:

(1) Large-scale learner forum discussion data are obtained from the university of Chinese MOOC forum to form BERT model pre-training corpus:

(1-1) the chinese MOOC lesson forum records a large amount of text data generated when learner and companion and teacher discuss, communicate. To obtain these text data, a crawler program is designed using a request, re, etc. module in Python, and various discussion data (subject, discussion, reply) in the MOOC forum of university is downloaded locally. Finally, 10.7G discussion text data in the classes of courses such as China university MOOC economy, science and technology, humanity, psychology, art and computer are obtained and saved as a. Txt file.

(1-2) Based on the above discussion text data, constructing a regular expression by using a re module in Python, deleting blank lines, html code blocks, special characters and the like in a saved file, and then obtaining a MOOC comment corpus for generating pre-training data.

(2) Generating pre-training data from the corpus obtained in the step (1):

(2-1) segmenting sentences in the corpus according to single characters, and vectorizing Chinese characters. The beginning of a sentence is denoted by [ CLS ] and if two sentences are separated by [ SEP ].

(2-2) When the sentence length is less than 256, [ PAD ] is used for filling, and when the sentence length exceeds 256, it is necessary to intercept the sentence to a predetermined number of characters.

(2-3) To improve training efficiency, 10% of words are randomly masked in each sentence. The masking method is that 80% of the probability is replaced by [ MASK ],10% of the probability is replaced by other words, and 10% of the probability keeps the original words unchanged. And finally, saving the processed data as mooc _documents.

(3) Pre-training the BERT again by using the pre-training data obtained in the step (2) to generate a MOOC-BERT model fusing the knowledge in the MOOC field:

(3-1) BERT model A plurality of language versions exist, and BERT-base-chinese versions are selected in the embodiment. bert-base-chinses is trained on general corpus such as Chinese Wikipedia, lacks relevant knowledge in the education-related field, and can be better represented in education-related natural language processing tasks by using corpus in MOOC for retraining again.

(3-2) Reading mooc _documents.tfreeord, inputting the data in the file into the initialized BERT model, and starting pre-training. In the process, parameters and weights of each layer of the model are adjusted by using two task combinations of MLM and NSP, wherein the MLM is used for predicting masked words, namely parts replaced by [ MASK ], and the NSP is used for predicting the sequence relation between two sentences.

(3-3) In order to prevent the learning rate from becoming too high, it is necessary to make the learning rate continuously decrease exponentially with the number of training rounds and to converge the learning step length of the gradient decrease. In the training process, parameters of each layer of the model need to meet the formula (1):

First, each level parameter of the model satisfies the formula (1), wherein Parameters representing the mth layer of the model at a time point n, wherein n is a time step; lambda ^m represents the learning rate; Is a gradient representation. In this case, λ ^m is required to satisfy the formula (2):

Lambda ^x-1＝α×λ^x (formula 2) wherein α represents an attenuation index, set to α=0.96; the initial learning rate is set to 2e ^-5, and the learning rate is continuously reduced according to the fixed iteration number.

(3-4) Calculating a loss value in the training process, wherein the loss value tends to be minimum when the model iterates to 500000 steps. The loss value in the MLM task is as follows: mask lm loss = 0.12968734; the accuracy is: masked lm accuracy = 0.9648933; the loss value in the NSP task is: next_ sentence _loss= 5.480657e-05, accuracy is: next_ sentence _ accuracy =1.0; finally, a MOOC-BERT model which fuses the knowledge in the MOOC field is obtained, and the model is used for learning behavior recognition tasks.

(4) Selecting a forum text in a proper MOOC course to carry out labeling work of a learner cognitive behavior training corpus:

(4-1) this example selects the university of Chinese MOOC platform, psychology: i know nothing about the forum data of the sixth give a course of its wonderful introduction. The course learners discuss each other more actively, and 198112 pieces of forum data are generated together. After the preprocessing operations such as deduplication, cleaning, filtering, html code block removal and the like are completed on the course, the discussion posts are randomly selected 12000 as labeling samples.

(4-2) 6 Encoders were recruited and trained, two by two, with 4000 texts to be annotated for each group. The encoder needs to mark each text according to the theoretical knowledge of the exploratory community, and the encoding framework is shown in fig. 4. Cognition exists as an important theoretical framework for representing the cognitive behaviors of learners, and four dimensions from low-order cognition to high-order cognition are respectively triggered, explored, integrated and solved.

(4-3) Labeling a small number of samples in each group, and calculating whether the consistency Kappa coefficient of the labeling of two persons is higher than 0.75 by using SPSS. And the consistency Kappa coefficients of the three groups of labels are all above 0.8, which means that the label consistency is good. Kappa value judgment criteria: kappa is more than or equal to 0.75, which indicates that the consistency of the labels of two people is better; kappa <0.75 is more than or equal to 0.4, which indicates that the consistency of the labels of two persons is common; kappa <0.4, indicates poor consistency of labeling by two persons.

(4-4) Summarizing the marked texts to form a learner cognitive behavior data set.

(5) Fine tuning parameters in the MOOC-BERT model to obtain a learner cognitive behavior recognition model:

(5-1) first preprocessing each row of data set, adding [ CLS ] to the beginning part of sentence, and adding [ SEP ] to the end part of sentence.

(5-2) Dividing 12000 data sets into a training set, a verification set and a test set according to the proportion of (6:2:2). And splicing a Softmax function at the MOOC-BERT output layer, then developing fine tuning training, and adjusting parameters in the MOOC-BERT so as to obtain the probability (triggering, exploring, integrating and solving) of the category of the cognitive behaviors of the learner in the discussion text, wherein the model structure is shown in figure 2.

(5-3) This embodiment uses the accuracy (accuracy) and the F value (F-Measure) as evaluation indexes. And optimizing model parameters until an optimal solution is obtained. Finally, compared with the traditional machine learning algorithm, the learner cognitive behavior recognition model provided by the invention has the highest accuracy and F value, and reaches 85.68 and 85.65 percent respectively, which is enough to prove the effectiveness of the model. The comparison result is shown in FIG. 3.

(5-4) By giving the learner to be tested discussion text data, the implicit learner cognitive behavior types can be quickly, accurately and in batches. The teacher can know the cognitive behaviors and learning states of the learner by analyzing the discussion data of the learner at any stage in MOOC teaching, and can further study the variation trend and the cognitive structure of the cognitive behaviors of the learner by means of various analysis tools such as data statistics analysis, cognitive network and the like, and the teacher can formulate a personalized teaching strategy to promote the learning centering on the learner.

Aiming at the problems of difficult feature extraction, poor recognition accuracy and the like of the current cognitive behavior recognition method, the invention pre-trains the BERT again by acquiring a large-scale unlabeled corpus from a MOOC platform of the traditional Chinese university, thereby constructing the MOOC-BERT, and applying the MOOC-BERT to the cognitive behavior analysis of a MOOC learner after fine tuning. On the one hand, the invention overcomes the defects of SVM,Bayse, random Forest and other traditional machine learning algorithms feature engineering problems, on the other hand, BERT learns professional knowledge in MOOC and education fields through learning of large-scale unlabeled corpus, and semantic representation is enhanced. Therefore, when the tasks such as cognitive behavior recognition and the like which need to depend on the context information are completed, the model can achieve higher recognition accuracy and stability.

What is not described in detail in this specification is prior art known to those skilled in the art. The foregoing description of the preferred embodiments of the invention is not intended to limit the invention to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the invention are intended to be included within the scope of the invention.

What is not described in detail in this specification is prior art known to those skilled in the art.

Claims

1. The MOOC learner cognitive behavior recognition method based on the BERT model is characterized by comprising the following steps of:

(1) Acquiring learner discussion text data in the MOOC forum, and generating professional corpus in the MOOC comment field;

(2) Preprocessing the corpus to generate MOOC domain expertise pre-training data;

(3) Retraining the BERT model by using an MLM and NSP strategy in combination with the pre-training data to obtain MOOC-BERT; the method specifically comprises the following steps:

(3-1) training parameters and weights of Embedding layers and output layers of the model in a combined manner by using MLM and NSP in the pre-training process; training by using a learning rate layer-by-layer attenuation mode, searching an optimal solution by adopting a low learning rate at a bottom layer of a model, and accelerating learning by adopting a high learning rate at a top layer of the model; in the training process, parameters of each layer of the model need to meet the formula (1):

First, each level parameter of the model satisfies the formula (1), wherein Parameters representing the mth layer of the model at a time point n, wherein n is a time step; lambda ^m represents the learning rate; Is a gradient representation; at this time λ ^m needs to satisfy the formula (2):

lambda ^x-1＝α×λ^x (type 2)

Where α represents an attenuation index, set to α=0.96; setting the initial learning rate to be 2e ^-5, and continuously reducing the learning rate according to the fixed iteration times;

(3-2) calculating a Loss value in each iteration process by using a cross entropy Loss function, iterating until the Loss value is reduced to the minimum, and finally obtaining a BERT model fused with the professional knowledge in the MOOC field;

(4) Constructing a MOOC learner cognitive behavior annotation data set;

(5) And fine-tuning parameters and weights in the MOOC-BERT by using the annotation data set to generate a cognitive behavior recognition model for the MOOC learner, wherein the model can effectively recognize the cognitive behavior type implicit in the interaction words of the MOOC learner.

2. The method for identifying cognitive behaviors of a moc learner based on a BERT model according to claim 1, wherein the step (1) of generating a specialized corpus in the field of moc comments specifically comprises:

(1-1) using a request and re module in the Python to design a crawler program, and downloading various discussion data in the MOOC forum to the local;

and (1-2) after merging all data, performing html code block removal, blank line removal and irrelevant character cleaning operation to form a corpus containing professional knowledge in the MOOC field, and storing the corpus as a. Txt file.

3. The method for recognizing cognitive behaviors of a moc learner based on a BERT model according to claim 1, wherein the generating the pre-training data of the expertise in the moc domain in the step (2) specifically comprises:

(2-1) segmenting sentences in terms of individual characters and vectorizing chinese characters;

(2-2) maxlength being set to 256, i.e., the remaining portion is filled with [ PAD ] symbols when the sentence length is less than 256, the portion greater than 256 is truncated;

(2-3) randomly masking 10% of words by each sentence, replacing 80% of masking words by [ MASK ], keeping 10% unchanged, randomly replacing 10% of masking words by other words, and finally generating a tfrecord file for storage.

4. The method for recognizing cognitive behaviors of a moc learner based on a BERT model according to claim 1, wherein the "fine tuning parameters and weights in a moc-BERT using a labeling dataset" in the step (5) specifically comprises:

(5-1) fitting the data set to a training set: verification set: test set = 6:2:2 division, text data sequence preceded by [ CLS ], sentence end position by [ SEP ];

And (5-2) adding a Softmax regression function to the BERT output layer to enable the BERT output layer to output a cognitive behavior label, and finally obtaining a learner cognitive behavior recognition model, and verifying the effect by using the F value and the accuracy.