CN108595602A

CN108595602A - The question sentence file classification method combined with depth model based on shallow Model

Info

Publication number: CN108595602A
Application number: CN201810357603.5A
Authority: CN
Inventors: 黄青松; 余慧; 郭勃; 刘利军; 冯旭鹏
Original assignee: Kunming University of Science and Technology
Current assignee: Kunming University of Science and Technology
Priority date: 2018-04-20
Filing date: 2018-04-20
Publication date: 2018-09-28

Abstract

The present invention relates to the question sentence file classification methods combined with depth model based on shallow Model, belong to Computer Natural Language Processing technical field.The present invention extracts the feature set of words of question sentence text first, and character pair word weight is worth to using normalized term vector after vectorization, is inputted as a part for shallow-layer linear model.Convolutional network carries out convolution using the convolution kernel of a variety of different windows sizes to question sentence text, the feature vector that the different convolution kernels for possessing equal length convolution window extract is rearranged, it is separately input to again among corresponding Recognition with Recurrent Neural Network, it finally is linked together the output of multiple Recognition with Recurrent Neural Network to obtain the syntactic-semantic feature vector of question sentence, another part as shallow-layer linear model inputs.Final shallow Model obtains the classification results of question sentence according to the input that the output by feature term vector and depth model is spliced to form.The present invention overcomes the shortcomings of single depth model, effectively improves the accuracy rate of Question Classification.

Description

The question sentence file classification method combined with depth model based on shallow Model

Technical field

The present invention relates to the question sentence file classification methods combined with depth model based on shallow Model, belong to computer nature Language processing techniques field.

Background technology

Question sentence text classification belongs to short text classification, is played an important role in automatically request-answering system.Question sentence text point Class mainly classifies to question sentence by analyzing the content of question sentence.There is rule-based method in early stage, utilizes the key of question sentence The correspondence of word or grammatical pattern and question sentence type, classifies to question sentence.This method to possess apparent interrogative or The Question Classification effect of question sentence Based on Class Feature Word Quadric is fine, but for there is no apparent in more complex question sentence or question sentence text Based on Class Feature Word Quadric is quite different, and the flexibility ratio of method is not high, and workload is larger, and the subjectivity of Question Classification is strong.With machine The development of study, the Question Classification method based on machine learning become mainstream, Zhang (<26th ACM years international Art meeting>, 2003) et al. using support vector machines (SVM), the syntactic feature for extracting sentence classifies to problem, this method Relatively pervious method accuracy rate has obtained larger promotion.Also have in addition to this and is mutually tied rule-based with the method for machine learning It closes, Li (<Chinese Journal of information>, 2008) etc. by interrogative and centre word rule and SVM method phases In conjunction with making the accuracy of Question Classification further increase.Nicety of grading depends on the effect of the technologies such as syntactic analysis, but in The syntactic analysis technology phase that the form variability of text and clause complexity cause the syntactic analysis difficulty of Chinese higher, current To not mature enough, the order of accuarcy of question sentence text classification is affected.

Recently as the proposition of deep learning, various deep learning frames have been widely used in image procossing and nature In Language Processing, and the promotion breakthrough relative to conventional method is achieved, wherein in terms of sentence Document Modeling and classification, volume Product neural network (CNN) and Recognition with Recurrent Neural Network (RNN) have become two kinds of most common deep learning neural network frameworks. Kim(<Eprint Arxiv>, 2014) et al. the text that is originally modeled using convolutional neural networks distich Ziwen, and will obtained Feature is used for text classification, this model structure is simple, but achieves preferable classification results, has become a kind of text classification Baseline Methods.Tang(<Natural language processing meeting>, 2015) et al. also taken in emotional semantic classification task with Recognition with Recurrent Neural Network Obtained preferable result.Depth model efficiently solves feature extraction complexity existing for conventional machines learning method, and transplantability is poor, The problems such as short text character representation is sparse.But due to the too strong learning ability of depth model, Cheng (<arXiv.org>, 2016) Et al. indicate that depth model is difficult to learn to indicate existed to effective feature vector to the lower feature of certain frequencies of occurrences In extensive problem, and shallow Model such as SVM, linear model etc. can the feature less to occurrence number preferably learned It practises, the perceptron network integration of Logic Regression Models and multilayer is improved the software application of Google's application shop by Cheng et al. Recommend accuracy rate.Often occur the unbalanced problem of training data in text categorization task, single depth model to data volume compared with Few classification is difficult that effective character representation is arrived in study.

Invention content

The present invention provides the question sentence file classification methods combined with depth model based on shallow Model, for single depth When model faces uneven training data there are the problem of, there is using traditional shallow Model to feature the spy of stronger Memorability Point effectively improves the accuracy rate of Question Classification.

The technical scheme is that：It is described based on the question sentence file classification method that shallow Model is combined with depth model Method is as follows：

Step1, the question sentence language for crawling economy and finance, laws and regulations, sports, 5 health care, electronic digital classifications Secondly material pre-processes language material text；

Step2, the feature set of words that question sentence text in question sentence language material is extracted using the method for evolution inspection CHI, will be each Feature Words are converted into the form of term vector, and use the corresponding normalization term vector value of Feature Words as its weighted value, thus To the part input Input1 of shallow-layer linear model；

Step3, increase question sentence keyword term vector weight, the question sentence text vector for then forming term vector matrix inputs Into first part's convolutional network of depth model；Wherein use the convolution kernel of a variety of different windows sizes respectively to question sentence text Convolution operation is carried out, the local phrase feature of sentence is extracted, the different convolution kernels for possessing equal length convolution window are extracted Feature vector rearranged；

Step4, the feature vector generated in Step3 is separately input among corresponding Recognition with Recurrent Neural Network；Cycle god The historical information of sentence can be captured by its chain structure through network, the long-term dependence characteristics of sequence data are arrived in study, The output of the last one time step contains the characteristic information of entire sentence, and the output of multiple Recognition with Recurrent Neural Network is linked to one The final feature as question sentence is played, another part input Input2 of shallow-layer linear model is thus obtained；

Step5, the final output Input2 of depth model in the Input1 obtained in Step2 and Step4 is spliced to form The input of shallow Model, shallow Model part use multiple linear regression structure, finally obtain the classification results of question sentence.

The step Step1 is as follows：

Step1.1, first manual compiling crawlers, Baidu know swash take economy and finance, laws and regulations, sport fortune Dynamic, 5 health care, electronic digital classifications question sentence language material；

Step1.2, the language material crawled, obtain unduplicated question sentence language material by filtering, duplicate removal, and it be stored in In database；

Step1.3, the question sentence language material in database is segmented, stop words is gone to pre-process.

The step Step2 is as follows：

Step2.1, the feature set of words that question sentence text is extracted using the method for evolution inspection CHI；

Step2.2, the form for converting each Feature Words in Step2.1 to term vector, using distributed term vector Representation method；

Step2.3, using the corresponding normalization term vector value of Feature Words as its weighted value, finally obtain question sentence text Non- syntactic feature indicates that the part as shallow Model inputs Input1.

The step Step3 is as follows：

Step3.1, question sentence key is extracted using the method based on tf-idf that the jieba kits in python provide The term vector of question sentence keyword is repeated once by word respectively after each word is represented as term vector in question sentence in its left and right, Keyword shared weight in sentence just will increase at this time, thus obtain a term vector matrix；

Step3.2, the question sentence text vector that the word vector matrix obtained in Step3.1 indicates is input to depth model First part's convolutional network in, wherein the line number of matrix be sentence in word number, columns be term vector dimension；Here make Vertical convolution operation is carried out to question sentence with each two of the convolution kernel of 2,3,4 three kinds of different length convolution windows, is extracted The local feature of different location in sentence, thus obtains several groups feature vector；

Step3.3, will possess identical convolution window size the extraction of different convolution kernels feature vector position in temporal sequence Confidence breath carries out rearranging combination so that the feature vector that different convolution kernels are obtained in sentence same position convolution is spliced one It rises.

The step Step4 is as follows：

Step4.1, by three kinds of different length convolution windows obtain in Step3.3 the feature rearranged respectively according to sentence Son is sequentially input among corresponding three Recognition with Recurrent Neural Network；Used here as LSTM Recognition with Recurrent Neural Network, for more preferably capturing To sentence, historical information, the long-term dependence characteristics of study to sequence data, the output of the last one time step include earlier The characteristic information of entire question sentence；

Step4.2, the output of three Recognition with Recurrent Neural Network in Step4.1 is linked together to final feature as question sentence It indicates, thus obtains another part input Input2 of shallow-layer linear model.

The step Step5 is as follows：

Step5.1, the final output Input2 of the Input1 obtained in Step2.3 and Step4.2 is spliced to form shallow-layer The input of model, shallow Model is using multiple linear regression structure here, i.e., one last layer connected entirely is added with softmax The general neural network of function；

Step5.2, the input layer content for obtaining Step5.1 pass through one layer of hidden layer, then the output of hidden layer is inputted To obtaining final Question Classification result in sotfmax functions.

The depth model part is made of convolutional network layer and Recognition with Recurrent Neural Network layer；K-th of convolution window in convolutional layer The Text Representation that the convolution nuclear convolution that mouth length is h obtains is w_kh=[c_ki,…,c_k(l-h+1)], wherein c_kiIt indicates k-th The convolution feature of convolution kernel i-th of position in question sentence text；c_ki=Relu (o_ki+ b), o_kiIndicate the value that convolutional calculation obtains； o_ki=[x_i,x_i+1,…,x_i+h-1]*f_kh, wherein x_iThe term vector of i-th of word in sentence is represented, h represents convolution kernel length of window, [x_i,x_i+1,…,x_i+h-1] represent in sentence the word from i-th of word to the i-th-h+1, the term vector matrix that total h word forms；f_kh Indicate that the convolution kernel that k-th of convolution length of window is h, * represent corresponding element multiplication sum operation in two matrixes；By convolutional layer Obtained feature vector rearranges combination and then inputs three different LSTM Recognition with Recurrent Neural Network layers respectively, is formed final special Sign vector is expressed as V=[v₂,v₃,v₄], wherein v₂,v₃,v₄Convolution length of window 2,3,4 is indicated respectively；The input of entire model Layer is spliced to form by the feature term vector of shallow-layer part and the output V of depth model, and the vector for forming a m dimension indicates, X= [wf₁…wf_n,V]。

The shallow Model final classification method is softmax functions.

The beneficial effects of the invention are as follows：

1, the present invention carries out term vector training using the word2vec modules of gensim, since the vector of word is the neighbour by word What nearly word calculated, so meeting implicit semantic information in vector, suitable for semantic information extraction.Term vector is indicated Text effectively improves the performance of model as the input of model.

2, in the preprocessing process of data, for depth model importation, increase the power of question sentence keyword term vector Weight.Keyword in question sentence works as each word in question sentence by table to judging that the classification of sentence often has the function of bigger It is shown as after term vector, the term vector of question sentence keyword in training corpus is repeated once respectively in its left and right, at this time keyword Shared weight just will increase in sentence, can further increase the classification performance of model in this way.

3, the present invention is based on the question sentence textual classification model that shallow Model is combined with depth model, combine depth model with Conventional machines learn the advantage of shallow Model respectively.Wherein depth model is by convolutional neural networks and LSTM Recognition with Recurrent Neural Network groups It closes, in order to the syntactic-semantic feature of preferably learning text, a variety of different windows sizes is used in convolutional network Convolution kernel carries out convolution operation to question sentence text.Simultaneously in order to overcome when the training corpus of certain class question sentence text is relatively fewer, Depth model is difficult to learn to the validity feature vector of feature corresponding to respective classes to indicate, it is proposed that on the basis of depth model Upper combination shallow Model has the characteristics that stronger Memorability using traditional shallow Model to feature.Training data imbalance with In the case of balance, the accuracy rate of Question Classification achieves promotion, especially in training data imbalance, compares other models Performance has a distinct increment.

To sum up, this question sentence file classification method combined with depth model based on shallow Model is passed through by convolutional Neural net Network and Recognition with Recurrent Neural Network are composed depth model, and doing preferably study extraction to the syntactic-semantic feature of question sentence is used as shallow-layer The part input of linear model, wherein depth model importation increase question sentence keyword term vector weight, and convolutional network makes With the convolution kernel of a variety of different windows sizes.Feature term vector is inputted as another part of shallow Model, utilizes shallow Model The advantages of, it is unbalanced in training corpus, overcome the deficiency of single depth model.Final unified model effectively carries The accuracy rate of Question Classification is risen.

Description of the drawings

Fig. 1 is the Question Classification model structure of the present invention；

Fig. 2 is depth model part-structure figure in the present invention；

Fig. 3 is the Question Classification accuracy rate comparison diagram of different convolutional network output processing in the present invention；

Fig. 4 is different neural network models with the training increased performance change comparison diagram of iterations.

Specific implementation mode

Embodiment 1：As shown in Figs 1-4, the question sentence file classification method combined with depth model based on shallow Model, it is described Method is as follows：

Further, the step Step1 is as follows：

The present invention has crawled on Baidu is known economy and finance by crawlers, laws and regulations, sports, medical treatment are defended The problem of raw, 5 classifications of electronic digital each 5000 language materials, as first prepared language material set, i.e. balanced corpus.In addition Health care and electronic digital are respectively removed into 3000 language materials, retain 2000 language materials, other three types language material numbers are constant, make For second prepared corpus, i.e., uneven corpus.Each language material, which combines, takes therein 1/10th to be used as test set, remaining Be used as training set.In view of the question sentence language material crawled is there may be repetition, these language materials increase workload, without too big Meaning, so obtaining unduplicated question sentence corpus of text by filtering, duplicate removal on the basis of preparation language material.It is stored in database It is in order to facilitate the management and use of data.

The present invention is in view of by the character string forms that text dividing is multiple characters composition, can directly cause in original text The loss of linguistic information between word, word.So to question sentence language material carry out pretreatment work, including with jieba tools into Row Chinese word segmentation removes stop words etc., facilitates the progress of follow-up work.

Further, the step Step2 is as follows：

The present invention, using word as its essential characteristic item, does not use in terms of the Feature Selection of multiple linear structure division Syntax grammar property, the feature selection approach evolution preferable and more commonly used using effect are examined to extract the spy of question sentence text Word is levied, and question sentence text is indicated with feature set of words.

Step2.2, the form for converting each Feature Words in Step2.1 to term vector, using distribution The term vector representation method of (distributed representation)；

During text vector, the present invention considers the limitation of tradition one-hot representation methods, selection Distributed representation, the representation method of this term vector not only solve that one-hot dimensions are sparse to ask Topic, and distance is close between the term vector expression of similar word, has carried certain semantic information.Term vector is indicated in this way Text is helpful to the promotion of model performance as the input of model.The present invention is carried out using the word2vec modules of gensim Term vector is trained.

Different weights are assigned for different Feature Words, the present invention is using most simple and effective normalization term vector value as special Term vector weighted value is levied, the non-syntactic feature expression of question sentence text is one that Feature Words vector is denoted as subground line model Divide input.

Further, the step Step3 is as follows：

In the importation of depth model, because some words in question sentence are to judging that the classification of sentence often has more your writing With, for example " when basketball movement invents in question sentence" in, noun ' basketball ' is for differentiating that question sentence type is sport category Play key effect.Therefore after each word in question sentence is represented as term vector, by question sentence keyword in training corpus Term vector be repeated once respectively in its left and right, former sentence has reformed into " basketball basketball basketball movement is when to invent 's" at this time keyword shared weight in question sentence just will increase.In order to verify, keyword weight can be further in increase question sentence Increase the classification performance of model, keyword plays the result of Question Classification crucial effect, has been one group of contrast experiment, such as Shown in table 1：

Table 1

	Do not increase keyword weight	Increase keyword weight
			Accuracy rate	0.9219	0.9226

As shown in Fig. 2, in order to preferably learn to obtain the syntactic-semantic feature of question sentence text, used in convolutional network part Each two of the convolution kernel of 2,3,4 three kinds of different length convolution windows carries out convolution operation to question text.Convolution length of window refers to In each convolution operation covering sentence word quantity number.Convolution kernel is slided on sentence, is extracted different in sentence Thus the local feature of position obtains one group of feature vector.

The validity for handling and selecting different convolution windows in order to verify the present invention to convolutional network output, compares another A kind of outer processing convolutional network exports and inputs the strategy of Recognition with Recurrent Neural Network and point of different convolution window size selection strategies Class difference on effect.Second of link method of making comparisons is as described below, the feature after being reset to convolution, according to maximum length convolution window Subject to the characteristic length that mouth convolution is reset, the feature obtained after other two kinds of length convolution window convolution is reset is cut therewith Compare the part exceeded, and the feature of same position in sentence is linked together, and is input in a LSTM recirculating network. This structure is denoted as M2:cl2,3,4.The link policy that depth part uses in shallow depth binding model of the present invention is denoted as M1:Cl2,3,4, in addition also the model of single different length window is compared, is denoted as S respectively:Cl2, S:Cl3, S: Cl4 indicates that window size is 2,3,4.It is tested in corpus 1, the results are shown in Figure 3.It will become apparent from the M1 of the present invention: Cl2,3,4 tactful effects are best, and M2:The classifying quality of cl2,3,4 compares the model and M1 of single window length:Cl2,3,4, Declining occurs in its classification accuracy, and reason may be that the feature cut away causes influence to final characteristic sequence, causes Make LSTM that could not capture the sequence information of high quality.In addition in single length of window, when length of window is 3, classification accuracy Highest.

Further, the step Step4 is as follows：

The present invention is considered more preferably learn the syntactic-semantic feature of sentence, be recycled in the second part of depth model refreshing (LSTM) network is remembered through network selection shot and long term, because basic Recognition with Recurrent Neural Network model can lose sentence when sentence is longer The information of forward portion in son, to overcome the above disadvantages, people have invented LSTM Recognition with Recurrent Neural Network models, it is relatively traditional Neural network can preferably remember historical information earlier.

The output of three LSTM is spliced together, the final feature vector of question sentence, i.e. V=[v are formed₂,v₃,v₄], wherein v₂,v₃,v₄Indicate that convolution length of window is respectively 2,3,4 respectively.Multiwindow convolution loop combination of network depth model such as Fig. 2 institutes Show.

Further, the step Step5 is as follows：

Further, the depth model part is made of convolutional network layer and Recognition with Recurrent Neural Network layer；Kth in convolutional layer The Text Representation that the convolution nuclear convolution that a convolution length of window is h obtains is w_kh=[c_ki,…,c_k(l-h+1)], wherein c_kiTable Show the convolution feature of k-th of convolution kernel, i-th of position in question sentence text；c_ki=Relu (o_ki+ b), o_kiIndicate that convolutional calculation obtains The value arrived；o_ki=[x_i,x_i+1,…,x_i+h-1]*f_kh, wherein x_iThe term vector of i-th of word in sentence is represented, h represents convolution kernel window Mouth length, [x_i,x_i+1,…,x_i+h-1] represent in sentence the word from i-th of word to the i-th-h+1, the term vector that total h word forms Matrix；f_khIndicate that the convolution kernel that k-th of convolution length of window is h, * represent corresponding element multiplication sum operation in two matrixes； The feature vector that convolutional layer obtains is rearranged into combination and then inputs three different LSTM Recognition with Recurrent Neural Network layers, shape respectively It is expressed as V=[v at final feature vector₂,v₃,v₄], wherein v₂,v₃,v₄Convolution length of window 2,3,4 is indicated respectively；Entire mould The input layer of type is spliced to form by the feature term vector of shallow-layer part and the output V of depth model, forms the vector table of m dimensions Show, X=[wf₁…wf_n,V]。

Further, the shallow Model final classification method is softmax functions.

In order to compare shallow depth binding model and conventional machines learning model and convolutional neural networks model, cycle god The problem of the problem of convolution loop combinational network model through network model and multiple length convolution windows classifying quality classification effect Fruit, wherein conventional machines learning model have chosen three kinds of SVM, maximum entropy and naive Bayesian methods, in the present invention model and its Excess-three kind neural network model is denoted as WD, CNN, RNN and M respectively:Cnn+rnn, here respectively from balanced corpus 1 and injustice The accuracy rate of weighing apparatus corpus 2 is compared, as a result as shown in table 2, table 3.

Table 2

Table 3

By table 2, it is apparent that WD models accuracy rate in the corpus that language material balances compares other conventional machines Model is practised, accuracy rate highest, although accuracy rate is declined in uneven language material, fall is compared to for other models It is relatively low.

As can be seen from Table 3, the overall performance of depth model is still better than conventional model, but depth model is in unbalanced language material Accuracy rate fall is relatively large in library, reason be in face of a certain category feature language material it is less in the case of, depth model Too strong learning ability can increase the learning difficulty of effective characteristic of division.

In order to which further more general depth model and shallow depth binding model of the present invention, Fig. 4 are illustrated in imbalance In corpus 2, with the increase of training iterations, the variation of respective classification accuracy.It can be seen from the figure that with model The increase of training iterations, the Question Classification accuracy of 4 kinds of models are all increasing steadily, and iterations are at 200 times or so When accuracy rate no longer change substantially.Shallow depth binding model is better than other three kinds of models on final classification performance.From It can also be seen that convolutional network is slightly better than Recognition with Recurrent Neural Network on short text in figure.

In the present invention, the question sentence textual classification model that is combined with depth model based on shallow Model by shallow Model part with Depth model part forms, and overall structure is as shown in Figure 1.

Input layer

Input layer is spliced to form by the feature term vector of shallow-layer part and the output V of depth model, formed m dimension to Amount indicates, is denoted as X=[wf₁…wf_n,V]。

Softmax layers

Softmax layers are equivalent to the general neural network connected entirely for possessing one layer of hidden layer.The content of input layer is passed through One layer of hidden layer is crossed, then being input in sotfmax functions for hidden layer is obtained into final classification results.Hidden layer is k The neuron of a neural unit, input layer and hidden layer connects entirely.Its calculation formula：O=X*W, wherein W are m rows k row Matrix, matrix finite element stochastic production nonzero value, then constantly with new in training.O be possess k value it is one-dimensional to Amount.Each value represents the output valve of kth class, then is passed to softmax functions.The formula of softmax functions： O_kIndicate the output valve of neural network kth class, s_kRepresent the probability value that text belongs to k classifications.

In order to be trained entire model, need to define a suitable loss function, using Adam (<Computer Science>, 2014) and optimization method minimizes or maximizes loss function and train entire model.For classification problem, one As using cross entropy (cross-entropy) be used as its loss function.Its formula is：H_y′(y)=- ∑_iy_i′logy_i, wherein y_i′ To be true probability distribution (i.e. the class label of training corpus), y_iFor the probability distribution of model prediction.Pass through minimum in this Change H_y′(y) value trains entire model.

The specific implementation mode of the present invention is explained in detail above in conjunction with attached drawing, but the present invention is not limited to above-mentioned Embodiment within the knowledge of a person skilled in the art can also be before not departing from present inventive concept Put that various changes can be made.

Claims

1. the question sentence file classification method combined with depth model based on shallow Model, it is characterised in that：The method it is specific Steps are as follows：

Step1, the question sentence language material for crawling economy and finance, laws and regulations, sports, 5 health care, electronic digital classifications, Secondly language material text is pre-processed；

Step2, the feature set of words that question sentence text in question sentence language material is extracted using the method for evolution inspection CHI, by each feature Word is converted into the form of term vector, and uses the corresponding normalization term vector value of Feature Words as its weighted value, thus obtains shallow The part input Input1 of layer linear model；

Step3, increase question sentence keyword term vector weight, the question sentence text vector that term vector matrix forms then is input to depth It spends in first part's convolutional network of model；Wherein question sentence text is carried out respectively using the convolution kernel of a variety of different windows sizes Convolution operation extracts the local phrase feature of sentence, will possess the spy of the different convolution kernels extraction of equal length convolution window Sign vector is rearranged；

Step4, the feature vector generated in Step3 is separately input among corresponding Recognition with Recurrent Neural Network；Recycle nerve net Network can capture the historical information of sentence by its chain structure, and the long-term dependence characteristics of study to sequence data are last The output of one time step contains the characteristic information of entire sentence, and the output of multiple Recognition with Recurrent Neural Network is linked together work For the final feature of question sentence, another part input Input2 of shallow-layer linear model is thus obtained；

Step5, the final output Input2 of depth model in the Input1 obtained in Step2 and Step4 is spliced to form shallow-layer The input of model, shallow Model part use multiple linear regression structure, finally obtain the classification results of question sentence.

2. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The step Step1 is as follows：

Step1.1, first manual compiling crawlers, Baidu know swash take economy and finance, laws and regulations, sports, The question sentence language material of 5 health care, electronic digital classifications；

Step1.2, the language material crawled, obtain unduplicated question sentence language material by filtering, duplicate removal, and it be stored in data In library；

3. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The step Step2 is as follows：

Step2.2, the form for converting each Feature Words in Step2.1 to term vector are indicated using distributed term vector Method；

Step2.3, using the corresponding normalization term vector value of Feature Words as its weighted value, finally obtain the non-sentence of question sentence text Method character representation, the part input Input1 as shallow Model.

4. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The step Step3 is as follows：

Step3.1, question sentence keyword is extracted using the method based on tf-idf that the jieba kits in python provide, when Each word is represented as after term vector in question sentence, the term vector of question sentence keyword is repeated once respectively in its left and right, at this time Keyword shared weight in sentence just will increase, and thus obtain a term vector matrix；

Step3.2, the question sentence text vector that the word vector matrix obtained in Step3.1 indicates is input to the of depth model In a part of convolutional network, wherein the line number of matrix is the number of word in sentence, and columns is the dimension of term vector；Used here as 2, Each two of the convolution kernel of 3,4 three kinds of different length convolution windows carries out vertical convolution operation to question sentence, extracts sentence Thus the local feature of middle different location obtains several groups feature vector；

Step3.3, by the feature vector for the different convolution kernels extraction for possessing identical convolution window size, position is believed in temporal sequence Breath carries out rearranging combination so that the feature vector that different convolution kernels are obtained in sentence same position convolution is stitched together.

5. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The step Step4 is as follows：

It is Step4.1, three kinds of different length convolution windows obtain in Step3.3 the feature rearranged is suitable according to sentence respectively Sequence is input among corresponding three Recognition with Recurrent Neural Network；Used here as LSTM Recognition with Recurrent Neural Network, for more preferably capturing sentence Son historical information earlier, the long-term dependence characteristics of study to sequence data, the output of the last one time step contain whole The characteristic information of a question sentence；

Step4.2, the output of three Recognition with Recurrent Neural Network in Step4.1 is linked together to final mark sheet as question sentence Show, thus obtains another part input Input2 of shallow-layer linear model.

6. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The step Step5 is as follows：

Step5.1, the final output Input2 of the Input1 obtained in Step2.3 and Step4.2 is spliced to form shallow Model Input, here shallow Model use multiple linear regression structure, i.e., one last layer connected entirely is added with softmax functions General neural network；

Step5.2, the input layer content for obtaining Step5.1 pass through one layer of hidden layer, then the output of hidden layer are input to Final Question Classification result is obtained in sotfmax functions.

7. the question sentence file classification method according to claim 1 combined with depth model based on shallow Model, feature It is：The depth model part is made of convolutional network layer and Recognition with Recurrent Neural Network layer；K-th of convolution window is long in convolutional layer The Text Representation that the convolution nuclear convolution that degree is h obtains is w_kh=[c_ki,…,c_k(l-h+1)], wherein c_kiIndicate k-th of convolution The convolution feature of core i-th of position in question sentence text；c_ki=Relu (o_ki+ b), o_kiIndicate the value that convolutional calculation obtains；o_ki= [x_i,x_i+1,…,x_i+h-1]*f_kh, wherein x_iThe term vector of i-th of word in sentence is represented, h represents convolution kernel length of window, [x_i, x_i+1,…,x_i+h-1] represent in sentence the word from i-th of word to the i-th-h+1, the term vector matrix that total h word forms；f_khTable Show that the convolution kernel that k-th of convolution length of window is h, * represent corresponding element multiplication sum operation in two matrixes；Convolutional layer is obtained To feature vector rearrange combination then respectively input three different LSTM Recognition with Recurrent Neural Network layers, form final feature Vector is expressed as V=[v₂,v₃,v₄], wherein v₂,v₃,v₄Convolution length of window 2,3,4 is indicated respectively；The input layer of entire model It is spliced to form by the feature term vector of shallow-layer part and the output V of depth model, the vector for forming a m dimension indicates, X= [wf₁…wf_n,V]。

8. the question sentence file classification method according to claim 6 combined with depth model based on shallow Model, feature It is：The shallow Model final classification method is softmax functions.