CN103928023A - Voice scoring method and system - Google Patents
Voice scoring method and system Download PDFInfo
- Publication number
- CN103928023A CN103928023A CN201410178813.XA CN201410178813A CN103928023A CN 103928023 A CN103928023 A CN 103928023A CN 201410178813 A CN201410178813 A CN 201410178813A CN 103928023 A CN103928023 A CN 103928023A
- Authority
- CN
- China
- Prior art keywords
- voice
- examination paper
- scoring
- marked
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a voice scoring method. The voice scoring method includes the steps that firstly, examination paper voice of an examinee is recorded; secondly, the examination paper voice of the examinee is preprocessed, and an examination paper voice corpus is obtained; thirdly, feature parameters of the examination paper voice corpus are extracted; fourthly, feature matching is performed on the feature parameters of the examination paper voice corpus and a standard voice template with a voice identification method based on an HMM and ANN mixed model, the content of the examination paper voice is identified, and a preliminary score is given; fifthly, if the preliminary score is lower than a threshold value, the preliminary score is a final score, or else, scoring on sub-indexes such as the accuracy, the fluency, the voice speed, the rhythm, the stress and the intonation is performed; sixthly, the final score of the examination paper voice is obtained by synthesizing various score calculations. The invention further discloses a voice scoring system. According to the voice identification method based on the mixed model, identification is more accurate, objective scoring can be performed on the voice examination paper which is stored in a file pattern after examinee recording is performed through evaluation standard classification.
Description
Technical field
The present invention relates to speech recognition and assessment technique, relate in particular to a kind of speech assessment method and system.
Background technology
Speech recognition technology is conventionally divided into two classes application point: a class is particular person speech recognition, and a class is unspecified person speech recognition.Particular person speech recognition technology is the recognition technology for a specific people, simply says to be exactly the sound of only identifying a people, is not suitable for colony widely; But not specific people discern technology is on the contrary, can meet the speech recognition requirement of different people, be applicable to extensive crowd's application.
At present in the IBM voice study group maintaining the leading position aspect large vocabulary speech recognition.The Bel research institute of AT & T has also started a series of experiments about unspecified person speech recognition, and its achievement has been established the method for the standard form of unspecified person speech recognition of how making.
This period, obtained major progress had:
(1) maturation of hidden markov model (Hidden Markov Models, HMM) technology and the constantly perfect main stream approach that becomes speech recognition;
(2) in carrying out continuous speech recognition, except identification acoustic information, utilize more various linguistries, help further voice are made identification and understood such as the knowledge of word-building, syntax, semanteme, dialogue background aspect etc.; In the Research of Speech Recognition field, also produced the language model based on statistical probability simultaneously;
(3) rise of the applied research of artificial neural network in speech recognition.In these researchs, most of Multilayer Perception network adopting based on back-propagation algorithm (BP algorithm); In addition, also has the feedforward network that network structure is simple, be easy to realize, do not have feedback signal; The stability of system and function of associate memory have the feedback network that has feedback between substantial connection, neuron.Artificial neural network has the ability of distinguishing complicated classification boundaries, and obviously it extremely contributes to mode division.
In addition, be also gradually improved towards the continuous speech dictation machine technology of personal use.This respect, the most representative is the ViaVoice of IBM and the Dragon Dictate system of Dragon company.These systems have speaker adaptation ability, and new user does not need whole vocabulary to train, and just can in use improve constantly discrimination.
The development of the speech recognition technology of China: have scientific research institution and the institution of higher learning such as acoustics institute of the Chinese Academy of Sciences, Institute of Automation, Tsing-Hua University, Northern Transportation University in Beijing.In addition, also have Harbin Institute of Technology, Chinese University of Science and Technology, Sichuan University etc. also to take action one after another.Now, domestic have many speech recognition systems to succeed in developing.The performance of these systems differs from one another: aspect the speech recognition of isolated word large vocabulary, the most representative is department of electronic engineering, tsinghua university and the speech recognition of the successful THED-919 particular person of device company of China Electronics cooperation research and development and understand real-time system; Aspect continuous speech recognition, computer center of Sichuan University has realized the continuous English of particular person of a Topic-constrained on microcomputer---Chinese speech translation demo system; Aspect unspecified person speech recognition, there is the voice control telephone directory system of Computer Science and Technology Department of Tsing-Hua University development and drop into actual use.
In addition, University of Science and Technology news fly the intelligent sound technology provider as Largest In China, have issued global first mobile Internet intelligent sound interaction platform " news rumours sound cloud " in 2010, and the declaration mobile Internet voice dictation epoch arrive.
Fly long-term research accumulation in intelligent sound technical field University of Science and Technology news, and in multinomial technical achievements leading in the world that has such as Chinese speech synthesis, speech recognition, speech evaluatings: phonetic synthesis is to realize man machine language to communicate by letter with speech recognition technology, and setting up one has necessary two gordian techniquies of the voice system of listening and saying ability; Automatic speech recognition technology (Auto Speech Recognize, ASR) problem to be solved is to allow computing machine can " understand " mankind's voice, by the Word message comprising in voice " extraction " out; Speech evaluating technology is a research forward position of intelligent sound process field, claim again computer-assisted language learning (Computer Assisted Language Learning) technology, be a kind of by machine automatically to pronunciation mark, error detection provide the technology of remedial teaching; Sound groove recognition technology in e, claim again speaker Recognition Technology (Speaker Recognition), be one and extract the correlated characteristic (as the spectrum signature of the fundamental frequency feature of reflection glottis folding frequency, reflection oral cavity size shape and sound channel length etc.) that represents speaker ' s identity by voice signal, and then identify the technology of the work aspects such as speaker ' s identity; Natural language is requisite element in people's life for thousands of years, work, study, and computing machine is one of greatest invention of 20th century, the natural language that how to utilize computing machine to grasp the mankind is processed, is even understood, make computing machine possess the mankind's listening, speaking, reading and writing ability, be the research work that domestic and international research institution pays special attention to and actively develops always.
Summary of the invention
Technical matters to be solved by this invention is, a kind of speech assessment method and system is provided, and the scoring of can fast going over examination papers exactly, marks to examinee with objective standards of grading.The advantage of existing voice quality objective evaluation model has been merged in the present invention, has obtained the better speech recognition modeling of performance and voice training model and the spoken marking scheme of voice more accurately; And can realize the voice paper of depositing with document form is carried out to objective scoring by multiple assessment indicator system.That the present invention has advantages of is more stable, efficiency is higher, for the practical of achievement in research lays the foundation, is conducive to realize extensive Oral English Practice and tests the target of automatically going over examination papers.
For solving the problems of the technologies described above, the invention provides a kind of speech assessment method, comprise step:
S1, record examinee's examination paper voice;
S2, described examinee's examination paper voice are carried out to pre-service, obtain examination paper voice language material;
S3, extract the characteristic parameter of described examination paper voice language material;
The characteristic parameter of described examination paper voice language material and received pronunciation template are carried out characteristic matching by S4, the audio recognition method of employing based on HMM and ANN mixture model, identifies the content of described examination paper voice, and give raw score;
If S5 raw score is lower than presetting threshold value, described raw score is the final scoring of these examination paper voice, and these examination paper voice of mark are problem volume; If raw score is higher than presetting threshold value, described examination paper voice are carried out to point index scoring of accuracy, fluency, word speed, rhythm, stress and intonation;
S6, to described point of index, scoring is weighted the final scoring that obtains described examination paper voice.
Further, before described step S1, also comprise step S0, described step S0 specifically comprises step:
S01, record expert's received pronunciation;
S02, described received pronunciation is carried out to pre-service, obtain received pronunciation language material;
S03, extract the characteristic parameter of described received pronunciation language material;
S04, the characteristic parameter of described received pronunciation language material is carried out to model training, obtain described received pronunciation template.
Further, in described step S4, the concrete steps of the audio recognition method based on HMM and ANN mixture model are:
S41, set up the HMM model of the characteristic parameter of described examination paper voice language material, obtain all state cumulative probabilities in HMM model;
S42, described all state cumulative probabilities are processed as the input feature vector of ANN sorter, thus output recognition result;
S43, described recognition result and described received pronunciation template are carried out to characteristic matching, thereby identify the content of described examination paper voice.
Further, pre-service in described step S2 specifically comprises pre-emphasis, point frame, windowing, noise reduction, end-point detection and cuts word, wherein, the concrete steps of described noise reduction are to adopt the blank voice segments of voice, as the base value of noise, subsequent voice is carried out to denoising.
Further, described in, cut word and specifically comprise step:
S21, extract the MFCC parameter of each phoneme in voice, and set up the HMM model of corresponding phoneme;
S22, voice are carried out to rough lumber divide, obtain effective voice segments;
S23, go out the word of described voice segments according to the HMM Model Identification of described phoneme, thereby be set of letters by speech recognition.
Further, the extracting parameter feature in described step S3 is specially extracts MFCC characteristic parameter, and concrete steps are that the language material obtaining after pre-service is carried out to Fast Fourier Transform (FFT), quarter window filtering, asks logarithm, discrete cosine transform to obtain MFCC characteristic parameter.
Further, the scoring of the accuracy in described step S5 concrete steps are:
The method of the employing value of pulling and pushing by regular speech sentences to be marked to the degree close with received pronunciation statement; Adopt short-time energy extract as feature described in the intensity curve of speech sentences to be marked and received pronunciation statement; The fitting degree of the intensity curve by speech sentences relatively to be marked and received pronunciation statement is marked.
Further, the scoring of the fluency in described step S5 concrete steps are:
Two parts before and after voice to be marked are cut into, thereby and word cut in first part and latter part obtain efficient voice section; The length of two-part front and back efficient voice section is made to division operation with the length of voice always to be marked respectively, and by the value obtaining and corresponding threshold, if be all greater than corresponding threshold value, it is fluent to be judged to be; Otherwise it is unfluent to be judged to be.
Word speed scoring concrete steps are: calculate in voice to be marked to pronounce and partly account for the ratio of voice duration whole to be marked, carry out word speed scoring according to described ratio.
Rhythm scoring concrete steps are: adopt improved dPVI parameter calculation formula to calculate the rhythm of voice to be marked.
Stress scoring concrete steps are: on the intensity curve basis after regular, by being set, stress threshold value and non-stress threshold value divide stress unit as double threshold and the stressed vowel duration of feature, and adopt DTW algorithm to carry out pattern match to speech sentences described to be marked and received pronunciation statement, realize commenting of stress.
Intonation scoring concrete steps are: extract the resonance peak of voice to be marked and received pronunciation, and according to the fitting degree of the variation tendency of the variation tendency of speech resonant peak described to be marked and received pronunciation resonance peak, intonation is marked.
The present invention also provides a kind of speech assessment system, comprising:
Voice recording module, for recording examinee's examination paper voice;
Pretreatment module, carries out pre-service for the examination paper voice to described examinee, obtains examination paper voice language material;
Parameter attribute extraction module, for extracting the characteristic parameter of described examination paper voice language material;
Sound identification module, for adopting audio recognition method based on HMM and ANN mixture model characteristic parameter and the received pronunciation template to described examination paper voice language material to carry out characteristic matching, identifies the content of examination paper voice, and gives raw score;
Speech assessment module, for carrying out accuracy scoring, fluency scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring for raw score higher than the examination paper voice of setting threshold.
Comprehensive grading module, obtains the final scoring of raw score higher than the examination paper voice of setting threshold for the score calculation of overall accuracy, fluency, word speed, rhythm, stress and intonation.
Implement the present invention, there is following beneficial effect:
1, the present invention has added practical noise reduction and has cut word method in pretreatment module, obtains the voice language material of better quality;
2, adopt the audio recognition method based on HMM and ANN mixture model, performance is better, and it is more accurate to identify;
3, by the many index analysis to word speed, rhythm, stress and intonation, than original more diversification of Score index of reading aloud topic, result has more objectivity;
4, by the double analysis to accuracy and fluency, on the original basis that can only realize reading aloud topic scoring, realize non-objective scorings of reading aloud topic such as translation topic, question-and-answer problem and repetition topics, set up a reasonable perfect speech assessment method and system, the scoring of fast going over examination papers exactly, marks to examinee with objective standards of grading;
5, the present invention have advantages of more stable, efficiency is higher, and practical, applied range, can apply to the process of correcting of SET, significantly effectively shortens and corrects the time, improves the high efficiency of system processing, has also improved the objectivity of correcting.
Brief description of the drawings
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, to the accompanying drawing of required use in embodiment or description of the Prior Art be briefly described below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not paying under the prerequisite of creative work, can also obtain according to these accompanying drawings other accompanying drawing.
Fig. 1 is the schematic flow sheet of the speech assessment method that provides of the embodiment of the present invention;
Fig. 2 is the schematic flow sheet of the concrete steps of step S0;
Fig. 3 is the schematic flow sheet of pretreated concrete steps in Fig. 1;
Fig. 4 is the schematic flow sheet of cutting the concrete steps of word in Fig. 3;
Fig. 5 is the schematic flow sheet of the concrete steps of MFCC characteristic parameter extraction;
Fig. 6 is the schematic flow sheet of the concrete steps of the audio recognition method based on HMM and ANN mixture model;
Fig. 7 is the structural representation of the speech assessment system that provides of the embodiment of the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
The embodiment of the present invention provides a kind of speech assessment method, as shown in Figure 1, comprises step:
S1, record examinee's examination paper voice;
S2, described examinee's examination paper voice are carried out to pre-service, obtain examination paper voice language material;
S3, extract the characteristic parameter of described examination paper voice language material;
S4, employing are based on hidden Markov model (Hidden Markov Models, and artificial neural network (Artificial Neural Networks HMM), ANN) characteristic parameter of described examination paper voice language material and received pronunciation template are carried out characteristic matching by the audio recognition method of mixture model, identify the content of described examination paper voice, and give raw score;
If S5 raw score is lower than presetting threshold value, described raw score is the final scoring of these examination paper voice, and these examination paper voice of mark are problem volume; If raw score is higher than presetting threshold value, described examination paper voice are carried out to point index scoring of accuracy, fluency, word speed, rhythm, stress and intonation;
S6, the scoring of described point of index is weighted to the final scoring that obtains described examination paper voice.
Further, before described step S1, also comprise step S0, as shown in Figure 2, described step S0 specifically comprises step:
S01, record expert's received pronunciation;
Wherein received pronunciation is all recorded under specific environment by most professional persons, and voice content is corresponding with Oral English Exam content;
S02, described received pronunciation is carried out to pre-service, obtain received pronunciation language material;
S03, extract the characteristic parameter of described received pronunciation language material;
S04, the characteristic parameter of described received pronunciation language material is carried out to model training, obtain described received pronunciation template.
Wherein, the model training of received pronunciation refers to according to certain criterion, obtains the model parameter that characterizes this pattern essential characteristic, i.e. received pronunciation template from a large amount of known mode.The process of described model training specifically refers in order to make speech recognition system reach certain optimum condition, by to initial construction data constantly the parameter of iteration adjustment system template (comprise the variance of probability and the gauss hybrid models of state-transition matrix, average, weight etc.), the process that the performance of system is constantly approached to this optimum condition.Because professional person's received pronunciation and examinee's voice have difference to a certain extent, and scoring of the present invention is to liking nature person, so the present invention will make great efforts to expand corpus, expand to ordinary people by specific professional person, specific environment expands to conventional environment, and the speaker's who comprises different sexes, age, accent sound.
Next will be specifically introduced each step.
1, pre-service
As shown in Figure 3, pre-service in described step S2 specifically comprises noise reduction, pre-emphasis, point frame, windowing, end-point detection and cuts word, pretreated object is to eliminate because the impact that people's vocal organs itself and the equipment due to voice signal produce quality of speech signal, for speech feature extraction provides the parameter of high-quality, thereby improve the quality of speech processes.
Wherein, the concrete steps of described noise reduction are to adopt the blank voice segments of voice, as the base value of noise, subsequent voice is carried out to denoising, because find according to research, before examinee is recording recording, conventionally in a bit of time starting, there is no sounding, and this bit of recording is not blank, but there is the recording section of noise.Therefore, by extracting the audio frequency of this recording section as the base value of noise, just can carry out the processing of a place to go noise to recording afterwards, also get rid of the noise of unvoiced segments simultaneously.
Wherein, the described word of cutting refers to being in short cut into word or phrase one by one, so that must calculate, machine can by identifying one by one word or phrase, " understanding " examinee's statement content be carried out analysis and the last automatic scoring of corresponding bonus point or deduction of points factor and is prepared for after-stage computing machine.As shown in Figure 4, described in, cut word and specifically comprise step:
S21, extract Mel frequency cepstral coefficient (Mel Frequency Cepstrum Coefficient, the MFCC) parameter of each phoneme in voice, and set up the HMM model of corresponding phoneme;
S22, voice are carried out to rough lumber divide, obtain effective voice segments;
The object that rough lumber divides has 2 points: the one, reduce operand, and reduce whereby the time of cutting word; The 2nd, increase the accuracy of cutting word.About rough segmentation, utilization be double threshold method, obviously blank place is intercepted, but the threshold value using is lower, object is in order to obtain effective voice segments;
S23, go out the word of described voice segments according to the HMM Model Identification of described phoneme, thereby be set of letters by speech recognition.
This is cut, and word method has discrimination, accuracy rate is high, the advantage that error is little: 1) number of recognition template is fixed, and for HMM model, accuracy rate is very high; And not needing to go again to arrange the threshold value of output probability, this will improve discrimination to a great extent.2) after cutting word, obtain the pronunciation of word, the coupling of carrying out keyword can be assisted in pronunciation, thereby reduced the error that coupling word brings.
2, extracting parameter feature
Extraction characteristic parameter in described step S3 is specially and extracts MFCC characteristic parameter, and as shown in Figure 5, concrete steps are that the language material obtaining after pre-service is carried out to Fast Fourier Transform (FFT), quarter window filtering, asks logarithm, discrete cosine transform to obtain MFCC characteristic parameter.Wherein, adopting MFCC characteristic parameter is because it has considered the auditory properties of people's ear, and frequency spectrum is converted into the non-linear frequency spectrum based on Mel frequency, is then transformed on cepstrum domain.And without any hypotheses, simulate the auditory properties of people's ear by the method for mathematics, and use a string triangular form wave filter of arranging at low frequency region juxtaposition, catch the spectrum information of voice; In addition, the anti-noise ability of MFCC characteristic parameter and anti-distortion spectrum ability are strong, can better improve the recognition performance of system.
3, voice content identification
In described step S4, adopt the audio recognition method based on HMM and ANN mixture model, wherein HMM method has needs that the priori of voice signal statistical knowledge, categorised decision ability are weak, complex structure, needs a large amount of training samples and need to carry out a large amount of shortcomings of calculating; Although ANN has certain advantage in decision-making capability, it is still unsatisfactory to the descriptive power of dynamic time signal, and has training, the oversize shortcoming of recognition time based on the speech recognition algorithm of neural network.In order to overcome shortcoming separately, the present invention organically combines the HMM with stronger time modeling ability with two kinds of methods of ANN with stronger classification capacity, has further improved robustness and the accuracy rate of speech recognition.This method has not only overcome the overlapped problem between the insoluble pattern class of HMM itself, has improved the recognition capability of commute confusable word, has simultaneously also overcome ANN and only can process the limitation of fixed length input pattern, has saved complicated consolidation computing.Concrete, as shown in Figure 6, the concrete steps of the audio recognition method based on HMM and ANN mixture model in described step S4 are:
S41, set up the HMM model of the characteristic parameter of described examination paper voice language material, obtain all state cumulative probabilities in HMM model;
S42, described all state cumulative probabilities are processed as the input feature vector of ANN (being specially self organizing neural network) sorter, thus output recognition result;
S43, described recognition result and described received pronunciation template are carried out to characteristic matching, thereby identify the content of described examination paper voice.
4, voice evaluation
Due in daily life, there are some examinees not carry out spoken language test at official hour well, will there is a large amount of blank or None-identified in the examination paper voice that obtain, these examination paper recording are labeled as problem volume by we.Problem volume comprises the sound recording of blank recording and various None-identifieds, recording as excessive in the recording of non-English languages, noise etc., and the object of step S4 does not just identify the content that examinee reads, be exactly test problems volume in addition, and provide lower mark according to actual situation, just there is no need that for problems volume voice it is carried out to accuracy, fluency, word speed, rhythm, stress and intonation and mark.Only have and just carry out further voice evaluation when presetting threshold value when initial score.
(1) accuracy in described step S5 scoring concrete steps are: the method that adopts the value of pulling and pushing by regular speech sentences to be marked to the degree close with received pronunciation statement; Adopt short-time energy extract as feature described in the intensity curve of speech sentences to be marked and received pronunciation statement; The fitting degree of the intensity curve by speech sentences relatively to be marked and received pronunciation statement is marked.
The intensity of statement can reflect the variation of voice signal along with the time.The feature that in statement, stressed syllable is loud is the energy intensity being reflected in time domain, and to show as speech energy intensity large for stress syllable.But because different people different time is to, intensity of phonation also difference unequal with pronunciation duration in short, if the intensity curve of speech sentences to be marked and received pronunciation statement is directly carried out to template matches, result will affect the objectivity of evaluating.Therefore the present invention revises out a kind of intensity curve extracting method based on received pronunciation statement on the basis of original technology: when speech sentences time length ratio standard to be marked is short by speech sentences, adopt interpolation method to carry out supplementing of duration to it; In the time that speech sentences time length ratio received pronunciation statement to be marked is long, adopt the value of taking out method to carry out the adjustment of duration to it; Finally, utilize the point of maximum intensity of the intensity curve of received pronunciation statement, it is regular that the intensity curve for the treatment of scoring speech sentences carries out intensity.
(2) fluency scoring concrete steps are: voice to be marked are cut into front and back two parts, thereby and word cut in first part and latter part obtain efficient voice section; The length of two-part front and back efficient voice section is made to division operation with the length of voice always to be marked respectively, and by the value obtaining and corresponding threshold, if be all greater than corresponding threshold value, it is fluent to be judged to be; Otherwise it is unfluent to be judged to be;
For the fluency of Sentence-level, be intended to the clear and coherent degree by calculating sentence expression, and utilize received pronunciation to calculate the rhythm score of pronunciation, both fusions obtain the fluency diagnostic model of sentence.This sentence fluency methods of marking also can be applied to the scoring of chapter fluency.The method is considered the smoothness of enunciator in statement statement process, has the higher degree of correlation than classic method.Therefore can be applied in speech assessment system.
(3) word speed scoring concrete steps are: calculate in voice to be marked to pronounce and partly account for the ratio of voice duration whole to be marked, according to described ratio, word speed is marked.
(4) rhythm scoring concrete steps are: adopt the paired index of variability of improved otherness (the Distinct Pairwise Variability Index, dPVI) parameter calculation formula to calculate the rhythm of voice to be marked.DPVI, according to the feature of voice unit duration otherness, carries out respectively comparing calculation by received pronunciation statement and the syllable unit fragment duration of band scoring speech sentences, and the parameter of changing out is instructed to foundation for objective evaluation and feedback.
Wherein d is the voice unit fragment duration (as: d that statement is divided
kbe k voice unit fragment duration), when m=min (received pronunciation statement unit number, speech sentences unit number to be marked), Len
stdfor received pronunciation statement duration.Suitable with received pronunciation statement duration owing to carrying out regularly speech sentences duration be marked having arrived before PVI computing, when calculating, can only use Len
stdas computing unit.
(5) stress scoring concrete steps are: on the intensity curve basis after regular, by being set, stress threshold value and non-stress threshold value divide stress unit as double threshold and the stressed vowel duration of feature, and adopt dynamic time warping (Dynamic Time Warping, DTW) algorithm carries out pattern match to speech sentences described to be marked and received pronunciation statement, realizes the scoring of stress.
Stress refers to sound stressed in word, phrase, sentence.The ultimate principle of DTW algorithm is dynamic time warping, and original unmatched time span between test template and reference template is mated.Calculate its similarity by traditional Euclidean distance, establishing reference template and test template is R and T, and the less similarity of distance B [T, R] is higher.The shortcoming of tradition DTW algorithm is in the time carrying out template matches, and the weight of all frames is consistent, must all templates of coupling, and calculated amount is larger, and particularly, when template number increases when very fast, operand growth is fast especially.So the present invention adopts the DTW algorithm speech sentences to be marked improved and the pattern match of received pronunciation statement, the perfect shortcoming of traditional DTW algorithm, the weight of each frame is given priority to, and greatly reduces calculated amount, make result more accurate.
(6) intonation scoring concrete steps are: extract the resonance peak of voice to be marked and received pronunciation, and according to the fitting degree of the variation tendency of the variation tendency of speech resonant peak described to be marked and received pronunciation resonance peak, intonation is marked.
Intonation is an important sign of representation language ability to express in people's English communication, is the reflection that speech people language uses state holophrase gesture, is the order of importance and emergency of voice and the intonation of modulation in tone in sense of hearing.
In the research of voice digital signal processing, the resonance peak of voice signal is a very important performance parameter.Here the resonance peak of mentioning refers to some regions that in the frequency spectrum of sound energy is concentrated relatively, the not determinative of tonequality still of resonance peak, and reflected the physical features of sound channel (resonant cavity).Sound is when through resonant cavity, be subject to the filter action of cavity, the energy of different frequency in frequency domain is redistributed, a part is because the resonant interaction of resonant cavity is strengthened, another part is decayed, and those frequencies that strengthened show as dense blackstreak on the sonagram of time frequency analysis.Because energy distribution is inhomogeneous, strong part is just as mountain peak, so be referred to as resonance peak.Resonance peak is the key character of reflection vocal tract resonance characteristic, and it has represented the direct sources of pronunciation information, and people utilized resonance peak information in speech perception, thus resonance peak is voice signal process in very important characteristic parameter.Resonance peak is the one group of resonant frequency producing when quasi-periodicity, pulse excitation entered sound channel.Formant parameter comprises formant frequency and frequency span, and it is the important parameter of the different simple or compound vowel of a Chinese syllable of difference.And resonance peak information is included among frequency envelope, the key that therefore formant parameter extracts is to estimate natural-sounding spectrum envelope, it is generally acknowledged that the maximal value in spectrum envelope is exactly resonance peak.
The present invention also provides a kind of speech assessment system, as shown in Figure 7, comprising:
Voice recording module 101, for recording examinee's examination paper voice;
Pretreatment module 102, carries out pre-service for the examination paper voice to described examinee, obtains examination paper voice language material;
Parameter attribute extraction module 103, for extracting the characteristic parameter of described examination paper voice language material;
Sound identification module 104, for adopting audio recognition method based on HMM and ANN mixture model characteristic parameter and the received pronunciation template to described examination paper voice language material to carry out characteristic matching, identifies the content of examination paper voice, and gives raw score;
Speech assessment module 105, for carrying out accuracy scoring, fluency scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring for raw score higher than the examination paper voice of setting threshold.
Comprehensive grading module 106, obtains the final scoring of raw score higher than the examination paper voice of setting threshold for the score calculation of overall accuracy, fluency, word speed, rhythm, stress and intonation.
Wherein, described speech assessment system and speech assessment method are mutually corresponding, and the step that therefore the concrete treatment step of each module can reference voice methods of marking, is not repeating again.
Implement the present invention, there is following beneficial effect:
(1) the present invention has added practical noise reduction and has cut word method in pretreatment module, obtains the voice language material of better quality;
(2) adopt the audio recognition method based on HMM and ANN mixture model, performance is better, and it is more accurate to identify;
(3) by the many index analysis to word speed, rhythm, stress and intonation, than original more diversification of Score index of reading aloud topic, result has more objectivity;
(4) by the double analysis to accuracy and fluency, on the original basis that can only realize reading aloud topic scoring, realize non-objective scorings of reading aloud topic such as translation topic, question-and-answer problem and repetition topics, set up a reasonable perfect speech assessment method and system, the scoring of fast going over examination papers exactly, marks to examinee with objective standards of grading;
(5) the present invention have advantages of more stable, efficiency is higher, and practical, applied range, can apply to the process of correcting of SET, significantly effectively shorten and correct the time, improve the high efficiency of system processing, also improved the objectivity of correcting.
Above disclosed is only a kind of preferred embodiment of the present invention, certainly can not limit with this interest field of the present invention, and the equivalent variations of therefore doing according to the claims in the present invention, still belongs to the scope that the present invention is contained.
Claims (10)
1. a speech assessment method, is characterized in that, comprises step:
S1, record examinee's examination paper voice;
S2, described examinee's examination paper voice are carried out to pre-service, obtain examination paper voice language material;
S3, extract the characteristic parameter of described examination paper voice language material;
The characteristic parameter of described examination paper voice language material and received pronunciation template are carried out characteristic matching by S4, the audio recognition method of employing based on HMM and ANN mixture model, identifies the content of described examination paper voice, and give raw score;
If S5 raw score is lower than presetting threshold value, described raw score is the final scoring of these examination paper voice, and these examination paper voice of mark are problem volume; If raw score is higher than presetting threshold value, described examination paper voice are carried out to point index scoring of accuracy, fluency, word speed, rhythm, stress and intonation;
S6, to described point of index, scoring is weighted the final scoring that obtains described examination paper voice.
2. speech assessment method as claimed in claim 1, is characterized in that, before described step S1, also comprises step S0, and described step S0 specifically comprises step:
S01, record expert's received pronunciation;
S02, described received pronunciation is carried out to pre-service, obtain received pronunciation language material;
S03, extract the characteristic parameter of described received pronunciation language material;
S04, the characteristic parameter of described received pronunciation language material is carried out to model training, obtain described received pronunciation template.
3. speech assessment method as claimed in claim 1, is characterized in that, the concrete steps of the audio recognition method based on HMM and ANN mixture model in described step S4 are:
S41, set up the HMM model of the characteristic parameter of described examination paper voice language material, obtain all state cumulative probabilities in HMM model;
S42, described all state cumulative probabilities are processed as the input feature vector of ANN sorter, thus output recognition result;
S43, described recognition result and described received pronunciation template are carried out to characteristic matching, thereby identify the content of described examination paper voice.
4. speech assessment method as claimed in claim 1, it is characterized in that, pre-service in described step S2 specifically comprises noise reduction, pre-emphasis, point frame, windowing, end-point detection and cuts word, wherein, the concrete steps of described noise reduction are to adopt the blank voice segments of voice, as the base value of noise, subsequent voice is carried out to denoising.
5. speech assessment method as claimed in claim 4, is characterized in that, described in cut word and specifically comprise step:
S21, extract the MFCC parameter of each phoneme in voice, and set up the HMM model of corresponding phoneme;
S22, voice are carried out to rough lumber divide, obtain effective voice segments;
S23, go out the word of described voice segments according to the HMM Model Identification of described phoneme, thereby be set of letters by speech recognition.
6. speech assessment method as claimed in claim 1, it is characterized in that, extracting parameter feature in described step S3 is specially extracts MFCC characteristic parameter, and concrete steps are that the language material obtaining after pre-service is carried out to Fast Fourier Transform (FFT), quarter window filtering, asks logarithm, discrete cosine transform to obtain MFCC characteristic parameter.
7. speech assessment method as claimed in claim 1, is characterized in that, the accuracy scoring concrete steps in described step S5 are:
The method of the employing value of pulling and pushing by regular speech sentences to be marked to the degree close with received pronunciation statement; Adopt short-time energy extract as feature described in the intensity curve of speech sentences to be marked and received pronunciation statement; The fitting degree of the intensity curve by speech sentences relatively to be marked and received pronunciation statement is marked.
8. speech assessment method as claimed in claim 1, is characterized in that, the fluency scoring concrete steps in described step S5 are:
Two parts before and after voice to be marked are cut into, thereby and word cut in first part and latter part obtain efficient voice section; The length of two-part front and back efficient voice section is made to division operation with the length of voice always to be marked respectively, and by the value obtaining and corresponding threshold, if be greater than corresponding threshold value, it is fluent to be judged to be; Otherwise it is unfluent to be judged to be.
9. speech assessment method as claimed in claim 1, is characterized in that, in described step S5
Word speed scoring concrete steps are: calculate in voice to be marked to pronounce and partly account for the ratio of voice duration whole to be marked, according to described ratio, word speed is marked;
Rhythm scoring concrete steps are: adopt improved dPVI parameter calculation formula to calculate the rhythm of voice to be marked;
Stress scoring concrete steps are: on the intensity curve basis after regular, by being set, stress threshold value and non-stress threshold value divide stress unit as double threshold and the stressed vowel duration of feature, and adopt DTW algorithm to carry out pattern match to speech sentences described to be marked and received pronunciation statement, realize the scoring of stress;
Intonation scoring concrete steps are: extract the resonance peak of voice to be marked and received pronunciation, and according to the fitting degree of the variation tendency of the variation tendency of speech resonant peak described to be marked and received pronunciation resonance peak, intonation is marked.
10. a speech assessment system, is characterized in that, comprising:
Voice recording module, for recording examinee's examination paper voice;
Pretreatment module, carries out pre-service for the examination paper voice to described examinee, obtains examination paper voice language material;
Characteristic parameter extraction module, for extracting the characteristic parameter of described examination paper voice language material;
Sound identification module, carry out characteristic matching for adopting audio recognition method based on HMM and ANN mixture model characteristic parameter and the received pronunciation template to described examination paper voice language material, identify the content of examination paper voice, and give raw score and mark whether as problem volume;
Speech assessment module, for carrying out accuracy scoring, fluency scoring, word speed scoring, rhythm scoring, stress scoring and intonation scoring for raw score higher than the non-problem examination paper voice that preset threshold value.
Comprehensive grading module, obtains the final scoring of raw score higher than the examination paper voice of setting threshold for the score calculation of overall accuracy, fluency, word speed, rhythm, stress and intonation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410178813.XA CN103928023B (en) | 2014-04-29 | 2014-04-29 | A kind of speech assessment method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410178813.XA CN103928023B (en) | 2014-04-29 | 2014-04-29 | A kind of speech assessment method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103928023A true CN103928023A (en) | 2014-07-16 |
CN103928023B CN103928023B (en) | 2017-04-05 |
Family
ID=51146222
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410178813.XA Expired - Fee Related CN103928023B (en) | 2014-04-29 | 2014-04-29 | A kind of speech assessment method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103928023B (en) |
Cited By (61)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104361896A (en) * | 2014-12-04 | 2015-02-18 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104361895A (en) * | 2014-12-04 | 2015-02-18 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104464423A (en) * | 2014-12-19 | 2015-03-25 | 科大讯飞股份有限公司 | Calibration optimization method and system for speaking test evaluation |
CN104485105A (en) * | 2014-12-31 | 2015-04-01 | 中国科学院深圳先进技术研究院 | Electronic medical record generating method and electronic medical record system |
CN104505103A (en) * | 2014-12-04 | 2015-04-08 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104732352A (en) * | 2015-04-02 | 2015-06-24 | 张可 | Method for question bank quality evaluation |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
CN104810017A (en) * | 2015-04-08 | 2015-07-29 | 广东外语外贸大学 | Semantic analysis-based oral language evaluating method and system |
CN105608960A (en) * | 2016-01-27 | 2016-05-25 | 广东外语外贸大学 | Spoken language formative teaching method and system based on multi-parameter analysis |
CN105632488A (en) * | 2016-02-23 | 2016-06-01 | 深圳市海云天教育测评有限公司 | Voice evaluation method and device |
CN105654785A (en) * | 2016-03-18 | 2016-06-08 | 上海语知义信息技术有限公司 | Personalized spoken foreign language learning system and method |
CN105681920A (en) * | 2015-12-30 | 2016-06-15 | 深圳市鹰硕音频科技有限公司 | Network teaching method and system with voice recognition function |
CN105825852A (en) * | 2016-05-23 | 2016-08-03 | 渤海大学 | Oral English reading test scoring method |
CN105989839A (en) * | 2015-06-03 | 2016-10-05 | 乐视致新电子科技(天津)有限公司 | Speech recognition method and speech recognition device |
CN106531182A (en) * | 2016-12-16 | 2017-03-22 | 上海斐讯数据通信技术有限公司 | Language learning system |
CN106548673A (en) * | 2016-10-25 | 2017-03-29 | 合肥东上多媒体科技有限公司 | A kind of Teaching Management Method based on intelligent Matching |
CN106652622A (en) * | 2017-02-07 | 2017-05-10 | 广东小天才科技有限公司 | Text training method and device |
CN106710348A (en) * | 2016-12-20 | 2017-05-24 | 江苏前景信息科技有限公司 | Civil air defense interactive experience method and system |
CN106971711A (en) * | 2016-01-14 | 2017-07-21 | 芋头科技(杭州)有限公司 | A kind of adaptive method for recognizing sound-groove and system |
CN107221318A (en) * | 2017-05-12 | 2017-09-29 | 广东外语外贸大学 | Oral English Practice pronunciation methods of marking and system |
CN107230171A (en) * | 2017-05-31 | 2017-10-03 | 中南大学 | A kind of student, which chooses a job, is orientated evaluation method and system |
CN107239897A (en) * | 2017-05-31 | 2017-10-10 | 中南大学 | A kind of personality occupation type method of testing and system |
CN107274738A (en) * | 2017-06-23 | 2017-10-20 | 广东外语外贸大学 | Chinese-English translation teaching points-scoring system based on mobile Internet |
CN107293286A (en) * | 2017-05-27 | 2017-10-24 | 华南理工大学 | A kind of speech samples collection method that game is dubbed based on network |
CN107292496A (en) * | 2017-05-31 | 2017-10-24 | 中南大学 | A kind of work values cognitive system and method |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
CN107785011A (en) * | 2017-09-15 | 2018-03-09 | 北京理工大学 | Word speed estimates training, word speed method of estimation, device, equipment and the medium of model |
CN107818797A (en) * | 2017-12-07 | 2018-03-20 | 苏州科达科技股份有限公司 | Voice quality assessment method, apparatus and its system |
CN108428382A (en) * | 2018-02-14 | 2018-08-21 | 广东外语外贸大学 | It is a kind of spoken to repeat methods of marking and system |
CN108429932A (en) * | 2018-04-25 | 2018-08-21 | 北京比特智学科技有限公司 | Method for processing video frequency and device |
CN108831503A (en) * | 2018-06-07 | 2018-11-16 | 深圳习习网络科技有限公司 | A kind of method and device for oral evaluation |
CN108986786A (en) * | 2018-07-27 | 2018-12-11 | 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) | Interactive voice equipment ranking method, system, computer equipment and storage medium |
CN109036429A (en) * | 2018-07-25 | 2018-12-18 | 浪潮电子信息产业股份有限公司 | A kind of voice match scoring querying method and system based on cloud service |
CN109147823A (en) * | 2018-10-31 | 2019-01-04 | 河南职业技术学院 | Oral English Practice assessment method and Oral English Practice assessment device |
CN109214616A (en) * | 2017-06-29 | 2019-01-15 | 上海寒武纪信息科技有限公司 | A kind of information processing unit, system and method |
CN109493658A (en) * | 2019-01-08 | 2019-03-19 | 上海健坤教育科技有限公司 | Situated human-computer dialogue formula spoken language interactive learning method |
WO2019075828A1 (en) * | 2017-10-20 | 2019-04-25 | 深圳市鹰硕音频科技有限公司 | Voice evaluation method and apparatus |
CN109727608A (en) * | 2017-10-25 | 2019-05-07 | 香港中文大学深圳研究院 | A kind of ill voice appraisal procedure based on Chinese speech |
CN109979484A (en) * | 2019-04-03 | 2019-07-05 | 北京儒博科技有限公司 | Pronounce error-detecting method, device, electronic equipment and storage medium |
CN110135492A (en) * | 2019-05-13 | 2019-08-16 | 山东大学 | Equipment fault diagnosis and method for detecting abnormality and system based on more Gauss models |
CN110211607A (en) * | 2019-07-04 | 2019-09-06 | 山东中医药高等专科学校 | A kind of English learning system based on sensing network |
CN110600052A (en) * | 2019-08-19 | 2019-12-20 | 天闻数媒科技(北京)有限公司 | Voice evaluation method and device |
CN111294468A (en) * | 2020-02-07 | 2020-06-16 | 普强时代(珠海横琴)信息技术有限公司 | Tone quality detection and analysis system for customer service center calling |
CN111358428A (en) * | 2020-01-20 | 2020-07-03 | 书丸子(北京)科技有限公司 | Observation capability test evaluation method and device |
CN111583961A (en) * | 2020-05-07 | 2020-08-25 | 北京一起教育信息咨询有限责任公司 | Stress evaluation method and device and electronic equipment |
CN111599234A (en) * | 2020-05-19 | 2020-08-28 | 黑龙江工业学院 | Automatic English spoken language scoring system based on voice recognition |
CN111612324A (en) * | 2020-05-15 | 2020-09-01 | 深圳看齐信息有限公司 | Multi-dimensional assessment method based on oral English examination |
CN111612352A (en) * | 2020-05-22 | 2020-09-01 | 北京易华录信息技术股份有限公司 | Student expression ability assessment method and device |
CN111640452A (en) * | 2019-03-01 | 2020-09-08 | 北京搜狗科技发展有限公司 | Data processing method and device and data processing device |
CN111696524A (en) * | 2020-04-21 | 2020-09-22 | 厦门快商通科技股份有限公司 | Character-overlapping voice recognition method and system |
CN111816169A (en) * | 2020-07-23 | 2020-10-23 | 苏州思必驰信息科技有限公司 | Method and device for training Chinese and English hybrid speech recognition model |
CN112349300A (en) * | 2020-11-06 | 2021-02-09 | 北京乐学帮网络技术有限公司 | Voice evaluation method and device |
CN112634692A (en) * | 2020-12-15 | 2021-04-09 | 成都职业技术学院 | Emergency evacuation deduction training system for crew cabins |
CN112750465A (en) * | 2020-12-29 | 2021-05-04 | 昆山杜克大学 | Cloud language ability evaluation system and wearable recording terminal |
CN113035238A (en) * | 2021-05-20 | 2021-06-25 | 北京世纪好未来教育科技有限公司 | Audio evaluation method, device, electronic equipment and medium |
WO2021196475A1 (en) * | 2020-04-01 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Intelligent language fluency recognition method and apparatus, computer device, and storage medium |
CN113571043A (en) * | 2021-07-27 | 2021-10-29 | 广州欢城文化传媒有限公司 | Dialect simulation force evaluation method and device, electronic equipment and storage medium |
CN113807813A (en) * | 2021-09-14 | 2021-12-17 | 广东德诚科教有限公司 | Grading system and method based on man-machine conversation examination |
US11656910B2 (en) | 2017-08-21 | 2023-05-23 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
US11687467B2 (en) | 2018-04-28 | 2023-06-27 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
US11726844B2 (en) | 2017-06-26 | 2023-08-15 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102354495A (en) * | 2011-08-31 | 2012-02-15 | 中国科学院自动化研究所 | Testing method and system of semi-opened spoken language examination questions |
CN102800314A (en) * | 2012-07-17 | 2012-11-28 | 广东外语外贸大学 | English sentence recognizing and evaluating system with feedback guidance and method of system |
CN103559894A (en) * | 2013-11-08 | 2014-02-05 | 安徽科大讯飞信息科技股份有限公司 | Method and system for evaluating spoken language |
CN103617799A (en) * | 2013-11-28 | 2014-03-05 | 广东外语外贸大学 | Method for detecting English statement pronunciation quality suitable for mobile device |
-
2014
- 2014-04-29 CN CN201410178813.XA patent/CN103928023B/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102354495A (en) * | 2011-08-31 | 2012-02-15 | 中国科学院自动化研究所 | Testing method and system of semi-opened spoken language examination questions |
CN102800314A (en) * | 2012-07-17 | 2012-11-28 | 广东外语外贸大学 | English sentence recognizing and evaluating system with feedback guidance and method of system |
CN103559894A (en) * | 2013-11-08 | 2014-02-05 | 安徽科大讯飞信息科技股份有限公司 | Method and system for evaluating spoken language |
CN103617799A (en) * | 2013-11-28 | 2014-03-05 | 广东外语外贸大学 | Method for detecting English statement pronunciation quality suitable for mobile device |
Non-Patent Citations (4)
Title |
---|
孟平: "发音自动评估系统的设计与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
张文忠 等: "第二语言口语流利性发展定量研究", 《现代外语(季刊)》 * |
李心广 等: "考察重音与韵律的英语句子客观评价系统研究", 《计算机工程与应用》 * |
李晶皎 等: "语音识别中HMM与自组织神经网络结合的混合模型", 《东北大学学报》 * |
Cited By (79)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104505103B (en) * | 2014-12-04 | 2018-07-03 | 上海流利说信息技术有限公司 | Voice quality assessment equipment, method and system |
CN104361895A (en) * | 2014-12-04 | 2015-02-18 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104505103A (en) * | 2014-12-04 | 2015-04-08 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104361896A (en) * | 2014-12-04 | 2015-02-18 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104361896B (en) * | 2014-12-04 | 2018-04-13 | 上海流利说信息技术有限公司 | Voice quality assessment equipment, method and system |
CN104361895B (en) * | 2014-12-04 | 2018-12-18 | 上海流利说信息技术有限公司 | Voice quality assessment equipment, method and system |
CN104464423A (en) * | 2014-12-19 | 2015-03-25 | 科大讯飞股份有限公司 | Calibration optimization method and system for speaking test evaluation |
CN104485105A (en) * | 2014-12-31 | 2015-04-01 | 中国科学院深圳先进技术研究院 | Electronic medical record generating method and electronic medical record system |
CN104485105B (en) * | 2014-12-31 | 2018-04-13 | 中国科学院深圳先进技术研究院 | A kind of electronic health record generation method and electronic medical record system |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
CN104732977B (en) * | 2015-03-09 | 2018-05-11 | 广东外语外贸大学 | A kind of online spoken language pronunciation quality evaluating method and system |
CN104732352A (en) * | 2015-04-02 | 2015-06-24 | 张可 | Method for question bank quality evaluation |
CN104810017B (en) * | 2015-04-08 | 2018-07-17 | 广东外语外贸大学 | Oral evaluation method and system based on semantic analysis |
CN104810017A (en) * | 2015-04-08 | 2015-07-29 | 广东外语外贸大学 | Semantic analysis-based oral language evaluating method and system |
CN105989839B (en) * | 2015-06-03 | 2019-12-13 | 乐融致新电子科技(天津)有限公司 | Speech recognition method and device |
CN105989839A (en) * | 2015-06-03 | 2016-10-05 | 乐视致新电子科技(天津)有限公司 | Speech recognition method and speech recognition device |
CN105681920A (en) * | 2015-12-30 | 2016-06-15 | 深圳市鹰硕音频科技有限公司 | Network teaching method and system with voice recognition function |
CN105681920B (en) * | 2015-12-30 | 2017-03-15 | 深圳市鹰硕音频科技有限公司 | A kind of Network teaching method and system with speech identifying function |
CN106971711A (en) * | 2016-01-14 | 2017-07-21 | 芋头科技(杭州)有限公司 | A kind of adaptive method for recognizing sound-groove and system |
CN105608960A (en) * | 2016-01-27 | 2016-05-25 | 广东外语外贸大学 | Spoken language formative teaching method and system based on multi-parameter analysis |
CN105632488A (en) * | 2016-02-23 | 2016-06-01 | 深圳市海云天教育测评有限公司 | Voice evaluation method and device |
CN105654785A (en) * | 2016-03-18 | 2016-06-08 | 上海语知义信息技术有限公司 | Personalized spoken foreign language learning system and method |
CN105825852A (en) * | 2016-05-23 | 2016-08-03 | 渤海大学 | Oral English reading test scoring method |
CN106548673A (en) * | 2016-10-25 | 2017-03-29 | 合肥东上多媒体科技有限公司 | A kind of Teaching Management Method based on intelligent Matching |
CN106531182A (en) * | 2016-12-16 | 2017-03-22 | 上海斐讯数据通信技术有限公司 | Language learning system |
CN106710348A (en) * | 2016-12-20 | 2017-05-24 | 江苏前景信息科技有限公司 | Civil air defense interactive experience method and system |
CN106652622A (en) * | 2017-02-07 | 2017-05-10 | 广东小天才科技有限公司 | Text training method and device |
CN107221318B (en) * | 2017-05-12 | 2020-03-31 | 广东外语外贸大学 | English spoken language pronunciation scoring method and system |
CN107221318A (en) * | 2017-05-12 | 2017-09-29 | 广东外语外贸大学 | Oral English Practice pronunciation methods of marking and system |
CN107293286A (en) * | 2017-05-27 | 2017-10-24 | 华南理工大学 | A kind of speech samples collection method that game is dubbed based on network |
CN107230171A (en) * | 2017-05-31 | 2017-10-03 | 中南大学 | A kind of student, which chooses a job, is orientated evaluation method and system |
CN107292496A (en) * | 2017-05-31 | 2017-10-24 | 中南大学 | A kind of work values cognitive system and method |
CN107239897A (en) * | 2017-05-31 | 2017-10-10 | 中南大学 | A kind of personality occupation type method of testing and system |
CN107274738A (en) * | 2017-06-23 | 2017-10-20 | 广东外语外贸大学 | Chinese-English translation teaching points-scoring system based on mobile Internet |
US11726844B2 (en) | 2017-06-26 | 2023-08-15 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
US11537843B2 (en) | 2017-06-29 | 2022-12-27 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
CN109214616A (en) * | 2017-06-29 | 2019-01-15 | 上海寒武纪信息科技有限公司 | A kind of information processing unit, system and method |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
US11656910B2 (en) | 2017-08-21 | 2023-05-23 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
CN107785011B (en) * | 2017-09-15 | 2020-07-03 | 北京理工大学 | Training method, device, equipment and medium of speech rate estimation model and speech rate estimation method, device and equipment |
CN107785011A (en) * | 2017-09-15 | 2018-03-09 | 北京理工大学 | Word speed estimates training, word speed method of estimation, device, equipment and the medium of model |
WO2019075828A1 (en) * | 2017-10-20 | 2019-04-25 | 深圳市鹰硕音频科技有限公司 | Voice evaluation method and apparatus |
CN109727608A (en) * | 2017-10-25 | 2019-05-07 | 香港中文大学深圳研究院 | A kind of ill voice appraisal procedure based on Chinese speech |
CN109727608B (en) * | 2017-10-25 | 2020-07-24 | 香港中文大学深圳研究院 | Chinese speech-based ill voice evaluation system |
CN107818797A (en) * | 2017-12-07 | 2018-03-20 | 苏州科达科技股份有限公司 | Voice quality assessment method, apparatus and its system |
CN108428382A (en) * | 2018-02-14 | 2018-08-21 | 广东外语外贸大学 | It is a kind of spoken to repeat methods of marking and system |
CN108429932A (en) * | 2018-04-25 | 2018-08-21 | 北京比特智学科技有限公司 | Method for processing video frequency and device |
US11687467B2 (en) | 2018-04-28 | 2023-06-27 | Shanghai Cambricon Information Technology Co., Ltd | Data sharing system and data sharing method therefor |
CN108831503A (en) * | 2018-06-07 | 2018-11-16 | 深圳习习网络科技有限公司 | A kind of method and device for oral evaluation |
CN109036429A (en) * | 2018-07-25 | 2018-12-18 | 浪潮电子信息产业股份有限公司 | A kind of voice match scoring querying method and system based on cloud service |
CN108986786A (en) * | 2018-07-27 | 2018-12-11 | 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) | Interactive voice equipment ranking method, system, computer equipment and storage medium |
CN109147823A (en) * | 2018-10-31 | 2019-01-04 | 河南职业技术学院 | Oral English Practice assessment method and Oral English Practice assessment device |
CN109493658A (en) * | 2019-01-08 | 2019-03-19 | 上海健坤教育科技有限公司 | Situated human-computer dialogue formula spoken language interactive learning method |
CN111640452A (en) * | 2019-03-01 | 2020-09-08 | 北京搜狗科技发展有限公司 | Data processing method and device and data processing device |
CN111640452B (en) * | 2019-03-01 | 2024-05-07 | 北京搜狗科技发展有限公司 | Data processing method and device for data processing |
CN109979484B (en) * | 2019-04-03 | 2021-06-08 | 北京儒博科技有限公司 | Pronunciation error detection method and device, electronic equipment and storage medium |
CN109979484A (en) * | 2019-04-03 | 2019-07-05 | 北京儒博科技有限公司 | Pronounce error-detecting method, device, electronic equipment and storage medium |
CN110135492A (en) * | 2019-05-13 | 2019-08-16 | 山东大学 | Equipment fault diagnosis and method for detecting abnormality and system based on more Gauss models |
CN110211607A (en) * | 2019-07-04 | 2019-09-06 | 山东中医药高等专科学校 | A kind of English learning system based on sensing network |
CN110600052A (en) * | 2019-08-19 | 2019-12-20 | 天闻数媒科技(北京)有限公司 | Voice evaluation method and device |
CN111358428A (en) * | 2020-01-20 | 2020-07-03 | 书丸子(北京)科技有限公司 | Observation capability test evaluation method and device |
CN111294468A (en) * | 2020-02-07 | 2020-06-16 | 普强时代(珠海横琴)信息技术有限公司 | Tone quality detection and analysis system for customer service center calling |
WO2021196475A1 (en) * | 2020-04-01 | 2021-10-07 | 深圳壹账通智能科技有限公司 | Intelligent language fluency recognition method and apparatus, computer device, and storage medium |
CN111696524A (en) * | 2020-04-21 | 2020-09-22 | 厦门快商通科技股份有限公司 | Character-overlapping voice recognition method and system |
CN111696524B (en) * | 2020-04-21 | 2023-02-14 | 厦门快商通科技股份有限公司 | Character-overlapping voice recognition method and system |
CN111583961A (en) * | 2020-05-07 | 2020-08-25 | 北京一起教育信息咨询有限责任公司 | Stress evaluation method and device and electronic equipment |
CN111612324A (en) * | 2020-05-15 | 2020-09-01 | 深圳看齐信息有限公司 | Multi-dimensional assessment method based on oral English examination |
CN111612324B (en) * | 2020-05-15 | 2021-02-19 | 深圳看齐信息有限公司 | Multi-dimensional assessment method based on oral English examination |
CN111599234A (en) * | 2020-05-19 | 2020-08-28 | 黑龙江工业学院 | Automatic English spoken language scoring system based on voice recognition |
CN111612352A (en) * | 2020-05-22 | 2020-09-01 | 北京易华录信息技术股份有限公司 | Student expression ability assessment method and device |
CN111816169A (en) * | 2020-07-23 | 2020-10-23 | 苏州思必驰信息科技有限公司 | Method and device for training Chinese and English hybrid speech recognition model |
CN112349300A (en) * | 2020-11-06 | 2021-02-09 | 北京乐学帮网络技术有限公司 | Voice evaluation method and device |
CN112634692A (en) * | 2020-12-15 | 2021-04-09 | 成都职业技术学院 | Emergency evacuation deduction training system for crew cabins |
CN112750465A (en) * | 2020-12-29 | 2021-05-04 | 昆山杜克大学 | Cloud language ability evaluation system and wearable recording terminal |
CN112750465B (en) * | 2020-12-29 | 2024-04-30 | 昆山杜克大学 | Cloud language ability evaluation system and wearable recording terminal |
CN113035238A (en) * | 2021-05-20 | 2021-06-25 | 北京世纪好未来教育科技有限公司 | Audio evaluation method, device, electronic equipment and medium |
CN113571043A (en) * | 2021-07-27 | 2021-10-29 | 广州欢城文化传媒有限公司 | Dialect simulation force evaluation method and device, electronic equipment and storage medium |
CN113571043B (en) * | 2021-07-27 | 2024-06-04 | 广州欢城文化传媒有限公司 | Dialect simulation force evaluation method and device, electronic equipment and storage medium |
CN113807813A (en) * | 2021-09-14 | 2021-12-17 | 广东德诚科教有限公司 | Grading system and method based on man-machine conversation examination |
Also Published As
Publication number | Publication date |
---|---|
CN103928023B (en) | 2017-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103928023B (en) | A kind of speech assessment method and system | |
CN102800314B (en) | English sentence recognizing and evaluating system with feedback guidance and method | |
US11335324B2 (en) | Synthesized data augmentation using voice conversion and speech recognition models | |
US20100004931A1 (en) | Apparatus and method for speech utterance verification | |
CN104050965A (en) | English phonetic pronunciation quality evaluation system with emotion recognition function and method thereof | |
CN109285535A (en) | Phoneme synthesizing method based on Front-end Design | |
CN106548775A (en) | A kind of audio recognition method and system | |
Duan et al. | A Preliminary study on ASR-based detection of Chinese mispronunciation by Japanese learners | |
Liu et al. | AI recognition method of pronunciation errors in oral English speech with the help of big data for personalized learning | |
Goyal et al. | A comparison of Laryngeal effect in the dialects of Punjabi language | |
TW201411602A (en) | Speaking-rate controlled prosodic-information generating device and speaking-rate dependent hierarchical prosodic module | |
TWI467566B (en) | Polyglot speech synthesis method | |
Farooq et al. | Mispronunciation detection in articulation points of Arabic letters using machine learning | |
Sinha et al. | Empirical analysis of linguistic and paralinguistic information for automatic dialect classification | |
CN112133292A (en) | End-to-end automatic voice recognition method for civil aviation land-air communication field | |
CN102880906A (en) | Chinese vowel pronunciation method based on DIVA nerve network model | |
Dai | [Retracted] An Automatic Pronunciation Error Detection and Correction Mechanism in English Teaching Based on an Improved Random Forest Model | |
Hacioglu et al. | Parsing speech into articulatory events | |
Sharma et al. | Soft-Computational Techniques and Spectro-Temporal Features for Telephonic Speech Recognition: an overview and review of current state of the art | |
CN117012230A (en) | Evaluation model for singing pronunciation and character biting | |
Dalva | Automatic speech recognition system for Turkish spoken language | |
Vyas et al. | Study of Speech Recognition Technology and its Significance in Human-Machine Interface | |
Sinha et al. | Spectral and prosodic features-based speech pattern classification | |
CN111179902B (en) | Speech synthesis method, equipment and medium for simulating resonance cavity based on Gaussian model | |
Miyazaki et al. | Connectionist temporal classification-based sound event encoder for converting sound events into onomatopoeic representations |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20170405 Termination date: 20200429 |