[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN103778914B - Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching - Google Patents

Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching Download PDF

Info

Publication number
CN103778914B
CN103778914B CN201410040474.9A CN201410040474A CN103778914B CN 103778914 B CN103778914 B CN 103778914B CN 201410040474 A CN201410040474 A CN 201410040474A CN 103778914 B CN103778914 B CN 103778914B
Authority
CN
China
Prior art keywords
rsqb
lsqb
snr
template
noise ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410040474.9A
Other languages
Chinese (zh)
Other versions
CN103778914A (en
Inventor
宁更新
吴丽菲
宁小娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201410040474.9A priority Critical patent/CN103778914B/en
Publication of CN103778914A publication Critical patent/CN103778914A/en
Application granted granted Critical
Publication of CN103778914B publication Critical patent/CN103778914B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Complex Calculations (AREA)

Abstract

The invention discloses an anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching. The anti-noise voice identification method based on signal-to-noise ratio weighing template characteristic matching comprises the following steps that (1) input voice signals are preprocessed, and a phase position coefficient is obtained; (2) the characteristics of input voice, namely a phase position MFCC, are calculated; (3) characteristic matching is carried out on a template based on SNR. The invention further discloses a device of the anti-noise voice identification method based on signal-to-noise ratio weighing template characteristic matching. The device comprises a power source module, a display module, a storage module, a DSP/ARM digital processing module, a microphone, an A/D converter and a USB interface. The anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching have the advantages of being wide in application range, high in accuracy, low in cost, convenient and fast to use, high in adaptability and the like.

Description

Anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling and device
Technical field
The present invention relates to a kind of sound signal processing technology, particularly to a kind of based on noise Ratio Weighted template characteristic coupling Anti-noise audio recognition method and device.
Background technology
The application of speech recognition widely, is almost related to the every aspect of daily life.As phonetic dialing system System, seat reservation system, medical services, bank service, dictation machine, computer controls, Industry Control, voice communication system etc..Voice Technology of identification changes the existing daily life side of the mankind deeply in every field such as industry, household electrical appliances, communication, medical treatment, home services Formula.Nowadays, the acoustic noise robustness requirement more and more higher to speech recognition for the actual environment, therefore, extract have robustness and The characteristic vector of stronger separating capacity has great importance to speech recognition system.
The feature being currently used for speech recognition is all based on the power spectrum of voice signal, and power spectrum illustrates signal in frequency domain model The Energy distribution enclosed.When there is external noise, this Energy distribution further comprises the energy of noise.This allows for corresponding spy Levy vector very sensitive to external noise, lead to speech recognition system performance in a noisy environment not good.
The method of solution block eigenvector external portion noise-sensitive mainly has two aspects, and one is feature based, and one is base In model.The method of feature based is to make the characteristic vector of generation as far as possible unrelated with noise in the front end of speech recognition system.And It is the rear end in speech recognition system based on the method for model, by a small amount of self-adapting data under test environment, model is joined Number is adjusted, and gradually model parameter is transformed to actual environment, thus reaching the purpose improving system recognition rate.Feature based Solution have spectrum-subtraction, RASTA facture etc..Parallel model mixing method (PMC) is had based on the method for model, based on vector The adaptive method (VTS) of Taylor series, signal decomposition method etc..
The phonic signal character parameter being presently used for the extraction of speech recognition mainly has two kinds:Linear prediction residue error And Mel frequency cepstral coefficient (MFCC) (LPCC).LPCC characteristic parameter can effectively represent speech parameter and have higher Calculating speed, but do not account for the feature to speech processes for the auditory system of the mankind.It is special to human auditory system that Mel frequency band divides Property a kind of simulation of through engineering approaches, MFCC simulates the feature to speech processes for the human ear to a certain extent.
But either MFCC or LPCC, existing speech recognition features, the recognition performance under low signal-to-noise ratio environment Fine, in order to overcome this weakness, present invention firstly provides a kind of by change relativity measurement in low signal-to-noise ratio In the case of there is the new feature of more preferable robustness, that is, adopt the angle between two time-delay signals vectors as dependency degree Amount, because angle is the nonlinear transformation of traditional autocorrelation coefficient scalar product, can strengthen the work of crest on frequency domain with phase place With, and crest relative noise robustness is higher.Then, high s/n ratio is suitable to according to traditional characteristic, new feature is suitable to low signal-to-noise ratio, Propose a kind of template matching computational methods according to noise Ratio Weighted, finally propose related device.
Content of the invention
The primary and foremost purpose of the present invention is to overcome the shortcoming of prior art and deficiency, provides one kind to be based on noise Ratio Weighted mould The anti-noise audio recognition method of plate features coupling, the method wide accommodation, accuracy is high.
Another object of the present invention is to overcoming shortcoming and the deficiency of prior art, a kind of realization is provided to add based on signal to noise ratio The device of the anti-noise audio recognition method of power template characteristic coupling, in DSP/ARM7 chip operation, it is possible to use the TMS of TI The ARM7S3C44B0 of 320C6711 or Samsung realizes.
The primary and foremost purpose of the present invention is achieved through the following technical solutions:A kind of based on noise Ratio Weighted template characteristic coupling Anti-noise audio recognition method, comprises the following steps:
Step one:Pretreatment is carried out to input speech signal, tries to achieve phase coefficient;
Voice signal s [n] after digitized is carried out sub-frame processing, adding window is carried out to it using Hamming window simultaneously.It is divided into T Frame,
{s0[n],s1[n],...,st[n],...,sT-1[n]}
Wherein
st[n]=and s [Kt], s [Kt+1] ..., s [Kt+N-1] }
K moves for frame, and N is frame length, st[n] is the frame signal sequence in moment t.
Voice signal has short-term stationarity, and therefore every frame signal is all stable.Gained frame signal is entered line period prolong Open up, thus obtaining auto-correlation function is
Be can be seen that by above formula, R [k] is the dot product of two N-dimensional vectors,
Wherein, | | x | |2=| | x0||2=| | xk||2, expression is frame energy.θkIt is vector x0And vector xkIn N-dimensional space Angle.
Normalized autocorrelation coefficient is carried out the nonlinear change of anticosine, obtain phase coefficient.
The span of P [k] is between 0 to π, is normalized between 0 to 1, obtains normalized phase place auto-correlation letter Number
Pn[k] can improve the robustness in the case of low signal-to-noise ratio, but in the case of high s/n ratio, especially pure language In the case of sound, performance is not so good as Rn[k].
Step 2:Calculate the feature of input voice, i.e. phase place MFCC;
Respectively to Pn[k] carries out DFT transform, obtains phase power spectrum Sp[l].
Here Sp[l] is called phase power spectrum, and the MEL frequency cepstral coefficient therefrom obtaining is called phase place MFCC, that is, leads to Cross the filtering of Mel dimensions in frequency wave filter group, then carry out logarithm operation.After separating in the information of each frequency band, with from Scattered cosine transform (DCT) changes to frequency domain character in time domain, obtains phase place MFCC parameter.
Phase place MFCC parameter chooses L rank static cepstral coefficients and its single order and second dervative, common 3L dimension.
Step 3:Template characteristic coupling based on SNR;
There is j reference voice data template in reference database, wherein comprise the MFCC feature of 3M dimension and the phase place of 3L dimension MFCC feature.The Euclidean distance that characteristic vector 3M ties up between the test template of MFCC and wherein i-th reference template is DMi, feature to The Euclidean distance that amount 3L ties up between the test template of phase place MFCC and the i-th reference template is PLi, i=0,1 ..., j-1.
Known robustness in the case of low signal-to-noise ratio using characteristic vector N-dimensional phase place MFCC is higher, and in high s/n ratio In the case of, especially in the case of clean speech, the robustness tieing up MFCC using characteristic vector M is higher.
According to this point, the present invention adopts a kind of method based on noise Ratio Weighted, under the conditions of different signal to noise ratios, adopts Different weight values, obtains the weight distance value C in mould distance between plates for two feature vectorsi.
Ci=(1-w) DMi+wPLi, i=0,1 ..., j-1, (formula 5)
Template matching process is exactly search in j reference template, finds and makes min { Ci, i=0,1 ..., j-1 establishment That template.
W is the weight of distance between phase place MFCC parameterized template, and its value is determined by signal to noise ratio snr, signal to noise ratio thus can obtain:
||Y||2Represent is the frame energy of voice in actual environment, | | N | |2Represent be in actual environment sampling make an uproar The energy of acoustical signal,Represent the estimated value to this energy.
The value of w is determined by signal to noise ratio snr,
W=f (SNR), (formula 8)
F (SNR) represents the relation between weight coefficient w and signal to noise ratio snr.F (SNR) span is (0,1), with w each other Negative correlation, this relation can be linear or nonlinear.Can be to represent this pass using following two modes System:
Mode one:
Mode two:
U () is jump function, and α span is (1,5), is the threshold value of SNR, and when SNR is less than α, weight coefficient w is 1, when SNR is more than α, weight coefficient w and SNR is negatively correlated, and along index decreased, with the growth of SNR, final w gradually restrains In 0.The span of β be (1,10), be equivalent to traditional MFCC and phase place MFCC weight equal when SNR marginal value.γ's and θ Span be (0.1,1), be all used for adjusting the speed of change, its value is bigger, change slower.
Another object of the present invention is achieved through the following technical solutions:A kind of realization is based on noise Ratio Weighted template characteristic The device of the anti-noise audio recognition method joined, including:Power module, display module, memory module, DSP/ARM digital processing mould Block, mike, A/D converter and USB interface;Described memory module, USB interface, display module, power module and A/D conversion One end of device is all electrically connected with DSP/ARM digital signal processing module, and described mike is electrically connected with the other end of A/D converter Connect;Described mike is used for input test voice, and described A/D converter is used for tested speech digitized, described DSP/ARM core Piece is used for extracting feature and carrying out template matching, and described memory module is used for storing reference database, and described display module is used for Display result, described USB interface and computer connect.
Described A/D converter adopts ADC0832 chip;Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.
Described DSP/ARM7 chip adopts the TMS 320C6711 of the TI or ARM7S3C44B0 of Samsung.
On the basis of the present invention is calculated MFCC parameter in traditional autocorrelation coefficient, increased and replaced by phase coefficient Autocorrelation coefficient obtains phase place MFCC parameter, obtains individual features vector, and proposes the template matching meter according to noise Ratio Weighted Calculation method.
The present invention has such advantages as with respect to prior art and effect:
First, wide accommodation.The application of the present invention widely, is almost related to the every aspect of daily life.
2nd, accuracy is high.Invention applies the robustness of phase place MFCC is higher in the case of low signal-to-noise ratio, and in high letter Make an uproar than in the case of, the higher characteristic of the robustness of traditional MFCC especially in the case of clean speech, improve feature extraction distance Estimate mode, improve the accuracy of identification, the accuracy especially in the case of low signal-to-noise ratio.
3rd, low cost.All of computing can be completed using a common DSP or ARM chip.
4th, easy to use.This device can be inserted on any equipment having a USB interface, and plug and play is very convenient.
5th, strong adaptability.There is no particular/special requirement to use environment, can in most of environment normal work.
Brief description
Fig. 1 is the module frame chart of invention device.
Fig. 2 is pretreatment and the feature extraction flow chart of invention device.
Fig. 3 is the template matching block flow diagram of invention device.
Fig. 4 is the hardware structure diagram of invention device.
Specific embodiment
With reference to embodiment and accompanying drawing, the present invention is described in further detail, but embodiments of the present invention do not limit In this.
Embodiment 1
As shown in figure 1, tested speech initially enters pretreatment module, subsequently enter characteristic extracting module, obtain testing language MFCC and PAC-MFCC is input to template matching module by sound individual features vector, by calculating weight distance value and reference data Template in storehouse is mated (specific template matching block process is as shown in Figure 3), obtains the minimum coupling of weight distance value Template, result exports display module the most at last.
Wherein pretreatment process and feature extraction flow process as shown in Fig. 2 carrying out preemphasis in pretreatment process, digitized, Framing, adding window, extract tested speech frame feature in feature extraction flow process afterwards, by calculating autocorrelation coefficient and phase coefficient, Carry out FFT, by MEL wave filter group, then pass through logarithmic transformation and discrete Fourier transform DCT, try to achieve traditional MFCC and Phase place MFCC, and in the case of actual environment no tested speech, estimated noise energy, try to achieve respective environment SNR.
Speech recognition equipment to implement step as follows:
Step 1:Tested speech is digitized process, sample frequency is 8kHz, then carries out preemphasis, with 20ms is One frame, frame moves as 10ms, and window is Hamming window.
Step 2:Every frame voice is analyzed, carries out periodic extension first, try to achieve according still further to (formula 1-3) normalized Autocorrelation coefficient and phase coefficient.
Step 3:FFT is carried out to the coefficient tried to achieve, obtains corresponding power spectrum, then obtain two kinds are composed, lead to Cross the MEL scaling filter filtering of 13 ranks, then pass through logarithmic transformation and dct transform, try to achieve the static cepstral coefficients of 13 rank MFCC With the static cepstral coefficients of phase place MFCC of 13 ranks, and take both single order and second dervatives, obtain 39 dimensions MFCC parameter and The phase place MFCC parameter of 39 dimensions, as characteristic vector.
Step 4:In the case of no tested speech, gather the noise signal in actual environment, obtain noise energy.Pass through again (formula 6) and (formula 7), estimates the signal to noise ratio under the actual environment having tested speech.
Step 5:Calculate characteristic vector 39 and tie up the Euclidean distance D between the test template of MFCC and reference templateM, characteristic vector Euclidean distance P between the test template of 39 dimension phase places MFCC and reference templateN.
Step 6:Calculate the weighted value of two characteristic vector mould distances between plates according to (formula 8), finally according to (formula 5), obtain Weight distance value C.
The calculating formula calculating respective weights is as follows:
Take relevant parameter:α=3, γ=0.5.
As shown in figure 4, a kind of device realizing the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling, Including:Power module, display module, memory module, DSP/ARM digital signal processing module, mike, A/D converter and USB connect Mouthful;One end of described memory module, USB interface, display module, power module and A/D converter all with DSP/ARM digital processing Modular electrical connects, and described mike is electrically connected with the other end of A/D converter;Described mike is used for input test language Sound, described A/D converter is used for tested speech digitized, and described DSP/ARM chip is used for extracting feature and carrying out template Join, described memory module is used for storing reference database, described display module is used for showing result, described USB interface and computer Connect.Described A/D converter adopts ADC0832 chip;Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.Institute State DSP/ARM7 chip and adopt the TMS 320C6711 of the TI or ARM7S3C44B0 of Samsung.
Embodiment 2
The present embodiment in addition to herein below, with embodiment 1:
The calculating formula calculating respective weights is as follows:
Take relevant parameter:β=3, θ=0.5.
Above-described embodiment is the present invention preferably embodiment, but embodiments of the present invention are not subject to above-described embodiment Limit, other any spirit without departing from the present invention and the change made under principle, modification, replacement, combine, simplify, All should be equivalent substitute mode, be included within protection scope of the present invention.

Claims (8)

1. a kind of anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling is it is characterised in that include following walking Suddenly:
Step one:Pretreatment is carried out to input speech signal, tries to achieve phase coefficient;
Step 2:Calculate the feature of input voice, i.e. phase place MFCC;
Step 3:Characteristic matching is carried out to the template based on SNR;
It is characterized in that, described step one comprises the following steps:
Step A, the voice signal s [n] after digitized is carried out sub-frame processing, adding window is carried out using Hamming window simultaneously, and be divided into T Frame:
{s0[n],s1[n],...,st[n],...,sT-1[n] },
Wherein:
st[n]={ s [Kt], s [Kt+1] ..., s [Kt+N-1] }, K move for frame, and N is frame length, st[n] is the frame letter in moment t Number sequence;
Step B, gained frame signal is carried out periodic extension, obtain auto-correlation function:
R [ k ] = Σ n = 0 N - 1 s ~ t [ n ] s ~ t [ n + k ] , k = 0 , 1 , ... , N - 1 ;
Can be drawn by the expression formula of auto-correlation function, R [k] is the dot product of two N-dimensional vectors,
x 0 = { s ~ t [ 0 ] , s ~ t [ 1 ] , ... , s ~ t [ N - 1 ] } ,
x k = { s ~ t [ k ] , ... , s ~ t [ N - 1 ] , s ~ t [ 0 ] , ... , s ~ t [ k - 1 ] } ,
R [ k ] = x 0 T x k = | | x | | 2 c o s ( θ k ) ,
Wherein, | | x | |2=| | x0||2=| | xk||2, expression is frame energy, θkIt is vector x0And vector xkFolder in N-dimensional space Angle;
Step C, normalized autocorrelation coefficient is carried out the nonlinear change of anticosine, obtain phase coefficient:
P [ k ] = θ k = cos - 1 ( R [ k ] | | x | | 2 ) ,
The span of P [k] is between 0 to π, is normalized between 0 to 1, obtains normalized phase place auto-correlation function:
P n [ k ] = P [ k ] π = cos - 1 ( R n [ k ] ) π = cos - 1 ( R [ k ] | | x | | 2 ) π ,
Wherein, Pn[k] is used for improving the robustness in the case of low signal-to-noise ratio.
2. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 1, its feature It is, described step 2 comprises the following steps:
Step I, to Pn[k] carries out DFT transform, obtains phase power spectrum Sp[l]:
S p [ l ] = Σ k = 0 N - 1 P n [ k ] exp ( - j 2 π N k l ) ,
In formula, Sp[l] represents phase power spectrum, and the MEL frequency cepstral coefficient obtaining from formula is called phase place MFCC, that is,:Pass through Mel dimensions in frequency wave filter group filters, and then carries out logarithm operation;
Step II, after the information of each frequency band is separated, with discrete cosine transform, frequency domain character is changed in time domain, obtains To phase place MFCC parameter;Described phase place MFCC parameter chooses L rank static cepstral coefficients and its single order and second dervative, common 3L dimension.
3. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 1, its feature It is, described step 3 comprises the following steps:
Step 1., have j reference voice data template in reference database, wherein comprises MFCC characteristic vector and the 3L dimension of 3M dimension Phase place MFCC characteristic vector;Characteristic vector 3M ties up the Euclidean distance between the test template of MFCC and wherein i-th reference template For DMi, the Euclidean distance that characteristic vector 3L ties up between the test template of phase place MFCC and the i-th reference template is PLi, i=0,1 ..., j-1;
Step 2., under the conditions of different signal to noise ratios, using different weight values, obtain two feature vectors in mould distance between plates Weight distance value Ci
Ci=(1-w) DMi+wPLi, i=0,1 ..., j-1,
Wherein, w is the weight of distance between phase place MFCC parameterized template;Template matching process refers to search in j reference template, looks for To making min { Ci, i=0, the template of 1 ..., j-1 establishment;
Signal to noise ratio snr can be obtained by following formula:
S N R = log 10 ( | | Y | | 2 | | N | | 2 ) ≅ log 10 ( | | Y | | 2 | | N | | 2 ‾ ) ,
| | Y | | 2 = | | X | | 2 + | | N | | 2 ≅ | | X | | 2 + | | N | | 2 ‾ ,
Wherein, | | Y | |2Represent is the frame energy of voice in actual environment, | | N | |2Represent be in actual environment sampling make an uproar The energy of acoustical signal,Represent the estimated value to this energy;
The value of w is determined by signal to noise ratio snr:
W=f (SNR),
Wherein, f (SNR) represents the relation between weight coefficient w and signal to noise ratio snr, the span of f (SNR) is (0,1), f (SNR) with the relation of w it is negative correlation linearly or nonlinearly each other.
4. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 3, its feature It is, described f (SNR) is as follows with the expression formula of the relation of w:
w = f ( S N R ) = exp ( - S N R - α γ ) · u ( S N R - α ) + u ( S N R - α ) ,
Wherein, u () is jump function, and α span is (1,5), and α is the threshold value of SNR, when SNR is less than α, weight coefficient w For 1, when SNR is more than α, weight coefficient w and SNR is negatively correlated, and along index decreased, with the growth of SNR, w gradually converges on 0.
5. the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling according to claim 3, its feature It is, described f (SNR) is as follows with the expression formula of the relation of w:
w = f ( S N R ) = 1 - 1 1 + exp [ - ( S N R - β ) θ ] ,
Wherein, the span of β be (1,10), be traditional MFCC and phase place MFCC weight equal when SNR marginal value;γ and θ Span be (0.1,1), γ and θ be used to adjust change speed, when the value of γ or θ is bigger, change slower.
6. a kind of dress of the anti-noise audio recognition method based on noise Ratio Weighted template characteristic coupling realized described in claim 1 Put it is characterised in that including:Power module, display module, memory module, DSP/ARM digital signal processing module, mike, A/D Transducer and USB interface;One end of described memory module, USB interface, display module, power module and A/D converter all with DSP/ARM digital signal processing module is electrically connected, and described mike is electrically connected with the other end of A/D converter;Described mike For input test voice, described A/D converter is used for tested speech digitized, and described DSP/ARM chip is used for extracting spy Levy and carry out template matching, described memory module is used for storing reference database, and described display module is used for showing result, described USB interface and computer connect.
7. device according to claim 6 is it is characterised in that described A/D converter adopts ADC0832 chip;Described DSP/ARM digital signal processing module adopts DSP/ARM7 chip.
8. device according to claim 7 is it is characterised in that described DSP/ARM7 chip adopts the TMS 320C6711 of TI Or the ARM7 S3C44B0 of Samsung.
CN201410040474.9A 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching Expired - Fee Related CN103778914B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410040474.9A CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410040474.9A CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Publications (2)

Publication Number Publication Date
CN103778914A CN103778914A (en) 2014-05-07
CN103778914B true CN103778914B (en) 2017-02-15

Family

ID=50571083

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410040474.9A Expired - Fee Related CN103778914B (en) 2014-01-27 2014-01-27 Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching

Country Status (1)

Country Link
CN (1) CN103778914B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106373559B (en) * 2016-09-08 2019-12-10 河海大学 Robust feature extraction method based on log-spectrum signal-to-noise ratio weighting
CN108735229B (en) * 2018-06-12 2020-06-19 华南理工大学 Amplitude and phase joint compensation anti-noise voice enhancement method based on signal-to-noise ratio weighting
CN117690439B (en) * 2024-01-31 2024-04-16 国网安徽省电力有限公司合肥供电公司 Speech recognition semantic understanding method and system based on marketing scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1675684A (en) * 2002-08-09 2005-09-28 摩托罗拉公司(特拉华州注册) Distributed speech recognition with back-end voice activity detection apparatus and method
CN102592589A (en) * 2012-02-23 2012-07-18 华南理工大学 Speech scoring method and device implemented through dynamically normalizing digital characteristics
CN202454260U (en) * 2012-02-23 2012-09-26 华南理工大学 Speech assessment device utilizing dynamic normalized digital features
CN102737629A (en) * 2011-11-11 2012-10-17 东南大学 Embedded type speech emotion recognition method and device
CN103440872A (en) * 2013-08-15 2013-12-11 大连理工大学 Transient state noise removing method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE60213195T8 (en) * 2002-02-13 2007-10-04 Sony Deutschland Gmbh Method, system and computer program for speech / speaker recognition using an emotion state change for the unsupervised adaptation of the recognition method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1675684A (en) * 2002-08-09 2005-09-28 摩托罗拉公司(特拉华州注册) Distributed speech recognition with back-end voice activity detection apparatus and method
CN102737629A (en) * 2011-11-11 2012-10-17 东南大学 Embedded type speech emotion recognition method and device
CN102592589A (en) * 2012-02-23 2012-07-18 华南理工大学 Speech scoring method and device implemented through dynamically normalizing digital characteristics
CN202454260U (en) * 2012-02-23 2012-09-26 华南理工大学 Speech assessment device utilizing dynamic normalized digital features
CN103440872A (en) * 2013-08-15 2013-12-11 大连理工大学 Transient state noise removing method

Also Published As

Publication number Publication date
CN103778914A (en) 2014-05-07

Similar Documents

Publication Publication Date Title
CN103236260B (en) Speech recognition system
Yegnanarayana et al. Processing of reverberant speech for time-delay estimation
EP3309782B1 (en) Method, device and system for noise suppression
CN102509547B (en) Method and system for voiceprint recognition based on vector quantization based
CN103117059B (en) Voice signal characteristics extracting method based on tensor decomposition
CN110364143A (en) Voice awakening method, device and its intelligent electronic device
TW201248613A (en) System and method for monaural audio processing based preserving speech information
CN103065629A (en) Speech recognition system of humanoid robot
CN109949823A (en) A kind of interior abnormal sound recognition methods based on DWPT-MFCC and GMM
CN107293306B (en) A kind of appraisal procedure of the Objective speech quality based on output
CN104134444B (en) A kind of song based on MMSE removes method and apparatus of accompanying
CN108597505A (en) Audio recognition method, device and terminal device
Ratnarajah et al. Towards improved room impulse response estimation for speech recognition
CN110459241A (en) A kind of extracting method and system for phonetic feature
CN102436809A (en) Network speech recognition method in English oral language machine examination system
CN102237083A (en) Portable interpretation system based on WinCE platform and language recognition method thereof
CN106373559A (en) Robustness feature extraction method based on logarithmic spectrum noise-to-signal weighting
CN103778914B (en) Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching
Shi et al. Fusion feature extraction based on auditory and energy for noise-robust speech recognition
CN112735477A (en) Voice emotion analysis method and device
CN103557925B (en) Underwater target gammatone discrete wavelet coefficient auditory feature extraction method
CN103400578A (en) Anti-noise voiceprint recognition device with joint treatment of spectral subtraction and dynamic time warping algorithm
CN117746905B (en) Human activity influence assessment method and system based on time-frequency persistence analysis
Paliwal On the use of filter-bank energies as features for robust speech recognition
CN110875037A (en) Voice data processing method and device and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170215

Termination date: 20220127