CN103811003B - A kind of audio recognition method and electronic equipment - Google Patents
A kind of audio recognition method and electronic equipment Download PDFInfo
- Publication number
- CN103811003B CN103811003B CN201210454965.9A CN201210454965A CN103811003B CN 103811003 B CN103811003 B CN 103811003B CN 201210454965 A CN201210454965 A CN 201210454965A CN 103811003 B CN103811003 B CN 103811003B
- Authority
- CN
- China
- Prior art keywords
- voice messaging
- recognition result
- preset condition
- identification model
- recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Telephonic Communication Services (AREA)
- Telephone Function (AREA)
Abstract
The present invention provides a kind of audio recognition method and electronic equipment.This method is applied in an electronic equipment, and the electronic equipment has speech-recognition services, which comprises obtains the first voice messaging;First voice messaging is identified by the first identification model, obtains the first recognition result;Judge whether first recognition result meets the first preset condition;When first recognition result meets first preset condition, first voice messaging is identified by second identification model different from first identification model, obtains the second recognition result;Based on second recognition result, controls the electronic equipment and execute corresponding control instruction.
Description
Technical field
The present invention relates to electronic technology field more particularly to a kind of audio recognition methods and electronic equipment.
Background technique
With the development of electronic technology, for the convenience of human-computer interaction, voice is integrated on more and more electronic equipments
Identification service, so can be convenient electronic equipment is controlled by voice of user, and filled without relying on physical control
It sets, such as mouse, keyboard.
In the prior art, it is usually using the course of work of speech-recognition services: sound input device, such as microphone
Then the acoustic information of real-time typing is also transferred to speech recognition module simultaneously, then by real-time typing acoustic information in real time
Speech recognition module carries out a series of processing to acoustic information, such as is first pre-processed, pretreatment include filtering, sampling and
Quantization, adding window etc.;Then characteristic parameter extraction is carried out to pretreated voice signal, obtains characteristic vector, then will acquire
To characteristic vector and template library in each template carry out similarity-rough set, similarity soprano is defeated as recognition result
Out.And the template in template library is to be trained in advance, i.e., gives an account of each of vocabulary word, then by its feature matter
Amount is as in template deposit template library.Next it is exactly the corresponding relationship according to recognition result and operational order, gets pair
Then the operational order answered carries out corresponding operation according to the operational order.
However, the present inventor has found in the implementation of the present invention, how is scheme in the prior art either typing
Acoustic information, identification process above-mentioned will be carried out, until identifying as a result, and being corresponding with operational order or no correspondence
It may not be sometimes user by the acoustic information of microphone typing until operational order, however in practice
Sound, or even be not the sound of people, if handled one time according further to above-mentioned identification process, real effective voice life
Enable the ratio of the total identified amount of Zhan relatively low with regard to relatively low namely phonetic recognization rate, while also influencing recognition efficiency to reduce.
Summary of the invention
The present invention provides a kind of audio recognition method and electronic equipment, existing in the prior art to all to solve
Acoustic information, which all carries out complete identification process, leads to the technical problem that phonetic recognization rate is lower, recognition efficiency is lower.
One aspect of the present invention provides a kind of audio recognition method, is applied in an electronic equipment, the electronic equipment tool
There are speech-recognition services, which comprises obtain the first voice messaging;First voice is identified by the first identification model
Information obtains the first recognition result;Judge whether first recognition result meets the first preset condition;In first identification
When as a result meeting first preset condition, described the is identified by second identification model different from first identification model
One voice messaging obtains the second recognition result;Based on second recognition result, controls the electronic equipment and execute corresponding control
System instruction;Wherein, the speech-recognition services are in close state, when the corresponding control instruction of second recognition result is to call out
It is described to be based on second recognition result when awake instruction, it controls the electronic equipment and executes corresponding control instruction, comprising: hold
The row wake up instruction, wakes up the speech-recognition services.
Optionally, when first recognition result does not meet first preset condition, the method also includes: it abandons
First voice messaging.
Optionally, pass through before the first identification model identifies first voice messaging described, the method also includes:
Judge whether first voice messaging meets the second preset condition;It is preset when first voice messaging is unsatisfactory for described second
When condition, first voice messaging is abandoned;When first voice messaging meets second preset condition, step is executed
It is rapid: first voice messaging is identified by the first identification model.
Optionally, first identification model that passes through identifies first voice messaging, obtains the first recognition result, specifically
Are as follows: whether the corresponding user of identification first voice messaging is predesignated subscriber, obtains the first recognition result;Wherein, when described
When the corresponding user of first voice messaging is not the predesignated subscriber, it is pre- to indicate that first voice messaging is unsatisfactory for described first
If condition, when the corresponding user of first voice messaging is the predesignated subscriber, indicate that first voice messaging meets
First preset condition.
Optionally, the first voice messaging of the acquisition, specifically includes: end-point detection is carried out to first voice messaging,
First voice messaging after being detected.
Optionally, described by knowing with described first when first recognition result meets first preset condition
The second different identification model of other model identifies first voice messaging, obtains the second recognition result, specifically: pass through second
Identification model identifies first voice messaging, obtains third recognition result;Based on first recognition result and the third
Recognition result obtains second recognition result.
Another aspect of the present invention provides a kind of electronic equipment, and the electronic equipment has speech-recognition services, the electronics
Equipment includes: circuit board;Sound acquiring is connected to the circuit board, for obtaining the first voice messaging;Chip is handled,
It is arranged on the circuit board, for identifying first voice messaging by the first identification model, obtains the first recognition result;
Judge whether first recognition result meets the first preset condition;Meet the described first default item in first recognition result
When part, first voice messaging is identified by second identification model different from first identification model, second is obtained and knows
Other result;Chip is controlled, is arranged on the circuit board, for being based on second recognition result, controls the electronic equipment
Execute corresponding control instruction;The speech-recognition services are in close state, when the corresponding control of second recognition result
When instruction is wake up instruction, the control chip is specifically used for executing the wake up instruction, wakes up the speech-recognition services.
Optionally, the processing chip is specifically also used to not meet first preset condition in first recognition result
When, abandon first voice messaging.
Optionally, the processing chip includes the first subprocessing chip and the second subprocessing chip, first subprocessing
Chip is specifically used for judging whether first voice messaging meets the second preset condition;When first voice messaging is unsatisfactory for
When second preset condition, first voice messaging is abandoned;When first voice messaging meets the described second default item
When part, the second subprocessing chip is specifically used for identifying first voice messaging by the first identification model.
Optionally, the processing chip specifically further includes third subprocessing chip, is specifically used for identifying first voice
Whether the corresponding user of information is predesignated subscriber, obtains the first recognition result;Wherein, when the corresponding use of first voice messaging
When family is not the predesignated subscriber, indicate that first voice messaging is unsatisfactory for first preset condition, when first language
When the corresponding user of message breath is the predesignated subscriber, indicate that first voice messaging meets first preset condition.
Optionally, the sound acquiring further includes detection chip, for carrying out endpoint to first voice messaging
Detection, first voice messaging after being detected.
Optionally, the processing chip further includes the 4th subprocessing chip, for meeting institute in first recognition result
When stating the first preset condition, first voice messaging is identified by the second identification model, obtains third recognition result;Based on institute
The first recognition result and the third recognition result are stated, second recognition result is obtained.
The one or more technical solutions provided in the embodiment of the present invention, have at least the following technical effects or advantages:
In an embodiment of the present invention, the first identification model is first passed through for voice messaging and carries out first step identification, then
Judge whether the result meets the first preset condition according to the result that the first step identifies, that is, judges whether also to continue under identification
It goes, only when meeting the preset condition, the identification of next step is just carried out by the second identification model, and then obtain identification knot
Fruit executes corresponding control instruction according to recognition result.In this way, only meet item because first passing through the screening of the first step
Part can just continue to identify, so that the ratio that the recognition result finally obtained is effective recognition result is got higher, Ye Jiti
High discrimination, and those intercept the voice messaging of falling by the first step, just do not have to the work for continue identifying, so improving
The efficiency of identification.
Further, the voice messaging for not meeting preset condition is directly abandoned in one embodiment of the invention, and do not had to it
Subsequent processing is done, so unwanted calculation amount is greatly reduced, and the second identification model does not have to calculate, and also saves electricity
Amount.
Further, in one embodiment of the invention also before being identified using the first identification model, then it is arranged one
Rule of judgment directly judges whether voice messaging itself meets the second preset condition, just straight when not meeting the second preset condition
It connects and abandons the first voice messaging, and require no the first identification model and identified, so further saving electricity and reduction
Calculation amount.
Further, the second knowledge is finally obtained by the first identification model and the second identification model in one embodiment of the invention
Not as a result, being only to determine whether the corresponding control instruction of the second recognition result is wake up instruction, when being wake up instruction, is just gone
Speech-recognition services are waken up, allows speech-recognition services to carry out subsequent voice order and is identified, and if not wake up instruction
Words, continue to monitor, until listening to wake up instruction, so at this moment real speech-recognition services are constantly in the shape not worked
State, so greatly having saved electricity and calculation amount.
Detailed description of the invention
Fig. 1 is the flow chart of the audio recognition method in one embodiment of the invention;
Fig. 2 is the architecture diagram of the electronic equipment in one embodiment of the invention.
Specific embodiment
The embodiment of the present invention provides a kind of audio recognition method and electronic equipment, solves existing in the prior art to institute
Having acoustic information all to carry out complete identification process leads to the technical problem that phonetic recognization rate is lower, recognition efficiency is lower.
In order to solve the above technical problems, general thought is as follows for technical solution in the embodiment of the present invention:
First identification model is first passed through for voice messaging and carries out first step identification, the result then identified according to the first step
Judge whether the result meets the first preset condition, that is, identification will also be continued by, which judging whether, goes down, and is only meeting the default item
When part, the identification of next step is just carried out by the second identification model, and then obtain recognition result, according to recognition result, execute phase
The control instruction answered.In this way, because first pass through the screening of the first step, only qualified just to continue to identify, institute
So that the ratio that the recognition result finally obtained is effective recognition result is got higher, namely discrimination is improved, and those are by the
As soon as step intercepts the voice messaging of falling, the work for continue identification is not had to, so improving the efficiency of identification.
In order to better understand the above technical scheme, in conjunction with appended figures and specific embodiments to upper
Technical solution is stated to be described in detail.
One embodiment of the invention provides a kind of audio recognition method, is applied in an electronic equipment, the electronic equipment is for example
It is mobile phone, PDA (personal digital assistant), tablet computer or laptop.The electronic equipment has speech-recognition services.
Next, referring to FIG. 1, Fig. 1 be the present embodiment in audio recognition method flow chart, this method comprises:
Step 101: obtaining the first voice messaging;
Step 102: first voice messaging being identified by the first identification model, obtains the first recognition result;
Step 103: judging whether first recognition result meets the first preset condition;
Step 104: when first recognition result meets first preset condition, by identifying mould with described first
The second different identification model of type identifies first voice messaging, obtains the second recognition result;
Step 105: being based on second recognition result, control the electronic equipment and execute corresponding control instruction.
The implementation process of audio recognition method in the present embodiment will be described in detail according to different application scenarios below.
In the first embodiment, it is assumed that speech-recognition services have been switched on.In a step 101, such as it can be and pass through wheat
Gram real-time typing voice messaging of wind, obtaining the first voice messaging in the specific implementation process can also be to the first voice messaging
End-point detection is carried out, such as end-point detection is carried out based on short-time energy and short-time average zero passage dose rate, to believe from the sound of acquisition
The starting point and ending point that voice is accurately determined in number, distinguishes voice signal and non-speech audio, can so reduce first
The collection capacity of voice messaging saves the workload of subsequent step, excludes the interference of unvoiced segments or noise segment, improves speech recognition clothes
The performance of business.In following embodiment, the first voice messaging, can also be with either carry out the voice messaging after end-point detection
It is the voice messaging crossed without end-point detection, the implementation of subsequent step is similar.
Then step 102 is executed, that is, the first voice messaging that will acquire identifies first voice by the first identification model
Information obtains the first recognition result, and in the specific implementation process, the first identification model can divide below there are many embodiment
It Ju Li not be illustrated.
The first, the first identification model is, for example, the voice recognition model of specific user, when getting first in step 101
When voice messaging, just identify whether the corresponding user of first voice messaging is predesignated subscriber by the first identification model, namely
Identify whether first voice messaging is that the predesignated subscriber issues, specific is, for example, to compare by vocal print, sees the similar of vocal print
Whether degree is more than a preset condition, and in the present embodiment, the first preset condition is, for example, that similarity value is more than or equal to 98%;Assuming that
The identification of first voice messaging the result is that similarity value is 99%, then just by 99% and first preset condition 98% carry out pair
Than the result is that being greater than, then meaning that the first voice messaging is that the predesignated subscriber issues;Assuming that the identification of the first voice messaging
The result is that similarity value is 97%, then just comparing 97% with the first preset condition 98%, the result is that being less than, then just
Indicating the first voice messaging not is that the predesignated subscriber issues.
Second, the first identification model be model is easily recognized, i.e., only identify first voice messaging one of them or
Two features, then obtain the recognition result of a feature and two features, and in the present embodiment, the first preset condition is for example
It is that certain threshold value will be reached to the score of the matching degree of one or two feature, when point of the matching degree in the first recognition result
When number is more than or equal to threshold value, determine that the first recognition result meets the first preset condition.Because only identifying one or two of feature, meter
Calculation amount is smaller.
The third, the first identification model is that model is easily recognized, the simple knowledge unlike second, in the present embodiment
Other model is to identify whole sound characteristics, but using fuzzy algorithmic approach, i.e. algorithm comparison is simple, carries out fuzzy matching, institute
With calculation amount compared to accurately calculate and accurately match it is much smaller.Then in the present embodiment, it is easily recognized by such
Model identification obtain the first recognition result, then may determine that first voice messaging be voice command a possibility that whether be more than
One threshold value, i.e. the first preset condition then illustrate that the first recognition result meets the first preset condition if it is larger than or equal to the threshold value.
Foregoing illustrate three kinds of situations of the first identification model, however in practice, the first identification model is also
It can be other model, as long as the calculation amount that calculation amount is only once identified in entire identification process than in the prior art is small i.e.
Can, the application is with no restriction.
It is identified when being passed through by the first above-mentioned identification model, and judges that the first recognition result meets the first preset condition
When, it is carried out step 104, i.e., the first voice messaging is further identified by the second identification model, will be corresponded to below
Aforementioned three kind of first identification model illustrates the second identification model.
The first, when determining that the first voice messaging is what the predesignated subscriber issued, then meaning that first voice messaging
It is that authorized user issues, can further identifies the first voice messaging, at this moment, just enables the second identification model identification first
Voice messaging, detailed process for example extract characteristic parameter before this, obtain characteristic vector, the characteristic vector and mould that then will acquire
Each template in plate library carries out similarity-rough set, exports similarity soprano as recognition result, i.e., and in the prior art
Identification process it is identical, after identifying in this way, the second recognition result can be obtained.
Second, the second identification model is sophisticated identification model, that is, identifies other spies identified through the first identification model
Whole features can also all be identified that finally identification obtains one one time by sign, such as three, five, even more features again
Recognition result, i.e. the second recognition result.Specifically, if using only residue character is analyzed, it can be by first
Recognition result and comprehensively considered using the recognition result that the second identification model obtains, such as considers the score and power of each feature
Weight, finally obtains the second recognition result.
The third, the second identification model is sophisticated identification model, correspondingly, the sophisticated identification model from second is different
It is that the sophisticated identification model in the present embodiment is accurately matched using accurate algorithm, so more accurate knowledge can be obtained
Not as a result, i.e. the second recognition result.It is of course also possible to consider the first recognition result, such as to assign recognition result twice different
Weight finally determines the second recognition result corresponding with the first voice messaging.
Equally, three kinds of modes of above-mentioned second identification model are also only intended to illustrate, and are not intended to limit the present invention, as long as
The model that the recognition result of voice command can be determined according to recognition result can be obtained by the identification of the second identification model.
After obtaining the second recognition result by the above method or other methods, it is carried out step 105, i.e., is known based on second
Not as a result, controlling electronic devices executes corresponding control instruction.In the specific implementation process, e.g. according to the second recognition result
It first determines corresponding voice command, corresponding control instruction is then executed according to voice command.And the second recognition result is corresponding
The order of order, editing short message that voice command is e.g. made a phone call can also be other orders in practice, this Shen
Please with no restriction.
Seen from the above description, only qualified just to continue to identify because first passing through the screening of the first step, institute
So that the ratio that the recognition result finally obtained is effective recognition result is got higher, namely discrimination is improved, and those are by the
As soon as step intercepts the voice messaging of falling, the work for continue identification is not had to, so improving the efficiency of identification.
In a further embodiment, when in step 103, judgement the result is that the first recognition result not meet first pre-
If when condition, the first voice messaging is just directly abandoned, without will do it subsequent identification, so greatly reducing unwanted calculating
Amount, and the second identification model does not have to calculate, and also saves electricity.
In order to further save electricity and reduce calculation amount, in the present embodiment also before executing step 102, directly sentence
Whether disconnected first voice messaging meets the second preset condition, when the first voice messaging is unsatisfactory for the second preset condition, just abandons
First voice messaging;When the first voice messaging meets the second preset condition, step 102 is just executed.
Specifically, it can be determined that whether the first voice messaging is voice, rather than noise, such as sound of the wind, construction site
Metallic sound or animal sound, such as barking, mew just execute if the first voice messaging is the sound of people
Step 102, if not, so that it may directly discarding the first voice messaging, so saved the first identification model and second identification
The calculation amount of model, while also reducing power consumption because the first identification model and the second identification model do not have to calculate.
In another embodiment, the second preset condition, being also possible to the corresponding user of the first voice messaging as the aforementioned is
Predesignated subscriber, if it is determined that result indicate that the corresponding user of the first voice messaging is not the predesignated subscriber, then illustrate this
As soon as the corresponding user of voice messaging does not have control authority to the electronic equipment, so without carrying out step 102 and subsequent each
Step, but directly abandon.
In a second embodiment, it is assumed that speech-recognition services are not gated at this time, because if speech-recognition services are always
In starting state, speech recognition process will be carried out always, so will result in big power consumption and calculation amount, so this implementation
Example electronic equipment operating system backstage be resident a wake-ups small routine, by wake up small routine identification user instruction whether
It is wake up instruction, if so just starts speech-recognition services, will be illustrated in the present embodiment by specific example below
Audio recognition method implementation process.
The sound that small routine monitors always sound input device typing, i.e. step 101 are waken up, the first voice messaging is obtained,
Then step 102 is executed, in the present embodiment, the first identification model for example can be using three kinds described in first embodiment
Model, naturally it is also possible to be to judge whether first voice messaging is voice, if it is voice, just carry out step 104;Work as step
103 judging result meets the first preset condition, then just being identified using the second identification model, obtains the second identification knot
Fruit.Then compare whether the second recognition result is wake up instruction, can be set only included in wake-up small routine in the present embodiment
Two voice commands, one is to open speech-recognition services, and one is to close speech-recognition services, so by the second recognition result
Be compared, if just compare twice, that is, can determine the second recognition result it is corresponding whether be wake up instruction, so compare
Speed is fast, and calculation amount is small, can save electricity.
It is wake up instruction when the second recognition result is corresponding, then step 105 is specially to execute wake up instruction, wakes up voice
Identification service, such speech-recognition services starting, user can be interacted by voice and electronic equipment.Equally, may be used
It in this way to close speech-recognition services, to save electricity, is then wake up small routine and continues to monitor, until monitoring
To wake up instruction, speech-recognition services are just waken up.
For example, current speech identification service has been in close state, at this moment user has said " an a girl secretary concurrently a lover to electronic equipment
Book " can first carry out the judgement of aforementioned second preset condition then waking up small routine will listen to, and judgement is the discovery that people's
Sound is identified then step 102 can be executed then by the first identification model, obtain a recognition result, example
Such as once using fuzzy diagnosis, have found that it is likely that it is wake up instruction, so continuing to accurately be known using the second identification model
Not, the second recognition result is obtained, discovery is strictly wake up instruction, then being carried out step 105, that is, executes wake up instruction, control
Electronic equipment opens speech-recognition services.
And such as user does not speak also, only one sound of small mewing in room, at this moment wakes up after small routine listens to, just
Judgement it is found not to be voice, then just directly abandoning the voice messaging, then proceed to monitor.
For another example preliminary judgement passes through, it is voice, then can be judged by step 101, such as finds
The voice messaging is not what the user issued, so at this moment can still abandon the voice messaging, then proceedes to monitor.
For another example by contrast, the second recognition result is not wake up instruction, then at this moment after step 104 is finished
It wakes up small routine to continue to monitor the acoustic information from sound input device typing, until listening to " Mytip ", can just wake up
Speech-recognition services.
The above various embodiments can individually be implemented, can also be in conjunction with implementation, and those skilled in the art can be according to practical feelings
Condition is selected.
3rd embodiment, in the present embodiment, the second identification model in first embodiment is in second embodiment
Speech-recognition services, and the first identification model in first embodiment is the wake-up small routine in second embodiment, so
When wake-up small routine judges that the first recognition result meets the first preset condition, such as judges the user of the first voice messaging and be
The predesignated subscriber, namely be strictly the voice command that the predesignated subscriber issues, so the second identification model is just waken up, so that second
Identification model is able to enter working condition, further identifies that the corresponding voice command of the first voice messaging is, e.g. beats
The order of phone.If not the predesignated subscriber, then the second identification model is not just waken up, so in the present embodiment, in step
It after rapid 103, before step 104, further comprises the steps of: when the first recognition result meets the first preset condition, wakes up second and know
Other model.
Based on the same inventive concept, realize that the electronics of above-mentioned audio recognition method is set in the embodiment of the present invention described below
Standby specific framework, referring to FIG. 2, electronic equipment includes: circuit board 201;Sound acquiring 202, is connected to circuit board
201, for obtaining the first voice messaging;Chip 203 is handled, is arranged on circuit board 201, for being known by the first identification model
Other first voice messaging obtains the first recognition result;Judge whether the first recognition result meets the first preset condition;Know first
When other result meets first preset condition, pass through the second identification model identification described first different from the first identification model
Voice messaging obtains the second recognition result;Chip 204 is controlled, is arranged on circuit board 201, for being based on the second recognition result,
Controlling electronic devices executes corresponding control instruction.
Further, processing chip 203 specifically is also used to when the first recognition result does not meet the first preset condition, discarding the
One voice messaging.
In one embodiment, processing chip 203 includes the first subprocessing chip and the second subprocessing chip, the first subprocessing
Chip is specifically used for judging whether the first voice messaging meets the second preset condition;It is preset when the first voice messaging is unsatisfactory for second
When condition, the first voice messaging is abandoned;When the first voice messaging meets the second preset condition, the second subprocessing chip is specifically used
The first voice messaging is identified in passing through the first identification model.
Further, processing chip 203 specifically further includes third subprocessing chip, is specifically used for the first voice messaging pair of identification
Whether the user answered is predesignated subscriber, obtains the first recognition result;Wherein, when the corresponding user of the first voice messaging is not predetermined
When user, indicate that the first voice messaging is unsatisfactory for the first preset condition, when the corresponding user of the first voice messaging is predesignated subscriber
When, indicate that the first voice messaging meets the first preset condition.
Further, processing chip 203 further includes the 4th subprocessing chip, default for meeting first in the first recognition result
When condition, the first voice messaging is identified by the second identification model, obtains third recognition result;Based on the first recognition result and
Three recognition results obtain the second recognition result.
In another embodiment, sound acquiring 201 further includes detection chip, for holding to the first voice messaging
Point detection, the first voice messaging after being detected.Wherein, detection chip also can be set on circuit board 201.
In another embodiment, speech-recognition services are in close state, when the corresponding control instruction of the second recognition result
When for wake up instruction, control chip 204 is specifically used for executing wake up instruction, wakes up speech-recognition services.
Wherein, sound acquiring is, for example, microphone, can be a microphone, is also possible to microphone array.
In addition, processing chip 203 and control chip 204 can be two individual chips, also can integrate same
On chip.
And it handles at the first subprocessing chip, the second subprocessing chip, third subprocessing chip and the 4th son of chip 203
Reason chip is also possible to four independent chips, also can integrate on the same chip.
Various change mode and specific example in audio recognition method in previous embodiment are equally applicable to this implementation
The electronic equipment of example, by the aforementioned detailed description to audio recognition method, those skilled in the art are clear that this
The implementation method of electronic equipment in embodiment, so this will not be detailed here in order to illustrate the succinct of book.
The one or more technical solutions provided in the embodiment of the present invention, have at least the following technical effects or advantages:
In an embodiment of the present invention, the first identification model is first passed through for voice messaging and carries out first step identification, then
Judge whether the result meets the first preset condition according to the result that the first step identifies, that is, judges whether also to continue under identification
It goes, only when meeting the preset condition, the identification of next step is just carried out by the second identification model, and then obtain identification knot
Fruit executes corresponding control instruction according to recognition result.In this way, only meet item because first passing through the screening of the first step
Part can just continue to identify, so that the ratio that the recognition result finally obtained is effective recognition result is got higher, Ye Jiti
High discrimination, and those intercept the voice messaging of falling by the first step, just do not have to the work for continue identifying, so improving
The efficiency of identification.
Further, the voice messaging for not meeting preset condition is directly abandoned in one embodiment of the invention, and do not had to it
Subsequent processing is done, so unwanted calculation amount is greatly reduced, and the second identification model does not have to calculate, and also saves electricity
Amount.
Further, in one embodiment of the invention also before being identified using the first identification model, then it is arranged one
Rule of judgment directly judges whether voice messaging itself meets the second preset condition, just straight when not meeting the second preset condition
It connects and abandons the first voice messaging, and require no the first identification model and identified, so further saving electricity and reduction
Calculation amount.
Further, the second knowledge is finally obtained by the first identification model and the second identification model in one embodiment of the invention
Not as a result, being only to determine whether the corresponding control instruction of the second recognition result is wake up instruction, when being wake up instruction, is just gone
Speech-recognition services are waken up, allows speech-recognition services to carry out subsequent voice order and is identified, and if not wake up instruction
Words, continue to monitor, until listening to wake up instruction, so at this moment real speech-recognition services are constantly in the shape not worked
State, so greatly having saved electricity and calculation amount.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
The shape for the computer program product implemented in usable storage medium (including but not limited to magnetic disk storage and optical memory etc.)
Formula.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (8)
1. a kind of audio recognition method is applied in an electronic equipment, the electronic equipment has speech-recognition services, feature
It is, which comprises
Obtain the first voice messaging;
Judge whether first voice messaging meets the second preset condition;Wherein, second preset condition, for judging
State whether the first voice messaging is voice;
When first voice messaging is unsatisfactory for second preset condition, first voice messaging is abandoned;
When first voice messaging meets second preset condition, first voice is identified by the first identification model
Information obtains the first recognition result;
Judge whether first recognition result meets the first preset condition;
When first recognition result does not meet first preset condition, first voice messaging is abandoned;Described
When one recognition result meets first preset condition, identified by second identification model different from first identification model
First voice messaging obtains the second recognition result;
Based on second recognition result, controls the electronic equipment and execute corresponding control instruction;
Wherein, the speech-recognition services are in close state, when the corresponding control instruction of second recognition result is to wake up
It is described to be based on second recognition result when instruction, it controls the electronic equipment and executes corresponding control instruction, comprising:
The wake up instruction is executed, the speech-recognition services are waken up.
2. the method as described in claim 1, which is characterized in that first identification model that passes through identifies the first voice letter
Breath obtains the first recognition result, specifically:
Identify whether the corresponding user of first voice messaging is predesignated subscriber, obtains the first recognition result;Wherein, when described
When the corresponding user of first voice messaging is not the predesignated subscriber, it is pre- to indicate that first voice messaging is unsatisfactory for described first
If condition, when the corresponding user of first voice messaging is the predesignated subscriber, indicate that first voice messaging meets
First preset condition.
3. the method as described in claim 1, which is characterized in that the first voice messaging of the acquisition specifically includes:
End-point detection is carried out to first voice messaging, first voice messaging after being detected.
4. the method as described in claim 1, which is characterized in that meet first preset condition in first recognition result
When, it is described that first voice messaging is identified by second identification model different from first identification model, obtain second
Recognition result, specifically:
First voice messaging is identified by the second identification model, obtains third recognition result;
Based on first recognition result and the third recognition result, second recognition result is obtained.
5. a kind of electronic equipment, the electronic equipment has speech-recognition services, which is characterized in that the electronic equipment includes:
Circuit board;
Sound acquiring is connected to the circuit board, for obtaining the first voice messaging;
Chip is handled, is arranged on the circuit board, for identifying first voice messaging by the first identification model, is obtained
First recognition result;Judge whether first recognition result meets the first preset condition;It is not inconsistent in first recognition result
When closing first preset condition, first voice messaging is abandoned;It is default to meet described first in first recognition result
When condition, first voice messaging is identified by second identification model different from first identification model, obtains second
Recognition result;
Chip is controlled, is arranged on the circuit board, for being based on second recognition result, the electronic equipment is controlled and executes
Corresponding control instruction;
Wherein, the speech-recognition services are in close state, when the corresponding control instruction of second recognition result is to wake up
When instruction, the control chip is specifically used for executing the wake up instruction, wakes up the speech-recognition services;
The processing chip includes the first subprocessing chip and the second subprocessing chip, and the first subprocessing chip is specifically used for
Judge whether first voice messaging meets the second preset condition;It is preset when first voice messaging is unsatisfactory for described second
When condition, first voice messaging is abandoned;When first voice messaging meets second preset condition, described second
Subprocessing chip is specifically used for identifying first voice messaging by the first identification model;
Wherein, second preset condition, for judging whether first voice messaging is voice.
6. electronic equipment as claimed in claim 5, which is characterized in that the processing chip specifically further includes third subprocessing core
Piece, specifically for identifying whether the corresponding user of first voice messaging is predesignated subscriber, obtains the first recognition result;Its
In, when the corresponding user of first voice messaging is not the predesignated subscriber, indicate that first voice messaging is unsatisfactory for
First preset condition indicates first language when the corresponding user of first voice messaging is the predesignated subscriber
Message breath meets first preset condition.
7. electronic equipment as claimed in claim 5, which is characterized in that the sound acquiring further includes detection chip, is used
In first voice messaging to first voice messaging progress end-point detection, after being detected.
8. electronic equipment as claimed in claim 5, which is characterized in that the processing chip further includes the 4th subprocessing chip,
For identifying first voice by the second identification model when first recognition result meets first preset condition
Information obtains third recognition result;Based on first recognition result and the third recognition result, second identification is obtained
As a result.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210454965.9A CN103811003B (en) | 2012-11-13 | 2012-11-13 | A kind of audio recognition method and electronic equipment |
US14/079,219 US9959865B2 (en) | 2012-11-13 | 2013-11-13 | Information processing method with voice recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210454965.9A CN103811003B (en) | 2012-11-13 | 2012-11-13 | A kind of audio recognition method and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103811003A CN103811003A (en) | 2014-05-21 |
CN103811003B true CN103811003B (en) | 2019-09-24 |
Family
ID=50707680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210454965.9A Active CN103811003B (en) | 2012-11-13 | 2012-11-13 | A kind of audio recognition method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103811003B (en) |
Families Citing this family (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104282307A (en) * | 2014-09-05 | 2015-01-14 | 中兴通讯股份有限公司 | Method, device and terminal for awakening voice control system |
CN105529025B (en) * | 2014-09-28 | 2019-12-24 | 联想(北京)有限公司 | Voice operation input method and electronic equipment |
CN106033331B (en) * | 2015-03-16 | 2019-07-26 | 联想(北京)有限公司 | Information processing method and electronic equipment |
CN106469553A (en) * | 2015-08-13 | 2017-03-01 | 中兴通讯股份有限公司 | Audio recognition method and device |
CN105206271A (en) * | 2015-08-25 | 2015-12-30 | 北京宇音天下科技有限公司 | Intelligent equipment voice wake-up method and system for realizing method |
CN105976814B (en) * | 2015-12-10 | 2020-04-10 | 乐融致新电子科技(天津)有限公司 | Control method and device of head-mounted equipment |
CN105609103A (en) * | 2015-12-18 | 2016-05-25 | 合肥寰景信息技术有限公司 | Speech instant recognition system |
CN107767860B (en) * | 2016-08-15 | 2023-01-13 | 中兴通讯股份有限公司 | Voice information processing method and device |
CN107767861B (en) * | 2016-08-22 | 2021-07-02 | 科大讯飞股份有限公司 | Voice awakening method and system and intelligent terminal |
CN107767863B (en) * | 2016-08-22 | 2021-05-04 | 科大讯飞股份有限公司 | Voice awakening method and system and intelligent terminal |
CN106157950A (en) * | 2016-09-29 | 2016-11-23 | 合肥华凌股份有限公司 | Speech control system and awakening method, Rouser and household electrical appliances, coprocessor |
CN106653031A (en) * | 2016-10-17 | 2017-05-10 | 海信集团有限公司 | Voice wake-up method and voice interaction device |
CN106448663B (en) * | 2016-10-17 | 2020-10-23 | 海信集团有限公司 | Voice awakening method and voice interaction device |
CN106782569A (en) * | 2016-12-06 | 2017-05-31 | 深圳增强现实技术有限公司 | A kind of augmented reality method and device based on voiceprint registration |
CN108335695B (en) * | 2017-06-27 | 2020-10-30 | 腾讯科技(深圳)有限公司 | Voice control method, device, computer equipment and storage medium |
CN107680590B (en) * | 2017-09-18 | 2020-10-02 | 北京小蓦机器人技术有限公司 | Method, device and storage medium for processing natural language command |
CN108511002B (en) * | 2018-01-23 | 2020-12-01 | 太仓鸿羽智能科技有限公司 | Method for recognizing sound signal of dangerous event, terminal and computer readable storage medium |
CN108461084A (en) * | 2018-03-01 | 2018-08-28 | 广东美的制冷设备有限公司 | Speech recognition system control method, control device and computer readable storage medium |
CN108717851B (en) * | 2018-03-28 | 2021-04-06 | 深圳市三诺数字科技有限公司 | Voice recognition method and device |
CN108665889B (en) * | 2018-04-20 | 2021-09-28 | 百度在线网络技术(北京)有限公司 | Voice signal endpoint detection method, device, equipment and storage medium |
CN109065036A (en) * | 2018-08-30 | 2018-12-21 | 出门问问信息科技有限公司 | Method, apparatus, electronic equipment and the computer readable storage medium of speech recognition |
US11482215B2 (en) * | 2019-03-27 | 2022-10-25 | Samsung Electronics Co., Ltd. | Multi-modal interaction with intelligent assistants in voice command devices |
CN110223672B (en) * | 2019-05-16 | 2021-04-23 | 九牧厨卫股份有限公司 | Offline multi-language voice recognition method |
CN112116926A (en) * | 2019-06-19 | 2020-12-22 | 北京猎户星空科技有限公司 | Audio data processing method and device and model training method and device |
CN110299139A (en) * | 2019-06-29 | 2019-10-01 | 联想(北京)有限公司 | A kind of sound control method, device and electronic equipment |
CN110598762A (en) * | 2019-08-26 | 2019-12-20 | Oppo广东移动通信有限公司 | Audio-based trip mode detection method and device and mobile terminal |
CN110675869A (en) * | 2019-08-28 | 2020-01-10 | 紫光云(南京)数字技术有限公司 | Method and device for controlling applications in smart city app through voice |
CN110853633A (en) * | 2019-09-29 | 2020-02-28 | 联想(北京)有限公司 | Awakening method and device |
CN111767793B (en) * | 2020-05-25 | 2024-07-26 | 联想(北京)有限公司 | Data processing method and device |
CN111951793B (en) * | 2020-08-13 | 2021-08-24 | 北京声智科技有限公司 | Method, device and storage medium for awakening word recognition |
CN114220427A (en) * | 2021-10-29 | 2022-03-22 | 深圳市锐明技术股份有限公司 | Method and device for identifying call command, terminal equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101441869A (en) * | 2007-11-21 | 2009-05-27 | 联想(北京)有限公司 | Method and terminal for speech recognition of terminal user identification |
CN201307938Y (en) * | 2008-09-02 | 2009-09-09 | 宇龙计算机通信科技(深圳)有限公司 | Mobile terminal |
CN102316227A (en) * | 2010-07-06 | 2012-01-11 | 宏碁股份有限公司 | Data processing method for voice call process |
CN102549653A (en) * | 2009-10-02 | 2012-07-04 | 独立行政法人情报通信研究机构 | Speech translation system, first terminal device, speech recognition server device, translation server device, and speech synthesis server device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI235358B (en) * | 2003-11-21 | 2005-07-01 | Acer Inc | Interactive speech method and system thereof |
-
2012
- 2012-11-13 CN CN201210454965.9A patent/CN103811003B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101441869A (en) * | 2007-11-21 | 2009-05-27 | 联想(北京)有限公司 | Method and terminal for speech recognition of terminal user identification |
CN201307938Y (en) * | 2008-09-02 | 2009-09-09 | 宇龙计算机通信科技(深圳)有限公司 | Mobile terminal |
CN102549653A (en) * | 2009-10-02 | 2012-07-04 | 独立行政法人情报通信研究机构 | Speech translation system, first terminal device, speech recognition server device, translation server device, and speech synthesis server device |
CN102316227A (en) * | 2010-07-06 | 2012-01-11 | 宏碁股份有限公司 | Data processing method for voice call process |
Also Published As
Publication number | Publication date |
---|---|
CN103811003A (en) | 2014-05-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103811003B (en) | A kind of audio recognition method and electronic equipment | |
WO2021093449A1 (en) | Wakeup word detection method and apparatus employing artificial intelligence, device, and medium | |
CN107481718B (en) | Audio recognition method, device, storage medium and electronic equipment | |
CN109979438A (en) | Voice awakening method and electronic equipment | |
CN106448663B (en) | Voice awakening method and voice interaction device | |
CN109087669B (en) | Audio similarity detection method and device, storage medium and computer equipment | |
US9653069B2 (en) | System and method for personalization of acoustic models for automatic speech recognition | |
Rossi et al. | AmbientSense: A real-time ambient sound recognition system for smartphones | |
CN110570873B (en) | Voiceprint wake-up method and device, computer equipment and storage medium | |
CN110047485B (en) | Method and apparatus for recognizing wake-up word, medium, and device | |
CN105009204A (en) | Speech recognition power management | |
CN103543979A (en) | Voice outputting method, voice interaction method and electronic device | |
CN105810213A (en) | Typical abnormal sound detection method and device | |
CN110825446B (en) | Parameter configuration method and device, storage medium and electronic equipment | |
CN103543814B (en) | Signal processing apparatus and signal processing method | |
CN110223687B (en) | Instruction execution method and device, storage medium and electronic equipment | |
CN109215647A (en) | Voice awakening method, electronic equipment and non-transient computer readable storage medium | |
CN107102713A (en) | It is a kind of to reduce the method and device of power consumption | |
CN110246502A (en) | Voice noise reduction method and device and terminal equipment | |
CN109688271A (en) | The method, apparatus and terminal device of contact information input | |
CN109979446A (en) | Sound control method, storage medium and device | |
CN108231074A (en) | A kind of data processing method, voice assistant equipment and computer readable storage medium | |
EP3503093B1 (en) | Method for associating a device with a speaker in a gateway, corresponding computer program computer and apparatus | |
CN112687298A (en) | Voice wake-up optimization method, device, system, storage medium and electronic equipment | |
CN110415729A (en) | Voice activity detection method, device, medium and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |