CN111128155B - Awakening method, device, equipment and medium for intelligent equipment - Google Patents
Awakening method, device, equipment and medium for intelligent equipment Download PDFInfo
- Publication number
- CN111128155B CN111128155B CN201911236140.8A CN201911236140A CN111128155B CN 111128155 B CN111128155 B CN 111128155B CN 201911236140 A CN201911236140 A CN 201911236140A CN 111128155 B CN111128155 B CN 111128155B
- Authority
- CN
- China
- Prior art keywords
- awakening
- threshold
- voice
- wake
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000004590 computer program Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 13
- 230000002618 waking effect Effects 0.000 abstract description 13
- 238000007796 conventional method Methods 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 25
- 238000010586 diagram Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000001276 controlling effect Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000006854 communication Effects 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Electric Clocks (AREA)
Abstract
The invention discloses a method, a device, equipment and a medium for waking up intelligent equipment, which are used for solving the problems of low wake-up rate and high false wake-up rate caused by the conventional method for waking up the intelligent equipment. According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly awakens the voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly is not awakened voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is also avoided, and the mistaken awakening rate of the intelligent device is reduced.
Description
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a method, an apparatus, a device, and a medium for waking up an intelligent device.
Background
In recent years, with the development of the natural speech processing technology field, smart devices and speech interaction applications are gradually popularized, and how to efficiently and accurately wake up smart devices becomes a hot spot of research in recent years.
In the prior art, the awakening of the intelligent device mainly depends on the accuracy of waveform coding analysis of an awakening word. Fig. 1 is a schematic flow diagram of waking up an intelligent device in the prior art, and as shown in fig. 1, after acquiring a voice segment of which voice information includes a wake-up word, the intelligent device acquires a similarity between the voice segment and a preset wake-up word by using a wake-up word meaning similarity model, and determines whether the similarity is greater than a set wake-up threshold, if so, the intelligent device is woken up, otherwise, the intelligent device is not woken up.
The awakening threshold value is a key for improving the awakening rate and the false awakening rate, but the awakening rate and the false awakening rate are contradictory, if the awakening rate of the intelligent device is expected to be improved, the awakening voice message is prevented from being mistakenly detected as the non-awakening voice message, the awakening threshold value of the intelligent device can be set to be lower, but the intelligent device can collect the non-awakening voice message in the normal communication process of people, and if the non-awakening voice message contains the awakening word, the intelligent device is easily mistakenly awakened.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a medium for waking up intelligent equipment, which are used for solving the problems of low wake-up rate and high false wake-up rate caused by the existing method for waking up the intelligent equipment.
The embodiment of the invention provides a method for waking up intelligent equipment, which comprises the following steps:
acquiring a voice section containing a wakeup word in voice information to be processed;
if the attribute characteristics of the voice section meet the preset threshold adjustment condition, adjusting the size of a currently stored target awakening threshold;
and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.
Further, the determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained through the semantic model of the wake-up word and the adjusted target wake-up threshold includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
Further, if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, the method further includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the currently stored target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
Further, the attribute feature of the speech segment is a voiceprint feature of the speech segment or a signal energy of the speech segment.
Further, if the attribute feature of the voice segment is the voiceprint feature of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
and if the voiceprint features of the voice section are matched with the currently stored voiceprint features with higher priority, reducing the size of the currently stored target awakening threshold.
Further, the method further comprises:
and if the intelligent equipment is awakened and the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, updating the awakening times corresponding to the target voiceprint characteristics, and adjusting the priority corresponding to the target voiceprint characteristics according to the updated awakening times.
Further, if the attribute characteristic of the voice segment is the signal energy of the voice segment, and if the attribute characteristic of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
if the signal energy at the starting endpoint of the voice segment is less than a set first energy threshold value, and the signal energy at the ending endpoint of the voice segment is less than a set second energy threshold value, reducing the size of the currently stored target awakening threshold value; otherwise, increasing the size of the currently stored target awakening threshold.
The embodiment of the invention provides a wake-up device of intelligent equipment, which comprises:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a voice section containing a wakeup word in voice information to be processed;
the adjusting unit is used for adjusting the size of the currently stored target awakening threshold value if the attribute characteristics of the voice section meet the preset threshold value adjusting condition;
and the determining unit is used for determining whether to awaken the intelligent equipment according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.
Further, the determining unit is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
Further, the determining unit is further configured to, if the attribute characteristics of the voice segment do not satisfy a preset threshold adjustment condition, determine whether a similarity between the voice segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
Further, the adjusting unit is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, reduce the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.
Further, the adjusting unit is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.
Further, the adjusting unit is specifically configured to reduce the size of the currently stored target wake-up threshold if the attribute feature of a speech segment is the signal energy of the speech segment, and if the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.
An embodiment of the present invention provides an electronic device, where the electronic device includes a processor, and the processor is configured to implement the steps of the method for waking up an intelligent device as described above when executing a computer program stored in a memory.
An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for waking up an intelligent device as described above.
According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly serves as awakening voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly serves as non-awakening voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is avoided, and the mistaken awakening rate of the intelligent device is reduced.
Drawings
FIG. 1 is a schematic flow chart illustrating a wake-up process of a smart device according to the prior art;
fig. 2 is a schematic diagram of a wake-up process of an intelligent device according to an embodiment of the present invention;
fig. 3 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;
fig. 4 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;
fig. 5 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a wake-up apparatus of an intelligent device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to improve the wake-up rate and the false wake-up rate of the intelligent device, embodiments of the present invention provide a wake-up method, apparatus, device, and medium for the intelligent device.
Example 1:
fig. 2 is a schematic diagram of a wake-up process of an intelligent device according to an embodiment of the present invention, where the wake-up process includes the following steps:
s201: and acquiring a voice section containing the awakening word in the voice information to be processed.
The method for waking up the intelligent device provided by the embodiment of the invention is applied to the electronic device, the electronic device can be the intelligent device which is wakened, or other devices which are used for performing wake-up identification and controlling the intelligent device to wake up except the intelligent device, the intelligent device can be a mobile terminal, an intelligent sound box, an intelligent air conditioner and other intelligent household devices, and the other devices which perform wake-up identification and control the intelligent device to wake up can be a server, a mobile terminal and other devices.
After the voice information to be processed is collected, the electronic device can perform recognition and semantic analysis processing on the voice information to be processed, acquire a voice segment containing the awakening word, and perform subsequent processing based on the acquired voice segment. If the voice segment containing the awakening word cannot be acquired from the voice information to be processed, the voice information can be directly ignored without subsequent processing.
It should be noted that, a specific method for detecting whether the collected voice information includes the voice segment of the wakeup word is the prior art, and is not described herein again.
S202: and if the attribute characteristics of the voice section meet the preset threshold value adjusting condition, adjusting the size of the currently stored target awakening threshold value.
In order to increase the wake-up rate and reduce the false wake-up rate, in the embodiment of the present invention, the target wake-up threshold of the smart device may be adjusted. In order to determine whether to adjust the target wake-up threshold, a threshold adjustment condition may be preset, and if the attribute characteristics of the voice segment satisfy the preset threshold adjustment condition, it is determined to adjust the target wake-up threshold, and if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, it is determined not to adjust the currently stored target wake-up threshold.
The preset threshold adjustment condition may be to detect whether the voice information to be processed includes a plurality of consecutive voice segments including a wakeup word, for example: and when the electronic equipment detects that the number of the voice sections is at least two, the electronic equipment can determine that the voice sections meet a preset threshold value adjusting condition, and adjust a currently stored target awakening threshold value.
When the attribute characteristics of the voice segment meet the preset threshold adjustment condition, the size of the target awakening threshold is adjusted, the size of the target awakening threshold can be increased or decreased, and the threshold adjustment condition which the attribute characteristics of the voice segment meet can be specifically determined.
S203: and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.
In the embodiment of the invention, the electronic equipment stores the semantic model of the awakening word, and the similarity between the input voice segment and the awakening word can be acquired through the semantic model of the awakening word. And after the adjusted target threshold value is obtained, judging whether the similarity between the voice segment and the awakening word is greater than the adjusted target awakening threshold value or not, thereby determining whether to awaken the intelligent equipment or not.
The specific method for obtaining the similarity between the voice segment and the awakening word through the awakening word semantic model belongs to the prior art, and is not described herein again.
According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly awakens the voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly is not awakened voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is also avoided, and the mistaken awakening rate of the intelligent device is reduced.
Example 2:
in order to improve the wake-up rate and the false wake-up rate of the smart device, on the basis of the above embodiments, in an embodiment of the present invention, the attribute feature of the voice segment is a voiceprint feature of the voice segment, or a signal energy of the voice segment.
The voiceprint feature is information which uniquely identifies the voice feature of the user, and the identity of the user who awakens the intelligent device can be determined according to the voiceprint feature of the voice section, so that whether the target awakening threshold value is adjusted or not can be determined according to the identity of the user.
In order to further accurately determine whether to wake up the smart device, if the attribute feature of the voice segment is the voiceprint feature of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
and if the voiceprint features of the voice section are matched with the currently stored voiceprint features with higher priority, reducing the size of the currently stored target awakening threshold.
The voiceprint feature can identify the identity of the user, so that the priority of the corresponding voiceprint feature can be determined according to the identity of the user, for example: if the voice print feature A of the male owner in the family is the voice print feature A, the priority of the voice print feature A is 4, if the voice print feature B of the female owner in the family is 3, if the voice print feature C of the child in the family is 2, the priority in the family is higher in the order of 4 than 3 than 2, so that the priority in the voice print feature A is higher than the priority in the order of B, and the priority in the voice print feature B is higher than the priority in the order of C.
Therefore, when determining whether to adjust the target wake-up threshold, it may be determined whether the voiceprint features of the voice segment are matched with any currently stored voiceprint feature, if there is a matched voiceprint feature, the priority of the matched voiceprint feature is obtained, and in order to adjust the size of the target wake-up threshold, a priority range is preset, and the priority in the priority range is taken as the priority with higher priority.
Also in the above example, the home priority 4 and the priority 3 may be set as a priority range in which the target wake-up threshold can be adjusted, that is, the priority 4 and the priority 3 are higher priorities. Therefore, if the matched voiceprint feature is the voiceprint feature B, the target wake-up threshold is adjusted; if the priority of the matching voiceprint feature is not a higher priority, e.g., the matching voiceprint feature is voiceprint feature C, then the target wake-up threshold is not adjusted.
The voiceprint feature has a higher priority, which indicates that the user corresponding to the voiceprint feature is more important, and in order to increase the wake-up rate of the user for waking up the smart device, when the voiceprint feature of the voice segment is the voiceprint feature corresponding to the user, the target wake-up threshold value can be appropriately reduced. The reduction amount can be determined according to the priority of the matched voiceprint feature, if the priority of the matched voiceprint feature is higher, the size of the target wake-up threshold is reduced more, if the priority of the matched voiceprint feature is low, the size of the target wake-up threshold can be reduced less, and of course, the size of the target wake-up threshold can be reduced by the same value no matter the priority of the matched voiceprint feature is high or low.
For example, the preset priority range is priority 4 and priority 3, and if the priority of the matched voiceprint feature is within the priority range, the target wake-up threshold is lowered by 0.1. The voiceprint feature of the voice segment of the voice information to be processed acquired by the current electronic equipment is matched with the currently stored voiceprint feature A, the priority corresponding to the matched voiceprint feature A is priority 3, the currently stored target awakening threshold value 0.85 is reduced by 0.1, and the adjusted target awakening threshold value is 0.75.
In order to further effectively improve the wake-up rate and the false wake-up rate of the smart device, the wake-up method of the smart device further includes:
and if the intelligent equipment is awakened and the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, updating the awakening times corresponding to the target voiceprint characteristics, and adjusting the priority corresponding to the target voiceprint characteristics according to the updated awakening times.
The priority can be preset, and in order to further improve the awakening rate, in the embodiment of the present invention, the priority of the target voiceprint feature can be determined according to the number of times that the intelligent device is awakened by any target voiceprint feature. If the number of times that a certain target voiceprint feature wakes up the intelligent device is more, the priority of the target voiceprint feature is higher; if the number of times that the target voiceprint feature wakes up the smart device is smaller, the priority of the target voiceprint feature is lower.
For example, the voiceprint feature a, the voiceprint feature B, and the voiceprint feature C wake up the smart device respectively 16 times, 15 times, and 21 times, and then the priorities of the three voiceprint features are the voiceprint feature C, the voiceprint feature a, and the voiceprint feature B from high to low. Therefore, after the intelligent device is awakened each time, whether the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features is judged, and if the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, the awakening times corresponding to the target voiceprint features are updated. And after the awakening times corresponding to the target voiceprint features are updated, sequencing all currently stored target voiceprint features according to the awakening times, so as to adjust the priority corresponding to the target voiceprint features.
In order to determine the priority corresponding to each frequency, the frequency range corresponding to each priority may be set in advance. In addition, the number of voice messages received by the intelligent device is continuously increased, so that the number of awakening times corresponding to the target voiceprint feature is continuously changed, and in order to determine the priority, the number range corresponding to each priority can be determined according to the total number of awakening times and the number of preset priorities.
In order to further accurately determine whether to wake up the smart device, if the attribute feature of the voice segment is the signal energy of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
if the signal energy at the starting endpoint of the voice segment is less than a set first energy threshold value, and the signal energy at the ending endpoint of the voice segment is less than a set second energy threshold value, reducing the size of the currently stored target awakening threshold value; otherwise, increasing the size of the currently stored target awakening threshold.
In the normal voice message, the signal energy between every two adjacent voice frames of the voice message is correlated and mutually influenced, so in the embodiment of the present invention, it may be determined whether the voice segment is influenced by other voice segments in the voice message to be processed according to the signal energy of the voice segment, so as to determine whether the voice message to be processed is the voice message only containing the wakeup word, that is, only the voice segment containing the wakeup word in the voice message to be processed.
The to-be-processed voice information acquired by the electronic device may be voice information containing a wake-up word or voice information not containing the wake-up word. When the voice information to be processed is the voice information containing the wakeup word, the voice information to be processed may only contain the voice segment of the wakeup word, or may not only contain the voice segment of the wakeup word, that is, the voice segment containing the wakeup word is in a session.
Generally, the probability that the voice information to be processed, which only includes the voice segment of the wakeup word, is the wakeup voice information is relatively high, and the probability that the voice information, which does not only include the voice segment of the wakeup word, is the non-wakeup voice information is relatively high. Therefore, in order to conveniently identify whether the voice information to be processed only includes the voice segment of the wakeup word, in the embodiment of the present invention, an energy detection method may be used to determine whether the voice information to be processed only includes the voice segment of the wakeup word.
When the speech information to be processed only includes the speech segment of the wakeup word, the energy at the start end point and the end point of the speech segment is weak because no other speech segment in the speech information to be processed can affect the signal energy of the speech segment. When the speech information to be processed does not only include the speech segment of the wake-up word, i.e. the speech segment including the wake-up word is in a session, the signal energy of the speech segment is affected by other speech segments in the speech information to be processed, and therefore the energy at the starting end point and/or the ending end point of the speech segment is strong.
Based on this, whether to adjust the size of the target wake-up threshold can be determined according to the signal energy of the voice segment. In order to further accurately determine whether to wake up the smart device, in the embodiment of the present invention, a first energy threshold and a second energy threshold may be preset, and it is determined whether the signal energy at the starting endpoint of the voice segment is smaller than the set first energy threshold, and whether the signal energy at the ending endpoint of the voice segment is smaller than the set second energy threshold, so as to determine whether to adjust the currently stored target wake-up threshold.
In a specific implementation, if the signal energy at the starting end point of the voice segment is less than a set first energy threshold and the signal energy at the ending end point of the voice segment is less than a set second energy threshold, the voice information to be processed only contains the voice segment of the wakeup word, and the voice information to be processed is most likely to be wakeup voice information, and the size of the currently stored target wakeup threshold is reduced; otherwise, it is indicated that the voice message to be processed does not only include the voice segment of the wakeup word, that is, the voice segment including the wakeup word is in a segment of the session, and the voice message to be processed is likely to be the non-wakeup voice message, and the size of the currently stored target wakeup threshold is increased.
In order to determine the first energy threshold and the second energy threshold, in an embodiment of the present invention, a large number of speech segments containing a wakeup word may be analyzed by obtaining signal energies of a start end point and an end point of the speech segment according to the following formulas to determine the first energy threshold and the second energy threshold:
where, x (m) is the signal of the mth frame, n is the nth time, W (n-m) is the window function sequence, and En represents the short-time energy of the signal when the window function is started at the nth time. The process of obtaining the signal energy of the starting end point and the ending end point of the voice segment for analysis by the above formula belongs to the prior art, and is not described herein again.
The size of the target awakening threshold can be increased by setting different values according to different scenes, and if the false awakening rate of the intelligent device is expected to be reduced as much as possible, the target awakening threshold can be increased by a large amount; the target wake-up threshold may be raised a little bit if to avoid false detection of the wake-up voice message as a non-wake-up voice message.
Correspondingly, the size of the target awakening threshold value is reduced, different values can be set according to different scenes, and if the awakening rate of the intelligent device is expected to be improved as much as possible, the target awakening threshold value can be reduced; the target wake-up threshold may be lowered less if to avoid false detection of non-wake-up voice information as wake-up voice information.
According to the embodiment of the invention, the target awakening threshold is adjusted correspondingly by judging whether the attribute characteristics of the voice segment meet the preset threshold adjusting condition, so that the occurrence of the situation that the intelligent equipment is awakened by non-awakening voice information is reduced, the difficulty that the intelligent equipment is awakened possibly for awakening the voice information is reduced, and the false awakening rate and the awakening rate of the intelligent equipment are improved.
Example 3:
in order to further improve the wake-up rate and the false wake-up rate of the smart device, on the basis of the above embodiments, in the embodiment of the present invention, determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained by the semantic model of the wake-up word and the adjusted target wake-up threshold includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
Based on the above embodiment, after determining whether the target wake-up threshold needs to be adjusted, the target wake-up threshold for determining whether to wake up the smart device may be determined according to the determination result, so as to determine whether to wake up the smart device according to the target wake-up threshold and the similarity between the voice segment and the wake-up word obtained by the semantic model of the wake-up word.
If the target awakening threshold is adjusted, the target awakening threshold stored in the electronic device is the adjusted target awakening threshold, and in order to further improve the awakening rate and the false awakening rate of the intelligent device, in the embodiment of the invention, after the adjusted target awakening threshold is obtained, the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is judged whether to be greater than the adjusted target awakening threshold, so that whether the intelligent device is awakened or not is determined.
If the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is larger than the currently stored target awakening threshold value, which indicates that the voice information to be processed is awakening voice information, the intelligent equipment is determined to be awakened; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold value, which indicates that the voice information to be processed is non-awakening voice information, determining not to awaken the intelligent equipment.
If the electronic equipment for controlling the intelligent equipment to wake up is the intelligent equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is greater than the adjusted target wake-up threshold, the intelligent equipment directly wakes up; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the adjusted target awakening threshold, the intelligent equipment is not awakened.
If the electronic equipment for controlling the intelligent equipment to wake up is other equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is larger than the adjusted target wake-up threshold, the other equipment determines to wake up the intelligent equipment and sends a control instruction for waking up the intelligent equipment to the intelligent equipment so as to wake up the intelligent equipment; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the adjusted target awakening threshold, other equipment determines not to awaken the intelligent equipment.
The similarity between the voice segment obtained through the awakening word semantic model and the awakening word belongs to the prior art, and is not described herein again.
In order to further improve the wake-up rate and the false wake-up rate of the smart device, if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, the method further includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the currently stored target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
If the target awakening threshold is not adjusted, the target awakening threshold stored in the electronic device is an unadjusted target awakening threshold, which may be a preset threshold, and in order to determine whether to awaken the device, after determining that the target awakening threshold is not adjusted, the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is determined to be greater than the currently stored target awakening threshold, so as to determine whether to awaken the intelligent device.
If the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is larger than the currently stored target awakening threshold value, which indicates that the voice information to be processed is awakening voice information, the intelligent equipment is determined to be awakened; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold value, which indicates that the voice information to be processed is non-awakening voice information, determining not to awaken the intelligent equipment.
If the electronic equipment for controlling the intelligent equipment to wake up is the intelligent equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is greater than the currently stored target wake-up threshold, the intelligent equipment directly wakes up; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold, the intelligent equipment is not awakened.
If the electronic equipment for controlling the intelligent equipment to wake up is other equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is larger than the currently stored target wake-up threshold, the other equipment determines to wake up the intelligent equipment and sends a control instruction for waking up the intelligent equipment to the intelligent equipment so as to wake up the intelligent equipment; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold, other equipment determines not to awaken the intelligent equipment.
The similarity between the voice segment obtained through the awakening word semantic model and the awakening word belongs to the prior art, and is not described herein again.
Example 4:
fig. 3 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, where the execution subject is the intelligent device, and the attribute feature of a voice segment is signal energy of the voice segment, which is described in detail:
s301: the intelligent equipment acquires a voice section containing a wakeup word in the voice information to be processed.
S302: the intelligent device judges whether the signal energy at the starting endpoint of the voice segment is smaller than a set first energy threshold value or not and whether the signal energy at the ending endpoint of the voice segment is smaller than a set second energy threshold value or not, if so, S303 is executed, and if not, S304 is executed.
S303: the smart device reduces the size of the currently saved target wake-up threshold and then executes S305.
S304: the smart device raises the size of the currently saved target wake-up threshold and then executes S305.
S305: and the intelligent equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is larger than the adjusted target awakening threshold value or not, if so, S306 is executed, and otherwise, S307 is executed.
S306: the intelligent device wakes up directly.
S307: the smart device does not wake up.
Fig. 4 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, where the execution subject is another device, and an attribute feature of a speech segment is a voiceprint feature of the speech segment, which are described in detail:
s401: and other equipment acquires the voice segment containing the awakening word in the voice information to be processed.
S402: and other equipment judges whether the voiceprint characteristics of the voice section are matched with the currently stored voiceprint characteristics with higher priority, if so, S403 is executed, and otherwise, S404 is executed.
S403: the other device reduces the size of the currently saved target wake-up threshold and then executes S405.
S404: and the other equipment determines that the voiceprint characteristics of the voice segment do not meet the preset threshold adjustment condition, does not adjust the currently stored target awakening threshold, and then executes S406.
S405: and other equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not, if so, S407 is executed, and otherwise, S408 is executed.
S406: and other equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is larger than a currently stored target awakening threshold value or not, if so, S407 is executed, and otherwise, S408 is executed.
S407: the other devices determine to wake up the smart device and send a wake-up command to the smart device to control the smart device to wake up, and then execute S409.
S408: the other devices determine not to wake up the smart device.
S409: and other equipment judges that the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, if so, S410 is executed, and otherwise, S411 is executed.
S410: and other equipment updates the awakening times corresponding to the target voiceprint characteristics, and adjusts the priority corresponding to the target voiceprint characteristics according to the updated awakening times.
S411: other devices save the voiceprint characteristics of the speech segment as target voiceprint characteristics.
Fig. 5 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, and as shown in fig. 5, after acquiring voice information to be processed and extracting a voice segment containing a wake-up word from the voice information to be processed, an electronic device may determine whether to adjust a target wake-up threshold value by determining whether voiceprint characteristics of the voice segment satisfy a preset threshold adjustment condition, or determine whether to adjust the target wake-up threshold value by determining whether signal energy of the voice segment satisfies the preset threshold adjustment condition. And after the adjusted target awakening threshold value is obtained, judging whether the similarity between the voice segment obtained through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not, and thus determining whether the intelligent equipment is awakened or not.
Example 5:
fig. 6 is a schematic structural diagram of a wake-up apparatus of an intelligent device according to an embodiment of the present invention, where the wake-up apparatus includes:
the acquiring unit 61 is configured to acquire a voice segment containing a wakeup word in the voice information to be processed;
an adjusting unit 62, configured to adjust a size of a currently stored target wake-up threshold if the attribute characteristics of the voice segment meet a preset threshold adjustment condition;
and the determining unit 63 is configured to determine whether to wake up the smart device according to the similarity between the voice segment and the wake-up word, which is obtained through the semantic model of the wake-up word, and the adjusted target wake-up threshold.
The determining unit 63 is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
In another possible implementation manner, the determining unit 63 is further configured to determine, if the attribute characteristics of the speech segment do not meet a preset threshold adjustment condition, whether a similarity between the speech segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
In a possible embodiment, the adjusting unit 62 is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.
Further, the adjusting unit 62 is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.
In another possible embodiment, the adjusting unit 62 is specifically configured to decrease the size of the currently saved target wake-up threshold when the attribute characteristic of a speech segment is the signal energy of the speech segment, and when the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.
According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly serves as awakening voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly serves as non-awakening voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is avoided, and the mistaken awakening rate of the intelligent device is reduced.
Example 6:
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and on the basis of the foregoing embodiments, an electronic device according to an embodiment of the present invention further includes a processor 71 and a memory 72;
the processor 71 is adapted to carry out the steps of the wake-up method of the smart device described above when executing the computer program stored in the memory 72.
Alternatively, the processor 71 may be a CPU (central processing unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or a CPLD (Complex Programmable Logic Device).
A processor 71 for performing the following steps when in accordance with a computer program stored in the memory 72:
acquiring a voice section containing a wakeup word in voice information to be processed;
if the attribute characteristics of the voice section meet the preset threshold adjustment condition, adjusting the size of a currently stored target awakening threshold;
and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.
The processor 71 is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
In another possible implementation manner, the processor 71 is further configured to determine, if the attribute characteristics of the voice segment do not meet a preset threshold adjustment condition, whether a similarity between the voice segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
In a possible embodiment, the processor 71 is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.
Further, the processor 71 is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.
In another possible embodiment, the processor 71 is specifically configured to decrease the size of the currently saved target wake-up threshold when the attribute characteristic of a speech segment is the signal energy of the speech segment, and when the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.
According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly serves as awakening voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly serves as non-awakening voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is avoided, and the mistaken awakening rate of the intelligent device is reduced.
Example 8:
on the basis of the foregoing embodiments, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program executable by an electronic device is stored, and when the program is run on the electronic device, the electronic device is caused to execute the following steps:
acquiring a voice section containing a wakeup word in voice information to be processed;
if the attribute characteristics of the voice section meet the preset threshold adjustment condition, adjusting the size of a currently stored target awakening threshold;
and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.
In a possible implementation manner, the determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained through the wake-up word semantic model and the adjusted target wake-up threshold includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
In another possible implementation manner, if the attribute feature of the speech segment does not satisfy the preset threshold adjustment condition, the method further includes:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the currently stored target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
Specifically, the attribute feature of the speech segment is a voiceprint feature of the speech segment, or a signal energy of the speech segment.
In a possible implementation manner, if the attribute feature of a speech segment is a voiceprint feature of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
and if the voiceprint features of the voice section are matched with the currently stored voiceprint features with higher priority, reducing the size of the currently stored target awakening threshold.
Further, the method further comprises:
and if the intelligent equipment is awakened and the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, updating the awakening times corresponding to the target voiceprint characteristics, and adjusting the priority corresponding to the target voiceprint characteristics according to the updated awakening times.
In another possible implementation manner, if the attribute feature of a speech segment is the signal energy of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
if the signal energy at the starting endpoint of the voice segment is less than a set first energy threshold value, and the signal energy at the ending endpoint of the voice segment is less than a set second energy threshold value, reducing the size of the currently stored target awakening threshold value; otherwise, increasing the size of the currently stored target awakening threshold.
According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly serves as awakening voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly serves as non-awakening voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is avoided, and the mistaken awakening rate of the intelligent device is reduced.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.
Claims (13)
1. A wake-up method of a smart device, the method comprising:
acquiring a voice section containing a wakeup word in voice information to be processed;
if the attribute characteristics of the voice section meet the preset threshold adjustment condition, adjusting the size of a currently stored target awakening threshold;
determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word obtained through the awakening word semantic model and the adjusted target awakening threshold;
if the attribute characteristics of the voice segment are the signal energy of the voice segment, and if the attribute characteristics of the voice segment meet a preset threshold adjustment condition, adjusting the size of the currently stored target awakening threshold comprises:
if the signal energy at the starting endpoint of the voice segment is less than a set first energy threshold value, and the signal energy at the ending endpoint of the voice segment is less than a set second energy threshold value, reducing the size of the currently stored target awakening threshold value; otherwise, increasing the size of the currently stored target awakening threshold.
2. The method according to claim 1, wherein the determining whether to wake up the smart device according to the similarity between the voice segments and the wake-up words obtained through the semantic model of the wake-up words and the adjusted target wake-up threshold comprises:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
3. The method according to claim 1, wherein if the property feature of the speech segment does not satisfy the predetermined threshold adjustment condition, the method further comprises:
judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the currently stored target awakening threshold value or not;
if yes, the intelligent equipment is confirmed to be awakened;
otherwise, the intelligent device is determined not to be awakened.
4. A method according to any of claims 1-3, wherein the attribute of the speech segments is a voiceprint characteristic of the speech segments, or a signal energy of the speech segments.
5. A method according to any one of claims 1 to 3, wherein if the attribute feature of a speech segment is a voiceprint feature of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:
and if the voiceprint features of the voice section are matched with the currently stored voiceprint features with higher priority, reducing the size of the currently stored target awakening threshold.
6. The method of claim 5, further comprising:
and if the intelligent equipment is awakened and the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, updating the awakening times corresponding to the target voiceprint characteristics, and adjusting the priority corresponding to the target voiceprint characteristics according to the updated awakening times.
7. A wake-up apparatus of a smart device, the apparatus comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a voice section containing a wakeup word in voice information to be processed;
the adjusting unit is used for adjusting the size of the currently stored target awakening threshold value if the attribute characteristics of the voice section meet the preset threshold value adjusting condition;
the determining unit is used for determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold;
the adjusting unit is specifically configured to reduce the size of the currently stored target wake-up threshold if the attribute feature of a voice segment is the signal energy of the voice segment, and if the signal energy at the starting endpoint of the voice segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the voice segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.
8. The apparatus according to claim 7, wherein the determining unit is specifically configured to determine whether a similarity between the speech segment obtained through the wakeup word semantic model and a wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
9. The apparatus according to claim 7, wherein the determining unit is further configured to, if the attribute feature of the speech segment does not satisfy a preset threshold adjustment condition, determine whether a similarity between the speech segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.
10. The apparatus according to any one of claims 7 to 9, wherein the adjusting unit is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.
11. The apparatus according to claim 10, wherein the adjusting unit is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the speech segment matches any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.
12. An electronic device, characterized in that the electronic device comprises a processor for implementing the steps of the wake-up method of the smart device according to any of claims 1-6 when executing a computer program stored in a memory.
13. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the wake-up method of a smart device according to any one of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911236140.8A CN111128155B (en) | 2019-12-05 | 2019-12-05 | Awakening method, device, equipment and medium for intelligent equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911236140.8A CN111128155B (en) | 2019-12-05 | 2019-12-05 | Awakening method, device, equipment and medium for intelligent equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111128155A CN111128155A (en) | 2020-05-08 |
CN111128155B true CN111128155B (en) | 2020-12-01 |
Family
ID=70497610
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911236140.8A Active CN111128155B (en) | 2019-12-05 | 2019-12-05 | Awakening method, device, equipment and medium for intelligent equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111128155B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111816178B (en) * | 2020-07-07 | 2024-09-06 | 云知声智能科技股份有限公司 | Control method, device and equipment of voice equipment |
CN111949323B (en) * | 2020-08-31 | 2024-06-11 | 深圳市欧瑞博科技股份有限公司 | Optimization method and device for waking up intelligent equipment, intelligent equipment and storage medium |
CN113628620A (en) * | 2021-08-12 | 2021-11-09 | 云知声(上海)智能科技有限公司 | A smart device wake-up method, device, electronic device and storage medium |
CN113921014A (en) * | 2021-10-11 | 2022-01-11 | 云知声(上海)智能科技有限公司 | Intelligent device voice broadcast interruption prevention method and system, storage medium and terminal |
CN114141233A (en) * | 2021-12-08 | 2022-03-04 | 科大讯飞股份有限公司 | Voice awakening method and related equipment thereof |
WO2023240649A1 (en) * | 2022-06-17 | 2023-12-21 | 北京小米移动软件有限公司 | Method and apparatus for updating wake-up priority |
CN115472161A (en) * | 2022-07-27 | 2022-12-13 | 北京声智科技有限公司 | Voice wake-up method, device, device and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109378000A (en) * | 2018-12-19 | 2019-02-22 | 科大讯飞股份有限公司 | Voice awakening method, device, system, equipment, server and storage medium |
CN109473092A (en) * | 2018-12-03 | 2019-03-15 | 珠海格力电器股份有限公司 | Voice endpoint detection method and device |
EP3540730A1 (en) * | 2018-03-16 | 2019-09-18 | Wistron Corporation | Speech service control apparatus and method thereof |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9354687B2 (en) * | 2014-09-11 | 2016-05-31 | Nuance Communications, Inc. | Methods and apparatus for unsupervised wakeup with time-correlated acoustic events |
CN106653010B (en) * | 2015-11-03 | 2020-07-24 | 络达科技股份有限公司 | Electronic device and method for waking up through voice recognition |
US10573329B2 (en) * | 2017-05-31 | 2020-02-25 | Dell Products L.P. | High frequency injection for improved false acceptance reduction |
CN108320733B (en) * | 2017-12-18 | 2022-01-04 | 上海科大讯飞信息科技有限公司 | Voice data processing method and device, storage medium and electronic equipment |
CN108735209B (en) * | 2018-04-28 | 2021-01-08 | 广东美的制冷设备有限公司 | Wake-up word binding method, intelligent device and storage medium |
CN108847219B (en) * | 2018-05-25 | 2020-12-25 | 台州智奥通信设备有限公司 | Awakening word preset confidence threshold adjusting method and system |
CN109920418B (en) * | 2019-02-20 | 2021-06-22 | 北京小米移动软件有限公司 | Method and device for adjusting awakening sensitivity |
CN110047493A (en) * | 2019-03-13 | 2019-07-23 | 深圳市酷开网络科技有限公司 | Control method, device and storage medium based on Application on Voiceprint Recognition priority |
CN110428810B (en) * | 2019-08-30 | 2020-10-30 | 北京声智科技有限公司 | Voice wake-up recognition method and device and electronic equipment |
-
2019
- 2019-12-05 CN CN201911236140.8A patent/CN111128155B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3540730A1 (en) * | 2018-03-16 | 2019-09-18 | Wistron Corporation | Speech service control apparatus and method thereof |
CN109473092A (en) * | 2018-12-03 | 2019-03-15 | 珠海格力电器股份有限公司 | Voice endpoint detection method and device |
CN109378000A (en) * | 2018-12-19 | 2019-02-22 | 科大讯飞股份有限公司 | Voice awakening method, device, system, equipment, server and storage medium |
Non-Patent Citations (2)
Title |
---|
Këpuska, V. Z.A novel wake-up-word speech recognition system, wake-up-word recognition task, technology and evaluation.《Nonlinear Analysis: Theory, Methods & Applications》.2009, * |
面向语音环境的情感补偿推荐模型及方法研究;张希翔;《中国优秀博士学位论文全文数据库信息科技辑》;20180630;I138-129 * |
Also Published As
Publication number | Publication date |
---|---|
CN111128155A (en) | 2020-05-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111128155B (en) | Awakening method, device, equipment and medium for intelligent equipment | |
CN105654949B (en) | A kind of voice awakening method and device | |
CN111968644B (en) | Intelligent device awakening method and device and electronic device | |
CN108509225B (en) | Information processing method and electronic equipment | |
CN111161728B (en) | Awakening method, awakening device, awakening equipment and awakening medium of intelligent equipment | |
CN111312222A (en) | Awakening and voice recognition model training method and device | |
CN106157950A (en) | Speech control system and awakening method, Rouser and household electrical appliances, coprocessor | |
CN113160815B (en) | Intelligent control method, device, equipment and storage medium for voice wakeup | |
CN105575395A (en) | Voice wake-up method and apparatus, terminal, and processing method thereof | |
CN112489648A (en) | Wake-up processing threshold adjustment method, voice home appliance, and storage medium | |
CN111949323B (en) | Optimization method and device for waking up intelligent equipment, intelligent equipment and storage medium | |
CN111429901B (en) | IoT chip-oriented multi-stage voice intelligent awakening method and system | |
CN110910878B (en) | Voice wake-up control method and device, storage medium and household appliance | |
CN111161714A (en) | Voice information processing method, electronic equipment and storage medium | |
CN103811014B (en) | Voice interference filtering method and voice interference filtering system | |
CN110895930B (en) | Voice recognition method and device | |
CN111081251B (en) | Voice wake-up method and device | |
CN111124512B (en) | Awakening method, device, equipment and medium for intelligent equipment | |
CN111739515B (en) | Speech recognition method, equipment, electronic equipment, server and related system | |
CN111179924A (en) | Method and system for optimizing awakening performance based on mode switching | |
CN111161745A (en) | Awakening method, device, equipment and medium for intelligent equipment | |
CN114220418A (en) | Awakening word recognition method and device for target speaker | |
CN111696555A (en) | Method and system for confirming awakening words | |
CN116386676A (en) | Voice awakening method, voice awakening device and storage medium | |
CN111028830A (en) | Local hot word bank updating method, device and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |