CN111128155B

CN111128155B - Awakening method, device, equipment and medium for intelligent equipment

Info

Publication number: CN111128155B
Application number: CN201911236140.8A
Authority: CN
Inventors: 高杰
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2020-12-01
Anticipated expiration: 2039-12-05
Also published as: CN111128155A

Abstract

The invention discloses a method, a device, equipment and a medium for waking up intelligent equipment, which are used for solving the problems of low wake-up rate and high false wake-up rate caused by the conventional method for waking up the intelligent equipment. According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly awakens the voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly is not awakened voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is also avoided, and the mistaken awakening rate of the intelligent device is reduced.

Description

Awakening method, device, equipment and medium for intelligent equipment

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a method, an apparatus, a device, and a medium for waking up an intelligent device.

Background

In recent years, with the development of the natural speech processing technology field, smart devices and speech interaction applications are gradually popularized, and how to efficiently and accurately wake up smart devices becomes a hot spot of research in recent years.

In the prior art, the awakening of the intelligent device mainly depends on the accuracy of waveform coding analysis of an awakening word. Fig. 1 is a schematic flow diagram of waking up an intelligent device in the prior art, and as shown in fig. 1, after acquiring a voice segment of which voice information includes a wake-up word, the intelligent device acquires a similarity between the voice segment and a preset wake-up word by using a wake-up word meaning similarity model, and determines whether the similarity is greater than a set wake-up threshold, if so, the intelligent device is woken up, otherwise, the intelligent device is not woken up.

The awakening threshold value is a key for improving the awakening rate and the false awakening rate, but the awakening rate and the false awakening rate are contradictory, if the awakening rate of the intelligent device is expected to be improved, the awakening voice message is prevented from being mistakenly detected as the non-awakening voice message, the awakening threshold value of the intelligent device can be set to be lower, but the intelligent device can collect the non-awakening voice message in the normal communication process of people, and if the non-awakening voice message contains the awakening word, the intelligent device is easily mistakenly awakened.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a medium for waking up intelligent equipment, which are used for solving the problems of low wake-up rate and high false wake-up rate caused by the existing method for waking up the intelligent equipment.

The embodiment of the invention provides a method for waking up intelligent equipment, which comprises the following steps:

acquiring a voice section containing a wakeup word in voice information to be processed;

if the attribute characteristics of the voice section meet the preset threshold adjustment condition, adjusting the size of a currently stored target awakening threshold;

and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.

Further, the determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained through the semantic model of the wake-up word and the adjusted target wake-up threshold includes:

judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not;

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

Further, if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, the method further includes:

judging whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the currently stored target awakening threshold value or not;

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

Further, the attribute feature of the speech segment is a voiceprint feature of the speech segment or a signal energy of the speech segment.

Further, if the attribute feature of the voice segment is the voiceprint feature of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

and if the voiceprint features of the voice section are matched with the currently stored voiceprint features with higher priority, reducing the size of the currently stored target awakening threshold.

Further, the method further comprises:

and if the intelligent equipment is awakened and the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, updating the awakening times corresponding to the target voiceprint characteristics, and adjusting the priority corresponding to the target voiceprint characteristics according to the updated awakening times.

Further, if the attribute characteristic of the voice segment is the signal energy of the voice segment, and if the attribute characteristic of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

if the signal energy at the starting endpoint of the voice segment is less than a set first energy threshold value, and the signal energy at the ending endpoint of the voice segment is less than a set second energy threshold value, reducing the size of the currently stored target awakening threshold value; otherwise, increasing the size of the currently stored target awakening threshold.

The embodiment of the invention provides a wake-up device of intelligent equipment, which comprises:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a voice section containing a wakeup word in voice information to be processed;

the adjusting unit is used for adjusting the size of the currently stored target awakening threshold value if the attribute characteristics of the voice section meet the preset threshold value adjusting condition;

and the determining unit is used for determining whether to awaken the intelligent equipment according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.

Further, the determining unit is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

Further, the determining unit is further configured to, if the attribute characteristics of the voice segment do not satisfy a preset threshold adjustment condition, determine whether a similarity between the voice segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

Further, the adjusting unit is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, reduce the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.

Further, the adjusting unit is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.

Further, the adjusting unit is specifically configured to reduce the size of the currently stored target wake-up threshold if the attribute feature of a speech segment is the signal energy of the speech segment, and if the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.

An embodiment of the present invention provides an electronic device, where the electronic device includes a processor, and the processor is configured to implement the steps of the method for waking up an intelligent device as described above when executing a computer program stored in a memory.

An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for waking up an intelligent device as described above.

According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly serves as awakening voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly serves as non-awakening voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is avoided, and the mistaken awakening rate of the intelligent device is reduced.

Drawings

FIG. 1 is a schematic flow chart illustrating a wake-up process of a smart device according to the prior art;

fig. 2 is a schematic diagram of a wake-up process of an intelligent device according to an embodiment of the present invention;

fig. 3 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;

fig. 4 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;

fig. 5 is a schematic flowchart illustrating an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a wake-up apparatus of an intelligent device according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In order to improve the wake-up rate and the false wake-up rate of the intelligent device, embodiments of the present invention provide a wake-up method, apparatus, device, and medium for the intelligent device.

Example 1:

fig. 2 is a schematic diagram of a wake-up process of an intelligent device according to an embodiment of the present invention, where the wake-up process includes the following steps:

s201: and acquiring a voice section containing the awakening word in the voice information to be processed.

The method for waking up the intelligent device provided by the embodiment of the invention is applied to the electronic device, the electronic device can be the intelligent device which is wakened, or other devices which are used for performing wake-up identification and controlling the intelligent device to wake up except the intelligent device, the intelligent device can be a mobile terminal, an intelligent sound box, an intelligent air conditioner and other intelligent household devices, and the other devices which perform wake-up identification and control the intelligent device to wake up can be a server, a mobile terminal and other devices.

After the voice information to be processed is collected, the electronic device can perform recognition and semantic analysis processing on the voice information to be processed, acquire a voice segment containing the awakening word, and perform subsequent processing based on the acquired voice segment. If the voice segment containing the awakening word cannot be acquired from the voice information to be processed, the voice information can be directly ignored without subsequent processing.

It should be noted that, a specific method for detecting whether the collected voice information includes the voice segment of the wakeup word is the prior art, and is not described herein again.

S202: and if the attribute characteristics of the voice section meet the preset threshold value adjusting condition, adjusting the size of the currently stored target awakening threshold value.

In order to increase the wake-up rate and reduce the false wake-up rate, in the embodiment of the present invention, the target wake-up threshold of the smart device may be adjusted. In order to determine whether to adjust the target wake-up threshold, a threshold adjustment condition may be preset, and if the attribute characteristics of the voice segment satisfy the preset threshold adjustment condition, it is determined to adjust the target wake-up threshold, and if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, it is determined not to adjust the currently stored target wake-up threshold.

The preset threshold adjustment condition may be to detect whether the voice information to be processed includes a plurality of consecutive voice segments including a wakeup word, for example: and when the electronic equipment detects that the number of the voice sections is at least two, the electronic equipment can determine that the voice sections meet a preset threshold value adjusting condition, and adjust a currently stored target awakening threshold value.

When the attribute characteristics of the voice segment meet the preset threshold adjustment condition, the size of the target awakening threshold is adjusted, the size of the target awakening threshold can be increased or decreased, and the threshold adjustment condition which the attribute characteristics of the voice segment meet can be specifically determined.

S203: and determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold.

In the embodiment of the invention, the electronic equipment stores the semantic model of the awakening word, and the similarity between the input voice segment and the awakening word can be acquired through the semantic model of the awakening word. And after the adjusted target threshold value is obtained, judging whether the similarity between the voice segment and the awakening word is greater than the adjusted target awakening threshold value or not, thereby determining whether to awaken the intelligent equipment or not.

The specific method for obtaining the similarity between the voice segment and the awakening word through the awakening word semantic model belongs to the prior art, and is not described herein again.

According to the embodiment of the invention, if the attribute characteristics of the voice section meet the preset threshold adjustment condition, the size of the currently stored target awakening threshold is adjusted, and whether the intelligent equipment is awakened or not is determined according to the adjusted target awakening threshold. The target awakening threshold of the intelligent device can be adjusted according to whether the voice section meets the threshold adjusting condition or not, so that the situation that the intelligent device cannot be awakened by the voice information which possibly awakens the voice information is avoided, the awakening rate of the intelligent device is improved, meanwhile, the situation that the voice information which possibly is not awakened voice information is mistakenly detected as the awakening voice information to awaken the intelligent device is also avoided, and the mistaken awakening rate of the intelligent device is reduced.

Example 2:

in order to improve the wake-up rate and the false wake-up rate of the smart device, on the basis of the above embodiments, in an embodiment of the present invention, the attribute feature of the voice segment is a voiceprint feature of the voice segment, or a signal energy of the voice segment.

The voiceprint feature is information which uniquely identifies the voice feature of the user, and the identity of the user who awakens the intelligent device can be determined according to the voiceprint feature of the voice section, so that whether the target awakening threshold value is adjusted or not can be determined according to the identity of the user.

In order to further accurately determine whether to wake up the smart device, if the attribute feature of the voice segment is the voiceprint feature of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

The voiceprint feature can identify the identity of the user, so that the priority of the corresponding voiceprint feature can be determined according to the identity of the user, for example: if the voice print feature A of the male owner in the family is the voice print feature A, the priority of the voice print feature A is 4, if the voice print feature B of the female owner in the family is 3, if the voice print feature C of the child in the family is 2, the priority in the family is higher in the order of 4 than 3 than 2, so that the priority in the voice print feature A is higher than the priority in the order of B, and the priority in the voice print feature B is higher than the priority in the order of C.

Therefore, when determining whether to adjust the target wake-up threshold, it may be determined whether the voiceprint features of the voice segment are matched with any currently stored voiceprint feature, if there is a matched voiceprint feature, the priority of the matched voiceprint feature is obtained, and in order to adjust the size of the target wake-up threshold, a priority range is preset, and the priority in the priority range is taken as the priority with higher priority.

Also in the above example, the home priority 4 and the priority 3 may be set as a priority range in which the target wake-up threshold can be adjusted, that is, the priority 4 and the priority 3 are higher priorities. Therefore, if the matched voiceprint feature is the voiceprint feature B, the target wake-up threshold is adjusted; if the priority of the matching voiceprint feature is not a higher priority, e.g., the matching voiceprint feature is voiceprint feature C, then the target wake-up threshold is not adjusted.

The voiceprint feature has a higher priority, which indicates that the user corresponding to the voiceprint feature is more important, and in order to increase the wake-up rate of the user for waking up the smart device, when the voiceprint feature of the voice segment is the voiceprint feature corresponding to the user, the target wake-up threshold value can be appropriately reduced. The reduction amount can be determined according to the priority of the matched voiceprint feature, if the priority of the matched voiceprint feature is higher, the size of the target wake-up threshold is reduced more, if the priority of the matched voiceprint feature is low, the size of the target wake-up threshold can be reduced less, and of course, the size of the target wake-up threshold can be reduced by the same value no matter the priority of the matched voiceprint feature is high or low.

For example, the preset priority range is priority 4 and priority 3, and if the priority of the matched voiceprint feature is within the priority range, the target wake-up threshold is lowered by 0.1. The voiceprint feature of the voice segment of the voice information to be processed acquired by the current electronic equipment is matched with the currently stored voiceprint feature A, the priority corresponding to the matched voiceprint feature A is priority 3, the currently stored target awakening threshold value 0.85 is reduced by 0.1, and the adjusted target awakening threshold value is 0.75.

In order to further effectively improve the wake-up rate and the false wake-up rate of the smart device, the wake-up method of the smart device further includes:

The priority can be preset, and in order to further improve the awakening rate, in the embodiment of the present invention, the priority of the target voiceprint feature can be determined according to the number of times that the intelligent device is awakened by any target voiceprint feature. If the number of times that a certain target voiceprint feature wakes up the intelligent device is more, the priority of the target voiceprint feature is higher; if the number of times that the target voiceprint feature wakes up the smart device is smaller, the priority of the target voiceprint feature is lower.

For example, the voiceprint feature a, the voiceprint feature B, and the voiceprint feature C wake up the smart device respectively 16 times, 15 times, and 21 times, and then the priorities of the three voiceprint features are the voiceprint feature C, the voiceprint feature a, and the voiceprint feature B from high to low. Therefore, after the intelligent device is awakened each time, whether the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features is judged, and if the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, the awakening times corresponding to the target voiceprint features are updated. And after the awakening times corresponding to the target voiceprint features are updated, sequencing all currently stored target voiceprint features according to the awakening times, so as to adjust the priority corresponding to the target voiceprint features.

In order to determine the priority corresponding to each frequency, the frequency range corresponding to each priority may be set in advance. In addition, the number of voice messages received by the intelligent device is continuously increased, so that the number of awakening times corresponding to the target voiceprint feature is continuously changed, and in order to determine the priority, the number range corresponding to each priority can be determined according to the total number of awakening times and the number of preset priorities.

In order to further accurately determine whether to wake up the smart device, if the attribute feature of the voice segment is the signal energy of the voice segment, and if the attribute feature of the voice segment meets a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

In the normal voice message, the signal energy between every two adjacent voice frames of the voice message is correlated and mutually influenced, so in the embodiment of the present invention, it may be determined whether the voice segment is influenced by other voice segments in the voice message to be processed according to the signal energy of the voice segment, so as to determine whether the voice message to be processed is the voice message only containing the wakeup word, that is, only the voice segment containing the wakeup word in the voice message to be processed.

The to-be-processed voice information acquired by the electronic device may be voice information containing a wake-up word or voice information not containing the wake-up word. When the voice information to be processed is the voice information containing the wakeup word, the voice information to be processed may only contain the voice segment of the wakeup word, or may not only contain the voice segment of the wakeup word, that is, the voice segment containing the wakeup word is in a session.

Generally, the probability that the voice information to be processed, which only includes the voice segment of the wakeup word, is the wakeup voice information is relatively high, and the probability that the voice information, which does not only include the voice segment of the wakeup word, is the non-wakeup voice information is relatively high. Therefore, in order to conveniently identify whether the voice information to be processed only includes the voice segment of the wakeup word, in the embodiment of the present invention, an energy detection method may be used to determine whether the voice information to be processed only includes the voice segment of the wakeup word.

When the speech information to be processed only includes the speech segment of the wakeup word, the energy at the start end point and the end point of the speech segment is weak because no other speech segment in the speech information to be processed can affect the signal energy of the speech segment. When the speech information to be processed does not only include the speech segment of the wake-up word, i.e. the speech segment including the wake-up word is in a session, the signal energy of the speech segment is affected by other speech segments in the speech information to be processed, and therefore the energy at the starting end point and/or the ending end point of the speech segment is strong.

Based on this, whether to adjust the size of the target wake-up threshold can be determined according to the signal energy of the voice segment. In order to further accurately determine whether to wake up the smart device, in the embodiment of the present invention, a first energy threshold and a second energy threshold may be preset, and it is determined whether the signal energy at the starting endpoint of the voice segment is smaller than the set first energy threshold, and whether the signal energy at the ending endpoint of the voice segment is smaller than the set second energy threshold, so as to determine whether to adjust the currently stored target wake-up threshold.

In a specific implementation, if the signal energy at the starting end point of the voice segment is less than a set first energy threshold and the signal energy at the ending end point of the voice segment is less than a set second energy threshold, the voice information to be processed only contains the voice segment of the wakeup word, and the voice information to be processed is most likely to be wakeup voice information, and the size of the currently stored target wakeup threshold is reduced; otherwise, it is indicated that the voice message to be processed does not only include the voice segment of the wakeup word, that is, the voice segment including the wakeup word is in a segment of the session, and the voice message to be processed is likely to be the non-wakeup voice message, and the size of the currently stored target wakeup threshold is increased.

In order to determine the first energy threshold and the second energy threshold, in an embodiment of the present invention, a large number of speech segments containing a wakeup word may be analyzed by obtaining signal energies of a start end point and an end point of the speech segment according to the following formulas to determine the first energy threshold and the second energy threshold:

where, x (m) is the signal of the mth frame, n is the nth time, W (n-m) is the window function sequence, and En represents the short-time energy of the signal when the window function is started at the nth time. The process of obtaining the signal energy of the starting end point and the ending end point of the voice segment for analysis by the above formula belongs to the prior art, and is not described herein again.

The size of the target awakening threshold can be increased by setting different values according to different scenes, and if the false awakening rate of the intelligent device is expected to be reduced as much as possible, the target awakening threshold can be increased by a large amount; the target wake-up threshold may be raised a little bit if to avoid false detection of the wake-up voice message as a non-wake-up voice message.

Correspondingly, the size of the target awakening threshold value is reduced, different values can be set according to different scenes, and if the awakening rate of the intelligent device is expected to be improved as much as possible, the target awakening threshold value can be reduced; the target wake-up threshold may be lowered less if to avoid false detection of non-wake-up voice information as wake-up voice information.

According to the embodiment of the invention, the target awakening threshold is adjusted correspondingly by judging whether the attribute characteristics of the voice segment meet the preset threshold adjusting condition, so that the occurrence of the situation that the intelligent equipment is awakened by non-awakening voice information is reduced, the difficulty that the intelligent equipment is awakened possibly for awakening the voice information is reduced, and the false awakening rate and the awakening rate of the intelligent equipment are improved.

Example 3:

in order to further improve the wake-up rate and the false wake-up rate of the smart device, on the basis of the above embodiments, in the embodiment of the present invention, determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained by the semantic model of the wake-up word and the adjusted target wake-up threshold includes:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

Based on the above embodiment, after determining whether the target wake-up threshold needs to be adjusted, the target wake-up threshold for determining whether to wake up the smart device may be determined according to the determination result, so as to determine whether to wake up the smart device according to the target wake-up threshold and the similarity between the voice segment and the wake-up word obtained by the semantic model of the wake-up word.

If the target awakening threshold is adjusted, the target awakening threshold stored in the electronic device is the adjusted target awakening threshold, and in order to further improve the awakening rate and the false awakening rate of the intelligent device, in the embodiment of the invention, after the adjusted target awakening threshold is obtained, the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is judged whether to be greater than the adjusted target awakening threshold, so that whether the intelligent device is awakened or not is determined.

If the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is larger than the currently stored target awakening threshold value, which indicates that the voice information to be processed is awakening voice information, the intelligent equipment is determined to be awakened; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold value, which indicates that the voice information to be processed is non-awakening voice information, determining not to awaken the intelligent equipment.

If the electronic equipment for controlling the intelligent equipment to wake up is the intelligent equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is greater than the adjusted target wake-up threshold, the intelligent equipment directly wakes up; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the adjusted target awakening threshold, the intelligent equipment is not awakened.

If the electronic equipment for controlling the intelligent equipment to wake up is other equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is larger than the adjusted target wake-up threshold, the other equipment determines to wake up the intelligent equipment and sends a control instruction for waking up the intelligent equipment to the intelligent equipment so as to wake up the intelligent equipment; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the adjusted target awakening threshold, other equipment determines not to awaken the intelligent equipment.

The similarity between the voice segment obtained through the awakening word semantic model and the awakening word belongs to the prior art, and is not described herein again.

In order to further improve the wake-up rate and the false wake-up rate of the smart device, if the attribute characteristics of the voice segment do not satisfy the preset threshold adjustment condition, the method further includes:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

If the target awakening threshold is not adjusted, the target awakening threshold stored in the electronic device is an unadjusted target awakening threshold, which may be a preset threshold, and in order to determine whether to awaken the device, after determining that the target awakening threshold is not adjusted, the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is determined to be greater than the currently stored target awakening threshold, so as to determine whether to awaken the intelligent device.

If the electronic equipment for controlling the intelligent equipment to wake up is the intelligent equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is greater than the currently stored target wake-up threshold, the intelligent equipment directly wakes up; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold, the intelligent equipment is not awakened.

If the electronic equipment for controlling the intelligent equipment to wake up is other equipment, if the similarity between the voice segment acquired through the locally stored wake-up word semantic model and the wake-up word is larger than the currently stored target wake-up threshold, the other equipment determines to wake up the intelligent equipment and sends a control instruction for waking up the intelligent equipment to the intelligent equipment so as to wake up the intelligent equipment; and if the similarity between the voice segment acquired through the locally stored awakening word semantic model and the awakening word is not greater than the currently stored target awakening threshold, other equipment determines not to awaken the intelligent equipment.

Example 4:

fig. 3 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, where the execution subject is the intelligent device, and the attribute feature of a voice segment is signal energy of the voice segment, which is described in detail:

s301: the intelligent equipment acquires a voice section containing a wakeup word in the voice information to be processed.

S302: the intelligent device judges whether the signal energy at the starting endpoint of the voice segment is smaller than a set first energy threshold value or not and whether the signal energy at the ending endpoint of the voice segment is smaller than a set second energy threshold value or not, if so, S303 is executed, and if not, S304 is executed.

S303: the smart device reduces the size of the currently saved target wake-up threshold and then executes S305.

S304: the smart device raises the size of the currently saved target wake-up threshold and then executes S305.

S305: and the intelligent equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is larger than the adjusted target awakening threshold value or not, if so, S306 is executed, and otherwise, S307 is executed.

S306: the intelligent device wakes up directly.

S307: the smart device does not wake up.

Fig. 4 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, where the execution subject is another device, and an attribute feature of a speech segment is a voiceprint feature of the speech segment, which are described in detail:

s401: and other equipment acquires the voice segment containing the awakening word in the voice information to be processed.

S402: and other equipment judges whether the voiceprint characteristics of the voice section are matched with the currently stored voiceprint characteristics with higher priority, if so, S403 is executed, and otherwise, S404 is executed.

S403: the other device reduces the size of the currently saved target wake-up threshold and then executes S405.

S404: and the other equipment determines that the voiceprint characteristics of the voice segment do not meet the preset threshold adjustment condition, does not adjust the currently stored target awakening threshold, and then executes S406.

S405: and other equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not, if so, S407 is executed, and otherwise, S408 is executed.

S406: and other equipment judges whether the similarity between the voice segment acquired through the awakening word semantic model and the awakening word is larger than a currently stored target awakening threshold value or not, if so, S407 is executed, and otherwise, S408 is executed.

S407: the other devices determine to wake up the smart device and send a wake-up command to the smart device to control the smart device to wake up, and then execute S409.

S408: the other devices determine not to wake up the smart device.

S409: and other equipment judges that the voiceprint characteristics of the voice section are matched with any one of the currently stored target voiceprint characteristics, if so, S410 is executed, and otherwise, S411 is executed.

S410: and other equipment updates the awakening times corresponding to the target voiceprint characteristics, and adjusts the priority corresponding to the target voiceprint characteristics according to the updated awakening times.

S411: other devices save the voiceprint characteristics of the speech segment as target voiceprint characteristics.

Fig. 5 is a schematic diagram of an implementation flow of a specific wake-up method for an intelligent device according to an embodiment of the present invention, and as shown in fig. 5, after acquiring voice information to be processed and extracting a voice segment containing a wake-up word from the voice information to be processed, an electronic device may determine whether to adjust a target wake-up threshold value by determining whether voiceprint characteristics of the voice segment satisfy a preset threshold adjustment condition, or determine whether to adjust the target wake-up threshold value by determining whether signal energy of the voice segment satisfies the preset threshold adjustment condition. And after the adjusted target awakening threshold value is obtained, judging whether the similarity between the voice segment obtained through the awakening word semantic model and the awakening word is greater than the adjusted target awakening threshold value or not, and thus determining whether the intelligent equipment is awakened or not.

Example 5:

fig. 6 is a schematic structural diagram of a wake-up apparatus of an intelligent device according to an embodiment of the present invention, where the wake-up apparatus includes:

the acquiring unit 61 is configured to acquire a voice segment containing a wakeup word in the voice information to be processed;

an adjusting unit 62, configured to adjust a size of a currently stored target wake-up threshold if the attribute characteristics of the voice segment meet a preset threshold adjustment condition;

and the determining unit 63 is configured to determine whether to wake up the smart device according to the similarity between the voice segment and the wake-up word, which is obtained through the semantic model of the wake-up word, and the adjusted target wake-up threshold.

The determining unit 63 is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

In another possible implementation manner, the determining unit 63 is further configured to determine, if the attribute characteristics of the speech segment do not meet a preset threshold adjustment condition, whether a similarity between the speech segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

In a possible embodiment, the adjusting unit 62 is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.

Further, the adjusting unit 62 is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.

In another possible embodiment, the adjusting unit 62 is specifically configured to decrease the size of the currently saved target wake-up threshold when the attribute characteristic of a speech segment is the signal energy of the speech segment, and when the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.

Example 6:

fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and on the basis of the foregoing embodiments, an electronic device according to an embodiment of the present invention further includes a processor 71 and a memory 72;

the processor 71 is adapted to carry out the steps of the wake-up method of the smart device described above when executing the computer program stored in the memory 72.

Alternatively, the processor 71 may be a CPU (central processing unit), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), or a CPLD (Complex Programmable Logic Device).

A processor 71 for performing the following steps when in accordance with a computer program stored in the memory 72:

The processor 71 is specifically configured to determine whether a similarity between the voice segment obtained through the wakeup word semantic model and the wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

In another possible implementation manner, the processor 71 is further configured to determine, if the attribute characteristics of the voice segment do not meet a preset threshold adjustment condition, whether a similarity between the voice segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

In a possible embodiment, the processor 71 is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.

Further, the processor 71 is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the voice segment is matched with any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.

In another possible embodiment, the processor 71 is specifically configured to decrease the size of the currently saved target wake-up threshold when the attribute characteristic of a speech segment is the signal energy of the speech segment, and when the signal energy at the starting endpoint of the speech segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the speech segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.

Example 8:

on the basis of the foregoing embodiments, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program executable by an electronic device is stored, and when the program is run on the electronic device, the electronic device is caused to execute the following steps:

In a possible implementation manner, the determining whether to wake up the smart device according to the similarity between the voice segment and the wake-up word obtained through the wake-up word semantic model and the adjusted target wake-up threshold includes:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

In another possible implementation manner, if the attribute feature of the speech segment does not satisfy the preset threshold adjustment condition, the method further includes:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

Specifically, the attribute feature of the speech segment is a voiceprint feature of the speech segment, or a signal energy of the speech segment.

In a possible implementation manner, if the attribute feature of a speech segment is a voiceprint feature of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

Further, the method further comprises:

In another possible implementation manner, if the attribute feature of a speech segment is the signal energy of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A wake-up method of a smart device, the method comprising:

determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word obtained through the awakening word semantic model and the adjusted target awakening threshold;

if the attribute characteristics of the voice segment are the signal energy of the voice segment, and if the attribute characteristics of the voice segment meet a preset threshold adjustment condition, adjusting the size of the currently stored target awakening threshold comprises:

2. The method according to claim 1, wherein the determining whether to wake up the smart device according to the similarity between the voice segments and the wake-up words obtained through the semantic model of the wake-up words and the adjusted target wake-up threshold comprises:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

3. The method according to claim 1, wherein if the property feature of the speech segment does not satisfy the predetermined threshold adjustment condition, the method further comprises:

if yes, the intelligent equipment is confirmed to be awakened;

otherwise, the intelligent device is determined not to be awakened.

4. A method according to any of claims 1-3, wherein the attribute of the speech segments is a voiceprint characteristic of the speech segments, or a signal energy of the speech segments.

5. A method according to any one of claims 1 to 3, wherein if the attribute feature of a speech segment is a voiceprint feature of the speech segment, and if the attribute feature of the speech segment satisfies a preset threshold adjustment condition, adjusting the size of the currently stored target wake-up threshold includes:

6. The method of claim 5, further comprising:

7. A wake-up apparatus of a smart device, the apparatus comprising:

the determining unit is used for determining whether to awaken the intelligent equipment or not according to the similarity between the voice segment and the awakening word acquired through the awakening word semantic model and the adjusted target awakening threshold;

the adjusting unit is specifically configured to reduce the size of the currently stored target wake-up threshold if the attribute feature of a voice segment is the signal energy of the voice segment, and if the signal energy at the starting endpoint of the voice segment is smaller than a set first energy threshold and the signal energy at the ending endpoint of the voice segment is smaller than a set second energy threshold; otherwise, increasing the size of the currently stored target awakening threshold.

8. The apparatus according to claim 7, wherein the determining unit is specifically configured to determine whether a similarity between the speech segment obtained through the wakeup word semantic model and a wakeup word is greater than the adjusted target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

9. The apparatus according to claim 7, wherein the determining unit is further configured to, if the attribute feature of the speech segment does not satisfy a preset threshold adjustment condition, determine whether a similarity between the speech segment and a wakeup word, which is obtained through a wakeup word semantic model, is greater than the currently stored target wakeup threshold; if yes, the intelligent equipment is confirmed to be awakened; otherwise, the intelligent device is determined not to be awakened.

10. The apparatus according to any one of claims 7 to 9, wherein the adjusting unit is specifically configured to, when the attribute feature of a speech segment is a voiceprint feature of the speech segment, decrease the size of the currently stored target wake-up threshold if the voiceprint feature of the speech segment matches a currently stored voiceprint feature with a higher priority.

11. The apparatus according to claim 10, wherein the adjusting unit is further configured to update the wake-up times corresponding to the target voiceprint feature if the smart device is woken up and the voiceprint feature of the speech segment matches any one of the currently stored target voiceprint features, and adjust the priority corresponding to the target voiceprint feature according to the updated wake-up times.

12. An electronic device, characterized in that the electronic device comprises a processor for implementing the steps of the wake-up method of the smart device according to any of claims 1-6 when executing a computer program stored in a memory.

13. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the wake-up method of a smart device according to any one of claims 1-6.