[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN108735209B - Wake-up word binding method, intelligent device and storage medium - Google Patents

Wake-up word binding method, intelligent device and storage medium Download PDF

Info

Publication number
CN108735209B
CN108735209B CN201810407844.6A CN201810407844A CN108735209B CN 108735209 B CN108735209 B CN 108735209B CN 201810407844 A CN201810407844 A CN 201810407844A CN 108735209 B CN108735209 B CN 108735209B
Authority
CN
China
Prior art keywords
word
awakening
information
voiceprint
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810407844.6A
Other languages
Chinese (zh)
Other versions
CN108735209A (en
Inventor
何瑞澄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Midea Group Co Ltd
GD Midea Air Conditioning Equipment Co Ltd
Original Assignee
Midea Group Co Ltd
GD Midea Air Conditioning Equipment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Midea Group Co Ltd, GD Midea Air Conditioning Equipment Co Ltd filed Critical Midea Group Co Ltd
Priority to CN201810407844.6A priority Critical patent/CN108735209B/en
Publication of CN108735209A publication Critical patent/CN108735209A/en
Application granted granted Critical
Publication of CN108735209B publication Critical patent/CN108735209B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/025Phonemes, fenemes or fenones being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • G10L2015/027Syllables being the recognition units
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L2015/088Word spotting
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Reverberation, Karaoke And Other Acoustics (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a method for binding awakening words, which comprises the following steps: step S1, collecting voice signals sent by a user; step S2, extracting awakening word information and user information in the voice signal; and step S3, binding the user information and the awakening word information with the user. The invention also provides the intelligent equipment and a storage medium. The invention does not need to record a large amount of voice, reduces the operation, is convenient to use and improves the intelligent degree.

Description

Wake-up word binding method, intelligent device and storage medium
Technical Field
The invention relates to the technical field of voice recognition, in particular to a method for binding awakening words, intelligent equipment and a storage medium.
Background
Speech recognition technology is a high technology that allows a machine to convert speech signals into corresponding text or commands through a recognition and understanding process, i.e., allows the machine to understand human speech. Also known as Automatic Speech Recognition (ASR), the goal is to convert the vocabulary content in human Speech into computer-readable input, such as keystrokes, binary codes, or character sequences. In recent years, voice recognition technology has entered the fields of home appliances, communication, electronic products, home services, and the like to provide near-field or far-field control of the home appliances or electronic products, and wake-up word binding technology provides a premise for near-field or far-field control of user home appliances or electronic products.
The main technology for binding the awakening words is technical software awakening, but the software operation is based on the premise of system starting, in order to ensure that the voice instruction of a user can be received anytime and anywhere, a voice recognition engine needs to be operated and monitored in a background all the time, the system cannot enter a dormant standby power-saving state, and the power consumption is large. In order to reduce system power consumption, voice low-power wake-up techniques have been developed, in which a large amount of voice data is recorded and trained into a fixed wake-up word, so as to wake up the system when the wake-up word is recognized in a voice command of a user.
However, the present inventors have found that the above-mentioned techniques have at least the following technical problems:
the user-defined awakening words need to define voice data which are recorded very much, and the method is complex in operation, inconvenient to use and poor in intelligent degree.
Disclosure of Invention
The embodiment of the invention provides a method for binding the awakening words, and solves the technical problems that the existing user-defined awakening words need to define very much recorded voice data, are complex in operation, inconvenient to use and poor in intelligent degree.
The embodiment of the invention provides a method for binding awakening words, which comprises the following steps:
step S1, collecting voice signals sent by a user;
step S2, extracting awakening word information and user information in the voice signal;
and step S3, binding the user information and the awakening word information with the user.
Optionally, the step S3 includes:
step S31, obtaining the awakening word model of the user registered to the voice recognition system, and binding the user information and the awakening word with the awakening word model.
Optionally, when the user information is voiceprint information, the step S31 includes:
s311, collecting awakening word tone signals input by a user for multiple times;
step S312, acquiring rhythm sensing characteristics, tone characteristics and phoneme characteristics in the awakening word sound signals input each time;
step 313, performing acoustic feature processing on the rhythm feature and the tone feature acquired each time, and registering rhythm feature information and tone feature information subjected to the acoustic feature processing as voiceprint data of the user;
s314, sequencing and combining the phoneme characteristics acquired each time based on a preset acoustic model to obtain the awakening word model;
and S315, storing the voiceprint data and the awakening words in a correlation mode with the awakening word model.
Optionally, the step S2 includes:
step S21, when a voice signal is received, judging whether the volume value of the voice signal is larger than a preset volume value;
and step S22, if yes, acquiring awakening word information in the voice signal based on the acoustic model and the grammar structure, and acquiring voiceprint information in the voice signal based on the voiceprint recognition technology.
Optionally, after the step S3, the method further includes:
step S4, receiving a wake-up voice signal, and extracting wake-up words in the wake-up voice signal;
and step S5, when the awakening words are matched with preset awakening words in the voice recognition system, responding to the awakening word sound signals.
Optionally, after the step S4, the method further includes:
step S6, adjusting the recognition threshold value of a preset awakening word in the voice recognition system;
and step S7, when the awakening words are matched with the adjusted preset awakening words, responding to the awakening word sound signals.
Optionally, the user information is voiceprint information, and the step S6 includes:
step S61, extracting voiceprint information in the awakening word sound signal;
step S62, when voiceprint data matched with the voiceprint information do not exist in the voice recognition system, the awakening word recognition threshold value of the voice recognition system is increased;
and step S63, when the voiceprint data matched with the voiceprint information exist in the voice recognition system, turning down the awakening word recognition threshold of the voice recognition system.
Optionally, after the step S61, the method further includes:
step S64, calculating the similarity between the voiceprint information and the voiceprint data registered in the voice recognition system according to a preset voiceprint model;
step S65, when the similarity is in a preset range, determining that voiceprint data matched with the voiceprint information exist in the voice recognition system;
and step S66, when the similarity is out of the preset range, judging that the voiceprint data matched with the voiceprint information does not exist in the voice recognition system.
The present invention further provides a storage medium storing a wakeup word binding program, where the wakeup word binding program implements the above-mentioned step of binding wakeup words when executed by a processor.
According to the invention, the awakening word is bound with the user by acquiring the awakening word information in the received voice signal, instead of recording a large amount of voice blindly, the awakening word is bound with the user information after being recorded, and the user and the awakening word can be directly identified in the subsequent identification process, so that the identification accuracy is improved, the recording of a large amount of voice is not required, the operation is reduced, the use is convenient, and the intelligent degree is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.
FIG. 1 is a schematic structural diagram of a hardware operating environment related to a smart device according to the present invention;
FIG. 2 is a flowchart illustrating a method for binding wake words according to a first embodiment of the present invention;
fig. 3 is a schematic flow chart illustrating a process of acquiring a wake-up word model of the user registered in the voice recognition system and binding the user information and the wake-up word with the wake-up word model according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a detailed process of step S20 according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a wake-up word binding method according to a second embodiment of the present invention;
FIG. 6 is a flowchart illustrating a method for binding wake words according to a third embodiment of the present invention;
FIG. 7 is a flow chart illustrating adjusting recognition threshold according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating a process of determining voiceprint information according to an embodiment of the invention;
FIG. 9 is a flowchart illustrating a detailed process of step S203 according to an embodiment of the present invention;
fig. 10 is a flowchart illustrating a detailed process of step S70 according to an embodiment of the present invention.
The reference numbers illustrate:
reference numerals Name (R) Reference numerals Name (R)
100 Intelligent device 101 Radio frequency unit
102 WiFi module 103 Audio output unit
104 A/V input unit 1041 Graphics processor
1042 Microphone (CN) 105 Sensor with a sensor element
106 Display unit 1061 Display interface
107 User input unit 1071 Control interface
1072 Other input devices 108 Interface unit
109 Memory device 110 Processor with a memory having a plurality of memory cells
111 Power supply
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.
Smart devices may be implemented in various forms. For example, the smart device described in the present invention may be implemented by a mobile terminal having a display interface, such as a mobile phone, a tablet computer, a notebook computer, a palm top computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, a smart speaker, or the like, or may be implemented by a fixed terminal having a display interface, such as a Digital TV, a desktop computer, an air conditioner, a refrigerator, a water heater, a dust collector, or the like.
While the following description will be given by way of example of a smart device, it will be appreciated by those skilled in the art that the configuration according to the embodiment of the present invention can be applied to a fixed type smart device, in addition to elements particularly used for mobile purposes.
Referring to fig. 1, which is a schematic diagram of a hardware structure of an intelligent device for implementing various embodiments of the present invention, the intelligent device 100 may include: RF (Radio Frequency) unit 101, WiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display area 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the smart device architecture shown in FIG. 1 does not constitute a limitation of a smart device, which may include more or fewer components than shown, or some components in combination, or a different arrangement of components.
The following describes each component of the smart device in detail with reference to fig. 1:
the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and specifically, receive downlink information of a base station and then process the downlink information to the processor 110; in addition, the uplink data is transmitted to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access2000 ), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).
WiFi belongs to short-distance wireless transmission technology, and intelligent equipment can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the smart device, and may be omitted entirely as needed within the scope not changing the essence of the invention. For example, in this embodiment, the smart device 100 may establish a synchronization association relationship with an App terminal based on the WiFi module 102.
The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the smart device 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the smart device 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like. As in the present embodiment, when a prompt to re-input a voice signal is output, the prompt may be a voice prompt, a vibration prompt based on a buzzer, or the like.
The a/V input unit 104 is used to receive audio or video signals. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics processor 1041 Processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display area 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 may receive sounds (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and may be capable of processing such sounds into audio data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.
The smart device 100 also includes at least one sensor 105, such as light sensors, motion sensors, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display interface 1061 according to the brightness of ambient light, and a proximity sensor that can turn off the display interface 1061 and/or backlight when the smart device 100 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.
The display area 106 is used to display information input by the user or information provided to the user. The Display area 106 may include a Display interface 1061, and the Display interface 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the smart device. In particular, the user input unit 107 may include a manipulation interface 1071 and other input devices 1072. The control interface 1071, also referred to as a touch screen, may collect touch operations by a user (e.g., operations by a user on or near the control interface 1071 using a finger, a stylus, or any other suitable object or attachment) and drive the corresponding connection device according to a predetermined program. The manipulation interface 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the manipulation interface 1071 can be implemented in various types, such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the manipulation interface 1071, the user input unit 107 may include other input devices 1072. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, and are not limited to these specific examples.
Further, the manipulation interface 1071 may overlay the display interface 1061, and when the manipulation interface 1071 detects a touch operation thereon or nearby, transmit to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display interface 1061 according to the type of the touch event. Although in fig. 1, the control interface 1071 and the display interface 1061 are two separate components to implement the input and output functions of the smart device, in some embodiments, the control interface 1071 and the display interface 1061 may be integrated to implement the input and output functions of the smart device, which is not limited herein.
The interface unit 108 serves as an interface through which at least one external device is connected to the smart device 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from an external device and transmit the received input to one or more elements within the smart device 100 or may be used to transmit data between the smart device 100 and the external device.
The memory 109 may be used to store software programs as well as various data. The memory 109 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a voice recognition system) required for at least one function, and the like; the storage data area may store data created according to the use of the smart device (such as voiceprint data, a wakeup word model, user information, etc.), and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The processor 110 is a control center of the smart device, connects various parts of the entire smart device using various interfaces and lines, and performs various functions of the smart device and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the smart device. Processor 110 may include one or more processing units; preferably, the processor 110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.
The smart device 100 may further include a power source 111 (such as a battery) for supplying power to various components, and preferably, the power source 111 may be logically connected to the processor 110 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system.
Although not shown in fig. 1, the smart device 100 may further include a bluetooth module and the like capable of establishing a communication connection with other terminals, which will not be described herein.
Based on the hardware structure of the intelligent device, the intelligent device provided by the embodiment of the invention is provided with the voice recognition system, the awakening word is bound with the user by acquiring the awakening word information in the received voice signal instead of recording a large amount of voice blindly, and the awakening word is bound with the user information after the awakening word is recorded.
As shown in fig. 1, the memory 109, which is a kind of computer storage medium, may include an operating system and a wakeup word binding program.
In the intelligent device 100 shown in fig. 1, the WiFi module 102 is mainly used for connecting to a background server or a big data cloud, performing data communication with the background server or the big data cloud, and implementing communication connection with other terminal devices; the processor 110 may be configured to invoke the wake word binding application stored in the memory 109 and perform the following operations:
step S1, collecting voice signals sent by a user;
step S2, extracting awakening word information and user information in the voice signal;
and step S3, binding the user information and the awakening word information with the user.
Optionally, the step S3 includes:
step S31, obtaining the awakening word model of the user registered to the voice recognition system, and binding the user information and the awakening word with the awakening word model.
Further, when the user information is voiceprint information, the processor 110 may be configured to call the wakeup word binding application stored in the memory 109, and perform the following operations:
s311, collecting awakening word tone signals input by a user for multiple times;
step S312, acquiring rhythm sensing characteristics, tone characteristics and phoneme characteristics in the awakening word sound signals input each time;
step 313, performing acoustic feature processing on the rhythm feature and the tone feature acquired each time, and registering rhythm feature information and tone feature information subjected to the acoustic feature processing as voiceprint data of the user;
s314, sequencing and combining the phoneme characteristics acquired each time based on a preset acoustic model to obtain the awakening word model;
and S315, storing the voiceprint data and the awakening words in a correlation mode with the awakening word model.
Further, the processor 110 may be configured to invoke the wake word binding application stored in the memory 109 and perform the following operations:
step S21, when a voice signal is received, judging whether the volume value of the voice signal is larger than a preset volume value;
and step S22, if yes, acquiring awakening word information in the voice signal based on the acoustic model and the grammar structure, and acquiring voiceprint information in the voice signal based on the voiceprint recognition technology.
Further, after the step S3, the processor 110 may be configured to call the wakeup word binding application stored in the memory 109, and perform the following operations:
step S4, receiving a wake-up voice signal, and extracting wake-up words in the wake-up voice signal;
and step S5, when the awakening words are matched with preset awakening words in the voice recognition system, responding to the awakening word sound signals.
Further, after the step S4, the processor 110 may be configured to call the wakeup word binding application stored in the memory 109, and perform the following operations:
step S6, adjusting the recognition threshold value of a preset awakening word in the voice recognition system;
and step S7, when the awakening words are matched with the adjusted preset awakening words, responding to the awakening word sound signals.
Further, the user information is voiceprint information, and the processor 110 may be configured to call the wakeup word binding application stored in the memory 109, and perform the following operations:
step S61, extracting voiceprint information in the awakening word sound signal;
step S62, when voiceprint data matched with the voiceprint information do not exist in the voice recognition system, the awakening word recognition threshold value of the voice recognition system is increased;
and step S63, when the voiceprint data matched with the voiceprint information exist in the voice recognition system, turning down the awakening word recognition threshold of the voice recognition system.
Further, after the step S61, the processor 110 may be configured to call the wakeup word binding application stored in the memory 109, and perform the following operations:
step S64, calculating the similarity between the voiceprint information and the voiceprint data registered in the voice recognition system according to a preset voiceprint model;
step S65, when the similarity is in a preset range, determining that voiceprint data matched with the voiceprint information exist in the voice recognition system;
and step S66, when the similarity is out of the preset range, judging that the voiceprint data matched with the voiceprint information does not exist in the voice recognition system.
The invention further provides a method for binding the awakening words, which is applied to awakening the voice recognition system or intelligent equipment loaded with the voice recognition system.
Referring to fig. 2, fig. 2 is a flowchart illustrating a method for binding a wakeup word according to a first embodiment of the present invention.
In this embodiment, the method for binding the wake word includes the following steps:
s10: collecting voice signals sent by a user;
in this embodiment, when the user wakes up the speech recognition system with the self-defined wake-up word for the first time or needs to input the wake-up word of the user, in order to avoid wake-up failure and improve the wake-up rate, the user-defined wake-up word model needs to be trained, so as to respond when receiving the user input including the wake-up word corresponding to the wake-up word model. The method comprises the steps that a user sends out voice signals, the voice signals sent out by the user are collected, the voice signals can comprise an air conditioner, a dehumidifier, a fan and the like, and can also comprise a power-on state, a temperature increasing state, a first-gear wind speed increasing state and the like, and information serving as a wake-up word is set in advance.
S20, extracting awakening word information and user information in the voice signal;
after a voice signal input by a user is acquired, awakening word information and user information in the voice signal are extracted; the user information may be user identity information, user voiceprint data, and the like, which may be used to identify the user. And the awakening words and the user information are extracted, the voice signals are converted into text information through conversion, and the awakening words and the sentences carrying the user information are extracted from the text information.
S30, binding the user information and the awakening word information with the user.
Specifically, a user-defined awakening word sound signal is collected, for example, a user can input a voice signal of an 'air conditioner' for multiple times, and after the voice signal of the 'air conditioner' is picked up by the intelligent device based on a microphone or an audio sensor, an awakening word model of the user registered to a voice recognition system is obtained, and the user information and the awakening word are bound with the awakening word model.
In order to facilitate more accurate adjustment of the awakening word identification threshold value according to the identified voiceprint data in the follow-up process, after the voiceprint data of the registered user and the registered awakening word model are obtained, the voiceprint data and the awakening word model are further associated, and an association relation between the voiceprint data and the awakening word model is established.
According to the embodiment, the awakening word is bound with the user by acquiring the awakening word information in the received voice signal instead of recording a large amount of voice blindly, the awakening word is bound with the user information after being recorded, and the user and the awakening word can be directly identified in the subsequent identification process, so that the identification accuracy is improved, the recording of a large amount of voice is not needed, the operation is reduced, the use is convenient, and the intelligent degree is improved.
Further, referring to fig. 3, based on the method for binding a wake word in the foregoing embodiment, the step of obtaining a wake word model in which the user is registered in a speech recognition system, and binding the user information and the wake word with the wake word model includes:
s100: collecting awakening word tone signals input by a user for multiple times;
in this embodiment, the user information is described by taking user voiceprint data as an example. In order to improve the binding accuracy of the awakening word, in the sampling stage, the method in this embodiment may collect the awakening word tone signal input by the user for multiple times, and then obtain an optimal awakening word model and voice print data according to the awakening word tone signal collected for multiple times.
S200: acquiring rhythm sensing characteristics, tone characteristics and phoneme characteristics in the awakening word sound signals input each time;
when acquiring voiceprint data of a user and a wake-up word model registered by the user to a voice recognition system according to wake-up word sound signals acquired for multiple times, specifically, acquiring rhythm characteristics and tone characteristics in the voice signals based on a voiceprint recognition technology after converting the wake-up word sound signals input by the same user each time into voice digital signals; obtaining factor characteristics in the speech signal based on the acoustic model and the grammatical structure, such as obtaining the starting point and the ending point positions of the speech signal in various paragraphs (such as phoneme, syllable and morpheme) through end point detection, and excluding the unvoiced segments from the speech signal.
S300: acoustic feature processing is carried out on the rhythm sense feature and the tone feature which are obtained every time, and rhythm sense feature information and tone feature information which are processed through the acoustic feature processing are registered as voiceprint data of the user;
after the awakening word sound signal input for the first time is obtained, the rhythm sense characteristic 1 and the tone characteristic 1 are obtained based on voiceprint recognition, then the rhythm sense characteristic 2 and the tone characteristic 2 in the awakening word sound signal input for the second time are obtained, when the difference is large, the rhythm sense characteristic 1 is optimized by the rhythm sense characteristic 2, the tone characteristic 1 is optimized by the tone characteristic 2, and the like are carried out until the difference between the rhythm sense characteristic n and the tone characteristic n which are obtained again and the difference between the current rhythm sense characteristic n-1 and the current tone characteristic n-1 are within a preset range, and the current rhythm sense characteristic and the current tone characteristic are registered as voiceprint data of the user after being processed by acoustic characteristics.
S400: sequencing and combining the phoneme characteristics acquired each time based on a preset acoustic model to obtain the awakening word model;
similarly, after the awakening word tone signal input for the first time is obtained, the phoneme characteristics 1 are obtained based on the acoustic model and the grammar structure, the phoneme characteristics 2 in the awakening word tone signal input for the second time are obtained, the position of the same phoneme in the permutation combination is obtained, when the input for the first time is different from the input for the second time, the phoneme characteristics 3 in the awakening word tone signal input for the third time are obtained, and the awakening word model is obtained until the position of each phoneme in the preset phoneme permutation combination of the awakening word model is determined.
S500: and storing the voiceprint data and the awakening words in association with the awakening word model.
After obtaining voiceprint data of a user and a registered awakening word model, the voiceprint data, the awakening word and the awakening word model are stored in the voice recognition system in a correlation mode through user information of the user, such as a user account number, a user number and the like, so that the awakening word model corresponding to the user is determined according to the recognized voiceprint data in a subsequent awakening process, and the awakening word model can be recognized subsequently. Through the association of the voiceprint data and the awakening words and the awakening word model, the identification and awakening through the voiceprint data are more accurate.
Further, referring to fig. 4, based on the method for binding a wake word in the foregoing embodiment, step S20 includes:
s20 a: when a voice signal is received, judging whether the volume value of the voice signal is larger than a preset volume value or not;
in this embodiment, since the voiceprint is a sound wave spectrum with speech information, the voiceprint itself is closely related to amplitude, frequency, gene profile, formant frequency bandwidth, and the like, and the sound wave is transmitted in a shorter distance, the sound volume value of the received voice signal is smaller, and the amplitude is in an inverse proportion relationship with the sound volume value, so that the voiceprint is related to the sound volume value of the received voice signal. In addition, the speech recognition engine of the speech recognition system only recognizes a speech whose speech volume reaches a preset threshold, and thus, in order to improve the accuracy of voiceprint recognition and speech recognition, it is necessary to determine whether the volume value of the received speech signal is greater than a preset volume value, which is the minimum volume value of the speech signal required for voiceprint recognition and speech recognition.
S20 b: and if so, acquiring awakening word information in the voice signal based on the acoustic model and the grammatical structure, and acquiring voiceprint information in the voice signal based on a voiceprint recognition technology.
When the volume value of the received voice signal is greater than the preset volume value, the received voice signal is judged to be valid, and voiceprint recognition and acoustic model analysis can be further performed on the received voice signal, for example, silent sections of the voice signal in sections such as phonemes, syllables and morphemes are excluded based on end point detection, then voiceprint information of the voice signal is obtained based on syllable features in the voice signal, and awakening word information in the voice signal is obtained based on morpheme features, phoneme feature acoustic models and grammatical structures in the voice signal.
Further, referring to fig. 5, after step S30, the method for binding a wake word according to the foregoing embodiment further includes:
s40, receiving the awakening voice signal, and extracting awakening words in the awakening voice signal;
s50, when the awakening word is matched with the preset awakening word in the voice recognition system, responding to the awakening word sound signal.
And after the user has the bound awakening word, receiving the awakening word sound signal, performing awakening operation, extracting the awakening word in the awakening voice signal, and responding to the awakening word sound signal to execute response operation when the awakening word is matched with a preset awakening word in a voice recognition system. And when the extracted awakening words are matched with the preset awakening words which are correspondingly stored in the voice recognition system by the user, executing response operation. And realizing accurate awakening.
Further, in order to better perform the wake-up and reduce the error rate, referring to fig. 6, after the step S40, the method further includes:
s60: adjusting a recognition threshold value of a preset awakening word in a voice recognition system;
s70: and when the awakening words are matched with the adjusted preset awakening words, responding to the awakening word sound signals and executing responding operation.
The awakening words are adjusted, and cannot be fixed and unchanged, and the adjustment is made along with different user conditions. Specifically, referring to fig. 7, the adjusting process includes:
s201: extracting voiceprint information in the awakening word sound signal;
after the awakening word information is extracted, voiceprint information is extracted from the awakening word sound signal, the problem that the awakening rate is low when a user awakens a voice recognition system or an intelligent device loaded with the voice recognition system by using an individualized or customized awakening word is solved, and the awakening word binding technology and the core of the voice recognition technology are a training model and a recognition model, so that in order to improve the awakening rate of voice recognition, the corresponding awakening word model and voiceprint data need to be registered in the voice recognition system in advance for the user to awaken the voice recognition system after inputting a matched voice signal. In order to further improve the awakening rate of the voice recognition system and avoid false awakening caused by environmental noise, whether voiceprint data matched with the voiceprint information exists in the voice recognition system or not can be preferentially judged. If the voice print information exists in the voice recognition system, step S202 is executed, and if not, step S203 is executed.
S202: turning down a awakening word recognition threshold value of a voice recognition system;
when the voiceprint data matched with the voiceprint information exists in the voice recognition system, the current user of the intelligent device can be determined to be a registered user according to the voiceprint data registered by the user in the voice recognition system, the condition that environmental noise or other sounds are mistakenly awakened is eliminated, and therefore the awakening word recognition threshold value of the user corresponding to the voiceprint data is reduced, and the probability that the user awakens the voice recognition system is improved.
S203: and (5) raising the awakening word recognition threshold of the voice recognition system.
When voiceprint data matched with the voiceprint information do not exist in the voice recognition system, the fact that the voice signal may be environmental noise or the voice signal is sent by a non-registered user can be inferred, false awakening caused by the environmental noise is avoided, meanwhile, the safety of the voice recognition system is improved, and at the moment, an awakening word recognition threshold value of the voice recognition system can be correspondingly increased, so that awakening difficulty is improved.
Further, referring to fig. 8, after step S201, the method for binding a wakeup word according to the foregoing embodiment further includes:
s204: calculating the similarity between the voiceprint information and voiceprint data registered in a voice recognition system according to a preset voiceprint model;
in this embodiment, when determining whether voiceprint data matched with voiceprint information in a voice signal exists in a voice recognition system, in order to improve accuracy of voiceprint recognition and thus improve a subsequent wake-up rate for subsequent voice recognition, a similarity between the voiceprint information in the voice signal and the voiceprint data registered in the voice recognition system may be calculated based on a preset voiceprint model during determination, specifically, syllable state segmentation may be performed on a tone a in the voiceprint information based on the preset voiceprint model, then syllable state segmentation is performed on a tone S in the voiceprint data based on the same means, and then coincidence degrees of various state syllables of the tone a and the tone S are compared, where the coincidence degree is the similarity degree. In other embodiments, the similarity may also be calculated by comparing the rhythm B in the voiceprint information in the speech signal with the rhythm D in the voiceprint data.
S205: when the similarity is within a preset range, determining that voiceprint data matched with the voiceprint information exist in the voice recognition system;
when the coincidence degree of each state syllable of the tone A and the tone S is within a preset range, it can be judged that voiceprint data matched with the voiceprint information exists in the voice recognition system.
S206: and when the similarity is out of a preset range, judging that voiceprint data matched with the voiceprint information do not exist in the voice recognition system.
And when the coincidence degree of each state syllable of the tone A and the tone S is out of a preset range, judging that voiceprint data matched with the voiceprint information does not exist in the voice recognition system.
Further, referring to fig. 9, the method for binding a wake word based on the foregoing embodiment, in step S203, includes:
s2031: when voiceprint data matched with the voiceprint information do not exist in the voice recognition system, current user state information and image information are obtained;
in this embodiment, when the coincidence degree of each state syllable of the tone a and the tone S is outside the preset range, it is determined that voiceprint data matching the voiceprint information does not exist in the speech recognition system, and at this time, an unregistered wakeup word may be input by the user, or the voiceprint data may be caused by receiving environmental noise, so that it is necessary to further acquire the state information and image information of the current user to determine whether the current user is a registered user or whether the received speech signal is environmental noise.
S2032: and when detecting that the current user does not speak, is out of the recognition range of the voice recognition system or is not registered, turning up the awakening word recognition threshold of the voice recognition system.
When the user is judged not to be sounded or the user is judged to be out of the recognition range of the voice recognition system according to the obtained current user state information, the received voice signal is judged to be environmental noise, and in order to reduce false awakening caused by the environmental noise, the awakening word recognition threshold value of the voice recognition system is increased, so that the awakening difficulty is improved, and the false awakening rate is reduced. And when the current user is judged to be unregistered according to the acquired image information of the current user, the awakening word recognition threshold value of the voice recognition system is increased so as to improve the awakening difficulty and improve the safety of voice recognition.
Further, referring to fig. 10, based on the method for binding a wake word in the foregoing embodiment, step S70 includes:
s71: counting the matching degree of the awakening word information in the received voice signal and an awakening word model registered to a voice recognition system;
in this embodiment, since the wake word information in the speech signal is mainly matched with the wake word model, and the specific matching manner may be a matching degree of permutation and combination between phonemes, for example, when the wake word model includes 48 phonemes, it is necessary to count the wake word information in the received speech signal, that is, count phoneme characteristics in the wake word information, and then compare phonemes in the wake word information to a preset number, and further compare permutation and combination manners between phonemes.
S72: and when the matching degree reaches the identification threshold of the awakening words after being turned down or turned up, awakening the voice identification system or awakening the intelligent equipment where the voice identification system is located.
When the phonemes in the awakening word information reach the preset number and the coincidence rate of the permutation combination between the phonemes is greater than the preset threshold value, the matching degree of the awakening word information in the voice signal and the awakening word model reaches the awakening word recognition threshold value after being turned down or turned up, at the moment, the voice signal can be responded, for example, the voice recognition system or the intelligent equipment where the voice recognition system is awakened, so that a voice control instruction or a voice interaction instruction input by a subsequent user is recognized, and further, a response control action or interaction action is made, so that the intelligence of the intelligent equipment is improved.
In addition, an embodiment of the present invention further provides a storage medium, where the storage medium stores a wakeup word binding application program, and the wakeup word binding program, when executed by a processor, implements the steps of the wakeup word binding method described above.
The method implemented when the wakeup word binding program is executed may refer to each embodiment of the wakeup word binding method of the present invention, and is not described herein again.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should be noted that in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (5)

1. A method for binding a wake-up word, the method comprising:
step S1, collecting voice signals sent by a user;
step S2, extracting awakening word information and user information in the voice signal;
step S3, binding the user information and the awakening word information with the user;
the step S3 includes:
step S31, acquiring a wake-up word model of the user registered to a voice recognition system, and binding the user information and the wake-up word with the wake-up word model;
when the user information is voiceprint information, the step S31 includes:
s311, collecting awakening word tone signals input by a user for multiple times;
step S312, acquiring rhythm sensing characteristics, tone characteristics and phoneme characteristics in the awakening word sound signals input each time;
step 313, performing acoustic feature processing on the rhythm feature and the tone feature acquired each time, and registering rhythm feature information and tone feature information subjected to the acoustic feature processing as voiceprint data of the user; after a first-time input awakening word sound signal is obtained, rhythm sense characteristic 1 and tone characteristic 1 are obtained based on voiceprint recognition, rhythm sense characteristic 2 and tone characteristic 2 in a second-time input awakening word sound signal are obtained, when the difference is large, rhythm sense characteristic 1 is optimized by utilizing rhythm sense characteristic 2, tone characteristic 1 is optimized by utilizing tone characteristic 2 until the difference between rhythm sense characteristic n obtained again and current rhythm sense characteristic n-1 and the difference between tone characteristic n and tone characteristic n-1 are within a preset range, and the current rhythm sense characteristic and tone characteristic are registered as voiceprint data of the user after being processed by acoustic characteristics;
s314, sequencing and combining the phoneme characteristics acquired each time based on a preset acoustic model to obtain the awakening word model;
step S315, storing the voiceprint data and the awakening words in association with the awakening word model;
after the step S3, the method further includes:
step S4, receiving a wake-up voice signal, and extracting wake-up words in the wake-up voice signal;
step S5, when the awakening word is matched with a preset awakening word in a voice recognition system, responding to the awakening word sound signal;
after the step S4, the method further includes:
step S6, adjusting the recognition threshold of the preset awakening word in the voice recognition system, and adjusting the awakening word without fixation or along with different user conditions;
step S7, when the awakening word is matched with the adjusted preset awakening word, responding to the awakening word sound signal;
the step S6 includes:
step S61, extracting voiceprint information in the awakening word sound signal;
step S62, when voiceprint data matched with the voiceprint information do not exist in the voice recognition system, the awakening word recognition threshold value of the voice recognition system is increased;
and step S63, when the voiceprint data matched with the voiceprint information exist in the voice recognition system, turning down the awakening word recognition threshold of the voice recognition system.
2. The wake word binding method according to claim 1, wherein the step S2 comprises:
step S21, when a voice signal is received, judging whether the volume value of the voice signal is larger than a preset volume value;
and step S22, if yes, acquiring awakening word information in the voice signal based on the acoustic model and the grammar structure, and acquiring voiceprint information in the voice signal based on the voiceprint recognition technology.
3. The method for binding wake words according to claim 1, wherein after the step S61, the method further comprises:
step S64, calculating the similarity between the voiceprint information and the voiceprint data registered in the voice recognition system according to a preset voiceprint model;
step S65, when the similarity is in a preset range, determining that voiceprint data matched with the voiceprint information exist in the voice recognition system;
and step S66, when the similarity is out of the preset range, judging that the voiceprint data matched with the voiceprint information does not exist in the voice recognition system.
4. A smart device loaded with a voice recognition system, the smart device further comprising a memory, a processor, and a wake-up word binding application stored in the memory and executable on the processor, the voice recognition system being coupled to the processor, wherein:
the voice recognition system is used for responding to the voice signals meeting the awakening condition;
the wake word binding program when executed by the processor performs the steps of the wake word binding method according to any of claims 1 to 3.
5. A storage medium storing a wake word binding application program, the wake word binding application program when executed by a processor implementing the steps of the wake word binding method according to any one of claims 1 to 3.
CN201810407844.6A 2018-04-28 2018-04-28 Wake-up word binding method, intelligent device and storage medium Active CN108735209B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810407844.6A CN108735209B (en) 2018-04-28 2018-04-28 Wake-up word binding method, intelligent device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810407844.6A CN108735209B (en) 2018-04-28 2018-04-28 Wake-up word binding method, intelligent device and storage medium

Publications (2)

Publication Number Publication Date
CN108735209A CN108735209A (en) 2018-11-02
CN108735209B true CN108735209B (en) 2021-01-08

Family

ID=63939486

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810407844.6A Active CN108735209B (en) 2018-04-28 2018-04-28 Wake-up word binding method, intelligent device and storage medium

Country Status (1)

Country Link
CN (1) CN108735209B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109273005B (en) * 2018-12-11 2024-10-01 胡应章 Sound control output device
CN109887508A (en) * 2019-01-25 2019-06-14 广州富港万嘉智能科技有限公司 A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print
CN110062309B (en) * 2019-04-28 2021-04-27 百度在线网络技术(北京)有限公司 Method and device for controlling intelligent loudspeaker box
CN110556107A (en) * 2019-08-23 2019-12-10 宁波奥克斯电气股份有限公司 control method and system capable of automatically adjusting voice recognition sensitivity, air conditioner and readable storage medium
CN110600029A (en) * 2019-09-17 2019-12-20 苏州思必驰信息科技有限公司 User-defined awakening method and device for intelligent voice equipment
CN111128155B (en) * 2019-12-05 2020-12-01 珠海格力电器股份有限公司 Awakening method, device, equipment and medium for intelligent equipment
CN111124512B (en) * 2019-12-10 2020-12-08 珠海格力电器股份有限公司 Awakening method, device, equipment and medium for intelligent equipment
CN111240634A (en) * 2020-01-08 2020-06-05 百度在线网络技术(北京)有限公司 Sound box working mode adjusting method and device
CN111261171A (en) * 2020-01-17 2020-06-09 厦门快商通科技股份有限公司 Method and system for voiceprint verification of customizable text
CN112053689A (en) * 2020-09-11 2020-12-08 深圳市北科瑞声科技股份有限公司 Method and system for operating equipment based on eyeball and voice instruction and server
CN112382288B (en) * 2020-11-11 2024-04-02 湖南常德牌水表制造有限公司 Method, system, computer device and storage medium for voice debugging device
CN112530424A (en) * 2020-11-23 2021-03-19 北京小米移动软件有限公司 Voice processing method and device, electronic equipment and storage medium
CN112700782A (en) * 2020-12-25 2021-04-23 维沃移动通信有限公司 Voice processing method and electronic equipment
CN113327593B (en) * 2021-05-25 2024-04-30 上海明略人工智能(集团)有限公司 Device and method for corpus acquisition, electronic equipment and readable storage medium
CN113380257B (en) * 2021-06-08 2024-09-24 深圳市同行者科技有限公司 Response method, device, equipment and storage medium of multi-terminal intelligent home
CN113808585A (en) * 2021-08-16 2021-12-17 百度在线网络技术(北京)有限公司 Earphone awakening method, device, equipment and storage medium
CN114333828A (en) * 2022-03-08 2022-04-12 深圳市华方信息产业有限公司 Quick voice recognition system for digital product
CN115132195B (en) * 2022-05-12 2024-03-12 腾讯科技(深圳)有限公司 Voice wakeup method, device, equipment, storage medium and program product
CN117153166B (en) * 2022-07-18 2024-07-12 荣耀终端有限公司 Voice wakeup method, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567431A (en) * 2003-07-10 2005-01-19 上海优浪信息科技有限公司 Method and system for identifying status of speaker
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN106338924A (en) * 2016-09-23 2017-01-18 广州视源电子科技股份有限公司 Method and device for automatically adjusting equipment operation parameter threshold
CN106358061A (en) * 2016-11-11 2017-01-25 四川长虹电器股份有限公司 Television voice remote control system and television voice remote control method
CN107945806A (en) * 2017-11-10 2018-04-20 北京小米移动软件有限公司 User identification method and device based on sound characteristic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1567431A (en) * 2003-07-10 2005-01-19 上海优浪信息科技有限公司 Method and system for identifying status of speaker
CN105575395A (en) * 2014-10-14 2016-05-11 中兴通讯股份有限公司 Voice wake-up method and apparatus, terminal, and processing method thereof
CN106338924A (en) * 2016-09-23 2017-01-18 广州视源电子科技股份有限公司 Method and device for automatically adjusting equipment operation parameter threshold
CN106358061A (en) * 2016-11-11 2017-01-25 四川长虹电器股份有限公司 Television voice remote control system and television voice remote control method
CN107945806A (en) * 2017-11-10 2018-04-20 北京小米移动软件有限公司 User identification method and device based on sound characteristic

Also Published As

Publication number Publication date
CN108735209A (en) 2018-11-02

Similar Documents

Publication Publication Date Title
CN108735209B (en) Wake-up word binding method, intelligent device and storage medium
CN108711430B (en) Speech recognition method, intelligent device and storage medium
CN109427333B (en) Method for activating speech recognition service and electronic device for implementing said method
CN108320742B (en) Voice interaction method, intelligent device and storage medium
CN110890093B (en) Intelligent equipment awakening method and device based on artificial intelligence
CN107360327B (en) Speech recognition method, apparatus and storage medium
US10825453B2 (en) Electronic device for providing speech recognition service and method thereof
CN110570840B (en) Intelligent device awakening method and device based on artificial intelligence
CN108494947B (en) Image sharing method and mobile terminal
CN108712566B (en) Voice assistant awakening method and mobile terminal
CN109509473B (en) Voice control method and terminal equipment
CN112735418B (en) Voice interaction processing method, device, terminal and storage medium
CN109065060B (en) Voice awakening method and terminal
CN106203235B (en) Living body identification method and apparatus
KR20180047801A (en) Electronic apparatus and controlling method thereof
CN111522592A (en) Intelligent terminal awakening method and device based on artificial intelligence
WO2022227507A1 (en) Wake-up degree recognition model training method and speech wake-up degree acquisition method
CN109754823A (en) A kind of voice activity detection method, mobile terminal
CN110830368A (en) Instant messaging message sending method and electronic equipment
CN113782012A (en) Wake-up model training method, wake-up method and electronic equipment
CN111292727B (en) Voice recognition method and electronic equipment
CN111526244A (en) Alarm clock processing method and electronic equipment
CN114093357A (en) Control method, intelligent terminal and readable storage medium
CN111880988B (en) Voiceprint wake-up log collection method and device
CN111479005B (en) Volume adjusting method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant