CN110418245B

CN110418245B - Method and device for reducing reaction delay of Bluetooth sound box and terminal equipment

Info

Publication number: CN110418245B
Application number: CN201810401906.2A
Authority: CN
Inventors: 程雯; 吴海全; 唐大勇; 张恩勤; 曹磊; 师瑞文
Original assignee: Shenzhen Grandsun Electronics Co Ltd
Current assignee: Shenzhen Grandsun Electronics Co Ltd
Priority date: 2018-04-28
Filing date: 2018-04-28
Publication date: 2021-03-19
Anticipated expiration: 2038-04-28
Also published as: CN110418245A; WO2019206235A1

Abstract

The invention is suitable for the technical field of information processing, and provides a method, a device, terminal equipment and a computer readable storage medium for reducing the reaction delay of a Bluetooth sound box, wherein the method comprises the following steps: after receiving the awakening information, executing awakening interruption operation to awaken the Bluetooth sound box; after the Bluetooth sound box is awakened, caching the received voice data; detecting whether invalid data exists in the cached voice data; and if invalid data exists in the cached voice data, determining that the data except the invalid data in the cached voice data is valid data. By the method and the device, the reaction delay of the Bluetooth sound box during establishing the Bluetooth connection and transmitting data can be reduced, and the practicability is high.

Description

Method and device for reducing reaction delay of Bluetooth sound box and terminal equipment

Technical Field

The invention belongs to the technical field of information processing, and particularly relates to a method and a device for reducing reaction delay of a Bluetooth sound box, terminal equipment and a computer readable storage medium.

Background

With the rapid development of voice recognition technology, most of the existing smart speakers have a voice interaction function, such as a Wi-Fi-based smart speaker (Wi-Fi speaker for short), a bluetooth-based smart speaker (bluetooth speaker for short), and the like. Compared with a Wi-Fi sound box, the Bluetooth sound box is popular with more and more users due to the advantages of small size, low power consumption, good portability and the like.

However, since the bluetooth speaker cannot be directly connected to the internet, when a user needs to perform voice interaction with the bluetooth speaker, the user needs to establish bluetooth connection with a terminal such as a mobile phone, and send voice data of the user to the mobile phone through bluetooth, and then upload the voice data to the server through the mobile phone for analysis. After the server completes analysis, the returned voice data also needs to be sent to the mobile phone first, and then the mobile phone transmits the voice data to the Bluetooth sound box through Bluetooth for playing. Compared with a Wi-Fi loudspeaker, the Bluetooth loudspeaker has more time for establishing Bluetooth connection and transmitting data through Bluetooth, so the reaction delay of the Bluetooth loudspeaker is larger than that of the Wi-Fi loudspeaker.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, a terminal device, and a computer-readable storage medium for reducing a reaction delay of a bluetooth speaker, so as to solve a problem that the reaction delay is large when an existing bluetooth speaker performs bluetooth connection and data transmission.

The first aspect of the embodiments of the present invention provides a method for reducing a reaction delay of a bluetooth speaker, including:

after receiving the awakening information, executing awakening interruption operation to awaken the Bluetooth sound box;

after the Bluetooth sound box is awakened, caching the received voice data;

detecting whether invalid data exists in the cached voice data;

and if invalid data exists in the cached voice data, determining that the data except the invalid data in the cached voice data is valid data.

A second aspect of the embodiments of the present invention provides an apparatus for reducing a reaction delay of a bluetooth speaker, including:

the awakening module is used for executing awakening interrupt operation to awaken the Bluetooth sound box after receiving the awakening information;

the buffer module is used for buffering the received voice data after the Bluetooth sound box is awakened;

the detection module is used for detecting whether invalid data exists in the cached voice data;

and the processing module is used for determining that the data except the invalid data in the cached voice data is valid data if the invalid data exists in the cached voice data.

A third aspect of the embodiments of the present invention provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the method for reducing the reaction delay of the bluetooth speaker as described above when executing the computer program.

A fourth aspect of an embodiment of the present invention provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the method for reducing the reaction delay of the bluetooth speaker as described above.

Compared with the prior art, the embodiment of the invention has the following beneficial effects: after receiving the awakening information, the embodiment of the invention executes the awakening interrupt operation to awaken the Bluetooth sound box; after the Bluetooth sound box is awakened, caching the received voice data; detecting whether invalid data exists in the cached voice data; and if invalid data exists in the cached voice data, determining that the data except the invalid data in the cached voice data is valid data. According to the embodiment of the invention, by detecting the invalid data in the cached voice data, the response delay of the Bluetooth sound box during establishing Bluetooth connection and transmitting data can be effectively reduced by identifying the invalid data, so that the user experience is greatly improved, and the method and the device have strong usability and practicability.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

Fig. 1 is a schematic flow chart of an implementation of a method for reducing a reaction delay of a bluetooth speaker according to an embodiment of the present invention;

fig. 2 is a schematic flow chart illustrating an implementation of the method for reducing the reaction delay of the bluetooth speaker according to the second embodiment of the present invention;

fig. 3 is a schematic diagram of an apparatus for reducing the reaction delay of a bluetooth speaker according to a third embodiment of the present invention;

fig. 4 is a schematic diagram of a terminal device according to a fourth embodiment of the present invention.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In order to explain the technical means of the present invention, the following description will be given by way of specific examples.

Fig. 1 is a schematic flow chart of an implementation of a method for reducing a reaction delay of a bluetooth speaker according to an embodiment of the present invention, where as shown in fig. 1, the method may include the following steps:

and step S101, after receiving the awakening information, executing awakening interrupt operation to awaken the Bluetooth sound box.

For example, in the embodiment of the present invention, the wake-up information may be information obtained according to a wake-up word, a wake-up instruction, key switch wake-up, timed wake-up, and the like; for example, the voice data may be specifically a voice signal collected by a microphone of a bluetooth speaker, where the voice signal is converted into a digital signal by an analog-to-digital converter, the digital signal is sent to a microcontroller or a digital signal processor and then processed by an Acoustic Echo cancellation Algorithm (AEC), and whether the processed voice data includes a preset wake-up condition, such as a wake-up word, a wake-up instruction, or the like, is detected, and if the processed voice data includes the preset wake-up condition, wake-up information may be obtained. In addition, the wake-up information may be information obtained according to wake-up modes such as key switch wake-up, timed wake-up, and the like. For example, the wake-up interrupt operation may be to wake up the bluetooth speaker by generating a trigger signal in a manner of changing a level or the like after receiving the wake-up information.

And S102, after the Bluetooth sound box is awakened, caching the received voice data.

Optionally, after the bluetooth speaker is awakened, a request for establishing a bluetooth Connection may be sent to the terminal device, where the bluetooth Connection may be a bluetooth Synchronous Connection Oriented (SCO) Connection, so that Synchronous voice information transmission may be performed.

Step S103, detecting whether invalid data exists in the cached voice data.

Step S104, if invalid data exists in the cached voice data, determining that data except the invalid data in the cached voice data is valid data.

It should be noted that, in the embodiment of the present invention, it may be determined whether the voice data is valid data by determining whether the voice data is invalid data, and if the voice data is not invalid data, whether invalid data exists in the cached voice data is detected; in addition, whether the voice data is valid data or not may be determined, and if the voice data is not valid data, the voice data may be determined to be invalid data, so as to detect whether invalid data exists in the cached voice data.

Optionally, after determining the invalid data and/or valid data in the cached voice data by the invalid data in the voice data, the invalid data and/or the valid data may be further processed. For example, the invalid data may be deleted, or an instruction to not transmit the invalid data may be generated, or the like.

Optionally, in this embodiment of the present invention, the detecting whether there is invalid data in the cached voice data, and if there is invalid data in the cached voice data, determining that data other than the invalid data in the cached voice data is valid data may include: calculating the absolute value of the amplitude of the single frame data in the voice data according to the sequence of the voice data caching time, and comparing the absolute value with a first preset value; if the absolute value of the amplitude of the data of the Nth frame is less than or equal to a first preset value, determining the data of the Nth frame as invalid data; otherwise, if the absolute value of the amplitude of the nth frame data is greater than the first predetermined value, determining that the nth frame data and M frame data before the nth frame data are valid data, and the 1 st frame to the nth-M frame data are invalid data, wherein N, M is an integer not less than zero, and M is less than N.

In order to prevent data jumping, the data of the nth frame and the data of the M frames before the nth frame are determined to be valid data, and a buffer space can be provided for the data to prevent the valid information of the missing data.

Optionally, when it is detected that the absolute value of the amplitude of the nth frame data is greater than the first predetermined value, the detection of whether invalid data exists in the buffered voice data may be stopped, so as to detect data without valid voice information input during a period from the start of buffering to the detection of the start of speaking by the user during buffering.

Optionally, in this embodiment of the present invention, the detecting whether there is invalid data in the cached voice data, and if there is invalid data in the cached voice data, determining that data other than the invalid data in the cached voice data is valid data may include: grouping the cached voice data; the root mean square value of the amplitude of each set of voice data is calculated using the following formula:

wherein De is the root mean square value of the amplitude of the group of voice data, K is the frame number of the group of voice data, K is an integer greater than zero, and Dn is the amplitude of the nth frame data in the group of voice data; if the root mean square value is smaller than a second preset value, determining the group of voice data as invalid data; and if the root mean square value is greater than or equal to the second preset value, determining the group of voice data as valid data.

Optionally, when it is detected that the absolute value of the amplitude of the group of voice data is greater than the first predetermined value, the detection of whether invalid data exists in the buffered voice data may be stopped.

Optionally, in this embodiment of the present invention, the detecting whether there is invalid data in the cached voice data, and if there is invalid data in the cached voice data, determining that data other than the invalid data in the cached voice data is valid data may include: acquiring the amplitude of single-frame data in the voice data according to the sequence of the voice data caching time, and inputting the acquired amplitude of the single-frame data as input data into the following formula to acquire output data:

Dp(n)＝(1-a)×Dp(n-1)+a×|D(n)|

wherein Dp (n) is the output data of the nth frame, D (n) is the amplitude of the data of the nth frame in the voice data, and a is a constant; if the output data of the Nth frame is smaller than a third preset value, determining that the output data of the Nth frame is invalid data; otherwise, if the nth frame output data is greater than or equal to the third predetermined value, determining that the nth frame output data and M frame data before the nth frame output data are valid data, and the 1 st to nth-M frame output data are invalid data, where N, M is an integer not less than zero, and M is less than N.

Wherein the amplitude of the acquired single frame data may be processed by an Infinite Impulse Response (IIR) low pass filter to obtain the output data. Through an IIR low-pass filter, high-frequency components in the voice data can be filtered out, and envelope detection is achieved. The value of the constant a can be preset, and the value of a meeting the preset condition can also be obtained through testing.

Optionally, when it is detected that the absolute value of the amplitude of the nth frame data is greater than the first predetermined value, the detection of whether invalid data exists in the buffered voice data may be stopped, so as to detect data without valid voice information input in a period from the start of buffering to the detection of the start of speaking by the user during buffering.

In the embodiment of the invention, after the awakening information is received, the awakening interrupt operation is executed to awaken the Bluetooth sound box; after the Bluetooth sound box is awakened, caching the received voice data; detecting whether invalid data exists in the cached voice data; and if invalid data exists in the cached voice data, determining that the data except the invalid data in the cached voice data is valid data. According to the embodiment of the invention, by detecting the invalid data in the cached voice data, the response delay of the Bluetooth sound box during establishing Bluetooth connection and transmitting data can be effectively reduced by identifying the invalid data, so that the user experience is greatly improved, and the method and the device have strong usability and practicability.

Fig. 2 is a schematic flow chart of an implementation of the method for reducing the reaction delay of the bluetooth speaker according to the second embodiment of the present invention, and as shown in fig. 2, the method may include the following steps:

step S201, after receiving the wakeup information, executing a wakeup interrupt operation to wake up the bluetooth speaker.

And step S202, caching the received voice data after the Bluetooth sound box is awakened.

Step S203, detecting whether invalid data exists in the cached voice data.

Step S204, if invalid data exists in the cached voice data, determining that data other than the invalid data in the cached voice data is valid data.

Steps S201, S202, S203, and S204 in this embodiment are the same as steps S101, S102, S103, and S104, and specific reference may be made to the description of steps S101, S102, S103, and S104, which is not repeated herein.

In step S205, if it is detected that the bluetooth speaker starts to transmit voice data, the operation of detecting whether invalid data exists in the cached voice data is stopped.

Optionally, in this embodiment of the present invention, after detecting whether invalid data exists in the cached voice data, the method may further include:

if receiving connection information indicating that the Bluetooth sound box is successfully connected with the terminal through Bluetooth and the cached voice data comprises valid data, the Bluetooth sound box sends the valid data to the terminal which is successfully connected with the Bluetooth sound box through Bluetooth.

In the embodiment of the invention, the terminal can be a mobile phone, a tablet personal computer and other equipment which can establish Bluetooth connection with the Bluetooth sound box. The connection information indicating that the bluetooth speaker and the terminal successfully establish the bluetooth connection may be notification information that the terminal sends to the bluetooth speaker after successfully establishing the bluetooth connection, or information obtained by detecting a bluetooth connection state of the bluetooth speaker, or the like.

Optionally, after detecting whether invalid data exists in the cached voice data, the method may further include:

and if the duration or the frame number of the cached voice data reaches a preset threshold value and the cached voice data does not contain valid data, the voice sound box enters a dormant state to wait for awakening next time.

In this embodiment of the present invention, the predetermined threshold may be set according to a duration of the corresponding determination object being the voice data or a frame number of the voice data. For example, in the embodiment of the present invention, the predetermined threshold corresponding to the duration of the voice data may be set to 10 seconds, 30 seconds, and the like, and the predetermined threshold corresponding to the frame number of the voice data may be 1000 frames, and the like, which is not limited herein.

Through if the duration or the frame number of the voice data of the buffer memory reach the predetermined threshold value, and detect the buffer memory not include valid data in the voice data, then set up the voice sound box enters the dormant state, can be in under the circumstances such as bluetooth speaker is awaken up by mistake or the user leaves temporarily, reduce bluetooth speaker's power to through discerning and handling a large amount of invalid data of buffer memory in the bluetooth speaker, reduce reaction delay.

An implementation of the embodiment of the present invention is illustrated below by taking a specific embodiment as an example.

For example, if the time when the bluetooth speaker starts to buffer the received voice data is t1 and the time when the bluetooth speaker starts to transmit the voice data is detected to be t2, there must be a time difference between t1 and t2, and the bluetooth speaker prevents the voice data from being lost during the time period by buffering the received voice data during the time period. At the same time, the bluetooth speaker generates a reaction delay of t2-t1, and the same delay is maintained for the data transmitted after the data is buffered. And if the fact that L frame data are shared in the voice data cached by the Bluetooth sound box in the time period from t2 to t1 is detected, and the data of the 1 st frame to the N-M th frame exist in the L frame data and are invalid data, wherein M is more than or equal to 1 and less than or equal to N and less than or equal to L, the data of the N-M th frame to the L th frame are determined to be valid data. If the cache time corresponding to the data of the Nth-M frame is t (N-M), and t (N-M) is more than or equal to t1 and less than or equal to t2, the Bluetooth speaker can only send the data of the Nth-M frame to the L-th frame to the terminal successfully establishing the Bluetooth connection with the Bluetooth speaker if receiving the connection information indicating that the Bluetooth speaker successfully establishes the Bluetooth connection with the terminal, so that the reaction delay of the Bluetooth speaker can be reduced from t2-t1 to t2-t (N-M). And if the Bluetooth sound box is detected to start to send the voice data, stopping detecting whether invalid data exists in the cached voice data or not, and stopping detecting and processing the invalid data.

In the embodiment of the invention, after the awakening information is received, the awakening interrupt operation is executed to awaken the Bluetooth sound box; after the Bluetooth sound box is awakened, caching the received voice data; detecting whether invalid data exists in the cached voice data; if invalid data exists in the cached voice data, determining that data except the invalid data in the cached voice data is valid data; and if the Bluetooth sound box is detected to start to send voice data, stopping detecting whether invalid data exists in the cached voice data. According to the embodiment of the invention, whether invalid data exists in the voice data in the two time periods from the beginning of caching to the beginning of sending the voice data by the Bluetooth sound box is detected, and the response delay caused by the connection establishment between the Bluetooth sound box and the terminal equipment is reduced by distinguishing the valid data and the invalid data in the cached voice data, so that the user experience is greatly improved, and the practicability is higher.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

Fig. 3 is a schematic diagram of an apparatus for reducing the reaction delay of a bluetooth speaker according to a third embodiment of the present invention. For convenience of explanation, only portions related to the embodiments of the present invention are shown.

The device for reducing the reaction delay of the Bluetooth sound box comprises:

the wake-up module 31 is configured to, after receiving the wake-up information, execute a wake-up interrupt operation to wake up the bluetooth speaker;

the cache module 32 is configured to cache the received voice data after the bluetooth speaker is awakened;

a detecting module 33, configured to detect whether invalid data exists in the cached voice data;

the processing module 34 is configured to determine that data other than the invalid data in the cached voice data is valid data if the cached voice data contains invalid data.

Optionally, the detection module 33 is specifically configured to:

calculating the absolute value of the amplitude of the single frame data in the voice data according to the sequence of the voice data caching time, and comparing the absolute value with a first preset value;

if the absolute value of the amplitude of the data of the Nth frame is less than or equal to a first preset value, determining the data of the Nth frame as invalid data; otherwise, if the absolute value of the amplitude of the nth frame data is greater than the first predetermined value, determining that the nth frame data and M frame data before the nth frame data are valid data, and the 1 st frame to the nth-M frame data are invalid data, wherein N, M is an integer not less than zero, and M is less than N.

Optionally, the detection module 33 is specifically configured to:

grouping the cached voice data;

the root mean square value of the amplitude of each set of voice data is calculated using the following formula:

wherein De is the root mean square value of the amplitude of the group of voice data, K is the frame number of the group of voice data, K is an integer greater than zero, and Dn is the amplitude of the nth frame data in the group of voice data;

if the root mean square value is smaller than a second preset value, determining the group of voice data as invalid data; and if the root mean square value is greater than or equal to the second preset value, determining the group of voice data as valid data.

Optionally, the detection module 33 is specifically configured to:

acquiring the amplitude of single-frame data in the voice data according to the sequence of the voice data caching time, and inputting the acquired amplitude of the single-frame data as input data into the following formula to acquire output data:

Dp(n)＝(1-a)×Dp(n-1)+a×|D(n)|

wherein Dp (n) is the output data of the nth frame, D (n) is the amplitude of the data of the nth frame in the voice data, and a is a constant;

if the output data of the Nth frame is smaller than a third preset value, determining that the output data of the Nth frame is invalid data; otherwise, if the nth frame output data is greater than or equal to the third predetermined value, determining that the nth frame output data and M frame data before the nth frame output data are valid data, and the 1 st to nth-M frame output data are invalid data, where N, M is an integer not less than zero, and M is less than N.

Optionally, the apparatus for reducing the reaction delay of the bluetooth speaker further includes:

and the control module is used for stopping detecting whether invalid data exists in the cached voice data or not if the Bluetooth sound box is detected to start to send the voice data.

and the sending module is used for sending the effective data to the terminal which successfully establishes the Bluetooth connection with the Bluetooth sound box if receiving the connection information indicating that the Bluetooth sound box and the terminal successfully establish the Bluetooth connection and caching the voice data comprising the effective data.

and the dormancy module is used for entering a dormant state and waiting for awakening next time if the duration or the frame number of the cached voice data reaches a preset threshold value and the cached voice data does not contain valid data.

Fig. 4 is a schematic diagram of a terminal device according to a fourth embodiment of the present invention. As shown in fig. 4, the terminal device 4 of this embodiment includes: a processor 40, a memory 41 and a computer program 42 stored in said memory 41 and executable on said processor 40. The processor 40 executes the computer program 42 to implement the steps in each of the above-mentioned embodiments of the method for reducing latency of bluetooth speaker, such as the steps 101 to 104 shown in fig. 1. Alternatively, the processor 40, when executing the computer program 42, implements the functions of the modules/units in the above-mentioned device embodiments, such as the functions of the modules 31 to 34 shown in fig. 3.

Illustratively, the computer program 42 may be partitioned into one or more modules/units that are stored in the memory 41 and executed by the processor 40 to implement the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 42 in the terminal device 4. For example, the computer program 42 may be divided into a wake-up module, a cache module, a detection module, and a processing module, and the specific functions of each module are as follows:

The terminal device 4 may be a sound box (such as a bluetooth sound box), a desktop computer, a notebook computer, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 40, a memory 41. Those skilled in the art will appreciate that fig. 4 is merely an example of a terminal device 4 and does not constitute a limitation of terminal device 4 and may include more or fewer components than shown, or some components may be combined, or different components, e.g., the terminal device may also include input-output devices, network access devices, buses, etc.

The Processor 40 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 41 may be an internal storage unit of the terminal device 4, such as a hard disk or a memory of the terminal device 4. The memory 41 may also be an external storage device of the terminal device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 4. Further, the memory 41 may also include both an internal storage unit and an external storage device of the terminal device 4. The memory 41 is used for storing the computer program and other programs and data required by the terminal device. The memory 41 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A method for reducing latency in the response of a Bluetooth speaker, comprising:

after the Bluetooth sound box is awakened, caching the received voice data;

detecting whether invalid data exists in the cached voice data;

if invalid data exists in the cached voice data, determining that data except the invalid data in the cached voice data is valid data;

the detecting whether invalid data exists in the cached voice data, and if invalid data exists in the cached voice data, determining that data other than the invalid data in the cached voice data is valid data includes:

2. The method of claim 1, wherein the detecting whether invalid data exists in the buffered voice data, and if invalid data exists in the buffered voice data, determining that data other than the invalid data in the buffered voice data is valid data comprises:

grouping the cached voice data;

3. The method of claim 1, wherein the detecting whether invalid data exists in the buffered voice data, and if invalid data exists in the buffered voice data, determining that data other than the invalid data in the buffered voice data is valid data comprises:

Dp(n)＝(1-a)×Dp(n-1)+a×|D(n)|

if the output data of the nth frame is smaller than a third preset value, determining that the output data of the nth frame is invalid data; otherwise, if the nth frame output data is greater than or equal to the third predetermined value, determining that the nth frame output data and M frame data before the nth frame output data are valid data, and the 1 st to nth-M frame output data are invalid data, wherein n and M are integers not less than zero, and M is less than n.

4. The method of claim 1, wherein after said detecting whether invalid data exists in said buffered speech data, further comprising:

and if the Bluetooth sound box is detected to start to send voice data, stopping detecting whether invalid data exists in the cached voice data.

5. The method of claim 1, wherein after detecting whether invalid data exists in the buffered voice data, further comprising:

6. The method according to any one of claims 1 to 5, wherein after detecting whether invalid data exists in the buffered voice data, further comprising:

7. An apparatus for reducing latency in the response of a bluetooth speaker, comprising:

the processing module is used for determining that data except invalid data in the cached voice data is valid data if the cached voice data contains invalid data;

the detection module is specifically configured to:

8. The apparatus of claim 7, further comprising:

9. The apparatus of claim 7 or 8, further comprising:

10. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when executing the computer program.

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.