[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN117636891A - Filter updating method and device, storage medium and electronic equipment - Google Patents

Filter updating method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN117636891A
CN117636891A CN202311676608.1A CN202311676608A CN117636891A CN 117636891 A CN117636891 A CN 117636891A CN 202311676608 A CN202311676608 A CN 202311676608A CN 117636891 A CN117636891 A CN 117636891A
Authority
CN
China
Prior art keywords
signal
filter
power
target
updating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311676608.1A
Other languages
Chinese (zh)
Inventor
杨钦雲
李倩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bestechnic Shanghai Co Ltd
Original Assignee
Bestechnic Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bestechnic Shanghai Co Ltd filed Critical Bestechnic Shanghai Co Ltd
Priority to CN202311676608.1A priority Critical patent/CN117636891A/en
Publication of CN117636891A publication Critical patent/CN117636891A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The application discloses a filter updating method, a device, a storage medium and electronic equipment, wherein the filter updating method comprises the steps of collecting a voice signal, converting the voice signal into a frequency domain signal, and the frequency domain signal comprises a first target signal; the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal; calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor; the second target signal, the third target signal, and the filter update factor are input to the target filter to update the filter coefficients of the target filter. The scheme can reduce the voice distortion degree.

Description

Filter updating method and device, storage medium and electronic equipment
Technical Field
The embodiment of the application relates to the technical field of signal processing, in particular to a filter updating method, a device, a storage medium and electronic equipment.
Background
Adaptive filters are widely used in speech recognition systems. For example, in a far-field speech recognition system, an adaptive filter can be used to eliminate noise, reverberation and other interference in the far-field speech signal, improve speech quality, and improve recognition performance of the speech recognition system. The adaptive filter updates the filter coefficients in real time according to a specific criterion, and automatically and continuously works on a given input signal to obtain a desired output.
Generalized sidelobe canceling (Generalized Sidelobe Cancellation, GSC) algorithms are applied in adaptive filters as they are able to translate constrained optimal problems into unconstrained problems. However, under the condition of low signal-to-noise ratio, the traditional GSC algorithm has inaccurate power ratio of the calculated fixed beam former to the blocking matrix, which is easy to cause erroneous judgment and leads to high voice distortion degree.
Disclosure of Invention
The embodiment of the application provides a filter updating method, a device, a storage medium and electronic equipment, which can reduce the voice distortion degree.
In a first aspect, an embodiment of the present application provides a method for updating a filter, including:
collecting a voice signal and converting the voice signal into a frequency domain signal, wherein the frequency domain signal comprises a first target signal;
the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal;
calculating based on the first target signal, the second target signal and the third target signal to obtain a filter update factor;
and inputting the second target signal, the third target signal and the filter updating factor into a target filter to update the filter coefficient of the target filter.
In the method for updating a filter provided in the embodiment of the present application, the calculating based on the first target signal, the second target signal and the third target signal to obtain a filter update factor includes:
respectively calculating the power of the first target signal, the second target signal and the third target signal to obtain first signal power, second signal power and third signal power;
and calculating according to the first signal power, the second signal power and the third signal power to obtain a filter update factor.
In the method for updating a filter provided in the embodiment of the present application, the calculating according to the first signal power, the second signal power, and the third signal power to obtain a filter update factor includes:
performing first-order smoothing processing on the first signal power, the second signal power and the third signal power respectively to generate first smoothing power, second smoothing power and third smoothing power;
and calculating according to the first smooth power, the second smooth power and the third smooth power to obtain a filter updating factor.
In the method for updating a filter provided in the embodiment of the present application, the calculating according to the first smooth power, the second smooth power and the third smooth power to obtain a filter updating factor includes:
comparing the second smoothed power to the first smoothed power to generate a first power ratio;
comparing the second smoothed power to the third smoothed power to generate a second power ratio;
and calculating according to the first power ratio and the second power ratio to obtain a filter updating factor.
In the method for updating a filter provided in the embodiment of the present application, the calculating according to the first power ratio and the second power ratio to obtain a filter updating factor includes:
judging the first power ratio by adopting a first judgment algorithm to obtain a first updating factor;
judging the second power ratio by adopting a second judgment algorithm to obtain a second updating factor;
and multiplying the first updating factor by the second updating factor to obtain a filter updating factor.
In the method for updating a filter provided in the embodiment of the present application, the collecting a voice signal includes: collecting voice signals through a plurality of microphones;
the first target signal is a frequency domain signal of a voice signal collected by a microphone nearest to the target voice signal.
In the method for updating a filter provided in the embodiment of the present application, the converting the speech signal into a frequency domain signal includes:
fourier transforming the speech signal to convert the speech signal into a frequency domain signal.
In a second aspect, an embodiment of the present application provides a filter updating apparatus, including:
the device comprises an acquisition unit, a first target signal acquisition unit and a second target signal acquisition unit, wherein the acquisition unit is used for acquiring a voice signal and converting the voice signal into a frequency domain signal, and the frequency domain signal comprises the first target signal;
the generation unit is used for inputting the frequency domain signals into the fixed beam former and the blocking matrix module respectively to generate a second target signal and a third target signal;
the calculating unit is used for calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor;
and the updating unit is used for inputting the second target signal, the third target signal and the filter updating factor into a target filter so as to update the filter coefficient of the target filter.
In a third aspect, embodiments of the present application provide a storage medium storing a plurality of instructions adapted to be loaded by a processor to perform a filter updating method according to any one of the preceding claims.
In a fourth aspect, an embodiment of the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements the filter updating method of any one of the above when executing the computer program.
In summary, the method for updating the filter provided in the embodiment of the present application includes collecting a voice signal, and converting the voice signal into a frequency domain signal, where the frequency domain signal includes a first target signal; the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal; calculating based on the first target signal, the second target signal and the third target signal to obtain a filter update factor; and inputting the second target signal, the third target signal and the filter updating factor into a target filter to update the filter coefficient of the target filter. The scheme can reduce the voice distortion degree.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of a filter updating method according to an embodiment of the present application.
Fig. 2 is a schematic structural diagram of a filter update architecture according to an embodiment of the present application.
Fig. 3 is another flow chart of a filter updating method according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a filter updating apparatus according to an embodiment of the present application.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, the element defined by the phrase "comprising one … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element, and furthermore, elements having the same name in different embodiments of the present application may have the same meaning or may have different meanings, a particular meaning of which is to be determined by its interpretation in this particular embodiment or by further combining the context of this particular embodiment.
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
In the following description, suffixes such as "module", "component", or "unit" for representing elements are used only for facilitating the description of the present application, and are not of specific significance per se. Thus, "module," "component," or "unit" may be used in combination.
In the description of the present application, it should be noted that the positional or positional relationship indicated by the terms such as "upper", "lower", "left", "right", "inner", "outer", etc. are based on the positional or positional relationship shown in the drawings, are merely for convenience of describing the present application and simplifying the description, and do not indicate or imply that the apparatus or element in question must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present application. Furthermore, the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Generalized sidelobe canceling (Generalized Sidelobe Cancellation, GSC) algorithms are applied in adaptive filters as they are able to translate constrained optimal problems into unconstrained problems. However, under the condition of low signal-to-noise ratio, the traditional GSC algorithm has inaccurate power ratio of the calculated fixed beam former to the blocking matrix, which is easy to cause erroneous judgment and leads to high voice distortion degree.
Based on this, the embodiment of the application provides a method, a device, a storage medium and an electronic device for updating a filter, where the device for updating a filter may be integrated in the electronic device, and the electronic device may be a server or a terminal; the terminal can comprise embedded equipment such as a mobile phone, a wearable intelligent device, a tablet personal computer and the like, and can also be a notebook computer, a personal computer (Personal Computer, PC) and the like; the server can be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like.
The technical solutions shown in the present application will be described in detail below through specific examples. The following description of the embodiments is not intended to limit the priority of the embodiments.
Referring to fig. 1, fig. 1 is a flowchart of a filter updating method according to an embodiment of the present application. The specific flow of the filter updating method can be as follows:
101. the method comprises the steps of collecting a voice signal and converting the voice signal into a frequency domain signal, wherein the frequency domain signal comprises a first target signal.
Specifically, the voice signal may be collected by a microphone array. The microphone array can be in the form of a regular array, such as a linear array, an area array, a circular array, a three-dimensional array and the like, or can be in the form of an irregular array; the microphone array may include a plurality of microphones; structural information of the microphone array (array configuration and microphone pitch) is known. The speech signals received by the microphone array may include device local noise, external interference, and target speech signals (such as user speech instructions), among others.
In this embodiment of the present application, the first target signal is a frequency domain signal of a voice signal collected by a microphone nearest to the target voice signal.
In an implementation, the voice signal may be fourier transformed to convert the voice signal to a frequency domain signal.
102. The frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal.
In the implementation process, the frequency domain signal can be transmitted to the fixed beam former and the blocking matrix module through two branches for signal processing, so as to generate a second target signal for primarily enhancing the target voice signal and a third target signal (noise reference signal) after blocking the target voice signal.
103. And calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor.
Specifically, the power of the first target signal, the second target signal and the third target signal can be calculated respectively to obtain the first signal power, the second signal power and the third signal power; and calculating according to the first signal power, the second signal power and the third signal power to obtain a filter updating factor.
In some embodiments, the second signal power may be directly compared to the first signal power to generate a first power ratio; the second signal power is compared to the third signal power to generate a second power ratio. And then, calculating according to the first power ratio and the second power ratio to obtain a filter updating factor.
In another embodiment, to enhance the continuity before each frame signal, first-order smoothing processing may be performed on the first signal power, the second signal power, and the third signal power, to generate a first smoothed power, a second smoothed power, and a third smoothed power, and then calculation is performed according to the first smoothed power, the second smoothed power, and the third smoothed power, to obtain the filter update factor. Specifically, the second smoothed power may be compared to the first smoothed power to generate a first power ratio; comparing the second smoothed power to the third smoothed power to generate a second power ratio; and calculating according to the first power ratio and the second power ratio to obtain a filter updating factor.
In this embodiment of the present application, the calculating according to the first power ratio and the second power ratio may specifically be that: judging the first power ratio by adopting a first judgment algorithm to obtain a first updating factor; judging a second power ratio by adopting a second judgment algorithm to obtain a second updating factor; and multiplying the first updating factor by the second updating factor to obtain the filter updating factor.
The first update factor and the second update factor are two factor matrices, and by multiplying the two factor matrices, a filter update factor that can be regarded as a speech-like non-existence probability can be obtained.
104. The second target signal, the third target signal, and the filter update factor are input to the target filter to update the filter coefficients of the target filter.
In the embodiment of the application, the filter update factor is input into the target filter, so that the target filter can slow down the update speed when the voice existence probability is high, and speed up the update speed when the voice non-existence probability is high, thereby reducing the convergence time. In addition, the filter updating factor combines the first updating factor and the second updating factor, so that the robustness is stronger, the voice distortion rate can be better reduced, and the noise suppression degree is increased.
In summary, the method for updating the filter provided in the embodiment of the present application includes collecting a voice signal, and converting the voice signal into a frequency domain signal, where the frequency domain signal includes a first target signal; the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal; calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor; the second target signal, the third target signal, and the filter update factor are input to the target filter to update the filter coefficients of the target filter. According to the scheme, the filter update factor is input into the target filter, so that the update speed of the target filter can be reduced when the existence probability of voice is high, and the update speed is increased when the non-existence probability of voice is high, and the convergence time is reduced. In addition, the filter updating factor combines the first updating factor and the second updating factor, so that the robustness is stronger, the voice distortion rate can be better reduced, and the noise suppression degree is increased.
In order to describe the above filter updating method in detail, the embodiment of the present application further provides a filter updating method. It should be noted that, as shown in fig. 2, the filter updating architecture provided in the embodiment of the present application may be implemented by using a microphone array as a dual microphone array, where the voice signals collected by the dual microphone array are illustrated as s1 and s2 respectively. The specific flow of the filter updating method can be as shown in fig. 3:
201. the speech signals s1 and s2 are acquired by a dual microphone array and the speech signals s1 and s2 are converted into frequency domain signals X1 and X2.
The voice signal received by the dual microphone array may include device local noise, external interference, target voice signal (such as user voice command), and the like. In the present embodiment, the dual microphone array includes a first microphone (microphone 1) and a second microphone (microphone 2). The first microphone is closer to the originating point of the target speech signal than the second microphone.
In an implementation, the speech signals s1 and s2 may be fourier transformed to convert the speech signals s1 and s2 into frequency domain signals X1 and X2. Specifically, x1=fft (s 1, fft_size) and x2=fft (s 2, fft_size).
Wherein fft_size represents the number of discrete fourier transform points, and X1 and X2 represent the frequency domain signals of s1 and s2, respectively; fft () is a fourier transform function.
202. The frequency domain signals X1 and X2 are input to the fixed beamformer and the blocking matrix module, respectively, to generate a second target signal Y1 and a third target signal Y3.
The frequency domain signals X1 and X2 are input to a fixed beam former for first signal processing, and a second target signal Y1 which is primarily enhanced for the target voice signal is generated. The frequency domain signals X1 and X2 are input to the blocking matrix module for second signal processing, and a third target signal Y3 (noise reference signal) after blocking the target speech signal is generated.
It should be noted that, the first signal processing and the second signal processing are not sequential, and may be performed simultaneously; the first signal processing may be performed first, and the second signal processing may be performed after the second signal processing; the second signal processing may be performed first, and the first signal processing may be performed after the first signal processing.
In this embodiment, the first signal processing may specifically be: y1=w1 (X1, X2). The second signal processing may specifically be: y3=w3 (X1, X2).
Where W1 denotes the filter coefficients of the fixed beamformer and W3 denotes the filter coefficients of the blocking matrix.
203. The power of the first target signal Y2, the second target signal Y1 and the third target signal Y3 are calculated respectively to obtain a first signal power y2_out, a second signal power y1_out and a third signal power y3_out.
In a specific implementation process, the first target signal Y2 and the second target signal Y1 may be calculated by a first power ratio calculation module, so as to generate a first signal power y2_out and a second signal power y1_out. And calculates the second target signal Y1 and the third target signal Y3 by the second power ratio calculation module to generate a second signal power y1_out and a third signal power y3_out.
Wherein y1_out=y1×conj (Y1), y2_out=y2_conj (Y2), y3_out=
Y3 is conj (Y3). Where conj represents conjugation.
204. First-order smoothing processing is performed on the first signal power y2_out, the second signal power y1_out and the third signal power y3_out, respectively, to generate a first smoothing power y2_smooth, a second smoothing power y1_smooth and a third smoothing power y3_smooth.
In the implementation process, the first signal power y2_out and the second signal power y1_out may be subjected to first-order smoothing processing by the first power ratio calculation module, so as to generate a first smoothed power y2_smooth and a second smoothed power y1_smooth. And performing first-order smoothing processing on the second signal power y1_out and the third signal power y3_out through a second power ratio calculation module to generate second smoothing power y1_smooth and third smoothing power y3_smooth. In particular, the method comprises the steps of,
(1)Y1_smooth=alpha*Y1_smooth+(1-alpha)*Y1_out;
(2)Y2_smooth=alpha*Y2_smooth+(1-alpha)*Y2_out;
(3)Y3_smooth=alpha*Y3_smooth+(1-alpha)*Y3_out。
where alpha represents the smoothing factor of the first order recursive smoothing graph.
205. The second smoothing power y1_smooth is compared with the first smoothing power y2_smooth to generate a first power ratio powerratio_1.
In an implementation, step 205 is performed in the first power ratio module.
206. The second smoothing power y1_smooth is compared with the third smoothing power y3_smooth to generate a second power ratio powerratio_2.
In an implementation, step 206 is performed in the second power ratio module.
In the embodiment of the present application, in order to prevent the calculation of an outlier due to the calculation of the power ratio time denominator of 0, y2_smooth and y3_smooth need to be non-zero values. Therefore, after y2_smooth and y3_smooth are obtained by (1) and (3) above, it is also necessary to calculate them to generate non-zero values of y2_smooth and y3_smooth. The method is specifically as follows:
(4)Y2_smooth=max(Y2_smooth,EPS);
(5)Y3_smooth=max(Y3_smooth,EPS)。
the EPS represents a very small number, and this embodiment takes 0.00001, so that it is possible to prevent the calculation of an outlier caused by the fact that the calculation power ratio is 0.
At this time, the first and third smoothing powers y2_smooth and y3_smooth obtained through (4) and (5) may be compared with the second smoothing power y1_smooth to generate the first and second power ratios powerratio_1 and powerratio_2. Wherein,
PowerRatio_1=min(Y1_smooth/Y2_smooth,1);
PowerRatio_2=min(Y1_smooth/Y3_smooth,SDB))。
wherein SDB is the maximum limit value of signal-to-noise ratio of the second target signal Y1 and the third target signal Y3, in the embodiment of the present application, sdb=10 (20/10).
207. And judging the first power ratio PowerRatio_1 by adopting a first judging algorithm to obtain a first updating factor f1.
Step 207 is performed in a first decision module.
208. And judging the second power ratio PowerRatio_2 by adopting a second judging algorithm to obtain a second updating factor f2.
Wherein step 208 is performed in a second decision module.
209. The first update factor f1 is multiplied by the second update factor f2 to obtain a filter update factor f.
Step 209 is performed in a multiplier module.
It should be noted that, a specific process of determining the first power ratio powerratio_1 by using the first determining algorithm may be as follows:
the specific process of deciding the second power ratio powerratio_2 by using the second decision algorithm may be as follows:
f2(i)=1/(PowerRatio_2(i)+1)。
where i is a frequency bin index, and the range of i is 1 to 257 when the number of points of the discrete fourier transform is 512 points. the threshold_high and the threshold_low are empirical values, the values are between 0 and 1, and the threshold_low is smaller than the threshold_high.
The filter update factor f may be regarded as a speech-like non-existence probability.
210. The second target signal Y1, the third target signal Y3, and the filter update factor f are input to the target filter to update the filter coefficients of the target filter.
In this embodiment, a specific process of updating the filter coefficient of the target filter may be as follows:
E(k,L)=Y1(k,L)-W(k,L)*Y3(k,L);
P_Y3=Y3_smooth;
W(k,L)=W(k,L-1)+f*mu*E(k,L)*conj(Y3(k,L))/P_Y3。
where W is the filter coefficient, k represents the bin index, L represents the frame index, mu is the fixed step factor, E is the error signal, and p_y3 is the smoothed power spectrum of the Y3 signal.
It will be appreciated that in FIG. 2, the first power ratio module is configured to generate a first power ratio
In summary, the filter updating method provided in the embodiment of the present application has low computational complexity, and the spatial information of the array is used to calculate the filter updating factor f to update the updating step length of the target filter, so that the updating speed of the target filter is slowed down when the existence probability of speech is high, and the updating speed is increased when the non-existence probability of speech is high, thereby reducing the convergence time. In addition, the filter updating factor f combines the first updating factor f1 and the second updating factor f2, so that the robustness is stronger, the voice distortion rate can be better reduced, and the noise suppression degree is increased.
In order to facilitate better implementation of the filter updating method provided by the embodiment of the application, the embodiment of the application also provides a filter updating device. Where the meaning of the terms is the same as in the filter updating method described above, specific implementation details may be referred to in the description of the method embodiments.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a filter updating apparatus according to an embodiment of the present application. The filter updating means may comprise an acquisition unit 301, a generation unit 302, a calculation unit 303 and an updating unit 304. Wherein,
an acquisition unit 301, configured to acquire a voice signal and convert the voice signal into a frequency domain signal, where the frequency domain signal includes a first target signal;
a generating unit 302, configured to input the frequency domain signals into a fixed beamformer and a blocking matrix module, respectively, and generate a second target signal and a third target signal;
a calculating unit 303, configured to calculate, based on the first target signal, the second target signal, and the third target signal, to obtain a filter update factor;
an updating unit 304, configured to input the second target signal, the third target signal, and the filter update factor into a target filter, so as to update a filter coefficient of the target filter.
The specific embodiments of the above units can be referred to the above embodiments of the filter updating method, and will not be described herein.
In summary, the filter updating apparatus provided in the embodiments of the present application may collect a voice signal by using the microphone array of the collection unit 301, and convert the voice signal into a frequency domain signal, where the frequency domain signal includes a first target signal; the generating unit 302 inputs the frequency domain signals into the fixed beam former and the blocking matrix module respectively, and generates a second target signal and a third target signal; calculating, by a calculating unit 303, based on the first target signal, the second target signal, and the third target signal, a filter update factor is obtained; the second target signal, the third target signal, and the filter update factor are input to a target filter by an update unit 304 to update filter coefficients of the target filter. According to the scheme, the filter update factor is input into the target filter, so that the update speed of the target filter can be reduced when the existence probability of voice is high, and the update speed is increased when the non-existence probability of voice is high, and the convergence time is reduced. In addition, the filter updating factor combines the first updating factor and the second updating factor, so that the robustness is stronger, the voice distortion rate can be better reduced, and the noise suppression degree is increased.
The embodiment of the present application further provides an electronic device, in which the filter updating apparatus of the embodiment of the present application may be integrated, as shown in fig. 5, which shows a schematic structural diagram of the electronic device according to the embodiment of the present application, specifically:
the electronic device may include Radio Frequency (RF) circuitry 601, memory 602 including one or more computer readable storage media, input unit 603, display unit 604, sensor 605, audio circuitry 606, wireless fidelity (Wireless Fidelity, wiFi) module 607, processor 608 including one or more processing cores, and power supply 609. It will be appreciated by those skilled in the art that the electronic device structure shown in fig. 5 is not limiting of the electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
Wherein:
the RF circuit 601 may be used for receiving and transmitting signals during a message or a call, and in particular, after receiving downlink information of a base station, the downlink information is processed by one or more processors 608; in addition, data relating to uplink is transmitted to the base station. Typically, RF circuitry 601 includes, but is not limited to, an antenna, at least one amplifier, a tuner, one or more oscillators, a subscriber identity module (Subscriber Identity Module, SIM) card, a transceiver, a coupler, a low noise amplifier (Low Noise Amplifier, LNA), a duplexer, and the like. In addition, the RF circuitry 601 may also communicate with networks and other devices through wireless communications. The wireless communication may use any communication standard or protocol including, but not limited to, global system for mobile communications (Global System of Mobile communication, GSM), general packet radio service (General Packet Radio Service, GPRS), code division multiple access (Code Division Multiple Access, CDMA), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), long term evolution (Long Term Evolution, LTE), email, short message service (Short Messaging Service, SMS), and the like.
The memory 602 may be used to store software programs and modules, and the processor 608 may execute various functional applications and information processing by executing the software programs and modules stored in the memory 602. The memory 602 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device (such as audio data, phonebooks, etc.), and the like. In addition, the memory 602 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Accordingly, the memory 602 may also include a memory controller to provide access to the memory 602 by the processor 608 and the input unit 603.
The input unit 603 may be used to receive input numeric or character information and to generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control. In particular, in one particular embodiment, the input unit 603 may include a touch-sensitive surface, as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations thereon or thereabout by a user (e.g., operations thereon or thereabout by a user using any suitable object or accessory such as a finger, stylus, etc.), and actuate the corresponding connection means according to a predetermined program. Alternatively, the touch-sensitive surface may comprise two parts, a touch detection device and a touch controller. The touch detection device detects the touch azimuth of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch detection device and converts it into touch point coordinates, which are then sent to the processor 608, and can receive commands from the processor 608 and execute them. In addition, touch sensitive surfaces may be implemented in a variety of types, such as resistive, capacitive, infrared, and surface acoustic waves. The input unit 603 may comprise other input devices in addition to a touch sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, mouse, joystick, etc.
The display unit 604 may be used to display information entered by a user or provided to a user as well as various graphical user interfaces of the electronic device, which may be composed of graphics, text, icons, video, and any combination thereof. The display unit 604 may include a display panel, which may alternatively be configured in the form of a liquid crystal display (Liquid Crystal Display, LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay a display panel, and upon detection of a touch operation thereon or thereabout, the touch-sensitive surface is passed to the processor 608 to determine the type of touch event, and the processor 608 then provides a corresponding visual output on the display panel based on the type of touch event. Although in fig. 5 the touch sensitive surface and the display panel are implemented as two separate components for input and output functions, in some embodiments the touch sensitive surface may be integrated with the display panel to implement the input and output functions.
The electronic device may also include at least one sensor 605, such as a light sensor, a motion sensor, and other sensors. In particular, the light sensor may include an ambient light sensor that may adjust the brightness of the display panel according to the brightness of ambient light, and a proximity sensor that may turn off the display panel and/or backlight when the electronic device is moved to the ear. As one of the motion sensors, the gravity acceleration sensor can detect the acceleration in all directions (generally three axes), and can detect the gravity and the direction when the mobile phone is stationary, and can be used for applications of recognizing the gesture of the mobile phone (such as horizontal and vertical screen switching, related games, magnetometer gesture calibration), vibration recognition related functions (such as pedometer and knocking), and the like; other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that may also be configured with the electronic device are not described in detail herein.
Audio circuitry 606, speakers, and a microphone may provide an audio interface between the user and the electronic device. The audio circuit 606 may transmit the received electrical signal after audio data conversion to a speaker, where the electrical signal is converted to a sound signal for output; on the other hand, the microphone converts the collected sound signals into electrical signals, which are received by the audio circuit 606 and converted into audio data, which are processed by the audio data output processor 608 for transmission via the RF circuit 601 to, for example, another electronic device, or which are output to the memory 602 for further processing. The audio circuit 606 may also include an ear bud jack to provide communication of the peripheral ear bud with the electronic device.
WiFi belongs to a short-distance wireless transmission technology, and the electronic equipment can help a user to send and receive emails, browse webpages, access streaming media and the like through the WiFi module 607, so that wireless broadband Internet access is provided for the user. Although fig. 5 shows a WiFi module 607, it is understood that it does not belong to the necessary constitution of the electronic device, and can be omitted entirely as needed within the scope of not changing the essence of the invention.
The processor 608 is a control center of the electronic device that uses various interfaces and lines to connect the various parts of the overall handset, performing various functions of the electronic device and processing the data by running or executing software programs and/or modules stored in the memory 602, and invoking data stored in the memory 602, thereby performing overall monitoring of the handset. Optionally, the processor 608 may include one or more processing cores; preferably, the processor 608 may integrate an application processor that primarily handles operating systems, user interfaces, applications, etc., with a modem processor that primarily handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 608.
The electronic device also includes a power supply 609 (e.g., a battery) for powering the various components, which may be logically connected to the processor 608 via a power management system so as to perform functions such as managing charge, discharge, and power consumption via the power management system. The power supply 609 may also include one or more of any components, such as a direct current or alternating current power supply, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.
Although not shown, the electronic device may further include a camera, a bluetooth module, etc., which will not be described herein. In particular, in this embodiment, the processor 608 in the electronic device loads executable files corresponding to the processes of one or more application programs into the memory 602 according to the following instructions, and the processor 608 executes the application programs stored in the memory 602, so as to implement various functions, for example:
collecting a voice signal, and converting the voice signal into a frequency domain signal, wherein the frequency domain signal comprises a first target signal;
the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal;
calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor;
inputting the second target signal, the third target signal and the filter update factor into the target filter to update the filter coefficients of the target filter
In summary, the electronic device provided in the embodiment of the present application may collect a voice signal and convert the voice signal into a frequency domain signal, where the frequency domain signal includes a first target signal; the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal; calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor; the second target signal, the third target signal, and the filter update factor are input to the target filter to update the filter coefficients of the target filter. According to the scheme, the filter update factor is input into the target filter, so that the update speed of the target filter can be reduced when the existence probability of voice is high, and the update speed is increased when the non-existence probability of voice is high, and the convergence time is reduced. In addition, the filter updating factor combines the first updating factor and the second updating factor, so that the robustness is stronger, the voice distortion rate can be better reduced, and the noise suppression degree is increased.
In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of an embodiment that are not described in detail, reference may be made to the foregoing detailed description of the method for updating a filter, which is not repeated herein.
It should be noted that, for the filter updating method in the embodiment of the present application, it will be understood by those skilled in the art that all or part of the flow of implementing the filter updating method in the embodiment of the present application may be implemented by controlling related hardware by a computer program, where the computer program may be stored in a computer readable storage medium, such as a memory of a terminal, and executed by at least one processor in the terminal, and the execution may include, for example, the flow of the embodiment of the filter updating method.
For the filter updating device of the embodiment of the application, each functional module may be integrated in one processing chip, or each module may exist alone physically, or two or more modules may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented as software functional modules and sold or used as a stand-alone product.
To this end, embodiments of the present application provide a storage medium having stored therein a plurality of instructions capable of being loaded by a processor to perform steps in any of the filter updating methods provided by embodiments of the present application. The storage medium may be a magnetic disk, an optical disk, a Read Only MeMory (ROM), a random access MeMory (Random Access Memory, RAM), or the like.
The above detailed description of the filter updating method, the device, the storage medium and the electronic apparatus provided by the present application respectively, and specific examples are applied herein to illustrate the principles and the implementation of the present application, where the illustration of the above examples is only used to help understand the core idea of the present application; meanwhile, as those skilled in the art will vary in the specific embodiments and application scope according to the ideas of the present application, the contents of the present specification should not be construed as limiting the present application in summary.

Claims (10)

1. A method of updating a filter, comprising:
collecting a voice signal and converting the voice signal into a frequency domain signal, wherein the frequency domain signal comprises a first target signal;
the frequency domain signals are respectively input into a fixed beam former and a blocking matrix module to generate a second target signal and a third target signal;
calculating based on the first target signal, the second target signal and the third target signal to obtain a filter update factor;
and inputting the second target signal, the third target signal and the filter updating factor into a target filter to update the filter coefficient of the target filter.
2. The method of updating a filter according to claim 1, wherein the calculating based on the first target signal, the second target signal, and the third target signal to obtain the filter update factor includes:
respectively calculating the power of the first target signal, the second target signal and the third target signal to obtain first signal power, second signal power and third signal power;
and calculating according to the first signal power, the second signal power and the third signal power to obtain a filter update factor.
3. The method of updating a filter according to claim 2, wherein the calculating according to the first signal power, the second signal power, and the third signal power to obtain the filter updating factor includes:
performing first-order smoothing processing on the first signal power, the second signal power and the third signal power respectively to generate first smoothing power, second smoothing power and third smoothing power;
and calculating according to the first smooth power, the second smooth power and the third smooth power to obtain a filter updating factor.
4. The method of updating a filter according to claim 3, wherein the calculating according to the first smoothing power, the second smoothing power and the third smoothing power to obtain the filter updating factor comprises:
comparing the second smoothed power to the first smoothed power to generate a first power ratio;
comparing the second smoothed power to the third smoothed power to generate a second power ratio;
and calculating according to the first power ratio and the second power ratio to obtain a filter updating factor.
5. The method of updating a filter according to claim 4, wherein the calculating according to the first power ratio and the second power ratio to obtain the filter update factor includes:
judging the first power ratio by adopting a first judgment algorithm to obtain a first updating factor;
judging the second power ratio by adopting a second judgment algorithm to obtain a second updating factor;
and multiplying the first updating factor by the second updating factor to obtain a filter updating factor.
6. The filter updating method according to any one of claims 1 to 5, wherein the collecting the voice signal includes: collecting voice signals through a plurality of microphones;
the first target signal is a frequency domain signal of a voice signal collected by a microphone nearest to the target voice signal.
7. The filter updating method according to any one of claims 1 to 5, wherein said converting the speech signal into a frequency domain signal comprises:
fourier transforming the speech signal to convert the speech signal into a frequency domain signal.
8. A filter updating apparatus, comprising:
the device comprises an acquisition unit, a first target signal acquisition unit and a second target signal acquisition unit, wherein the acquisition unit is used for acquiring a voice signal and converting the voice signal into a frequency domain signal, and the frequency domain signal comprises the first target signal;
the generation unit is used for inputting the frequency domain signals into the fixed beam former and the blocking matrix module respectively to generate a second target signal and a third target signal;
the calculating unit is used for calculating based on the first target signal, the second target signal and the third target signal to obtain a filter updating factor;
and the updating unit is used for inputting the second target signal, the third target signal and the filter updating factor into a target filter so as to update the filter coefficient of the target filter.
9. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the filter updating method of any of claims 1-7.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the filter updating method according to any of claims 1-7 when executing the computer program.
CN202311676608.1A 2023-12-07 2023-12-07 Filter updating method and device, storage medium and electronic equipment Pending CN117636891A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311676608.1A CN117636891A (en) 2023-12-07 2023-12-07 Filter updating method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311676608.1A CN117636891A (en) 2023-12-07 2023-12-07 Filter updating method and device, storage medium and electronic equipment

Publications (1)

Publication Number Publication Date
CN117636891A true CN117636891A (en) 2024-03-01

Family

ID=90028649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311676608.1A Pending CN117636891A (en) 2023-12-07 2023-12-07 Filter updating method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117636891A (en)

Similar Documents

Publication Publication Date Title
KR20200027554A (en) Speech recognition method and apparatus, and storage medium
CN107393548B (en) Method and device for processing voice information collected by multiple voice assistant devices
CN112308806B (en) Image processing method, device, electronic equipment and readable storage medium
CN109583271B (en) Method, device and terminal for fitting lane line
CN106847298A (en) A kind of sound pick-up method and device based on diffused interactive voice
CN111739545B (en) Audio processing method, device and storage medium
CN109756818B (en) Dual-microphone noise reduction method and device, storage medium and electronic equipment
CN110099434B (en) Power adjustment method, terminal equipment and computer readable storage medium
CN106487984A (en) A kind of method and apparatus of adjustment volume
JP7324949B2 (en) Application sharing method, first electronic device and computer readable storage medium
WO2014166266A1 (en) File scanning method and system, client and server
CN111654902B (en) Method and device for reducing GPS signal interference of mobile terminal and mobile terminal
CN116486833B (en) Audio gain adjustment method and device, storage medium and electronic equipment
CN116994596A (en) Howling suppression method and device, storage medium and electronic equipment
CN106817324B (en) Frequency response correction method and device
CN113593602B (en) Audio processing method and device, electronic equipment and storage medium
CN112166441A (en) Data processing method, device and computer readable storage medium
CN117636891A (en) Filter updating method and device, storage medium and electronic equipment
CN111192027B (en) Method and device for processing list and computer readable storage medium
CN109309758B (en) Audio processing device, terminal equipment and signal processing method
CN110209924B (en) Recommendation parameter acquisition method, device, server and storage medium
CN106210951A (en) The adaptation method of a kind of bluetooth earphone, device and terminal
CN116935883B (en) Sound source positioning method and device, storage medium and electronic equipment
CN118354204B (en) Focusing control method and device, storage medium and electronic equipment
CN111405649B (en) Information transmission method and device and mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination