CN118262737B - Method, system and storage medium for separating sound array voice signal from background noise - Google Patents
Method, system and storage medium for separating sound array voice signal from background noise Download PDFInfo
- Publication number
- CN118262737B CN118262737B CN202410449206.6A CN202410449206A CN118262737B CN 118262737 B CN118262737 B CN 118262737B CN 202410449206 A CN202410449206 A CN 202410449206A CN 118262737 B CN118262737 B CN 118262737B
- Authority
- CN
- China
- Prior art keywords
- array
- signal
- matrix
- received
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 24
- 239000011159 matrix material Substances 0.000 claims abstract description 88
- 238000000354 decomposition reaction Methods 0.000 claims abstract description 13
- 238000005457 optimization Methods 0.000 claims abstract description 13
- 238000004364 calculation method Methods 0.000 claims description 11
- 238000000926 separation method Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 description 5
- 230000001629 suppression Effects 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/0308—Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The invention discloses a method, a system and a storage medium for separating a sound array voice signal from background noise, which comprise the following steps: acquiring an acoustic array and calculating a guide vector of the acoustic array; based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array; performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array; and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals. The method can effectively improve the purity of the received voice signal and can also obtain the environment information carried in the background noise.
Description
Technical Field
The invention belongs to the technical field of processing of voice signals, and relates to a method, a system and a storage medium for separating an acoustic array voice signal from background noise.
Background
Background noise suppression is of great necessity in speech array signal processing. First, suppressing background noise helps to improve the clarity and quality of speech signals, especially in complex environments, such as conference rooms or traffic scenes. Secondly, the suppression of background noise can obviously improve the accuracy of a voice recognition system and avoid false recognition caused by the noise. In addition, for the voice communication system, the background noise is restrained, so that the communication quality can be improved, and the user experience is enhanced. In complex environments, suppression of background noise helps to accommodate various challenging conditions, ensuring robustness of the speech system. For voice control and interactive application, the system is more sensitive due to suppression of background noise, and man-machine interaction experience is improved. In general, background noise suppression is used in speech array signal processing to optimize speech signal quality, improve system accuracy and adaptability to meet practical demands in different environments. On the other hand, in some contexts, the noise itself may contain cues about changes in the surrounding environment, such as crowd noise, traffic noise, and the like. Thus, by extracting and analyzing background noise, the system is better able to understand the surrounding environment, thereby providing more intelligent services. Finally, taking the diversity of background noise into consideration in practical application, the background noise is extracted, so that the system is more flexibly adapted to different environments. However, the existing background noise extraction algorithm is less researched and cannot effectively extract background noise.
Disclosure of Invention
The invention aims to provide a method, a system and a storage medium for separating an acoustic array voice signal from background noise, which can effectively improve the purity of a received voice signal and can also obtain environment information carried in the background noise.
The technical solution for realizing the purpose of the invention is as follows:
a method of separating a sound array speech signal from background noise, comprising the steps of:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array;
S04: and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred embodiment, the steering vector of the acoustic array calculated in step S01 is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
In a preferred technical scheme, the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
In a preferred technical scheme, the feature decomposition of the spatial feature matrix in step S03 is performed as follows:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N.
In a preferred technical solution, the angle dictionary for the incoming direction of the voice signal established in the step S04 is:
Wherein, Phi 1,φ2,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
In a preferred embodiment, the spatial feature domain projection signal of the acoustic array received signal in step S04 is further expressed as:
Y=ΦR=ΦΨSΞ+ΦJ
The xi is a spatial domain sparse feature matrix of the voice signal, and has sparse characteristics.
In a preferred technical solution, the joint optimization function established in the step S04 is:
Wherein, xi 0 is zero norm of xi, J * is nuclear norm, η and μ are adjustable super parameters, and pi is the correlation characteristic coefficient of the received signals among the receiving array elements.
The invention also discloses a system for separating the voice signal of the acoustic array from the background noise, which comprises:
the sound array acquisition and calculation module acquires a sound array and calculates a guide vector of the sound array;
The space feature matrix calculation module is used for obtaining a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The characteristic domain projection observation calculation module is used for carrying out characteristic decomposition on the space characteristic matrix to obtain a right characteristic matrix, taking the right characteristic matrix as a space characteristic observation matrix and carrying out characteristic domain projection observation on the voice signals received by the acoustic array;
The separation module establishes an angle dictionary to which the voice signals are directed to represent the space feature domain projection signals of the voice array received signals, and establishes a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred technical solution, the guiding vector of the calculated acoustic array in the acoustic array acquisition calculation module is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
The invention also discloses a computer storage medium, on which a computer program is stored, which when executed realizes the above-mentioned method for separating the acoustic array voice signal from the background noise.
Compared with the prior art, the invention has the remarkable advantages that:
The method can more effectively separate the undistorted voice signal from the background noise by utilizing the voice signal sparse feature decomposition and feature domain projection observation method, and is applicable to different environments. The purity of the received voice signal can be effectively improved, and meanwhile, the environment information carried in the background noise can be obtained. The background noise can be effectively extracted and correspondingly adjusted when necessary, so that stable and efficient voice processing performance can be provided in various complex scenes.
Drawings
FIG. 1 is a flow chart of a method of separating an acoustic array speech signal from background noise in accordance with the present invention;
fig. 2 is a schematic block diagram of a system for separating acoustic array speech signals from background noise in accordance with the present invention.
Detailed Description
The principle of the invention is as follows: the space feature matrix is subjected to feature decomposition to obtain a right feature matrix, the right feature matrix is used as a space feature observation matrix, feature domain projection observation is performed on voice signals received by the acoustic array, and a joint optimization function is established to obtain a space domain sparse feature matrix and a background clutter signal of the voice signals, so that the purity of the received voice signals can be effectively improved, and meanwhile, the environment information carried in the background noise can be obtained. The background noise can be effectively extracted and correspondingly adjusted when necessary, so that stable and efficient voice processing performance can be provided in various complex scenes.
Example 1:
As shown in fig. 1, a method for separating a sound array speech signal from background noise includes the following steps:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array;
S04: and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred embodiment, the steering vector of the acoustic array is calculated in step S01 as:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
In a preferred embodiment, the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
In a preferred embodiment, in step S03, the spatial feature matrix is decomposed into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N.
In a preferred embodiment, the angle dictionary for the incoming speech signal established in step S04 is:
Wherein, Phi 1,φ2,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
In a preferred embodiment, the spatial signature domain projection signal of the acoustic array received signal in step S04 is further expressed as:
Y=ΦR=ΦΨSΞ+ΦJ
The xi is a spatial domain sparse feature matrix of the voice signal, and has sparse characteristics.
In a preferred embodiment, the joint optimization function established in step S04 is:
Wherein, II 0 is zero norm of the XI, II phi J I * is kernel norm, eta and mu are adjustable super parameters, II is the correlation characteristic coefficient of the received signals between the receiving array elements.
In another embodiment, a computer storage medium has a computer program stored thereon, and the computer program when executed implements the above method for separating an acoustic array speech signal from background noise.
The method for separating the acoustic array voice signal from the background noise can be any one of the above methods for separating the acoustic array voice signal from the background noise, and detailed description thereof is omitted herein.
In yet another embodiment, as shown in fig. 2, a system for separating an acoustic array speech signal from background noise, includes:
The acoustic array acquisition and calculation module 10 acquires an acoustic array and calculates a steering vector of the acoustic array;
The space feature matrix calculation module 20 obtains a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The feature domain projection observation calculation module 30 performs feature decomposition on the spatial feature matrix to obtain a right feature matrix, takes the right feature matrix as a spatial feature observation matrix, and performs feature domain projection observation on the voice signals received by the acoustic array;
The separation module 40 establishes an angle dictionary to which the voice signal is directed to represent the spatial feature domain projection signal of the acoustic array received signal, and establishes a joint optimization function to obtain the spatial domain sparse feature matrix and the background clutter signal of the voice signal.
Specifically, the following description will be given by taking a preferred embodiment as an example of the workflow of the system for separating the acoustic array speech signal from the background noise:
There is a receiving acoustic array with a number of receiving elements of N and a spacing of d 0, then the receiving steering vector of the receiving acoustic array can be expressed as:
where c is the speed of sound, θ is the signal direction angle, and f is the speech signal frequency.
The signal received by the nth receiving element is r n, and in the presence of background noise, r n may be expressed as:
rn=β(n)sn+Jn (2)
Where s n is the speech signal, J n is the background noise, and β (i) is the i-th element of the received steering vector β.
The embodiment defines a spatial feature matrix representing the positional relationship of each array element in the receiving acoustic array based on the positional correlation of the receiving acoustic array:
Wherein, The elements in the matrix are:
Wherein Δd ij is the distance between the ith and jth array elements.
The spatial feature matrix indicating the positional relationship of each array element in the received sound array may be obtained by other methods, and is not limited herein.
Subsequently, the spatial feature matrix G is subjected to feature decomposition g=Ω ΛΦ and a right-hand feature matrix Φ thereof is obtained. The invention takes the right-direction characteristic matrix phi as a space characteristic observation matrix to carry out characteristic domain projection observation on the voice signals received by the acoustic array, namely
Y=ΦR=ΦβS+ΦJ (5)
Wherein,
The correlation characteristic coefficients of the received signals among the receiving array elements are as follows:
Wherein σ is a similarity constraint parameter, S i、Sj is a speech signal received by the ith and jth array elements, and the correlation characteristic coefficient characterizes the correlation of the pure speech signal received by different receiving array elements, and the smaller the pi value is, the higher the correlation is.
Finally, define the angle dictionary to which the speech signal comes:
Wherein phi 1,φ2,…,φQ is the search incoming angle set, and Q is the number of search incoming angles.
The spatial signature domain projection signal of the acoustic array received signal represented by equation (5) may be further represented as:
Y=ΦR=ΦΨSΞ+ΦJ (9)
Wherein, the xi has sparse property.
Therefore, the invention designs the separation process of the acoustic array voice signal and the background noise as a joint optimization problem as follows:
Wherein II 0 is zero norm of the XI, and the sparse characteristic is represented. And II phi J II * is a nuclear norm and represents the low-rank characteristic of the low-rank digital video camera. η and μ are adjustable hyper-parameters.
The method can solve the upper expression by using an ADMM algorithm, and the ADMM I alternate direction multiplier method, and can solve the upper expression by using other known algorithms, so that the spatial domain sparse feature matrix (Xi) of the voice signal and the background clutter signal (J) are obtained after the solution, and the joint separation of the voice signal and the background noise is realized.
The method can effectively improve the purity of the received voice signal and can also obtain the environment information carried in the background noise.
The foregoing examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the foregoing examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principles of the present invention should be made therein and are intended to be equivalent substitutes within the scope of the present invention.
Claims (7)
1. A method for separating a speech signal from background noise in an acoustic array, comprising the steps of:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array; the space feature matrix is subjected to feature decomposition into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N;
S04: establishing an angle dictionary for the incoming direction of the voice signals to represent the space feature domain projection signals of the voice array received signals, and establishing a joint optimization function to obtain a space domain sparse feature matrix and background clutter signals of the voice signals; the spatial signature domain projection signal of the acoustic array received signal is further represented as:
Y=ΦR=ΦΨSΞ+ΦJ
The Xis is a space domain sparse feature matrix of the voice signal and has sparse characteristics;
the established joint optimization function is as follows:
Wherein, xi is zero norm of xi, J * is nuclear norm, eta and mu are adjustable super parameters, pi is the correlation characteristic coefficient of the received signal between the receiving array elements, and ψ is the angle dictionary.
2. The method of claim 1, wherein the calculating the steering vector of the acoustic array in step S01 is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
3. The method of claim 2, wherein the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
4. The method for separating acoustic array speech signals from background noise according to claim 1, wherein the angle dictionary established in step S04 is:
Wherein, Phi 1,φ2,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
5. A system for separating a sound array speech signal from background noise, comprising:
the sound array acquisition and calculation module acquires a sound array and calculates a guide vector of the sound array;
The space feature matrix calculation module is used for obtaining a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The characteristic domain projection observation calculation module is used for carrying out characteristic decomposition on the space characteristic matrix to obtain a right characteristic matrix, taking the right characteristic matrix as a space characteristic observation matrix and carrying out characteristic domain projection observation on the voice signals received by the acoustic array; the space feature matrix is subjected to feature decomposition into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N;
the separation module establishes an angle dictionary to which the voice signals come to represent the space feature domain projection signals of the voice array received signals, and establishes a joint optimization function to obtain a space domain sparse feature matrix and a background clutter signal of the voice signals; the spatial signature domain projection signal of the acoustic array received signal is further represented as:
Y=ΦR=ΦΨSΞ+ΦJ
The Xis is a space domain sparse feature matrix of the voice signal and has sparse characteristics;
the established joint optimization function is as follows:
Wherein, xi is zero norm of xi, J * is nuclear norm, eta and mu are adjustable super parameters, pi is the correlation characteristic coefficient of the received signal between the receiving array elements, and ψ is the angle dictionary.
6. The system for separating a speech signal from background noise according to claim 5, wherein the acoustic array acquisition calculation module calculates a steering vector of the acoustic array as:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
7. A computer storage medium having stored thereon a computer program, which when executed implements the method of separating an acoustic array speech signal from background noise of any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410449206.6A CN118262737B (en) | 2024-04-15 | 2024-04-15 | Method, system and storage medium for separating sound array voice signal from background noise |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410449206.6A CN118262737B (en) | 2024-04-15 | 2024-04-15 | Method, system and storage medium for separating sound array voice signal from background noise |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118262737A CN118262737A (en) | 2024-06-28 |
CN118262737B true CN118262737B (en) | 2024-10-29 |
Family
ID=91606738
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410449206.6A Active CN118262737B (en) | 2024-04-15 | 2024-04-15 | Method, system and storage medium for separating sound array voice signal from background noise |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118262737B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915742A (en) * | 2012-10-30 | 2013-02-06 | 中国人民解放军理工大学 | Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition |
CN111724806A (en) * | 2020-06-05 | 2020-09-29 | 太原理工大学 | Double-visual-angle single-channel voice separation method based on deep neural network |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10957337B2 (en) * | 2018-04-11 | 2021-03-23 | Microsoft Technology Licensing, Llc | Multi-microphone speech separation |
CN114879133A (en) * | 2022-04-27 | 2022-08-09 | 西安电子科技大学 | Sparse angle estimation method under multipath and Gaussian color noise environment |
CN115171716B (en) * | 2022-06-14 | 2024-04-19 | 武汉大学 | Continuous voice separation method and system based on spatial feature clustering and electronic equipment |
-
2024
- 2024-04-15 CN CN202410449206.6A patent/CN118262737B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915742A (en) * | 2012-10-30 | 2013-02-06 | 中国人民解放军理工大学 | Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition |
CN111724806A (en) * | 2020-06-05 | 2020-09-29 | 太原理工大学 | Double-visual-angle single-channel voice separation method based on deep neural network |
Also Published As
Publication number | Publication date |
---|---|
CN118262737A (en) | 2024-06-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112088402B (en) | Federated neural network for speaker recognition | |
US8583428B2 (en) | Sound source separation using spatial filtering and regularization phases | |
CN107534725B (en) | Voice signal processing method and device | |
AU2022200439B2 (en) | Multi-modal speech separation method and system | |
CN109599124A (en) | A kind of audio data processing method, device and storage medium | |
CN111462733B (en) | Multi-modal speech recognition model training method, device, equipment and storage medium | |
US11817112B2 (en) | Method, device, computer readable storage medium and electronic apparatus for speech signal processing | |
CN109410956B (en) | Object identification method, device, equipment and storage medium of audio data | |
US20220115002A1 (en) | Speech recognition method, speech recognition device, and electronic equipment | |
Wang et al. | Deep learning assisted time-frequency processing for speech enhancement on drones | |
EP4207195A1 (en) | Speech separation method, electronic device, chip and computer-readable storage medium | |
CN110111808A (en) | Acoustic signal processing method and Related product | |
CN110188179B (en) | Voice directional recognition interaction method, device, equipment and medium | |
CN112466327B (en) | Voice processing method and device and electronic equipment | |
CN109308909B (en) | Signal separation method and device, electronic equipment and storage medium | |
CN112786072A (en) | Ship classification and identification method based on propeller radiation noise | |
KR20230134613A (en) | Multi channel voice activity detection | |
CN116385546A (en) | Multi-mode feature fusion method for simultaneously segmenting and detecting grabbing pose | |
CN118262737B (en) | Method, system and storage medium for separating sound array voice signal from background noise | |
US20240371387A1 (en) | Area sound pickup method and system of small microphone array device | |
CN115171716B (en) | Continuous voice separation method and system based on spatial feature clustering and electronic equipment | |
CN114664288A (en) | Voice recognition method, device, equipment and storage medium | |
CN115662394A (en) | Voice extraction method, device, storage medium and electronic device | |
CN117198311A (en) | Voice control method and device based on voice noise reduction | |
CN112489678B (en) | Scene recognition method and device based on channel characteristics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |