[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN118262737B - Method, system and storage medium for separating sound array voice signal from background noise - Google Patents

Method, system and storage medium for separating sound array voice signal from background noise Download PDF

Info

Publication number
CN118262737B
CN118262737B CN202410449206.6A CN202410449206A CN118262737B CN 118262737 B CN118262737 B CN 118262737B CN 202410449206 A CN202410449206 A CN 202410449206A CN 118262737 B CN118262737 B CN 118262737B
Authority
CN
China
Prior art keywords
array
signal
matrix
received
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410449206.6A
Other languages
Chinese (zh)
Other versions
CN118262737A (en
Inventor
韩瑜
鲍彧
黄克迪
郭宁宇
顾恺然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Institute of Technology
Original Assignee
Changzhou Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Institute of Technology filed Critical Changzhou Institute of Technology
Priority to CN202410449206.6A priority Critical patent/CN118262737B/en
Publication of CN118262737A publication Critical patent/CN118262737A/en
Application granted granted Critical
Publication of CN118262737B publication Critical patent/CN118262737B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/0308Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method, a system and a storage medium for separating a sound array voice signal from background noise, which comprise the following steps: acquiring an acoustic array and calculating a guide vector of the acoustic array; based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array; performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array; and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals. The method can effectively improve the purity of the received voice signal and can also obtain the environment information carried in the background noise.

Description

Method, system and storage medium for separating sound array voice signal from background noise
Technical Field
The invention belongs to the technical field of processing of voice signals, and relates to a method, a system and a storage medium for separating an acoustic array voice signal from background noise.
Background
Background noise suppression is of great necessity in speech array signal processing. First, suppressing background noise helps to improve the clarity and quality of speech signals, especially in complex environments, such as conference rooms or traffic scenes. Secondly, the suppression of background noise can obviously improve the accuracy of a voice recognition system and avoid false recognition caused by the noise. In addition, for the voice communication system, the background noise is restrained, so that the communication quality can be improved, and the user experience is enhanced. In complex environments, suppression of background noise helps to accommodate various challenging conditions, ensuring robustness of the speech system. For voice control and interactive application, the system is more sensitive due to suppression of background noise, and man-machine interaction experience is improved. In general, background noise suppression is used in speech array signal processing to optimize speech signal quality, improve system accuracy and adaptability to meet practical demands in different environments. On the other hand, in some contexts, the noise itself may contain cues about changes in the surrounding environment, such as crowd noise, traffic noise, and the like. Thus, by extracting and analyzing background noise, the system is better able to understand the surrounding environment, thereby providing more intelligent services. Finally, taking the diversity of background noise into consideration in practical application, the background noise is extracted, so that the system is more flexibly adapted to different environments. However, the existing background noise extraction algorithm is less researched and cannot effectively extract background noise.
Disclosure of Invention
The invention aims to provide a method, a system and a storage medium for separating an acoustic array voice signal from background noise, which can effectively improve the purity of a received voice signal and can also obtain environment information carried in the background noise.
The technical solution for realizing the purpose of the invention is as follows:
a method of separating a sound array speech signal from background noise, comprising the steps of:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array;
S04: and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred embodiment, the steering vector of the acoustic array calculated in step S01 is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
In a preferred technical scheme, the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
In a preferred technical scheme, the feature decomposition of the spatial feature matrix in step S03 is performed as follows:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N.
In a preferred technical solution, the angle dictionary for the incoming direction of the voice signal established in the step S04 is:
Wherein, Phi 12,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
In a preferred embodiment, the spatial feature domain projection signal of the acoustic array received signal in step S04 is further expressed as:
Y=ΦR=ΦΨSΞ+ΦJ
The xi is a spatial domain sparse feature matrix of the voice signal, and has sparse characteristics.
In a preferred technical solution, the joint optimization function established in the step S04 is:
Wherein, xi 0 is zero norm of xi, J * is nuclear norm, η and μ are adjustable super parameters, and pi is the correlation characteristic coefficient of the received signals among the receiving array elements.
The invention also discloses a system for separating the voice signal of the acoustic array from the background noise, which comprises:
the sound array acquisition and calculation module acquires a sound array and calculates a guide vector of the sound array;
The space feature matrix calculation module is used for obtaining a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The characteristic domain projection observation calculation module is used for carrying out characteristic decomposition on the space characteristic matrix to obtain a right characteristic matrix, taking the right characteristic matrix as a space characteristic observation matrix and carrying out characteristic domain projection observation on the voice signals received by the acoustic array;
The separation module establishes an angle dictionary to which the voice signals are directed to represent the space feature domain projection signals of the voice array received signals, and establishes a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred technical solution, the guiding vector of the calculated acoustic array in the acoustic array acquisition calculation module is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
The invention also discloses a computer storage medium, on which a computer program is stored, which when executed realizes the above-mentioned method for separating the acoustic array voice signal from the background noise.
Compared with the prior art, the invention has the remarkable advantages that:
The method can more effectively separate the undistorted voice signal from the background noise by utilizing the voice signal sparse feature decomposition and feature domain projection observation method, and is applicable to different environments. The purity of the received voice signal can be effectively improved, and meanwhile, the environment information carried in the background noise can be obtained. The background noise can be effectively extracted and correspondingly adjusted when necessary, so that stable and efficient voice processing performance can be provided in various complex scenes.
Drawings
FIG. 1 is a flow chart of a method of separating an acoustic array speech signal from background noise in accordance with the present invention;
fig. 2 is a schematic block diagram of a system for separating acoustic array speech signals from background noise in accordance with the present invention.
Detailed Description
The principle of the invention is as follows: the space feature matrix is subjected to feature decomposition to obtain a right feature matrix, the right feature matrix is used as a space feature observation matrix, feature domain projection observation is performed on voice signals received by the acoustic array, and a joint optimization function is established to obtain a space domain sparse feature matrix and a background clutter signal of the voice signals, so that the purity of the received voice signals can be effectively improved, and meanwhile, the environment information carried in the background noise can be obtained. The background noise can be effectively extracted and correspondingly adjusted when necessary, so that stable and efficient voice processing performance can be provided in various complex scenes.
Example 1:
As shown in fig. 1, a method for separating a sound array speech signal from background noise includes the following steps:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array;
S04: and establishing an angle dictionary for the voice signals to come to represent the space feature domain projection signals of the acoustic array received signals, and establishing a joint optimization function to obtain the space domain sparse feature matrix and the background clutter signals of the voice signals.
In a preferred embodiment, the steering vector of the acoustic array is calculated in step S01 as:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
In a preferred embodiment, the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
In a preferred embodiment, in step S03, the spatial feature matrix is decomposed into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N.
In a preferred embodiment, the angle dictionary for the incoming speech signal established in step S04 is:
Wherein, Phi 12,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
In a preferred embodiment, the spatial signature domain projection signal of the acoustic array received signal in step S04 is further expressed as:
Y=ΦR=ΦΨSΞ+ΦJ
The xi is a spatial domain sparse feature matrix of the voice signal, and has sparse characteristics.
In a preferred embodiment, the joint optimization function established in step S04 is:
Wherein, II 0 is zero norm of the XI, II phi J I * is kernel norm, eta and mu are adjustable super parameters, II is the correlation characteristic coefficient of the received signals between the receiving array elements.
In another embodiment, a computer storage medium has a computer program stored thereon, and the computer program when executed implements the above method for separating an acoustic array speech signal from background noise.
The method for separating the acoustic array voice signal from the background noise can be any one of the above methods for separating the acoustic array voice signal from the background noise, and detailed description thereof is omitted herein.
In yet another embodiment, as shown in fig. 2, a system for separating an acoustic array speech signal from background noise, includes:
The acoustic array acquisition and calculation module 10 acquires an acoustic array and calculates a steering vector of the acoustic array;
The space feature matrix calculation module 20 obtains a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The feature domain projection observation calculation module 30 performs feature decomposition on the spatial feature matrix to obtain a right feature matrix, takes the right feature matrix as a spatial feature observation matrix, and performs feature domain projection observation on the voice signals received by the acoustic array;
The separation module 40 establishes an angle dictionary to which the voice signal is directed to represent the spatial feature domain projection signal of the acoustic array received signal, and establishes a joint optimization function to obtain the spatial domain sparse feature matrix and the background clutter signal of the voice signal.
Specifically, the following description will be given by taking a preferred embodiment as an example of the workflow of the system for separating the acoustic array speech signal from the background noise:
There is a receiving acoustic array with a number of receiving elements of N and a spacing of d 0, then the receiving steering vector of the receiving acoustic array can be expressed as:
where c is the speed of sound, θ is the signal direction angle, and f is the speech signal frequency.
The signal received by the nth receiving element is r n, and in the presence of background noise, r n may be expressed as:
rn=β(n)sn+Jn (2)
Where s n is the speech signal, J n is the background noise, and β (i) is the i-th element of the received steering vector β.
The embodiment defines a spatial feature matrix representing the positional relationship of each array element in the receiving acoustic array based on the positional correlation of the receiving acoustic array:
Wherein, The elements in the matrix are:
Wherein Δd ij is the distance between the ith and jth array elements.
The spatial feature matrix indicating the positional relationship of each array element in the received sound array may be obtained by other methods, and is not limited herein.
Subsequently, the spatial feature matrix G is subjected to feature decomposition g=Ω ΛΦ and a right-hand feature matrix Φ thereof is obtained. The invention takes the right-direction characteristic matrix phi as a space characteristic observation matrix to carry out characteristic domain projection observation on the voice signals received by the acoustic array, namely
Y=ΦR=ΦβS+ΦJ (5)
Wherein,
The correlation characteristic coefficients of the received signals among the receiving array elements are as follows:
Wherein σ is a similarity constraint parameter, S i、Sj is a speech signal received by the ith and jth array elements, and the correlation characteristic coefficient characterizes the correlation of the pure speech signal received by different receiving array elements, and the smaller the pi value is, the higher the correlation is.
Finally, define the angle dictionary to which the speech signal comes:
Wherein phi 12,…,φQ is the search incoming angle set, and Q is the number of search incoming angles.
The spatial signature domain projection signal of the acoustic array received signal represented by equation (5) may be further represented as:
Y=ΦR=ΦΨSΞ+ΦJ (9)
Wherein, the xi has sparse property.
Therefore, the invention designs the separation process of the acoustic array voice signal and the background noise as a joint optimization problem as follows:
Wherein II 0 is zero norm of the XI, and the sparse characteristic is represented. And II phi J II * is a nuclear norm and represents the low-rank characteristic of the low-rank digital video camera. η and μ are adjustable hyper-parameters.
The method can solve the upper expression by using an ADMM algorithm, and the ADMM I alternate direction multiplier method, and can solve the upper expression by using other known algorithms, so that the spatial domain sparse feature matrix (Xi) of the voice signal and the background clutter signal (J) are obtained after the solution, and the joint separation of the voice signal and the background noise is realized.
The method can effectively improve the purity of the received voice signal and can also obtain the environment information carried in the background noise.
The foregoing examples are preferred embodiments of the present invention, but the embodiments of the present invention are not limited to the foregoing examples, and any other changes, modifications, substitutions, combinations, and simplifications that do not depart from the spirit and principles of the present invention should be made therein and are intended to be equivalent substitutes within the scope of the present invention.

Claims (7)

1. A method for separating a speech signal from background noise in an acoustic array, comprising the steps of:
s01: acquiring an acoustic array and calculating a guide vector of the acoustic array;
S02: based on the position correlation of the receiving sound array, obtaining a space feature matrix representing the position relation of each array element in the receiving sound array;
S03: performing feature decomposition on the spatial feature matrix to obtain a right feature matrix, taking the right feature matrix as a spatial feature observation matrix, and performing feature domain projection observation on the voice signals received by the acoustic array; the space feature matrix is subjected to feature decomposition into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N;
S04: establishing an angle dictionary for the incoming direction of the voice signals to represent the space feature domain projection signals of the voice array received signals, and establishing a joint optimization function to obtain a space domain sparse feature matrix and background clutter signals of the voice signals; the spatial signature domain projection signal of the acoustic array received signal is further represented as:
Y=ΦR=ΦΨSΞ+ΦJ
The Xis is a space domain sparse feature matrix of the voice signal and has sparse characteristics;
the established joint optimization function is as follows:
Wherein, xi is zero norm of xi, J * is nuclear norm, eta and mu are adjustable super parameters, pi is the correlation characteristic coefficient of the received signal between the receiving array elements, and ψ is the angle dictionary.
2. The method of claim 1, wherein the calculating the steering vector of the acoustic array in step S01 is:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
3. The method of claim 2, wherein the spatial feature matrix in step S02 is:
Wherein, The elements in the matrix are:
wherein Δd ij is the distance between the ith and jth array elements, and β (i) is the ith element of the steering vector β.
4. The method for separating acoustic array speech signals from background noise according to claim 1, wherein the angle dictionary established in step S04 is:
Wherein, Phi 12,…,φQ is the set of search incoming angles, Q is the number of search incoming angles.
5. A system for separating a sound array speech signal from background noise, comprising:
the sound array acquisition and calculation module acquires a sound array and calculates a guide vector of the sound array;
The space feature matrix calculation module is used for obtaining a space feature matrix representing the position relation of each array element in the receiving sound array based on the position correlation of the receiving sound array;
The characteristic domain projection observation calculation module is used for carrying out characteristic decomposition on the space characteristic matrix to obtain a right characteristic matrix, taking the right characteristic matrix as a space characteristic observation matrix and carrying out characteristic domain projection observation on the voice signals received by the acoustic array; the space feature matrix is subjected to feature decomposition into:
G=ΩΛΦ
wherein omega is a left-hand feature matrix, Λ is a diagonal matrix, elements on the diagonal line of the diagonal matrix are feature values of G, and phi is a right-hand feature matrix;
the projection observation of the characteristic domain obtains a projection signal Y=ΦR=ΦβS+ΦJ of the spatial characteristic domain;
Wherein, R is the total signal received by the array, S is the speech signal received by the array, J is the background signal received by the array, y n is the characteristic field projection signal of the nth array element, R n is the received signal of the nth array element, S n is the speech signal received by the nth array element, J n is the background signal received by the nth array element, n=1, 2, …, N;
the separation module establishes an angle dictionary to which the voice signals come to represent the space feature domain projection signals of the voice array received signals, and establishes a joint optimization function to obtain a space domain sparse feature matrix and a background clutter signal of the voice signals; the spatial signature domain projection signal of the acoustic array received signal is further represented as:
Y=ΦR=ΦΨSΞ+ΦJ
The Xis is a space domain sparse feature matrix of the voice signal and has sparse characteristics;
the established joint optimization function is as follows:
Wherein, xi is zero norm of xi, J * is nuclear norm, eta and mu are adjustable super parameters, pi is the correlation characteristic coefficient of the received signal between the receiving array elements, and ψ is the angle dictionary.
6. The system for separating a speech signal from background noise according to claim 5, wherein the acoustic array acquisition calculation module calculates a steering vector of the acoustic array as:
wherein c is sound velocity, θ is signal direction angle, N is number of receiving array elements, d 0 is array element spacing, and f is voice signal frequency;
The signal received by the nth receiving element is r n, and in the presence of background noise, r n is expressed as:
rn=β(n)sn+Jn
Where s n is the speech signal, J n is the background noise, and β (n) is the nth element of the steering vector β.
7. A computer storage medium having stored thereon a computer program, which when executed implements the method of separating an acoustic array speech signal from background noise of any of claims 1-4.
CN202410449206.6A 2024-04-15 2024-04-15 Method, system and storage medium for separating sound array voice signal from background noise Active CN118262737B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410449206.6A CN118262737B (en) 2024-04-15 2024-04-15 Method, system and storage medium for separating sound array voice signal from background noise

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410449206.6A CN118262737B (en) 2024-04-15 2024-04-15 Method, system and storage medium for separating sound array voice signal from background noise

Publications (2)

Publication Number Publication Date
CN118262737A CN118262737A (en) 2024-06-28
CN118262737B true CN118262737B (en) 2024-10-29

Family

ID=91606738

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410449206.6A Active CN118262737B (en) 2024-04-15 2024-04-15 Method, system and storage medium for separating sound array voice signal from background noise

Country Status (1)

Country Link
CN (1) CN118262737B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915742A (en) * 2012-10-30 2013-02-06 中国人民解放军理工大学 Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition
CN111724806A (en) * 2020-06-05 2020-09-29 太原理工大学 Double-visual-angle single-channel voice separation method based on deep neural network

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10957337B2 (en) * 2018-04-11 2021-03-23 Microsoft Technology Licensing, Llc Multi-microphone speech separation
CN114879133A (en) * 2022-04-27 2022-08-09 西安电子科技大学 Sparse angle estimation method under multipath and Gaussian color noise environment
CN115171716B (en) * 2022-06-14 2024-04-19 武汉大学 Continuous voice separation method and system based on spatial feature clustering and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102915742A (en) * 2012-10-30 2013-02-06 中国人民解放军理工大学 Single-channel monitor-free voice and noise separating method based on low-rank and sparse matrix decomposition
CN111724806A (en) * 2020-06-05 2020-09-29 太原理工大学 Double-visual-angle single-channel voice separation method based on deep neural network

Also Published As

Publication number Publication date
CN118262737A (en) 2024-06-28

Similar Documents

Publication Publication Date Title
CN112088402B (en) Federated neural network for speaker recognition
US8583428B2 (en) Sound source separation using spatial filtering and regularization phases
CN107534725B (en) Voice signal processing method and device
AU2022200439B2 (en) Multi-modal speech separation method and system
CN109599124A (en) A kind of audio data processing method, device and storage medium
CN111462733B (en) Multi-modal speech recognition model training method, device, equipment and storage medium
US11817112B2 (en) Method, device, computer readable storage medium and electronic apparatus for speech signal processing
CN109410956B (en) Object identification method, device, equipment and storage medium of audio data
US20220115002A1 (en) Speech recognition method, speech recognition device, and electronic equipment
Wang et al. Deep learning assisted time-frequency processing for speech enhancement on drones
EP4207195A1 (en) Speech separation method, electronic device, chip and computer-readable storage medium
CN110111808A (en) Acoustic signal processing method and Related product
CN110188179B (en) Voice directional recognition interaction method, device, equipment and medium
CN112466327B (en) Voice processing method and device and electronic equipment
CN109308909B (en) Signal separation method and device, electronic equipment and storage medium
CN112786072A (en) Ship classification and identification method based on propeller radiation noise
KR20230134613A (en) Multi channel voice activity detection
CN116385546A (en) Multi-mode feature fusion method for simultaneously segmenting and detecting grabbing pose
CN118262737B (en) Method, system and storage medium for separating sound array voice signal from background noise
US20240371387A1 (en) Area sound pickup method and system of small microphone array device
CN115171716B (en) Continuous voice separation method and system based on spatial feature clustering and electronic equipment
CN114664288A (en) Voice recognition method, device, equipment and storage medium
CN115662394A (en) Voice extraction method, device, storage medium and electronic device
CN117198311A (en) Voice control method and device based on voice noise reduction
CN112489678B (en) Scene recognition method and device based on channel characteristics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant