Bub et al., 1995 - Google Patents

Knowing who to listen to in speech recognition: Visually guided beamforming

Bub et al., 1995

Document ID: 2428160260266937212
Author: Bub U; Hunke M; Waibel A
Publication year: 1995
Publication venue: 1995 International Conference on Acoustics, Speech, and Signal Processing

External Links

Cited by

Snippet

With speech recognition systems steadily improving in performance, freedom from head-sets and push-buttons to activate the recognizer is one of the most important issues to achieve user acceptance. Microphone arrays and beamforming can deliver signals that suppress …

Continue reading at isl.iar.kit.edu (PDF) (other versions)

230000004807 localization 0 abstract description 15

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
- G10L2021/02166—Microphone arrays; Beamforming
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
- H04M3/569—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants using the instant speaker's algorithm
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R3/00—Circuits for transducers, loudspeakers or microphones
- H04R3/005—Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10K—SOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
- G10K11/00—Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
- G10K11/18—Methods or devices for transmitting, conducting, or directing sound
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification

Similar Documents

Publication	Publication Date	Title
Bub et al.	1995	Knowing who to listen to in speech recognition: Visually guided beamforming
Donley et al.	2021	Easycom: An augmented reality dataset to support algorithms for easy communication in noisy environments
DiBiase	2000	A high-accuracy, low-latency technique for talker localization in reverberant environments using microphone arrays
JP6464449B2 (en)	2019-02-06	Sound source separation apparatus and sound source separation method
KR101444100B1 (en)	2014-09-26	Noise cancelling method and apparatus from the mixed sound
JP2022544138A (en)	2022-10-17	Systems and methods for assisting selective listening
US20200184991A1 (en)	2020-06-11	Sound class identification using a neural network
EP2320676A1 (en)	2011-05-11	Method, communication device and communication system for controlling sound focusing
Taherian et al.	2022	Multi-channel talker-independent speaker separation through location-based training
Khan et al.	2013	Video-aided model-based source separation in real reverberant rooms
CN111078185A (en)	2020-04-28	Method and equipment for recording sound
JP2022062875A (en)	2022-04-21	Audio signal processing method and audio signal processing apparatus
Tesch et al.	2023	Spatially selective deep non-linear filters for speaker extraction
Pertilä	2013	Online blind speech separation using multiple acoustic speaker tracking and time–frequency masking
KR101976937B1 (en)	2019-05-10	Apparatus for automatic conference notetaking using mems microphone array
Rabinkin	1998	Optimum sensor placement for microphone arrays
Nakadai et al.	2002	Exploiting auditory fovea in humanoid-human interaction
Ihara et al.	2007	Multichannel speech separation and localization by frequency assignment
KR102412148B1 (en)	2022-06-22	Beamforming method and beamforming system using neural network
JP2022062876A (en)	2022-04-21	Audio signal processing method and audio signal processing apparatus
Đurković	2012	Localization, tracking, and separation of sound sources for cognitive robots
Flanagan et al.	1997	Sound capture with three-dimensional selectivity
Wilson et al.	2002	Audiovisual arrays for untethered spoken interfaces
CN113785357A (en)	2021-12-10	Open active noise cancellation system
Brückmann et al.	2006	Integration of a sound source detection into a probabilistic-based multimodal approach for person detection and tracking