[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Bub et al., 1995 - Google Patents

Knowing who to listen to in speech recognition: Visually guided beamforming

Bub et al., 1995

View PDF
Document ID
2428160260266937212
Author
Bub U
Hunke M
Waibel A
Publication year
Publication venue
1995 International Conference on Acoustics, Speech, and Signal Processing

External Links

Snippet

With speech recognition systems steadily improving in performance, freedom from head-sets and push-buttons to activate the recognizer is one of the most important issues to achieve user acceptance. Microphone arrays and beamforming can deliver signals that suppress …
Continue reading at isl.iar.kit.edu (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • H04M3/569Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants using the instant speaker's algorithm
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/18Methods or devices for transmitting, conducting, or directing sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification

Similar Documents

Publication Publication Date Title
Bub et al. Knowing who to listen to in speech recognition: Visually guided beamforming
Donley et al. Easycom: An augmented reality dataset to support algorithms for easy communication in noisy environments
DiBiase A high-accuracy, low-latency technique for talker localization in reverberant environments using microphone arrays
JP6464449B2 (en) Sound source separation apparatus and sound source separation method
KR101444100B1 (en) Noise cancelling method and apparatus from the mixed sound
JP2022544138A (en) Systems and methods for assisting selective listening
US20200184991A1 (en) Sound class identification using a neural network
EP2320676A1 (en) Method, communication device and communication system for controlling sound focusing
Taherian et al. Multi-channel talker-independent speaker separation through location-based training
Khan et al. Video-aided model-based source separation in real reverberant rooms
CN111078185A (en) Method and equipment for recording sound
JP2022062875A (en) Audio signal processing method and audio signal processing apparatus
Tesch et al. Spatially selective deep non-linear filters for speaker extraction
Pertilä Online blind speech separation using multiple acoustic speaker tracking and time–frequency masking
KR101976937B1 (en) Apparatus for automatic conference notetaking using mems microphone array
Rabinkin Optimum sensor placement for microphone arrays
Nakadai et al. Exploiting auditory fovea in humanoid-human interaction
Ihara et al. Multichannel speech separation and localization by frequency assignment
KR102412148B1 (en) Beamforming method and beamforming system using neural network
JP2022062876A (en) Audio signal processing method and audio signal processing apparatus
Đurković Localization, tracking, and separation of sound sources for cognitive robots
Flanagan et al. Sound capture with three-dimensional selectivity
Wilson et al. Audiovisual arrays for untethered spoken interfaces
CN113785357A (en) Open active noise cancellation system
Brückmann et al. Integration of a sound source detection into a probabilistic-based multimodal approach for person detection and tracking