Cant et al., 2017 - Google Patents

Mask Optimisation for Neural Network Monaural Source Separation

Cant et al., 2017

Document ID: 6444672445922813662
Author: Cant R; Langensiepen C; Metcalf W
Publication year: 2017
Publication venue: 2017 UKSim-AMSS 19th International Conference on Computer Modelling & Simulation (UKSim)

External Links

Cited by

Snippet

An ideal binary mask is a means by which multiple sound sources within a single audio file can be separated. Previous work has shown a deep neural network can be trained to approximate the ideal mask, but at a substantial computational cost. We present a method to …

Continue reading at uksim.info (PDF) (other versions)

230000001537 neural 0 title abstract description 24

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/02—Means for controlling the tone frequencies, e.g. attack, decay; Means for producing special musical effects, e.g. vibrato, glissando
- G10H1/06—Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/025—Envelope processing of music signals in, e.g. time domain, transform domain or cepstrum domain
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H7/00—Instruments in which the tones are synthesised from a data store, e.g. computer organs
- G10H7/08—Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform
- G10H7/10—Instruments in which the tones are synthesised from a data store, e.g. computer organs by calculating functions or polynomial approximations to evaluate amplitudes at successive sample points of a tone waveform using coefficients or parameters stored in a memory, e.g. Fourier coefficients
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition

Similar Documents

Publication	Publication Date	Title
Cano et al.	2018	Musical source separation: An introduction
Shi et al.	2006	On the importance of phase in human speech recognition
Li et al.	2006	Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
DE102012103553A1 (en)	2013-01-17	AUDIO SYSTEM AND METHOD FOR USING ADAPTIVE INTELLIGENCE TO DISTINCT THE INFORMATION CONTENT OF AUDIOSIGNALS IN CONSUMER AUDIO AND TO CONTROL A SIGNAL PROCESSING FUNCTION
Nagathil et al.	2015	Spectral complexity reduction of music signals for mitigating effects of cochlear hearing loss
Buyens et al.	2015	A stereo music preprocessing scheme for cochlear implant users
Chiu et al.	2020	Mixing-specific data augmentation techniques for improved blind violin/piano source separation
Tomic et al.	2008	Beyond the beat: modeling metric structure in music and performance
Özer et al.	2022	Source Separation of Piano Concertos with Test-Time Adaptation.
Gauer et al.	2022	A versatile deep-neural-network-based music preprocessing and remixing scheme for cochlear implant listeners
Burred et al.	2005	On the use of auditory representations for sparsity-based sound source separation
Woodruff et al.	2006	Using pitch, amplitude modulation, and spatial cues for separation of harmonic instruments from stereo music recordings
Cant et al.	2017	Mask Optimisation for Neural Network Monaural Source Separation
Marolt	2000	Transcription of polyphonic piano music with neural networks
Alghamdi et al.	2020	Real time blind audio source separation based on machine learning algorithms
Pardo et al.	2018	Applying source separation to music
Jensen et al.	2001	Hybrid perception
Marolt et al.	2001	SONIC: A system for transcription of piano music
Tsumoto et al.	2016	The effect of harmonic overtones in relation to “sharpness” for perception of brightness of distorted guitar timbre
Chen et al.	2015	Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition.
Seipel et al.	2018	Multi-track crosstalk reduction using spectral subtraction
Delgado Castro et al.	2019	Semi-Automatic Mono-to-Stereo Upmixing via Separation of Note Events
Drake et al.	2009	A computational auditory scene analysis-enhanced beamforming approach for sound source separation
Ranjan et al.	2021	Segregation of Speech and Music Signals for Aiding the Hearing Impaired
Pedersen et al.	2006	BLUES from music: Blind underdetermined extraction of sources from music