[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

Cant et al., 2017 - Google Patents

Mask Optimisation for Neural Network Monaural Source Separation

Cant et al., 2017

View PDF
Document ID
6444672445922813662
Author
Cant R
Langensiepen C
Metcalf W
Publication year
Publication venue
2017 UKSim-AMSS 19th International Conference on Computer Modelling & Simulation (UKSim)

External Links

Snippet

An ideal binary mask is a means by which multiple sound sources within a single audio file can be separated. Previous work has shown a deep neural network can be trained to approximate the ideal mask, but at a substantial computational cost. We present a method to …
Continue reading at uksim.info (PDF) (other versions)

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2210/00Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
    • G10H2210/031Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/0008Associated control or indicating means
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H1/00Details of electrophonic musical instruments
    • G10H1/02Means for controlling the tone frequencies, e.g. attack, decay; Means for producing special musical effects, e.g. vibrato, glissando
    • G10H1/06Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
    • G10L19/018Audio watermarking, i.e. embedding inaudible data in the audio signal
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/131Mathematical functions for musical analysis, processing, synthesis or composition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2250/00Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
    • G10H2250/025Envelope processing of music signals in, e.g. time domain, transform domain or cepstrum domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H2240/00Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10HELECTROPHONIC MUSICAL INSTRUMENTS
    • G10H7/00Instruments in which the tones are synthesised from a data store, e.g. computer organs

Similar Documents

Publication Publication Date Title
Cano et al. Musical source separation: An introduction
Shi et al. On the importance of phase in human speech recognition
Li et al. Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech
Cuesta et al. Analysis of intonation in unison choir singing
DE102012103553A1 (en) AUDIO SYSTEM AND METHOD FOR USING ADAPTIVE INTELLIGENCE TO DISTINCT THE INFORMATION CONTENT OF AUDIOSIGNALS IN CONSUMER AUDIO AND TO CONTROL A SIGNAL PROCESSING FUNCTION
Nagathil et al. Spectral complexity reduction of music signals for mitigating effects of cochlear hearing loss
Buyens et al. A stereo music preprocessing scheme for cochlear implant users
Taenzer et al. Investigating CNN-based Instrument Family Recognition for Western Classical Music Recordings.
Chiu et al. Mixing-specific data augmentation techniques for improved blind violin/piano source separation
Özer et al. Source Separation of Piano Concertos with Test-Time Adaptation.
Gauer et al. A versatile deep-neural-network-based music preprocessing and remixing scheme for cochlear implant listeners
Woodruff et al. Using pitch, amplitude modulation, and spatial cues for separation of harmonic instruments from stereo music recordings
Burred et al. On the use of auditory representations for sparsity-based sound source separation
Cant et al. Mask Optimisation for Neural Network Monaural Source Separation
Marolt Transcription of polyphonic piano music with neural networks
Alghamdi et al. Real time blind audio source separation based on machine learning algorithms
Pardo et al. Applying source separation to music
Marolt et al. SONIC: A system for transcription of piano music
Tsumoto et al. The effect of harmonic overtones in relation to “sharpness” for perception of brightness of distorted guitar timbre
Chen et al. Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition.
Marolt Adaptive oscillator networks for partial tracking and piano music transcription
Drake et al. A computational auditory scene analysis-enhanced beamforming approach for sound source separation
Ranjan et al. Segregation of Speech and Music Signals for Aiding the Hearing Impaired
Pedersen et al. BLUES from music: Blind underdetermined extraction of sources from music
Fenton et al. Hybrid Multiresolution Analysis of “Punch” in Musical Signals