Cant et al., 2017 - Google Patents
Mask Optimisation for Neural Network Monaural Source SeparationCant et al., 2017
View PDF- Document ID
- 6444672445922813662
- Author
- Cant R
- Langensiepen C
- Metcalf W
- Publication year
- Publication venue
- 2017 UKSim-AMSS 19th International Conference on Computer Modelling & Simulation (UKSim)
External Links
Snippet
An ideal binary mask is a means by which multiple sound sources within a single audio file can be separated. Previous work has shown a deep neural network can be trained to approximate the ideal mask, but at a substantial computational cost. We present a method to …
- 230000001537 neural 0 title abstract description 24
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2210/00—Aspects or methods of musical processing having intrinsic musical character, i.e. involving musical theory or musical parameters or relying on musical knowledge, as applied in electrophonic musical tools or instruments
- G10H2210/031—Musical analysis, i.e. isolation, extraction or identification of musical elements or musical parameters from a raw acoustic signal or from an encoded audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis using predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/93—Discriminating between voiced and unvoiced parts of speech signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding, i.e. using interchannel correlation to reduce redundancies, e.g. joint-stereo, intensity-coding, matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H1/00—Details of electrophonic musical instruments
- G10H1/02—Means for controlling the tone frequencies, e.g. attack, decay; Means for producing special musical effects, e.g. vibrato, glissando
- G10H1/06—Circuits for establishing the harmonic content of tones, or other arrangements for changing the tone colour
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/131—Mathematical functions for musical analysis, processing, synthesis or composition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/025—Envelope processing of music signals in, e.g. time domain, transform domain or cepstrum domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS
- G10H7/00—Instruments in which the tones are synthesised from a data store, e.g. computer organs
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Cano et al. | Musical source separation: An introduction | |
Shi et al. | On the importance of phase in human speech recognition | |
Li et al. | Monaural speech separation based on computational auditory scene analysis and objective quality assessment of speech | |
Cuesta et al. | Analysis of intonation in unison choir singing | |
DE102012103553A1 (en) | AUDIO SYSTEM AND METHOD FOR USING ADAPTIVE INTELLIGENCE TO DISTINCT THE INFORMATION CONTENT OF AUDIOSIGNALS IN CONSUMER AUDIO AND TO CONTROL A SIGNAL PROCESSING FUNCTION | |
Nagathil et al. | Spectral complexity reduction of music signals for mitigating effects of cochlear hearing loss | |
Buyens et al. | A stereo music preprocessing scheme for cochlear implant users | |
Taenzer et al. | Investigating CNN-based Instrument Family Recognition for Western Classical Music Recordings. | |
Chiu et al. | Mixing-specific data augmentation techniques for improved blind violin/piano source separation | |
Özer et al. | Source Separation of Piano Concertos with Test-Time Adaptation. | |
Gauer et al. | A versatile deep-neural-network-based music preprocessing and remixing scheme for cochlear implant listeners | |
Woodruff et al. | Using pitch, amplitude modulation, and spatial cues for separation of harmonic instruments from stereo music recordings | |
Burred et al. | On the use of auditory representations for sparsity-based sound source separation | |
Cant et al. | Mask Optimisation for Neural Network Monaural Source Separation | |
Marolt | Transcription of polyphonic piano music with neural networks | |
Alghamdi et al. | Real time blind audio source separation based on machine learning algorithms | |
Pardo et al. | Applying source separation to music | |
Marolt et al. | SONIC: A system for transcription of piano music | |
Tsumoto et al. | The effect of harmonic overtones in relation to “sharpness” for perception of brightness of distorted guitar timbre | |
Chen et al. | Modified Perceptual Linear Prediction Liftered Cepstrum (MPLPLC) Model for Pop Cover Song Recognition. | |
Marolt | Adaptive oscillator networks for partial tracking and piano music transcription | |
Drake et al. | A computational auditory scene analysis-enhanced beamforming approach for sound source separation | |
Ranjan et al. | Segregation of Speech and Music Signals for Aiding the Hearing Impaired | |
Pedersen et al. | BLUES from music: Blind underdetermined extraction of sources from music | |
Fenton et al. | Hybrid Multiresolution Analysis of “Punch” in Musical Signals |