Priebe et al., 2024 - Google Patents

Efficient speech detection in environmental audio using acoustic recognition and knowledge distillation

Priebe et al., 2024

Document ID: 576334816165842655
Author: Priebe D; Ghani B; Stowell D
Publication year: 2024
Publication venue: Sensors

External Links

Cited by

Snippet

The ongoing biodiversity crisis, driven by factors such as land-use change and global warming, emphasizes the need for effective ecological monitoring methods. Acoustic monitoring of biodiversity has emerged as an important monitoring tool. Detecting human …

Continue reading at www.mdpi.com (PDF) (other versions)

238000013140 knowledge distillation 0 title abstract description 21

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6261—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation partitioning the feature space
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
- G06N3/08—Learning methods
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computer systems based on specific mathematical models
- G06N7/005—Probabilistic networks
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/18—Digital computers in general; Data processing equipment in general in which a programme is changed according to experience gained by the computer itself during a complete run; Learning machines
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signal analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signal, using source filter models or psychoacoustic analysis

Similar Documents

Publication	Publication Date	Title
Priyadarshani et al.	2018	Automated birdsong recognition in complex acoustic environments: a review
Ulloa et al.	2021	scikit‐maad: An open‐source and modular toolbox for quantitative soundscape analysis in Python
Xie et al.	2023	A review of automatic recognition technology for bird vocalizations in the deep learning era
Patton et al.	2016	AutoMOS: Learning a non-intrusive assessor of naturalness-of-speech
Usman et al.	2020	Review of automatic detection and classification techniques for cetacean vocalization
Kvsn et al.	2020	Bioacoustics data analysis–A taxonomy, survey and open challenges
Tian et al.	2023	Joint learning model for underwater acoustic target recognition
Merchan et al.	2020	Bioacoustic classification of antillean manatee vocalization spectrograms using deep convolutional neural networks
Barroso et al.	2023	Applications of machine learning to identify and characterize the sounds produced by fish
Wang et al.	2022	Automated call detection for acoustic surveys with structured calls of varying length
Singh et al.	2023	A survey on preprocessing and classification techniques for acoustic scene
Yadav et al.	2024	Developing a Model for Bird Vocalization Recognition and Population Estimation in Forest Ecosystems
Priebe et al.	2024	Efficient speech detection in environmental audio using acoustic recognition and knowledge distillation
Mewada et al.	2023	Gaussian-filtered high-frequency-feature trained optimized BiLSTM network for spoofed-speech classification
Riad et al.	2021	Learning spectro-temporal representations of complex sounds with parameterized neural networks
Wang et al.	2023	A hierarchical birdsong feature extraction architecture combining static and dynamic modeling
Diep et al.	2024	Crossmixed convolutional neural network for digital speech recognition
Xie et al.	2023	Cross-corpus open set bird species recognition by vocalization
van Osta et al.	2023	An active learning framework and assessment of inter-annotator agreement facilitate automated recogniser development for vocalisations of a rare species, the southern black-throated finch (Poephila cincta cincta)
Schneider et al.	2024	Acoustic estimation of the manatee population and classification of call categories using artificial intelligence
Lovenia et al.	2022	What did I just hear? Detecting pornographic sounds in adult videos using neural networks
Kareem et al.	2023	Multi-Label Bird Species Classification Using Sequential Aggregation Strategy from Audio Recordings
Ijaz et al.	2024	Reshaping Bioacoustics Event Detection: Leveraging Few-Shot Learning (FSL) with Transductive Inference and Data Augmentation
Koli	2021	Classification of speaker’s age, gender and nationality using transfer learning
Mohmmad et al.	2024	A parametric survey on polyphonic sound event detection and localization