Xiao et al., 2022 - Google Patents
AMResNet: An automatic recognition model of bird sounds in real environmentXiao et al., 2022
- Document ID
- 14489605373673130916
- Author
- Xiao H
- Liu D
- Chen K
- Zhu M
- Publication year
- Publication venue
- Applied Acoustics
External Links
Snippet
Birds are biological indicators reflecting environmental quality and its changes, so ecologists devote a great deal of attention to monitor their population trends. Automated acoustic recognition is regarded as an important technology to support bird monitoring and …
- 230000001537 neural 0 abstract description 24
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6267—Classification techniques
- G06K9/6268—Classification techniques relating to the classification paradigm, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/62—Methods or arrangements for recognition using electronic means
- G06K9/6217—Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06K9/6232—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
- G06K9/6247—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods based on an approximation criterion, e.g. principal component analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06K—RECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K9/00—Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
- G06K9/36—Image preprocessing, i.e. processing the image information without deciding about the identity of the image
- G06K9/46—Extraction of features or characteristics of the image
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30781—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F17/30784—Information retrieval; Database structures therefor; File system structures therefor of video data using features automatically derived from the video content, e.g. descriptors, fingerprints, signatures, genre
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/02—Computer systems based on biological models using neural network models
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00-G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Stöter et al. | Countnet: Estimating the number of concurrent speakers using supervised learning | |
CN105023573B (en) | It is detected using speech syllable/vowel/phone boundary of auditory attention clue | |
Hidayat et al. | Convolutional neural networks for scops owl sound classification | |
Xiao et al. | AMResNet: An automatic recognition model of bird sounds in real environment | |
Mehyadin et al. | Birds sound classification based on machine learning algorithms | |
CN112750442B (en) | Crested mill population ecological system monitoring system with wavelet transformation and method thereof | |
Schröter et al. | Segmentation, classification, and visualization of orca calls using deep learning | |
Albornoz et al. | Automatic classification of Furnariidae species from the Paranaense Littoral region using speech-related features and machine learning | |
CN112735442B (en) | Wetland ecology monitoring system with audio separation voiceprint recognition function and audio separation method thereof | |
CN113936667A (en) | Bird song recognition model training method, recognition method and storage medium | |
Xie et al. | KD-CLDNN: Lightweight automatic recognition model based on bird vocalization | |
Imran et al. | An analysis of audio classification techniques using deep learning architectures | |
Gunawan et al. | Repurposing transfer learning strategy of computer vision for owl sound classification | |
Hu et al. | A lightweight multi-sensory field-based dual-feature fusion residual network for bird song recognition | |
Wang et al. | A hierarchical birdsong feature extraction architecture combining static and dynamic modeling | |
Riad et al. | Learning spectro-temporal representations of complex sounds with parameterized neural networks | |
Noumida et al. | Stacked Res2Net-CBAM with Grouped Channel Attention for Multi-Label Bird Species Classification | |
Hu et al. | An features extraction and recognition method for underwater acoustic target based on ATCNN | |
Chaves et al. | Katydids acoustic classification on verification approach based on MFCC and HMM | |
Joelianto et al. | Convolutional neural network-based real-time mosquito genus identification using wingbeat frequency: A binary and multiclass classification approach | |
Priebe et al. | Efficient speech detection in environmental audio using acoustic recognition and knowledge distillation | |
Marck et al. | Identification, analysis and characterization of base units of bird vocal communication: The white spectacled bulbul (Pycnonotus xanthopygos) as a case study | |
Bai et al. | CIAIC-BAD system for DCASE2018 challenge task 3 | |
Kareem et al. | Multi-Label Bird Species Classification Using Sequential Aggregation Strategy from Audio Recordings | |
Xie et al. | MDF-Net: A multi-view dual-attention fusion network for efficient bird sound classification |