[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112634937A - Sound classification method without digital feature extraction calculation - Google Patents

Sound classification method without digital feature extraction calculation Download PDF

Info

Publication number
CN112634937A
CN112634937A CN202011392004.0A CN202011392004A CN112634937A CN 112634937 A CN112634937 A CN 112634937A CN 202011392004 A CN202011392004 A CN 202011392004A CN 112634937 A CN112634937 A CN 112634937A
Authority
CN
China
Prior art keywords
analog
digital
energy
sound
sound classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011392004.0A
Other languages
Chinese (zh)
Inventor
陈盛
马文亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ai Li Chi Technology Beijing Co ltd
Original Assignee
Ai Li Chi Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ai Li Chi Technology Beijing Co ltd filed Critical Ai Li Chi Technology Beijing Co ltd
Priority to CN202011392004.0A priority Critical patent/CN112634937A/en
Publication of CN112634937A publication Critical patent/CN112634937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a sound classification method without digital feature extraction calculation, which comprises the following steps: acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector; step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy; converting the analog energy characteristic of each frequency band into a digital energy characteristic through an analog-to-digital converter; and step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound. The method adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification, and is applicable to simple sound classification.

Description

Sound classification method without digital feature extraction calculation
Technical Field
The invention belongs to the technical field of voice, and particularly relates to a sound classification method.
Background
Sound classification is a method used to distinguish sound classes. The sound classification task is different from the speech recognition task, and the requirement for feature vectors is generally lower for the sound classification task, such as distinguishing baby crying from normal environmental sounds, than for speech recognition. The speech recognition task generally needs to extract formant features in speech, so advanced features such as mel frequency cepstrum coefficients or linear perception prediction need to be used, the features need to be extracted from digital audio signals, a large number of digital operations are needed in the extraction process, fast fourier transform, trigonometric function calculation, exponential calculation and logarithmic calculation are involved, and one frame of data is processed every 10 milliseconds in general. For the sound classification task, the training and using requirements of sound classification can be met according to the energy feature vector of the frequency band. For example, in a sound classification task for distinguishing a baby cry from a normal environmental sound, since the energy characteristics of the baby cry in a specific frequency band are obvious and the difference from the environmental noise in the frequency spectrum is large, the energy extracted according to a plurality of frequency bands is a feature vector. In the sound classification method, the extraction of features is generally performed in the digital domain, and first, an analog signal collected by a microphone is converted into a digital signal through analog-to-digital conversion (a/D), and then the features of sound are extracted from the digital signal. After the characteristics of the sound are extracted, the sound characteristics are compared with the sound model to calculate the type of the sound, and the basic flow is shown in fig. 1.
The method for sound classification needs to perform operation processing in a digital domain, further needs high-performance analog-to-digital conversion and processing calculation capacity for support, and has high hardware cost. This is due to:
firstly, a high-precision analog-to-digital converter is needed, the built-in analog-to-digital converter of a common single chip cannot meet the requirement, a professional audio analog-to-digital converter needs to be connected externally, and the hardware cost of the system is high. The analog audio signal is converted into the digital audio signal, and because the feature extraction algorithm has higher requirements on the sampling rate and the precision of the digital audio, an analog-to-digital converter with the sampling rate of more than 8000Hz and the sampling precision of 16 bits is generally required to convert the analog audio signal into the digital audio signal.
Secondly, the sound feature extraction algorithm needs to be run on a digital processor, a system needs to provide a large operation amount and a large memory, high requirements are required on the operation capacity and the memory of the system, and the hardware cost of the system is high. The feature extraction algorithm operates on the digital audio signal in the digital domain, for example, the signal is converted from the time domain to the frequency domain by using discrete fourier transform, a fast fourier algorithm is generally adopted, and then the digital signal is further processed and calculated in the frequency domain, and finally a feature vector is obtained.
Disclosure of Invention
The invention aims at improving the prior technical problem, namely the invention aims to provide a sound classification method, which does not need digital feature extraction calculation and reduces the requirement of hardware on processing and calculating capacity.
The technical scheme provided by the invention is as follows: a sound classification method without digital feature extraction calculation is characterized by comprising the following steps:
acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector;
step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy;
converting the analog energy characteristic of each frequency band into a digital energy characteristic through an analog-to-digital converter;
and step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound.
Further, in the second step, the analog recording signal of each frequency band is integrated by an analog integrator to obtain the analog energy characteristic of each frequency band.
The method adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification, and is applicable to simple sound classification. In addition, by adopting the method, a high-precision analog-to-digital converter is not needed, a large amount of digital calculation is saved, the requirement on the calculation capacity of the processor is reduced, and the operation cost is further reduced.
Drawings
Fig. 1 is a process flow diagram of a sound classification method in the prior art.
Fig. 2 is a process flow diagram of a sound classification method provided by the present invention.
Detailed Description
The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art.
As shown in fig. 2, the sound classification method provided in the embodiment of the present invention directly processes the analog signal collected by the microphone, divides the analog signal received by the microphone into a plurality of frequency bands by using the analog filter bank, and then integrates the analog signal in each frequency band by using the analog integrator to obtain the analog signals in the plurality of frequency bands, where the analog signals are the "analog characteristics" of the sound. And performing analog-to-digital conversion on the analog features to obtain feature signals in a digital domain, and then comparing the feature signals with a sound model to calculate to obtain the type corresponding to the sound.
The specific process of extracting the feature vector in the scheme is as shown in the figure, and the steps are as follows:
acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector;
and step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy. The method specifically comprises the following steps: carrying out integration processing on the analog recording signal of each frequency band through an independent analog integrator to obtain the analog energy characteristic of each frequency band;
and step three, converting the analog energy characteristic of each frequency band into a digital energy characteristic through a low-precision analog-to-digital converter.
And step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound, and finishing the classification of the sound. The specific algorithm for model matching is the prior art, and is not described herein again.
The above method is suitable for simple sound classification tasks, such as sound classification tasks for distinguishing baby crying from normal environmental sounds. For the sound classification task, the training and using requirements of sound classification can be met according to the energy feature vectors of frequency bands, because the baby cry has obvious energy features in specific frequency bands and has larger difference with environmental noise on frequency spectrum, so the energy extracted according to a plurality of frequency bands is the feature vectors. The scheme adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification. The concrete advantages are:
firstly, a high-precision analog-to-digital converter used for acquiring a digital audio signal is omitted. Generally, an analog-to-digital converter for acquiring digital audio needs 16-bit precision, and then a plurality of low-precision analog-to-digital converters are used for converting analog characteristic signals into digital characteristic signals, wherein the precision of the low-precision analog-to-digital converters is 10 to 12 bits, so that the requirement can be met. The analog-to-digital converters in most single-chip microcomputers are low in precision, so that the high-precision analog-to-digital converters are required to be expanded for acquiring digital audio signals, and the scheme does not need to acquire the digital audio signals directly, so that the high-precision analog-to-digital converters can be omitted.
And secondly, the original digital domain feature extraction algorithm which needs a large amount of calculation is replaced by a feature extraction method in an analog domain, so that a large amount of digital calculation is saved. Corresponding to an actual hardware system, the scheme can realize the feature extraction of simple sound classification without digital extraction feature calculation, and can reduce the requirement of a sound classification algorithm on the calculation capacity of a processor, thereby reducing the cost.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention. Such changes and modifications are intended to be within the scope of the claimed invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (2)

1. A sound classification method without digital feature extraction calculation is characterized by comprising the following steps:
acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector;
step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy;
converting the analog energy characteristic of each frequency band into a digital energy characteristic through an analog-to-digital converter;
and step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound.
2. The sound classification method without digital feature extraction calculation of claim 1, wherein in the second step, the analog integrator is used to integrate the analog recording signal of each frequency band to obtain the analog energy feature of each frequency band.
CN202011392004.0A 2020-12-02 2020-12-02 Sound classification method without digital feature extraction calculation Pending CN112634937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011392004.0A CN112634937A (en) 2020-12-02 2020-12-02 Sound classification method without digital feature extraction calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011392004.0A CN112634937A (en) 2020-12-02 2020-12-02 Sound classification method without digital feature extraction calculation

Publications (1)

Publication Number Publication Date
CN112634937A true CN112634937A (en) 2021-04-09

Family

ID=75307900

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011392004.0A Pending CN112634937A (en) 2020-12-02 2020-12-02 Sound classification method without digital feature extraction calculation

Country Status (1)

Country Link
CN (1) CN112634937A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113702982A (en) * 2021-08-26 2021-11-26 廊坊市新思维科技有限公司 Ultrasonic data imaging algorithm
CN115985331A (en) * 2023-02-27 2023-04-18 百鸟数据科技(北京)有限责任公司 Audio automatic analysis method for field observation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611596A (en) * 2015-10-22 2017-05-03 德克萨斯仪器股份有限公司 Time-based frequency tuning of analog-to-information feature extraction
CN106683687A (en) * 2016-12-30 2017-05-17 杭州华为数字技术有限公司 Abnormal voice classifying method and device
US20170263268A1 (en) * 2016-03-10 2017-09-14 Brandon David Rumberg Analog voice activity detection
CN110610696A (en) * 2018-06-14 2019-12-24 清华大学 MFCC feature extraction method and device based on mixed signal domain
CN111667838A (en) * 2020-06-22 2020-09-15 清华大学 Low-power-consumption analog domain feature vector extraction method for voiceprint recognition

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106611596A (en) * 2015-10-22 2017-05-03 德克萨斯仪器股份有限公司 Time-based frequency tuning of analog-to-information feature extraction
US20170263268A1 (en) * 2016-03-10 2017-09-14 Brandon David Rumberg Analog voice activity detection
CN106683687A (en) * 2016-12-30 2017-05-17 杭州华为数字技术有限公司 Abnormal voice classifying method and device
CN110610696A (en) * 2018-06-14 2019-12-24 清华大学 MFCC feature extraction method and device based on mixed signal domain
CN111667838A (en) * 2020-06-22 2020-09-15 清华大学 Low-power-consumption analog domain feature vector extraction method for voiceprint recognition

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113702982A (en) * 2021-08-26 2021-11-26 廊坊市新思维科技有限公司 Ultrasonic data imaging algorithm
CN115985331A (en) * 2023-02-27 2023-04-18 百鸟数据科技(北京)有限责任公司 Audio automatic analysis method for field observation

Similar Documents

Publication Publication Date Title
US10665222B2 (en) Method and system of temporal-domain feature extraction for automatic speech recognition
EP0077558B1 (en) Method and apparatus for speech recognition and reproduction
CN103117059B (en) Voice signal characteristics extracting method based on tensor decomposition
CN106653056B (en) Fundamental frequency extraction model and training method based on LSTM recurrent neural network
KR100930060B1 (en) Recording medium on which a signal detecting method, apparatus and program for executing the method are recorded
CN108461081B (en) Voice control method, device, equipment and storage medium
WO2001031633A2 (en) Speech recognition
CN105719657A (en) Human voice extracting method and device based on microphone
CN112634937A (en) Sound classification method without digital feature extraction calculation
US4922539A (en) Method of encoding speech signals involving the extraction of speech formant candidates in real time
CN110827808A (en) Speech recognition method, speech recognition device, electronic equipment and computer-readable storage medium
CN111667834A (en) Hearing-aid device and hearing-aid method
WO2017000772A1 (en) Front-end audio processing system
US11749295B2 (en) Pitch emphasis apparatus, method and program for the same
EP1239458A2 (en) Voice recognition system, standard pattern preparation system and corresponding methods
KR100930061B1 (en) Signal detection method and apparatus
CN110136741B (en) Single-channel speech enhancement method based on multi-scale context
Agcaer et al. Optimization of amplitude modulation features for low-resource acoustic scene classification
CN107919136B (en) Digital voice sampling frequency estimation method based on Gaussian mixture model
CN118155632A (en) Voiceprint feature extraction algorithm based on dynamic segmentation of context-dependent spectral coefficients
CN110767238A (en) Blacklist identification method, apparatus, device and storage medium based on address information
CN117935826B (en) Audio up-sampling method, device, equipment and storage medium
CN118098255A (en) Voice enhancement method based on neural network detection and related device thereof
CN106448655A (en) Speech identification method
CN112435655A (en) Data acquisition and model training method and device for isolated word speech recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210409

WD01 Invention patent application deemed withdrawn after publication