CN112634937A

CN112634937A - Sound classification method without digital feature extraction calculation

Info

Publication number: CN112634937A
Application number: CN202011392004.0A
Authority: CN
Inventors: 陈盛; 马文亮
Original assignee: Ai Li Chi Technology Beijing Co ltd
Current assignee: Ai Li Chi Technology Beijing Co ltd
Priority date: 2020-12-02
Filing date: 2020-12-02
Publication date: 2021-04-09

Abstract

The invention discloses a sound classification method without digital feature extraction calculation, which comprises the following steps: acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector; step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy; converting the analog energy characteristic of each frequency band into a digital energy characteristic through an analog-to-digital converter; and step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound. The method adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification, and is applicable to simple sound classification.

Description

Sound classification method without digital feature extraction calculation

Technical Field

The invention belongs to the technical field of voice, and particularly relates to a sound classification method.

Background

Sound classification is a method used to distinguish sound classes. The sound classification task is different from the speech recognition task, and the requirement for feature vectors is generally lower for the sound classification task, such as distinguishing baby crying from normal environmental sounds, than for speech recognition. The speech recognition task generally needs to extract formant features in speech, so advanced features such as mel frequency cepstrum coefficients or linear perception prediction need to be used, the features need to be extracted from digital audio signals, a large number of digital operations are needed in the extraction process, fast fourier transform, trigonometric function calculation, exponential calculation and logarithmic calculation are involved, and one frame of data is processed every 10 milliseconds in general. For the sound classification task, the training and using requirements of sound classification can be met according to the energy feature vector of the frequency band. For example, in a sound classification task for distinguishing a baby cry from a normal environmental sound, since the energy characteristics of the baby cry in a specific frequency band are obvious and the difference from the environmental noise in the frequency spectrum is large, the energy extracted according to a plurality of frequency bands is a feature vector. In the sound classification method, the extraction of features is generally performed in the digital domain, and first, an analog signal collected by a microphone is converted into a digital signal through analog-to-digital conversion (a/D), and then the features of sound are extracted from the digital signal. After the characteristics of the sound are extracted, the sound characteristics are compared with the sound model to calculate the type of the sound, and the basic flow is shown in fig. 1.

The method for sound classification needs to perform operation processing in a digital domain, further needs high-performance analog-to-digital conversion and processing calculation capacity for support, and has high hardware cost. This is due to:

firstly, a high-precision analog-to-digital converter is needed, the built-in analog-to-digital converter of a common single chip cannot meet the requirement, a professional audio analog-to-digital converter needs to be connected externally, and the hardware cost of the system is high. The analog audio signal is converted into the digital audio signal, and because the feature extraction algorithm has higher requirements on the sampling rate and the precision of the digital audio, an analog-to-digital converter with the sampling rate of more than 8000Hz and the sampling precision of 16 bits is generally required to convert the analog audio signal into the digital audio signal.

Secondly, the sound feature extraction algorithm needs to be run on a digital processor, a system needs to provide a large operation amount and a large memory, high requirements are required on the operation capacity and the memory of the system, and the hardware cost of the system is high. The feature extraction algorithm operates on the digital audio signal in the digital domain, for example, the signal is converted from the time domain to the frequency domain by using discrete fourier transform, a fast fourier algorithm is generally adopted, and then the digital signal is further processed and calculated in the frequency domain, and finally a feature vector is obtained.

Disclosure of Invention

The invention aims at improving the prior technical problem, namely the invention aims to provide a sound classification method, which does not need digital feature extraction calculation and reduces the requirement of hardware on processing and calculating capacity.

The technical scheme provided by the invention is as follows: a sound classification method without digital feature extraction calculation is characterized by comprising the following steps:

acquiring an analog recording signal through a recording system, and then acquiring analog signals of N frequency bands through an analog filter bank consisting of analog circuits, wherein N is the number of the frequency bands required by a characteristic vector;

step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy;

converting the analog energy characteristic of each frequency band into a digital energy characteristic through an analog-to-digital converter;

and step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound.

Further, in the second step, the analog recording signal of each frequency band is integrated by an analog integrator to obtain the analog energy characteristic of each frequency band.

The method adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification, and is applicable to simple sound classification. In addition, by adopting the method, a high-precision analog-to-digital converter is not needed, a large amount of digital calculation is saved, the requirement on the calculation capacity of the processor is reduced, and the operation cost is further reduced.

Drawings

Fig. 1 is a process flow diagram of a sound classification method in the prior art.

Fig. 2 is a process flow diagram of a sound classification method provided by the present invention.

Detailed Description

The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art.

As shown in fig. 2, the sound classification method provided in the embodiment of the present invention directly processes the analog signal collected by the microphone, divides the analog signal received by the microphone into a plurality of frequency bands by using the analog filter bank, and then integrates the analog signal in each frequency band by using the analog integrator to obtain the analog signals in the plurality of frequency bands, where the analog signals are the "analog characteristics" of the sound. And performing analog-to-digital conversion on the analog features to obtain feature signals in a digital domain, and then comparing the feature signals with a sound model to calculate to obtain the type corresponding to the sound.

The specific process of extracting the feature vector in the scheme is as shown in the figure, and the steps are as follows:

and step two, respectively processing the analog signals of the N frequency bands generated in the step one to calculate the energy. The method specifically comprises the following steps: carrying out integration processing on the analog recording signal of each frequency band through an independent analog integrator to obtain the analog energy characteristic of each frequency band;

and step three, converting the analog energy characteristic of each frequency band into a digital energy characteristic through a low-precision analog-to-digital converter.

And step four, performing matching calculation on the digital energy characteristics of each frequency band and the model data to obtain the type of the sound, and finishing the classification of the sound. The specific algorithm for model matching is the prior art, and is not described herein again.

The above method is suitable for simple sound classification tasks, such as sound classification tasks for distinguishing baby crying from normal environmental sounds. For the sound classification task, the training and using requirements of sound classification can be met according to the energy feature vectors of frequency bands, because the baby cry has obvious energy features in specific frequency bands and has larger difference with environmental noise on frequency spectrum, so the energy extracted according to a plurality of frequency bands is the feature vectors. The scheme adopts an analog signal processing method to replace a digital signal processing method to obtain the sub-band energy characteristic vector required by simple sound classification. The concrete advantages are:

firstly, a high-precision analog-to-digital converter used for acquiring a digital audio signal is omitted. Generally, an analog-to-digital converter for acquiring digital audio needs 16-bit precision, and then a plurality of low-precision analog-to-digital converters are used for converting analog characteristic signals into digital characteristic signals, wherein the precision of the low-precision analog-to-digital converters is 10 to 12 bits, so that the requirement can be met. The analog-to-digital converters in most single-chip microcomputers are low in precision, so that the high-precision analog-to-digital converters are required to be expanded for acquiring digital audio signals, and the scheme does not need to acquire the digital audio signals directly, so that the high-precision analog-to-digital converters can be omitted.

And secondly, the original digital domain feature extraction algorithm which needs a large amount of calculation is replaced by a feature extraction method in an analog domain, so that a large amount of digital calculation is saved. Corresponding to an actual hardware system, the scheme can realize the feature extraction of simple sound classification without digital extraction feature calculation, and can reduce the requirement of a sound classification algorithm on the calculation capacity of a processor, thereby reducing the cost.

The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are given by way of illustration of the principles of the present invention, and that various changes and modifications may be made without departing from the spirit and scope of the invention. Such changes and modifications are intended to be within the scope of the claimed invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

1. A sound classification method without digital feature extraction calculation is characterized by comprising the following steps:

2. The sound classification method without digital feature extraction calculation of claim 1, wherein in the second step, the analog integrator is used to integrate the analog recording signal of each frequency band to obtain the analog energy feature of each frequency band.