[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN112908343B - Acquisition method and system for bird species number based on cepstrum spectrogram - Google Patents

Acquisition method and system for bird species number based on cepstrum spectrogram Download PDF

Info

Publication number
CN112908343B
CN112908343B CN201911133278.5A CN201911133278A CN112908343B CN 112908343 B CN112908343 B CN 112908343B CN 201911133278 A CN201911133278 A CN 201911133278A CN 112908343 B CN112908343 B CN 112908343B
Authority
CN
China
Prior art keywords
cepstrum
spectrogram
signal
bird
bird song
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911133278.5A
Other languages
Chinese (zh)
Other versions
CN112908343A (en
Inventor
王静宇
张纯
许枫
蒋立军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Acoustics CAS
Original Assignee
Institute of Acoustics CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Acoustics CAS filed Critical Institute of Acoustics CAS
Priority to CN201911133278.5A priority Critical patent/CN112908343B/en
Publication of CN112908343A publication Critical patent/CN112908343A/en
Application granted granted Critical
Publication of CN112908343B publication Critical patent/CN112908343B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Abstract

The invention belongs to the technical field of bird species number analysis, and particularly relates to a cepstrum spectrogram-based method for acquiring the number of bird species, which comprises the following steps: collecting a bird song signal of a monitoring point in real time to serve as bird song data; performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram; acquiring different cepstrum spectrogram shapes by using the acquired cepstrum spectrogram according to different cepstrum characteristics of the bird song of different species; and acquiring the number of the bird species according to the acquired different cepstrum spectrogram shapes.

Description

Acquisition method and system for bird species number based on cepstrum spectrogram
Technical Field
The invention belongs to the technical field of bird species number analysis, and particularly relates to a cepstrum spectrogram-based method and system for acquiring the number of bird species.
Background
The investigation of bird species diversity is an important research content in the biological field, in recent years, the monitoring and research of the biological activity rule of an ecosystem by using an acoustic means becomes a new technical means, an acoustic sensor array is laid in the field, and the monitoring of bird species by a method for processing a bird song signal is a nondestructive and efficient research method.
When the species diversity of birds in a certain monitoring area is analyzed, the conventional method generally identifies the species of birds at the monitoring point, and analyzes the species diversity of the birds according to the identification result. The method needs to establish a characteristic database of bird singing in a monitoring area and a complex bird singing recognition algorithm, and complex environmental noise and complex situations that multiple birds singing at the same time exist in a field environment, which bring difficulty to bird species recognition.
In addition, the conventional methods for acquiring the number of bird species, such as a spot method and a line method, have the defects that the bird information needs to be collected from the field manually, the investigation efficiency is low, and the investigation result error is large.
Disclosure of Invention
The invention aims to solve the defects of the existing method for acquiring the number of bird species, and provides a cepstrum spectrogram-based method for acquiring the number of bird species.
In order to achieve the purpose, the invention provides a method for acquiring the number of bird species based on a cepstrum spectrogram, which comprises the steps of performing cepstrum analysis on field collected birdsong data to generate the cepstrum spectrogram, and further acquiring the number of bird species by analyzing the image shape characteristics of the cepstrum spectrogram of the birdsong; the method solves the technical problem that bird song recognition is difficult under the field complex environment, and further bird species diversity analysis is difficult; the method comprises the following steps:
collecting a bird song signal of a monitoring point in real time to serve as bird song data;
performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
obtaining different cepstrum spectrogram shape characteristics by using the obtained cepstrum spectrogram according to different cepstrum characteristics of the bird song of different species;
and acquiring the number of the bird species according to the shape characteristics of the acquired different cepstrum spectrograms.
Performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram; the method comprises the following specific steps:
and performing cepstrum analysis on the acquired bird song data of the monitoring points, and extracting two characteristic quantities, namely the gene frequency and the formant of the voice signal corresponding to the bird song of different birds.
According to a short-time analysis principle, the amplitudes of two characteristic quantities of the fundamental tone frequency and the formant obtained in the cepstrum analysis of each frame of the bird song signal are respectively subjected to gray level mapping to respectively generate a fundamental tone frequency spectrogram and a formant spectrogram, and the fundamental tone frequency spectrogram and the formant spectrogram are integrated to obtain the cepstrum spectrogram.
The obtaining of the cepstrum spectrogram specifically comprises:
obtaining a spectrum signal X (k) of the bird song signal X (n) from each frame of the bird song signal X (n):
Figure BDA0002278908850000021
wherein, N is the length of the voice signal x (N) of each frame by taking the number of points as a unit;
n is any integer between 0 and N-1 and represents the time sequence number of the bird song signal x (N);
k is any integer between 0 and N-1 and represents the frequency serial number of the frequency spectrum signal X (k);
j is an imaginary unit;
e is a natural index.
Obtaining a pitch frequency characteristic quantity and a formant characteristic quantity of a voice signal corresponding to the frame of the birdsong signal respectively based on a spectrum signal X (k) of the birdsong signal X (n); the method comprises the following specific steps:
the characteristic quantity of the formant is the amplitude of the logarithmic spectrum of the bird song signal x (n), and the expression is as follows:
F(k)=log 10 |X(k)| (2)
wherein, F (k) is the corresponding formant characteristic quantity of the frame bird song signal;
the pitch frequency feature value is expressed by a time corresponding to a high-level partial amplitude peak of a cepstrum c (n) of a bird song signal x (n), and the cepstrum c (n) of the bird song signal is expressed by the following expression:
Figure BDA0002278908850000031
where c (n) is a cepstrum sequence of the frame of birdsong signal, that is, a time sequence of an inverse fourier transform of a log-amplitude spectrum of the frame birdsong signal.
According to a short-time analysis principle, generating a two-dimensional image represented by gray levels by carrying out gray mapping on amplitude values corresponding to high-time parts of a cepstrum sequence c (n) obtained in the cepstrum analysis of each frame of the bird song signal to obtain a fundamental tone frequency spectrogram;
carrying out gray mapping on an amplitude value corresponding to the characteristic quantity of the formant obtained in the cepstrum analysis of each frame of the bird song signal to generate a two-dimensional image represented by gray levels, and obtaining a formant spectrogram;
and integrating the fundamental tone frequency spectrogram and the formant spectrogram to obtain a cepstrum spectrogram.
Wherein the pitch frequency spectrogram reflects the relationship between the occurrence time of the pitch frequency of the birdsong signal x (n) and the number of frames; the formant spectrogram reflects the relationship between the formant frequency and the number of frames of the bird song signal x (n).
As an improvement of the above technical solution, the method further comprises: and comparing the acquired shape characteristics of the different cepstrum spectrograms with the shape characteristics of the cepstrum spectrograms of the birds in the existing database, and determining the number of the species of the birds and the type of the species of the birds at the same time.
The invention also provides a cepstrum spectrogram-based acquisition system for the number of bird species, which is realized based on the method and comprises the following steps: the device comprises an acquisition module, an analysis module, a spectrogram shape feature acquisition module and a species number acquisition module;
the acquisition module is used for acquiring a birdsong signal of a monitoring point in real time as birdsong data;
the analysis module is used for performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
the spectrogram shape feature acquisition module is used for acquiring different cepstrum spectrogram shape features according to different cepstrum characteristics of the bird song of different species by using the acquired cepstrum spectrogram;
the species number acquisition module is used for acquiring the number of the bird species according to the acquired different cepstrum spectrogram shapes.
As an improvement of the above technical solution, the system further includes: and the bird species type determining module is used for comparing the acquired different cepstrum spectrogram shapes with cepstrum spectrogram shapes of birds in the existing database through the existing database, and determining the species types of the birds while determining the number of the bird species.
Compared with the prior art, the invention has the beneficial effects that:
1) The method has the advantages that the number of bird species is obtained through a bird song signal processing method, the diversity of the bird species is analyzed, compared with a manual investigation method, a large amount of manpower and material resources are consumed, the cost is saved, the time is saved, meanwhile, the method is not limited by the activity of birds, the activity of the birds cannot be disturbed, and the method is a nondestructive and efficient research method;
2) The number of bird species is obtained through the pattern characteristics of the cepstrum spectrogram, and compared with a complex bird song pattern recognition algorithm, the method is simple in algorithm, visual in result and easy to implement.
Drawings
FIG. 1 is a schematic flow chart diagram of a brief description of a cepstrum spectrogram-based method of obtaining avian species numbers in accordance with the present invention;
FIG. 2 is a schematic diagram of a time-domain waveform of a bird song signal in an embodiment of a cepstrum spectrogram-based method of obtaining bird species numbers of the present invention;
FIG. 3a is a schematic illustration of cepstral-formant spectra in an embodiment of a cepstral-based method of obtaining avian species numbers of the present invention;
FIG. 3b is a schematic illustration of a cepstral spectrogram-fundamental frequency spectrogram in an embodiment of a cepstral spectrogram-based avian species number acquisition method of the present invention;
FIG. 4a is a schematic representation of the cepstrum of a voiced sound signal in an embodiment of a cepstrum spectrogram-based method of obtaining avian species numbers according to the present invention;
FIG. 4b is a schematic representation of cepstrum of an unvoiced signal in an embodiment of a cepstrum spectrogram-based method of obtaining avian species numbers according to the present invention;
FIG. 5 is a specific schematic flow chart of a cepstrum spectrogram-based method of obtaining avian species numbers of the present invention.
Detailed Description
The invention will now be further described with reference to the accompanying drawings.
As shown in fig. 1, the present invention provides a method for obtaining bird species number based on cepstrum spectrogram, wherein a cepstrum analysis method is adopted, which is an important speech signal analysis method, and formants and pitch frequency features extracted by cepstrum analysis are determined by the unique vocal tract structure of a sounding individual, so that the method can be used for distinguishing bird singing of different species.
As shown in fig. 1 and 5, the method includes:
collecting a bird song signal of a monitoring point in real time to serve as bird song data;
performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram; the method specifically comprises the following steps:
and performing cepstrum analysis on the acquired bird song data of the monitoring points, and extracting two characteristic quantities, namely the gene frequency and the formant of the voice signal corresponding to the bird song of different birds.
According to a short-time analysis principle, the amplitudes of two characteristic quantities of the fundamental tone frequency and the formant obtained in the cepstrum analysis of each frame of the bird song signal are respectively subjected to gray level mapping to respectively generate a fundamental tone frequency spectrogram and a formant spectrogram, and the fundamental tone frequency spectrogram and the formant spectrogram are integrated to obtain the cepstrum spectrogram.
The obtaining of the cepstrum spectrogram specifically comprises:
obtaining a spectrum signal X (k) of the bird song signal X (n) from each frame of the bird song signal X (n):
Figure BDA0002278908850000051
wherein, N is the length of the voice signal x (N) of each frame by taking the number of points as a unit;
n is an integer between 0 and N-1 and represents the time sequence number of the bird song signal x (N);
k is any integer between 0 and N-1 and represents the frequency serial number of the frequency spectrum signal X (k);
j is an imaginary unit;
e is a natural index.
Respectively obtaining a pitch frequency characteristic quantity and a formant characteristic quantity of a voice signal corresponding to the frame of the bird song signal based on a frequency spectrum signal X (k) of the bird song signal X (n); the method comprises the following specific steps:
the characteristic quantity of the formant is the amplitude of the logarithmic spectrum of the bird song signal x (n), and the expression is as follows:
F(k)=log 10 |X(k)| (2)
wherein, F (k) is the corresponding formant characteristic quantity of the frame bird song signal;
the pitch frequency feature value is expressed by a time corresponding to a high-level partial amplitude peak of a cepstrum c (n) of a bird song signal x (n), and the cepstrum c (n) of the bird song signal has an expression:
Figure BDA0002278908850000052
where c (n) is a cepstrum sequence of the frame of birdsong signals, that is, a time sequence of inverse fourier transform of a log-amplitude spectrum of the frame of birdsong signals.
According to a short-time analysis principle, generating a two-dimensional image represented by gray levels by mapping amplitude values corresponding to high-time parts of a cepstrum sequence c (n) obtained in the cepstrum analysis of each frame of the bird song signal through gray levels to obtain a fundamental tone frequency spectrogram;
carrying out gray mapping on an amplitude value corresponding to the characteristic quantity of the formant obtained in the cepstrum analysis of each frame of the bird song signal to generate a two-dimensional image represented by gray levels, and obtaining a formant spectrogram;
and integrating the fundamental tone frequency spectrogram and the formant spectrogram to obtain a cepstrum spectrogram.
Wherein, the fundamental tone frequency spectrogram reflects the relationship between the occurrence time and the frame number of the fundamental tone frequency of the bird song signal x (n); the formant spectrum reflects the relationship between the formant frequency and the number of frames of the bird song signal x (n).
Acquiring different cepstrum spectrogram shape characteristics by using the acquired cepstrum spectrogram according to different cepstrum characteristics of the bird song of different species;
and acquiring the number of the bird species according to the shape characteristics of the acquired different cepstrum spectrograms.
The method further comprises the following steps: and comparing the acquired shape characteristics of the different cepstrum spectrograms with the shape characteristics of the cepstrum spectrograms of specific birds in the existing database, and determining the number of bird species and the species type of the birds at the same time.
The invention also provides a cepstrum spectrogram-based acquisition system for the number of bird species, which is realized based on the method and comprises the following steps: the device comprises an acquisition module, an analysis module, a spectrogram shape characteristic acquisition module and a species number acquisition module;
the acquisition module is used for acquiring a birdsong signal of a monitoring point in real time as birdsong data;
the analysis module is used for performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
the spectrogram shape feature acquisition module is used for acquiring different cepstrum shape features according to different cepstrum features of bird sings of different species by using the acquired cepstrum;
the species number obtaining module is used for obtaining the number of the bird species according to the obtained different cepstrum spectrogram shapes.
The system further comprises: and the bird species type determining module is used for comparing the acquired different cepstral spectrogram shapes with the cepstral spectrogram shapes of birds in the existing database through the existing database, and determining the number of bird species and the species type of the birds at the same time.
Example 1.
In this example, the experimental data is a section of field bird song data acquired by researchers at a certain monitoring point in a natural protection area of a five-river island in the boat mountain, and the section of data is marked by data acquisition personnel and includes the song of three birds in total.
Wherein, the acquisition equipment adopts Marantz PMD-671 digital solid state recorder and Sennheiser MKH416-P48 external strong directivity microphone, the sampling precision of the sound signal is 16bit, and the sampling frequency is 44.1kHz.
The method of the invention is utilized to realize the acquisition and analysis of the number of bird species, and comprises the following steps:
performing cepstrum spectrogram analysis on the bird song data acquired in real time, operating an algorithm in a Matlab software environment of a computer, wherein a hamming window is adopted for calculation, the length of the short-time window is 1024 sampling points (23.2 ms), the number of overlapped points of the short-time window is 512, the time domain waveform of a bird song signal is shown in fig. 2, and the time domain waveform of the bird song of 3 kinds of birds acquired by experimenters in the field is shown in fig. 2.
Cepstrum analysis is performed on the bird song time domain signal in fig. 2, two feature quantities of the gene frequency and the formant of the bird song are calculated, a fundamental tone frequency spectrogram and a formant spectrogram are respectively generated and integrated, and the cepstrum spectrogram is obtained as shown in fig. 3.
Wherein, the formant spectrogram is shown in fig. 3a, the abscissa is frequency, and the unit is hertz (Hz); the ordinate represents the number of frames, representing the time sequence of the bird song signal. The fundamental frequency spectrogram is shown in fig. 3b, with the abscissa as time and the unit as seconds(s); the ordinate represents the number of frames, representing the time sequence of the bird song signal. As can be seen by observation, three formant spectrogram stripes with three different shapes are respectively arranged at three positions of 1-60 frames, 80-100 frames and 100-120 frames in the graph of FIG. 3 a; in fig. 3b, there are two different forms of fundamental frequency spectrogram fringes at positions of 80-100 frames and 100-120 frames, respectively, and there are no obvious fundamental frequency spectrogram fringes at positions of 1-60 frames.
The reason for this phenomenon can be explained by the difference of cepstrum characteristics of unvoiced and voiced speech signals: a high-period part of the cepstrum waveform of a voiced speech segment has an obvious peak value, and the occurrence time of the peak value is equal to the pitch period of an input speech signal; such peaks do not appear in the cepstrum waveform of unvoiced speech segments. FIG. 4 gives a visual display of this conclusion, as shown in FIG. 4;
in fig. 4a, there is a sharp peak at 1.8ms in the cepstrum waveform of the voiced signal, and in fig. 4b, no peak appears in the cepstrum waveform of the unvoiced signal.
In fig. 3b, the reason why there is no significant fundamental frequency spectrogram streak at the time positions of 1 to 60 frames is that the section of the bird song is an unvoiced sound signal, and the two sections of the bird song signals at the positions of 80 to 100 frames and 100 to 120 frames are voiced sound signals, there is significant fundamental frequency spectrogram streak.
By graphical features of the cepstral spectrogram of fig. 3a and 3 b: there are three types of formant spectral fringes in FIG. 3 a; in fig. 3b, there are three types of fundamental tone frequency spectrogram stripes (no stripe is also a feature), it can be determined that there are 3 kinds of birds singing in the section of bird singing data, and the singing coincides with information labeled by experimental collection personnel, so that it is verified that the number of bird species in the bird singing data can be obtained by the algorithm of the present invention; and comparing the bird species with the existing database to determine the species types of the 3 birds.
Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not limited. Although the present invention has been described in detail with reference to the embodiments, it should be understood by those skilled in the art that the technical solutions of the present invention may be modified or substituted with equivalents without departing from the spirit and scope of the technical solutions of the present invention, and all of them should be covered by the scope of the claims of the present invention.

Claims (5)

1. A cepstrum spectrogram-based method for acquiring the number of avian species is characterized by comprising the following steps of:
collecting a bird song signal of a monitoring point in real time to serve as bird song data;
performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
the obtaining of the cepstrum spectrogram specifically comprises:
obtaining a spectrum signal X (k) of the bird song signal X (n) from each frame of the bird song signal X (n):
Figure FDA0003753357420000011
wherein, N is the length of the voice signal x (N) of each frame by taking the number of points as a unit;
n is an integer between 0 and N-1 and represents the time sequence number of the bird song signal x (N);
k is any integer between 0 and N-1 and represents the frequency serial number of the frequency spectrum signal X (k);
j is an imaginary unit;
e is a natural index;
spectral signal X (n) based on bird song signal X (n) k ) Respectively obtaining a fundamental tone frequency characteristic quantity and a formant characteristic quantity of a voice signal corresponding to the frame of the birdsong signal; the method comprises the following specific steps:
the characteristic quantity of the formant is the amplitude of the logarithmic spectrum of the bird song signal x (n), and the expression is as follows:
F(k)=log 10 |X(k)| (2)
wherein, F (k) is the corresponding formant characteristic quantity of the frame bird song signal;
the pitch frequency feature value is expressed by a time corresponding to a high-level partial amplitude peak of a cepstrum c (n) of a bird song signal x (n), and the cepstrum c (n) of the bird song signal has an expression:
Figure FDA0003753357420000012
wherein c (n) is a cepstrum sequence of the frame of the birdsong signal, namely a time sequence of inverse fourier transform of a logarithmic magnitude spectrum of the frame of the birdsong signal;
according to a short-time analysis principle, generating a two-dimensional image represented by gray levels by mapping amplitude values corresponding to high-time parts of a cepstrum sequence c (n) obtained in the cepstrum analysis of each frame of the bird song signal through gray levels to obtain a fundamental tone frequency spectrogram;
carrying out gray mapping on an amplitude value corresponding to the characteristic quantity of the formant obtained in the cepstrum analysis of each frame of the bird song signal to generate a two-dimensional image represented by gray levels, and obtaining a formant spectrogram;
integrating the fundamental tone frequency spectrogram and the formant spectrogram to obtain a cepstrum spectrogram;
wherein the pitch frequency spectrogram reflects the relationship between the occurrence time of the pitch frequency of the birdsong signal x (n) and the number of frames; the formant spectrogram reflects the relationship between the formant frequency and the number of frames of the bird song signal x (n);
acquiring different cepstrum spectrogram shape characteristics by using the acquired cepstrum spectrogram according to different cepstrum characteristics of the bird song of different species;
and acquiring the number of the bird species according to the shape characteristics of the acquired different cepstrum spectrograms.
2. The method of claim 1, wherein the cepstrum analysis of the acquired bird song data is performed to obtain a cepstrum spectrogram; the method specifically comprises the following steps:
performing cepstrum analysis on the acquired bird song data of the monitoring points, and extracting two characteristic quantities, namely gene frequency and formant of voice signals corresponding to the bird songs of different birds;
according to the short-time analysis principle, the amplitudes of two characteristic quantities of the fundamental tone frequency and the formant obtained in the cepstrum analysis of each frame of the bird song signal are respectively subjected to gray level mapping to respectively generate a fundamental tone frequency spectrogram and a formant spectrogram, and the fundamental tone frequency spectrogram and the formant spectrogram are integrated to obtain the cepstrum spectrogram.
3. The method of claim 1, further comprising: and comparing the shape characteristics of the acquired different cepstrum spectrograms with the shape characteristics of the cepstrum spectrograms of the birds in the existing database, and determining the species types of the birds while determining the number of the species of the birds.
4. A cepstrum spectrogram-based acquisition system for avian species numbers, the system comprising: the device comprises an acquisition module, an analysis module, a spectrogram shape characteristic acquisition module and a species number acquisition module;
the acquisition module is used for acquiring a birdsong signal of a monitoring point in real time as birdsong data;
the analysis module is used for performing cepstrum analysis on the acquired bird song data to obtain a cepstrum spectrogram;
the obtaining of the cepstrum spectrogram specifically comprises:
obtaining a spectrum signal X (k) of the bird song signal X (n) from each frame of the bird song signal X (n):
Figure FDA0003753357420000031
wherein, N is the length of the voice signal x (N) of each frame by taking the number of points as a unit;
n is an integer between 0 and N-1 and represents the time sequence number of the bird song signal x (N);
k is any integer between 0 and N-1 and represents the frequency serial number of the frequency spectrum signal X (k);
j is an imaginary unit;
e is a natural index;
respectively obtaining a pitch frequency characteristic quantity and a formant characteristic quantity of a voice signal corresponding to the frame of the bird song signal based on a frequency spectrum signal X (k) of the bird song signal X (n); the method comprises the following specific steps:
the characteristic quantity of the formant is the amplitude of the logarithmic spectrum of the bird song signal x (n), and the expression is as follows:
F(k)=log 10 |X(k)| (2)
wherein, F (k) is the corresponding formant characteristic quantity of the frame bird song signal;
the pitch frequency feature value is expressed by a time corresponding to a high-level partial amplitude peak of a cepstrum c (n) of a bird song signal x (n), and the cepstrum c (n) of the bird song signal has an expression:
Figure FDA0003753357420000032
wherein c (n) is a cepstrum sequence of the frame of the birdsong signal, namely a time sequence of inverse fourier transform of a logarithmic magnitude spectrum of the frame of the birdsong signal;
according to a short-time analysis principle, generating a two-dimensional image represented by gray levels by carrying out gray mapping on amplitude values corresponding to high-time parts of a cepstrum sequence c (n) obtained in the cepstrum analysis of each frame of the bird song signal to obtain a fundamental tone frequency spectrogram;
generating a two-dimensional image represented by gray levels by mapping amplitude values corresponding to the characteristic quantities of the formants obtained in the cepstrum analysis of each frame of the birdsong signals through gray level mapping, and obtaining a formant spectrogram;
integrating the fundamental tone frequency spectrogram and the formant spectrogram to obtain a cepstrum spectrogram;
wherein, the fundamental tone frequency spectrogram reflects the relationship between the occurrence time and the frame number of the fundamental tone frequency of the bird song signal x (n); the formant spectrogram reflects the relationship between the formant frequency and the number of frames of the bird song signal x (n);
the spectrogram shape feature acquisition module is used for acquiring different cepstrum shape features according to different cepstrum features of bird sings of different species by using the acquired cepstrum;
the species number obtaining module is used for obtaining the number of the bird species according to the obtained different cepstrum spectrogram shapes.
5. The system of claim 4, further comprising: and the bird species type determining module is used for comparing the acquired different cepstral spectrogram shapes with the cepstral spectrogram shapes of birds in the existing database through the existing database, and determining the number of bird species and the species type of the birds at the same time.
CN201911133278.5A 2019-11-19 2019-11-19 Acquisition method and system for bird species number based on cepstrum spectrogram Active CN112908343B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911133278.5A CN112908343B (en) 2019-11-19 2019-11-19 Acquisition method and system for bird species number based on cepstrum spectrogram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911133278.5A CN112908343B (en) 2019-11-19 2019-11-19 Acquisition method and system for bird species number based on cepstrum spectrogram

Publications (2)

Publication Number Publication Date
CN112908343A CN112908343A (en) 2021-06-04
CN112908343B true CN112908343B (en) 2022-10-04

Family

ID=76103171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911133278.5A Active CN112908343B (en) 2019-11-19 2019-11-19 Acquisition method and system for bird species number based on cepstrum spectrogram

Country Status (1)

Country Link
CN (1) CN112908343B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117727332B (en) * 2024-02-18 2024-04-26 百鸟数据科技(北京)有限责任公司 Ecological population assessment method based on language spectrum feature analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8831942B1 (en) * 2010-03-19 2014-09-09 Narus, Inc. System and method for pitch based gender identification with suspicious speaker detection
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong
CN109409308A (en) * 2018-11-05 2019-03-01 中国科学院声学研究所 A method of the birds species identification based on birdvocalization
CN109462482A (en) * 2018-11-09 2019-03-12 深圳壹账通智能科技有限公司 Method for recognizing sound-groove, device, electronic equipment and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8831942B1 (en) * 2010-03-19 2014-09-09 Narus, Inc. System and method for pitch based gender identification with suspicious speaker detection
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong
CN109409308A (en) * 2018-11-05 2019-03-01 中国科学院声学研究所 A method of the birds species identification based on birdvocalization
CN109462482A (en) * 2018-11-09 2019-03-12 深圳壹账通智能科技有限公司 Method for recognizing sound-groove, device, electronic equipment and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
倒谱在语音的基音和共振峰提取中的应用;王晓亚;《无线电工程》;20040130(第01期);第3-4节 *

Also Published As

Publication number Publication date
CN112908343A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
Alim et al. Some commonly used speech feature extraction algorithms
CN106935248B (en) Voice similarity detection method and device
US6862558B2 (en) Empirical mode decomposition for analyzing acoustical signals
Kumar et al. Design of an automatic speaker recognition system using MFCC, vector quantization and LBG algorithm
CN109034046B (en) Method for automatically identifying foreign matters in electric energy meter based on acoustic detection
CN110880329B (en) Audio identification method and equipment and storage medium
CN105825852A (en) Oral English reading test scoring method
CN108896878A (en) A kind of detection method for local discharge based on ultrasound
CN104050965A (en) English phonetic pronunciation quality evaluation system with emotion recognition function and method thereof
CN103617799A (en) Method for detecting English statement pronunciation quality suitable for mobile device
CN104616663A (en) Music separation method of MFCC (Mel Frequency Cepstrum Coefficient)-multi-repetition model in combination with HPSS (Harmonic/Percussive Sound Separation)
CN108305639B (en) Speech emotion recognition method, computer-readable storage medium and terminal
CN104123934A (en) Speech composition recognition method and system
CN103366735B (en) The mapping method of speech data and device
CN103366759A (en) Speech data evaluation method and speech data evaluation device
CN110299141A (en) The acoustic feature extracting method of recording replay attack detection in a kind of Application on Voiceprint Recognition
CN108682432B (en) Speech emotion recognition device
CN112331220A (en) Bird real-time identification method based on deep learning
CN112397074A (en) Voiceprint recognition method based on MFCC (Mel frequency cepstrum coefficient) and vector element learning
Venter et al. Automatic detection of African elephant (Loxodonta africana) infrasonic vocalisations from recordings
JP2004240214A (en) Acoustic signal discriminating method, acoustic signal discriminating device, and acoustic signal discriminating program
CN116230018A (en) Synthetic voice quality evaluation method for voice synthesis system
CN112908343B (en) Acquisition method and system for bird species number based on cepstrum spectrogram
Kawahara et al. Higher order waveform symmetry measure and its application to periodicity detectors for speech and singing with fine temporal resolution
CN112201226B (en) Sound production mode judging method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant