CN109817223A - Phoneme marking method and device based on audio fingerprints - Google Patents
Phoneme marking method and device based on audio fingerprints Download PDFInfo
- Publication number
- CN109817223A CN109817223A CN201910086808.9A CN201910086808A CN109817223A CN 109817223 A CN109817223 A CN 109817223A CN 201910086808 A CN201910086808 A CN 201910086808A CN 109817223 A CN109817223 A CN 109817223A
- Authority
- CN
- China
- Prior art keywords
- phoneme
- voice
- audio
- marked
- frequency fingerprint
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000001228 spectrum Methods 0.000 claims abstract description 22
- 238000009432 framing Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims 3
- 230000000694 effects Effects 0.000 abstract description 5
- 238000002372 labelling Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000001755 vocal effect Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000011840 criminal investigation Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to the technical field of voiceprint identification, and particularly discloses a phoneme marking method and device based on an audio fingerprint, wherein the method comprises the following steps: extracting an audio fingerprint of a voice to be marked, and acquiring voice frequency spectrum pole information of the audio fingerprint of the voice to be marked; comparing the pole information with all audio fingerprints in a phoneme database to obtain N retrieval phonemes with the highest matching value; wherein N is a natural number; judging whether the pronunciation of one retrieval phoneme in the first N retrieval phonemes is consistent with the pronunciation of the phoneme to be marked: and if so, determining the N retrieval phonemes as the marking phonemes of the voice to be marked. The invention provides a phoneme marking method and a phoneme marking device based on audio fingerprints, which only select the spectrum poles for comparison and can achieve the effect of quick marking by reducing the comparison time.
Description
Technical field
The present invention relates to vocal print identification technical field more particularly to a kind of phoneme notation methods based on audio-frequency fingerprint.
Background technique
Audio fingerprint techniques are that the audio of identified content and foundation will be needed by extracting the data characteristics in sound
Completion is compared in fingerprint database.Identification process is not by the saving format of audio itself, coding mode, code rate and compression
Technique influence.The matching of audio-frequency fingerprint is the matching of high precision, and the element independent of file can provide the member of related pages
Information (meta-information), watermarking and file cryptographic Hash.
Phoneme is the minimum unit in voice, is analyzed according to the articulation in syllable, and a movement constitutes a sound
Element.Phoneme is divided into vowel, consonant two major classes.
Vocal print identification be also known as voice identity identification, Speaker identification/identification, refer to by comparing, analysis, to acoustic image provide
Expect the science judgment that the Problems of Identity for the voice recorded is carried out.In practical public security and judicial work, appraiser is usually
It needs to test to case-involving voice (such as extorting, threatening phone call recording, the talk recording etc. of both parties in economic dispute),
It analyzes the identity of speaker, judge case-involving voice (sample voice) and special object voice (sample voice) whether from same
One people;And the judge written comment-voice identity expert opinion for the science of making, and then give a clue for the investigation of case
And direction, evidence is provided for court action.
In the identification of existing voice identity, need to be decomposed into audio-visual data basic phoneme/syllable, then to difference
Identical phoneme/the syllable in source compares, and then judges whether sounder is same target.
Vocal print identification is largely divided into two classes: i.e. words person recognizes (Speaker Identification) and words person's confirmation
(Speaker Verification).The former, to judge that certain section of voice is described in which of several people, is " multiselect
One " problem, and the latter is to confirm whether certain section of voice is described in specified someone, is " one-to-one differentiation " problem.Such as
Recognition techniques may be needed when reducing criminal investigation range, and then need to confirm technology when bank transaction.
Either identification or confirmation, requires first to model the vocal print of speaker.Modeling needs to extract acoustic image money
The phoneme of current object in material.The current main method extracted using the method manually identified or pure machine, it is artificial to identify
Method accuracy it is high, but it is more to be needed manpower, and time-consuming, inefficiency.
Summary of the invention
It is an advantage of the invention to provide a kind of phoneme notation method and device based on audio-frequency fingerprint are only chosen
Frequency spectrum pole is compared, can be by reducing the reduced time to achieve the effect that Fast Labeling.
To achieve these objectives, the present invention provides a kind of phoneme notation method based on audio-frequency fingerprint, comprising the following steps:
The audio-frequency fingerprint of voice to be marked is extracted, the voice spectrum pole of the audio-frequency fingerprint of the voice to be marked is obtained
Information;
The pole information is compared with audio-frequency fingerprint all in phoneme database, obtains the highest N of matching value
A retrieval phoneme;Wherein, N is natural number;
Judge that the pronunciation in top n retrieval phoneme with the presence or absence of a retrieval phoneme is consistent with the pronunciation of phoneme to be marked:
If so, N number of retrieval phoneme to be confirmed as to the label phoneme of the voice to be marked.
Preferably, the audio-frequency fingerprint for extracting voice to be marked, obtains the language of the audio-frequency fingerprint of the voice to be marked
Sound spectrum pole information, before, further includes:
Preemphasis, framing are carried out to voice signal and add the pretreatment of Hamming window to obtain the voice to be marked.
Preferably, the audio-frequency fingerprint for extracting voice to be marked, obtains the language of the audio-frequency fingerprint of the voice to be marked
Sound spectrum pole information, specifically:
The audio-frequency fingerprint of voice to be marked is extracted, obtains the voice spectrum of the audio-frequency fingerprint of the voice to be marked on time
Between the extreme value of each pole that sequentially occurs.
Preferably, the pronunciation for retrieving phoneme with the presence or absence of one in the judgement top n retrieval phoneme and phoneme to be marked
Pronunciation it is consistent: if so, N number of retrieval phoneme is confirmed as the label phoneme of the voice to be marked, further includes:
If it is not, then enabling N=N+1, it is described by sound all in the pole information and phoneme database then to return to execution
Frequency fingerprint compares, and obtains the highest N number of retrieval phoneme of matching value.
On the other hand, the present invention provides a kind of phoneme notation device based on audio-frequency fingerprint, comprising:
Pole acquiring unit, for extracting the audio-frequency fingerprint of voice to be marked, the audio for obtaining the voice to be marked refers to
The voice spectrum pole information of line;
Comparing unit is obtained for comparing the pole information with audio-frequency fingerprint all in phoneme database
The highest N number of retrieval phoneme of matching value;Wherein, N is natural number;
Judging unit, pronunciation and sound to be marked for judging to retrieve phoneme with the presence or absence of one in top n retrieval phoneme
The pronunciation of element is consistent;
Phoneme notation unit, if the judging result of the judging unit be it is yes, the phoneme notation unit is for will be N number of
The retrieval phoneme is confirmed as the label phoneme of the voice to be marked.
Preferably, further includes:
Pretreatment unit, for carrying out preemphasis, framing to voice signal and adding the pretreatment of Hamming window described to obtain
Voice to be marked.
Preferably, the pole acquiring unit is specifically used for:
The audio-frequency fingerprint of voice to be marked is extracted, obtains the voice spectrum of the audio-frequency fingerprint of the voice to be marked on time
Between the extreme value of each pole that sequentially occurs.
Preferably, further includes:
Return execution unit, if the judging result of the judging unit be it is no, the return execution unit be used for enable N=N
+ 1, then return execute it is described the pole information is compared with audio-frequency fingerprint all in phoneme database, obtain
With the highest N number of retrieval phoneme of value.
The beneficial effects of the present invention are: a kind of phoneme notation method and device based on audio-frequency fingerprint is provided, by right
Audio-frequency fingerprint compares, and only chooses frequency spectrum pole and is compared, reduces the reduced time, and avoid the shadow of noise in audio
It rings, achievees the effect that Fast Labeling.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, below will to embodiment or
Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
Some embodiments of the present invention, for those of ordinary skill in the art, without any creative labor,
It can also be obtained according to these attached drawings other attached drawings.
Fig. 1 is the flow diagram for the phoneme notation method based on audio-frequency fingerprint that embodiment one provides;
Fig. 2 is the structural block diagram for the phoneme notation device based on audio-frequency fingerprint that embodiment two provides.
Specific embodiment
To enable the purpose of the present invention, feature, advantage more obvious and understandable, below in conjunction with the present embodiment
In attached drawing, the technical solution in the present embodiment is clearly and completely described, it is clear that the embodiments described below are only
It is only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, ordinary skill
Personnel's all other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Embodiment one
The present embodiment provides a kind of phoneme notation methods based on audio-frequency fingerprint, the sound suitable for field of speech recognition
Labeling effciency can be improved in element label application scenarios, described to be based on audio by one kind based on the phoneme notation method of audio-frequency fingerprint
The phoneme notation device of fingerprint executes, and passes through software and or hardware realization.
Fig. 1 is the flow diagram for the phoneme notation method based on audio-frequency fingerprint that embodiment one provides.
Referring to Fig. 1, the phoneme notation method based on audio-frequency fingerprint includes the following steps:
S10, preemphasis, framing are carried out to voice signal and adds the pretreatment of Hamming window to obtain the voice to be marked.
Specifically, the preemphasis of voice signal, in order to the high frequency section of voice be aggravated, lip is removed
The influence of radiation increases the high frequency resolution of voice.After carrying out the processing of preemphasis digital filtering, here is exactly to carry out adding window point
Frame processing, voice signal have short-term stationarity (10--30ms in it is considered that voice signal approximation constant), thus can be with
Voice signal is divided into some short sections to be handled, here it is framing, the framing of voice signal is using moveable limited
Method that the window of length is weighted is realized.General frame number per second is about 33 frames~100 frames, is depended on the circumstances.One
As framing method be overlapping segmentation method, the overlapping part of former frame and a later frame is known as frame shifting, and frame moves and frame length
Ratio is generally 0~0.5.
S20, the audio-frequency fingerprint for extracting voice to be marked obtain the voice spectrum of the audio-frequency fingerprint of the voice to be marked
Pole information.
Generally, the abscissa of voice spectrum figure is the time, and ordinate is frequency.The voice frequency of tagged speech to be extracted
Spectrogram should include several poles, and the extreme value of each pole occurred sequentially in time is the pole information.Specifically
Ground, step S20 are as follows: extract the audio-frequency fingerprint of voice to be marked, obtain the voice spectrum of the audio-frequency fingerprint of the voice to be marked
The extreme value of each pole occurred in chronological order.
S30, the pole information is compared with audio-frequency fingerprint all in phoneme database, obtains matching value most
High N number of retrieval phoneme;Wherein, N is natural number.
Specifically, the pole information of voice to be marked is compared with audio-frequency fingerprint all in phoneme database, i.e.,
It can achieve the extraction for targetedly comparing information, and extract content without artificial determine, to reach Fast Labeling
Effect.Further, N is preset value, can be defined and select according to the demands such as retrieval precision and target retrieval time
It takes.
S40, judge in top n retrieval phoneme with the presence or absence of the pronunciation and the pronunciation of phoneme to be marked of a retrieval phoneme
It is consistent: if so, N number of retrieval phoneme to be confirmed as to the label phoneme of the voice to be marked;If it is not, N=N+1 is then enabled,
Then S30 is returned to step.
Phoneme notation method provided in this embodiment based on audio-frequency fingerprint only chooses the pole of voice spectrum to audio
Fingerprint compares, and not only can be reduced the reduced time, but also is avoided that the influence of noise in audio, to reach the effect of Fast Labeling
Fruit.
Embodiment two
Phoneme notation device provided in this embodiment based on audio-frequency fingerprint can be used for executing the embodiment of the present invention offer
The phoneme notation method based on audio-frequency fingerprint, have corresponding function and beneficial effect.
Fig. 2 is the structural block diagram for the phoneme notation device based on audio-frequency fingerprint that embodiment two provides.
Referring to fig. 2, a kind of phoneme notation device based on audio-frequency fingerprint, comprising:
Pretreatment unit 1, for carrying out preemphasis, framing to voice signal and adding the pretreatment of Hamming window to obtain
State voice to be marked;
Pole acquiring unit 2 obtains the audio of the voice to be marked for extracting the audio-frequency fingerprint of voice to be marked
The voice spectrum pole information of fingerprint;The pole acquiring unit 2 is specifically used for extracting the audio-frequency fingerprint of voice to be marked, obtains
The extreme value for each pole for taking the voice spectrum of the audio-frequency fingerprint of the voice to be marked to occur in chronological order;
Comparing unit 3 is obtained for comparing the pole information with audio-frequency fingerprint all in phoneme database
To the highest N number of retrieval phoneme of matching value;Wherein, N is natural number;
Judging unit 4, for judge top n retrieval phoneme in the presence or absence of one retrieval phoneme pronunciation with it is to be marked
The pronunciation of phoneme is consistent;
Phoneme notation unit 5, if the judging result of the judging unit 4 be it is yes, the phoneme notation unit 5 is used for N
A retrieval phoneme is confirmed as the label phoneme of the voice to be marked;
Return execution unit 6, if the judging result of the judging unit 4 be it is no, the return execution unit 6 be used for enable N
=N+1, then return execution is described compares the pole information with audio-frequency fingerprint all in phoneme database, obtains
To the highest N number of retrieval phoneme of matching value.
In embodiment provided herein, it should be understood that disclosed system, unit, device and method can
To realize by another way.For example, all embodiments described above are only schematical, for example, said units
Or the division of module etc., only a kind of logical function partition, there may be another division manner in actual implementation, such as
Multiple units, module and component can be combined or can be integrated into another system, or some features can be ignored, or
It does not execute.Another point, shown or discussed mutual coupling, direct-coupling or communication connection can be by some
The indirect coupling or communication connection of interface, device or unit can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, as unit
The component of display may or may not be physical unit, it can and it is in one place, or may be distributed over more
In a network unit.Some or all of unit therein can be selected to realize this embodiment scheme according to the actual needs
Purpose.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, the technical solution essence of the application
On all or part of the part that contributes to existing technology or the technical solution can be with the shape of software product in other words
Formula embodies, which is stored in a computer readable storage medium, including some instructions are to make
A terminal device (can be mobile phone, notebook or other electronic equipments etc.) is obtained to execute described in each embodiment of the application
The all or part of the steps of method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-
Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can be with
Store the medium of program code.
The above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations;Although with reference to the foregoing embodiments
Invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each implementation
Technical solution documented by example is modified or equivalent replacement of some of the technical features;And these modification or
Replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.
Claims (8)
1. a kind of phoneme notation method based on audio-frequency fingerprint, which comprises the following steps:
The audio-frequency fingerprint of voice to be marked is extracted, the voice spectrum pole information of the audio-frequency fingerprint of the voice to be marked is obtained;
The pole information is compared with audio-frequency fingerprint all in phoneme database, obtains the highest N number of inspection of matching value
Suo Yinsu;Wherein, N is natural number;
Judge that the pronunciation in top n retrieval phoneme with the presence or absence of a retrieval phoneme is consistent with the pronunciation of phoneme to be marked: if so,
N number of retrieval phoneme is then confirmed as to the label phoneme of the voice to be marked.
2. the phoneme notation method according to claim 1 based on audio-frequency fingerprint, which is characterized in that the extraction is to be marked
The audio-frequency fingerprint of voice obtains the voice spectrum pole information of the audio-frequency fingerprint of the voice to be marked, before, further includes:
Preemphasis, framing are carried out to voice signal and add the pretreatment of Hamming window to obtain the voice to be marked.
3. the phoneme notation method according to claim 1 based on audio-frequency fingerprint, which is characterized in that the extraction is to be marked
The audio-frequency fingerprint of voice obtains the voice spectrum pole information of the audio-frequency fingerprint of the voice to be marked, specifically:
The audio-frequency fingerprint of voice to be marked is extracted, obtains the voice spectrum of the audio-frequency fingerprint of the voice to be marked in chronological order
The extreme value of each pole occurred.
4. the phoneme notation method according to claim 1 based on audio-frequency fingerprint, which is characterized in that the judgement top n
It retrieves consistent with the pronunciation of phoneme to be marked with the presence or absence of the pronunciation of a retrieval phoneme in phoneme: if so, by N number of inspection
Suo Yinsu is confirmed as the label phoneme of the voice to be marked, further includes:
If it is not, then enabling N=N+1, then return execution is described refers to the pole information with audio all in phoneme database
Line compares, and obtains the highest N number of retrieval phoneme of matching value.
5. a kind of phoneme notation device based on audio-frequency fingerprint characterized by comprising
Pole acquiring unit obtains the audio-frequency fingerprint of the voice to be marked for extracting the audio-frequency fingerprint of voice to be marked
Voice spectrum pole information;
Comparing unit is matched for comparing the pole information with audio-frequency fingerprint all in phoneme database
It is worth highest N number of retrieval phoneme;Wherein, N is natural number;
Judging unit, for judging in top n retrieval phoneme with the presence or absence of the pronunciation of a retrieval phoneme and phoneme to be marked
Pronunciation is consistent;
Phoneme notation unit, if the judging result of the judging unit be it is yes, the phoneme notation unit is used for N number of inspection
Suo Yinsu is confirmed as the label phoneme of the voice to be marked.
6. the phoneme notation device according to claim 5 based on audio-frequency fingerprint, which is characterized in that further include:
Pretreatment unit, for carrying out preemphasis, framing to voice signal and adding the pretreatment of Hamming window described wait mark to obtain
Remember voice.
7. the phoneme notation device according to claim 5 based on audio-frequency fingerprint, which is characterized in that the pole obtains single
Member is specifically used for:
The audio-frequency fingerprint of voice to be marked is extracted, obtains the voice spectrum of the audio-frequency fingerprint of the voice to be marked in chronological order
The extreme value of each pole occurred.
8. the phoneme notation device according to claim 5 based on audio-frequency fingerprint, which is characterized in that further include:
Return execution unit, if the judging result of the judging unit be it is no, the return execution unit be used for enable N=N+1, so
Return execution is described afterwards compares the pole information with audio-frequency fingerprint all in phoneme database, obtains matching value most
High N number of retrieval phoneme.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910086808.9A CN109817223A (en) | 2019-01-29 | 2019-01-29 | Phoneme marking method and device based on audio fingerprints |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910086808.9A CN109817223A (en) | 2019-01-29 | 2019-01-29 | Phoneme marking method and device based on audio fingerprints |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109817223A true CN109817223A (en) | 2019-05-28 |
Family
ID=66605748
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910086808.9A Pending CN109817223A (en) | 2019-01-29 | 2019-01-29 | Phoneme marking method and device based on audio fingerprints |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109817223A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110853676A (en) * | 2019-11-18 | 2020-02-28 | 广州国音智能科技有限公司 | Audio comparison method, device and equipment |
CN111640454A (en) * | 2020-05-13 | 2020-09-08 | 广州国音智能科技有限公司 | Spectrogram matching method, device and equipment and computer readable storage medium |
CN111782867A (en) * | 2020-05-20 | 2020-10-16 | 厦门快商通科技股份有限公司 | Voiceprint retrieval method, system, mobile terminal and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20170036404A (en) * | 2015-09-24 | 2017-04-03 | 국민대학교산학협력단 | Error analyzing method in a reading test and terminal device and system employing the same method |
CN106802960A (en) * | 2017-01-19 | 2017-06-06 | 湖南大学 | A kind of burst audio search method based on audio-frequency fingerprint |
CN107680601A (en) * | 2017-10-18 | 2018-02-09 | 深圳势必可赢科技有限公司 | A kind of identity homogeneity method of inspection retrieved based on sound spectrograph and phoneme and device |
CN108766417A (en) * | 2018-05-29 | 2018-11-06 | 广州国音科技有限公司 | A kind of the identity homogeneity method of inspection and device based on phoneme automatically retrieval |
-
2019
- 2019-01-29 CN CN201910086808.9A patent/CN109817223A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20170036404A (en) * | 2015-09-24 | 2017-04-03 | 국민대학교산학협력단 | Error analyzing method in a reading test and terminal device and system employing the same method |
CN106802960A (en) * | 2017-01-19 | 2017-06-06 | 湖南大学 | A kind of burst audio search method based on audio-frequency fingerprint |
CN107680601A (en) * | 2017-10-18 | 2018-02-09 | 深圳势必可赢科技有限公司 | A kind of identity homogeneity method of inspection retrieved based on sound spectrograph and phoneme and device |
CN108766417A (en) * | 2018-05-29 | 2018-11-06 | 广州国音科技有限公司 | A kind of the identity homogeneity method of inspection and device based on phoneme automatically retrieval |
Non-Patent Citations (1)
Title |
---|
胡伟: "音频指纹技术及其在广播音乐版权中的应用", 《电子科技大学》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110853676A (en) * | 2019-11-18 | 2020-02-28 | 广州国音智能科技有限公司 | Audio comparison method, device and equipment |
CN111640454A (en) * | 2020-05-13 | 2020-09-08 | 广州国音智能科技有限公司 | Spectrogram matching method, device and equipment and computer readable storage medium |
CN111640454B (en) * | 2020-05-13 | 2023-08-11 | 广州国音智能科技有限公司 | Spectrogram matching method, device, equipment and computer readable storage medium |
CN111782867A (en) * | 2020-05-20 | 2020-10-16 | 厦门快商通科技股份有限公司 | Voiceprint retrieval method, system, mobile terminal and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110781916B (en) | Fraud detection method, apparatus, computer device and storage medium for video data | |
CN107274916B (en) | Method and device for operating audio/video file based on voiceprint information | |
CN107731233B (en) | Voiceprint recognition method based on RNN | |
CN112289323B (en) | Voice data processing method and device, computer equipment and storage medium | |
CN110473566A (en) | Audio separation method, device, electronic equipment and computer readable storage medium | |
CN111145786A (en) | Speech emotion recognition method and device, server and computer readable storage medium | |
CN109714608B (en) | Video data processing method, video data processing device, computer equipment and storage medium | |
CN108305616A (en) | A kind of audio scene recognition method and device based on long feature extraction in short-term | |
CN111243619B (en) | Training method and device for speech signal segmentation model and computer equipment | |
CN112466287B (en) | Voice segmentation method, device and computer readable storage medium | |
CN112712809B (en) | Voice detection method and device, electronic equipment and storage medium | |
CN110136726A (en) | A kind of estimation method, device, system and the storage medium of voice gender | |
CN110797032A (en) | Voiceprint database establishing method and voiceprint identification method | |
CN108735222A (en) | A kind of vocal print identification method and system based on Application on Voiceprint Recognition | |
CN109817223A (en) | Phoneme marking method and device based on audio fingerprints | |
CN108665901B (en) | Phoneme/syllable extraction method and device | |
CN109872714A (en) | A kind of method, electronic equipment and storage medium improving accuracy of speech recognition | |
CN113409774A (en) | Voice recognition method and device and electronic equipment | |
CN113744742B (en) | Role identification method, device and system under dialogue scene | |
Birla | A robust unsupervised pattern discovery and clustering of speech signals | |
CN110782902A (en) | Audio data determination method, apparatus, device and medium | |
CN114694637A (en) | Hybrid speech recognition method, device, electronic device and storage medium | |
CN112599114B (en) | Voice recognition method and device | |
CN111985231B (en) | Unsupervised role recognition method and device, electronic equipment and storage medium | |
Shah et al. | A robust approach for speaker identification using dialect information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190528 |