CN105719662B - Dysarthrosis detection method and system - Google Patents
Dysarthrosis detection method and system Download PDFInfo
- Publication number
- CN105719662B CN105719662B CN201610264854.XA CN201610264854A CN105719662B CN 105719662 B CN105719662 B CN 105719662B CN 201610264854 A CN201610264854 A CN 201610264854A CN 105719662 B CN105719662 B CN 105719662B
- Authority
- CN
- China
- Prior art keywords
- pronunciation
- motion track
- track information
- voice
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 53
- 201000008482 osteoarthritis Diseases 0.000 title claims abstract description 53
- 230000033001 locomotion Effects 0.000 claims abstract description 87
- 238000000034 method Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 230000001360 synchronised effect Effects 0.000 claims description 5
- 210000004709 eyebrow Anatomy 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 210000005182 tip of the tongue Anatomy 0.000 claims description 2
- 238000012360 testing method Methods 0.000 abstract description 5
- 239000000284 extract Substances 0.000 description 8
- 210000000214 mouth Anatomy 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 239000004819 Drying adhesive Substances 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 239000003292 glue Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000011982 device technology Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 210000004209 hair Anatomy 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000005311 nuclear magnetism Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/66—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The present invention relates to a kind of dysarthrosis detection method and systems.Above-mentioned dysarthrosis detection method includes the following steps: to read the data that electromagnetic pronunciation instrument generates, obtains the coordinates of motion data of audio data and its synchronization;The corresponding sub- motion track information of each words pronunciation is extracted from the motion track information according to the audio data;The sub- motion track information is corresponding with words pronunciation each in reference voice library with reference to motion track information progress characteristic operation, obtain likelihood probability value;Wherein the reference voice library is the speech database for including each words normal articulation;Dysarthrosis detection is carried out to the user according to likelihood probability value.Dysarthrosis detection method and system provided by the invention can use each words pronunciation in data and corresponding sub- motion track information, be improved the accuracy of testing result.
Description
Technical field
The present invention relates to voice processing technology fields, more particularly to a kind of dysarthrosis detection method and system.
Background technique
Currently, dysarthrosis detection technique research is in the stage of an initial development.Nowadays the structure of hospital rehabilitation department
Sound impaired patients diagnostic mode is mainly assessed according to the diagnostic experiences of medical teacher and subjective Auditory Perception, time-consuming and laborious and not
It is objective enough to stablize.It can be to dysarthrosis moreover, carrying out diagnosis using radioactive ray imaging technique and nuclear-magnetism medical device technology
The body of patient causes adverse effect, it is also necessary to spend expensive medical instrument expense.It is existing about dysarthrosis
Appraisal procedure mainly includes graphical method, phonetic symbol method, standardized test detection method and instrument inspection technique etc..
Above-mentioned dysarthrosis detection scheme is mainly concerned with speech intelligibility assessment, Diadochokinetic rate assessment and nose
Flow detection etc. is easy to influence the accuracy of testing result.
Summary of the invention
Based on this, it is necessary to be easy the technical issues of influencing dysarthrosis detection accuracy for traditional scheme, provide one
Kind dysarthrosis detection method and system.
A kind of dysarthrosis detection method, includes the following steps:
The voice data that electromagnetic pronunciation instrument generates is read, audio data and its corresponding is obtained according to the voice data
Motion track information;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the position of articulation of user, and the voice data is
When user pronounces according to setting words, data that electromagnetic pronunciation instrument is obtained in user pronunciation sensed position;
The corresponding sub- motion profile of each words pronunciation is extracted from the motion track information according to the audio data
Information;
The sub- motion track information is corresponding with words pronunciation each in reference voice library with reference to motion track information
Characteristic operation is carried out, likelihood probability value is obtained;Wherein the reference voice library is the voice for including each words normal articulation
Database;
Dysarthrosis detection is carried out to the user according to likelihood probability value.
A kind of dysarthrosis detection system, comprising:
Read module obtains audio according to the voice data for reading the voice data of electromagnetic pronunciation instrument generation
Data and its corresponding motion track information;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the position of articulation of user,
When the voice data is that user pronounces according to setting words, electromagnetic pronunciation instrument is obtained in user pronunciation sensed position
Data;
Extraction module, it is corresponding for extracting each words pronunciation from the motion track information according to the audio data
Sub- motion track information;
Module is obtained, is used for sub- motion track information reference corresponding with words pronunciation each in reference voice library
Motion track information carries out characteristic operation, obtains likelihood probability value;Wherein the reference voice library be include each words just
The speech database often to pronounce;
Detection module, for carrying out dysarthrosis detection to the user according to likelihood probability value.
Above-mentioned dysarthrosis detection method and system, the voice data generated by reading electromagnetic pronunciation instrument, from described
The corresponding sub- motion track information of each words pronunciation is extracted in motion track information;By the sub- motion track information and reference
Each words pronunciation is corresponding in sound bank carries out characteristic operation with reference to motion track information, likelihood probability value is obtained, thus real
The dysarthrosis detection of the existing user, make above-mentioned dysarthrosis detection scheme can use each words pronunciation in data and
Corresponding sub- motion track information, is improved the accuracy of testing result.
Detailed description of the invention
Fig. 1 is the dysarthrosis detection method flow chart of one embodiment;
Fig. 2 is that the sensor of one embodiment pastes schematic diagram;
Fig. 3 is the probability distribution schematic diagram of one embodiment;
Fig. 4 is the dysarthrosis detection system structure of one embodiment.
Specific embodiment
The specific embodiment of dysarthrosis detection method and system of the invention is described in detail with reference to the accompanying drawing.
It show the dysarthrosis detection method flow chart of one embodiment with reference to Fig. 1, Fig. 1, includes the following steps:
S10, reads the voice data that electromagnetic pronunciation instrument generates, and obtains audio data and its right according to the voice data
The motion track information answered;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the position of articulation of user, the voice number
When according to being pronounced for user according to setting words, data that electromagnetic pronunciation instrument is obtained in user pronunciation sensed position;
The speech research system installed on above-mentioned electromagnetic pronunciation instrument is a non line of sight motion capture system, passes through above-mentioned speech
Language studies system, and electromagnetic pronunciation instrument can be acquired including two synchronous documents of audio data and its motion track information of synchronization,
Wherein audio data is wav format, and motion track information is tsv format.Above-mentioned electromagnetic pronunciation instrument may be mounted at user's
Position of articulation, the vocal organs position including above-mentioned user obtain data when user pronounces to setting words.It is above-mentioned
Setting words can be one or more words corresponding to normal articulation included by reference voice library.
In one embodiment, the reference sensor of above-mentioned electromagnetic pronunciation instrument can be pasted onto the place between the eyebrows position of user,
After six microsensors of electromagnetic pronunciation instrument are successively pasted onto the lingual surface of user, before lingual surface, the tip of the tongue, lower gums, upper mouth
Lip, lower lip.
The stickup order of the sensor of above-mentioned electromagnetic pronunciation instrument may include: that reference sensor is pasted onto place between the eyebrows position
Set, be secondly the stickup of oral sensor, concrete operations are after 6 microsensors are successively pasted onto lingual surface, before lingual surface, tongue
The paste position of sharp, lower gums, upper lip, lower lip, above 6 microsensors can be as shown in Figure 2.Sensor is being pasted
When be that edible medical oral cavity quick-drying adhesive in intraoral microsensor adhesive portion first has to counterpart
Chamber clear up and to dry lingual surface with gauze intraoral right in order to be pasted microsensor with oral cavity quick-drying adhesive
The position answered, it is noted that the interval between three microsensors is about 10mm when pasting microsensor on lingual surface
(millimeter).It also needs to utilize intraoral microsensor the mixed of oral cavity later since the bonding force of rapid-curing cutback glue is weaker
Glue is closed to be fixed.Since the line of microsensor is very thin, is easy to be torn and have certain length, so viscous
Microsensor is posted also to need to carry out fixation to the line of microsensor later.Before carrying out data acquisition in order to allow by
Picker is adapted to speak in the case where pasting microsensor in mouth, and collected user can be allowed first to say
Words practice is practiced adapting to contain microsensor in the mouth and speak, when above-mentioned user, which feels, to be already adapted to after again formally
Carry out the acquisition of voice data.
S20 extracts the corresponding sub- movement of each words pronunciation according to the audio data from the motion track information
Trace information;
In above-mentioned steps, audio data coordinate data corresponding with motion track information is synchronization signal, can be by voice
Signal carries out paragraph alignment to obtain the beginning and ending time of each words pronunciation, then the data of corresponding coordinate is utilized synchronization
Time is segmented.
In one embodiment, above-mentioned to extract each words reading from the motion track information according to the audio data
The step of sound corresponding sub- motion track information may include:
The audio data is segmented, start-stop of each words pronunciation in audio data in voice data is obtained
Time;
Processing is synchronized to the audio data and motion track information, obtains the corresponding sub- movement of each words pronunciation
Trace information.
The present embodiment can use cutting techniques Meier domain cepstrum coefficient (the Mel Frequency of voice signal
Cepstrum Coefficient, MFCC) audio data is segmented, using alignment algorithm DTW (Dynamic Time
Warping, dynamic time warping) voice signal is aligned, phase is carried out to voice signal using gauss hybrid models (GMM)
Compare like property, the automatic segmentation of voice is aligned to realize.In the segmentation pair for completing voice signal using speech recognition technology
Speech recognition system can also obtain a Likelihood Score for voice signal while neat, which can be used as detection
The standard of audio data resolution, and can judge whether the audio data needs to carry out manual segmentation by above-mentioned Likelihood Score
Alignment.
Can also include: after obtaining the corresponding sub- motion track information of each words pronunciation as one embodiment
Obtain the corresponding Likelihood Score of each words pronunciation in voice data;
When the Likelihood Score is lower than preset likelihood threshold value, each pronunciation pair is obtained using manual annotated audio tool
The sub- motion track information answered.
Above-mentioned likelihood threshold value can the words normal articulation feature according to corresponding to the audio data being segmented be configured.
Above-mentioned Likelihood Score is lower than the sub- motion track information of default likelihood threshold value, may indicate that electromagnetic pronunciation instrument is obtained
The user voice data taken is excessively fuzzy, can not carry out automatic paragraph alignment to it with speech recognition technology.For this portion
Multi-voice frequency data need to carry out manually paragraph alignment using artificial.It is further that manual annotated audio tool Praat can be used
Obtain the corresponding sub- motion track information of each pronunciation.
As one embodiment, may include: using the basic step that Praat carries out voice annotation
Newly-built voice annotation object.The audio data for needing to carry out voice annotation is chosen in list object, is clicked
" To TextGrid ... " under " Annotate- " keys in the title of layer to be marked and confirmation in new window, chooses TextGrid
" Edit " button is clicked after object enters edit page.
Save mark file.Since Praat software will not automatically save, so in order to avoid the contents lost marked,
It needs in time to save mark object.
Extract hierarchical data needed for mark object." Extract tier ... " is clicked after selected TextGrid object
Button, the level number needed for new window input simultaneously confirms, chooses emerging object and clicks " Into TextGrid " and presses
Button, chooses and newly-generated object and click " Edit " can check extracted hierarchical data.
Extract fragment data needed for mark object." Extract part ... " is clicked after selected TextGrid object
Button, at the beginning of new window input required fragment data and the end time chooses " Preserve times " simultaneously true
Recognize, extracted hierarchical data can be checked by choosing emerging object and clicking " Edit ".
Obtain the data in TextGrid object.It is achieved using the sub- option in Query- menu.
Obtain the data in mark file.It using shell script extracts whole TextGrid files, and will do and extract
Data in file are saved in text file, and then obtained data can be imported into Excel table and be carried out into one
The data analysis and process of step.
S30, the sub- motion track information is corresponding with words pronunciation each in reference voice library with reference to motion profile
Information carries out characteristic operation, obtains likelihood probability value;Wherein the reference voice library be include each words normal articulation
Speech database;
Above-mentioned steps can use the GUI in MATLAB, and sub- motion track information and corresponding reference motion profile are believed
Breath Dynamically Announce comes out, the motion conditions of oral cavity organs point when intuitively showing user pronunciation, then carries out corresponding characteristic operation,
To obtain the likelihood probability value of user's words pronunciation, above-mentioned likelihood probability value is lower, shows that user's dysarthrosis is more serious.
The normal hair for being adapted to detect for the whether normal all words of user pronunciation structure sound has been prestored in above-mentioned reference voice library
Whether sound, other pronunciations or the pronunciation that can be used for detecting the corresponding words of its detection are normal.
In one embodiment, above-mentioned that the sub- motion track information is corresponding with words pronunciation each in reference voice library
Reference motion track information carry out characteristic operation, obtain likelihood probability value the step of may include:
It is obtained in the corresponding reference motion track information of sub- motion track information and the sub- motion track information respectively
Coordinate point sequence obtains voice coordinate sequence and reference coordinate sequence;Wherein, the voice coordinate sequence and reference coordinate sequence
In coordinate points correspond;
Voice coordinate sequence and reference coordinate sequence are normalized respectively;
Voice coordinate sequence after normalized is fitted to voice coordinate curve, obtains the voice coordinate curve
Each rank voice fitting coefficient;
It is reference coordinate curve by the reference coordinate sequence fit after normalized, obtains the reference coordinate curve
Each rank refers to fitting coefficient;
Likelihood probability value is obtained according to the voice fitting coefficient and with reference to fitting coefficient.
In the present embodiment, coordinate point sequence obtains sub- motion track information using the paragraph alignment result of voice, to described
The time of the corresponding motion track information of sub- motion track information and amplitude are normalized.Voice after above-mentioned normalized
Coordinate curve and reference coordinate curve can be written as corresponding multi-order function, and each rank fitting coefficient of voice coordinate curve is
Each level number of the corresponding multi-order function of above-mentioned voice coordinate curve, each rank of reference coordinate curve are above-mentioned ginseng with reference to coefficient
Examine each level number of the corresponding multi-order function of coordinate curve.
It is above-mentioned similar with reference to fitting coefficient acquisition according to the voice fitting coefficient according to described as one embodiment
The step of probability value may include:
Multivariate Gaussian probability density distribution model is established according to reference fitting coefficient;
The voice fitting coefficient is substituted into multivariate Gaussian probability density distribution model and obtains the similar of voice fitting coefficient
Probability value.
S40 carries out dysarthrosis detection to the user according to likelihood probability value.
In one embodiment, the above-mentioned process for carrying out dysarthrosis detection to the user according to likelihood probability value can be with
Include:
Judge the size relation between the likelihood probability value and preset structure sound threshold value;
If the likelihood probability value is less than preset structure sound threshold value, determining user, there are dysarthrosis.
Above-mentioned preset structure sound threshold value can be according to set by specific words feature, and usual structure sound threshold value then determines to use
There are dysarthrosis at family.In the case of, multivariate Gaussian probability density distribution model is established according to reference coefficient and corresponds to user
Voice coefficient substitute into the probability after above-mentioned multivariate Gaussian probability density distribution model respectively as shown in figure 3, in Fig. 3, abscissa
Indicate sensor serial number, ordinate indicates probability value, then the corresponding voice coefficient of user can substituted into above-mentioned multivariate Gaussian
Probability after probability density distribution model is less than structure sound threshold value, then determines that there are dysarthrosis by user.
Dysarthrosis detection method provided by the invention, by reading audio data that electromagnetic pronunciation instrument generates and its same
The motion track information of step extracts the corresponding sub- motion track information of each words pronunciation from the motion track information;It will
The sub- motion track information is corresponding with words pronunciation each in reference voice library to carry out feature fortune with reference to motion track information
It calculates, obtains likelihood probability value, to realize the dysarthrosis detection of the user, allow above-mentioned dysarthrosis detection scheme benefit
With in data each words pronunciation and corresponding sub- motion track information, be improved the accuracy of testing result.
Refering to what is shown in Fig. 4, Fig. 4 is the dysarthrosis detection system structure of one embodiment, comprising:
Read module 10 obtains sound according to the voice data for reading the voice data of electromagnetic pronunciation instrument generation
Frequency evidence and its corresponding motion track information;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the pronunciation position of user
It sets, when the voice data is that user pronounces according to setting words, electromagnetic pronunciation instrument is obtained in user pronunciation sensed position
The data taken;
Extraction module 20, for extracting each words pronunciation pair from the motion track information according to the audio data
The sub- motion track information answered;
Module 30 is obtained, is used for sub- motion track information ginseng corresponding with words pronunciation each in reference voice library
It examines motion track information and carries out characteristic operation, obtain likelihood probability value;Wherein the reference voice library be include each words
The speech database of normal articulation;
Detection module 40, for carrying out dysarthrosis detection to the user according to likelihood probability value.
In one embodiment, said extracted module can be further used for:
The audio data is segmented, start-stop of each words pronunciation in audio data in voice data is obtained
Time;
Processing is synchronized to the audio data and motion track information, obtains the corresponding sub- movement of each words pronunciation
Trace information.
As one embodiment, said extracted module can be further used for:
Obtain the corresponding Likelihood Score of each words pronunciation in voice data;
When the Likelihood Score is lower than preset likelihood threshold value, each pronunciation pair is obtained using manual annotated audio tool
The sub- motion track information answered.
Dysarthrosis detection system provided by the invention and dysarthrosis detection method provided by the invention correspond,
The technical characteristic and its advantages that the embodiment of the dysarthrosis detection method illustrates are suitable for dysarthrosis detection system
In the embodiment of system, hereby give notice that.
Each technical characteristic of embodiment described above can be combined arbitrarily, for simplicity of description, not to above-mentioned reality
It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited
In contradiction, all should be considered as described in this specification.
The embodiments described above only express several embodiments of the present invention, and the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to protection of the invention
Range.Therefore, the scope of protection of the patent of the invention shall be subject to the appended claims.
Claims (10)
1. a kind of dysarthrosis detection method, which comprises the steps of:
The voice data that electromagnetic pronunciation instrument generates is read, audio data and its corresponding movement are obtained according to the voice data
Trace information;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the position of articulation of user, and the voice data is user
When being pronounced according to setting words, data that electromagnetic pronunciation instrument is obtained in user pronunciation sensed position;The setting words
Refer to one or more words corresponding to normal articulation included by reference voice library;
The corresponding sub- motion track information of each words pronunciation is extracted from the motion track information according to the audio data;
The sub- motion track information is corresponding with each words pronunciation in the reference voice library with reference to motion track information
Characteristic operation is carried out, likelihood probability value is obtained;Wherein the reference voice library is the language for including each words normal articulation
Sound database;
Dysarthrosis detection is carried out to the user according to likelihood probability value.
2. dysarthrosis detection method according to claim 1, which is characterized in that the reference of the electromagnetic pronunciation instrument passes
Sensor is pasted onto the place between the eyebrows position of user, and six microsensors of electromagnetic pronunciation instrument are successively pasted onto the lingual surface of user
Afterwards, before lingual surface, the tip of the tongue, lower gums, upper lip, lower lip.
3. dysarthrosis detection method according to claim 1, which is characterized in that it is described according to the audio data from institute
Stating the step of corresponding sub- motion track information of each words pronunciation is extracted in motion track information includes:
The audio data is segmented, when obtaining start-stop of each words pronunciation in voice data in audio data
Between;
Processing is synchronized to the audio data and motion track information, obtains the corresponding sub- motion profile of each words pronunciation
Information.
4. dysarthrosis detection method according to claim 3, which is characterized in that obtaining, each words pronunciation is corresponding
After sub- motion track information, further includes:
Obtain the corresponding Likelihood Score of each words pronunciation in voice data;
When the Likelihood Score is lower than preset likelihood threshold value, it is corresponding that each pronunciation is obtained using manual annotated audio tool
Sub- motion track information.
5. dysarthrosis detection method according to claim 1, which is characterized in that described by the sub- motion track information
It is corresponding with words pronunciation each in reference voice library to carry out characteristic operation with reference to motion track information, obtain likelihood probability value
Step includes:
Coordinate is obtained in the corresponding reference motion track information of sub- motion track information and the sub- motion track information respectively
Point sequence obtains voice coordinate sequence and reference coordinate sequence;Wherein, in the voice coordinate sequence and reference coordinate sequence
Coordinate points correspond;
Voice coordinate sequence and reference coordinate sequence are normalized respectively;
Voice coordinate sequence after normalized is fitted to voice coordinate curve, obtains each rank of the voice coordinate curve
Voice fitting coefficient;
It is reference coordinate curve by the reference coordinate sequence fit after normalized, obtains each rank of the reference coordinate curve
With reference to fitting coefficient;
Likelihood probability value is obtained according to the voice fitting coefficient and with reference to fitting coefficient.
6. dysarthrosis detection method according to claim 5, which is characterized in that described according to the voice fitting coefficient
Include: with the step of reference fitting coefficient acquisition likelihood probability value
Multivariate Gaussian probability density distribution model is established according to reference fitting coefficient;
The voice fitting coefficient is substituted into the likelihood probability that multivariate Gaussian probability density distribution model obtains voice fitting coefficient
Value.
7. dysarthrosis detection method according to claim 1, which is characterized in that it is described according to likelihood probability value to described
User carry out dysarthrosis detection process include:
Judge the size relation between the likelihood probability value and preset structure sound threshold value;
If the likelihood probability value is less than preset structure sound threshold value, determining user, there are dysarthrosis.
8. a kind of dysarthrosis detection system characterized by comprising
Read module obtains audio data according to the voice data for reading the voice data of electromagnetic pronunciation instrument generation
And its corresponding motion track information;Wherein, the sensor of the electromagnetic pronunciation instrument is mounted on the position of articulation of user, described
When voice data is that user pronounces according to setting words, number that electromagnetic pronunciation instrument is obtained in user pronunciation sensed position
According to;The setting words refers to one or more words corresponding to normal articulation included by reference voice library;
Extraction module, for extracting the corresponding son of each words pronunciation from the motion track information according to the audio data
Motion track information;
Module is obtained, is used for sub- motion track information reference corresponding with words pronunciation each in the reference voice library
Motion track information carries out characteristic operation, obtains likelihood probability value;Wherein the reference voice library be include each words
The speech database of normal articulation;
Detection module, for carrying out dysarthrosis detection to the user according to likelihood probability value.
9. dysarthrosis detection system according to claim 8, which is characterized in that the extraction module is further used for:
The audio data is segmented, when obtaining start-stop of each words pronunciation in voice data in audio data
Between;
Processing is synchronized to the audio data and motion track information, obtains the corresponding sub- motion profile of each words pronunciation
Information.
10. dysarthrosis detection system according to claim 9, which is characterized in that the extraction module is further used for:
Obtain the corresponding Likelihood Score of each words pronunciation in voice data;
When the Likelihood Score is lower than preset likelihood threshold value, it is corresponding that each pronunciation is obtained using manual annotated audio tool
Sub- motion track information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610264854.XA CN105719662B (en) | 2016-04-25 | 2016-04-25 | Dysarthrosis detection method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610264854.XA CN105719662B (en) | 2016-04-25 | 2016-04-25 | Dysarthrosis detection method and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105719662A CN105719662A (en) | 2016-06-29 |
CN105719662B true CN105719662B (en) | 2019-10-25 |
Family
ID=56161689
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610264854.XA Active CN105719662B (en) | 2016-04-25 | 2016-04-25 | Dysarthrosis detection method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105719662B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107452370A (en) * | 2017-07-18 | 2017-12-08 | 太原理工大学 | A kind of application method of the judgment means of Chinese vowel followed by a nasal consonant dysphonia patient |
CN109360645B (en) * | 2018-08-01 | 2021-06-11 | 太原理工大学 | Statistical classification method for dysarthria pronunciation and movement abnormal distribution |
CN111276130A (en) * | 2020-01-21 | 2020-06-12 | 河南优德医疗设备股份有限公司 | MFCC cepstrum coefficient calculation method for computer language knowledge education system |
CN113496696A (en) * | 2020-04-03 | 2021-10-12 | 中国科学院深圳先进技术研究院 | Speech function automatic evaluation system and method based on voice recognition |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2399931A (en) * | 2003-03-28 | 2004-09-29 | Barnsley Distr General Hospita | Assistive technology |
CN103337247A (en) * | 2013-06-17 | 2013-10-02 | 天津大学 | Data annotation analysis system for electromagnetic pronunciation recorder |
CN103337055A (en) * | 2013-06-24 | 2013-10-02 | 暨南大学 | Deblurring method for text image based on gradient fitting |
CN103383845A (en) * | 2013-07-08 | 2013-11-06 | 上海昭鸣投资管理有限责任公司 | Multi-dimensional dysarthria measuring system and method based on real-time vocal tract shape correction |
CN103405217A (en) * | 2013-07-08 | 2013-11-27 | 上海昭鸣投资管理有限责任公司 | System and method for multi-dimensional measurement of dysarthria based on real-time articulation modeling technology |
CN103705218A (en) * | 2013-12-20 | 2014-04-09 | 中国科学院深圳先进技术研究院 | Dysarthria identifying method, system and device |
CN104123934A (en) * | 2014-07-23 | 2014-10-29 | 泰亿格电子(上海)有限公司 | Speech composition recognition method and system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8275624B2 (en) * | 2008-10-16 | 2012-09-25 | Thomas David Kehoe | Electronic speech aid and method for use thereof to treat hypokinetic dysarthria |
US20140163070A1 (en) * | 2012-05-17 | 2014-06-12 | Bruce Roseman | Treatment for cerebral palsy impaired speech in children |
US9911358B2 (en) * | 2013-05-20 | 2018-03-06 | Georgia Tech Research Corporation | Wireless real-time tongue tracking for speech impairment diagnosis, speech therapy with audiovisual biofeedback, and silent speech interfaces |
-
2016
- 2016-04-25 CN CN201610264854.XA patent/CN105719662B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2399931A (en) * | 2003-03-28 | 2004-09-29 | Barnsley Distr General Hospita | Assistive technology |
CN103337247A (en) * | 2013-06-17 | 2013-10-02 | 天津大学 | Data annotation analysis system for electromagnetic pronunciation recorder |
CN103337055A (en) * | 2013-06-24 | 2013-10-02 | 暨南大学 | Deblurring method for text image based on gradient fitting |
CN103383845A (en) * | 2013-07-08 | 2013-11-06 | 上海昭鸣投资管理有限责任公司 | Multi-dimensional dysarthria measuring system and method based on real-time vocal tract shape correction |
CN103405217A (en) * | 2013-07-08 | 2013-11-27 | 上海昭鸣投资管理有限责任公司 | System and method for multi-dimensional measurement of dysarthria based on real-time articulation modeling technology |
CN103705218A (en) * | 2013-12-20 | 2014-04-09 | 中国科学院深圳先进技术研究院 | Dysarthria identifying method, system and device |
CN104123934A (en) * | 2014-07-23 | 2014-10-29 | 泰亿格电子(上海)有限公司 | Speech composition recognition method and system |
Also Published As
Publication number | Publication date |
---|---|
CN105719662A (en) | 2016-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Klein et al. | A multidimensional investigation of children's/r/productions: Perceptual, ultrasound, and acoustic measures | |
Ramanarayanan et al. | An investigation of articulatory setting using real-time magnetic resonance imaging | |
Wiget et al. | How stable are acoustic metrics of contrastive speech rhythm? | |
JP4725948B2 (en) | System and method for synchronizing text display and audio playback | |
Gobl et al. | 11 voice source variation and its communicative functions | |
CN105719662B (en) | Dysarthrosis detection method and system | |
Lawson et al. | The role of gesture delay in coda/r/weakening: An articulatory, auditory and acoustic study | |
CN106782603B (en) | Intelligent voice evaluation method and system | |
CN104252872B (en) | Lyric generating method and intelligent terminal | |
Bombien et al. | Articulatory overlap as a function of voicing in French and German consonant clusters | |
Carignan | Using ultrasound and nasalance to separate oral and nasal contributions to formant frequencies of nasalized vowels | |
Gallagher | Vowel height allophony and dorsal place contrasts in Cochabamba Quechua | |
Vojtech et al. | Refining algorithmic estimation of relative fundamental frequency: Accounting for sample characteristics and fundamental frequency estimation method | |
Stewart et al. | Earbuds: A method for analyzing nasality in the field | |
Paroni et al. | Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography | |
Kochetov | Research methods in articulatory phonetics II: Studying other gestures and recent trends | |
CN107625527B (en) | Lie detection method and device | |
Hussain et al. | An acoustic and articulatory study of laryngeal and place contrasts of Kalasha (Indo-Aryan, Dardic) | |
JP2013088552A (en) | Pronunciation training device | |
CN109166629A (en) | The method and system of aphasia evaluation and rehabilitation auxiliary | |
Cai et al. | The DKU-JNU-EMA electromagnetic articulography database on Mandarin and Chinese dialects with tandem feature based acoustic-to-articulatory inversion | |
Gilbert et al. | Restoring speech following total removal of the larynx by a learned transformation from sensor data to acoustics | |
Yeung et al. | Subglottal resonances of American English speaking children | |
CN107591163B (en) | Pronunciation detection method and device and voice category learning method and system | |
Maddieson | Articulatory Phonology and Sukuma" Aspirated Nasals" |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |