[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

JP3876703B2 - Speaker learning apparatus and method for speech recognition - Google Patents

Speaker learning apparatus and method for speech recognition Download PDF

Info

Publication number
JP3876703B2
JP3876703B2 JP2001378341A JP2001378341A JP3876703B2 JP 3876703 B2 JP3876703 B2 JP 3876703B2 JP 2001378341 A JP2001378341 A JP 2001378341A JP 2001378341 A JP2001378341 A JP 2001378341A JP 3876703 B2 JP3876703 B2 JP 3876703B2
Authority
JP
Japan
Prior art keywords
speaker
learning
recognition
registration
utterance content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2001378341A
Other languages
Japanese (ja)
Other versions
JP2003177779A5 (en
JP2003177779A (en
Inventor
由実 脇田
研治 水谷
伸一 芳澤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Corp
Panasonic Holdings Corp
Original Assignee
Panasonic Corp
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Corp, Matsushita Electric Industrial Co Ltd filed Critical Panasonic Corp
Priority to JP2001378341A priority Critical patent/JP3876703B2/en
Publication of JP2003177779A publication Critical patent/JP2003177779A/en
Publication of JP2003177779A5 publication Critical patent/JP2003177779A5/ja
Application granted granted Critical
Publication of JP3876703B2 publication Critical patent/JP3876703B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Description

【0001】
【発明の属する技術分野】
本発明は、音声認識における話者学習装置及び方法に関するものである。
【0002】
【従来の技術】
以下、従来の話者学習法を説明する。従来の不特定話者音声認識システムでは、なるべく不特定多数の話者に対応できる標準的な音響モデルを構築して用いているが、実用上では、話者の発声特徴は多種多様であり、全ての使用話者に対して高性能を保証する音響モデルを学習することは困難である。そこで従来は、認識しない話者について、話者自身の発声を用いて音響モデルパラメータを再学習し、話者に適応した音響モデルを再構築することにより全話者に対する性能を保証する話者適応手段をとっている。この話者適応には話者の特徴を捉えるに十分な多くの学習用音声が必要であるが、発声者の負担になるので、最低限の発声回数に絞る様々な工夫がなされている(たとえば、特許第2037877)。一方、別の学習方法として、誤認識した単語の認識結果に相当する音響モデル系列を正解系列として発音辞書に追加し、誤った系列として認識したものを正しい系列として認識することを可能とする話者登録方法もある(特開平8-171396号公報)。
【0003】
【発明が解決しようとする課題】
従来の話者適応法は、学習データが十分あれば、原理的に確実に認識性能を向上できる手法であるが、ほとんど全ての実用上システムでは行われているように、話者の学習負担を考慮して発声回数が絞られた場合、学習データに存在しない一部の発声に対して、逆に認識率が低下してしまう可能性があるという問題がある。一方、従来の話者登録法は、学習された発声部分の認識率は確実に向上するが、多くの発声内容で認識しにくい話者の場合は、学習時に認識しにくい全ての発声をしなければならず学習に負担がかかる、という問題がある。
【0004】
本発明の目的は、従来の話者適応学習と話者登録学習の問題点を解決し、話者に負担にならない学習発声量で、学習後に確実に認識率を向上させる話者学習法を提供するものである。
【0005】
【課題を解決するための手段】
上述した課題を解決するために、本発明は、話者の学習用音声を用いて音響モデルパラメータを再学習し、話者に適応した音響モデルを作成する話者適応学習手段と、誤認識した単語の認識結果に相当する音素又は音節からなる音響モデル系列を正解の音素又は音節系列として発音辞書に追加する話者登録学習手段と、認識しやすさが発声内容に依存するかどうかを判断する手段と、各話者の認識しやすさと発声内容の依存の強さによって、話者適応学習手段と話者登録学習手段との選択を行うものである。
【0006】
【発明の実施の形態】
以下、図面を参照して本発明の実施形態の話者学習説明する。
【0007】
図1は本発明の実施形態の話者学習ブロック図である。
【0008】
各話者が自分に対する認識性能を向上させる必要を感じた場合に選択するように設定された話者学習機能において、まず、システムからユーザに対し特定単語発声を促し、話者の特定単語発声が入力される。この発声内容は、各話者に対して、予め準備した標準音声がどのくらい適切かを判断するのに必要な最低限の内容であり、たとえば日本語認識の場合は、5母音を全て含む単語「マイクテスト」などの内容がふさわしい。システムが単語認識の場合には5母音が全て含まれるように対象単語から複数単語を選択しても良い。
【0009】
この発声に対して音声認識処理1で通常の認識処理が行われ、認識スコア算出処理2で認識結果と認識信頼度スコアが計算される。認識結果は、認識結果の音素または音節系列と正解の音素系列とを比較し、異なっている部分を誤りとし一致している部分を正解として、正解系列の各音素毎に正誤を記録しておく。また信頼度スコアは、たとえば正解音素または音節系列と発声された結果との各音素または音節毎の音響的距離スコアであり、距離尺度として重み付きケプストラム距離を用いた場合は、各音素の信頼度は式1で算出されるものを用いてもよい。
【0010】
【数1】

Figure 0003876703
【0011】
学習法決定処理3では、信頼度スコアが閾値以下であるか、閾値以上であったとしても誤認識している音素または音節(適応候補音素または音節と呼ぶ)の全発声に含まれる音素または音節に対する割合を計算する。この割合が大きい場合は、発声内容に依存せず話者の発声特徴が標準音声に適していないことが推定され、全ての標準音声を話者に適するように学習する必要があると考えられる。また、この割合が小さい場合には、誤認識は発声内容に依存しており、話者の発声特徴と標準音声は適しているが、特定の発声においてのみ学習が必要であると考えられる。従って、この割合が一定値以上である場合、話者適応学習を選択し、一定値以下である場合、話者登録学習を選択する。
【0012】
話者適応学習を選択した場合は、話者適応処理4で、ユーザにさらに適応するに必要最低限の発声を促す。話者適応法は、たとえば、特開平5-53599に記載のVFS法を利用した場合には、標準音響モデルと学習用入力音声パラメータとをマッチングし、対応するパラメータの関係からファジー級関数を求め、求められた関数を重みとして、標準音声を学習用入力音声に近づくように標準音響モデルのパラメータを更新している。
【0013】
また、話者登録学習を選択した場合には、話者登録処理5で、学習決定処理で算出した適応候補音素または音節が含まれている単語のみの発声を促し、適応候補音素に相当する音素系列を含む単語の音素系列に、発声に対する音素または音節認識結果系列を発音辞書7に追加する。たとえば、「メニュー」という単語が誤認識を起こす場合、この単語のみの発声を促し、その認識結果が「デニュー」であったとする。音響モデルとして音素モデルを使用している場合には、「メニュー」の正しい音素モデル系列は/m e ny u u/であり、認識結果音素系列は/d e ny u u/である。この話者の場合、単語の始めであり、次に/e/が続く音素/m/は/d/に誤る傾向があることがわかる。そこで、認識対象単語の中で、単語の先頭であり、次が/e/である/m/は/d/と誤っても/m/と認識するように、発音辞書に音素系列を追加する。この例の場合には、もともと辞書上で「メニュー/m e ny u u/」であったところに/d e ny u u/を追加し、「メニュー/m e ny u u/または/d e ny u u/」と辞書を変更する。これにより、この話者が「メニュー」を/d e ny u u/ と認識しても結果的には「メニュー」が認識できることになる。
【0014】
以上のように、話者の発声が発声内容に依存せずに誤るかどうかを推定し、発声内容に依存しない場合は話者適応学習、依存する場合は話者登録学習を行うことにより、従来の話者適応学習で、適応するための多くの学習発声をしたにもかかわらず認識率が低下する問題を、話者適応学習ではなく話者登録学習を行うことで解決することができる。また、従来の話者登録学習で、多くの単語を発声しなければ学習できなかった問題を、話者登録学習ではなく話者適応学習を行うことで解決することができる。
【0015】
以上詳述したように、本発明に係る実施形態の話者学習法は、各話者の認識しやすさと発声内容の依存の強さによって、話者適応学習を行うか話者登録学習を行うかの選択を行い、どちらかの学習を話者に促すことにより、従来の話者適応学習において、適応するための多くの学習発声をしたにもかかわらず認識率が低下する問題を、話者適応学習のかわりに話者登録学習を自動選択することで解決することができる。また、従来の話者登録学習において、多くの単語を発声しなければ学習できなかった問題を、話者登録学習のかわりに話者適応学習を自動選択することで解決することができる。従って、話者に負担にならない程度の学習量で、確実に認識率を向上させることが可能である話者学習法を提供するものである。
【0017】
以上詳述したように、本発明に係る実施形態の話者学習法は、認識しやすさが発声内容に依存するかどうかを判断した結果、依存すると判断された場合には話者登録学習を行い、依存しないと判断された場合には話者適応学習を行うことにより、従来の話者適応学習において、適応するための多くの学習発声をしたにもかかわらず認識率が低下する問題を、話者適応学習のかわりに話者登録学習を自動選択することで解決することができる。また、従来の話者登録学習において、多くの単語を発声しなければ学習できなかった問題を、話者登録学習のかわりに話者適応学習を自動選択することで解決することができる。従って、話者に負担にならない程度の学習量で、確実に認識率を向上させることが可能である話者学習法を提供するものである。
【0019】
以上詳述したように、本発明は、各話者の認識しやすさと発声内容の依存の強さによって、話者適応学習を行うか話者登録学習を行うかの選択を行い、どちらかの学習を話者に促すことにより、従来の話者適応学習において、適応するための多くの学習発声をしたにもかかわらず認識率が低下する問題を、話者適応学習のかわりに話者登録学習を自動選択することで解決することができる
【図面の簡単な説明】
【図1】本発明の一実施例である話者学習法ブロック図
【符号の説明】
1 音声認識
2 認識スコア算出
3 学習法決定
4 話者適応
5 話者登録
6 音響モデル
7 発音辞書
8 認識スコアバッファ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a speaker learning apparatus and method in speech recognition.
[0002]
[Prior art]
The conventional speaker learning method will be described below. In conventional unspecified speaker speech recognition systems, a standard acoustic model that can handle as many unspecified speakers as possible is constructed and used, but in practical use, there are a wide variety of speaker utterance features, It is difficult to learn an acoustic model that guarantees high performance for all speakers. Therefore, in the past, speaker adaptation that re-learns acoustic model parameters using the speaker's own utterance for unrecognized speakers and reconstructs an acoustic model adapted to the speakers to guarantee performance for all speakers. Take measures. This speaker adaptation requires a large amount of learning speech to capture the speaker's characteristics, but it is a burden on the speaker, so various ideas have been made to limit the minimum number of utterances (for example, Patent No. 2037877). On the other hand, as another learning method, an acoustic model sequence corresponding to the recognition result of a misrecognized word is added to the pronunciation dictionary as a correct answer sequence, and a story that enables recognition of an incorrect sequence as a correct sequence There is also a person registration method (Japanese Patent Laid-Open No. 8-171396).
[0003]
[Problems to be solved by the invention]
The conventional speaker adaptation method is a technique that can reliably improve the recognition performance in principle if there is enough learning data. However, as with almost all practical systems, the speaker's learning burden is reduced. When the number of utterances is narrowed down in consideration, there is a problem that the recognition rate may decrease for some utterances that do not exist in the learning data. On the other hand, the conventional speaker registration method improves the recognition rate of the learned utterances, but if the speaker is difficult to recognize with many utterances, all utterances difficult to recognize during learning must be made. There is a problem that learning is burdensome.
[0004]
An object of the present invention is to provide a speaker learning method that solves the problems of conventional speaker adaptive learning and speaker registration learning, and that reliably improves the recognition rate after learning with a learning utterance amount that does not burden the speaker. To do.
[0005]
[Means for Solving the Problems]
In order to solve the above-described problems, the present invention misrecognized speaker-adaptive learning means for re-learning acoustic model parameters using speaker learning speech and creating an acoustic model adapted to the speaker . Speaker registration learning means for adding an acoustic model sequence consisting of phonemes or syllables corresponding to the word recognition result to the pronunciation dictionary as correct phonemes or syllable sequences, and determining whether the recognition is dependent on the utterance content The speaker adaptive learning means and the speaker registration learning means are selected according to the means, the ease of recognition of each speaker, and the strength of the utterance content .
[0006]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, speaker learning according to an embodiment of the present invention will be described with reference to the drawings.
[0007]
FIG. 1 is a block diagram of speaker learning according to an embodiment of the present invention.
[0008]
In the speaker learning function set to select when each speaker feels necessary to improve his / her recognition performance, first, the system prompts the user to speak a specific word, and the speaker's specific word utterance is Entered. This utterance content is the minimum content necessary for each speaker to determine how appropriate the standard voice prepared in advance is. For example, in the case of Japanese recognition, the word “ Content such as “Mike Test” is appropriate. When the system is word recognition, a plurality of words may be selected from the target words so that all five vowels are included.
[0009]
A normal recognition process is performed on the utterance in the voice recognition process 1, and a recognition result and a recognition reliability score are calculated in the recognition score calculation process 2. For the recognition result, compare the phoneme or syllable sequence of the recognition result with the correct phoneme sequence, and record the correctness for each phoneme in the correct sequence, with the different part as the error and the matching part as the correct answer. . The confidence score is, for example, the acoustic distance score for each phoneme or syllable between the correct phoneme or syllable sequence and the utterance result, and when using a weighted cepstrum distance as a distance measure, the confidence score of each phoneme May be calculated by Equation 1.
[0010]
[Expression 1]
Figure 0003876703
[0011]
In the learning method determination process 3, a phoneme or syllable included in all utterances of phonemes or syllables (referred to as adaptive candidate phonemes or syllables) that are misrecognized even if the reliability score is less than or equal to the threshold. Calculate the ratio to. If this proportion is large, utterance feature of the speaker without depending on the utterance contents is estimated that no adaptation to standard voice, it is necessary to learn to adapt to the speaker all standard voice it is conceivable that. Also, when this ratio is small, erroneous recognition depends on the utterance contents, although the utterance feature and the standard audio speakers are adaptation is considered necessary only learned in a particular utterance . Therefore, when this ratio is a certain value or more, speaker adaptive learning is selected, and when it is less than a certain value, speaker registration learning is selected.
[0012]
When speaker adaptation learning is selected, the speaker adaptation process 4 prompts the user to speak at the minimum necessary for further adaptation. In the speaker adaptation method, for example, when the VFS method described in JP-A-5-53599 is used, a standard acoustic model and learning input speech parameters are matched, and a fuzzy class function is obtained from the relationship of the corresponding parameters. The parameters of the standard acoustic model are updated so that the standard speech approaches the learning input speech by using the obtained function as a weight.
[0013]
When speaker registration learning is selected, the speaker registration process 5 prompts the utterance of only words that include the adaptive candidate phonemes or syllables calculated in the learning determination process, and the phonemes corresponding to the adaptive candidate phonemes. A phoneme or syllable recognition result sequence for the utterance is added to the pronunciation dictionary 7 to the phoneme sequence of the word including the sequence. For example, when the word “menu” causes misrecognition, the utterance of only this word is prompted, and the recognition result is “Denyu”. When a phoneme model is used as the acoustic model, the correct phoneme model sequence of the “menu” is / me ny uu /, and the recognition result phoneme sequence is / de ny uu /. In the case of this speaker, it can be seen that the phoneme / m /, which is the beginning of a word, followed by / e /, tends to be mistaken for / d /. Therefore, in the recognition target words, add phoneme sequences to the pronunciation dictionary so that even if / m / is / d /, the next is / e / . In this example, add / de ny uu / where it was originally “menu / me ny uu /” in the dictionary, and change the dictionary to “menu / me ny uu / or / de ny uu /”. change. As a result, even if the speaker recognizes the “menu” as / de ny uu /, the “menu” can be recognized as a result.
[0014]
As described above, by estimating whether the utterance of the speaker is incorrect without depending on the utterance content, by performing speaker adaptive learning if not dependent on the utterance content, and performing speaker registration learning if dependent, In the speaker adaptive learning, the problem that the recognition rate decreases even though many learning utterances for adaptation are performed can be solved by performing speaker registration learning instead of speaker adaptive learning. Further, the problem that cannot be learned without speaking many words in the conventional speaker registration learning can be solved by performing speaker adaptive learning instead of speaker registration learning.
[0015]
As described above in detail, the speaker learning method of the embodiment according to the present invention performs speaker adaptive learning or speaker registration learning depending on the ease of recognition of each speaker and the strength of dependency of the utterance content. In the conventional speaker adaptive learning, the speaker has a problem that the recognition rate is lowered even though many learning utterances are applied for adaptation. This can be solved by automatically selecting speaker registration learning instead of adaptive learning. In addition, in the conventional speaker registration learning, a problem that cannot be learned without speaking a large number of words can be solved by automatically selecting speaker adaptive learning instead of speaker registration learning. Therefore, the present invention provides a speaker learning method that can reliably improve the recognition rate with a learning amount that does not burden the speaker.
[0017]
As described above in detail, the speaker learning method according to the embodiment of the present invention determines whether or not the recognition is dependent on the utterance content. If it is determined that it does not depend, speaker adaptive learning is performed, and in the conventional speaker adaptive learning, the problem that the recognition rate decreases despite a lot of learning utterances to adapt, This can be solved by automatically selecting speaker registration learning instead of speaker adaptive learning. In addition, in the conventional speaker registration learning, a problem that cannot be learned without speaking a large number of words can be solved by automatically selecting speaker adaptive learning instead of speaker registration learning. Therefore, the present invention provides a speaker learning method that can reliably improve the recognition rate with a learning amount that does not burden the speaker.
[0019]
As described in detail above, the present invention selects whether to perform speaker adaptive learning or speaker registration learning depending on the ease of recognition of each speaker and the strength of dependence on the utterance content. Speaker registration learning instead of speaker adaptive learning is a problem that reduces the recognition rate in spite of many learning utterances for adaptation in the conventional speaker adaptive learning by encouraging the speaker to learn. Can be resolved by automatically selecting [Short description of drawings]
FIG. 1 is a block diagram of a speaker learning method according to an embodiment of the present invention.
1 Speech Recognition 2 Recognition Score Calculation 3 Learning Method Determination 4 Speaker Adaptation 5 Speaker Registration 6 Acoustic Model 7 Pronunciation Dictionary 8 Recognition Score Buffer

Claims (4)

話者の学習用音声を用いて音響モデルパラメータを再学習し、話者に適応した音響モデルを作成する話者適応学習手段と、誤認識した単語の認識結果に相当する音素又は音節からなる音響モデル系列を正解の音素又は音節系列として発音辞書に追加する話者登録学習手段と、認識しやすさが発声内容に依存するかどうかを判断する手段と、各話者の認識しやすさと発声内容の依存の強さによって、話者適応学習手段と話者登録学習手段との選択を行い、どちらかの学習を話者に促す手段を有することを特徴とする話者学習装置。  Speaker-adaptive learning means for re-learning acoustic model parameters using speaker learning speech to create an acoustic model adapted to the speaker, and sound consisting of phonemes or syllables corresponding to recognition results of misrecognized words Speaker registration learning means for adding model series as correct phoneme or syllable series to pronunciation dictionary, means for determining whether recognition is dependent on utterance content, and ease of recognition and utterance content of each speaker A speaker learning apparatus comprising means for selecting a speaker adaptive learning means and a speaker registration learning means depending on the strength of dependence, and prompting the speaker to learn either of them. 認識しやすさが発声内容に依存するかどうかを判断した結果、依存すると判断された場合には話者登録学習手段を用い、依存しないと判断された場合には話者適応学習手段を用いることを特徴とする請求項1記載の話者学習装置。  As a result of determining whether or not ease of recognition depends on the utterance content, use speaker registration learning means if it is determined to be dependent, and use speaker adaptive learning means if it is determined not to depend The speaker learning device according to claim 1. 認識のしやすさが発声内容に依存するかどうかを判断する手段は、認識スコアが所定のしきい値以下であるか、所定のしきい値以上であっても誤認識している音素又は音節の全発声に含まれる音素又は音節に対する割合により判断を行う請求項1記載の話者学習装置。  The means for determining whether the ease of recognition depends on the utterance content is a phoneme or syllable that is erroneously recognized even if the recognition score is below a predetermined threshold or above a predetermined threshold. The speaker learning device according to claim 1, wherein the determination is made based on a ratio to phonemes or syllables included in all utterances. 話者の学習用音声を用いて音響モデルパラメータを再学習し、話者に適応した音響モデルを作成する話者適応学習ステップと、誤認識した単語の認識結果に相当する音素又は音節からなる音響モデル系列を正解の音素又は音節系列として発音辞書に追加する話者登録学習ステップと、認識しやすさが発声内容に依存するかどうかを判断するステップと、各話者の認識しやすさと発声内容の依存の強さによって、話者適応学習手段と話者登録学習手段との選択を行い、どちらかの学習を話者に促すステップとを有することを特徴とする話者学習方法。  A speaker-adaptive learning step for re-learning acoustic model parameters using speaker's learning speech to create an acoustic model adapted to the speaker, and a sound consisting of phonemes or syllables corresponding to recognition results of misrecognized words A speaker registration learning step of adding a model sequence to the pronunciation dictionary as a correct phoneme or syllable sequence, a step of determining whether the ease of recognition depends on the utterance content, and the ease of recognition and utterance content of each speaker A speaker learning method comprising a step of selecting a speaker adaptive learning means and a speaker registration learning means depending on the strength of dependence, and prompting the speaker to learn either.
JP2001378341A 2001-12-12 2001-12-12 Speaker learning apparatus and method for speech recognition Expired - Fee Related JP3876703B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2001378341A JP3876703B2 (en) 2001-12-12 2001-12-12 Speaker learning apparatus and method for speech recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2001378341A JP3876703B2 (en) 2001-12-12 2001-12-12 Speaker learning apparatus and method for speech recognition

Publications (3)

Publication Number Publication Date
JP2003177779A JP2003177779A (en) 2003-06-27
JP2003177779A5 JP2003177779A5 (en) 2005-07-14
JP3876703B2 true JP3876703B2 (en) 2007-02-07

Family

ID=19186094

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2001378341A Expired - Fee Related JP3876703B2 (en) 2001-12-12 2001-12-12 Speaker learning apparatus and method for speech recognition

Country Status (1)

Country Link
JP (1) JP3876703B2 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8200495B2 (en) 2005-02-04 2012-06-12 Vocollect, Inc. Methods and systems for considering information about an expected response when performing speech recognition
US7827032B2 (en) 2005-02-04 2010-11-02 Vocollect, Inc. Methods and systems for adapting a model for a speech recognition system
US7949533B2 (en) 2005-02-04 2011-05-24 Vococollect, Inc. Methods and systems for assessing and improving the performance of a speech recognition system
US7865362B2 (en) 2005-02-04 2011-01-04 Vocollect, Inc. Method and system for considering information about an expected response when performing speech recognition
WO2007118029A2 (en) * 2006-04-03 2007-10-18 Vocollect, Inc. Methods and systems for assessing and improving the performance of a speech recognition system
JP5326892B2 (en) 2008-12-26 2013-10-30 富士通株式会社 Information processing apparatus, program, and method for generating acoustic model
US8914290B2 (en) 2011-05-20 2014-12-16 Vocollect, Inc. Systems and methods for dynamically improving user intelligibility of synthesized speech in a work environment
JP5651567B2 (en) * 2011-10-11 2015-01-14 日本電信電話株式会社 Acoustic model adaptation apparatus, acoustic model adaptation method, and program
US9978395B2 (en) 2013-03-15 2018-05-22 Vocollect, Inc. Method and system for mitigating delay in receiving audio stream during production of sound from audio stream
US10714121B2 (en) 2016-07-27 2020-07-14 Vocollect, Inc. Distinguishing user speech from background speech in speech-dense environments

Also Published As

Publication number Publication date
JP2003177779A (en) 2003-06-27

Similar Documents

Publication Publication Date Title
EP1557822B1 (en) Automatic speech recognition adaptation using user corrections
US7013276B2 (en) Method of assessing degree of acoustic confusability, and system therefor
JP3685972B2 (en) Speech recognition apparatus and speech model adaptation method
US6029124A (en) Sequential, nonparametric speech recognition and speaker identification
US8886532B2 (en) Leveraging interaction context to improve recognition confidence scores
EP1701338B1 (en) Speech recognition method
JP6654611B2 (en) Growth type dialogue device
JP2000214882A (en) Voice recognition and voice study device capable of speedily checking respect to voice of children or foreign speaker that are hard to cope with
JP2000181482A (en) Voice recognition device and noninstruction and/or on- line adapting method for automatic voice recognition device
JP2003022087A (en) Voice recognition method
JPH0968994A (en) Word voice recognition method by pattern matching and device executing its method
JP3876703B2 (en) Speaker learning apparatus and method for speech recognition
WO2006093092A1 (en) Conversation system and conversation software
JP4293340B2 (en) Dialogue understanding device
JP2996019B2 (en) Voice recognition device
JP3633254B2 (en) Voice recognition system and recording medium recording the program
EP1067512B1 (en) Method for determining a confidence measure for speech recognition
JP4749990B2 (en) Voice recognition device
JP2000214879A (en) Adaptation method for voice recognition device
JP4604424B2 (en) Speech recognition apparatus and method, and program
JP2001175276A (en) Speech recognizing device and recording medium
US8688452B2 (en) Automatic generation of distractors for special-purpose speech recognition grammars
JP4297349B2 (en) Speech recognition system
JP3029654B2 (en) Voice recognition device
JPH11338492A (en) Speaker recognition unit

Legal Events

Date Code Title Description
A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20041116

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20041116

RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7421

Effective date: 20050704

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20060801

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20060828

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20061010

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20061023

R151 Written notification of patent or utility model registration

Ref document number: 3876703

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R151

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20091110

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20101110

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20111110

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121110

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20121110

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20131110

Year of fee payment: 7

S111 Request for change of ownership or part of ownership

Free format text: JAPANESE INTERMEDIATE CODE: R313113

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

LAPS Cancellation because of no payment of annual fees