JPH1152997A

JPH1152997A - Speech recorder, speech recording system, and speech recording method

Info

Publication number: JPH1152997A
Application number: JP9212764A
Authority: JP
Inventors: Osahisa Okamoto; 長久岡本; Koji Aizawa; 浩二相沢
Original assignee: Hitachi Engineering and Services Co Ltd
Current assignee: Hitachi Engineering and Services Co Ltd
Priority date: 1997-08-07
Filing date: 1997-08-07
Publication date: 1999-02-26

Abstract

PROBLEM TO BE SOLVED: To make it possible to record a summary of conversations as soon as they end or immediately distribute a copy of the record to attendants as a summary of the conference by recording words concerning a trigger word extracted by an extraction means and also a few words preceding and/or following the trigger word. SOLUTION: Speech from a telephone equipment 42 is taken into a speech recognition part 13 through a speech input unit 12. A storage device 14 as a computer device is connected with the speech recognition part 13. Further, the speech recognition part 13 has a speech dictionary part 15 for storing word groups to be recorded from the conversation of the telephone call as a trigger word. Moreover, a speech recorder 11 extracts a word to be recorded from the conversation over the telephone based on the trigger set beforehand, namely, a trigger word. When the trigger word is extracted, it is transmitted to the recorder 14 with a few words preceding and following the trigger word as a recognition result. The recorder 14 records the recognition result, namely, the trigger word and a few words preceding and following the trigger word.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声を入力し、そ
れを電子記録する音声記録装置・システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice recording apparatus and system for inputting voice and electronically recording the voice.

【０００２】[0002]

【従来の技術】特開平4−118800 号（発明の名称プラ
ントパトロール用データ収集装置）公報には、プラント
における設備・装置・プロセスの日常パトロール時のデ
ータを収集する装置であって、データを音声として入力
する音声入力装置と、入力された音声を認識する音声認
識装置と、認識したデータの異常の有無を判定して異常
有りのときに警報を発する警報装置と、認識したデータ
を記憶する記憶装置と、を備えてなることを特徴とする
プラントパトロール用データ収集装置が記載されてい
る。2. Description of the Related Art Japanese Unexamined Patent Publication No. 4-118800 (name of the invention, a data collection device for plant patrol) discloses a device for collecting data during daily patrols of equipment, devices, and processes in a plant. A voice input device, a voice recognition device for recognizing the input voice, an alarm device for determining the presence or absence of an abnormality in the recognized data and issuing an alarm when there is an abnormality, and a storage for storing the recognized data And a device for collecting data for a plant patrol.

【０００３】また、特開昭59−47657 号（発明の名称
音声対話試験装置）公報には、複数の音声入出力部，音
声認識部を持ち、複数の人に対して並行して別々の音声
入出力処理を行うことの出来る装置が記載してある。す
なわち、当該公報には、中央処理装置を備え、音声によ
る試験操作指示出力と、音声による試験データ入力との
繰り返しにより任意の被試験機器に対する任意の項目に
わたる試験が行えるようにした音声対話形試験装置にお
いて、それぞれ独立に音声認識機能を備え、それぞれ独
立に音声対話形式の試験プログラムの実行が可能な複数
の音声対話入出力処理端末部を上記中央処理装置に対し
て共通に設け、複数の被試験機器を対象とした複数の試
験動作を並行して行い得るように構成したことを特徴と
する音声対話形試験装置が記載されている。[0003] Japanese Patent Application Laid-Open No. 59-47657 (Title of Invention)
The speech dialogue test apparatus) discloses a device having a plurality of voice input / output units and a voice recognition unit, and capable of performing different voice input / output processes in parallel for a plurality of persons. That is, this publication includes a central processing unit, and a voice interactive test in which a test can be performed on any device under test by repeating a test operation instruction output by voice and a test data input by voice repeatedly. In the apparatus, a plurality of voice interaction input / output processing terminals each having an independent voice recognition function and capable of independently executing a voice interaction type test program are provided in common with the central processing unit, and a plurality of A speech interactive test apparatus is described which is configured to perform a plurality of test operations for test equipment in parallel.

【０００４】[0004]

【発明が解決しようとする課題】従来、会話終了と同時
に参加者に会話要旨をまとめた会話記録する音声記録装
置はなかった。Heretofore, there has been no voice recording apparatus for recording a conversation in which participants summarize the conversation at the same time as the end of the conversation.

【０００５】従来、会議議事録を手書きやパソコンへの
入力・出力を行って作成しており、確認事項をファクシ
ミリ（ＦＡＸ）等のやり取りで行っていた。Conventionally, the minutes of meetings have been created by handwriting or input / output to a personal computer, and confirmation items have been exchanged by facsimile (FAX) or the like.

【０００６】本発明は、一連の音声の入力あるいは会話
を入力することによって、その終了と同時にその要旨を
まとめて記録し、あるいは参加者に会議要旨として直ち
に記録を配布することができる音声記録装置，音声記録
システムおよび音声記録方法を提供することを目的とす
る。The present invention provides a voice recording apparatus capable of recording a series of voices at the end of a series of voice inputs or conversations, or distributing a record to participants as a conference summary immediately. , An audio recording system and an audio recording method.

【０００７】[0007]

【課題を解決するための手段】本発明は、音声を電子化
して記録する音声記録装置において、音声を入力する音
声入力装置と、入力された音声を認識する音声認識部
と、トリガとして指定した単語群（トリガワード）を記
憶する記憶装置と、前記音声認識部で認識した音声から
前記記憶装置に記憶されたトリガワードを抽出する抽出
手段と、抽出手段で抽出されたトリガワードに関連する
単語と、トリガワードに基因してその前または／および
後の数単語も記録する記録装置とを有する音声記録装置
を提供し、これによって前記課題を解決する。According to the present invention, there is provided a voice recording apparatus for digitizing and recording voice, comprising a voice input device for inputting voice, a voice recognition unit for recognizing the input voice, and a trigger specified. A storage device for storing a group of words (trigger words); an extraction unit for extracting a trigger word stored in the storage device from the voice recognized by the voice recognition unit; and a word associated with the trigger word extracted by the extraction unit And a recording device for recording a few words before and / or after the trigger word based on the trigger word, thereby solving the above-mentioned problem.

【０００８】更に本発明は、音声を電子化して記録する
音声記録装置において、音声を入力する音声入力装置
と、入力された音声を認識する音声認識部と、認識され
た音声を所定単語数記憶する音声記憶装置と、トリガと
して指定した単語群（トリガワード）を記録する音声辞
書と、前記音声認識した音声から前記音声辞書に記憶さ
れたトリガワードを抽出する手段と、抽出手段で抽出さ
れたトリガワードに関連する単語と、トリガワードに基
因してその前または／および後の数単語も記録する記録
装置とを有する音声記録装置を提供する。Further, the present invention relates to a voice recording device for digitizing and recording voice, a voice input device for inputting voice, a voice recognition unit for recognizing the input voice, and storing the recognized voice for a predetermined number of words. A voice storage device, a voice dictionary for recording a word group (trigger word) specified as a trigger, a unit for extracting a trigger word stored in the voice dictionary from the voice recognized, and an extraction unit. An audio recording device is provided that has a word associated with a trigger word and a recording device that also records several words before and / or after the trigger word.

【０００９】更に本発明は、音声を電子化して記録する
音声記録装置において、通信回線を通じて交信される音
声や会議中の音声など複数の方向と発信源を持つ会話音
声を入力する複数の音声入力装置と、入力された音声を
認識する複数の音声認識部と、トリガとして指定した単
語群（トリガワード）をそれぞれ同じく記憶する記憶手
段と、前記音声認識部で認識した複数の方向と発信源を
持つ会話音声から前記記憶装置に記憶されたトリガワー
ドをそれぞれ抽出する抽出手段と、該抽出手段で抽出さ
れたトリガワードに関連する単語と、トリガワードに基
因してその前または／および後の数単語をそれぞれ記録
する記録装置とを有する音声記録システムを提供する。Further, the present invention relates to a voice recording apparatus for digitizing and recording voice, wherein a plurality of voice inputs for inputting conversation voices having a plurality of directions and transmission sources, such as voices communicated through a communication line and voices during a conference. A device, a plurality of voice recognition units for recognizing the input voice, storage means for respectively storing word groups (trigger words) designated as triggers, and a plurality of directions and transmission sources recognized by the voice recognition unit. Extracting means for respectively extracting the trigger words stored in the storage device from the conversation voice held by the user, words associated with the trigger words extracted by the extracting means, and numbers before and / or after the trigger words based on the trigger words And a recording device for recording words.

【００１０】前記抽出手段でトリガワードが抽出された
ときに、そのトリガワードと、トリガワードに基因して
その前または／および後の数単語も合わせて記録するこ
とが望ましい。When a trigger word is extracted by the extracting means, it is desirable to record the trigger word and several words before and / or after the trigger word based on the trigger word.

【００１１】更に本発明は、音声を電子化して記録する
音声記録方法において、複数の方向と発信源を持つ音声
を入力し、入力された音声を認識し、トリガとして、日
付，参加者，会議名に該当する単語群、常の会話で使用
頻度の少ない専門分野に分類される単語群ならびに指定
した単語群（トリガワード）を記憶し、認識した複数の
方向と発信源を持つ会話音声から記憶されたトリガワー
ドを抽出し、抽出されたトリガワードに関連する単語を
複数個抽出された順に記録し、かつ記録された単語を順
に出力する音声記録方法を提供する。Further, the present invention relates to a voice recording method for digitizing and recording voice, wherein voice having a plurality of directions and transmission sources is input, the input voice is recognized, and a date, a participant, and a conference are used as triggers. Stores words that correspond to the name, words that are classified into specialized fields that are rarely used in ordinary conversation, and specified words (trigger words), and stores them from conversation voices that have recognized multiple directions and sources. The present invention provides a voice recording method for extracting recorded trigger words, recording a plurality of words related to the extracted trigger words in the order of extraction, and sequentially outputting the recorded words.

【００１２】[0012]

【発明の実施の形態】以下、本発明にかかる一実施例を
図面に基づいて説明する。図１は、双方向音声認識シス
テムを示す。この双方向音声認識システムは二つの音声
記録装置１１，２１を持っており、それらは通信回線３
１で結ばれている。図は、それぞれの音声記録装置１
１，２１をそれぞれ電話装置４２，５２を通話者４１，
５１が使用して音声記録する様子を示すものである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment according to the present invention will be described below with reference to the drawings. FIG. 1 shows an interactive speech recognition system. This two-way voice recognition system has two voice recording devices 11 and 21 which are
They are tied with one. The figure shows each audio recording device 1
1, 21 are telephone devices 42, 52, respectively, the caller 41,
Reference numeral 51 denotes a state in which voice recording is performed by using.

【００１３】音声記録装置１１について説明する。The sound recording device 11 will be described.

【００１４】電話装置４２からの音声は、音声入力装置
１２，２２を介して音声認識部１３に取り込まれる。音
声認識部１３にはコンピュータ装置としての記録装置１
４が接続されている。音声認識部１３は、通話中の会話
の中から記録すべき単語群をトリガワードとして記憶す
る音声辞書部１５を有している。The voice from the telephone device 42 is taken into the voice recognition unit 13 via the voice input devices 12 and 22. The voice recognition unit 13 has a recording device 1 as a computer device.
4 are connected. The voice recognition unit 13 has a voice dictionary unit 15 that stores, as a trigger word, a group of words to be recorded from a conversation during a call.

【００１５】音声記録装置１１は、あらかじめ設定され
たトリガに基づき、通話者の会話の中から記録すべき単
語、すなわちトリガワードを抽出する。トリガワードを
抽出した場合にはそのワードの前後数ワードと共に記録
装置１４に認識結果として送信する。記録装置１４は、
認識結果、すなわちトリガワードとその前後の数ワード
を記録する。この例では、トリガワードとその前後の数
ワードとを記録することにしているが、トリガワードと
その前もしくは後の数ワードに限定しても満足し得る場
合が存する。The voice recording device 11 extracts a word to be recorded, that is, a trigger word, from the conversation of the caller based on a preset trigger. When the trigger word is extracted, it is transmitted to the recording device 14 as a recognition result together with several words before and after the word. The recording device 14
The recognition result, that is, the trigger word and several words before and after the trigger word are recorded. In this example, the trigger word and several words before and after the trigger word are recorded. However, there may be cases where the trigger word and some words before or after the trigger word can be satisfied.

【００１６】このように音声記録装置１１は単独でも使
用できるが、図１に示すように双方向音声認識システム
として使用した時に威力を発揮する。すなわち、トリガ
ワードとその前後の数ワードは、記録装置１４および２
４に同時に記録されることになる。通信回線３１上にあ
る複数の記録装置にはその全てに同一の認識結果が同時
に記録される。こうすることで、通話に参加した通信者
全員の手元に共通の通話議事録が作成されるに至る。As described above, the voice recording device 11 can be used alone, but exerts its power when used as a two-way voice recognition system as shown in FIG. That is, the trigger word and several words before and after the trigger word are stored in the recording devices 14 and 2.
4 will be recorded simultaneously. The same recognition result is simultaneously recorded in all of the plurality of recording devices on the communication line 31. By doing so, a common call minutes is created in the hands of all correspondents who participated in the call.

【００１７】双方向音声入力システムは、通信回線を通
じて交信される音声や会議中の音声など、複数の方向と
発信源を持つ会議音声について認識機能を提供するシス
テムとなる。会議がなされている通信回線（または会話
空間）に接続された音声認識部は常に会話を認識してお
り、会議中のトリガとなる単語を抽出するとその語句、
およびその前後の単ワードを認識して議事として記録す
ることになる。トリガにしたい単語は、あらかじめユー
ザが認識装置の音声辞書に登録しておく。このことによ
ってそのユーザにマッチした要点だけが最大漏らさず記
録されることになる。トリガにしたい単語は単なる固定
単語のみでなく、例えば「日付」「社名」「発電所名」な
どを指定することで日付，社名，発電所名にからす全て
の音声を記録させることができる。すなわち、トリガワ
ードに基因してその前または／および後の数単語も記録
することができることになる。The two-way voice input system is a system that provides a recognition function for conference voices having a plurality of directions and transmission sources, such as voices communicated through a communication line and voices during a conference. The voice recognition unit connected to the communication line (or conversation space) where the conference is held always recognizes the conversation, and when a word that triggers during the conference is extracted, the phrase,
And the single word before and after that is recognized and recorded as proceedings. The word to be used as a trigger is registered in advance by the user in the speech dictionary of the recognition device. As a result, only the key points that match the user are recorded without leaking. The word to be used as a trigger is not limited to a fixed word. For example, by specifying "date", "company name", "power plant name", etc., all voices of the date, company name, and power plant name can be recorded. That is, several words before and / or after the trigger word can also be recorded.

【００１８】トリガにする単語が普通の会話では出現し
にくい場合、発声者が意識してそれを言うことでシステ
ムの効率を上げることができるようになる。例えば、
「事故」「火災」「発煙」「油漏れ」などというトリガを
登録しているシステムを使う場合、発声者は次のように
話すことで、以下のような議事を残すことができる。If the word used as a trigger is unlikely to appear in ordinary conversation, the speaker can be conscious and say it to increase the efficiency of the system. For example,
When using a system in which triggers such as "accident", "fire", "smoke", and "oil leak" are registered, the speaker can leave the following proceedings by speaking as follows.

【００１９】「丸々発電所において＿事故＿発生です。
事故内容は＿火災＿で原因は＿油漏れ＿です。」〔記録〕:丸々発電所において、事故、発生、事故内容
は、火災、原因、油漏れ」複数の発信源のある音声を扱
う場合、複数の認識装置を設置する方法と、単一の認識
装置で全ての会話認識をさせる場合とがある。特に、複
数の認識装置を設置する場合は認識記録を単一のものと
するため、全ての認識装置は共同で動作し、どの認識記
録も同一になるように調整することもできる。この際は
確率比較や多数決等の推論アルゴリズムを用いてもよ
い。"It is a _accident_occurrence at the Maruha Power Station.
The accident was _fire_ and the cause was _oil leak_. [Record]: Accidents, occurrences and details of accidents are fires, causes, and oil leaks at the Maruha Power Station. In some cases, all conversation recognition is performed by the device. In particular, when a plurality of recognition devices are installed, a single recognition record is used, so that all the recognition devices operate in cooperation, and adjustment can be made so that all the recognition records are the same. In this case, an inference algorithm such as probability comparison or majority decision may be used.

【００２０】音声認識部で使用される辞書について説明
する。The dictionary used in the voice recognition unit will be described.

【００２１】音声認識システムでは、認識される単語は
あらかじめ辞書に登録しておく必要があるが、この辞書
は次のように作成・構成される。In the speech recognition system, words to be recognized need to be registered in a dictionary in advance. This dictionary is created and configured as follows.

【００２２】１．なるべくコンパクトにしたほうが認識
率，応答性において有利→巨大な辞書にしてしまうと、
誤認識が増え、応答性も悪くなる。1. It is advantageous to make it as compact as possible in terms of recognition rate and responsiveness → If you make a huge dictionary,
False recognition increases, and responsiveness also worsens.

【００２３】２．小さな辞書では語集が不足しやすい→
ある程度の大きさは必要そこで、このシステムでは、大きすぎない辞書のサイズ
として１０００語を基準とし、このサイズまでの辞書を
複数個（最大１００個）までシステムに内蔵できるよう
にした。またこれらの辞書は、アプリケーションの各局
面ごとに最適な組み合わせを選択できるよう、音声認識
装置を使用中にでも自由かつ動的に切換えて使えるよう
にした。これらの結果、このシステムでは全体では最大
１０万語までの大きな語集を扱いながらも、局面毎に最
適な辞書を選択して、高速な応答性と良好な認識率を確
保している。2. Glossary is likely to be insufficient in a small dictionary →
Therefore, a certain size is necessary. Therefore, in this system, the dictionary size which is not too large is based on 1000 words, and a plurality of dictionaries up to this size (up to 100) can be built in the system. These dictionaries can be freely and dynamically switched and used even during use of the speech recognition device so that an optimum combination can be selected for each aspect of the application. As a result, while handling a large vocabulary of up to 100,000 words as a whole, this system selects the most appropriate dictionary for each situation to ensure high-speed responsiveness and a good recognition rate.

【００２４】辞書の例を以下に示す。An example of a dictionary is shown below.

【００２５】例１［テキスト］，［読み］茨城県，いばらきけん栃木県，とちぎけん群馬県，ぐんまけん埼玉県，さいたまけん千葉県，ちばけん東京都，とうきょうと・・・・・・この辞書ファイルは単純なテキストファイルであり、一
般のテキスト編集ソフトを用いて誰でも簡単に編集する
ことができる。また、音声応答をさせるかどうかという
指定も辞書に盛り込むことができ、その場合はたとえば
次のような記述となる。Example 1 [Text], [Reading] Ibaraki Prefecture, Ibarakiken Tochigi Prefecture, Tochigiken Gunma Prefecture, Gunmaken Saitama Prefecture, Saitamaken Chiba Prefecture, Chibaken Tokyo Metropolis, Tokyo, Tokyo Is a simple text file that can be easily edited by anyone using ordinary text editing software. In addition, a designation as to whether or not to make a voice response can be included in the dictionary. In this case, for example, the description is as follows.

【００２６】例２［テキスト］, ［読み］，［応答］茨城県，いばらきけん，１ {ＵＰ}，カーソルうえ，０群馬県，ぐんまけん，１埼玉県，さいたまけん，１＾{ＥＳＣ}ｒ点検．exe，てんけんのじっこう，０＾{ＥＳＣ}ｒ完成．exe，かんせいけんさのじっこう，０・・・・・・この場合、［応答］に１が指定された項目はユーザの音
声指令をシステムが復唱する項目であり、０が指定され
た項目に付いてはシステムは復唱しない。この機能を使
うことで、必要な項目についてだけシステムから音声応
答をさせることができるようになるため、特に上記の様
にパソコンへの操作指令を辞書に登録しておく際などに
有効である。Example 2 [Text], [Reading], [Response] Ibaraki prefecture, Ken Ibaraki, 1 {UP}, cursor, 0 Gunma prefecture, Gunmaken, 1 Saitama prefecture, Saitamaken, 1 ＾ {ESC} r check . exe, balance of KENKEN, 0０ {ESC} r completed. In this case, the item where 1 is specified in [Response] is the item where the system repeats the user's voice command, and 0 is added to the item where 0 is specified. The system does not repeat. By using this function, it is possible to make the system make a voice response for only necessary items, and this is particularly effective when registering an operation command to the personal computer in the dictionary as described above.

【００２７】辞書の実装上は、辞書ファイルの各行を行
番号でインデックスを採り、認識装置側には［インデッ
クス］＋［読み］の部分を格納し、パソコン側には［イ
ンデックス］＋［テキスト］の部分を格納している。こ
うすることで、認識装置から認識結果を送信する際には
［インデックス］だけを送ることで該当するテキストを
アプリケーションに渡すことができるため、［テキス
ト］そのものを送信するより通信時間の削減が図れると
ともに、認識装置側に［テキスト］を持つ必要がないた
め、記憶領域も小さくてすむ。ただし、［インデック
ス］だけで認識結果を通知するためパソコン側でアクテ
ィブになっている辞書と、認識装置側でアクティブにな
っている辞書を常に同期させる必要がある。この部分が
［辞書切替機能］であり、弊社システムの場合は外部プ
ログラムから切替えたり、音声指令で切替えたりでき
る。In the implementation of the dictionary, each line of the dictionary file is indexed by a line number, [Index] + [Reading] is stored in the recognition device, and [Index] + [Text] is stored in the personal computer. Part is stored. By doing so, when the recognition result is transmitted from the recognition device, the corresponding text can be passed to the application by sending only the [index], so that the communication time can be reduced as compared with the case where the [text] itself is transmitted. At the same time, since there is no need to have [text] on the recognition device side, the storage area can be small. However, it is necessary to always synchronize the dictionary active on the personal computer side and the dictionary active on the recognition device side in order to notify the recognition result only by [index]. This part is the [dictionary switching function]. In the case of our system, it can be switched from an external program or by voice command.

【００２８】また、［辞書切替］では常に１セットの辞
書しかアクティブにできないため、上記例２にあるよう
なアプリケーション起動テキスト（てんけんのじっこ
う）など共通に使いたい命令語は、切替が予想される全
ての辞書に登録しておかなければならない不便さがあ
る。ここを改善するため、当該システムでは制御辞書と
一般辞書を分けて管理するようにしたものである。こう
することで、ユーザは共通に使いたい語句を制御辞書に
入れておき、切替えて使いたい語句を一般辞書に入れて
おけばよいという使いやすさが実現される。In addition, since only one set of dictionaries can be activated at any time in [dictionary switching], it is anticipated that switching of commonly used command words such as the application start text (tenken no jikko) as in Example 2 above will be performed. There is an inconvenience of having to register in all dictionaries. To improve this, the system manages the control dictionary and the general dictionary separately. In this way, the user can easily use the common dictionary by putting the desired phrase in the control dictionary, and switch and put the desired phrase in the general dictionary.

【００２９】図２は、図１に示した音声記録部１１，２
１のトリガ抽出動作を示すフローチャート例であり、図
３はその記録出力処理部分のフローチャート例である。FIG. 2 shows the sound recording units 11 and 12 shown in FIG.
FIG. 3 is an example of a flowchart showing the trigger extraction operation of FIG. 1, and FIG. 3 is an example of a flowchart of the recording output processing portion.

【００３０】音声認識装置は、あらかじめ辞書に登録さ
れている単語に相当する音声が入力された時にその単語
を認識するが、その結果がトリガとして指定されていた
ものである場合に記録装置への記録動作を行う。この
際、トリガとして指定された単語ばかりでなく、その前
後の数単語も同時に記録するため、装置内部には先入れ
先出し機構を持つ一時的な記憶領域（バッファ）を備え
る。トリガ前に記録する単語数をＰとすれば、バッファ
にはＰ語まで記録されることになる。トリガ語の記録語
数をＱとし、特にトリガ語単語カウンタをＣとする。The speech recognition apparatus recognizes a word registered in advance in a dictionary when a speech corresponding to the word is input. If the result is the one designated as a trigger, the speech recognition apparatus transmits the word to the recording apparatus. Perform the recording operation. At this time, not only the word specified as the trigger but also several words before and after the word are recorded at the same time, so the apparatus is provided with a temporary storage area (buffer) having a first-in first-out mechanism. If the number of words to be recorded before the trigger is P, up to P words are recorded in the buffer. Let Q be the number of recorded words of the trigger word, and let C be the trigger word word counter.

【００３１】音声入力装置（ステップ１：以下ＳＴ１と
記す）から入った音声信号は、認識処理：ＳＴ２により
辞書との照合がなされ、辞書内に該当する単語がある場
合には抽出されてＳＴ３に渡される。その単語がトリガ
として認定されているかどうか（この単語をトリガワー
ドと称する）がＳＴ３で判定され、これがトリガワード
でない場合、まずＳＴ４によりＣがゼロかどうかが調べ
られる。A voice signal input from a voice input device (step 1: hereinafter referred to as ST1) is compared with a dictionary by a recognition process: ST2. If a corresponding word exists in the dictionary, it is extracted and sent to ST3. Passed. Whether or not the word is recognized as a trigger (this word is referred to as a trigger word) is determined in ST3. If this word is not a trigger word, first, ST4 checks whether C is zero.

【００３２】Ｃがゼロでない場合は前回検出したトリガ
ワードの後に続く数ワードを記録出力中であることを示
すので、その単語はそのまま記録出力：ＳＴ６へ送られ
る。ＳＴ６では送られた単語を記録出力処理：ＳＴ１２
に送り、ＳＴ７でＣを減算して先頭に戻る。ＳＴ４でＣ
がゼロである場合は、トリガ動作中ではないため、次回
のトリガワード検出に備えてＰ行準備されているバッフ
ァに格納される：ST5。If C is not zero, it indicates that several words following the previously detected trigger word are being recorded and output, so that word is sent to the recording output ST6 as it is. In ST6, the sent word is recorded and output processing: ST12
, And subtracts C in ST7 to return to the beginning. C in ST4
Is zero, the trigger operation is not in progress, and the data is stored in the buffer prepared for P rows in preparation for the next trigger word detection: ST5.

【００３３】ＳＴ３でトリガワードであると判定された
場合にはＳＴ８にてＣがゼロかどうか調べられ、Ｃがゼ
ロの場合はバッファにトリガ前の数ワードがあることを
意味するためＳＴ９にてまずバッファ内の単語をＳＴ１
２に出力する。Ｃがゼロでない場合はバッファには単語
がなく、前回のトリガ動作中に新たなトリガを検出した
続き動作と判断されるため、ＳＴ１０の処理へ移行して
単語をＳＴ１２に送る。最後にトリガ検出語の数ワード
を出力するためのカウンタＣにＱをセットして先頭に戻
る。If it is determined in ST3 that the word is a trigger word, it is checked whether or not C is zero in ST8. If C is zero, it means that there are several words before the trigger in the buffer, so that in ST9. First, the word in the buffer is ST1
Output to 2. If C is not zero, there is no word in the buffer, and it is determined that the operation has continued since a new trigger was detected during the previous trigger operation. Therefore, the process proceeds to ST10 and the word is sent to ST12. Finally, Q is set to a counter C for outputting several words of the trigger detection word, and the process returns to the beginning.

【００３４】図３の記録出力処理ＳＴ１２は、ＳＴ６，
ＳＴ９，ＳＴ１０から呼び出されるサブルーチンであ
り、それぞれの呼び出し元から送られた単語を通信ライ
ン上にある全記録装置に対して送信し、呼び出し元へ戻
る。The recording output process ST12 in FIG.
This is a subroutine called from ST9 and ST10. The word sent from each caller is transmitted to all the recording devices on the communication line, and returns to the caller.

【００３５】前述の実施例では電話を例にとり、１対１
の情報伝達において説明したが、一般の会議やテレビ会
議など１対Ｎの情報伝達においても同様である。In the above-described embodiment, taking a telephone as an example, one-to-one
However, the same applies to one-to-N information transmission such as general conferences and video conferences.

【００３６】[0036]

【発明の効果】本発明によれば、一連の音声の入力ある
いは会話を入力することによって、その終了と同時にそ
の要旨をまとめて記録し、あるいは参加者に会議要旨と
して直ちに記録を配布することができる。According to the present invention, by inputting a series of voices or conversations, the summary can be recorded at the same time as the end, or the recording can be immediately distributed to the participants as the conference summary. it can.

【００３７】本発明の双方向音声認識システムによれ
ば、会話終了と同時に参加者全員に同一の会話記録を残
すことができ、手書きで議事録を作成する手間や確認事
項をファクシミリ（ＦＡＸ）等でやり取りする手間を省
くことができる。このことは、事故対応や緊急対応時に
ファクシミリ（ＦＡＸ）すら書く時間が惜しい場合など
には特に有効である。また、記録は電子化されているた
めデータベース等への登録も簡単にでき、半永久的に残
すことができるため将来のより確実な参考材料となる。According to the two-way speech recognition system of the present invention, the same conversation record can be left for all the participants at the same time as the end of the conversation, and the labor and confirmation items for creating the minutes by hand can be confirmed by facsimile (FAX) or the like Can save time and effort to exchange. This is particularly effective when the time for writing a facsimile (FAX) during an emergency response or an emergency response is regrettable. Further, since the record is digitized, it can be easily registered in a database or the like, and can be left semi-permanently, so that it will be a more reliable reference material in the future.

[Brief description of the drawings]

【図１】本発明の実施例についてのブロック図。FIG. 1 is a block diagram of an embodiment of the present invention.

【図２】本発明の実施例についてのフローチャート図。FIG. 2 is a flowchart for an embodiment of the present invention.

【図３】図２の一部についてのフローチャート図。FIG. 3 is a flowchart for a part of FIG. 2;

[Explanation of symbols]

１１，２１…音声記録装置、１２，２２…入力装置、１
３，２３…音声認識部、１４，２４…記録装置、１５，
２５…音声辞書、３１…通信回線。11, 21 ... voice recording device, 12, 22 ... input device, 1
3, 23 ... voice recognition unit, 14, 24 ... recording device, 15,
25: Voice dictionary, 31: Communication line.

Claims

[Claims]

An audio recording apparatus for digitizing and recording a voice, a voice input device for inputting a voice, a voice recognition unit for recognizing the input voice, and storing a word group (trigger word) designated as a trigger. A storage device to perform, extraction means for extracting a trigger word stored in the storage device from the voice recognized by the voice recognition unit, a word associated with the trigger word extracted by the extraction means,
A recording device for recording a few words before and / or after the trigger word based on the trigger word.

2. A voice recording apparatus for digitizing and recording voice, a voice input device for inputting voice, a voice recognition unit for recognizing the input voice, and a voice storage for storing the recognized voice for a predetermined number of words. A device, a voice dictionary for recording a word group (trigger word) specified as a trigger, a unit for extracting a trigger word stored in the voice dictionary from the voice recognized, and a trigger word extracted by the extracting unit. Related words,
A recording device for recording a few words before and / or after the trigger word based on the trigger word.

3. A voice recording apparatus for digitizing and recording voice, comprising: a plurality of voice input devices for inputting conversation voices having a plurality of directions and transmission sources, such as voices communicated through a communication line and voices during a conference; A plurality of voice recognition units for recognizing the input voice; storage means for respectively storing a group of words (trigger words) designated as triggers; and a conversation voice having a plurality of directions and transmission sources recognized by the voice recognition unit. Extraction means for extracting the trigger words stored in the storage device from the storage device, words associated with the trigger words extracted by the extraction means, and several words before and / or after the words based on the trigger words, respectively. An audio recording system, comprising: a recording device for recording.

4. The method according to claim 3, wherein when the trigger word is extracted by the extracting means, the trigger word is recorded together with several words before and / or after the trigger word based on the trigger word. Characteristic voice recording system.

5. A voice recording method for digitizing and recording voice, comprising inputting voice having a plurality of directions and transmission sources, recognizing the input voice, and corresponding to a date, a participant, and a meeting name as a trigger. Triggered words that are stored in conversational voices that have recognized multiple directions and sources, storing words that are classified, words that are classified into specialized fields that are rarely used in ordinary conversation, and specified words (trigger words). A voice recording method comprising: extracting a word; recording a plurality of words associated with the extracted trigger word in the order in which the words are extracted; and sequentially outputting the recorded words.