JP2011150705A

JP2011150705A - Video display device with built in voice recognition function

Info

Publication number: JP2011150705A
Application number: JP2011026929A
Authority: JP
Inventors: Yoichi Itagi; 洋一板木
Original assignee: NEC Display Solutions Ltd
Current assignee: Sharp NEC Display Solutions Ltd
Priority date: 2011-02-10
Filing date: 2011-02-10
Publication date: 2011-08-04
Anticipated expiration: 2021-04-04
Also published as: JP5114578B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a video display device which is suitable for electronic presentation without being affected by external factors caused by change of surrounding noise environment and the vocalization method of a speaker. <P>SOLUTION: The video display device includes a voice display signal generation unit 105 that recognizes voices from a microphone and generates a voice display signal, and a video display signal generation unit 104 that processes video input and generates a video display signal, and combines and displays the signals on a screen. In this case, voice and video display memories 1053 and 103 are synchronously-controlled respectively by a CPU unit 108 and memory control circuit 109. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は映像表示装置に関し、特に電子プレゼンテーションなどに用いられ、音声認識機能を内蔵した映像表示装置に関する。 The present invention relates to a video display device, and more particularly to a video display device that is used for electronic presentations and that incorporates a voice recognition function.

従来、プロジェクタ装置に代表される映像表示装置は、電子プレゼンテーションなどに用いられている。かかる電子プレゼンテーションにおいては、映像表示装置とコンピュータなどを接続し、目的の電子資料を順次表示しながら、オペレータ（話者）が説明を行っている。また、大人数を対象とした電子プレゼンテーションなどの場合は、その状況に応じて、オペレータ（話者）は拡声装置などを使用することもある。 2. Description of the Related Art Conventionally, video display devices typified by projector devices are used for electronic presentations and the like. In such an electronic presentation, an operator (speaker) provides an explanation while connecting a video display device and a computer or the like and sequentially displaying target electronic materials. In the case of an electronic presentation for a large number of people, an operator (speaker) may use a loudspeaker or the like depending on the situation.

図７は従来の一例を説明する映像表示装置を用いたシステム構成図である。図７に示すように、従来の電子プレゼンテーションのシステムは、プロジェクタと称される映像表示装置１ａにコンピュータ３およびビデオ（ＶＩＤＥＯ）機器４を接続し、視聴者７に対し所定の距離を隔てて配置されるスクリーン２に画面表示するものが一般的である。しかも、これと同時に、オペレータ（話者）５の近傍にはマイクロフォン６が配置され、スピーカなどの拡声装置８を用いて視聴者７に音声を伝達するシステムが採用されている。 FIG. 7 is a system configuration diagram using a video display device for explaining an example of the prior art. As shown in FIG. 7, in the conventional electronic presentation system, a computer 3 and a video (VIDEO) device 4 are connected to a video display device 1 a called a projector, and are arranged at a predetermined distance from a viewer 7. The screen is generally displayed on the screen 2. In addition, at the same time, a microphone 6 is disposed in the vicinity of the operator (speaker) 5 and a system for transmitting sound to the viewer 7 using a loudspeaker 8 such as a speaker is employed.

また、最近の技術では、音声認識装置を利用して情報提示装置に音声内容を表示するもの、あるいは音声データ・文字コード変換手段を用いてプリンタなどの印字手段に表示するものも知られている。例えば、特許文献１に記載されているように、車載用の音声メモ装置などに用いるために、人の発した音声を音声認識技術を用いて単純な文字列に変換し、情報提示装置としての液晶ディスプレイ等に表示するものである。 Also, in recent technologies, there are known ones that display voice content on an information presentation device using a voice recognition device, or display on a printing means such as a printer using voice data / character code conversion means. . For example, as described in Patent Document 1, for use in an in-vehicle voice memo device or the like, a voice uttered by a person is converted into a simple character string using a voice recognition technology, and the information presenting device is used. It is displayed on a liquid crystal display or the like.

さらに、特許文献２に記載されているように、カラオケ装置やレクチャー装置のディスプレイにマイクからの音声入力の内容を合成して表示させる装置も提案されているもの、あるいは特許文献３に記載されているように、プロジェクタを用いて遠隔講義を行う際、音声入力を用いて各種のマルチメディア機器入力を切替えるようにしたものが知られている。 Furthermore, as described in Patent Document 2, a device for synthesizing and displaying the contents of voice input from a microphone on the display of a karaoke device or a lecture device has been proposed, or described in Patent Document 3. As described above, when a remote lecture is performed using a projector, various multimedia device inputs are switched using voice input.

特開平９−３３００９６号公報JP-A-9-330096 特開平１０−２８２９７０号公報JP-A-10-282970 特開２０００−２５０３９２号公報JP 2000-250392 A

上述した図７の映像表示装置においては、オペレータ（話者）の発声を伝達する手段が外的要因の影響を受けやすいので、周囲の騒音環境の変化やオペレータ（話者）の発声方法（発音、スピード）によっては、オペレータ（話者）の説明が聞き取り難い状況が発生するという欠点がある。また、かかる映像表示装置は、スピーカなどの必要器材が多いため、その運搬や設置作業が面倒であるという欠点がある。すなわち、電子プレゼンテーションに必要な器材については、実際にプレゼンテーションを行う場所に運搬して接続作業を実施する場合が多く、この作業が負担になる。さらに、従来の映像表示装置においては、オペレータ（話者）の発声を伝達する手段が聴覚にうったえるものであるため、聴覚障害者に対しては、電子プレゼンテーションを行うことが難しいという欠点がある。 In the video display apparatus of FIG. 7 described above, since the means for transmitting the utterance of the operator (speaker) is easily affected by external factors, changes in the surrounding noise environment and the utterance method of the operator (speaker) (pronunciation) Depending on the speed, there is a drawback in that it is difficult to hear the explanation of the operator (speaker). In addition, since such a video display device has many necessary equipment such as a speaker, there is a disadvantage that its transportation and installation work is troublesome. In other words, equipment necessary for electronic presentation is often transported to a place where the presentation is actually performed and connected, and this work is a burden. Furthermore, in the conventional video display apparatus, since the means for transmitting the utterance of the operator (speaker) is audible, there is a drawback that it is difficult to perform an electronic presentation for a hearing impaired person.

また、特許文献１の音声メモ装置は、プロジェクタとしての機能もなく、単に音声入力された内容を液晶ディスプレイなどに文字表示するものである。これは、自動車などで移動中に、メモを取る場合には有効であるが、静止中の多くの視聴者を対象にスクリーンを用いて映像表示するようなことはできない。 Further, the voice memo device of Patent Document 1 does not function as a projector, and simply displays the contents inputted by voice on a liquid crystal display or the like. This is effective when taking notes while moving in a car or the like, but it cannot display images using a screen for many stationary viewers.

また、特許文献２のカラオケ装置やレクチャー装置として用いられる音声情報表示装置においても、プロジェクタ装置としての機能がなく、しかもリモコン装置やタイマを必要とし、操作が面倒である。 In addition, the voice information display device used as a karaoke device or a lecture device of Patent Document 2 does not have a function as a projector device, requires a remote control device or a timer, and is troublesome to operate.

一方、特許文献３の遠隔講義装置においては、プロジェクタとしての機能を有しているが、プロジェクタ装置の他にインターネット回線や各種のメディア機器を必要とし、それらの機器の持ち運びに不便なだけでなく、使用場所が限定されるという問題がある。 On the other hand, although the distance learning apparatus of Patent Document 3 has a function as a projector, it requires an Internet line and various media devices in addition to the projector device, and is not only inconvenient to carry these devices. There is a problem that the place of use is limited.

本発明の目的は、上述した問題点を解決すること、すなわち周囲の騒音環境の変化やオペレータ（話者）の発声方法（発音、スピードなど）による外的要因に影響されることのない電子プレゼンテーションに適する音声認識機能を内蔵した映像表示装置（プロジェクタ）を提供することにある。 An object of the present invention is to solve the above-described problems, that is, an electronic presentation that is not affected by external factors due to changes in the surrounding noise environment and the operator's (speaker) utterance method (pronunciation, speed, etc.). Another object of the present invention is to provide a video display device (projector) incorporating a voice recognition function suitable for the above.

また、本発明の他の目的は、電子プレゼンテーションを行うために使用する機材を少なく且つ持ち運びの負担も減少させ、コードの接続などの設置作業も軽減できる音声認識機能を内蔵した映像表示装置を提供することにある。 Another object of the present invention is to provide a video display device with a built-in voice recognition function that reduces the equipment used for electronic presentations, reduces the burden of carrying, and reduces installation work such as cord connection. There is to do.

さらに、本発明の他の目的は、聴覚障害者に対しても、電子プレゼンテーションを行うことができ、より多くの人を対象とする電子プレゼンテーションを実現するとともに、より表現に優れた電子プレゼンテーションを実現できる音声認識機能を内蔵した映像表示装置を提供することにある。 Furthermore, another object of the present invention is to provide electronic presentations to persons with hearing impairments, to realize electronic presentations for more people, and to realize electronic presentations with better expression. An object of the present invention is to provide a video display device having a built-in voice recognition function.

本発明の音声認識機能を内蔵した映像表示装置は、マイクロフォンからの音声が入力され、入力された音声を認識して文字データに変換し、変換された文字データを文字表示メモリに格納し、格納された文字データを読出し、文字表示信号を生成する音声表示信号生成部と、複数の映像信号が入力され、入力された複数の映像信号を切り替えてディジタル映像信号として出力することのできる入力映像信号処理回路と、入力映像信号処理回路から出力されたディジタル映像信号を格納する映像表示メモリと、映像表示メモリに格納されたディジタル映像信号を読出し、映像表示信号を生成する映像表示信号生成部と、音声表示信号生成部から供給される文字表示信号と映像表示信号生成部から供給される映像表示信号を合成した表示信号を生成する表示信号合成回路と、各回路をプログラムに基づいて制御せるＣＰＵ部と、ＣＰＵ部の制御により映像表示メモリおよび音声表示信号生成部を制御するメモリ制御回路と、表示信号合成回路で生成された表示信号を表示する表示部とを有し、ＣＰＵ部から映像切替の指示を受けたとき、文字表示メモリの格納内容が無くなったことを識別してから映像切替を行うように構成される。 The video display device incorporating the voice recognition function of the present invention receives voice from a microphone, recognizes the inputted voice and converts it into character data, and stores the converted character data in a character display memory. An audio display signal generation unit that reads out the read character data and generates a character display signal, and an input video signal that can be output as a digital video signal by switching between the plurality of video signals that are input. A processing circuit; a video display memory for storing a digital video signal output from the input video signal processing circuit; a video display signal generator for reading the digital video signal stored in the video display memory and generating a video display signal; A display signal is generated by combining the character display signal supplied from the audio display signal generator and the video display signal supplied from the video display signal generator. Display signal synthesizing circuit, a CPU unit for controlling each circuit based on a program, a memory control circuit for controlling the video display memory and the audio display signal generating unit under the control of the CPU unit, and a display signal synthesizing circuit And a display unit for displaying a display signal. When receiving a video switching instruction from the CPU unit, it is configured to perform video switching after identifying that the content stored in the character display memory is lost.

この映像表示装置における音声表示信号生成部は、マイクロフォンに接続される音声入力端子と、音声入力端子に入力された音声信号を認識し、１文字毎の文字コードデータに変換する音声認識回路と、１文字毎の文字コードデータを文字列として格納する文字列バッファ回路と、文字フォントを記憶した文字フォントＲＯＭと、文字コードデータを文字表示データに変換して格納する文字表示メモリと、文字表示メモリの文字表示データを読出し、文字表示信号を作成する文字表示信号生成回路とを備え、各回路をバス接続することによりＣＰＵ部およびメモリ制御回路から制御されるように形成することができる。 The voice display signal generation unit in the video display device includes a voice input terminal connected to the microphone, a voice recognition circuit that recognizes a voice signal input to the voice input terminal, and converts the voice signal into character code data for each character; A character string buffer circuit that stores character code data for each character as a character string, a character font ROM that stores character fonts, a character display memory that converts character code data into character display data, and a character display memory, and a character display memory And a character display signal generation circuit for generating a character display signal. By connecting each circuit with a bus, it can be controlled by the CPU unit and the memory control circuit.

また、本発明におけるＣＰＵ部は、文字列バッファ回路に文字コードデータが格納されると、文字フォントＲＯＭをアクセスし、文字コードデータを文字パターンデータに変換して文字表示メモリに格納するように形成される。 In addition, the CPU unit according to the present invention is configured to access the character font ROM when character code data is stored in the character string buffer circuit, convert the character code data into character pattern data, and store it in the character display memory. Is done.

また、本発明におけるメモリ制御回路は、映像表示メモリと文字表示メモリを制御し、映像画面と音声文との同期がとれるように形成される。 The memory control circuit according to the present invention is formed so as to control the video display memory and the character display memory and to synchronize the video screen and the voice sentence.

また、本発明における音声表示信号生成部は、複数の音声入力端子を備え、複数の話者による音声をそれぞれ表示部に独立して文字表示するように形成することができる。 In addition, the voice display signal generation unit according to the present invention includes a plurality of voice input terminals, and can be formed so that voices from a plurality of speakers are individually displayed on the display unit as characters.

さらに、本発明における音声表示信号生成部は、複数の音声入力端子に対応した複数の音声認識回路と複数の文字列バッファ回路とを備え、複数の話者による対話形式の表示を行うことができる。 Furthermore, the voice display signal generation unit according to the present invention includes a plurality of voice recognition circuits and a plurality of character string buffer circuits corresponding to a plurality of voice input terminals, and can perform interactive display by a plurality of speakers. .

以上説明したように、本発明の音声認識機能を内蔵した映像表示装置は、オペレータ（話者）の声を伝達する補助手段として、音声信号を文字表示できる手段を設けているので、電子プレゼンテーション中にオペレータ（話者）の声が聞き取り難い状況が発生しても、その内容を視聴者へ確実に伝えることができるという効果がある。すなわち、周囲の騒音環境の変化やオペレータ（話者）の発声方法（発音、スピードなど）による外的要因に影響されることなく、一定の伝達性を保証することができる。 As described above, the video display apparatus with a built-in voice recognition function of the present invention is provided with means capable of displaying voice signals as characters as auxiliary means for transmitting the voice of the operator (speaker). Even if it is difficult to hear the voice of the operator (speaker), the content can be reliably transmitted to the viewer. In other words, it is possible to guarantee a certain level of transmission without being affected by external factors such as changes in the surrounding noise environment and the operator (speaker) utterance method (sounding, speed, etc.).

また、本発明の映像表示装置は、拡声装置にかわる伝達手段として、音声信号を文字表示できる手段を設けているので、スピーカなどの拡声装置を使用せずに、電子プレゼンテーションを行うことができ、このため使用する機材を少なく且つ持ち運びの負担も減少させることができる上、コードの接続などの設置作業も軽減できるという効果がある。 In addition, the video display device of the present invention is provided with means capable of displaying a voice signal as a transmission means in place of the loudspeaker, so that an electronic presentation can be performed without using a loudspeaker such as a speaker, For this reason, there is an effect that less equipment is used and the carrying burden can be reduced, and installation work such as cord connection can be reduced.

また、本発明の映像表示装置は、音声信号を視覚情報へ変換する手段を設けているので、聴覚障害者の人に対しても、電子プレゼンテーションを行うことができ、より多くの人を対象とした電子プレゼンテーションを実現できるという効果がある。 In addition, since the video display device of the present invention is provided with means for converting an audio signal into visual information, an electronic presentation can be made to a person with hearing impairment, and more people are targeted. The effect is that the electronic presentation can be realized.

さらに、本発明の映像表示装置は、音声信号を聴覚のみならず、視覚へも同時に伝達することにより、電子プレゼンテーション中に音声による視覚的効果を演出でき、より表現に優れた電子プレゼンテーションを行うことができるという効果がある。 Furthermore, the video display device of the present invention can produce a visual effect by voice during an electronic presentation by simultaneously transmitting an audio signal not only to the auditory sense but also to the visual presentation, and to perform an electronic presentation with better expression. There is an effect that can be.

本発明の第１の実施の形態を説明する映像表示装置を用いたシステム構成およびスクリーンの正面を表わす図である。It is a figure showing the system configuration | structure using the video display apparatus explaining the 1st Embodiment of this invention, and the front of a screen. 図１に示す映像表示装置の回路構成図である。It is a circuit block diagram of the video display apparatus shown in FIG. 図１におけるスクリーン画面の横方向表示の例を説明する図である。It is a figure explaining the example of the horizontal direction display of the screen screen in FIG. 図１におけるスクリーン画面の縦方向表示の例を説明する図である。It is a figure explaining the example of the vertical direction display of the screen screen in FIG. 図１におけるスクリーン画面の横方向表示の別の例を説明する図である。It is a figure explaining another example of the horizontal display of the screen screen in FIG. 本発明の第２の実施の形態を説明する映像表示装置を用いたシステム構成およびスクリーンの正面を表わす図である。It is a figure showing the system configuration | structure using the video display apparatus explaining the 2nd Embodiment of this invention, and the front of a screen. 従来の一例を説明する映像表示装置を用いたシステム構成図である。It is a system block diagram using the video display apparatus explaining an example of the past.

本発明は、コンピュータなどを使用し、電子化された資料を操作しながらプレゼンテーションなどを行う場合（以下、電子プレゼンテーション）において、それに要する伝達性および設置性を向上させることを意図したものである。かかる電子プレゼンテーションは、コンピュータと映像表示装置を接続し、コンピュータを操作することで電子化された資料を映像表示装置に順次表示しながら説明を行うのが一般的である。また、大人数を対象とした電子プレゼンテーションの場合は、大画面の映像表示装置と共に、オペレータ（話者）の声を増幅する拡声装置なども状況に応じて用いられる。特に、本発明では、映像表示装置に音声を認識して文字表示できる音声表示信号生成部を内蔵することにより、電子プレゼンテーションの伝達性および設置性を向上させている。 The present invention is intended to improve the transferability and installation required for a presentation or the like (hereinafter referred to as an electronic presentation) while operating a digitized material using a computer or the like. Such an electronic presentation is generally explained by connecting a computer and a video display device and sequentially displaying the digitized material on the video display device by operating the computer. In addition, in the case of an electronic presentation for a large number of people, a loudspeaker that amplifies the voice of an operator (speaker) and the like are used depending on the situation together with a large-screen video display device. In particular, according to the present invention, the audio display signal generation unit capable of recognizing sound and displaying characters is incorporated in the video display device, thereby improving the transferability and installability of the electronic presentation.

以下、本発明の実施の形態について図面を参照して説明する。図１（ａ），（ｂ）はそれぞれ本発明の第１の実施の形態を説明する映像表示装置を用いたシステム構成図およびスクリーンの正面図である。図１（ａ），（ｂ）に示すように、本実施の形態は、コンピュータ３およびビデオ（ＶＩＤＥＯ）機器４を接続した映像表示装置１を用い、多くの視聴者７が見ることのできるスクリーン２に映像表示させる電子プレゼンテーションのシステムを示しており、この映像表示装置１にマイクロフォン６を接続したものである。視聴者７の近くにいるオペレータ（話者）５はマイクロフォン６を通して音声情報、例えば「おはようございます」という挨拶、を映像表示装置１に与えると、音声認識機能により文字データに変換し、スクリーン２の一部（ここでは下側）に、すなわち図１（ｂ）に示すとおり、スクリーン２の右下側から左下側に、「おはようございます。」という文字を映像表示させる。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIGS. 1A and 1B are a system configuration diagram and a front view of a screen, respectively, using a video display device for explaining a first embodiment of the present invention. As shown in FIGS. 1A and 1B, the present embodiment uses a video display device 1 to which a computer 3 and a video (VIDEO) device 4 are connected, and can be viewed by many viewers 7. FIG. 2 shows an electronic presentation system for displaying an image. A microphone 6 is connected to the image display device 1. When an operator (speaker) 5 near the viewer 7 gives voice information, for example, “Good morning” greeting, to the video display device 1 through the microphone 6, it is converted into character data by the voice recognition function, and the screen 2 1 (here, the lower side), that is, as shown in FIG. 1B, the characters “Good morning” are displayed on the screen 2 from the lower right side to the lower left side.

このために、電子プレゼンテーションにあたっては、使用する器材が少なくなり、持ち運びや接続作業などが軽減される。また、周囲の騒音環境の変化や、オペレータ（話者）５の発声方法（発音、スピードなど）による外的要因によって、聞き取り難い状況が発生した場合の補助的な伝達手段として、一定の伝達性を保証することや、オペレータ（話者）５の音声に対して、視覚的効果を簡単に演出することもできる。 For this reason, in the electronic presentation, less equipment is used, and carrying and connection work are reduced. In addition, as a supplementary means of transmission in situations where it is difficult to hear due to changes in the surrounding noise environment and external factors due to the utterance method (pronunciation, speed, etc.) of the operator (speaker) 5, a certain level of transmission is provided. And a visual effect can be easily produced for the voice of the operator (speaker) 5.

図２は図１に示す映像表示装置の回路構成図である。図２に示すように、この映像表示装置（プロジェクタ装置）１は、映像をスクリーン２に拡大投写する大画面の映像表示を可能とする投写型映像表示装置である。 FIG. 2 is a circuit configuration diagram of the video display apparatus shown in FIG. As shown in FIG. 2, the video display device (projector device) 1 is a projection video display device that enables a large-screen video display in which video is enlarged and projected onto a screen 2.

このプロジェクタ装置１は、コンピュータなどの映像信号が供給される複数の映像入力端子１０１と、これら複数の映像入力端子１０１から供給される映像信号をＡ／Ｄ変換処理する入力映像信号処理回路１０２と、この入力映像信号処理回路１０２でデジタル化された映像信号を表示出力すべき映像表示データとして格納する映像メモリ１０３と、映像メモリ１０３から表示データを逐次読み出し、表示出力する映像として映像表示信号を生成する映像表示信号生成回路１０４と、音声信号を文字表示信号に変換するために、点線枠で囲まれた音声表示信号生成部１０５と、映像表示信号生成回路１０４から供給される映像表示信号と音声表示信号生成部１０５から供給される文字表示信号を合成して最終的な表示信号を生成する表示信号合成回路１０６と、表示信号合成回路１０６から供給される表示信号を投写表示する表示部１０７と、プログラムデータが組み込まれ、制御バス１１０とデータバス１１１によってプロジェクタ装置１内の全回路の制御を行うＣＰＵ部１０８と、プロジェクタ装置１内のメモリの制御を行うメモリ制御回路１０９とを備えている。特に、このメモリ制御回路１０９は、ＣＰＵ部１０８の制御のもとに、映像表示メモリ１０３と、音声表示信号生成部１０５とを制御し、所定の映像画面に、確実に音声文すべてが表示できるように同期をとっている。なお、表示部１０７は表示デバイス，光学レンズ，光源ランプなどから構成され、その表示デバイスは一般に液晶やＤＬＰなどのデバイスが使用されることが多い。この表示デバイスの画像は拡大投写され、スクリーン２に表示される。 The projector apparatus 1 includes a plurality of video input terminals 101 to which video signals are supplied, such as a computer, and an input video signal processing circuit 102 that performs A / D conversion processing on the video signals supplied from the plurality of video input terminals 101. The video memory 103 that stores the video signal digitized by the input video signal processing circuit 102 as video display data to be displayed and output, and the display data is sequentially read out from the video memory 103 and the video display signal is output as the video to be displayed and output. A video display signal generation circuit 104 to be generated; an audio display signal generation unit 105 surrounded by a dotted frame for converting an audio signal into a character display signal; and a video display signal supplied from the video display signal generation circuit 104 A display signal for synthesizing the character display signals supplied from the voice display signal generation unit 105 to generate a final display signal And a display unit 107 for projecting and displaying a display signal supplied from the display signal synthesizing circuit 106 and program data are incorporated, and the control bus 110 and the data bus 111 control all the circuits in the projector apparatus 1. A CPU unit 108 and a memory control circuit 109 that controls a memory in the projector apparatus 1 are provided. In particular, the memory control circuit 109 controls the video display memory 103 and the audio display signal generation unit 105 under the control of the CPU unit 108, and can reliably display all audio text on a predetermined video screen. So that they are in sync. The display unit 107 includes a display device, an optical lens, a light source lamp, and the like. In general, a device such as a liquid crystal or a DLP is often used as the display device. The image of this display device is enlarged and projected and displayed on the screen 2.

さらに、音声表示信号生成部１０５は、マイクロフォン６からの音声信号を供給される音声入力端子１０５０と、音声入力端子１０５０を介して供給される音声信号を音声認識し、逐次文字コードに変換する音声認識回路１０５１と、音声認識回路１０５１から供給される文字コードを格納する文字列バッファ回路１０５２と、文字列バッファ回路１０５２の文字コードデータを表示出力すべき文字表示データとして格納する文字表示メモリ１０５３と、文字表示メモリ１０５３から格納データを逐次読み出し、表示出力すべき文字表示信号を生成する文字表示信号生成回路１０５４と、文字列バッファ回路１０５２に格納される文字コードデータに対応する文字パターンデータを格納する文字フォントＲＯＭ１０５５とを備えている。これらの各回路は、上述した制御バス１１０とデータバス１１１により接続される。しかも、文字フォントを記憶した文字フォントＲＯＭ１０５５や文字パターンデータを格納する文字表示メモリ１０５３は、映像表示メモリ１０３を制御するメモリ制御回路１０９によって制御され、映像画面と音声文との同期がとれるようにする。また、このメモリ制御回路１０９は、ＣＰＵ部１０８から映像切替の指示を受けたとき、文字表示メモリ１０５３の格納内容が無くなったことを識別してから映像切替を行うようにしている。 Furthermore, the voice display signal generation unit 105 recognizes a voice input terminal 1050 to which a voice signal from the microphone 6 is supplied, and a voice signal supplied through the voice input terminal 1050, and sequentially converts the voice signal to a character code. A recognition circuit 1051, a character string buffer circuit 1052 that stores a character code supplied from the speech recognition circuit 1051, and a character display memory 1053 that stores character code data of the character string buffer circuit 1052 as character display data to be displayed and output. Then, the stored data is sequentially read from the character display memory 1053, the character display signal generation circuit 1054 for generating the character display signal to be displayed and output, and the character pattern data corresponding to the character code data stored in the character string buffer circuit 1052 are stored. And a character font ROM 1055. These circuits are connected by the control bus 110 and the data bus 111 described above. In addition, the character font ROM 1055 storing the character font and the character display memory 1053 storing the character pattern data are controlled by the memory control circuit 109 that controls the video display memory 103 so that the video screen and the audio text can be synchronized. To do. Further, when the memory control circuit 109 receives a video switching instruction from the CPU unit 108, the memory control circuit 109 recognizes that the stored contents of the character display memory 1053 are lost and switches the video.

かかるプロジェクタ装置１は、通常、映像入力端子１０１から供給されるコンピュータなどの外部映像機器の映像信号をスクリーン２に拡大投写して映像表示される。その映像信号の処理動作の概略は、次のとおりである。まず、映像入力端子１０１から外部映像機器の映像信号を入力映像信号処理回路１０２に供給すると、その映像信号を入力映像信号処理回路１０２においてＡ／Ｄ変換する。このように、映像信号はアナログ信号からディジタル信号へ逐次変換される。ついで、ディジタル信号化された映像信号は、逐次映像表示メモリ１０３に映像表示データとして格納される。このため、映像表示メモリ１０３の格納データは、映像入力端子１０１から入力される映像信号がデジタル化された映像表示データで逐次更新されることになる。更に、映像表示信号生成回路１０４では、映像表示メモリ１０３に格納される映像表示データを逐次読み出し、映像表示出力する映像表示信号を生成し、表示信号合成回路１０６へ供給する。 The projector apparatus 1 normally displays an image by enlarging and projecting a video signal of an external video device such as a computer supplied from the video input terminal 101 on the screen 2. The outline of the video signal processing operation is as follows. First, when a video signal of an external video device is supplied from the video input terminal 101 to the input video signal processing circuit 102, the video signal is A / D converted by the input video signal processing circuit 102. In this way, the video signal is sequentially converted from an analog signal to a digital signal. Next, the digital video signal is sequentially stored as video display data in the video display memory 103. Therefore, data stored in the video display memory 103 is sequentially updated with video display data obtained by digitizing a video signal input from the video input terminal 101. Further, the video display signal generation circuit 104 sequentially reads video display data stored in the video display memory 103, generates a video display signal for video display output, and supplies the video display signal to the display signal synthesis circuit 106.

次に、音声表示信号生成部１０５の動作について説明する。オペレータ（話者）５が使用するマイクロフォン６などの音声信号を音声入力端子１０５０から音声認識回路１０５１へ供給されると、その音声信号が音声認識回路１０５１において文字認識処理され、音声信号から一文字ごとの文字コードデータへ逐次変換される。その文字コード化された音声信号は、逐次文字列バッファ回路１０５２に格納される。文字列バッファ回路１０５２に文字コードデータが格納されると、ＣＰＵ部１０８は文字フォントＲＯＭ１０５５を使用して、文字コードデータから文字パターンデータに変換する。ついで、その変換された文字パターンデータは、文字表示メモリ１０５３に文字表示データとして逐次格納する。さらに、文字表示メモリ１０５３に格納された文字表示データは、文字表示信号生成回路１０５４から逐次読み出される。文字表示信号生成回路１０５４では、読み出した文字表示データから文字表示出力する文字表示信号を生成し、表示信号合成回路１０６へ出力する。また、文字表示信号生成回路１０５４では、ＣＰＵ部１０８から文字表示データの読み出し位置、文字表示の出力位置などの制御データが供給され、文字の表示方法や表示位置を変更できるようにしている。 Next, the operation of the audio display signal generation unit 105 will be described. When a voice signal of the microphone 6 or the like used by the operator (speaker) 5 is supplied from the voice input terminal 1050 to the voice recognition circuit 1051, the voice signal is subjected to character recognition processing in the voice recognition circuit 1051, and each character from the voice signal is processed. Is sequentially converted into character code data. The character-coded voice signal is sequentially stored in the character string buffer circuit 1052. When the character code data is stored in the character string buffer circuit 1052, the CPU unit 108 uses the character font ROM 1055 to convert the character code data into character pattern data. Next, the converted character pattern data is sequentially stored in the character display memory 1053 as character display data. Further, the character display data stored in the character display memory 1053 is sequentially read from the character display signal generation circuit 1054. The character display signal generation circuit 1054 generates a character display signal for character display output from the read character display data, and outputs it to the display signal synthesis circuit 106. Further, the character display signal generation circuit 1054 is supplied with control data such as a character display data reading position and a character display output position from the CPU unit 108 so that the character display method and display position can be changed.

次に、表示信号合成回路１０６は、映像表示画面上に文字を表示できるように、映像表示信号生成回路１０４から供給される映像表示信号と文字表示信号生成回路１０５４から供給される文字表示信号を合成した表示信号を生成し、表示部１０７へ供給する。表示部１０７へ供給された表示信号は、プロジェクタ装置１の表示映像として、スクリーン２へ拡大投写される。この結果、スクリーン２上では、映像入力端子１０１から入力される映像信号の映像表示画面上に、音声入力端子１０５０から入力された音声信号が逐次文字表示される。 Next, the display signal combining circuit 106 receives the video display signal supplied from the video display signal generation circuit 104 and the character display signal supplied from the character display signal generation circuit 1054 so that characters can be displayed on the video display screen. A combined display signal is generated and supplied to the display unit 107. The display signal supplied to the display unit 107 is enlarged and projected onto the screen 2 as a display image of the projector device 1. As a result, on the screen 2, the audio signal input from the audio input terminal 1050 is sequentially displayed on the video display screen of the video signal input from the video input terminal 101.

図３（ａ），（ｂ）はそれぞれ図１におけるスクリーン画面の横方向表示の例を説明する図である。図３（ａ）に示すように、このスクリーン２は同一の映像画面（図示省略：以下同様）を写しており、しかも時間的に左から右に向かって推移する場合である。例えば、１つの映像画面に対し、複数の音声文を重畳させるためには、所定の時間とともに、音声文を次から次へ表示させる必要があるためである。この場合は、音声文を下側で横方向に表示させているが、図３（ｂ）に示すように、音声文を上側で横方向に表示させることも可能である。 FIGS. 3A and 3B are diagrams for explaining examples of the horizontal display of the screen screen in FIG. As shown in FIG. 3 (a), this screen 2 shows the same video screen (not shown: the same applies hereinafter) and transitions from left to right in terms of time. For example, in order to superimpose a plurality of voice sentences on one video screen, it is necessary to display the voice sentences from one to the next with a predetermined time. In this case, the voice sentence is displayed in the horizontal direction on the lower side. However, as shown in FIG. 3B, the voice sentence can be displayed in the horizontal direction on the upper side.

図４（ａ），（ｂ）はそれぞれ図１におけるスクリーン画面の縦方向表示の例を説明する図である。図４（ａ），（ｂ）に示すように、このスクリーン２も動作的には図３と同様であり、図４（ａ）は音声文を右側縦方向に表示させた場合であり、図４（ｂ）は音声文を左側縦方向に表示させた場合である。 4A and 4B are diagrams for explaining examples of the vertical display of the screen screen in FIG. As shown in FIGS. 4A and 4B, the screen 2 is also operationally similar to that in FIG. 3, and FIG. 4A shows a case where a voice sentence is displayed in the vertical direction on the right side. 4 (b) shows a case where a voice sentence is displayed in the left vertical direction.

図５は図１におけるスクリーン画面の横方向表示の別の例を説明する図である。図５に示すように、このスクリーン２は時間的に音声文を移動させるのではなく、所定時間だけ音声文を一括して表示し、所定時間が過ぎると、その音声文を消去してしまう方式である。このため、複数の音声文があるときには、スクリーン２上に複数行にわたって表示すればよい。 FIG. 5 is a diagram for explaining another example of the horizontal display of the screen screen in FIG. As shown in FIG. 5, the screen 2 does not move the voice sentence over time, but displays the voice sentence at a time for a predetermined time and erases the voice sentence after the predetermined time. It is. For this reason, when there are a plurality of voice sentences, they may be displayed on the screen 2 over a plurality of lines.

上述した図３〜図５の音声文の表示にあたっては、プロジェクタ装置１のＣＰＵ部１０８においてプログラム制御できるように設定すればよい。すなわち、ＣＰＵ部１０８から文字表示信号生成回路１０５４に供給される文字表示データの読み出し位置、文字表示出力位置などの制御データを変更することにより、それぞれのイメージにすることができ、また文字の表示位置や表示速度を変更することもできる。 3 to 5 described above may be set such that the CPU 108 of the projector apparatus 1 can perform program control. That is, by changing the control data such as the reading position and the character display output position of the character display data supplied from the CPU unit 108 to the character display signal generation circuit 1054, each image can be displayed. The position and display speed can be changed.

図６（ａ），（ｂ）はそれぞれ本発明の第２の実施の形態を説明する映像表示装置を用いたシステム構成図およびそのスクリーンの正面図である。図６（ａ），（ｂ）に示すように、本実施の形態はプロジェクタ装置１に２つのマイクロフォン６１，６２を接続し、２人のオペレータ（話者）５１，５２の音声をスクリーン２上に表示するようにしたものである。この場合、プロジェクタ装置１には、前述した図２における音声入力端子１０５０を２つ設ける他に、音声認識回路１０５１と文字列バッファ回路１０５２もそれぞれ２つ設け、ＣＰＵ部１０９に表示順序や表示方法をプログラム設定することにより、すなわち入力端子数に対応した処理経路を用意することで、２つの音声信号を同時に文字表示することも可能である。この場合、図６（ｂ）に示すように、第１の話者５１の音声は、スクリーン２の上側横方向で左から右に表示させ、また第２の話者５２の音声は、下側横方向で右から左に表示させることにより、対話形式の表示をも実現することができる。また、この表示のさせ方は、前述した図３〜図５のように表示してもよいし、あるいはこれらを組合せて表示することもできる。なお、２人以上の複数のオペレータ（話者）が存在する場合においても同様である。 FIGS. 6A and 6B are a system configuration diagram using a video display device and a front view of the screen, respectively, for explaining the second embodiment of the present invention. As shown in FIGS. 6A and 6B, in the present embodiment, two microphones 61 and 62 are connected to the projector apparatus 1 and the voices of two operators (speakers) 51 and 52 are displayed on the screen 2. It is intended to be displayed. In this case, in addition to the two voice input terminals 1050 in FIG. 2 described above, the projector apparatus 1 is also provided with two voice recognition circuits 1051 and two character string buffer circuits 1052, respectively. It is also possible to display two audio signals simultaneously by character setting, that is, by preparing a processing path corresponding to the number of input terminals. In this case, as shown in FIG. 6B, the voice of the first speaker 51 is displayed from left to right in the upper horizontal direction of the screen 2, and the voice of the second speaker 52 is displayed on the lower side. Interactive display can also be realized by displaying from right to left in the horizontal direction. Further, this display method may be displayed as shown in FIGS. 3 to 5 described above, or may be displayed in combination. The same applies when there are two or more operators (speakers).

以上、２つの実施の形態については、電子プレゼンテーションの映像表示装置として最も一般的なプロジェクタ装置１を例に説明したが、これらに限定されることはない。例えば、映像表示装置の方式に関係なく、ＴＶやモニタなどといった映像表示装置であっても音声表示信号生成部１０５を内蔵すれば同様の効果が得られる。 As described above, the two embodiments have been described by taking the most general projector device 1 as an electronic presentation video display device as an example. However, the present invention is not limited to these. For example, the same effect can be obtained by incorporating the audio display signal generation unit 105 even in a video display device such as a TV or a monitor regardless of the type of the video display device.

また、ＣＰＵ部１０８から文字表示信号生成回路１０５４に供給される文字表示データの読み出し位置、文字表示出力位置などの制御データを変更することにより、文字の表示位置や表示速度を変更することができ、その結果、様々な表示方法が可能となり、状況に応じた視覚効果を演出することが可能である。 Further, the display position and display speed of the characters can be changed by changing the control data such as the reading position and the character display output position of the character display data supplied from the CPU unit 108 to the character display signal generation circuit 1054. As a result, various display methods are possible, and a visual effect corresponding to the situation can be produced.

さらに、プロジェクタ装置１のパワーをオンオフしたり、映像入力端子１０１の切替を行ったりできるように、予めキーワードとその動作をＣＰＵ部１０８に設定しておくことにより、ＣＰＵ部１０８が文字列バッファ回路１０５２において、キーワードと同じ文字コードデータを発見した場合には、キーワードに対応した動作を行わせることもできる。その結果、オペレータ（話者）がキーワードを発声することによって、リモートコントロール装置を用いずに、一般的なリモートコントロールと同様の効果を得ることもできる。 Further, the keyword and its operation are set in advance in the CPU unit 108 so that the power of the projector apparatus 1 can be turned on / off and the video input terminal 101 can be switched, so that the CPU unit 108 can perform the character string buffer circuit. When the same character code data as the keyword is found in 1052, an operation corresponding to the keyword can be performed. As a result, when an operator (speaker) utters a keyword, the same effect as a general remote control can be obtained without using a remote control device.

１映像表示装置
２スクリーン
３コンピュータ
４ビデオ（ＶＩＤＥＯ）機器
６，６１，６２マイクロフォン
１０１映像入力端子
１０２入力映像信号処理回路
１０３映像表示メモリ
１０４映像表示信号生成回路
１０５音声表示信号生成部
１０６表示信号合成回路
１０７表示部
１０８ＣＰＵ部
１０９メモリ制御回路
１１０制御バス
１１１データバス
１０５０音声入力端子
１０５１音声認識回路
１０５２文字列バッファ回路
１０５３文字表示メモリ
１０５４文字表示信号生成回路
１０５５文字フォントＲＯＭ DESCRIPTION OF SYMBOLS 1 Video display apparatus 2 Screen 3 Computer 4 Video (VIDEO) apparatus 6, 61, 62 Microphone 101 Video input terminal 102 Input video signal processing circuit 103 Video display memory 104 Video display signal generation circuit 105 Audio display signal generation part 106 Display signal synthesis | combination Circuit 107 Display unit 108 CPU unit 109 Memory control circuit 110 Control bus 111 Data bus 1050 Voice input terminal 1051 Voice recognition circuit 1052 Character string buffer circuit 1053 Character display memory 1054 Character display signal generation circuit 1055 Character font ROM

Claims

Voice from a microphone is input, the inputted voice is recognized and converted into character data, the converted character data is stored in a character display memory, the stored character data is read, and a character display signal is An audio display signal generation unit to generate;
An input video signal processing circuit capable of inputting a plurality of video signals and switching the input video signals to output as a digital video signal;
A video display memory for storing a digital video signal output from the input video signal processing circuit;
A digital video signal stored in the video display memory, and a video display signal generator for generating a video display signal;
A display signal synthesis circuit that generates a display signal obtained by synthesizing the character display signal supplied from the audio display signal generation unit and the video display signal supplied from the video display signal generation unit;
A CPU unit for controlling each circuit based on a program;
A memory control circuit for controlling the video display memory and the audio display signal generation unit under the control of the CPU unit;
A display unit for displaying a display signal generated by the display signal synthesis circuit;
A video display device with a built-in voice recognition function, wherein when a video switching instruction is received from the CPU, the video switching is performed after identifying that the stored contents of the character display memory are lost.

The voice display signal generation unit recognizes a voice input terminal connected to the microphone, a voice signal input to the voice input terminal, and converts the voice signal into character code data for each character; A character string buffer circuit that stores character code data for each character as a character string, a character font ROM that stores character fonts, a character display memory that converts and stores the character code data into character display data, and the character display 2. The voice according to claim 1, further comprising a character display signal generation circuit that reads the character display data of the memory and generates a character display signal, and is controlled from the CPU unit and the memory control circuit by connecting each circuit with a bus. Video display device with built-in recognition function.

2. The CPU unit, when character code data is stored in the character string buffer circuit, accesses the character font ROM, converts the character code data into character pattern data, and stores it in the character display memory. Alternatively, a video display device incorporating the voice recognition function according to claim 2.

3. The video display device with a built-in voice recognition function according to claim 1, wherein the memory control circuit controls the video display memory and the character display memory so that a video screen and a voice sentence can be synchronized. .

The video having a voice recognition function according to claim 1 or 2, wherein the voice display signal generation unit includes a plurality of voice input terminals and displays voices of a plurality of speakers independently on the display unit. Display device.

6. The voice display signal generation unit includes a plurality of voice recognition circuits and a plurality of character string buffer circuits corresponding to the plurality of voice input terminals, and performs interactive display by the plurality of speakers. Video display device with built-in voice recognition function.