JP2003223179A

JP2003223179A - Intelligent loudspeaker

Info

Publication number: JP2003223179A
Application number: JP2002022149A
Authority: JP
Inventors: Michiaki Kuno; 道明久野
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2002-01-30
Filing date: 2002-01-30
Publication date: 2003-08-08

Abstract

<P>PROBLEM TO BE SOLVED: To provide an intelligent loudspeaker enabling easy control of voice output based on voice information given from an external controller. <P>SOLUTION: The intelligent loudspeaker is provided with a receiver section for receiving voice information including at least prescribed character data and phonological data, a voice synthesizing database, a voice data synthesizing section for generating synthetic voices corresponding to the voice information utilizing the voice synthesizing database, and a voice output section for outputting, based on the voice information, the generated synthetic voices. <P>COPYRIGHT: (C)2003,JPO

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】この発明は、インテリジェン
トスピーカに関し、特に、受信された音声に関する情報
をもとに、合成音声を生成して出力する機能を有するイ
ンテリジェントスピーカに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an intelligent speaker, and more particularly to an intelligent speaker having a function of generating and outputting a synthetic voice based on received voice information.

【０００２】[0002]

【従来の技術】従来から、音韻や音律などの音声情報を
予め記憶し、入力された文字データを、音声に変換して
スピーカから出力する音声合成装置が提供されている。
また、特開２０００−２３１３９６号公報には、ドラマ
のセリフのような自然なイントネーションや間合いで合
成音声を出力できるように、入力された音声から音声の
音韻データと音律データを抽出して、これらの抽出デー
タと予め格納されている複数話者の単位音声データ及び
単位音声パラメータを用いて自然な合成音声を出力する
セリフデータ作成装置が記載されている。2. Description of the Related Art Conventionally, there has been provided a voice synthesizing device for preliminarily storing voice information such as phonemes and temperaments, converting input character data into voice and outputting the voice from a speaker.
Further, in Japanese Patent Laid-Open No. 2000-231396, phonological data and temperament data of a voice are extracted from an input voice so that a synthesized voice can be output with a natural intonation or a gap such as a dialogue of a drama. There is described a serif data creating apparatus for outputting a natural synthesized voice by using the extracted data, the unit voice data and the unit voice parameters of a plurality of speakers stored in advance.

【０００３】さらに、特開平８−２４９３４３号公報に
は、入力されたリアルタイム音声をディジタル音声デー
タとして保存し、ディジタル音声データの中から、予め
登録されたニュース特有の言い回しパターンに相当する
部分の音声データを抽出し、さらに音声認識してテキス
トデータに変換し、このテキストデータをもとに、効率
的に外部データベースの情報検索を行う音声情報取得装
置が記載されている。Further, in Japanese Unexamined Patent Publication No. 8-249343, the input real-time voice is stored as digital voice data, and the voice of a part corresponding to a pre-registered wording pattern peculiar to news is stored in the digital voice data. There is described a voice information acquisition device that extracts data, further performs voice recognition, converts the data into text data, and efficiently searches information in an external database based on the text data.

【０００４】また、図３に示すように、予め記録された
合成音声を用いて放送用アンプ５１から多数のスピーカ
５２〜６３に対して、同じ合成音声を出力する構内放送
システムも提供されている。Further, as shown in FIG. 3, there is also provided a private broadcasting system in which the same synthesized voice is output from the broadcasting amplifier 51 to a large number of speakers 52 to 63 by using a prerecorded synthesized voice. .

【０００５】図３のシステムでは、放送用アンプ５１か
ら出力される音声はアナログ信号であり、スピーカは送
られてきたアナログ信号を音に変換して出力する。ま
た、図３では、スピーカを３グループに分割しているシ
ステムを示しているが、この場合は、放送用アンプ５１
から、３種類の異なる合成音声を各グループごとに出力
することができる。In the system of FIG. 3, the sound output from the broadcasting amplifier 51 is an analog signal, and the speaker converts the sent analog signal into sound and outputs the sound. Further, although FIG. 3 shows a system in which the speakers are divided into three groups, in this case, the broadcasting amplifier 51 is used.
Therefore, three different types of synthesized speech can be output for each group.

【０００６】[0006]

【発明が解決しようとする課題】しかし、特開２０００
−２３１３９６号及び特開平８−２４９３４３号公報に
記載のものは、人間の生の音声を入力して、この音声か
ら得られた情報をともに合成音声を作成したり、情報検
索をするものであり、多数のスピーカに対して合成音声
を出力する場合において、自然な音声を出力したり、検
索により見つけられた合成音声を出力することはできて
も、各スピーカごとに出力する合成音声を変更したり、
出力する時間，回数，スピーカの選択といった制御をす
ることはできない。However, Japanese Patent Laid-Open No. 2000-2000
Japanese Patent Laid-Open No. 231396 and Japanese Patent Application Laid-Open No. 8-249343 are for inputting a live human voice, and creating synthesized voice together with the information obtained from this voice, or performing information retrieval. , When outputting synthetic speech to a large number of speakers, it is possible to output natural speech or to output synthetic speech found by searching, but change the synthetic speech output for each speaker. Or
You cannot control the output time, number of times, and speaker selection.

【０００７】また、図３の従来のシステムでは、１つの
グループに出力された合成音声は、そのグループ内のす
べてのスピーカから出力されることを前提としており、
スピーカ側で出力の可否等の制御をすることはできなか
った。Further, in the conventional system shown in FIG. 3, it is premised that the synthetic voice output to one group is output from all the speakers in the group,
It was not possible to control whether or not to output on the speaker side.

【０００８】そこで、この発明は、以上のような事情を
考慮してなされたものであり、スピーカに音声合成及び
音声の出力制御等の機能を備えて、外部のコントロール
機器から与えられた信号をもとに、容易に合成音声の出
力制御をすることが可能なインテリジェントスピーカを
提供することを課題とする。Therefore, the present invention has been made in consideration of the above circumstances, and a speaker is provided with functions such as voice synthesis and voice output control, and a signal given from an external control device is supplied. It is an object of the present invention to provide an intelligent speaker that can easily control the output of synthetic speech.

【０００９】[0009]

【課題を解決するための手段】この発明は、少なくとも
所定の文字データ、音韻データを含む音声情報を受信す
る受信部と、音声合成データベースと、前記音声合成デ
ータベースを用いて前記音声情報に対応する合成音声を
生成する音声データ合成部と、音声情報に基づいて、生
成された合成音声を出力する音声出力部とを備えたこと
を特徴とするインテリジェントスピーカを提供するもの
である。According to the present invention, a receiving unit for receiving voice information including at least predetermined character data and phoneme data, a voice synthesis database, and the voice synthesis database are used to deal with the voice information. The present invention provides an intelligent speaker including a voice data synthesizing unit that generates a synthetic voice and a voice output unit that outputs the generated synthetic voice based on voice information.

【００１０】これによれば、受信した音声情報を用いて
合成音声を出力するようにしているので、外部機器から
所望の音声情報を与えることにより、所望の内容の音声
を所望の時刻、回数、速さ等で出力することができる。
音声データ合成部とは、与えられた文章、単語、文字、
記号等のいわゆるデジタルデータを、所定の規則に従っ
て、アナログ音声に変換できる形式の情報に合成する部
分である。According to this, since the synthesized voice is output by using the received voice information, the desired voice information is given from the external device so that the voice having the desired content can be transmitted at the desired time, the number of times, and the like. It can be output at a speed.
The voice data synthesizer is a given sentence, word, character,
This is a part that synthesizes so-called digital data such as symbols into information in a format that can be converted into analog voice according to a predetermined rule.

【００１１】音声合成データベースとは、単語、文字、
記号ごとの音素情報と、音韻、韻律、言語、性別などに
ついての基礎情報を記憶したものであり、これらの音素
情報と基礎情報とを組み合わせることにより、音声合成
が行われる。音声情報は、主としてインテリジェントス
ピーカとは異なる筺体を持ちかつ異なる位置に配置され
る外部端末装置で作成され、所定の通信回線を介してこ
の外部端末装置からインテリジェントスピーカに送信さ
れる。The speech synthesis database means words, characters,
It stores phoneme information for each symbol and basic information about phonemes, prosody, language, gender, etc. By combining these phoneme information and basic information, speech synthesis is performed. The voice information is created mainly by an external terminal device having a housing different from that of the intelligent speaker and arranged at a different position, and transmitted from the external terminal device to the intelligent speaker via a predetermined communication line.

【００１２】通信回線は、音声情報をデジタルデータと
して伝送できる回線であればよく、特に限定されない。
１０ＢＡＳＥ−ＴなどのＬＡＮ回線や直結される専用回
線などの有線であってもよく、無線の通信回線でもよ
い。したがって、受信部は用いられる通信回線や通信プ
ロトコルに適用可能な受信機能を有するものであればよ
い。The communication line is not particularly limited as long as it can transmit voice information as digital data.
It may be a wired line such as a LAN line such as 10BASE-T or a dedicated line directly connected, or a wireless communication line. Therefore, the receiving unit may have a receiving function applicable to the communication line and communication protocol used.

【００１３】音声出力部は、狭義には、音声データに対
応するデジタルデータをアナログ信号に変換して、音と
して出力するいわゆるスピーカを意味するが、広義に
は、受信部及び音声データ合成部を制御して受信した音
声情報を解釈し、スピーカに音を出力させるまでの全体
的な出力制御をする部分を意味する。全体的な出力制御
は、ＣＰＵ、ＲＯＭ、ＲＡＭ、Ｉ／Ｏコントローラ、タ
イマーなどからなるマイクロコンピュータを用いること
ができる。In a narrow sense, the voice output unit means a so-called speaker that converts digital data corresponding to voice data into an analog signal and outputs it as sound. In a broad sense, the voice output unit includes a receiving unit and a voice data synthesizing unit. It means a part that controls and interprets the received voice information and controls the overall output until the speaker outputs a sound. For overall output control, a microcomputer including a CPU, ROM, RAM, I / O controller, timer and the like can be used.

【００１４】また、各機能部の処理は、ＲＯＭ、ＲＡＭ
等に記憶されたプログラムに基づいて、ＣＰＵが動作す
ることにより実行される。音声合成データベースや、Ｃ
ＰＵ動作時に必要となるデータは、ハードディスク、Ｒ
ＡＭ、ＭＯ、ＤＶＤ−ＲＡＭなどに格納される。また、
音声合成データベースなどの固定データは、ＣＤ−ＲＯ
Ｍ、ＣＤ−Ｒ、ＤＶＤ−ＲＯＭ、ＩＣメモリカードなど
に格納されたものを用いることもできる。Further, the processing of each functional unit is performed by the ROM and the RAM.
It is executed by the operation of the CPU based on the program stored in the above. Speech synthesis database, C
Data required for PU operation are hard disk, R
It is stored in AM, MO, DVD-RAM or the like. Also,
Fixed data such as voice synthesis database is CD-RO
Those stored in M, CD-R, DVD-ROM, IC memory cards and the like can also be used.

【００１５】この発明の音声情報は、出力の繰り返し回
数、出力指定時刻、音量及び読上げ速度のいずれかから
なる出力制御データをさらに含み、音声出力部が前記制
御データに基づいて音声を出力するようにしてもよい。
たとえば音量を含む制御データを受信した場合、この音
量に相当する大きさの音声を出力するようにする。出力
指定時刻等を含む音声情報をインテリジェントスピーカ
へ送信しておくことで、インテリジェントスピーカ自体
が自動的に音声の出力制御をすることができるので、制
御データを与える外部機器側では、インテリジェントス
ピーカを常に監視制御する必要がなくなり、外部機器側
の負担が軽減できる。The audio information of the present invention further includes output control data including any one of the number of output repetitions, the designated output time, the volume and the reading speed, and the audio output unit outputs the audio based on the control data. You may
For example, when the control data including the volume is received, the voice having the volume corresponding to the volume is output. By sending audio information including the specified output time to the intelligent speaker, the intelligent speaker itself can automatically control the output of the sound. Therefore, the external device that gives the control data always uses the intelligent speaker. There is no need to monitor and control, and the burden on the external device side can be reduced.

【００１６】さらに、自己を識別する識別番号を設定す
る番号設定部をさらに備え、前記音声情報が、インテリ
ジェントスピーカを特定する個有の識別番号をさらに含
み、番号設定部で設定された識別番号と、音声情報に含
まれる識別番号とが一致する場合に、音声出力部が、受
信された音声情報を用いて音声を出力するようにしても
よい。これによれば、複数のインテリジェントスピーカ
が単一の通信回線上に接続されている場合でも、インテ
リジェントスピーカごとに、音声出力の可否を判断して
音声出力をすることができる。したがって、単一通信回
線に接続された複数のインテリジェントスピーカごとに
異なる音声を出力させることもできる。Further, the apparatus further comprises a number setting section for setting an identification number for identifying itself, and the voice information further includes a unique identification number for identifying the intelligent speaker, and the identification number set by the number setting section. If the identification number included in the voice information matches, the voice output unit may output the voice using the received voice information. According to this, even if a plurality of intelligent speakers are connected on a single communication line, it is possible to determine whether or not to output audio for each intelligent speaker and output audio. Therefore, different sounds can be output for each of the plurality of intelligent speakers connected to the single communication line.

【００１７】また、前記音声出力部は、前記音声情報に
句読点を含む場合には、その句読点は音声として出力せ
ずに、その句読点の出力タイミングに所定の休止期間を
挿入するようにしてもよい。これによれば、句読点ごと
に文章が区切られて音声出力されるので、自然な速さで
聞き取りやすい出力をすることができる。また、出力す
べき文章が長文であるためにいくつかの音声情報に分割
されて受信された場合でも、適切な位置に句読点を含め
ておけば、分割された音声情報を併合して出力する場合
でも、適度な休止期間を含んだ自然な調子で、聞き取り
やすい音声出力が可能となる。Further, when the voice information includes a punctuation mark, the voice output unit may insert the predetermined pause period into the output timing of the punctuation mark without outputting the punctuation mark as a voice. . According to this, the sentence is separated for each punctuation mark and the voice is output, so that the output can be easily heard at a natural speed. In addition, even if a sentence to be output is a long sentence and is received after being divided into several pieces of audio information, if the punctuation marks are included in appropriate positions, the divided audio information will be output in combination. However, it is possible to output audio that is easy to hear in a natural tone that includes an appropriate pause period.

【００１８】さらに、表示部をさらに備え、受信した音
声情報に音声出力できない言語指定または文字、記号が
含まれる場合には、その音声情報に含まれる文字データ
を表示部に表示するようにしてもよい。これによれば、
音声合成データベースに予め格納されていない言語の指
定がされた場合や、発音することのできない記号等が音
声情報に含まれている場合など、合成音声として出力が
できない場合でも、音声出力に代わり、文字表示をさせ
ることにより、容易にその内容を視覚的に確認すること
ができる。Further, a display unit is further provided, and when the received voice information includes a language designation or characters or symbols that cannot be output as voice, the character data included in the voice information is displayed on the display unit. Good. According to this
Even if it is not possible to output as a synthesized voice, such as when a language that is not stored in advance in the voice synthesis database is designated, or when the voice information includes symbols that cannot be pronounced, instead of voice output, By displaying the characters, the contents can be easily visually confirmed.

【００１９】また、この発明は、前記したような複数の
インテリジェントスピーカと、前記音声情報を生成する
制御端末とが通信回線を介して接続され、前記制御端末
が、音声出力させたいインテリジェントスピーカ個有の
識別番号を音声情報に含ませて送信し、各インテリジェ
ントスピーカは、受信した音声情報に自己の識別番号が
含まれる音声情報のみを受信して音声出力をすることを
特徴とする構内放送システムを提供するものである。こ
れによれば、複数台のインテリジェントスピーカに対し
て、個別に音声出力の可否を制御することができる。ま
た、音声情報に音声出力に関する制御データを含めてお
けば、インテリジェントスピーカ側で自動的に出力制御
をすることができるので、制御端末側では各インテリジ
ェントスピーカを常に監視制御することがなくなり、負
担を減らすことができる。Further, according to the present invention, a plurality of intelligent speakers as described above and a control terminal for generating the voice information are connected via a communication line, and the control terminal has an intelligent speaker to output voice. The internal speaker broadcast system is characterized in that each intelligent speaker receives only the voice information in which the received voice information includes its own identification number, and outputs the voice information. It is provided. According to this, it is possible to individually control the availability of audio output for a plurality of intelligent speakers. Also, by including control data related to voice output in the voice information, the output control can be automatically performed on the intelligent speaker side, so that the control terminal side does not constantly monitor and control each intelligent speaker, thus reducing the burden. Can be reduced.

【００２０】ここで、前記制御端末が、複数のインテリ
ジェントスピーカ間の相対距離を考慮して、エコーが発
生しないように各インテリジェントスピーカごとの出力
指定時刻を設定し、又、音の大きさの減衰を補い且つ各
スピーカ同士の相互作用を最小にするための最小限の音
量を各インテリジェントスピーカごとに設定し、各イン
テリジェントスピーカごとに設定された出力指定時刻・
音量を含む音声情報を送信し、各インテリジェントスピ
ーカが、受信した音声情報に含まれる出力指定時刻に、
受信した音声情報に含まれる音声データを出力するよう
にしてもよい。これによれば、各インテリジェントスピ
ーカから発せられる音声の到達時間遅れやスピーカから
の不必要に大きな出力により、エコーがかかったように
聞こえるのを防止することができる。Here, the control terminal sets the designated output time for each intelligent speaker so as not to generate an echo in consideration of the relative distance between the plurality of intelligent speakers, and attenuates the loudness of the sound. The minimum volume to compensate for each speaker and minimize the interaction between the speakers is set for each intelligent speaker, and the specified output time for each intelligent speaker is set.
Audio information including volume is transmitted, and each intelligent speaker outputs at the specified output time included in the received audio information.
You may make it output the audio | voice data contained in the received audio | voice information. According to this, it is possible to prevent an echo from being heard due to a delay in the arrival time of the sound emitted from each intelligent speaker and an unnecessarily large output from the speaker.

【００２１】また、この発明は、複数の家庭用の制御端
末と、前記したような１台以上のインテリジェントスピ
ーカとが通信回線を介して接続され、前記家庭用制御端
末のそれぞれが、その制御端末に特有のメッセージデー
タと、音声出力させたいインテリジェントスピーカ個有
の識別番号とを含む音声情報を送信し、前記インテリジ
ェントスピーカは、自己の識別番号が含まれる音声情報
のみを受信して、その受信した音声情報に含まれるメッ
セージデータを音声出力することを特徴とするホームネ
ットワークシステムを提供するものである。これによれ
ば、家庭用の制御端末を各種電気機器（たとえば、エア
コン、テレビ、炊飯器）に組み込み、インテリジェント
スピーカをホームコントローラなどに組み込めば、各種
電気機器から送信されたメッセージデータを、ホームコ
ントローラのインテリジェントスピーカから音声出力す
ることができ、電気機器の状態を集中的に聞くことがで
きる。Further, according to the present invention, a plurality of home control terminals and one or more intelligent speakers as described above are connected via a communication line, and each of the home control terminals has its control terminal. Voice information including message data peculiar to the user and the identification number of the intelligent speaker to be output as voice is transmitted, and the intelligent speaker receives and receives only the voice information including its own identification number. The present invention provides a home network system characterized by outputting message data included in voice information as voice. According to this, when a home control terminal is incorporated in various electric devices (for example, air conditioners, TVs, rice cookers) and an intelligent speaker is incorporated in a home controller or the like, message data transmitted from the various electric devices can be stored in the home controller. The voice can be output from the intelligent speaker of, and the state of electrical equipment can be intensively heard.

【００２２】[0022]

【発明の実施の形態】以下、図面に示す実施の形態に基
づいてこの発明を詳述する。なお、これによってこの発
明が限定されるものではない。図１に、この発明のイン
テリジェントスピーカの一実施例の構成ブロック図を示
す。BEST MODE FOR CARRYING OUT THE INVENTION The present invention will be described in detail below based on the embodiments shown in the drawings. The present invention is not limited to this. FIG. 1 is a block diagram showing the configuration of an embodiment of the intelligent speaker of the present invention.

【００２３】図１において、符号１１は、インテリジェ
ントスピーカの外部に設置される制御装置に含まれる音
声データ作成部であり、後述する図２に示すような構造
のデジタルコード化された音声データを作成する部分で
ある。外部に設置される制御装置とは、図４に示すよう
なパソコン等に代表される放送用コントローラ７０に相
当する。図１の符号１は、インテリジェントスピーカの
外部入力端子であり、音声データ作成部１１から送られ
た音声データを受信する部分である。In FIG. 1, reference numeral 11 is a voice data creating section included in a control device installed outside the intelligent speaker, and creates digital coded voice data having a structure as shown in FIG. 2 described later. It is the part to do. The control device installed externally corresponds to the broadcasting controller 70 represented by a personal computer or the like as shown in FIG. Reference numeral 1 in FIG. 1 denotes an external input terminal of the intelligent speaker, which is a portion for receiving the voice data sent from the voice data creation unit 11.

【００２４】インテリジェントスピーカと音声データ作
成部１１とは、通常１本の通信線で接続するものとす
る。ただし、外部入力端子１の位置に無線通信機能を持
つ装置を接続して、インテリジェントスピーカと音声デ
ータ作成部１１との間のデータ通信を、無線通信で行う
ようにしてもよい。The intelligent speaker and the voice data creating section 11 are normally connected by one communication line. However, a device having a wireless communication function may be connected to the position of the external input terminal 1 so that data communication between the intelligent speaker and the voice data creation unit 11 is performed by wireless communication.

【００２５】また、複数のインテリジェントスピーカを
１つの音声データ作成部１１に接続する場合は、各イン
テリジェントスピーカごとにそれぞれ別の有線で接続し
てもよいが、１０ＢＡＳＥ−Ｔ仕様などのＬＡＮ回線，
無線ＬＡＮなどを用いて接続するようにしてもよい。こ
の場合には、インテリジェントスピーカの外部入力端子
１の部分にＬＡＮ接続機能を有する装置を接続すればよ
い。When a plurality of intelligent speakers are connected to one voice data creating section 11, each intelligent speaker may be connected by a different wire, but a LAN line of 10BASE-T specifications,
The connection may be made using a wireless LAN or the like. In this case, a device having a LAN connection function may be connected to the external input terminal 1 of the intelligent speaker.

【００２６】図１の音声データ受信部２は、外部入力端
子１を介して受信された音声データを、図２に示す所定
のデータフォーマットを一単位として取得する部分であ
り、ここで受信された音声データは、バッファメモリ３
に保存される。バッファメモリ３は、ＲＡＭなどの半導
体メモリやハードディスクなどのディスク媒体を用いる
ことができる。The audio data receiving section 2 of FIG. 1 is a section for acquiring the audio data received via the external input terminal 1 in a predetermined data format shown in FIG. 2 as one unit. Audio data is stored in buffer memory 3
Stored in. As the buffer memory 3, a semiconductor memory such as RAM or a disk medium such as a hard disk can be used.

【００２７】ＩＤ番号設定部４は、このインテリジェン
トスピーカ個有の識別番号（ＩＤ番号）を設定するもの
であり、ＤＩＰスイッチやロータリスイッチの他、ＲＯ
ＭやＩＣカードに記録されたＩＤ番号を読み出すように
してもよい。また、外部入力端子１を介して設定データ
を受信して、この設定データに含まれるＩＤ番号をバッ
ファメモリ３に保存するようにしてもよい。時計駆動部
６は、いわゆる内蔵時計であり、現在時刻の出力や、設
定された時間間隔の計測などに用いられる。The ID number setting section 4 is for setting an identification number (ID number) unique to the intelligent speaker. In addition to the DIP switch and rotary switch, RO
The ID number recorded in M or the IC card may be read out. Alternatively, the setting data may be received via the external input terminal 1 and the ID number included in the setting data may be stored in the buffer memory 3. The timepiece drive unit 6 is a so-called built-in timepiece, and is used for outputting the current time, measuring a set time interval, and the like.

【００２８】音声データ合成部７は、与えられる音声情
報に基づいて音声合成データベース８を利用して合成音
声を生成するものである。音声合成データベース８は、
文字，単語等の音声データに対応する音声波形のデジタ
ル情報を予め記憶したものである。スピーカ９は、音声
データ合成部７で合成された音声波形データを、音波に
変換して出力するものである。The voice data synthesizing unit 7 generates a synthesized voice by using the voice synthesizing database 8 based on the given voice information. The voice synthesis database 8 is
Digital information of a voice waveform corresponding to voice data such as characters and words is stored in advance. The speaker 9 converts the voice waveform data synthesized by the voice data synthesizer 7 into a sound wave and outputs the sound wave.

【００２９】表示部１０は、音声による放送をすること
を目的とする観点からは、必須の構成要素ではないが、
インテリジェントスピーカを利用する上で表示すべき設
定情報や、音声化できないあるいは音声化する必要のな
いデータなどを視覚的に表示するものであり、ＣＲＴ，
ＬＣＤ，ＥＬ表示などの表示装置を用いることができ
る。また、この表示部１０は、スピーカで放送する音声
の内容を文字で確認するために利用することもできる。The display unit 10 is not an essential component from the viewpoint of broadcasting by voice, but
This is a visual display of setting information that should be displayed when using the intelligent speaker, data that cannot be converted into voice, or data that does not need to be voiced.
A display device such as LCD or EL display can be used. The display unit 10 can also be used for confirming the content of the sound broadcast by the speaker with characters.

【００３０】実行制御部５は、以上の各機能部を動作さ
せる部分であり、ＣＰＵ，ＲＯＭ，ＲＡＭ，Ｉ／Ｏコン
トローラなどからなるマイクロコンピュータにより構成
することができる。実行制御部５は、たとえば次のよう
な機能を実行する。（１）音声データ受信部２で受信したデータのバッファ
メモリ３への格納（２）ＩＤ番号設定部４から設定ＩＤ番号の読み取り（３）時計駆動部６に対する制御（時刻設定など）（４）受信された音声データの解析（５）音声データ合成部７に与えるデータの作成（６）表示部１０の制御と表示部１０に表示するデータ
の作成The execution control section 5 is a section for operating each of the above functional sections, and can be composed of a microcomputer including a CPU, a ROM, a RAM, an I / O controller and the like. The execution control unit 5 executes the following functions, for example. (1) Storing the data received by the voice data receiving unit 2 in the buffer memory 3 (2) Reading the set ID number from the ID number setting unit 4 (3) Controlling the timepiece driving unit 6 (time setting, etc.) (4) Analysis of received voice data (5) Creation of data to be given to the voice data synthesizer 7 (6) Control of display 10 and creation of data to be displayed on the display 10.

【００３１】このインテリジェントスピーカでは、実行
制御部５は、音声データ受信部２を介して常に音声デー
タが受信されるか否かを監視しており、受信された音声
データが自己のＩＤ番号を含むものであればバッファメ
モリ３へ格納する。そして、実行制御部５は、格納され
た音声データの中に放送時刻の指定があれば、時計駆動
部６から得られる現在時刻を確認し、指定された放送時
刻になれば、受信された音声データに含まれる情報を加
工したデータを音声データ合成部７に与えて合成音声を
作成させ、スピーカ９から音声を出力させる。以上が、
この発明のインテリジェントスピーカの実行制御部５の
概略動作である。In this intelligent speaker, the execution control unit 5 constantly monitors whether or not voice data is received via the voice data receiving unit 2, and the received voice data includes its own ID number. If it is one, it is stored in the buffer memory 3. Then, if the broadcast time is specified in the stored audio data, the execution control unit 5 confirms the current time obtained from the clock drive unit 6, and when the specified broadcast time is reached, the received audio data is received. Data obtained by processing the information contained in the data is given to the voice data synthesizing unit 7 to create a synthesized voice, and the voice is output from the speaker 9. More than,
It is a schematic operation of the execution control unit 5 of the intelligent speaker of the present invention.

【００３２】図２に、音声データ受信部２が受信する音
声データの一実施例の構成説明図を示す。１つの音声デ
ータは、ヘッダーコード２０で始まり、終了コード２９
で終了する。ヘッダーコード２０と終了コード２９は、
それぞれ音声データの先頭及び終了を示す予め定められ
た特別な数値データである。宛先ＩＤ数２１は、放送を
行うスピーカの総数を表わし、宛先ＩＤ番号２２は、放
送を行うスピーカのＩＤ番号を表わす。ここで宛先ＩＤ
番号２２は、宛先ＩＤ数２１に示された数だけ繰り返さ
れる。すなわち、宛先ＩＤ番号は１つだけでなく、複数
個存在する場合もある。FIG. 2 is a structural explanatory diagram of an embodiment of the audio data received by the audio data receiving section 2. One audio data starts with a header code 20 and ends with a termination code 29.
Ends with. Header code 20 and end code 29 are
These are special predetermined numerical data indicating the beginning and end of the audio data, respectively. The number 21 of destination IDs represents the total number of speakers performing broadcasting, and the destination ID number 22 represents the ID number of speakers performing broadcasting. Destination ID here
The number 22 is repeated by the number indicated by the destination ID number 21. That is, the destination ID number is not limited to one, but there may be a plurality of destination ID numbers.

【００３３】発信元ＩＤ番号２３は、音声データを送出
した機器を特定する識別番号であり、後述するような放
送コントローラ７０等を特定する番号である。総バイト
数２４は、制御データバイト数２５から、音声データ２
８までのデータの合計バイト数を示している。The sender ID number 23 is an identification number that identifies the device that has transmitted the audio data, and is a number that identifies the broadcast controller 70, etc., which will be described later. The total number of bytes 24 is from the number of control data bytes 25 to the audio data 2
The total number of bytes of data up to 8 is shown.

【００３４】制御バイト数２５は、制御データ２６の合
計バイト数である。制御データ２６は、合成音声の出力
制御用の情報であり、これに基づいて音声データが出力
される。たとえば、制御データ２６には、繰り返し回数
を示す「繰り返しコード３０」，放送時刻を指定する
「放送時刻コード３１」，音量を指定する「音量指定コ
ード３２」，読み上げ速度を指定する「読み上げ速度指
定コード３３」，音楽のＭＩＤＩ等の音楽データタイプ
や音楽データ本体等を指定する「音楽指定コード３４」
などが含まれる。The number of control bytes 25 is the total number of bytes of the control data 26. The control data 26 is information for controlling the output of the synthetic voice, and the voice data is output based on this information. For example, the control data 26 includes a “repeating code 30” indicating the number of repetitions, a “broadcast time code 31” designating the broadcast time, a “volume designating code 32” designating the volume, and a “reading speed designation” designating the reading speed. Code 33 "," Music designation code 34 "for designating the music data type such as MIDI of music and the music data body
Etc. are included.

【００３５】音声データバイト数２７は、音声データ２
８の合計バイト数である。音声データ２８は、１組の音
声データＡと音律データＩを複数個並べたもの（３５〜
４２）で構成される。受信されるデータとしては、図２
のような音声データのほか、インテリジェントスピーカ
を制御するデータを用いることができる。たとえば、時
刻あわせを意味する制御データ２６を定義することがで
き、これをすべてのインテリジェントスピーカに同時に
送信することにより、すべてのインテリジェントスピー
カの時刻合わせが可能となる。The voice data byte number 27 is the voice data 2
8 total bytes. The voice data 28 is obtained by arranging a set of voice data A and a plurality of temperament data I (35 to 35).
42). The data received is shown in FIG.
In addition to voice data such as, data for controlling the intelligent speaker can be used. For example, it is possible to define the control data 26 that means time adjustment, and by transmitting this to all the intelligent speakers at the same time, the time adjustment of all the intelligent speakers is possible.

【００３６】次に、図４に、この発明の一実施例である
構内放送システムの構成図を示す。この構内放送システ
ムは、学校，事務所，工場，駅，店舗，屋内外の競技
場，集会所，展示会，コンサートなどで用いることがで
きる。Next, FIG. 4 shows a block diagram of a private broadcasting system which is an embodiment of the present invention. This in-house broadcasting system can be used in schools, offices, factories, stations, stores, indoor and outdoor stadiums, meeting places, exhibitions, concerts and the like.

【００３７】図４の構内放送システムは、１台の放送用
コントローラ７０と、複数台のインテリジェントスピー
カＩＳ（７１〜８２）とから構成される。図４の放送用
コントローラ７０は、各インテリジェントスピーカに、
音声データを送信するものである。放送用コントローラ
７０は、パソコン等の情報端末を用いることができる。
放送用コントローラ７０では、図２に示したような構成
の音声データを作成する。The local broadcast system shown in FIG. 4 comprises one broadcasting controller 70 and a plurality of intelligent speakers IS (71 to 82). The broadcasting controller 70 of FIG.
The audio data is transmitted. As the broadcasting controller 70, an information terminal such as a personal computer can be used.
The broadcasting controller 70 creates audio data having the configuration shown in FIG.

【００３８】図２の音声データ２８は、放送用コントロ
ーラに備えられたキーボードから文字データとして入力
することができる。あるいは、紙に印刷した放送用原稿
をスキャナで読み取り文字認識により文字データに変換
したり、マイクから入力された音声を音声認識すること
により文字データに変換したものを用いることができ
る。また、放送用コントローラ７０では、入力された音
声データに対して韻律データを作成し、さらに図２に示
したような制御データ２６を付加し、送信する宛先のイ
ンテリジェントスピーカを指定することにより、図２に
示した音声データを作成する。The voice data 28 in FIG. 2 can be input as character data from a keyboard provided in the broadcasting controller. Alternatively, a broadcast original printed on paper is read by a scanner and converted into character data by character recognition, or a voice input from a microphone is converted into character data by voice recognition. Further, in the broadcasting controller 70, prosody data is created for the input audio data, the control data 26 as shown in FIG. The voice data shown in 2 is created.

【００３９】放送用コントローラ７０と各インテリジェ
ントスピーカＩＳ（７１〜８２）とはそれぞれ別の有線
で接続することもできるが、たとえばフロアごと、イン
テリジェントスピーカの所定の設置地区ごとに１０ＢＡ
ＳＥ−ＴなどのＬＡＮ回線で接続してもよい。また、赤
外線、光、超音波、電波などを利用した無線通信で、放
送用コントローラ７０と各インテリジェントスピーカの
通信を行ってもよい。Although the broadcasting controller 70 and each intelligent speaker IS (71 to 82) can be connected to each other by different wires, for example, 10 BA for each floor and each predetermined installation area of the intelligent speaker.
You may connect by LAN lines, such as SE-T. Alternatively, the broadcast controller 70 may communicate with each intelligent speaker by wireless communication using infrared rays, light, ultrasonic waves, radio waves, or the like.

【００４０】図４において、放送用コントローラ７０の
ＩＤ番号を「０−０」、各インテリジェントスピーカの
ＩＤ番号を「ＩＳ１−１」〜「ＩＳ３−４」とする。In FIG. 4, the ID number of the broadcasting controller 70 is "0-0" and the ID numbers of the intelligent speakers are "IS1-1" to "IS3-4".

【００４１】図５に、この構内放送システムの実施例で
用いられる音声データの一実施例の説明図を示す。ここ
で、宛先ＩＤ数８４は３であり、これに続いて３つの宛
先ＩＤ番号（８４，８５，８６）が記述されている。す
なわち、これらの番号によりこの音声データは、３つの
インテリジェントスピーカＩＳ１−１，ＩＳ２−２，Ｉ
Ｓ１−３に送られることを示す。FIG. 5 is an explanatory diagram of an embodiment of audio data used in the embodiment of this private broadcasting system. Here, the number of destination IDs 84 is 3, and subsequently, three destination ID numbers (84, 85, 86) are described. That is, by these numbers, this voice data is transmitted to the three intelligent speakers IS1-1, IS2-2, I.
Indicates that the data is sent to S1-3.

【００４２】図５において、音声データ２８は、
“あ”、“い”、“う”の３つの文字データ（９２，９
４，９６）からなり、その韻律データ（９３，９５，９
７）は“０”である。ここで、韻律データが０であるこ
とは、標準設定の韻律データから変化がないこと、すな
わち、予め用意されている標準的な韻律で発音すること
を示している。この図５に示す音声データを受信した３
つのインテリジェントスピーカＩＳ１−１，ＩＳ２−
２，ＩＳ１−３は、「あいう」という文字列に対応する
音声を、標準的な韻律で出力することになる。In FIG. 5, the voice data 28 is
Three character data of "a", "i", and "u" (92, 9
4, 96) and the prosody data (93, 95, 9)
7) is "0". Here, the prosody data being 0 indicates that there is no change from the standard setting prosody data, that is, the standard prosody prepared in advance is used. 3 which received the voice data shown in FIG.
Intelligent speakers IS1-1, IS2-
2, IS1-3 will output the voice corresponding to the character string "Ai" in standard prosody.

【００４３】次に、図６に、インテリジェントスピーカ
の実行制御部５が、受信された音声データを用いて音声
出力するまでの処理のフローチャートを示す。ここで、
インテリジェントスピーカは、音声出力のモードとし
て、「リアルタイムモード」と「句読点モード」のどち
らかが予め設定されているものとする。この設定は、た
とえば、制御データ中に含まれる読上げモードデータ４
３を用いて行うことができる。ここで、「リアルタイム
モード」とは、受信した音声データをすべて一気に音声
出力するモードを意味する。「句読点モード」とは、受
信した音声データに含まれる「句点」や「読点」を検出
して、休止期間を入れながら句点または読点までの文字
列単位で音声出力するモードを意味する。Next, FIG. 6 shows a flowchart of a process until the execution control unit 5 of the intelligent speaker outputs a voice using the received voice data. here,
It is assumed that the intelligent speaker is preset with either a “real time mode” or a “punctuation mark mode” as a voice output mode. This setting is, for example, the reading mode data 4 included in the control data.
3 can be used. Here, the “real-time mode” means a mode in which all the received voice data are output as voice at once. The "punctuation mark mode" means a mode in which "punctuation marks" or "reading marks" included in the received voice data are detected, and voice output is performed in a character string unit up to the punctuation marks or the reading marks with a pause period.

【００４４】まず、ステップ１００において、音声デー
タを受信し、バッファメモリ３の受信バッファに格納す
る。受信データの中の宛先ＩＤ番号が、設定されている
自己のＩＤ番号と一致しているか否かチェックし（ステ
ップ１０１）、一致していなければ、バッファメモリ３
の受信バッファをクリアし（ステップ１０２）、ステッ
プ１００へ戻る。First, in step 100, voice data is received and stored in the reception buffer of the buffer memory 3. It is checked whether the destination ID number in the received data matches the set own ID number (step 101). If they do not match, the buffer memory 3
The reception buffer of is cleared (step 102) and the process returns to step 100.

【００４５】一致している場合は、自己に送信されてき
た放送データであるので、ステップ１０３へ進む。ステ
ップ１０３において、受信されたデータの中に放送時刻
データ３１があるか否か確認する。If they match, it means that the broadcast data has been transmitted to itself, so that the process proceeds to step 103. In step 103, it is confirmed whether the received data includes the broadcast time data 31.

【００４６】放送時刻が指定されている場合は、ステッ
プ１１２へ進み、その時刻データを一時保管し、指定さ
れた時刻に放送する。放送時刻の指定がない場合はステ
ップＳ１０４へ進み、繰り返しデータ３０があるか否か
確認する。繰り返し回数が指定されている場合は、ステ
ップ１１３へ進み、繰り返し回数データを一時保管し、
指定された回数だけ放送する。繰り返し回数が指定され
ていない場合は、ステップ１０５へ進み、現在「句読点
モード又はリアルタイムモード」のどちらに設定されて
いるか否か確認する。When the broadcast time is designated, the process proceeds to step 112, the time data is temporarily stored, and broadcast at the designated time. If the broadcast time is not designated, the process proceeds to step S104, and it is confirmed whether or not the repeated data 30 is present. If the repeat count is specified, the process proceeds to step 113, where the repeat count data is temporarily stored,
Broadcast a specified number of times. If the number of repetitions is not specified, the process proceeds to step 105 and it is confirmed whether the "punctuation mode or real-time mode" is currently set.

【００４７】リアルタイムモードに設定されている場合
は、ステップ１１０へ進み、バッファメモリ３に格納さ
れた全音声データを出力バッファへ移動させ音声データ
合成部７に与える。そして、ステップ１１１において、
音声データ合成部７により合成音声を作成して、スピー
カ９から音声を出力する。If the real-time mode is set, the process proceeds to step 110, where all the audio data stored in the buffer memory 3 is moved to the output buffer and given to the audio data synthesizer 7. Then, in step 111,
The voice data synthesizer 7 creates a synthesized voice and outputs the voice from the speaker 9.

【００４８】ステップ１０５において、「句読点モー
ド」の場合は、ステップ１０６へ進み、バッファメモリ
３の受信バッファから音声データを１つだけ読み出す。
ここで、１つとは、図２に示す形成の音声データが複数
個からなる場合の１音声データ（図２の音声データ２８
の中の１文字データ）を意味する。In step 105, in the case of the "punctuation mode", the process proceeds to step 106, and only one voice data is read from the reception buffer of the buffer memory 3.
Here, one means one audio data (the audio data 28 of FIG.
1 character data in).

【００４９】ステップ１１４で、読み出すべきデータが
ない場合は、ステップ１００へ戻り、音声データがあっ
た場合は、ステップ１０９へ進む。ステップ１０９で
は、この読み出した音声データをバッファメモリ３の中
の出力バッファに追加する。ここで、すでに出力バッフ
ァに音声データが存在する場合は、その音声データの後
に続けて保存する。In step 114, if there is no data to be read, the process returns to step 100, and if there is voice data, the process proceeds to step 109. In step 109, the read audio data is added to the output buffer in the buffer memory 3. If audio data already exists in the output buffer, the audio data is saved after the audio data.

【００５０】次に、ステップ１０７において、今追加さ
れた音声データが句点であるか否かチェックし、ステッ
プ１０８において読点であるか否かチェックする。ステ
ップ１０７，１０８において、句点又は読点でなかった
場合は、ステップ１０６へ戻り、次の音声データについ
て処理を繰り返す。Next, in step 107, it is checked whether the added voice data is a punctuation mark, and in step 108 it is checked whether it is a reading mark. If it is not a punctuation mark or a reading mark in steps 107 and 108, the process returns to step 106 and the process is repeated for the next voice data.

【００５１】句点又は読点である場合は、ステップ１１
１へ進み、出力バッファに格納された音声データを音声
データ合成部７に与え、音声合成による音声を出力す
る。ここで、音声合成時において、句読点は音声として
出力はされず、ある所定の時間だけ休止期間を設けるた
めに使われる。このように休止期間を設ければ、聞き取
りやすい放送をすることができる。If it is a punctuation mark or a reading mark, step 11
The process proceeds to 1 and the voice data stored in the output buffer is given to the voice data synthesizing unit 7 to output voice by voice synthesis. Here, at the time of voice synthesis, the punctuation marks are not output as voice, but are used to provide a pause period for a predetermined time. By providing the pause period in this way, it is possible to make the broadcast easy to hear.

【００５２】図７に、この発明のインテリジェントスピ
ーカの音声データ合成の可否の判断処理のフローチャー
トを示す。まず、ステップ１２０において、受信された
音声データの制御データ２６の中に、「言語指定デー
タ」があるか否かと、言語指定の言語が、音声合成デー
タベース８の中に格納されている言語に含まれるか否か
とを確認する。指定された言語が音声合成データベース
８の中にある場合は、ステップ１２１へ進み、音声合成
出力をする。指定された言語がない場合は、ステップ１
２２へ進み、音声出力せずに、出力バッファのデータを
文字データとして表示部１０へ表示出力する。FIG. 7 shows a flow chart of a process of judging whether or not voice data synthesis of the intelligent speaker of the present invention is possible. First, in step 120, whether or not there is “language designation data” in the control data 26 of the received voice data and the language designated language are included in the languages stored in the voice synthesis database 8. Confirm whether or not If the designated language is in the voice synthesis database 8, the process proceeds to step 121, where voice synthesis output is performed. Step 1 if no language is specified
In step 22, the output buffer data is displayed and output as character data on the display unit 10 without voice output.

【００５３】図８に、インテリジェントスピーカの設置
位置を考慮して、所定時間だけずらして各インテリジェ
ントスピーカから同じ音声データを出力するようにした
構内放送システムの実施例を示す。図８において、イン
テリジェントスピーカＩＳ２−１（１３４）を中心とし
て、６０ｍ離れた位置に、ＩＳ１−１（１３１），ＩＳ
２−２（１３５），ＩＳ３−１（１３７）が設置され、
１２０ｍ離れた位置に、ＩＳ１−２（１３２），ＩＳ２
−３（１３６），ＩＳ３−２（１３８）が設置され、１
８０ｍ離れた位置に、ＩＳ１−３（１３３）とＩＳ３−
３（１３９）とが設置されているものとする。FIG. 8 shows an embodiment of a private broadcasting system in which the same audio data is output from each intelligent speaker after being shifted by a predetermined time in consideration of the installation position of the intelligent speaker. In FIG. 8, IS1-1 (131), IS is placed at a position 60 m away from the intelligent speaker IS2-1 (134).
2-2 (135) and IS3-1 (137) are installed,
IS1-2 (132), IS2 at a position 120 m away
-3 (136) and IS3-2 (138) are installed, 1
IS1-3 (133) and IS3- at a distance of 80 m
3 (139) are installed.

【００５４】今、音速を３００ｍ／秒とすると、ＩＳ２
−１から出力された音声は、０．２秒後に６０ｍの位置
に到達し、０．４秒後に１２０ｍの位置に到達し、０．
６秒後に１８０ｍの位置に到達する。すべてのインテリ
ジェントスピーカから、同じ音声データを同時に出力し
たとすると、たとえば、ＩＳ２−２の位置では、ＩＳ２
−２から音声データが出力されて０．２秒後に、ＩＳ２
−１から出力された同じ音声データが聞こえてくること
になり、いわゆるエコーがかかったように聞こえ、放送
が聞き取りにくくなる。Now, assuming that the sound velocity is 300 m / sec, IS2
The sound output from -1 reaches the position of 60 m after 0.2 seconds, reaches the position of 120 m after 0.4 seconds, and reaches 0.
The position of 180 m is reached after 6 seconds. If the same audio data is simultaneously output from all intelligent speakers, for example, at the position of IS2-2, IS2
-2, 0.2 seconds after voice data is output, IS2
The same audio data output from -1 will be heard, and it will be heard as if so-called echo is applied, and the broadcast will be difficult to hear.

【００５５】そこで、このエコーの発生を緩和するため
に、上記した音声と音量の到達遅延時間と減衰量を考慮
して、各インテリジェントスピーカの出力タイミングを
調整する。Therefore, in order to mitigate the occurrence of this echo, the output timing of each intelligent speaker is adjusted in consideration of the arrival delay time and the attenuation amount of the voice and the sound volume.

【００５６】図９に、音声の到達遅延時間を考慮した４
つの放送データの一実施例を示す。ここで、放送される
音声はいずれも同じ「あいう」である。また、放送デー
タ１はＩＳ２−１に与えられるデータであり、放送デー
タ２はＩＳ１−１，２−２，３−１に与えられるデータ
であり、放送データ３はＩＳ１−２，２−３，３−２に
与えられるデータであり、放送データ４はＩＳ１−３，
３−３に与えられるデータである。FIG. 9 shows a case where the arrival delay time of voice is taken into consideration.
An example of one broadcast data is shown. Here, the broadcasted sounds are all the same. Broadcast data 1 is data given to IS2-1, broadcast data 2 is data given to IS1-1, 2-2, 3-1 and broadcast data 3 is IS1-2, 2-3, IS1-2. 3-2, and the broadcast data 4 is IS1-3.
This is the data given to 3-3.

【００５７】この４つの放送データは、時刻データ３１
と音量データ３２として指定された時刻と音量が異な
る。放送データ１の時刻データは、２００１年８月８日
１８時１５分３０秒であり、音量データは１００であ
る。放送データ２の時刻データは、これよりも０．２秒
だけ遅く、また音量データは２４だけ小さく設定されて
いる。すなわち、ＩＳ２−１から放送された放送データ
１「あいう」の０．２秒後に音量７６で、ＩＳ２−１か
ら６０ｍ離れて設置されているＩＳ１−１，２−２，３
−１から放送データ２「あいう」が放送されるように指
定されている。The four broadcast data are time data 31.
And the volume is different from the time designated as the volume data 32. The time data of the broadcast data 1 is 18:15:30 on August 8, 2001, and the volume data is 100. The time data of the broadcast data 2 is set to 0.2 seconds later than this, and the volume data is set to be smaller by 24. That is, 0.2 seconds after the broadcast data 1 "Ai" broadcast from IS2-1, the volume is 76, IS1-1, 2-2, 3 installed 60 m away from IS2-1.
Broadcast data 2 "Ai" is designated to be broadcast from -1.

【００５８】同様に、ＩＳ２−１から１２０ｍ離れて設
置されたＩＳ１−２，２−３，３−２に送られる放送デ
ータ３は、放送データ１よりも０．４秒遅い時刻データ
と音量データ７２が設定されており、１８０ｍ離れて設
置されたＩＳ１−３と３−３に送られる放送データ４
は、放送データ１よりも０．６秒遅い時刻データと音量
データ７１が設定されている。Similarly, the broadcast data 3 sent to IS1-2, 2-3, 3-2 installed 120 m away from IS2-1 includes time data and volume data 0.4 seconds later than broadcast data 1. 72 is set and broadcast data 4 sent to IS1-3 and 3-3 installed 180m apart
Is set with time data and volume data 71 which is 0.6 seconds later than the broadcast data 1.

【００５９】このような４種類の放送データを放送用コ
ントローラ１３０から各インテリジェントスピーカに送
信しておけば、各インテリジェントスピーカの位置で
は、そのスピーカから出力される音声とインテリジェン
トスピーカＩＳ２−１から出力される音声とがほぼ同時
に聞こえ、また音量も各スピーカの音量が重畳された音
量１００で聞こえることになり、不自然なエコーを防止
し、音量も損なわないようにすることができ、聞きやす
くなる。If such four types of broadcast data are transmitted from the broadcasting controller 130 to each intelligent speaker, the sound output from the intelligent speaker and the intelligent speaker IS2-1 are output at the position of each intelligent speaker. Sound is heard at almost the same time, and the volume is heard at a volume of 100 in which the volume of each speaker is superimposed. Unnatural echo can be prevented and the volume can be prevented from being impaired.

【００６０】また、制御データとして、所定の音量デー
タを有する「音量指定コード」も、同時に各インテリジ
ェントスピーカに与えれば、音量調整することができ、
より聞きやすくすることができる。例として、図１２お
よび図１３に音の伝播方向での音量の変化および逆方向
での音量の変化を示した。図１２は放送の伝播方向での
ＩＳ２−１，２−２，２−３の音量の変化を表してい
る。各インテリジェントスピーカの音は、音の位相を放
送時刻で調整し音量は音の減衰を考慮してスピーカ位置
での音量が必ず１００になるように調整されている。こ
の為、各インテリジェントスピーカの音は同じ位相で太
線のように合成され、放送を聞き取りやすくするのに効
果を発揮する。図１３は放送の伝播方向とは逆の方向に
伝わる音量の変化をＩＳ２−１，２−２，２−３につい
て表している。各インテリジェントスピーカの音は、逆
方向への伝播では音の位相がまちまちになる。たとえ
ば、ＩＳ２−１で他のインテリジェントスピーカの音を
聞くと、ＩＳ２−２で出力された音は０．４秒後に聞こ
え、ＩＳ２−３の音は０．８秒後に聞こえる。このよう
にＩＳ２−１の位置で聞く各インテリジェントスピーカ
の音は位相が合っておらず、また音量も音量データによ
り一般の構内放送よりも小さくしてある為に、各スピー
カの音は図１３の複数の太線のように単独の音となるの
で、一般の構内放送よりもエコーの影響を小さく抑える
効果がある。If a "volume designating code" having predetermined volume data is also given to each intelligent speaker as control data at the same time, the volume can be adjusted.
You can make it easier to hear. As an example, FIGS. 12 and 13 show changes in volume in the sound propagation direction and changes in volume in the opposite direction. FIG. 12 shows changes in the volume of IS2-1, 2-2, 2-3 in the propagation direction of broadcasting. The sound of each intelligent speaker is adjusted such that the sound phase is adjusted at the broadcast time and the sound volume is always 100 at the speaker position in consideration of sound attenuation. Therefore, the sound of each intelligent speaker is synthesized with the same phase as a thick line, which is effective in making the broadcast easier to hear. FIG. 13 shows changes in the volume of sound transmitted in the direction opposite to the propagation direction of broadcasting for IS2-1, 2-2, 2-3. The sound of each intelligent speaker has a different sound phase when propagating in the reverse direction. For example, when the sound of another intelligent speaker is heard by IS2-1, the sound output by IS2-2 is heard 0.4 seconds later, and the sound of IS2-3 is heard 0.8 seconds later. In this way, the sounds of the intelligent speakers heard at the IS2-1 position are out of phase, and the volume of the intelligent speakers is set lower than that of general public broadcasting according to the volume data. Since it produces a single sound like a plurality of thick lines, it has an effect of suppressing the influence of echo smaller than that of general public broadcasting.

【００６１】図１０に、この発明のインテリジェントス
ピーカを、ホームネットワークシステムに利用した一実
施例の説明図を示す。図１０において、ホームネットワ
ークシステムは、電気機器としての電気釜１８０、エア
コン１８１、冷蔵庫１８２、及びテレビ１８３と、ホー
ムコントローラ１８５とから構成され、これらの機器が
所定のデータ伝送路１９３によって接続されている。FIG. 10 shows an explanatory diagram of an embodiment in which the intelligent speaker of the present invention is used in a home network system. In FIG. 10, the home network system includes an electric kettle 180 as an electric device, an air conditioner 181, a refrigerator 182, a television 183, and a home controller 185. These devices are connected by a predetermined data transmission path 193. There is.

【００６２】データ伝送路１９３は、ＬＡＮ、電力線の
ような有線を用いてもよく、無線通信部を各機器に備え
て無線通信を行ってもよい。インテリジェントスピーカ
は、ホームコントローラ１８５の内部に備える。この実
施例は、図１の放送システムとは異なり、コントローラ
が複数台で、インテリジェントスピーカが１台であり、
各コントローラ（１８０〜１８３）から送信される音声
データを１台のスピーカから出力するものである。ま
た、このデータ伝送路１９３には、前記した図２の音声
データの他、ホームコントローラ１８５から、各電気機
器（１８０〜１８３）を制御する制御信号（起動、停
止、設定変更など）の伝送に利用してもよい。The data transmission path 193 may be a wired line such as a LAN or a power line, and a wireless communication section may be provided in each device for wireless communication. The intelligent speaker is provided inside the home controller 185. Unlike the broadcasting system shown in FIG. 1, this embodiment has a plurality of controllers and one intelligent speaker.
The audio data transmitted from each controller (180 to 183) is output from one speaker. In addition to the voice data of FIG. 2 described above, the data transmission path 193 is used to transmit control signals (start, stop, setting change, etc.) for controlling the electric devices (180 to 183) from the home controller 185. You may use it.

【００６３】たとえば、図１０の電気釜１８０からホー
ムコントローラ１８５に音声データが送信される場合、
発信元ＩＤ番号を０−１、宛先ＩＤ番号を１−１とし
て、所定のメッセージ（「あと１０分で炊きあがりま
す」など）を含む音声データが送られ、ホームコントロ
ーラ１８５のスピーカからそのメッセージが出力され
る。For example, when voice data is transmitted from the electric pot 180 of FIG. 10 to the home controller 185,
The sender ID number is 0-1 and the destination ID number is 1-1, and voice data including a predetermined message (such as "I will cook in 10 minutes") is sent and the message is output from the speaker of the home controller 185. To be done.

【００６４】この他、エアコンから送られるメッセージ
として、「室温の気温が外の気温と同じになりました。
エアコンの運転を止めることをお勧めします。」、冷蔵
庫から送られるメッセージとして、「冷蔵庫のドアが１
０分間開きっぱなしです。」などがあげられる。In addition, as a message sent from the air conditioner, "The temperature at room temperature has become the same as the outside temperature.
We recommend turning off the air conditioner. The message sent from the refrigerator says, "The refrigerator door is 1
It has been open for 0 minutes. And so on.

【００６５】この図１０の実施例では、メッセージでは
その都度電気機器からホームコントローラ１８５に送ら
れるので、接続される複数の電気機器に対応するメッセ
ージをすべて、ホームコントローラ側で記憶しておく必
要はないので、ホームコントローラの構成を容易にする
ことができる。また、新たに電気機器を接続しても、そ
の電気機器に対応するメッセージを新たにホームコント
ローラに登録する必要もない。In the embodiment of FIG. 10, a message is sent from the electric device to the home controller 185 each time, so it is not necessary for the home controller side to store all messages corresponding to a plurality of connected electric devices. Since it does not exist, the configuration of the home controller can be facilitated. Further, even if a new electric device is connected, it is not necessary to newly register a message corresponding to the electric device in the home controller.

【００６６】図１１は、図１０とは異なり、複数のホー
ムコントローラ（１８８，１８９）と、インテリジェン
トスピーカ自体１９０をさらに接続したホームネットワ
ークシステムを示した構成図である。ホームコントロー
ラは、たとえばそれぞれ各部屋に設置すればよい。Unlike FIG. 10, FIG. 11 is a configuration diagram showing a home network system in which a plurality of home controllers (188, 189) and the intelligent speaker itself 190 are further connected. The home controller may be installed in each room, for example.

【００６７】図１１の実施例では、電気機器（１８６，
１８７）からメッセージをすべてのホームコントローラ
１８８，１８９とインテリジェントスピーカ１９０に送
信すれば、利用者は家庭内のどこにいても、そのメッセ
ージを聞くことができる。また、予めメッセージの送信
宛先を、制御信号を用いてホームコントローラ側から電
気機器に設定しおけば、設定されたホームコントローラ
のみに選択的に電気機器からのメッセージを送信するこ
とができる。In the embodiment shown in FIG. 11, electric equipment (186,
187) sends the message to all home controllers 188, 189 and intelligent speaker 190 so that the user can hear the message anywhere in the home. In addition, if the destination of the message is set in advance to the electric device from the home controller side using the control signal, the message from the electric device can be selectively transmitted only to the set home controller.

【００６８】[0068]

【発明の効果】この発明によれば、受信した音声情報を
用いて合成音声を出力するようにしているので、外部機
器から所望の音声情報を与えることにより、所望の内容
の音声を所望のタイミングや速さ等で出力することがで
きる。特に、出力指定時刻等の制御データをインテリジ
ェントスピーカへ送信しておくことで、インテリジェン
トスピーカ自体が自動的に音声の出力制御をすることが
できるので、制御データを与える外部機器側では、常に
インテリジェントスピーカを監視制御する必要がなくな
り、外部機器側の負担が軽減できる。According to the present invention, since the synthesized voice is output by using the received voice information, the desired voice information is given from the external device so that the voice having the desired content can be output at the desired timing. It can be output at high speed. Especially, by transmitting control data such as specified output time to the intelligent speaker, the intelligent speaker itself can automatically control the output of the sound. It is not necessary to monitor and control the device, and the load on the external device side can be reduced.

[Brief description of drawings]

【図１】この発明のインテリジェントスピーカの一実施
例の構成ブロック図である。FIG. 1 is a configuration block diagram of an embodiment of an intelligent speaker of the present invention.

【図２】この発明の音声データの一実施例の構成図であ
る。FIG. 2 is a block diagram of an embodiment of audio data of the present invention.

【図３】従来の構内放送システムの構成ブロック図であ
る。FIG. 3 is a configuration block diagram of a conventional private broadcasting system.

【図４】この発明のインテリジェントスピーカを用いた
構内放送システムの一実施例の構成ブロック図である。FIG. 4 is a configuration block diagram of an embodiment of a local area broadcasting system using the intelligent speaker of the present invention.

【図５】この発明の構内放送システムで用いられる音声
データの具体的構成の説明図である。FIG. 5 is an explanatory diagram of a specific configuration of audio data used in the private broadcast system of the present invention.

【図６】この発明のインテリジェントスピーカの全体処
理のフローチャートである。FIG. 6 is a flowchart of the entire processing of the intelligent speaker of the present invention.

【図７】この発明の音声出力または表示の可否の判断処
理のフローチャートである。FIG. 7 is a flowchart of a process of determining whether to output or display voice according to the present invention.

【図８】この発明の構内放送システムの他の実施例の説
明図である。FIG. 8 is an explanatory diagram of another embodiment of the private broadcast system of the present invention.

【図９】この発明の構内放送システムの他の実施例の放
送データの説明図である。FIG. 9 is an explanatory diagram of broadcast data of another embodiment of the private broadcast system of the present invention.

【図１０】この発明のインテリジェントスピーカを利用
したホームネットワークシステムの一実施例の構成ブロ
ック図である。FIG. 10 is a configuration block diagram of an embodiment of a home network system using the intelligent speaker of the present invention.

【図１１】この発明のインテリジェントスピーカを利用
したホームネットワークシステムの一実施例の構成ブロ
ック図である。FIG. 11 is a configuration block diagram of an embodiment of a home network system using the intelligent speaker of the present invention.

【図１２】音の伝搬方向での音量の変化のグラフであ
る。FIG. 12 is a graph showing changes in volume in the sound propagation direction.

【図１３】音の伝搬方向とは逆の方向に伝わる音量の変
化のグラフである。FIG. 13 is a graph showing changes in volume transmitted in a direction opposite to the sound propagation direction.

[Explanation of symbols]

１外部入力端子２音声データ受信部３バッファメモリ４ＩＤ番号設定部５実行制御部６時計駆動部７音声データ合成部８音声合成データベース９スピーカ１０表示部７０放送用コントローラ 1 External input terminal 2 Audio data receiver 3 buffer memory 4 ID number setting section 5 Execution controller 6 Clock drive 7 Voice data synthesizer 8 Speech synthesis database 9 speakers 10 Display 70 Broadcast controller

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｒ 3/12 Ｇ１０Ｌ 3/00 ＱＳ Front page continuation (51) Int.Cl. ⁷ Identification code FI theme code (reference) H04R 3/12 G10L 3/00 QS

Claims

[Claims]

1. A receiving unit that receives voice information including at least predetermined character data and phoneme data, a voice synthesis database, and voice data synthesis that generates a synthesized voice corresponding to the voice information using the voice synthesis database. An intelligent speaker, comprising: a unit and a voice output unit that outputs a synthesized voice generated based on voice information.

2. A number setting unit for setting an identification number for identifying itself is provided, and the voice information further includes a unique identification number for identifying an intelligent speaker for outputting voice, and the number is set by the number setting unit. The intelligent speaker according to claim 1, wherein when the identification number and the identification number included in the voice information match, the voice output unit outputs voice using the received voice information.

3. The audio information is the number of output repetitions,
The intelligent speaker according to claim 2, further comprising output control data including any one of a designated output time, a volume, and a reading speed, and the voice output unit outputs a voice based on the control data.

4. The voice output unit, when the voice information includes a punctuation mark, does not output the punctuation mark as a voice and inserts a predetermined pause period at the output timing of the punctuation mark. The intelligent speaker according to claim 3.

5. A display unit is further provided, and when the received voice information includes a language designation or characters or symbols that cannot be output as voice, the character data included in the voice information is displayed on the display unit. The intelligent speaker according to claim 1.

6. The plurality of intelligent speakers according to claim 2 are connected to a control terminal for generating the voice information via a communication line, and the control terminal has an intelligent speaker to output voice. An internal broadcasting system characterized in that an identification number is transmitted by including it in voice information, and each intelligent speaker receives only voice information in which the received voice information includes its own identification number and outputs the voice.

7. The control terminal sets an output designated time for each intelligent speaker so that an echo does not occur in consideration of a relative distance between the plurality of intelligent speakers, and an output set for each intelligent speaker. 7. The internal broadcasting according to claim 6, wherein the intelligent speaker transmits audio information including a specified time, and each intelligent speaker outputs the audio data included in the received audio information at an output specified time included in the received audio information. system.

8. A plurality of home control terminals and one or more intelligent speakers according to claim 2 are connected via a communication line, and each of the home control terminals is connected to the control terminal. The voice information including the peculiar message data and the identification number of the intelligent speaker to be output as voice is transmitted, and the intelligent speaker receives only the voice information including its own identification number, and the received voice is received. A home network system characterized in that message data included in information is output as voice.