JP2002108601A

JP2002108601A - Information processing system, device and method

Info

Publication number: JP2002108601A
Application number: JP2000302764A
Authority: JP
Inventors: Hideo Kuboyama; 英生久保山
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-10-02
Filing date: 2000-10-02
Publication date: 2002-04-12

Abstract

PROBLEM TO BE SOLVED: To readily acquire information regarding desired news. SOLUTION: A news viewing computer to receive news articles distributed from a news article distribution computer synthesizes voice based on text information to be included in received transmission data, outputs obtained synthetic voice and displays speaker images (902, 903) to imitate speakers of the synthetic voice. Furthermore, text strings (905, 906) by the synthetic voice are displayed by colors associated with every speaker image.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、情報処理技術に関
し、特に、世の中の出来事等のニュースに関する情報を
処理する技術に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to information processing technology, and more particularly, to technology for processing information relating to news such as world events.

【０００２】[0002]

【従来の技術】時々刻々変化する新しいニュース記事な
どのフロー情報をユーザに伝えるさまざまな方法が提案
されている。その中で、例えば、テレビやラジオにおけ
るニュース番組はもっとも古くから存在し、広く普及し
ている情報提供方法である。2. Description of the Related Art Various methods have been proposed for transmitting flow information, such as a new news article, which changes every moment to a user. Among them, for example, news programs on television and radio are the oldest and widely spread information providing methods.

【０００３】これらは、ニュースキャスタがニュース原
稿を読み上げることによって情報をユーザに伝える。音
声で情報を伝えるため、例えば掃除をしながら或いは運
転をしながら情報収集が行えることになり、必ずしもユ
ーザの注意を独占する必要はなくなる。また、テレビで
は、映像を用い、より効果的な情報提供を行なってい
る。[0003] In these, newscasters convey information to users by reading out news manuscripts. Since information is transmitted by voice, information can be collected while cleaning or driving, for example, and it is not always necessary to monopolize the user's attention. In addition, televisions use video to provide more effective information.

【０００４】一方、コンピュータおよびインターネット
などの通信技術が発達し、最新のニュースを掲載したホ
ームページや、電子メールでニュースを配信するサービ
スなど、新たな情報提供方法も提案されている。こうし
た情報提供方法は、欲しいときに情報が得られるオンデ
マンド性や、情報を一方的に受けるだけでなく、ニュー
スジャンルなど欲しい情報を指示できるインタラクティ
ブ性を備える点で、テレビやラジオにない特長を持つ。
また、静止画や動画も扱えるため、視覚に訴えるとい
う、より効果的な情報提供が可能である、On the other hand, with the development of communication technologies such as computers and the Internet, new information providing methods such as a homepage on which the latest news is posted and a service for distributing news by e-mail have been proposed. These methods of providing information have features that TV and radio do not have, in that they provide on-demand information that enables them to obtain information when they want it, and they provide not only one-sided information but also interactivity that allows them to specify desired information such as news genres. Have.
In addition, because it can handle still images and moving images, it is possible to provide more effective information that appeals to the eyes,

【発明が解決しようとする課題】しかし、テレビやラジ
オによるニュース番組は、放送時間が定められており、
また、伝達するニュースの内容の順番が放送局側で定め
られているため、欲しいときに情報が得られるオンデマ
ンド性や、ニュースのジャンル等に従った視聴者が欲し
い情報を指示できるインタラクティブ性に欠ける。However, a news program on a television or radio has a fixed broadcast time,
In addition, since the order of the content of the news to be transmitted is determined by the broadcasting station, the on-demand property that information can be obtained when desired and the interactive property that the viewer can specify the desired information according to the news genre etc. Chip.

【０００５】一方、ニュース記事掲載ホームページや電
子メールによるニュース記事サービスなどによるニュー
スの提供では、パソコン操作を苦手とする人々には障壁
が高い。また、提供される情報は、テキストのみで提供
されるため、その情報を受け取るためには、常時、画面
に注意を向けて「読む」必要があり、例えば、掃除をし
ながら、或いは、運転しながら、情報を受け取るという
手軽さに欠ける。[0005] On the other hand, in providing news via a news article posting homepage or a news article service by e-mail, there are high barriers to those who are not good at operating personal computers. In addition, since the information provided is provided only in text, in order to receive the information, it is necessary to always “read” while paying attention to the screen, for example, while cleaning or driving. However, it lacks the convenience of receiving information.

【０００６】従って、本発明の目的は、手軽に欲しいニ
ュースに関する情報を取得し得る情報処理システム及び
装置及びそれらの方法を提供することにある。Accordingly, it is an object of the present invention to provide an information processing system and apparatus capable of easily obtaining information about desired news and a method thereof.

【０００７】[0007]

【課題を解決するための手段】上記の目的を達成するた
めの本発明による情報処理システムは、テキスト情報を
含む送信データを送信する送信装置と、前記送信装置に
通信可能に接続され、前記送信データを受信する受信装
置とを備えた情報処理システムであって、前記受信装置
が、受信した送信データに含まれるテキスト情報に基づ
いて音声合成を行い、得られた合成音声を出力する音声
出力手段と、前記合成音声の話者を模した話者イメージ
を表示する第１表示手段と、前記合成音声による発話対
象のテキスト列を前記話者イメージ毎に対応づけられた
テキスト表示形態で表示する第２表示手段とを備える。According to the present invention, there is provided an information processing system for transmitting transmission data including text information, the transmission device being communicably connected to the transmission device, and transmitting the transmission data. An information processing system comprising: a receiving device that receives data; wherein the receiving device performs voice synthesis based on text information included in the received transmission data, and outputs an obtained synthesized voice. First display means for displaying a speaker image imitating a speaker of the synthesized voice; and a second display means for displaying a text string to be uttered by the synthesized voice in a text display form associated with each of the speaker images. 2 display means.

【０００８】また、上記の目的を達成するための本発明
による情報処理装置は、テキスト情報を含む提示用デー
タを処理する情報処理装置であって、前記提示用データ
に含まれるテキスト情報に基づいて音声合成を行い、得
られた合成音声を出力する音声出力手段と、前記合成音
声の話者を模した話者イメージを表示する第１表示手段
と、前記合成音声による発話対象のテキスト列を前記話
者イメージ毎に対応づけられたテキスト表示形態で表示
する第２表示手段とを備える。An information processing apparatus according to the present invention for achieving the above object is an information processing apparatus for processing presentation data including text information, wherein the information processing apparatus performs processing based on text information included in the presentation data. Voice output means for performing voice synthesis and outputting the obtained synthesized voice; first display means for displaying a speaker image imitating the speaker of the synthesized voice; and a text string to be uttered by the synthesized voice. Second display means for displaying in a text display form associated with each speaker image.

【０００９】また、本発明によれば、上記情報処理シス
テム或いは装置によって実行される情報処理方法が提供
される。また、本発明によれば、上記情報処理方法をコ
ンピュータに実行させるための制御プログラムを格納す
る記憶媒体が提供される。Further, according to the present invention, there is provided an information processing method executed by the above information processing system or apparatus. Further, according to the present invention, there is provided a storage medium for storing a control program for causing a computer to execute the information processing method.

【００１０】[0010]

【発明の実施の形態】以下、添付の図面を参照して本発
明の一実施形態を詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of the present invention will be described below in detail with reference to the accompanying drawings.

【００１１】以下で説明する実施形態では、キャラクタ
ーアニメーションおよび音声合成によって、キャスタの
人物を模した仮想キャスタがテレビ番組のようにニュー
ス記事の内容を音声でユーザに伝えるとともに、対応す
る記事内容の文字列表示を可能とし、音声と文字列の両
方で内容をユーザに伝える構成を説明する。ここで、ニ
ュース記事は、例えばニュース記事提供者からインター
ネットなどのネットワークを経由してニュース記事の配
信を受け、ジャンル別に整理し、所定のジャンル順でそ
のニュース記事をユーザに伝えていく。さらに、本実施
形態によれば、任意の時点で、ユーザによる音声入力に
よって好きなジャンルを指定できるように構成されてお
り、オンデマンドかつインタラクティブな情報提供を可
能にする。In the embodiment described below, a virtual caster imitating a caster character conveys a news article to a user by voice as in a television program by character animation and voice synthesis, and a character of the corresponding article content. A description will be given of a configuration in which column display is enabled and the content is transmitted to the user by both voice and character strings. Here, for example, news articles are distributed from news article providers via a network such as the Internet, are organized by genre, and communicate the news articles to the user in a predetermined genre order. Furthermore, according to the present embodiment, the user is able to specify a favorite genre by voice input at any time, thereby enabling on-demand and interactive information provision.

【００１２】図１は、本発明の一実施形態に係る情報処
理システムの概略構成を示すブロック図である。図１に
おいて、１０１はニュース記事配信コンピュータであ
り、ニュース記事提供者が提供するニュース記事をネッ
トワーク１０３経由で配信する。１０２はニュース視聴
コンピュータであり、ネットワーク１０３経由で配信さ
れるニュース記事をキャラクターアニメーションおよび
音声合成を用いて出力する。すなわち、ニュース視聴コ
ンピュータ１０２では、配信されたニュース記事を、仮
想キャスタと文字列表示により、テレビ番組風にユーザ
に伝えるプログラムが動作する。１０３はインターネッ
トであり、ニュース記事配信コンピュータ１０１とニュ
ース視聴コンピュータ１０２との間でデータ通信を行な
う。FIG. 1 is a block diagram showing a schematic configuration of an information processing system according to one embodiment of the present invention. In FIG. 1, reference numeral 101 denotes a news article distribution computer which distributes news articles provided by a news article provider via a network 103. A news viewing computer 102 outputs a news article distributed via the network 103 using character animation and voice synthesis. That is, in the news viewing computer 102, a program for transmitting the distributed news article to the user in the manner of a television program by virtual casters and character string display operates. Reference numeral 103 denotes the Internet, which performs data communication between the news article distribution computer 101 and the news viewing computer 102.

【００１３】図２はニュース記事配信コンピュータの機
能構成を示すブロック図である。ニュース記事配信コン
ピュータ１０１は、ユーザに提供するニュース情報を保
持するニュース記事保持部２０１と、ニュース記事保持
部２０１に保持するニュース情報を最新のものに更新す
るためのニュース記事更新部２０２と、ニュース記事保
持部２０１に保持されたニュース情報を通信回線１０３
を介してニュース視聴コンピュータ１０２へ送信するた
めの通信部２０３と、を有する。FIG. 2 is a block diagram showing a functional configuration of the news article distribution computer. The news article distribution computer 101 includes a news article holding unit 201 that holds news information to be provided to a user, a news article updating unit 202 that updates the news information held in the news article holding unit 201 to the latest one, The news information held in the article holding unit 201 is transmitted to the communication line 103.
And a communication unit 203 for transmitting to the news viewing computer 102 via

【００１４】ニュース情報を提供する者は、このニュー
ス記事配信コンピュータ１０１に、提供せんとするニュ
ース情報を入力することにより、入力されたニュース情
報がニュース記事保持部２０１に保持され、ニュース視
聴コンピュータ１０２へ配信されることとなる。ニュー
ス視聴コンピュータ１０２は、ニュース記事配信コンピ
ュータ１０１にアクセスすることにより、常時このニュ
ース情報を受信することができる。A person who provides news information inputs news information to be provided to the news article distribution computer 101, whereby the input news information is stored in the news article holding unit 201, and the news viewing computer 102 Will be delivered to The news viewing computer 102 can always receive the news information by accessing the news article distribution computer 101.

【００１５】図３は、ニュース視聴コンピュータ１０２
が実現する機能のブロック図である。３０１はニュース
記事整理部であり、ニュース記事配信コンピュータ１０
１から受信したニュース記事をジャンル別に保持するな
どの整理をする。３０２は動作記述言語変換部であり、
ニュース記事を動作記述言語に変換する。３０３は動作
記述言語実行部であり、動作記述言語変換部３０２によ
って作成された動作記述言語に従って、仮想キャスタを
動かし、音声合成によりニュース記事を読ませ、字幕な
どを画面に表示する。FIG. 3 shows a news viewing computer 102.
FIG. 3 is a block diagram of the functions realized by. Reference numeral 301 denotes a news article distribution unit, which is a news article distribution computer 10;
Organize the news articles received from 1 by holding them by genre. 302 is a behavioral description language conversion unit,
Convert a news article into a behavioral description language. An operation description language execution unit 303 moves a virtual caster according to the operation description language created by the operation description language conversion unit 302, causes a news article to be read by speech synthesis, and displays subtitles and the like on a screen.

【００１６】３０４は情報提供過程制御部であり、ユー
ザへの情報提供の開始から終わりまで、全体の過程を管
理する。さらに、情報提供過程制御部３０４は、動作記
述言語の実行中にユーザの音声入力があった場合には、
動作記述言語実行部３０３の実行を中断させ、その入力
を音声認識する。こうして、情報提供過程制御部３０４
は、ユーザから音声によってニュースジャンルの指示が
あった場合は、指示されたニュースジャンルへの切り替
えを行うなど、伝えるべきニュースジャンルの管理をす
る。３０５は通信部であり、ニュース記事配信コンピュ
ータ１０１とニュース記事整理部３０１との通信を実現
する。An information provision process control unit 304 manages the entire process from the start to the end of information provision to the user. Further, the information provision process control unit 304, when there is a user's voice input during execution of the behavioral description language,
The execution of the action description language execution unit 303 is interrupted, and the input is speech-recognized. Thus, the information provision process control unit 304
Manages the news genre to be conveyed, such as switching to the indicated news genre when the user instructs the news genre by voice. A communication unit 305 implements communication between the news article distribution computer 101 and the news article arrangement unit 301.

【００１７】図１２は、図４に示した機能を実現するた
めのニュース視聴コンピュータ１０２の構成を示すブロ
ック図である。図１２において、１２０１はＣＰＵであ
り、本ニュース視聴コンピュータ１０２の各種制御を行
う。１２０２はＲＯＭであり、ＣＰＵ１２０１によって
実行される制御プログラムや各種データを格納する。１
２０３はＲＡＭであり、ＣＰＵ１２０１のメインメモリ
として機能し、外部記憶装置１２０８から読み出した各
種制御プログラムを格納したり、ＣＰＵ１２０１の作業
領域を提供する。１２０４は入力部であり、キーボード
あるいはマウスを備え、各種データ入力をする。FIG. 12 is a block diagram showing a configuration of the news viewing computer 102 for realizing the functions shown in FIG. In FIG. 12, reference numeral 1201 denotes a CPU, which performs various controls of the news viewing computer 102. A ROM 1202 stores a control program executed by the CPU 1201 and various data. 1
Reference numeral 203 denotes a RAM, which functions as a main memory of the CPU 1201, stores various control programs read from the external storage device 1208, and provides a work area for the CPU 1201. An input unit 1204 includes a keyboard or a mouse and inputs various data.

【００１８】１２０５は表示部であり、ＣＲＴ或いは液
晶表示器等により各種表示を行う。１２０６は音声出力
部であり、本実施形態ではテキストデータに基づく合成
音声の出力に用いられる。１２０７はネットワークイン
ターフェースであり、ネットワーク１０３とニュース視
聴コンピュータ１０２とを接続する。１２０８は外部記
憶装置であり、たとえばハードディスク装置を具備す
る。外部記憶装置１２０８には、仮想キャスタ定義ファ
イル６０１、ジャンル定義ファイル７０１、キャラクタ
ファイル群１２１０と制御プログラム１２２０が含まれ
る。Reference numeral 1205 denotes a display unit for performing various displays on a CRT or a liquid crystal display. Reference numeral 1206 denotes a voice output unit, which is used in this embodiment to output a synthesized voice based on text data. A network interface 1207 connects the network 103 and the news viewing computer 102. An external storage device 1208 includes, for example, a hard disk device. The external storage device 1208 includes a virtual caster definition file 601, a genre definition file 701, a character file group 1210, and a control program 1220.

【００１９】仮想キャスタ定義ファイル６０１は、仮想
キャスタとアニメーションデータ及び音声合成用の波形
データとの対応を定義するデータで構成される（図６に
より後述する）。ジャンル定義ファイル７０１は、ジャ
ンルと仮想キャスタとの対応を定義するデータで構成さ
れる（図７により後述する）。キャラクタファイル群１
２１０は複数のキャラクタファイル（１２１１）を含
む。各キャラクタファイル１２１１には、当該キャラク
タのアニメーション表示を行うためのアニメーションデ
ータ１２１３、音声合成を行うための波形辞書１２１２
が含まれる。制御プログラム１２２０は、図４のフロー
チャートによって示される制御手順をＣＰＵ１２０１に
よって実現させるためのプログラムコード群である。The virtual caster definition file 601 is composed of data defining the correspondence between virtual casters, animation data and waveform data for speech synthesis (described later with reference to FIG. 6). The genre definition file 701 includes data that defines the correspondence between genres and virtual casters (described later with reference to FIG. 7). Character file group 1
210 includes a plurality of character files (1211). Each character file 1211 includes animation data 1213 for displaying an animation of the character and a waveform dictionary 1212 for performing voice synthesis.
Is included. The control program 1220 is a group of program codes for causing the CPU 1201 to implement the control procedure shown by the flowchart in FIG.

【００２０】図４は、本実施形態の処理手順を示すフロ
ーチャートである。まず、ニュース視聴コンピュータ１
０２のニュース記事整理部３０１は、通信部３０５（ネ
ットワークインターフェース１２０７）及びネットワー
ク１０３を介してニュース記事配信コンピュータ１０１
と通信し、ニュース記事をダウンロードし、これを図５
に示すように、ジャンル別に整理する（ステップＳ４０
１）。FIG. 4 is a flowchart showing the processing procedure of this embodiment. First, the news viewing computer 1
The news article organizing unit 301 of the news article 02 is connected to the news article distribution computer 101 via the communication unit 305 (network interface 1207) and the network 103.
And download the news article, which is
(Step S40)
1).

【００２１】なお、ダウンロードしたニュース記事を図
５に示す形態に整理するために、ニュース記事とジャン
ルとの対応づけをマニュアルで指定してもよいし、ニュ
ース記事のデータを解析して自動的に対応づけを行うよ
うにしてもよい。ニュース記事整理部３０１が自動的に
対応づけを行う場合は、例えば、次のようにすればよ
い。In order to arrange the downloaded news articles into the form shown in FIG. 5, the correspondence between the news articles and the genres may be designated manually, or the data of the news articles may be analyzed and automatically. The association may be performed. When the news article organizing unit 301 automatically performs the association, for example, the following may be performed.

【００２２】（１）図１３に示されるように、ニュース
記事配信コンピュータ１０１によってニュース視聴コン
ピュータ１０２に送信される記事データ１３０１は、見
出し１３０２、記事内容１３０３、属性１３０４を有し
ている。ニュース視聴コンピュータ１０２は、受信した
記事データ１３０１の各属性１３０４に基づいてジャン
ル分けを行い（１３１０）、これに従って見出し及び記
事内容（本文）を図５に示す如く分類していく。（２）
或いは、記事データ１３０１に含まれる見出し１３０２
或いは記事内容１３０３の少なくともいずれかに対して
キーワード検索を行い、その記事のジャンルを判定し
（１３１１）、図５に示す如く見出しと記事内容（本
文）の分類を行う。なお、上記（２）の手法を用いる場
合は、記事データ１３０１の属性１３０４は不要とな
る。また、上記（１）と（２）とを組み合わせて用いて
ももちろんよい。更に、本実施形態では、ニュース記事
をジャンル別に分類した結果を、図５に示す如くジャン
ル分類テーブル５０１として保持するものとするが、上
述のジャンル分けの結果を保持する手法はこれに限定さ
れるものではない。(1) As shown in FIG. 13, the article data 1301 transmitted from the news article distribution computer 101 to the news viewing computer 102 has a heading 1302, article contents 1303, and attributes 1304. The news viewing computer 102 classifies the genre based on each attribute 1304 of the received article data 1301 (1310), and classifies headlines and article contents (text) as shown in FIG. (2)
Alternatively, a heading 1302 included in the article data 1301
Alternatively, a keyword search is performed for at least one of the article contents 1303, the genre of the article is determined (1311), and the heading and the article contents (text) are classified as shown in FIG. When the method (2) is used, the attribute 1304 of the article data 1301 becomes unnecessary. Further, the above (1) and (2) may be used in combination. Furthermore, in the present embodiment, the result of categorizing news articles by genre is stored as a genre classification table 501 as shown in FIG. 5, but the method of storing the result of genre classification is limited to this. Not something.

【００２３】また、以降の処理において、図５に示すジ
ャンル番号順に情報提示が行われることになるが、この
順番は、ユーザが所望に設定できるように構成してもよ
いことはいうまでもない。In the subsequent processing, information is presented in the order of genre numbers shown in FIG. 5, but it is needless to say that the order may be configured so that the user can set it as desired. .

【００２４】さらに、この段階で情報提供構成を情報提
供過程制御部３０４が決める。情報提供構成とは、どの
ジャンルをどの仮想キャスタによって発話させるか、ま
た、その発話内容を示す文字列をどのような形態で表示
するかの取り決めである。情報提供構成を決めるための
情報として、仮想キャスタ、背景、記事ジャンルなどが
図６、７のように設定される。Further, at this stage, the information provision process control section 304 determines the information provision configuration. The information provision configuration is a rule for determining which genre is uttered by which virtual caster, and in what form a character string indicating the uttered content is displayed. As information for determining the information provision configuration, a virtual caster, a background, an article genre, and the like are set as shown in FIGS.

【００２５】図６は仮想キャスタ定義ファイル６０１の
内容の一例を表す図である。仮想キャスタ定義ファイル
６０１は、仮想キャスタの名前と、使用するアニメーシ
ョンデータ及び音声合成用の波形辞書とを対応づける。
タグ＜＞が各仮想キャスタの定義を示し、nameでその名
前を定義する。colorは仮想キャスタの発話内容を画面
上に表示する際の文字列を構成する文字色である。これ
は各仮想キャスタで異なる色が割り当てられる。また、
fileはその仮想キャスタの声を音声合成する際に用いる
波形辞書や、アニメーションのための画像データなどを
定義したキャラクタファイル１２１１を指定する。な
お、波形辞書の詳細やアニメーションのためのデータの
詳細は、既存の技術を用いることで実現できるため、こ
こでの説明は省略する。FIG. 6 is a diagram showing an example of the contents of the virtual caster definition file 601. The virtual caster definition file 601 associates the name of the virtual caster with the animation data to be used and the waveform dictionary for speech synthesis.
A tag <> indicates the definition of each virtual caster, and its name is defined by name. The color is a character color that forms a character string when the utterance content of the virtual caster is displayed on the screen. It is assigned a different color for each virtual caster. Also,
The file designates a waveform dictionary used for synthesizing the voice of the virtual caster, a character file 1211 defining image data for animation, and the like. Since the details of the waveform dictionary and the details of the data for animation can be realized by using the existing technology, the description is omitted here.

【００２６】図７は各ニュースジャンルを定義するジャ
ンル定義ファイル７０１の内容の一例を表す図である。
ジャンル定義ファイルでは、ニュースのジャンルと仮想
キャスタとの対応づけが登録される。タグ＜＞がニュー
スジャンルを定義し、nameでそのジャンルの名前を定義
する。そして、casterはそのジャンルのニュースを伝え
る仮想キャスタを指定する。FIG. 7 is a diagram showing an example of the contents of a genre definition file 701 defining each news genre.
In the genre definition file, correspondence between news genres and virtual casters is registered. The tag <> defines a news genre, and the name defines the name of that genre. Then, the caster specifies a virtual caster that conveys the news of the genre.

【００２７】なお、以上の仮想キャスタ定義ファイル６
０１、ジャンル定義ファイル７０１は、ニュース記事提
供者が作成してニュース記事配信時に合わせて配信する
ようにしてもよいし、ユーザの好みに合わせてニュース
視聴コンピュータ１０２側にあらかじめ保持しておいて
も良い。本実施形態では、予めニュース視聴コンピュー
タ１０２に図６、図７に示すデータが外部記憶装置１２
０８に保持されているものとする。もちろん、各定義内
容はマニュアルにて変更可能としてもよい。The above-mentioned virtual caster definition file 6
01, the genre definition file 701 may be created by a news article provider and distributed at the time of news article distribution, or may be stored in advance in the news viewing computer 102 according to the user's preference. good. In this embodiment, the data shown in FIG. 6 and FIG.
08. Of course, each definition may be changed manually.

【００２８】上記のような初期設定が終了したら、ステ
ップＳ４０２〜Ｓ４０８の処理により、動作記述言語変
換部３０２がユーザへのニュースの提供を行うための動
作記述言語を生成する。すなわち、動作記述言語変換部
３０２は、図５に示すジャンル分類テーブル５０１と、
図６に示した仮想キャスタ定義ファイル６０１、図７に
示したジャンル定義ファイル７０１を参照して、図８に
示すような動作記述言語への変換を行う。When the above-described initialization is completed, the behavioral description language conversion unit 302 generates a behavioral description language for providing news to the user through the processing of steps S402 to S408. That is, the behavioral description language conversion unit 302 includes a genre classification table 501 shown in FIG.
With reference to the virtual caster definition file 601 shown in FIG. 6 and the genre definition file 701 shown in FIG. 7, conversion to an operation description language as shown in FIG. 8 is performed.

【００２９】まず、ユーザに伝えるニュースジャンル番
号Ｊを１に、記事番号Ｉを１に初期化する（ステップＳ
４０２）。次に、ステップＳ４０３において、ジャンル
Ｊの記事を読みあげる仮想キャスタを表示させるための
コマンドを記述し（図８の８０１）、ステップＳ４０４
において、ジャンルＪのＩ番目の記事データについて、
図８の８０２に示されるように、見出しの表示の記述、
音声出力の記述、音声出力内容を示す文字列（字幕）の
表示の記述を行う。見出しと音声出力内容とは、図１３
に示した記事データ１３０１中の見出し１３０２と記事
内容１３０３に対応しており、ＨＴＭＬ等で記述された
データから容易に識別することができる。First, the news genre number J to be transmitted to the user is initialized to 1 and the article number I is initialized to 1 (step S).
402). Next, in step S403, a command for displaying a virtual caster for reading articles of genre J is described (801 in FIG. 8), and step S404 is performed.
In, about the I-th article data of genre J,
As shown at 802 in FIG.
A description of the audio output and a description of the display of a character string (caption) indicating the content of the audio output are made. The heading and the audio output content are shown in FIG.
Corresponds to the headline 1302 and the article content 1303 in the article data 1301 shown in FIG. 1, and can be easily identified from data described in HTML or the like.

【００３０】例えば、Ｊ＝１は「政治」のジャンルであ
り、このジャンルのシーンではニュースを伝える仮想キ
ャスタはジャンル定義ファイル７０１より“mainCaste
r, subCaster”なので、この二人の仮想キャスタを指定
の位置（position1,position2）に登場させる動作を記
述する（「Caster->Show(mainCaster,position1)」、
「Caster->Show (subCaster,position2)」）。次に、Ｉ
＝１番目のニュース記事の見出し文字列を前面に表示す
る動作を記述する(「FrontText->Display(首相の減税方
針表明に野党反発)」)。ここで、見出しの文字色には、
予め決められた色が割り当てられており、本例では赤色
（red）で見出しが表示される。なお、この見出し文字
列用の文字色には、いずれの仮想キャスタにも割り当て
られていない色を割り当てることが好ましい。見出し
と、読み上げ文字列との識別が容易になるからである。For example, J = 1 is a genre of "politics", and in this genre scene, the virtual caster that conveys the news is "mainCaste" from the genre definition file 701.
r, subCaster ”, describe the action to make these two virtual casters appear at the specified positions (position1, position2) (“ Caster-> Show (mainCaster, position1) ”,
"Caster-> Show (subCaster, position2)"). Next, I
= Describe the operation of displaying the headline of the first news article in the foreground ("FrontText-> Display (opposition opposition to the Prime Minister's statement of tax reduction policy)"). Here, the text color of the heading
A predetermined color is assigned, and in this example, the heading is displayed in red. It is preferable to assign a color not assigned to any virtual caster to the character color for the heading character string. This is because it is easy to distinguish a heading from a reading character string.

【００３１】そして、記事内容を仮想キャスタに読み上
げさせる動作を記述し（「Caster->Speak(ＸＸＸ首相が
「実質減税」の・・・,mainCaster)」）、仮想キャスタ
ごとに指定された色で画面上に字幕を表示する動作を記
述する（「SpokenText->Display(ＸＸＸ首相が「実質減
税」の・・・,white)」）。ここで、動作記述言語変換
部３０２は、“mainCaster”を基に図６の仮想キャスタ
定義ファイル６０１から、当該キャスタの“color”が
示す表示色を読み取って、これを記述する。なお、政治
のジャンルのように、複数人の仮想キャスタが定義され
ている場合は、１文毎に順番に読み上げを行う仮想キャ
スタを変えていくものとしても良い。Then, the operation of making the virtual caster read the contents of the article is described (“Caster-> Speak (XXX Prime Minister is“ substantial tax reduction ”..., mainCaster)”), and the color specified for each virtual caster is used. Describe the operation of displaying subtitles on the screen ("SpokenText-> Display (XXX Prime Minister is" effective tax reduction ", white)). Here, the behavioral description language conversion unit 302 reads the display color indicated by “color” of the caster from the virtual caster definition file 601 in FIG. 6 based on “mainCaster” and describes it. When a plurality of virtual casters are defined as in the genre of politics, the virtual casters that sequentially read aloud for each sentence may be changed.

【００３２】一つの記事に対する動作記述言語を全て実
行したら、当該記事がジャンルＪの最終記事かどうかを
チェックし（ステップＳ４０５）、最終記事でなけれ
ば、Ｊの値をそのままにしてＩをインクリメントし（ス
テップＳ４０７）、処理をステップＳ４０４へ戻すこと
により、当該ジャンルの次のニュース記事の動作記述言
語への変換を行う。一方、ステップＳ４０５で、当該記
事が当該ジャンルＪの最終記事であると判定された場
合、当該ジャンルＪが最後に読み上げるべき最終ジャン
ルかどうかをチェックする（ステップＳ４０６）。最終
ジャンルでなければ、次のジャンルを処理するために、
Ｊを１つインクリメントし、Ｉを１に初期化して（ステ
ップＳ４０８）、処理をステップＳ４０３へ戻す。When all the action description languages for one article have been executed, it is checked whether the article is the last article of genre J (step S405). If not, the value of J is left unchanged and I is incremented. (Step S407) By returning the process to step S404, the next news article of the genre is converted into the action description language. On the other hand, if it is determined in step S405 that the article is the last article in the genre J, it is checked whether the genre J is the last genre to be read out last (step S406). If not the final genre, to process the next genre,
J is incremented by one, I is initialized to 1 (step S408), and the process returns to step S403.

【００３３】ステップＳ４０６で、当該ジャンルが最終
ジャンルと判断された場合は、ステップＳ４１１以降へ
進み、上記処理で生成された動作記述言語に従って、動
作記述言語実行部３０３がキャラクタアニメーション表
示、文字表示、音声合成出力を行う。If it is determined in step S406 that the genre is the last genre, the process proceeds to step S411 and the following steps. In accordance with the motion description language generated by the above processing, the motion description language execution unit 303 executes character animation display, character display, Performs speech synthesis output.

【００３４】ステップＳ４１１において、動作記述言語
で指定されたキャスタ名から、キャスタ定義ファイル６
０１を参照して対応するキャラクタファイル１２１１を
取得し、ステップＳ４１２において、取得したキャラク
タファイル１２１１に含まれるアニメーションデータ１
２１３に基づいてアニメーションキャラクタを表示す
る。次に、ステップＳ４１３において、上述のSpoken T
ext->によって記述されたテキスト列を、指定された色
で表示する。そして、ステップＳ４１４において、ステ
ップＳ４１１で取得したキャラクタファイル１２１１に
含まれる波形辞書１２１２を用いて、上述のCaster->Sp
eakによって記述されたテキスト列を音声合成して、音
声出力する。In step S411, the caster definition file 6 is obtained from the caster name specified in the operation description language.
01, the corresponding character file 1211 is acquired, and in step S412, the animation data 1 included in the acquired character file 1211 is acquired.
213 to display an animation character. Next, in step S413, the above-mentioned Spoken T
Displays the text string described by ext-> in the specified color. Then, in step S414, using the waveform dictionary 1212 included in the character file 1211 acquired in step S411, the above-described Caster-> Sp
The text string described by eak is voice-synthesized and output as voice.

【００３５】ステップＳ４１５では、動作記述言語に変
換された全てのデータについて処理を行ったか判定し、
処理すべきデータがあれば処理をステップＳ４１１へ戻
す。また、終了していれば、本処理を終了する。なお、
上記処理手順では、図５のごとく整理されたデータの全
てについて動作記述言語への変換を行ってから当該動作
記述言語の実行を開始したが、動作記述言語への変換の
終了を待たずに当該動作記述言語の実行を開始するよう
にしてもよい。In step S415, it is determined whether or not processing has been performed on all data converted into the behavioral description language.
If there is data to be processed, the process returns to step S411. If the processing has been completed, the processing ends. In addition,
In the above processing procedure, all the data arranged as shown in FIG. 5 are converted into the behavioral description language and then the execution of the behavioral description language is started. However, the execution of the behavioral description language is started without waiting for the end of the conversion into the behavioral description language. The execution of the operation description language may be started.

【００３６】図９は、情報提供の際にユーザに提示する
画面の例を示す図である。９０１は仮想キャスタが動作
し、ユーザにニュース記事の字幕を提示する画面であ
る。９０２、９０３はニュース記事を読み上げる仮想キ
ャスタである。９０４はニュース記事の見出しである。
９０５は９０２が、９０６は９０３が発話した内容の字
幕表示である。FIG. 9 is a diagram showing an example of a screen presented to the user when providing information. Reference numeral 901 denotes a screen on which a virtual caster operates and presents subtitles of a news article to a user. Virtual casters 902 and 903 read out news articles. Reference numeral 904 denotes a headline of a news article.
Reference numeral 905 denotes a caption display of the content uttered by 902 and 906 denotes a caption display of the content uttered by 903.

【００３７】図９において、字幕は発話する仮想キャス
タ毎に異なる文字色を図６の“color”で定めている。
仮想キャスタ９０２の発話内容は全て９０５と同じ色で
表示され、仮想キャスタ９０３の発話内容は全て９０６
と同じ色で表示され、仮想キャスタごとに異なる色の文
字で表示される。また、記事の見出し９０４もあらかじ
め表示色を指定しておき、各仮想キャスタの発話内容の
字幕とは異なる文字色で表示する。In FIG. 9, different subtitles have different character colors defined by "color" in FIG. 6 for each uttered virtual caster.
All the utterance contents of virtual caster 902 are displayed in the same color as 905, and all the utterance contents of virtual caster 903 are 906.
Is displayed in the same color as that of the virtual caster, and is displayed in a different color for each virtual caster. The display color of the headline 904 of the article is also specified in advance, and is displayed in a character color different from the caption of the utterance content of each virtual caster.

【００３８】以上のように、本実施形態によれば、配信
されたニュース記事を音声合成によって読み上げるの
で、常時画面に集中して表示されたテキストを読むとい
う必要がなくなり、手軽に情報を収集することができ
る。As described above, according to the present embodiment, the delivered news article is read out by speech synthesis, so that it is not necessary to always read the text displayed intensively on the screen, and information can be easily collected. be able to.

【００３９】また、音声合成出力に加えて記事の見出し
を字幕で表示するとともに、読み上げ内容を字幕表示す
ることにより、聴覚障害者が使用する場合や、周囲に騒
音が多く音声が聞こえにくい場合でも、内容を正しく認
識させることを可能にしている。さらに、本実施形態に
よれば、見出しやキャスタ毎に対応する字幕の文字色を
変えて表示するので、画面中に多様に表示される記事の
字幕の中で、どれが見出しであり、どれが内容として仮
想キャスタに読み上げられているのか、また、画面中に
表示されているどの仮想キャスタによって読み上げられ
ているのかが容易に把握できる。[0039] In addition to displaying the headline of the article in subtitles in addition to the speech synthesis output, the subtitles display the read-out contents, so that a person with a hearing impairment can use it, or even if the surroundings are too noisy and the sound is difficult to hear. , So that the contents can be correctly recognized. Furthermore, according to the present embodiment, since the character color of subtitles corresponding to each heading or caster is changed and displayed, among subtitles of articles that are variously displayed on the screen, which is the heading and which is It can be easily grasped as to whether the content is read out by the virtual caster and which virtual caster displayed on the screen is reading out the content.

【００４０】なお、上記実施形態では、見出しや仮想キ
ャスタごとに文字色を定めて、見出し及び発話内容を表
示したがこれに限られるものではない。要は、表示され
ている文字が見出しなのか、或いはどの仮想キャスタの
発話内容なのかがユーザに把握されればよいものであ
り、それを認識させるために見出しや仮想キャスタごと
に異なる表示形態を用いればよい。In the above embodiment, the character color is determined for each heading or virtual caster, and the heading and utterance contents are displayed. However, the present invention is not limited to this. The point is that it is only necessary for the user to know whether the displayed character is a heading or the utterance content of which virtual caster. In order to make that recognition possible, a different display form is used for each heading or virtual caster. It may be used.

【００４１】例えば、図１０は仮想キャスタの近傍に各
仮想キャスタの発話内容を表示するようにして、どのキ
ャスタの発話内容なのかを明示した例を示している。こ
のような表示を実現するためには、動作記述言語変換部
３０２において図１１に示す如き動作記述言語を生成
し、これを動作記述言語実行部３０３によって実行すれ
ばよい。For example, FIG. 10 shows an example in which the utterance content of each virtual caster is displayed in the vicinity of the virtual caster to clearly indicate which caster the utterance content is. In order to realize such display, the behavioral description language conversion unit 302 may generate a behavioral description language as shown in FIG. 11 and execute this by the behavioral description language execution unit 303.

【００４２】図１１に示されるように、発話内容の表示
を表す記述において、発話を行う仮想キャスタの表示位
置が追加記述される。例えば、「SpokenText->Display
(ＸＸＸ首相が「実質減税」の・・・,white, position
1)」の如き記述により、“mainCaster”の表示位置“po
sition1”から決められた相対位置に字幕を表示するこ
とを表現する（図１１の１１０１）。同様に、図１１の
１１０２の記述により、“subCaxter”の表示位置“pos
ition2”から決められた相対位置に字幕が表示される。
なお、この字幕の相対位置は、あらかじめ決めた値でも
良いし、上記動作記述の中で値を指定しても構わない。
また、この場合、各仮想キャスタの発話内容の文字色は
仮想キャスタごとに異ならなくても構わない。以上の記
述１１０１により、図１０に示すようにmainCasterのア
ニメーション１００１の近傍に発話内容の字幕１００２
が表示され、subCasterのアニメーション１００３の近
傍に発話内容の字幕１００４が表示される。As shown in FIG. 11, in the description representing the display of the utterance content, the display position of the virtual caster performing the utterance is additionally described. For example, "SpokenText-> Display
(XXX Prime Minister of "Real tax reduction" ..., white, position
1) ”, the display position“ po ”of“ mainCaster ”
The subtitle is displayed at the relative position determined from “sition1” (1101 in FIG. 11). Similarly, the display position “pos” of “subCaxter” is described by the description 1102 in FIG.
The subtitle is displayed at the relative position determined from ition2 ”.
The relative position of the caption may be a predetermined value or a value may be specified in the operation description.
In this case, the character color of the utterance content of each virtual caster does not have to be different for each virtual caster. According to the above description 1101, the caption 1002 of the utterance content is displayed near the mainCaster animation 1001 as shown in FIG.
Is displayed, and a caption 1004 of the utterance content is displayed near the animation 1003 of the subCaster.

【００４３】以上、見出しや仮想キャスタごとに異なる
表示形態として、文字色、表示位置を用いた例を説明し
たが、この他にも種々の変形例が考えられる。例えば、
仮想キャスタや見出しごとに文字サイズ或いは字体を変
えたり、字幕部分の背景を変えたり、罫線を変えたりし
ても構わない。As described above, the example in which the character color and the display position are used as the display form different for each heading or virtual caster has been described. However, various other modified examples can be considered. For example,
The character size or font may be changed for each virtual caster or heading, the background of the subtitle portion may be changed, or the ruled line may be changed.

【００４４】また、上記実施形態例では、仮想キャスタ
の定義を図６、ニュースジャンルの定義を図７、動作記
述言語を図８のように記述したが、これに限るものでは
なく、上記実施形態の用途を満たす記述形式であればよ
い。In the above embodiment, the definition of the virtual caster is shown in FIG. 6, the definition of the news genre is shown in FIG. 7, and the operation description language is described in FIG. 8, but the present invention is not limited to this. Any description format that satisfies the purpose is acceptable.

【００４５】また、上記実施形態では、配信されるデー
タとしてニュース記事を例に挙げて説明したが、各種広
告等の他のデータに対しても本実施形態の情報提示手法
を適用することができる。In the above embodiment, a news article is described as an example of data to be distributed. However, the information presentation method of this embodiment can be applied to other data such as various advertisements. .

【００４６】また、上記実施形態においては、インター
ネットを利用して各データ通信を行なう場合について説
明したが、これに限定されるものではなく、任意の通信
手段を利用してもよい、例えば、専用線を利用してもよ
い。Further, in the above embodiment, the case where each data communication is performed using the Internet has been described. However, the present invention is not limited to this, and any communication means may be used. Lines may be used.

【００４７】上記実施形態においては、プログラムを外
部記憶装置１２０８に保持し、ＲＡＭ１２０３にロード
して用いる場合を説明したが、これに限定されるもので
はなく、ＲＯＭ等、任意の記憶媒体を用いて実現しても
よい。また、同様の動作をする回路で実現してもよい。In the above-described embodiment, the case where the program is stored in the external storage device 1208 and loaded into the RAM 1203 for use is described. However, the present invention is not limited to this, and an arbitrary storage medium such as a ROM may be used. It may be realized. Further, it may be realized by a circuit that performs the same operation.

【００４８】なお、本発明は、複数の機器から構成され
るシステムに適用しても、１つの機器からなる装置に適
用してもよい。前述した実施形態の機能を実現するソフ
トウエアのプログラムコードを記録した記録媒体を、シ
ステム或いは装置に供給し、そのシステム或いは装置の
コンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に格
納されたプログラムコードを読み出し実行することによ
っても、達成されることは言うまでもない。この場合、
記録媒体から読み出されたプログラムコード自体が前述
した実施形態の機能を実現することになり、そのプログ
ラムコードを記録した記録媒体は本発明を構成すること
になる。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. A recording medium storing software program codes for realizing the functions of the above-described embodiments is supplied to a system or an apparatus, and a computer (or CPU or MPU) of the system or apparatus executes the program code stored in the recording medium. Needless to say, this can also be achieved by executing the reading. in this case,
The program code itself read from the recording medium implements the functions of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.

【００４９】プログラムコードを供給するための記録媒
体としては、例えばフロッピー（登録商標）ディスク、
ハードディスク、光ディスク、光磁気ディスク、ＣＤー
ＲＯＭ、ＣＤーＲ、磁気テープ、不揮発性のメモリカー
ド、ＲＯＭなどを用いることができる。As a recording medium for supplying the program code, for example, a floppy (registered trademark) disk,
A hard disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, a nonvolatile memory card, a ROM, and the like can be used.

【００５０】また、コンピュータが読み出したプログラ
ムコードを実行することにより、前述した実施形態の機
能が実現されるだけでなく、そのプログラムコードの指
示に基づき、コンピュータ上で稼働しているＯＳなどが
実際の処理の一部または全部を行ない、その処理によっ
て前述した実施形態の機能が実現される場合も含まれる
ことは言うまでもない。When the computer executes the readout program code, not only the functions of the above-described embodiment are realized, but also the OS and the like running on the computer are actually executed based on the instructions of the program code. It goes without saying that a part or all of the above-described processing is performed, and the functions of the above-described embodiments are realized by the processing.

【００５１】更に、記録媒体から読み出されたプログラ
ムコードが、コンピュータに挿入された機能拡張ボード
やコンピュータに接続された機能拡張ユニットに備わる
メモリに書き込まれた後、そのプログラムコードの指示
に基づき、その機能拡張ボードや機能拡張ユニットに備
わるＣＰＵなどが実際の処理の一部または全部を行な
い、その処理によって前述した実施形態の機能が実現さ
れる場合も含まれることは言うまでもない。Further, after the program code read from the recording medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, based on the instruction of the program code, It goes without saying that the CPU provided in the function expansion board or the function expansion unit performs a part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.

【００５２】[0052]

【発明の効果】以上説明したように、本発明によれば、
手軽に欲しいニュースに関する情報を取得することが可
能となる。As described above, according to the present invention,
It is possible to easily obtain information about news that the user wants.

[Brief description of the drawings]

【図１】実施形態に係るシステムの構成を示す構成図で
ある。FIG. 1 is a configuration diagram illustrating a configuration of a system according to an embodiment.

【図２】実施形態形態におけるニュース記事配信コンピ
ュータ１０１が実現する機能を表すブロック図である。FIG. 2 is a block diagram illustrating functions realized by a news article distribution computer 101 according to the embodiment.

【図３】実施形態におけるニュース視聴コンピュータ１
０２が実現する機能を表すブロック図である。FIG. 3 is a news viewing computer 1 according to the embodiment.
FIG. 2 is a block diagram illustrating a function realized by 02.

【図４】実施形態による情報提示処理の手順を示すフロ
ーチャートである。FIG. 4 is a flowchart illustrating a procedure of an information presentation process according to the embodiment.

【図５】実施形態において、ニュース記事がジャンル別
に整理された様子を示す図である。FIG. 5 is a diagram showing a state where news articles are organized by genre in the embodiment.

【図６】実施形態における仮想キャスタを定義するファ
イルの例を示す図である。FIG. 6 is a diagram illustrating an example of a file defining a virtual caster according to the embodiment.

【図７】実施形態における各ニュースジャンルを定義す
るファイルの例を示す図である。FIG. 7 is a diagram showing an example of a file defining each news genre in the embodiment.

【図８】実施形態における動作記述言語の生成例を示す
図である。FIG. 8 is a diagram illustrating an example of generating a behavioral description language in the embodiment.

【図９】実施形態における情報提供処理による画面表示
例を示す図である。FIG. 9 is a diagram illustrating an example of a screen display by an information providing process according to the embodiment.

【図１０】他の実施形態における情報提供処理による画
面表示例を示す図である。FIG. 10 is a diagram illustrating an example of a screen display by an information providing process according to another embodiment.

【図１１】他の実施形態における動作記述言語の生成例
を示す図である。FIG. 11 is a diagram illustrating an example of generating a behavioral description language according to another embodiment.

【図１２】実施形態によるニュース視聴コンピュータの
構成を示すブロック図である。FIG. 12 is a block diagram illustrating a configuration of a news viewing computer according to the embodiment.

【図１３】実施形態による、ニュース記事のジャンル分
けを説明する図である。FIG. 13 is a diagram illustrating categorization of news articles according to the embodiment.

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｇ１０Ｌ 21/06 Ｇ１０Ｌ 3/00 Ｓ ──────────────────────────────────────────────────続き Continued on the front page (51) Int.Cl. ⁷ Identification symbol FI Theme coat ゛ (Reference) G10L 21/06 G10L 3/00 S

Claims

[Claims]

1. An information processing system comprising: a transmission device that transmits transmission data including text information; and a reception device that is communicably connected to the transmission device and receives the transmission data. Voice synthesis means for performing voice synthesis based on text information included in the received transmission data, and outputting the obtained synthesized voice; and a first display for displaying a speaker image imitating a speaker of the synthesized voice. An information processing system, comprising: means for displaying a text string to be uttered by the synthesized voice in a text display form associated with each of the speaker images.

2. A first memory for storing display correspondence information indicating a correspondence between each of a plurality of speaker images and a display form of a text string.
Holding means, wherein the first display means displays a speaker image selected from the plurality of speaker images, and the second display means displays a form corresponding to the selected speaker image. The information processing system according to claim 1, wherein the text string is displayed in the display form obtained from the display correspondence information.

3. The information processing system according to claim 2, wherein the display correspondence information is transmitted from the transmitting device to the receiving device.

4. A second holding means for holding genre correspondence information indicating a correspondence between the plurality of speaker images and genres, wherein the receiving device identifies a genre of text information included in the transmission data and identifies the genre. Selecting means for selecting a speaker image corresponding to the selected genre based on the genre-corresponding information, wherein the first display means displays the speaker image selected by the selecting means. Item 2. The information processing system according to item 1.

5. The genre correspondence information is transmitted from the transmitting device to the receiving device.
An information processing system according to claim 1.

6. The text display form associated with each speaker image in the second display means is at least one of a character color, a size, and a font. Information processing system.

7. The information processing system according to claim 1, wherein the second display unit displays the text string in a predetermined positional relationship with a display position of the speaker image. .

8. The text information included in the transmission data includes a heading text and an utterance content text, and further comprising a third display means for displaying the heading text so as to be distinguishable from the text display by the second display means. The information processing system according to claim 1, wherein:

9. The information processing system according to claim 1, wherein the speaker image is an animation imitating a speaker.

10. An information processing apparatus for processing presentation data including text information, wherein the speech synthesis unit performs speech synthesis based on the text information included in the presentation data, and outputs an obtained synthesized speech. First display means for displaying a speaker image imitating the speaker of the synthesized voice; and a second display means for displaying a text string to be uttered by the synthesized voice in a text display form associated with each of the speaker images. An information processing apparatus comprising: two display means.

11. A system according to claim 11, further comprising first holding means for holding display correspondence information indicating a correspondence between each of the plurality of speaker images and a display form of a text string, wherein said first display means includes a plurality of said speaker images. Displaying the speaker image selected from the above, and the second display means obtains a display form corresponding to the selected speaker image from the display correspondence information, and displays the text string in the display form. The information processing apparatus according to claim 10, wherein:

12. A second holding unit for holding genre correspondence information indicating a correspondence between the plurality of speaker images and genres, and a genre of text information included in the presentation data is identified. 11. The apparatus according to claim 10, further comprising selection means for selecting a corresponding speaker image based on the genre correspondence information, wherein the first display means displays the speaker image selected by the selection means. An information processing apparatus according to claim 1.

13. The method according to claim 10, wherein the text display form associated with each speaker image in the second display means is at least one of a character color, a size, and a font. Information processing device.

14. The information processing apparatus according to claim 10, wherein the second display unit displays the text string in a predetermined positional relationship with a display position of the speaker image. .

15. The system according to claim 15, wherein the text information included in the transmission data includes a heading text and an utterance content text, and further comprising third display means for displaying the heading text so as to be distinguishable from the text display by the second display means. The information processing apparatus according to claim 10, wherein:

16. The information processing apparatus according to claim 10, wherein the speaker image is an animation imitating a speaker.

17. A transmission device for transmitting transmission data including text information, and communicably connected to the transmission device,
A method for controlling an information processing system, comprising: a receiving device that receives the transmission data; a voice output step of performing voice synthesis based on text information included in the received transmission data, and outputting an obtained synthesized voice. A first display step of displaying a speaker image imitating the speaker of the synthesized voice; and a second display step of displaying a text string to be uttered by the synthesized voice in a text display form associated with each of the speaker images. An information processing method comprising: two display steps.

18. An information processing method for processing presentation data including text information, comprising the steps of: performing voice synthesis based on text information included in received transmission data; and outputting an obtained synthesized voice. A first display step of displaying a speaker image imitating the speaker of the synthesized voice; and a second display step of displaying a text string to be uttered by the synthesized voice in a text display form associated with each of the speaker images. An information processing method comprising: two display steps.

19. A storage medium for storing a control program for causing a computer to implement the information processing method according to claim 17.