JP2012098753A

JP2012098753A - Audio display output control device, image display control device, audio display output control process program and image display control process program

Info

Publication number: JP2012098753A
Application number: JP2012014690A
Authority: JP
Inventors: Yoshiyuki Murata; 嘉行村田; Takashi Koshiro; 孝湖城
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2012-01-27
Filing date: 2012-01-27
Publication date: 2012-05-24

Abstract

PROBLEM TO BE SOLVED: To correctly express the timing of an accent in image display synchronized with an audio output in an audio display output control device for synchronously outputting data such as a sound, text, and image.SOLUTION: In synchronization with a pronunciation audio output of a search key word "low", the key word "low" and an identification display HL of its pronunciation symbol are successively displayed on a window 1, and on a window W2, based on a set character image 12d(No3), pronouncing mouth shape images 12e(No36→No9→No8) corresponding to the respective pronunciation symbols are sequentially switched and displayed in combination in the mouth image area. When the identification display HL synchronized with the pronunciation audio output to an accent character "o" and the mouth shape image 12e(No9) are switched and displayed in combination, the image 12d(No3) to be combined is changed to an accent corresponding face image 12d(No3') expressing a stronger pronunciation by, for example, head sweat or mouth shaking, to be displayed.

Description

本発明は、音声，テキスト，画像などのデータを同期して出力するための音声表示出力制御装置、画像表示制御装置、および音声表示出力制御処理プログラム、画像表示制御処理プログラムに関する。 The present invention relates to a voice display output control device, an image display control device, a voice display output control processing program, and an image display control processing program for outputting data such as voice, text, and images in synchronization.

従来、例えば言語学習装置として、言語の音声を出力しその口型を表示させるものがある。 2. Description of the Related Art Conventionally, for example, there is a language learning device that outputs language speech and displays its mouth shape.

この言語学習装置では、マイクとカメラによって、母国語使用者の音声情報と口型のイメージデータを予めサンプルデータメモリに記録する。そして、学習者の音声情報と口型のイメージデータを前記マイクとカメラによって記録し、この学習者と前記サンプルデータメモリに予め記録された母国語使用者とのそれぞれの音声情報の波形とこれに対応する各口型のイメージデータとを対比しチャート形式で表示する。 In this language learning apparatus, voice information of a native language user and mouth-shaped image data are recorded in a sample data memory in advance by a microphone and a camera. Then, the learner's voice information and mouth-shaped image data are recorded by the microphone and the camera, and the waveforms of the respective voice information of the learner and the native language user recorded in advance in the sample data memory and Corresponding image data of each corresponding mouth shape is displayed in a chart format.

これにより、母国語使用者と学習者との言語発音の差異を明確に分析し表示しようとするものである（例えば、特許文献１参照。）。 Thereby, it is intended to clearly analyze and display the difference in language pronunciation between the native language user and the learner (see, for example, Patent Document 1).

特開２００１−３１８５９２号公報JP 2001-318592 A

このような、従来の言語学習装置を用いると、手本である母国語使用者の発音音声とその口型イメージを知ることができるが、各言語のアクセントについては、主にアクセント部分の発音音声が強調されることで知らされるだけであって、口型イメージそのものには明確な違いが現れないため、各学習言語におけるアクセントのタイミングが分かり辛い問題がある。 Using such a conventional language learning device, it is possible to know the voice of the native language user who is a model and its mouth image, but the accent of each language is mainly the voice of the accent part. However, there is a problem that the timing of accents in each learning language is difficult to understand because there is no clear difference in the mouth image itself.

本発明は、前記のような問題に鑑みてなされたもので、音声出力に同期した画像の表示において、アクセントのタイミングを明確に現すことが可能になる音声表示出力制御装置、画像表示制御装置、および音声表示出力制御処理プログラム、画像表示制御処理プログラムを提供することを目的とする。 The present invention has been made in view of the above problems, and in the display of an image synchronized with the audio output, an audio display output control device, an image display control device, which can clearly show the timing of accents, Another object of the present invention is to provide an audio display output control processing program and an image display control processing program.

本発明の請求項１に係る音声表示出力制御装置では、音声データ出力手段により音声データを出力し、テキスト同期表示制御手段により前記音声データの出力に同期してテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を表示させ、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声出力される音声データに同期して当該音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記音声データまたは前記テキストのアクセントの有無を検出し、画像変化表示制御手段により前記アクセントの有りの検出に応じて前記画像表示制御手段により表示される口型の画像を変化させる。 In the voice display output control apparatus according to the first aspect of the present invention, the voice data output means outputs voice data, the text synchronous display control means causes the text to be displayed in synchronization with the output of the voice data, and the image display control means. An image including at least the mouth portion is displayed by the mouth image display control means, and the mouth-shaped image corresponding to the sound data is synchronized with the sound data output by the sound for the mouth portion included in the display image. Is displayed. Then, the accent detection means detects the presence or absence of the accent of the voice data or the text, and the image change display control means changes the mouth-shaped image displayed by the image display control means in response to the presence of the accent. Let

これによれば、音声データの出力に同期したテキストと画像の表示および画像に含まれる口部分で音声データに対応した口型の画像を表示できるだけでなく、音声データまたはテキストのアクセント検出に応じて口型の表示画像を変化でき、アクセントのタイミングを明確に表現できることになる。 According to this, not only the display of the text and the image synchronized with the output of the audio data and the mouth-shaped image corresponding to the audio data in the mouth portion included in the image, but also according to the accent detection of the audio data or the text Mouth-shaped display images can be changed, and the timing of accents can be clearly expressed.

本発明の請求項２に係る音声表示出力制御装置では、前記請求項１に係る音声表示出力制御装置にあって、さらに、辞書検索手段により入力された見出語に対応した辞書データを検索し、辞書データ表示制御手段により前記辞書検索された見出語に対応した辞書データを表示させる。そして、前記音声データは辞書検索手段により検索された見出語の発音音声データで、前記テキストは辞書検索手段により検索された見出語のテキストであり、前記音声データ出力手段による見出語発音音声データの出力、および前記テキスト同期表示制御手段による前記見出語発音音声データに同期した見出語テキストの表示、および前記画像表示制御手段による画像の表示は、前記辞書データ表示制御手段による検索見出語に対応した辞書データの表示状態において行う。 The voice display output control device according to claim 2 of the present invention is the voice display output control device according to claim 1, further searching for dictionary data corresponding to the entry word input by the dictionary search means. Then, the dictionary data display control means displays the dictionary data corresponding to the found word searched in the dictionary. The speech data is the pronunciation speech data of the headword searched by the dictionary search means, the text is the text of the headword searched by the dictionary search means, and the headword pronunciation by the voice data output means The dictionary data display control means searches for the output of voice data, the display of the headword text synchronized with the headword pronunciation voice data by the text synchronous display control means, and the display of the image by the image display control means. This is performed in the display state of dictionary data corresponding to the headword.

これによれば、入力した見出語に対応した辞書データの検索表示に伴い、当該見出語発音音声データの出力、これに同期した見出語テキストの表示および画像の表示と口型画像の同期表示ができ、しかもアクセント検出に応じた表示画像の変化により見出語アクセントのタイミングを明確に表現できることになる。 According to this, along with the search and display of the dictionary data corresponding to the input headword, the output of the headword pronunciation voice data, the display of the headword text synchronized with this and the display of the image and the mouth-shaped image Synchronous display is possible, and the timing of the headword accent can be clearly expressed by the change of the display image in accordance with the accent detection.

本発明の請求項３に係る音声表示出力制御装置では、単語記憶手段により複数の単語と当該各単語それぞれの正しいアクセント付き発音記号と誤りアクセント付き発音記号とを対応付けて記憶し、音声データ出力手段により前記記憶した単語の正しいアクセントの発音音声データまたは誤りアクセントの発音音声データを出力し、テキスト同期表示制御手段により前記音声出力される単語の発音音声データに同期して当該単語のテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を、前記音声データ出力手段により正しいアクセントの発音音声データが出力される場合と誤りアクセントの発音音声データが出力される場合とで異なる表示形態にして表示させ、さらに、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声データ出力手段により出力される発音音声データに同期して当該発音音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記テキスト同期表示制御手段による単語テキストの同期表示に伴い、前記単語記憶手段により記憶した該当単語のアクセント付き発音記号から該単語のアクセントを検出し、画像変化表示制御手段により前記アクセント検出に応じて前記画像表示制御手段により表示される画像を変化させる。 In the voice display output control device according to claim 3 of the present invention, the word storage means stores a plurality of words, the correct accented phonetic symbol and the phonetic symbol with error accent for each word, and stores the voice data. The sound output of the correct accent or the sound of error accent of the stored word is output by the means, and the text of the word is displayed in synchronization with the sound of the word output by the text synchronization display control means The image display control means displays an image including at least a mouth portion in a different display mode when the voice data output means outputs the correct accent pronunciation voice data and when the error accent pronunciation voice data is output. Furthermore, it is included in the display image by the mouth image display control means. That the portion of the mouth, to display an image of the mouth type in synchronization with the sound audio data outputted corresponding to the sound audio data by the audio data outputting means. Then, along with the synchronous display of the word text by the text synchronous display control means by the accent detection means, the accent of the word is detected from the accented phonetic symbol stored by the word storage means, and the image change display control means The image displayed by the image display control means is changed according to the accent detection.

これによれば、単語記憶手段により記憶される単語について正しいアクセントの発音音声データと誤りアクセントの発音音声データとを出力できるだけでなく、この発音音声データに同期した単語テキストの表示および表示画像に含まれる口部分についての発音音声データに対応した口型画像を表示でき、しかも単語アクセントの検出に応じて表示画像を変化できるので、単語についての正しいアクセントと誤りアクセントを容易かつ明確なタイミングで学習できることになる。 According to this, it is possible not only to output correct accent pronunciation voice data and error accent pronunciation voice data for the words stored by the word storage means, but also to display the word text synchronized with the pronunciation voice data and to display the display image. Mouth-shaped images corresponding to pronunciation speech data for the mouth part to be displayed can be displayed, and the display image can be changed according to the detection of word accents, so that correct and incorrect accents for words can be learned easily and clearly become.

本発明の請求項４に係る音声表示出力制御装置では、前記請求項３に係る音声表示出力制御装置にあって、さらに、正誤アクセント表示制御手段により前記単語記憶した単語と当該単語に対応付けられた正しいアクセント付き発音記号と誤りアクセント付き発音記号とを並べて表示させ、正誤アクセント選択手段により前記並べて表示された単語の正しいアクセント付き発音記号か誤りアクセント付き発音記号かの何れかを選択する。すると、音声データ出力手段は、前記正誤アクセント選択手段による単語アクセントの正誤選択に応じて、該当単語の正しいアクセントの発音音声データまたは誤りアクセントの発音音声データを出力する。 The voice display output control device according to claim 4 of the present invention is the voice display output control device according to claim 3, and further, the word stored by the correct / incorrect accent display control means is associated with the word. The correct accented phonetic symbol and the error accented phonetic symbol are displayed side by side, and the correct or incorrect accent selection means selects either the correct accented phonetic symbol or the error accented phonetic symbol. Then, the voice data output means outputs correct voice pronunciation data or correct accent voice data of the corresponding word in accordance with the correct / incorrect word accent selection by the correct / incorrect accent selection means.

これによれば、さらに、単語記憶手段により記憶される単語について正しいアクセント付き発音記号か誤りアクセント付き発音記号かを選択してその発音音声データを出力でき、しかも、この発音音声データに同期した単語テキストの表示および表示画像に含まれる口部分についての発音音声データに対応した口型画像を表示でき、単語アクセントの検出に応じて表示画像を変化できるので、単語についての正しいアクセントと誤りアクセントをさらに容易かつ明確なタイミングで学習できることになる。 According to this, it is possible to select whether the phonetic symbol with the correct accent or the phonetic symbol with the error accent is selected for the word stored by the word storage means, and output the pronunciation voice data, and the word synchronized with the pronunciation voice data Mouth-shaped images corresponding to pronunciation and voice data for the mouth part included in the display of the text and the display image can be displayed, and the display image can be changed according to the detection of the word accent. You will be able to learn easily and clearly.

本発明の請求項５に係る音声表示出力制御装置では、記憶手段により複数の見出語と当該各見出語それぞれの少なくとも２以上の地域の発音音声データとを対応付けて記憶し、地域指定手段により前記記憶した見出語の２以上の地域の発音音声データのうち何れかの地域を指定する。すると、音声データ出力手段により前記発音音声データの地域指定に応じて、該当見出語の指定地域の発音音声データを出力し、テキスト同期表示制御手段により前記音声出力される見出語の指定地域の発音音声データに同期して当該見出語のテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を、前記指定地域に応じて異なる表示形態にして表示させ、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声出力される発音音声データに同期して当該発音音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記見出語テキストの同期表示に伴い、当該見出語のアクセントを検出し、画像変化表示制御手段により前記アクセントの検出に応じて前記画像表示制御手段により表示される画像を変化させる。 In the voice display output control device according to claim 5 of the present invention, the storage means stores a plurality of headwords and at least two or more local pronunciation voice data of each headword in association with each other, and specifies the region The means designates one of the pronunciation sound data of two or more regions of the stored headword. Then, according to the region designation of the pronunciation speech data by the voice data output means, the pronunciation speech data of the designated region of the corresponding headword is output, and the designated region of the headword to be output by the text synchronous display control means The text of the headword is displayed in synchronization with the pronunciation voice data of the mouth, and the image including at least the mouth portion is displayed by the image display control means in a different display form according to the designated area, and the mouth image display control is performed. The mouth portion image included in the display image is displayed by the means in a mouth-shaped image corresponding to the sound output sound data in synchronization with the sound output sound data. Then, the accent detection means detects the accent of the headword along with the synchronous display of the headword text, and the image displayed by the image display control means according to the detection of the accent by the image change display control means To change.

これによれば、同一の見出語で異なる地域方言のある発音音声データを指定して出力できると共に、この発音音声データの出力に同期して当該見出語テキストおよび表示画像中口部分の口型画像を表示でき、しかも指定地域に応じて異なる表示形態の画像を表示でき、アクセント検出によって当該画像の変化表示もできるので、指定地域の発音音声データとアクセントのタイミングを容易かつ明確に学習できることになる。 According to this, it is possible to specify and output the pronunciation voice data having the same headword and different regional dialects, and in synchronization with the output of the pronunciation voice data, the headword text and the mouth of the middle part of the display image Type images can be displayed, and images with different display formats can be displayed according to the specified area, and the change of the image can be displayed by accent detection, so the pronunciation sound data and accent timing in the specified area can be learned easily and clearly. become.

本発明の請求項６に係る画像表示制御装置では、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、第１の記憶手段により前記発音対象データとアクセント記号付き発音記号を含む発音記号とを対応付けて複数組み記憶し、第２の記憶手段によりアクセント記号付き発音記号を含む発音記号とその音声および顔画像を対応付けて複数組み記憶する。そして、第１の制御手段により前記一連の発音対象データの発音順の表示に伴い、この発音対象データに対応する発音記号を前記第１の記憶手段から読み出し、この読み出された発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御し、第２の制御手段により前記第１の制御によって音声を外部に出力する際に、前記読み出された発音記号の中にアクセント記号付き発音記号が含まれているか否かを判別し、アクセント記号が含まれていると判別された際は、このアクセント記号付き発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御する。 The image display control apparatus according to claim 6 of the present invention is an image display control apparatus that controls to change a face image having a mouth or a facial expression in accordance with a display of a pronunciation order of a series of pronunciation target data including a word headword. The first storage means stores a plurality of sets of the pronunciation object data and the pronunciation symbols including the accented pronunciation symbols, and the second storage means stores the pronunciation symbols including the accented pronunciation symbols and the A plurality of sets of voice and face images are stored in association with each other. Then, with the display of the order of pronunciation of the series of pronunciation target data by the first control means, the phonetic symbols corresponding to the pronunciation target data are read from the first storage means and correspond to the read phonetic symbols. The voice and the face image to be read are read from the second storage means, the read voice is output to the outside, and the read face image is controlled to be displayed. When outputting the sound to the outside by the first control, it is determined whether or not a phonetic symbol with an accent symbol is included in the read phonetic symbol, and it is determined that an accent symbol is included. In this case, the voice and face image corresponding to the accented phonetic symbol are read from the second storage means, the read voice is output to the outside, and the read face image is displayed. It is controlled to be.

これによれば、単語の見出し語などの発音対象データの発音順の表示に伴い、当該発音対象データの発音記号に対応する音声出力と顔画像表示ができると共に、そのアクセント部分では該アクセント記号付き発音記号に対応する音声出力と顔画像表示ができ、容易かつ明確に単語などの発音音声とこの発音に伴う顔の表現およびそのアクセント部分での発音音声とこのアクセント部分の発音に伴う顔の表現を学習できることになる。 According to this, along with the display of the pronunciation order of the pronunciation target data such as the headword of the word, the voice output corresponding to the pronunciation symbol of the pronunciation target data and the face image display can be performed, and the accent part is provided with the accent symbol. Voice output corresponding to phonetic symbols and face image display can be performed easily and clearly. Voices such as words, facial expressions associated with this pronunciation, and voices generated in the accented part and facial expressions associated with the pronunciation of the accented part Will be able to learn.

本発明の請求項７に係る画像表示制御装置では、前記請求項６に係る画像表示制御装置にあって、前記第２の記憶手段に記憶されているアクセント記号付き発音記号を含む発音記号は、アクセント記号が付いている発音記号とアクセント記号が付いていない発音記号とからなり、前記アクセント記号が付いている発音記号に対応付けて記憶されている音声および顔画像と前記アクセント記号が付いていない発音記号に対応付けて記憶されている音声および顔画像とは異なっている。 The image display control device according to claim 7 of the present invention is the image display control device according to claim 6, wherein the phonetic symbols including the phonetic symbols with accent marks stored in the second storage means are: It consists of phonetic symbols with an accent symbol and phonetic symbols without an accent symbol. The voice and face image stored in association with the phonetic symbol with the accent symbol and the accent symbol are not attached. This is different from the voice and face image stored in association with the phonetic symbols.

これによれば、単語の見出し語などの発音対象データのアクセント記号の無い部分での発音音声とこれに伴う顔の表現、そしてアクセント記号がある部分での発音音声とこれに伴う顔の表現の相異をより明確に学習できることになる。 According to this, the pronunciation of the pronunciation target data such as the headword of the word and the expression of the face accompanying the accented part, and the expression of the face accompanying the accented expression and the expression of the face accompanying the accented part and the expression of the face You can learn the differences more clearly.

また、本発明の請求項８に係る画像表示制御装置では、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、記憶手段により前記発音対象データとその音声および顔画像を対応付けて複数組み記憶し、検出手段により前記記憶されている音声の信号波形のうち、前記発音対象データのアクセント部分に対応する信号波形のピーク部分を検出し、表示制御手段により前記検出されたアクセント部分の音声に対応する顔画像を前記記憶手段から読み出しこの読み出された顔画像を、アクセント部分以外の他の信号波形部分の音声に対応する顔画像と異なる表示形態で表示するように制御する。 Further, in the image display control apparatus according to claim 8 of the present invention, image display control for changing and controlling a face image having a mouth or a facial expression in accordance with a display of pronunciation sequence of a series of pronunciation target data including a word headword. A storage unit stores a plurality of sets of the sound generation target data and their voices and face images in association with each other, and the detection unit stores an accent portion of the sound generation target data in the stored signal waveform of the sound. The peak portion of the corresponding signal waveform is detected, and the face image corresponding to the detected voice of the accent part is read from the storage means by the display control means, and the read face image is read as a signal other than the accent part. Control is performed so that the face image corresponding to the sound of the waveform portion is displayed in a different display form.

これによれば、単語の見出語などの発音対象データの発音順の表示に伴い、その発音音声に対応する顔画像を表示でき、しかも音声信号波形のピーク部分によって検出されるアクセント部分では異なる表示形態にした顔画像を表示でき、アクセント部分での発音に伴う顔の表現をより明確に学習できることになる。 According to this, along with the display of the pronunciation order of the pronunciation target data such as word headwords, a face image corresponding to the pronunciation voice can be displayed, and the accent part detected by the peak part of the voice signal waveform is different. The face image in the display form can be displayed, and the expression of the face accompanying the pronunciation in the accent part can be learned more clearly.

また、本発明の請求項９に係る画像表示制御装置では、前記請求項８に係る画像表示制御装置にあって、前記表示制御手段は、前記検出手段により検出されたアクセント部分に対応する発音対象データの部分の表示を、アクセント部分以外の他の信号波形部分に対応する発音対象データの部分の表示と異なる表示形態で表示するように制御するテキスト表示制御手段を備えている。 The image display control apparatus according to claim 9 of the present invention is the image display control apparatus according to claim 8, wherein the display control means is a pronunciation target corresponding to the accent portion detected by the detection means. Text display control means is provided for controlling the display of the data portion to be displayed in a display form different from the display of the portion of the sound generation target data corresponding to the signal waveform portion other than the accent portion.

これによれば、発音対象データの発音音声に対応する顔画像の表示に加え、さらに、発音対象データのアクセント部分の表示を、該アクセント部分以外の発音対象データの表示と異なる表示形態にして表示できるので、発音対象データのアクセント部分およびその発音音声の発声に伴う顔の表現をより明確に学習できることになる。 According to this, in addition to the display of the face image corresponding to the pronunciation sound of the sound generation target data, the display of the accent part of the sound generation target data is displayed in a display form different from the display of the sound generation target data other than the accent part. Therefore, it is possible to more clearly learn the accent part of the pronunciation target data and the facial expression accompanying the utterance of the pronunciation sound.

以上のように、本発明の請求項１（請求項１０）に係る音声表示出力制御装置（音声表示出力制御処理プログラム）によれば、音声データ出力手段により音声データを出力し、テキスト同期表示制御手段により前記音声データの出力に同期してテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を表示させ、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声出力される音声データに同期して当該音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記音声データまたは前記テキストのアクセントを検出し、画像変化表示制御手段により前記アクセントの検出に応じて前記画像表示制御手段により表示される画像を変化させる。これにより、音声データの出力に同期したテキストと画像の表示および画像に含まれる口部分で音声データに対応した口型の画像を表示できるだけでなく、音声データまたはテキストのアクセント検出に応じて表示画像を変化でき、アクセントのタイミングを明確に表現できるようになる。 As described above, according to the voice display output control device (voice display output control processing program) according to claim 1 (claim 10) of the present invention, the voice data is output by the voice data output means, and the text synchronous display control is performed. The text is displayed in synchronization with the output of the audio data by the means, the image including at least the mouth part is displayed by the image display control means, and the mouth part included in the display image by the mouth image display control means, A mouth-shaped image corresponding to the audio data is displayed in synchronization with the audio data to be output. Then, the accent detection means detects the accent of the voice data or the text, and the image change display control means changes the image displayed by the image display control means in response to the detection of the accent. As a result, the display of text and images synchronized with the output of audio data and the mouth-shaped image corresponding to the audio data at the mouth included in the image, as well as the display image according to the accent detection of the audio data or text The accent timing can be clearly expressed.

本発明の請求項２に係る音声表示出力制御装置によれば、前記請求項１に係る音声表示出力制御装置にあって、さらに、辞書検索手段により入力された見出語に対応した辞書データを検索し、辞書データ表示制御手段により前記辞書検索された見出語に対応した辞書データを表示させる。そして、前記音声データは辞書検索手段により検索された見出語の発音音声データで、前記テキストは辞書検索手段により検索された見出語のテキストであり、前記音声データ出力手段による見出語発音音声データの出力、および前記テキスト同期表示制御手段による前記見出語発音音声データに同期した見出語テキストの表示、および前記画像表示制御手段による画像の表示は、前記辞書データ表示制御手段による検索見出語に対応した辞書データの表示状態において行う。これにより、入力した見出語に対応した辞書データの検索表示に伴い、当該見出語発音音声データの出力、これに同期した見出語テキストの表示および画像の表示と口型画像の同期表示ができ、しかもアクセント検出に応じた表示画像の変化により見出語アクセントのタイミングを明確に表現できるようになる。 According to the voice display output control device according to claim 2 of the present invention, in the voice display output control device according to claim 1, dictionary data corresponding to the headword input by the dictionary search means is further stored. Search is performed, and dictionary data corresponding to the found word searched in the dictionary is displayed by the dictionary data display control means. The speech data is the pronunciation speech data of the headword searched by the dictionary search means, the text is the text of the headword searched by the dictionary search means, and the headword pronunciation by the voice data output means The dictionary data display control means searches for the output of voice data, the display of the headword text synchronized with the headword pronunciation voice data by the text synchronous display control means, and the display of the image by the image display control means. This is performed in the display state of dictionary data corresponding to the headword. As a result, along with the search and display of the dictionary data corresponding to the entered headword, the output of the headword pronunciation speech data, the display of the headword text synchronized with this, the display of the image and the synchronous display of the mouth-shaped image In addition, the timing of the headword accent can be clearly expressed by the change in the display image according to the accent detection.

本発明の請求項３（請求項１１）に係る音声表示出力制御装置（音声表示出力制御処理プログラム）によれば、単語記憶手段により複数の単語と当該各単語それぞれの正しいアクセント付き発音記号と誤りアクセント付き発音記号とを対応付けて記憶し、音声データ出力手段により前記記憶した単語の正しいアクセントの発音音声データまたは誤りアクセントの発音音声データを出力し、テキスト同期表示制御手段により前記音声出力される単語の発音音声データに同期して当該単語のテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を、前記音声データ出力手段により正しいアクセントの発音音声データが出力される場合と誤りアクセントの発音音声データが出力される場合とで異なる表示形態にして表示させ、さらに、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声データ出力手段により出力される発音音声データに同期して当該発音音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記テキスト同期表示制御手段による単語テキストの同期表示に伴い、前記単語記憶手段により記憶した該当単語のアクセント付き発音記号から該単語のアクセントを検出し、画像変化表示制御手段により前記アクセント検出に応じて前記画像表示制御手段により表示される画像を変化させる。これにより、単語記憶手段により記憶される単語について正しいアクセントの発音音声データと誤りアクセントの発音音声データとを出力できるだけでなく、この発音音声データに同期した単語テキストの表示および表示画像に含まれる口部分についての発音音声データに対応した口型画像を表示でき、しかも単語アクセントの検出に応じて表示画像を変化できるので、単語についての正しいアクセントと誤りアクセントを容易かつ明確なタイミングで学習できるようになる。 According to the voice display output control apparatus (voice display output control processing program) according to claim 3 (claim 11) of the present invention, a plurality of words and correct accented phonetic symbols and errors of each of the words are stored by the word storage means. Accented phonetic symbols are stored in association with each other, and voice data output means outputs pronunciation voice data of correct accent or wrong accent of the stored word, and outputs the voice by text synchronous display control means. An error occurs when the text of the word is displayed in synchronization with the pronunciation voice data of the word, and the image display control means outputs an image including at least the mouth portion, and the voice data output means outputs the correct accent pronunciation voice data. The display format differs depending on whether accented voice data is output. Further, the mouth portion included in the mouth image display control means by the display image to display the image of the mouth type in synchronization with the sound audio data outputted corresponding to the sound audio data by the audio data outputting means. Then, along with the synchronous display of the word text by the text synchronous display control means by the accent detection means, the accent of the word is detected from the accented phonetic symbol stored by the word storage means, and the image change display control means The image displayed by the image display control means is changed according to the accent detection. As a result, not only the correct accent pronunciation data and the error accent pronunciation data can be output for the words stored by the word storage means, but also the word text displayed in synchronization with this pronunciation data and the mouth included in the display image. Mouth-shaped images corresponding to pronunciation speech data for the part can be displayed, and the display image can be changed according to the detection of word accents, so that correct and incorrect accents for words can be learned easily and clearly Become.

本発明の請求項４に係る音声表示出力制御装置によれば、前記請求項３に係る音声表示出力制御装置にあって、さらに、正誤アクセント表示制御手段により前記単語記憶した単語と当該単語に対応付けられた正しいアクセント付き発音記号と誤りアクセント付き発音記号とを並べて表示させ、正誤アクセント選択手段により前記並べて表示された単語の正しいアクセント付き発音記号か誤りアクセント付き発音記号かの何れかを選択する。すると、音声データ出力手段は、前記正誤アクセント選択手段による単語アクセントの正誤選択に応じて、該当単語の正しいアクセントの発音音声データまたは誤りアクセントの発音音声データを出力する。これにより、さらに、単語記憶手段により記憶される単語について正しいアクセント付き発音記号か誤りアクセント付き発音記号かを選択してその発音音声データを出力でき、しかも、この発音音声データに同期した単語テキストの表示および表示画像に含まれる口部分についての発音音声データに対応した口型画像を表示でき、単語アクセントの検出に応じて表示画像を変化できるので、単語についての正しいアクセントと誤りアクセントをさらに容易かつ明確なタイミングで学習できるようになる。 According to the voice display output control device according to claim 4 of the present invention, in the voice display output control device according to claim 3, the word stored by the correct / incorrect accent display control means and the corresponding word The correct phonetic symbol with correct accent and the phonetic symbol with error accent are displayed side by side, and the correct or incorrect accent selection means selects either the correct phonetic symbol with correct accent or the phonetic symbol with error accent. . Then, the voice data output means outputs correct voice pronunciation data or correct accent voice data of the corresponding word in accordance with the correct / incorrect word accent selection by the correct / incorrect accent selection means. As a result, it is possible to select the correct accented phonetic symbol or the error accented phonetic symbol for the word stored by the word storage means and output the pronunciation speech data, and the word text synchronized with the pronunciation speech data can be output. Mouth-shaped images corresponding to pronunciation speech data for the mouth part included in the display and display image can be displayed, and the display image can be changed according to the detection of word accents, making it easier and more correct for correct and incorrect accents on words Learn at a clear time.

本発明の請求項５に係る音声表示出力制御装置によれば、記憶手段により複数の見出語と当該各見出語それぞれの少なくとも２以上の地域の発音音声データとを対応付けて記憶し、地域指定手段により前記記憶した見出語の２以上の地域の発音音声データのうち何れかの地域を指定する。すると、音声データ出力手段により前記発音音声データの地域指定に応じて、該当見出語の指定地域の発音音声データを出力し、テキスト同期表示制御手段により前記音声出力される見出語の指定地域の発音音声データに同期して当該見出語のテキストを表示させ、画像表示制御手段により少なくとも口の部分を含む画像を、前記指定地域に応じて異なる表示形態にして表示させ、口画像表示制御手段により前記表示画像に含まれる口の部分について、前記音声出力される発音音声データに同期して当該発音音声データに対応した口型の画像を表示させる。そして、アクセント検出手段により前記見出語テキストの同期表示に伴い、当該見出語のアクセントを検出し、画像変化表示制御手段により前記アクセントの検出に応じて前記画像表示制御手段により表示される画像を変化させる。これにより、同一の見出語で異なる地域方言のある発音音声データを指定して出力できると共に、この発音音声データの出力に同期して当該見出語テキストおよび表示画像中口部分の口型画像を表示でき、しかも指定地域に応じて異なる表示形態の画像を表示でき、アクセント検出によって当該画像の変化表示もできるので、指定地域の発音音声データとアクセントのタイミングを容易かつ明確に学習できるようになる。 According to the voice display output control device according to claim 5 of the present invention, the storage means stores a plurality of headwords and at least two or more pronunciation sound data of each of the headwords in association with each other, The region designating unit designates one of the pronunciation sound data of two or more regions of the stored headword. Then, according to the region designation of the pronunciation speech data by the voice data output means, the pronunciation speech data of the designated region of the corresponding headword is output, and the designated region of the headword to be output by the text synchronous display control means The text of the headword is displayed in synchronization with the pronunciation voice data of the mouth, and the image including at least the mouth portion is displayed by the image display control means in a different display form according to the designated area, and the mouth image display control is performed. The mouth portion image included in the display image is displayed by the means in a mouth-shaped image corresponding to the sound output sound data in synchronization with the sound output sound data. Then, the accent detection means detects the accent of the headword along with the synchronous display of the headword text, and the image displayed by the image display control means according to the detection of the accent by the image change display control means To change. As a result, it is possible to specify and output pronunciation voice data having the same headword and different regional dialects, and in synchronism with the output of the pronunciation voice data, the headword text and the mouth image of the middle part of the display image In addition, it is possible to display images with different display formats according to the specified area, and display the change of the image by accent detection so that the pronunciation sound data and accent timing in the specified area can be learned easily and clearly. Become.

本発明の請求項６（請求項１２）に係る画像表示制御装置（画像表示制御処理プログラム）によれば、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、第１の記憶手段により前記発音対象データとアクセント記号付き発音記号を含む発音記号とを対応付けて複数組み記憶し、第２の記憶手段によりアクセント記号付き発音記号を含む発音記号とその音声および顔画像を対応付けて複数組み記憶する。そして、第１の制御手段により前記一連の発音対象データの発音順の表示に伴い、この発音対象データに対応する発音記号を前記第１の記憶手段から読み出し、この読み出された発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御し、第２の制御手段により前記第１の制御によって音声を外部に出力する際に、前記読み出された発音記号の中にアクセント記号付き発音記号が含まれているか否かを判別し、アクセント記号が含まれていると判別された際は、このアクセント記号付き発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御する。これにより、単語の見出し語などの発音対象データの発音順の表示に伴い、当該発音対象データの発音記号に対応する音声出力と顔画像表示ができると共に、そのアクセント部分では該アクセント記号付き発音記号に対応する音声出力と顔画像表示ができ、容易かつ明確に単語などの発音音声とこの発音に伴う顔の表現およびそのアクセント部分での発音音声とこのアクセント部分の発音に伴う顔の表現を学習できるようになる。 According to the image display control apparatus (image display control processing program) according to claim 6 of the present invention (mouth display expression processing program), the mouth or the facial expression according to the display of the pronunciation order of the series of pronunciation target data including the word headword An image display control device for controlling a change of a face image, wherein a plurality of sets of the pronunciation target data and pronunciation symbols including accented symbols are stored in association with each other by a first storage means, and a second set is stored. The storage means stores a plurality of sets of phonetic symbols including accented phonetic symbols and their voices and face images in association with each other. Then, with the display of the order of pronunciation of the series of pronunciation target data by the first control means, the phonetic symbols corresponding to the pronunciation target data are read from the first storage means and correspond to the read phonetic symbols. The voice and the face image to be read are read from the second storage means, the read voice is output to the outside, and the read face image is controlled to be displayed. When outputting the sound to the outside by the first control, it is determined whether or not a phonetic symbol with an accent symbol is included in the read phonetic symbol, and it is determined that an accent symbol is included. In this case, the voice and face image corresponding to the accented phonetic symbol are read from the second storage means, the read voice is output to the outside, and the read face image is displayed. It is controlled to be. Thus, along with the display of the pronunciation order of the pronunciation target data such as a word headword, the voice output and the face image display corresponding to the pronunciation symbol of the pronunciation target data can be performed, and the accented pronunciation symbol at the accent portion Easily and clearly learn pronunciation sounds such as words, facial expressions that accompany this pronunciation, and voice expressions in the accent part and facial expressions that accompany this accent part. become able to.

本発明の請求項７に係る画像表示制御装置によれば、前記請求項６に係る画像表示制御装置にあって、前記第２の記憶手段に記憶されているアクセント記号付き発音記号を含む発音記号は、アクセント記号が付いている発音記号とアクセント記号が付いていない発音記号とからなり、前記アクセント記号が付いている発音記号に対応付けて記憶されている音声および顔画像と前記アクセント記号が付いていない発音記号に対応付けて記憶されている音声および顔画像とは異なっている。これにより、単語の見出し語などの発音対象データのアクセント記号の無い部分での発音音声とこれに伴う顔の表現、そしてアクセント記号がある部分での発音音声とこれに伴う顔の表現の相異をより明確に学習できるようになる。 According to an image display control device of claim 7 of the present invention, in the image display control device of claim 6, the phonetic symbol including the phonetic symbol with an accent symbol stored in the second storage means Consists of a phonetic symbol with an accent symbol and a phonetic symbol without an accent symbol, with a voice and face image stored in association with the phonetic symbol with the accent symbol and the accent symbol This is different from voice and face images stored in association with phonetic symbols that are not. As a result, the pronunciation of the pronunciation target data such as the headword of the word and the expression of the face in the part without the accent symbol, and the expression of the face accompanying this, and the difference between the pronunciation of the pronunciation in the part of the accent symbol and the expression of the face associated therewith Can learn more clearly.

また、本発明の請求項８に係る画像表示制御装置によれば、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、記憶手段により前記発音対象データとその音声および顔画像を対応付けて複数組み記憶し、検出手段により前記記憶されている音声の信号波形のうち、アクセント部分に対応する信号波形のピーク部分を検出し、表示制御手段により前記検出されたアクセント部分の音声に対応する顔画像を前記記憶手段から読み出しこの読み出された顔画像を、アクセント部分以外の他の信号波形部分の音声に対応する顔画像と異なる表示形態で表示するように制御する。これにより、単語の見出語などの発音対象データの発音順の表示に伴い、その発音音声に対応する顔画像を表示でき、しかも音声信号波形のピーク部分によって検出されるアクセント部分では異なる表示形態にした顔画像を表示でき、アクセント部分での発音に伴う顔の表現をより明確に学習できるようになる。 According to the image display control device of the present invention, the image for changing and controlling the face image having the mouth or the expression according to the display of the pronunciation order of the series of pronunciation target data including the word headword. A display control device, wherein a plurality of sets of the sound generation target data and the sound and face image thereof are stored in association with each other by a storage unit, and a signal corresponding to an accent portion of the stored signal waveform of the sound by a detection unit The peak portion of the waveform is detected, the face image corresponding to the detected voice of the accent portion is read from the storage means by the display control means, and the read face image is read out of the signal waveform portion other than the accent portion. Control is performed so that the face image corresponding to the sound is displayed in a different display form. As a result, along with the display of the pronunciation order of the pronunciation target data such as the headword of the word, a face image corresponding to the pronunciation voice can be displayed, and the display form differs in the accent part detected by the peak part of the voice signal waveform The face image can be displayed, and the expression of the face accompanying the pronunciation in the accent part can be learned more clearly.

また、本発明の請求項９に係る画像表示制御装置によれば、前記請求項８に係る画像表示制御装置にあって、前記表示制御手段は、前記検出手段により検出されたアクセント部分に対応する発音対象データの部分の表示を、アクセント部分以外の他の信号波形部分に対応する発音対象データの部分の表示と異なる表示形態で表示するように制御するテキスト表示制御手段を備えている。これにより、発音対象データの発音音声に対応する顔画像の表示に加え、さらに、発音対象データのアクセント部分の表示を、該アクセント部分以外の発音対象データの表示と異なる表示形態にして表示できるので、発音対象データのアクセント部分およびその発音音声の発声に伴う顔の表現をより明確に学習できるようになる。 According to the image display control device of claim 9 of the present invention, in the image display control device of claim 8, the display control means corresponds to the accent portion detected by the detection means. Text display control means is provided for controlling the display of the pronunciation target data portion in a display form different from the display of the pronunciation target data portion corresponding to the signal waveform portion other than the accent portion. Thereby, in addition to the display of the face image corresponding to the pronunciation sound of the pronunciation target data, the display of the accent part of the pronunciation target data can be displayed in a display form different from the display of the pronunciation target data other than the accent part. Further, it becomes possible to learn more clearly the accent part of the pronunciation target data and the facial expression accompanying the utterance of the pronunciation sound.

よって、本発明によれば、音声出力に同期した画像の表示において、アクセントのタイミングを明確に現すことが可能になる音声表示出力制御装置、画像表示制御装置、および音声表示出力制御処理プログラム、画像表示制御処理プログラムを提供できる。 Therefore, according to the present invention, an audio display output control device, an image display control device, and an audio display output control processing program that can clearly show the timing of accents in the display of an image synchronized with audio output, an image A display control processing program can be provided.

本発明の音声表示出力制御装置（画像表示制御装置）の実施形態に係る携帯機器１０の電子回路の構成を示すブロック図。The block diagram which shows the structure of the electronic circuit of the portable apparatus 10 which concerns on embodiment of the audio | voice display output control apparatus (image display control apparatus) of this invention. 前記携帯機器１０のメモリ１２に記憶される辞書データベース１２ｂのうち１つの見出語「ｌｏｗ」についての同期再生用リンクデータを示す図であり、同図（Ａ）は各ファイルＮｏと格納先アドレスを示すテーブル、同図（Ｂ）は当該テキストファイルＮｏに従い格納されているテキストデータ「ｌｏｗ」を示す図、同図（Ｃ）はテキスト口同期ファイルＮｏに従い格納されているテキストの文字，発音記号，口型番号を示す図。It is a figure which shows the link data for synchronous reproduction | regeneration about one headword "low" among the dictionary databases 12b memorize | stored in the memory 12 of the said portable apparatus 10, The figure (A) shows each file No. and storage destination address. FIG. 4B shows the text data “low” stored according to the text file No., and FIG. 4C shows the text characters and phonetic symbols stored according to the text mouth synchronization file No. The figure which shows a mouth type number. 前記携帯機器１０のメモリ１２に記憶され、辞書の見出語検索における発音口型画像の同期表示のためにユーザ設定により選択的に使用されるキャラクタ画像データ１２ｄを示す図。The figure which shows the character image data 12d memorize | stored in the memory 12 of the said portable apparatus 10, and is selectively used by a user setting for the synchronous display of the pronunciation mouth type | mold image in the search for a dictionary word. 前記携帯機器１０のメモリ１２に記憶され、辞書の見出語検索における発音口型画像の同期表示のためにキャラクタ画像（１２ｄ：Ｎｏ１〜Ｎｏ３）の口画像エリア（Ｘ１，Ｙ１，Ｘ２，Ｙ２）に合成表示される音声別口画像データ１２ｅを示す図。Mouth image areas (X1, Y1, X2, Y2) of character images (12d: No1 to No3) stored in the memory 12 of the portable device 10 for the synchronous display of pronunciation mouth type images in the dictionary word search The figure which shows the audio | voice separate mouth image data 12e synthesize | combined and displayed. 前記携帯機器１０のメモリ１２に格納された辞書タイムコードファイル１２ｆにおける見出語「ｌｏｗ」に対応付けられたファイルＮｏ２３のタイムコードファイル１２ｆ２３（１２ｉ）を示す図。The figure which shows the time code file 12f23 (12i) of the file No23 matched with the headword "low" in the dictionary time code file 12f stored in the memory 12 of the said portable apparatus 10. FIG. 前記携帯機器１０の辞書タイムコードファイル１２ｆｎ（図５参照）にて記述される各種コマンドのコマンドコードとそのパラメータデータに基づき解析処理される命令内容を対応付けて示す図。The figure which matches and shows the command code | cord | chord of the various commands described in the dictionary time code file 12fn (refer FIG. 5) of the said portable apparatus 10, and the command content analyzed based on the parameter data. 前記携帯機器１０の辞書処理プログラム１２ａに従ったメイン処理を示すフローチャート。The flowchart which shows the main process according to the dictionary processing program 12a of the said portable device 10. 前記携帯機器１０のメイン処理に伴う見出語同期再生処理を示すフローチャート。4 is a flowchart showing a headword synchronized reproduction process associated with a main process of the mobile device 10. 前記携帯機器１０の見出語同期再生処理に伴う各見出語文字のハイライト表示に応じて割り込みで実行されるテキスト対応口表示処理を示すフローチャート。The flowchart which shows the text corresponding | compatible mouth display process performed by interruption according to the highlight display of each entry word character accompanying the entry word synchronous reproduction | regeneration process of the said portable device. 前記携帯機器１０のメイン処理内のキャラクタ設定処理に伴う同期再生用キャラクタ画像の設定表示状態を示す図。The figure which shows the setting display state of the character image for synchronous reproduction accompanying the character setting process in the main process of the said portable device. 前記携帯機器１０のメイン処理内の見出語検索処理に伴う検索見出語表示画面Ｇ２を示す図。The figure which shows the search word display screen G2 accompanying the word search process in the main process of the said portable device. 前記携帯機器１０の見出語検索処理における同期再生処理に伴いキャラクタ画像Ｎｏ３の設定状態において検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１およびアクセント未対応の発音口型表示ウインドウＷ２の変化状態を示す図、同図（Ｃ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１およびアクセント対応の発音口型表示ウインドウＷ２の変化状態を示す図。With the synchronous reproduction process in the search term search process of the portable device 10, the search word display window W1 and the pronunciation window display window W2 displayed on the search search word display screen G2 in the setting state of the character image No3. FIG. 6A is a diagram showing the setting display states of the headword character display window W1 and the pronunciation-mouth-type display window W2 for the search headword display screen G2, FIG. Is a diagram showing a change state of the headword character display window W1 synchronized with the output of the pronunciation voice and the pronunciation type display window W2 not corresponding to the accent, and FIG. The figure which shows the change state of the display window W1 and the sounding mouth type | mold display window W2 corresponding to an accent. 前記携帯機器１０の見出語検索処理における同期再生処理に伴いキャラクタ画像Ｎｏ１の設定状態において検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図。With the synchronous reproduction process in the search term search process of the portable device 10, the search word display window W1 and the pronunciation window display window W2 displayed on the search search word display screen G2 in the setting state of the character image No1. FIG. 6A is a diagram showing the setting display states of the headword character display window W1 and the pronunciation-mouth-type display window W2 for the search headword display screen G2, FIG. FIG. 7 is a diagram showing a change state of a headword character display window W1 and a pronunciation mouth type display window W2 in synchronization with the output of the pronunciation sound. 前記携帯機器１０のメイン処理内の見出語検索処理に伴い米国／英国の２国の発音形態を収録した英和辞書を利用した場合の検索見出語表示画面Ｇ２を示す図。The figure which shows the search headword display screen G2 at the time of using the English-Japanese dictionary which recorded the pronunciation form of the US / UK two countries with the headword search process in the main process of the said portable device. 前記携帯機器１０の見出語検索処理における同期再生処理に伴い米国式発音［米］を指定した場合に検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は米国式発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図。The headword character display window W1 and the pronunciation window displayed on the search headword display screen G2 when the American pronunciation [US] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A is a diagram showing the display state of the type display window W2, and FIG. 6A is a diagram showing the setting display state of the headword character display window W1 and the pronunciation type display window W2 for the search headword display screen G2. FIG. 5B is a diagram showing a change state of the headword character display window W1 and the pronunciation window display window W2 synchronized with the output of the American pronunciation sound. 前記携帯機器１０の見出語検索処理における同期再生処理に伴い英国式発音［英］を指定した場合に検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は英国式発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図。The headword character display window W1 and the pronunciation window displayed on the search headword display screen G2 when English pronunciation [English] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A is a diagram showing the display state of the type display window W2, and FIG. 6A is a diagram showing the setting display state of the headword character display window W1 and the pronunciation type display window W2 for the search headword display screen G2. FIG. (B) is a diagram showing a change state of the headword character display window W1 and the pronunciation opening type display window W2 synchronized with the output of the English pronunciation sound. 前記携帯機器１０のアクセントテスト処理に伴い不正解を選択した場合の操作表示状態を示す図であり、同図（Ａ）はアクセントテスト出題表示画面Ｇ３を示す図、同図（Ｂ）は出題対象の見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｃ）は誤りアクセントの発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図。It is a figure which shows the operation display state at the time of selecting an incorrect answer with the accent test process of the said portable apparatus 10, The figure (A) is a figure which shows the accent test question display screen G3, The figure (B) is a question subject. The figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 for the headword display screen G2 of FIG. The figure which shows the change state of the display window W1 and the sound emission port type display window W2. 前記携帯機器１０のアクセントテスト処理に伴い正解を選択した場合の操作表示状態を示す図であり、同図（Ａ）はアクセントテスト出題表示画面Ｇ３を示す図、同図（Ｂ）は出題対象の見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｃ）は正解アクセントの発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図。It is a figure which shows the operation display state at the time of selecting the correct answer with the accent test process of the said portable apparatus 10, The figure (A) is a figure which shows the accent test question display screen G3, The figure (B) is a question subject. The figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the headword display screen G2, The figure (C) is the headword character display synchronized with the output of the pronunciation sound of a correct accent. The figure which shows the change state of the window W1 and the sound emission port type display window W2. 前記携帯機器１０の第２実施形態の見出語同期再生処理を示すフローチャート。The flowchart which shows the word-synchronization reproduction | regeneration processing of 2nd Embodiment of the said portable device. 前記携帯機器１０の第３実施形態の見出語同期再生処理を示すフローチャート。10 is a flowchart showing a headword synchronized reproduction process of the third embodiment of the mobile device 10;

以下、図面を参照して本発明の実施の形態について説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第１実施形態）
図１は本発明の音声表示出力制御装置（画像表示制御装置）の実施形態に係る携帯機器１０の電子回路の構成を示すブロック図である。 (First embodiment)
FIG. 1 is a block diagram showing a configuration of an electronic circuit of a portable device 10 according to an embodiment of an audio display output control device (image display control device) of the present invention.

この携帯機器（ＰＤＡ：ｐｅｒｓｏｎａｌｄｉｇｉｔａｌａｓｓｉｓｔａｎｔｓ）１０は、各種の記録媒体に記録されたプログラム、又は、通信伝送されたプログラムを読み込んで、その読み込んだプログラムによって動作が制御されるコンピュータによって構成され、その電子回路には、ＣＰＵ（ｃｅｎｔｒａｌｐｒｏｃｅｓｓｉｎｇｕｎｉｔ）１１が備えられる。 This portable device (PDA: personal digital assistants) 10 is configured by a computer that reads a program recorded on various recording media or a program transmitted by communication and whose operation is controlled by the read program. The electronic circuit includes a CPU (central processing unit) 11.

ＣＰＵ１１は、メモリ１２内のＦＬＡＳＨメモリ１２Ａに予め記憶されたＰＤＡ制御プログラム、あるいはＲＯＭカードなどの外部記録媒体１３から記録媒体読取部１４を介して前記メモリ１２に読み込まれたＰＤＡ制御プログラム、あるいはインターネットなどの通信ネットワークＮ上の他のコンピュータ端末（３０）から電送制御部１５を介して前記メモリ１２に読み込まれたＰＤＡ制御プログラムに応じて、回路各部の動作を制御するもので、前記メモリ１２に記憶されたＰＤＡ制御プログラムは、スイッチやキーからなる入力部１７ａおよびマウスやタブレットからなる座標入力装置１７ｂからのユーザ操作に応じた入力信号、あるいは電送制御部１５に受信される通信ネットワークＮ上の他のコンピュータ端末（３０）からの通信信号、あるいはＢｌｕｅｔｏｏｔｈ（登録商標）による近距離無線接続や有線接続による通信部１６を介して受信される外部の通信機器（ＰＣ：ｐｅｒｓｏｎａｌｃｏｍｐｕｔｅｒ）２０からの通信信号に応じて起動される。 The CPU 11 is a PDA control program stored in advance in the FLASH memory 12A in the memory 12, or a PDA control program read from the external recording medium 13 such as a ROM card into the memory 12 via the recording medium reading unit 14, or the Internet. The operation of each part of the circuit is controlled in accordance with the PDA control program read into the memory 12 from the other computer terminal (30) on the communication network N such as The stored PDA control program is input on the communication network N received by an input signal corresponding to a user operation from the input unit 17a composed of switches and keys and the coordinate input device 17b composed of a mouse and a tablet, or to the power transmission control unit 15. Communication from other computer terminals (30) No., or Bluetooth external communication apparatus received via the communication section 16 by (R) by short-range wireless connection or a wired connection (PC: personal computer) is activated in response to the communication signal from 20.

前記ＣＰＵ１１には、前記メモリ１２、記録媒体読取部１４、電送制御部１５、通信部１６、入力部１７ａ、座標入力装置１７ｂが接続される他に、ＬＣＤからなる表示部１８、マイクを備え音声を入力する音声入力部１９ａ、左右チャンネルのスピーカＬ，Ｒを備え音声を出力するステレオ音声出力部１９ｂなどが接続される。 The CPU 11 is connected to the memory 12, the recording medium reading unit 14, the power transmission control unit 15, the communication unit 16, the input unit 17a, and the coordinate input device 17b. Are connected to a sound input unit 19a for inputting a sound, a stereo sound output unit 19b for outputting sound with left and right channel speakers L and R, and the like.

また、ＣＰＵ１１には、処理時間計時用のタイマが内蔵される。 The CPU 11 has a built-in timer for processing time counting.

この携帯機器１０のメモリ１２は、ＦＬＡＳＨメモリ（ＥＥＰ−ＲＯＭ）１２Ａ、ＲＡＭ１２Ｂを備えて構成される。 The memory 12 of the portable device 10 includes a FLASH memory (EEP-ROM) 12A and a RAM 12B.

ＦＬＡＳＨメモリ（ＥＥＰ−ＲＯＭ）１２Ａには、当該携帯機器１０の全体の動作を司るシステムプログラムや電送制御部１５を介して通信ネットワークＮ上の各コンピュータ端末（３０）とデータ通信するためのネット通信プログラム、通信部１６を介して外部の通信機器（ＰＣ）２０とデータ通信するための外部機器通信プログラムが記憶される他に、スケジュール管理プログラムやアドレス管理プログラム、そして辞書の見出語検索や検索見出語に対応する音声・テキスト・顔画像（含む口型合成画像）などの各種データの同期再生、および当該顔画像（キャラクタ）の種類設定、および見出語アクセントの出題テストを行うための辞書処理プログラム１２ａなど、種々のＰＤＡ制御プログラムが記憶される。 The FLASH memory (EEP-ROM) 12A has network communication for data communication with each computer terminal (30) on the communication network N via the system program that controls the overall operation of the portable device 10 and the power transmission control unit 15. In addition to storing programs, external device communication programs for data communication with an external communication device (PC) 20 via the communication unit 16, schedule management programs, address management programs, and dictionary search and search For synchronized playback of various data such as speech, text, and face images (including mouth-shaped composite images) corresponding to headwords, setting the type of the face image (character), and testing the headword accent Various PDA control programs such as the dictionary processing program 12a are stored.

また、ＦＬＡＳＨメモリ（ＥＥＰ−ＲＯＭ）１２Ａにはさらに、辞書データベース１２ｂ（図２参照）、辞書音声データ１２ｃ、キャラクタ画像データ１２ｄ（図３参照）、音声別口（型）画像データ１２ｅ（図４参照）、および辞書タイムコードファイル１２ｆ（図５・図６参照）が記憶される。 Further, the FLASH memory (EEP-ROM) 12A further includes a dictionary database 12b (see FIG. 2), dictionary voice data 12c, character image data 12d (see FIG. 3), and voice separate mouth (type) image data 12e (FIG. 4). And a dictionary time code file 12f (see FIGS. 5 and 6) are stored.

辞書データベース１２ｂとしては、英和辞書、和英辞書、国語辞書など、各種の辞書のデータが記憶されると共に、図２に示すように、辞書内の全ての見出語についてそれぞれその見出語Ｎｏ、音声・テキスト・画像の同期再生を簡単に行うためのタイムコードファイルのＮｏと格納先アドレス、画像再生ウインドウを設定するためのＨＴＭＬファイルのＮｏと格納先アドレス、テキストファイルのＮｏと格納先アドレス、テキストの各文字，発音記号，口型番号を対応付けたテキスト口同期ファイルのＮｏと格納先アドレス、音声データであるサウンドファイルのＮｏと格納先アドレス、辞書内容のデータ番号と格納先アドレスが、それぞれリンク付けられて記憶される。 As the dictionary database 12b, data of various dictionaries such as an English-Japanese dictionary, a Japanese-English dictionary, a national language dictionary, and the like are stored. As shown in FIG. Time code file No and storage destination address for easy synchronized playback of audio, text, and images, HTML file No and storage destination address for setting an image playback window, Text file No and storage destination address, No. and storage address of text mouth synchronization file that associates each character, phonetic symbol, mouth type number of text, No and storage address of sound file that is voice data, data number and storage address of dictionary contents, Each is linked and stored.

なお、各実施形態において、明細書上で記載される発音記号については、正式な発音記号の入力が困難なため類似の文字を代用し、正式な発音記号については図面上にて記載する。 In each embodiment, for phonetic symbols described in the specification, it is difficult to input formal phonetic symbols, so similar characters are substituted, and formal phonetic symbols are described on the drawings.

図２は前記携帯機器１０のメモリ１２に記憶される辞書データベース１２ｂのうち１つの見出語「ｌｏｗ」についての同期再生用リンクデータを示す図であり、同図（Ａ）は各ファイルＮｏと格納先アドレスを示すテーブル、同図（Ｂ）は当該テキストファイルＮｏに従い格納されているテキストデータ「ｌｏｗ」を示す図、同図（Ｃ）はテキスト口同期ファイルＮｏに従い格納されているテキストの文字，発音記号，口型番号を示す図である。 FIG. 2 is a diagram showing synchronous reproduction link data for one headword “low” in the dictionary database 12b stored in the memory 12 of the portable device 10, and FIG. A table showing storage destination addresses, FIG. 5B shows the text data “low” stored according to the text file No., and FIG. 4C shows text characters stored according to the text mouth synchronization file No. , Phonetic symbols, mouth type numbers.

辞書音声データ１２ｃとしては、前記辞書データベース１２ｂにおける各見出語毎の発音のための音声データがそのサウンドファイルＮｏとアドレスに対応付けられて記憶される。 As the dictionary voice data 12c, voice data for pronunciation for each headword in the dictionary database 12b is stored in association with the sound file No. and the address.

図３は前記携帯機器１０のメモリ１２に記憶され、辞書の見出語検索における発音口型画像の同期表示のためにユーザ設定により選択的に使用されるキャラクタ画像データ１２ｄを示す図である。 FIG. 3 is a diagram showing character image data 12d that is stored in the memory 12 of the portable device 10 and that is selectively used by user settings for synchronous display of pronunciation-portable images in a dictionary search for a dictionary.

キャラクタ画像データ１２ｄとしては、本実施形態の場合、３種類のキャラクタ画像（顔画像）Ｎｏ１〜Ｎｏ３が用意され、個々のキャラクタ画像Ｎｏ１，Ｎｏ２，Ｎｏ３には、その口型画像の合成矩形領域を対角する２点の座標として指定するための口画像エリアデータ（Ｘ１，Ｙ１，Ｘ２，Ｙ２）が対応付けられて記憶される。 In this embodiment, three types of character images (face images) No1 to No3 are prepared as the character image data 12d, and each character image No1, No2, No3 has a combined rectangular area of its mouth-shaped image. Mouth image area data (X1, Y1, X2, Y2) for designating as coordinates of two diagonal points are stored in association with each other.

なお、この３種類のキャラクタ画像（顔画像）Ｎｏ１〜Ｎｏ３には、それぞれさらに、辞書検索された見出語のアクセントのタイミングで発音の強調を表現するためのアクセント顔画像Ｎｏ１′〜Ｎｏ３′（図１２（Ｃ）(2)，図１３（Ｂ）(2)参照）が記憶され、さらには、米語または英語の発音音声が指定された場合の米語用キャラクタ画像Ｎｏ１ＵＳ〜Ｎｏ３ＵＳ（図１５参照）や英語用キャラクタ画像Ｎｏ１ＵＫ〜Ｎｏ３ＵＫ（図１６参照）、およびそのアクセント顔画像Ｎｏ１ＵＳ′〜Ｎｏ３ＵＳ′（図１５（Ｂ）(2)参照）やＮｏ１ＵＫ′〜Ｎｏ３ＵＫ′（図１６（Ｂ）(2)参照）が記憶される。 The three types of character images (face images) No1 to No3 are further accented face images No1 'to No3' (for expressing pronunciation emphasis at the timing of the accent of the found word searched in the dictionary. 12 (C) (2) and FIG. 13 (B) (2)) are stored, and further, American character images No1US to No3US when American or English pronunciation sounds are designated (see FIG. 15). And English character images No1UK to No3UK (see FIG. 16), and accent face images No1US ′ to No3US ′ (see FIG. 15B (2)) and No1UK ′ to No3UK ′ (FIG. 16B (2) Reference) is stored.

図４は前記携帯機器１０のメモリ１２に記憶され、辞書の見出語検索における発音口型画像の同期表示のためにキャラクタ画像（１２ｄ：Ｎｏ１〜Ｎｏ３）の口画像エリア（Ｘ１，Ｙ１，Ｘ２，Ｙ２）に合成表示される音声別口画像データ１２ｅを示す図である。 FIG. 4 is stored in the memory 12 of the portable device 10 and is used for the synchronous display of the pronunciation mouth type images in the dictionary word search, the mouth image areas (X1, Y1, X2) of the character images (12d: No1 to No3). , Y2) is a diagram showing voice-specific mouth image data 12e synthesized and displayed.

音声別口（型）画像データ１２ｅとしては、前記辞書データベース１２ｂに記憶された全ての見出し語の発音に要する各発音記号に対応付けた口型画像１２ｅ１，１２ｅ２，…がそれぞれその口番号Ｎｏ．ｎに対応付けられて記憶される。 As the mouth-specific mouth image data 12e, mouth-shaped images 12e1, 12e2,... Associated with each phonetic symbol required for pronunciation of all headwords stored in the dictionary database 12b are the mouth number No. It is stored in association with n.

また、前記携帯機器１０のメモリ１２に記憶される辞書タイムコードファイル１２ｆは、辞書検索された見出語に対応する音声・テキスト・顔画像（含む口型合成画像）の同期再生を行うための指令ファイル（図５参照）であり、全ての見出語毎ではなく、文字数と発音記号数およびその発音タイミングが同じである複数の見出語毎に用意され、所定のアルゴリズムにより圧縮・暗号化されている。 Further, the dictionary time code file 12f stored in the memory 12 of the portable device 10 is used for synchronous reproduction of voice / text / face images (including mouth-shaped composite images) corresponding to the searched words in the dictionary. This command file (see Fig. 5) is prepared not for every headword but for each headword that has the same number of characters, number of phonetic symbols, and pronunciation timing, and is compressed and encrypted using a predetermined algorithm. Has been.

図５は前記携帯機器１０のメモリ１２に格納された辞書タイムコードファイル１２ｆにおける見出語「ｌｏｗ」に対応付けられたファイルＮｏ２３のタイムコードファイル１２ｆ２３（１２ｉ）を示す図である。 FIG. 5 is a diagram showing a time code file 12f23 (12i) of the file No. 23 associated with the headword “low” in the dictionary time code file 12f stored in the memory 12 of the portable device 10.

タイムコードファイル１２ｆｎには、予めヘッダ情報Ｈとして記述設定される一定時間間隔の基準処理単位時間（例えば２５ｍｓ）で各種データ（音声・テキスト・画像）を同期再生するコマンド処理を行うためのタイムコードが記述配列されるもので、この各タイムコードは、命令を指示するコマンドコードと、当該コマンドに関わるデータ内容（テキストファイル／サウンドファイル／イメージファイルなど）を対応付けするための参照番号や指定数値からなるパラメータデータとの組み合わせにより構成される。 In the time code file 12fn, a time code for performing command processing for synchronously reproducing various data (speech, text, and images) at a reference processing unit time (for example, 25 ms) at a predetermined time interval described and set as header information H in advance. Each time code is a reference number or a specified numerical value for associating a command code indicating an instruction with data contents (text file / sound file / image file etc.) related to the command. It is comprised by the combination with the parameter data which consists of.

例えば図５で示す見出語「ｌｏｗ」のタイムコードファイル１２ｆ２３によるファイル再生時間は、予め設定された基準処理単位時間が２５ｍｓである場合、４０ステップのタイムコードからなる再生処理を経て１秒間となる。 For example, the file playback time by the time code file 12f23 of the headword “low” shown in FIG. 5 is 1 second after a playback process consisting of a time code of 40 steps when a preset reference processing unit time is 25 ms. Become.

図６は前記携帯機器１０の辞書タイムコードファイル１２ｆｎ（図５参照）にて記述される各種コマンドのコマンドコードとそのパラメータデータに基づき解析処理される命令内容を対応付けて示す図である。 FIG. 6 is a diagram in which command codes of various commands described in the dictionary time code file 12fn (see FIG. 5) of the portable device 10 are associated with instruction contents to be analyzed based on parameter data thereof.

タイムコードファイル１２ｆｎに使用されるコマンドとしては、標準コマンドと拡張コマンドがあり、標準コマンドには、ＬＴ（ｉ番目テキストロード）．ＶＤ（ｉ番目テキスト文節表示）．ＢＬ（文字カウンタリセット・ｉ番目文節ブロック指定）．ＨＮ（ハイライト無し・文字カウンタカウントアップ）．ＨＬ（ｉ番目文字までハイライト・文字カウント）．ＬＳ（１行スクロール・文字カウンタカウントアップ）．ＤＨ（ｉ番目ＨＴＭＬファイル表示）．ＤＩ（ｉ番目イメージファイル表示）．ＰＳ（ｉ番目サウンドファイルプレイ）．ＣＳ（クリアオールファイル）．ＰＰ（基本タイムｉ秒間停止）．ＦＮ（処理終了）．ＮＰ（無効）の各コマンドがある。 Commands used for the time code file 12fn include standard commands and extended commands. The standard commands include LT (i-th text load). VD (i-th text phrase display). BL (Character counter reset / i-th phrase block designation). HN (no highlight, character counter count up). HL (up to i-th character, character count). LS (1 line scrolling / character counter count up). DH (i-th HTML file display). DI (i-th image file display). PS (i-th sound file play). CS (Clear All File). PP (pause for basic time i seconds). FN (end of processing). There are NP (invalid) commands.

また、メモリ１２内のＲＡＭ１２Ｂには、辞書データベース１２ｂの検索処理に伴う見出語がその見出語番号に従い読み出されて記憶される検索見出語メモリ１２ｇ、検索された見出語に対応する意味内容などの辞書データが前記辞書データベース１２ｂからその辞書データ番号に従い読み出されて記憶される見出語対応辞書データメモリ１２ｈ、検索された見出語に対応した音声・テキスト・画像の同期再生を行うためのタイムコードファイル１２ｆｎ（図５参照）が前記辞書データベース１２ｂ内のタイムコードファイルＮｏに従い辞書タイムコードファイル１２ｆの中から読み出され伸張・復号化されて記憶される再生タイムコードファイルメモリ１２ｉが用意される。 Further, in the RAM 12B in the memory 12, a search word memory 12g in which a search word associated with the search processing of the dictionary database 12b is read and stored according to the search word number corresponds to the searched search word. Dictionary data memory 12h, in which dictionary data such as meaning contents to be read is read out from the dictionary database 12b according to the dictionary data number and stored, and voice / text / image synchronization corresponding to the searched entry A reproduction time code file 12fn (see FIG. 5) for reproduction is read out from the dictionary time code file 12f according to the time code file No in the dictionary database 12b, decompressed, decoded and stored. A memory 12i is prepared.

さらに、このメモリ１２内のＲＡＭ１２Ｂには、見出語検索画面Ｇ２上でテキスト・画像の同期再生用ウインドウＷ１，Ｗ２（図１２・図１３参照）を設定するためのＨＴＭＬファイルが、前記辞書データベース１２ｂからＨＴＭＬファイルＮｏに従い読み出されて記憶される同期用ＨＴＭＬファイルメモリ１２ｊ、検索見出語のテキストデータが前記辞書データベース１２ｂからそのテキストファイルＮｏに従い読み出されて記憶される同期用テキストファイルメモリ１２ｋ、検索見出語の発音音声データが前記辞書データベース１２ｂ内のサウンドファイルＮｏに従い前記辞書音声データ１２ｃの中から読み出されて記憶される同期用サウンドファイルメモリ１２ｍ、検索見出語の発音画像表示用としてユーザ設定されたキャラクタ画像が前記キャラクタ画像データ１２ｄ（図３参照）の中から読み出されて記憶される同期用イメージファイルメモリ１２ｎ、この同期用イメージファイルメモリ１２ｎに記憶されたキャラクタ画像における口型画像の合成領域を示す口画像エリアデータ（Ｘ１，Ｙ１；Ｘ２，Ｙ２）が記憶される口画像エリアメモリ１２ｐ、そして、前記タイムコードファイルメモリ１２ｉに記憶された検索見出語に対応するタイムコードファイル１２ｆｎに従い音声・テキストに同期再生すべきキャラクタ画像と口型画像とが展開合成されて記憶される画像展開バッファ１２ｑなどが用意される。 Further, in the RAM 12B in the memory 12, HTML files for setting the text / image synchronous reproduction windows W1 and W2 (see FIGS. 12 and 13) on the headword search screen G2 are stored in the dictionary database. A synchronization HTML file memory 12j that is read from and stored in accordance with the HTML file No. 12b, and a text file memory for synchronization in which the search headword text data is read from the dictionary database 12b and stored in accordance with the text file No. 12k, synchronization sound file memory 12m in which the pronunciation sound data of the search headword is read out from the dictionary sound data 12c and stored according to the sound file No in the dictionary database 12b, and the pronunciation image of the search headword Character image set by user for display A synchronization image file memory 12n that is read out from the character image data 12d (see FIG. 3) and stored, and a mouth that indicates a composition area of the mouth image in the character image stored in the synchronization image file memory 12n Mouth image area memory 12p in which image area data (X1, Y1; X2, Y2) are stored, and voice / text according to time code file 12fn corresponding to the search headword stored in time code file memory 12i An image expansion buffer 12q or the like is prepared in which a character image to be synchronized and a mouth image are expanded and synthesized and stored.

すなわち、この携帯機器（ＰＤＡ）１０のＦＬＡＳＨメモリ１２Ａに記憶されている辞書処理プログラム１２ａを起動させて検索された見出語が「ｌｏｗ」であり、これに対応して辞書タイムコードファイル１２ｆ内から読み出されて再生タイムコードファイルメモリ１２ｉに記憶されたタイムコードファイル１２ｆが、例えば図５で示したタイムコードファイル１２ｆ２３であり、設定処理単位時間毎のコマンド処理に伴い３番目のコマンドコード“ＤＩ”およびパラメータデータ“００”が読み込まれた場合には、このコマンド“ＤＩ”はｉ番目のイメージファイル表示命令であるため、パラメータデータｉ＝００からリンク付けられる同期用イメージファイル１２ｎに記憶されたキャラクタ画像１２ｄｎが読み出されて表示される。 That is, the entry word searched by activating the dictionary processing program 12a stored in the FLASH memory 12A of the portable device (PDA) 10 is “low”, and in the dictionary time code file 12f corresponding to this The time code file 12f read from and stored in the reproduction time code file memory 12i is, for example, the time code file 12f23 shown in FIG. 5, and the third command code “ When “DI” and parameter data “00” are read, this command “DI” is the i-th image file display command, and is stored in the synchronization image file 12n linked from the parameter data i = 00. The character image 12dn is read and displayed.

また、設定処理単位時間毎のコマンド処理に伴い４番目のコマンドコード“ＰＳ”およびパラメータデータ“００”が読み込まれた場合には、このコマンド“ＰＳ”はｉ番目のサウンドファイル再生命令であるため、パラメータデータｉ＝００からリンク付けられる同期用サウンドファイル１２ｍに記憶された音声データ１２ｃｎが読み出されて出力される。 When the fourth command code “PS” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “PS” is the i-th sound file playback command. The audio data 12cn stored in the synchronization sound file 12m linked from the parameter data i = 00 is read and output.

また、設定処理単位時間毎のコマンド処理に伴い６番目のコマンドコード“ＶＤ”およびパラメータデータ“００”が読み込まれた場合には、このコマンド“ＶＤ”はｉ番目のテキスト文節表示命令であるため、パラメータデータｉ＝００に従い、テキストの０番目の文節（この場合は、同期用テキストファイルメモリ１２ｋに記憶された検索見出語のテキストファイル「ｌｏｗ」が表示される。 When the sixth command code “VD” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “VD” is the i-th text phrase display command. According to the parameter data i = 00, the 0th clause of the text (in this case, the text file “low” of the search word stored in the synchronization text file memory 12k is displayed.

さらに、設定処理単位時間毎のコマンド処理に伴い９番目のコマンドコード“ＮＰ”およびパラメータデータ“００”が読み込まれた場合には、このコマンド“ＮＰ”は無効命令であるため、現状のファイル出力状態が維持される。 Further, when the ninth command code “NP” and parameter data “00” are read in accordance with the command processing for each set processing unit time, this command “NP” is an invalid instruction, and therefore the current file output State is maintained.

なお、この図５で示したファイル内容のタイムコードファイル１２ｆ２３（１２ｉ）に基づいた検索見出語に対応する発音音声・テキスト・画像（口型画像）の同期再生についての詳細な動作は、後述にて改めて説明する。 The detailed operation of the synchronized playback of pronunciation speech / text / image (mouth-shaped image) corresponding to the search word based on the time code file 12f23 (12i) having the file contents shown in FIG. 5 will be described later. I will explain it again.

次に、前記構成の携帯機器１０による各種の動作について説明する。 Next, various operations performed by the mobile device 10 having the above-described configuration will be described.

図７は前記携帯機器１０の辞書処理プログラム１２ａに従ったメイン処理を示すフローチャートである。 FIG. 7 is a flowchart showing main processing according to the dictionary processing program 12a of the portable device 10.

図８は前記携帯機器１０のメイン処理に伴う見出語同期再生処理を示すフローチャートである。 FIG. 8 is a flowchart showing a headword synchronized reproduction process accompanying the main process of the portable device 10.

図９は前記携帯機器１０の見出語同期再生処理に伴う各見出語文字のハイライト表示に応じて割り込みで実行されるテキスト対応口表示処理を示すフローチャートである。 FIG. 9 is a flowchart showing a text corresponding mouth display process executed by interruption in accordance with the highlight display of each entry word character accompanying the entry synchronized playback process of the portable device 10.

図１０は前記携帯機器１０のメイン処理内のキャラクタ設定処理に伴う同期再生用キャラクタ画像の設定表示状態を示す図である。 FIG. 10 is a diagram showing a setting display state of the character image for synchronous reproduction accompanying the character setting process in the main process of the portable device 10.

入力部１７ａの「設定」キー１７ａ１およびカーソルキー１７ａ２の操作によりキャラクタ画像の設定モードに切り替えられると（ステップＳ１→Ｓ２）、ＦＬＡＳＨメモリ１２Ａに記憶されている例えば３種類のキャラクタ画像データ１２ｄ１（Ｎｏ１），１２ｄ２（Ｎｏ２），１２ｄ３（Ｎｏ３）［図３参照］が読み出され、図１０に示すように、キャラクタ画像の一覧選択画面Ｇ１として表示部１８に表示される（ステップＳ３）。 When the character image setting mode is switched by operating the “setting” key 17a1 and the cursor key 17a2 of the input unit 17a (step S1 → S2), for example, three types of character image data 12d1 (No1) stored in the FLASH memory 12A ), 12d2 (No2), 12d3 (No3) [see FIG. 3] are read and displayed on the display unit 18 as a character image list selection screen G1, as shown in FIG. 10 (step S3).

このキャラクタ画像の一覧選択画面Ｇ１において、カーソルキー１７ａ３の操作により各キャラクタ画像の選択フレームＸが移動操作されてユーザ所望のキャラクタ画像（例えば１２ｄ３（Ｎｏ３））が選択されると共に、「訳／決定（音声）」キー１７ａ４による決定操作により当該キャラクタ画像の選択が検知されると（ステップＳ４）、この選択検知されたキャラクタ画像１２ｄｎが読み出され、ＲＡＭ１２Ｂ内の同期用イメージファイルメモリ１２ｎに転送格納される（ステップＳ５）。また、この選択検知されたキャラクタ画像１２ｄｎの口型画像の合成領域を示す口画像エリアデータ（Ｘ１，Ｙ１；Ｘ２，Ｙ２）も読み出され、ＲＡＭ１２Ｂ内の口画像エリアメモリ１２ｐに転送格納される（ステップＳ６）。 In this character image list selection screen G1, the selection frame X of each character image is moved by the operation of the cursor key 17a3 to select a user desired character image (for example, 12d3 (No 3)). When the selection of the character image is detected by the determination operation using the (voice) key 17a4 (step S4), the character image 12dn detected by the selection is read out and stored in the synchronization image file memory 12n in the RAM 12B. (Step S5). Mouth image area data (X1, Y1; X2, Y2) indicating the synthesized area of the mouth-shaped image of the character image 12dn that has been selected and detected is also read out and transferred to the mouth image area memory 12p in the RAM 12B. (Step S6).

これにより、見出語検索に伴い、当該見出語の発音音声に同期表示させるべき口型画像合成対象のキャラクタ画像が選択設定される。 Thus, the character image to be synthesized with the mouth-shaped image to be displayed in synchronization with the pronunciation sound of the headword is selected and set along with the headword search.

図１１は前記携帯機器１０のメイン処理内の見出語検索処理に伴う検索見出語表示画面Ｇ２を示す図である。 FIG. 11 is a diagram showing a search word entry display screen G2 accompanying the word search process in the main process of the portable device 10.

辞書データベース１２ｂに記憶されている例えば英和辞書の辞書データに基づいて見出語検索を行うのに、入力部１７ａの「英和」キー１７ａ５の操作により英和辞書の検索モードに設定した後に、検索対象の見出語（例えば「ｌｏｗ」）を入力すると（ステップＳ７→Ｓ８）、当該入力された見出語と一致及び一致文字を先頭に含む複数の見出語が前記英和辞書の辞書データから検索されて読み出され、検索見出語の一覧（図示せず）として表示部１８に表示される（ステップＳ９）。 For example, in order to perform a headword search based on dictionary data of an English-Japanese dictionary stored in the dictionary database 12b, after setting the English-Japanese dictionary search mode by operating the "English-Japanese" key 17a5 of the input unit 17a, the search target Is entered (for example, “low”) (steps S 7 → S 8), a plurality of headwords that match and match the input headword are searched from the dictionary data of the English-Japanese dictionary. Then, it is read out and displayed on the display unit 18 as a list of search terms (not shown) (step S9).

この検索見出語の一覧画面において、ユーザ入力した検索対象の見出語と一致する見出語（この場合「ｌｏｗ」）がカーソルキーにより選択指示されて「訳／決定（音声）」キー１７ａ４が操作されると（ステップＳ１０）、当該選択検知された見出語「ｌｏｗ」がＲＡＭ１２Ｂ内の見出語メモリ１２ｇに記憶されると共に、この見出語「ｌｏｗ」に対応する発音／品詞／意味内容などの辞書データが読み出されてＲＡＭ１２Ｂ内の見出語対応辞書データメモリ１２ｈに記憶され、図１１に示すように、検索見出語表示画面Ｇ２として表示部１８に表示される（ステップＳ１１）。 In this search headword list screen, the headword (in this case, “low”) that matches the search target headword input by the user is selected and designated by the cursor key, and the “translation / decision (voice)” key 17a4 Is operated (step S10), the selected and detected headword "low" is stored in the headword memory 12g in the RAM 12B, and the pronunciation / part of speech / corresponding to the headword "low" is stored. Dictionary data such as meaning contents is read out and stored in the entry word corresponding dictionary data memory 12h in the RAM 12B, and displayed on the display unit 18 as a search entry display screen G2 as shown in FIG. S11).

ここで、前記検索表示された見出語「ｌｏｗ」について、その発音音声を出力させるのと同時に、当該見出語の文字，発音記号と発音の口型画像を同期表示させるために、「訳／決定（音声）」キー１７ａ４が操作されると（ステップＳ１２）、図８における同期再生処理に移行される（ステップＳＡ）。 Here, for the headword “low” displayed in the search, the pronunciation voice is output, and at the same time, the character, phonetic symbol and pronunciation mouth image of the headword are synchronously displayed. When the “/ decision (voice)” key 17a4 is operated (step S12), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA).

図１２は前記携帯機器１０の見出語検索処理における同期再生処理に伴いキャラクタ画像Ｎｏ３の設定状態において検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１およびアクセント未対応の発音口型表示ウインドウＷ２の変化状態を示す図、同図（Ｃ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１およびアクセント対応の発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 12 shows a headword character display window W1 displayed on the search headword display screen G2 in the setting state of the character image No3 in accordance with the synchronous reproduction processing in the headword search processing of the portable device 10 and the pronunciation type. It is a figure which shows the display state of the display window W2, The figure (A) is a figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search headword display screen G2. (B) is a diagram showing a change state of the headword character display window W1 synchronized with the output of the pronunciation sound and the pronunciation type display window W2 not corresponding to the accent, and FIG. It is a figure which shows the change state of the spoken word display window W1 and the pronunciation opening type | mold display window W2 corresponding to an accent.

前記検索見出語表示画面Ｇ２が表示されている状態での「訳／決定（音声）」キー１７ａ４の操作に伴い、図８における同期再生処理（ステップＳＡ）が起動されると、ＲＡＭ１２Ｂ内の各ワークエリアのクリア処理などのイニシャライズ処理が行われ（ステップＡ１）、まず、辞書データベース１２ｂに記憶されている今回の検索見出語「ｌｏｗ」についての同期再生用リンクデータ（図２参照）に基づき、見出語検索画面Ｇ２上でテキスト・画像の同期再生用ウインドウＷ１，Ｗ２（図１２参照）を設定するためのＨＴＭＬファイルがそのＨＴＭＬファイルＮｏ３に従い読み出され同期用ＨＴＭＬファイルメモリ１２ｊに書き込まれる。また、検索見出語のテキストデータ「ｌｏｗ（発音記号付）」がそのテキストファイルＮｏ４２２２に従い読み出され同期用テキストファイルメモリ１２ｋに書き込まれる。また、検索見出語の発音音声データがそのサウンドファイルＮｏ４２２２に従い読み出され同期用サウンドファイルメモリ１２ｍに書き込まれる（ステップＡ２）。 When the synchronous reproduction process (step SA) in FIG. 8 is started in accordance with the operation of the “translation / decision (voice)” key 17a4 in the state where the search headword display screen G2 is displayed, Initialization processing such as clear processing of each work area is performed (step A1). First, the synchronous reproduction link data (refer to FIG. 2) for the current search term “low” stored in the dictionary database 12b is stored. Based on the headword search screen G2, the HTML file for setting the synchronous playback windows W1 and W2 (see FIG. 12) of the text / image is read according to the HTML file No3 and written to the synchronization HTML file memory 12j. It is. In addition, the text data “low (with phonetic symbol)” of the search headword is read according to the text file No 4222 and written into the synchronization text file memory 12k. Further, the pronunciation voice data of the search headword is read according to the sound file No 4222 and written into the synchronization sound file memory 12m (step A2).

なお、検索見出語の発音画像表示用としてユーザ設定されたキャラクタ画像（この場合、１２ｄ３（Ｎｏ３））は、前記キャラクタ設定処理に伴うステップＳ５に従って、既にキャラクタ画像データ１２ｄ（図３参照）の中から読み出されて同期用イメージファイルメモリ１２ｎに書き込まれ、さらに当該キャラクタ画像１２ｄ３（Ｎｏ３）における発音口型画像合成エリアである口画像エリアデータ（Ｘ１，Ｙ１；Ｘ２，Ｙ２）も前記キャラクタ設定処理に伴うステップＳ６に従って、既に口画像エリアメモリ１２ｐに書き込まれている。 It should be noted that the character image (in this case, 12d3 (No 3)) set by the user for displaying the search headline pronunciation image is already stored in the character image data 12d (see FIG. 3) in accordance with step S5 accompanying the character setting process. It is read from the inside and written to the synchronization image file memory 12n, and the mouth image area data (X1, Y1; X2, Y2), which is the pronunciation mouth type image composition area in the character image 12d3 (No3), is also set in the character setting. According to step S6 accompanying the process, it has already been written in the mouth image area memory 12p.

すると、ＦＬＡＳＨメモリ１２Ａ内に辞書タイムコードファイル１２ｆとして記憶されている各種見出語対応の暗号化された音声・テキスト・画像の同期再生用タイムコードファイル１２ｆｎの中から、今回の検索見出語「ｌｏｗ」に対応するタイムコードファイル１２ｆ２３（図５参照）が、前記同期再生用リンクデータ（図２参照）に記述されたタイムコードファイルＮｏ２３に従い解読復号化されて読み出され、ＲＡＭ１２Ｂ内のタイムコードファイルメモリ１２ｉに転送されて格納される（ステップＡ３）。 Then, from the time code file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary time code file 12f in the FLASH memory 12A, the current search headword The time code file 12f23 (see FIG. 5) corresponding to “low” is decoded and read according to the time code file No23 described in the synchronous reproduction link data (see FIG. 2), and the time in the RAM 12B is read. It is transferred and stored in the code file memory 12i (step A3).

こうして、前記検索見出語「ｌｏｗ」に対応する発音音声・テキスト・発音口型画像の同期再生用の各種ファイルのＲＡＭ１２Ｂへの読み込み設定、およびこれらのファイルを同期再生するためのタイムコードファイル１２ｆ２３のＲＡＭ１２Ｂへの転送設定が完了すると、タイムコードファイルメモリ１２ｉに格納されたタイムコードファイル（ＣＡＳファイル）１２ｆ２３（図５参照）のＣＰＵ１１による処理単位時間（例えば２５ｍｓ）が当該タイムコードファイル１２ｆ２３のヘッダ情報Ｈとして設定される（ステップＡ４）。 In this way, the setting of reading various files for synchronous reproduction of the pronunciation voice / text / speaking mouth type image corresponding to the search headword “low” into the RAM 12B and the time code file 12f23 for synchronous reproduction of these files. When the transfer setting to the RAM 12B is completed, the processing unit time (for example, 25 ms) by the CPU 11 of the time code file (CAS file) 12f23 (see FIG. 5) stored in the time code file memory 12i is the header of the time code file 12f23. Information H is set (step A4).

そして、前記タイムコードファイルメモリ１２ｉに格納されたタイムコードファイル１２ｆ２３の先頭に読み出しポインタがセットされると共に、各同期用ファイルメモリ１２ｊ，１２ｋ，１２ｍ，１２ｎに書き込まれた各種ファイルの先頭に読み出しポインタがセットされ（ステップＡ５）、当該各同期ファイルの再生処理タイミングを計時するためのタイマがスタートされる（ステップＡ６）。 A read pointer is set at the head of the time code file 12f23 stored in the time code file memory 12i, and at the top of various files written in the synchronization file memories 12j, 12k, 12m, and 12n. Is set (step A5), and a timer for timing the reproduction processing timing of each synchronous file is started (step A6).

このステップＡ６において、処理タイマがスタートされると、前記ステップＡ４にて設定された今回のタイムコードファイル１２ｆ２３に応じた処理単位時間（２５ｍｓ）毎に、前記ステップＡ５にて設定された読み出しポインタの初期位置の当該タイムコードファイル１２ｆ２３（図５参照）のコマンドコードおよびそのパラメータデータが読み出される（ステップＡ７）。 In step A6, when the processing timer is started, the read pointer set in step A5 is set every processing unit time (25 ms) corresponding to the current time code file 12f23 set in step A4. The command code and parameter data of the time code file 12f23 (see FIG. 5) at the initial position are read (step A7).

そして、前記タイムコードファイル１２ｆ２３（図５参照）から読み出されたコマンドコードが、“ＦＮ”か否か判断され（ステップＡ８）、“ＦＮ”と判断された場合には、その時点で当該同期再生処理の停止処理が指示実行される（ステップＡ８→Ａ９）。 Then, it is determined whether or not the command code read from the time code file 12f23 (see FIG. 5) is “FN” (step A8). If “FN” is determined, the synchronization is performed at that time. A reproduction process stop process is instructed (step A8 → A9).

一方、前記タイムコードファイル１２ｆ２３（図５参照）から読み出されたコマンドコードが、“ＦＮ”ではないと判断された場合には、当該コマンドコードの内容（図６参照）に対応する処理が実行される（ステップＡ１０）。 On the other hand, when it is determined that the command code read from the time code file 12f23 (see FIG. 5) is not “FN”, processing corresponding to the content of the command code (see FIG. 6) is executed. (Step A10).

そして、前記タイマによる計時時間が次の処理単位時間（２５ｍｓ）に到達したと判断された場合には、ＲＡＭ１２Ｂに格納されたタイムコードファイル１２ｆ２３（図５参照）に対する読み出しポインタが次の位置に移動セットされ（ステップＡ１１→Ａ１２）、前記ステップＡ７における当該読み出しポインタの位置のタイムコードファイル１２ｆ２３（図５参照）のコマンドコードおよびそのパラメータデータ読み出しからの処理が繰り返される（ステップＡ１２→Ａ７〜Ａ１０）。 When it is determined that the time measured by the timer has reached the next processing unit time (25 ms), the read pointer for the time code file 12f23 (see FIG. 5) stored in the RAM 12B moves to the next position. Set (step A11 → A12), and the process from reading the command code and its parameter data in the time code file 12f23 (see FIG. 5) at the position of the read pointer in step A7 is repeated (steps A12 → A7 to A10). .

ここで、図５で示す検索見出語「ｌｏｗ」のタイムコードファイル１２ｆ２３に基づいた、発音音声・テキスト・発音口型画像ファイルの同期再生出力動作について詳細に説明する。 Here, the synchronized playback output operation of the pronunciation voice / text / speech mouth type image file based on the time code file 12f23 of the search word “low” shown in FIG. 5 will be described in detail.

すなわち、このタイムコードファイル１２ｆ２３は、そのヘッダＨに予め記述設定された（基準）処理単位時間（例えば２５ｍｓ）毎にコマンド処理が実行されるもので、まず、タイムコードファイル１２ｆ２３（図５参照）の第１コマンドコード“ＣＳ”（クリアオールファイル）およびそのパラメータデータ“００”が読み出されると、全ファイルの出力をクリアする指示が行われ、テキスト・音声・画像ファイルの出力がクリアされる（ステップＡ１０）。 That is, the time code file 12f23 is a command process executed every (reference) processing unit time (for example, 25 ms) preset in the header H. First, the time code file 12f23 (see FIG. 5). When the first command code “CS” (clear all file) and its parameter data “00” are read out, an instruction to clear the output of all the files is given, and the output of the text / audio / image file is cleared ( Step A10).

第２コマンドコード“ＤＨ”（ｉ番目ＨＴＭＬファイル表示）およびそのパラメータデータ“００”が読み出されると、当該コマンドコードＤＨと共に読み出されたパラメータデータ（ｉ＝０）に応じて、ＲＡＭ１２Ｂ内の同期用ＨＴＭＬファイルメモリ１２ｊからＨＴＭＬデータの見出語テキスト・画像フレームデータが読み出され、図１２（Ａ）に示すように、表示部１８に対する見出語検索画面Ｇ２上でのテキスト・画像の同期再生用ウインドウＷ１，Ｗ２が設定される（ステップＡ１０）。 When the second command code “DH” (i-th HTML file display) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code DH. The headword text / image frame data of the HTML data is read from the HTML file memory 12j, and the text / image is synchronized on the headword search screen G2 with respect to the display unit 18 as shown in FIG. Playback windows W1 and W2 are set (step A10).

第３コマンドコード“ＤＩ”（ｉ番目イメージファイル表示）およびそのパラメータデータ“００”が読み出されると、当該コマンドコードＤＩと共に読み出されたパラメータデータ（ｉ＝０）に応じて、ＲＡＭ１２Ｂ内の同期用イメージファイルメモリ１２ｎから前記キャラクタ設定処理（ステップＳ２〜Ｓ６）にて設定記憶されたキャラクタ画像１２ｄ（この場合Ｎｏ３）が読み出され、図１２（Ａ）に示すように、前記見出語検索画面Ｇ２上でＨＴＭＬファイルで設定された画像の同期再生用ウインドウＷ２に表示される（ステップＡ１０）。 When the third command code “DI” (i-th image file display) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code DI. The character image 12d (No. 3 in this case) set and stored in the character setting process (steps S2 to S6) is read from the image file memory 12n, and as shown in FIG. The image is displayed in the synchronized playback window W2 for the image set in the HTML file on the screen G2 (step A10).

第４コマンドコード“ＰＳ”（ｉ番目サウンドファイルプレイ）およびそのパラメータデータ“００”が読み出されると、当該コマンドコードＰＳと共に読み出されたパラメータデータ（ｉ＝０）に応じて、ＲＡＭ１２Ｂ内の同期用サウンドファイルメモリ１２ｍから前記ステップＡ２にて設定記憶された検索見出語「ｌｏｗ」に対応する発音音声データが読み出され、ステレオ音声出力部１９ｂからの音声出力が開始される（ステップＡ１０）。 When the fourth command code “PS” (i-th sound file play) and its parameter data “00” are read, synchronization in the RAM 12B is performed according to the parameter data (i = 0) read together with the command code PS. The pronunciation sound data corresponding to the search headword “low” set and stored in the step A2 is read from the sound file memory 12m, and the sound output from the stereo sound output unit 19b is started (step A10). .

第５コマンドコード“ＬＴ”（ｉ番目テキストロード）およびそのパラメータデータ“００”が読み出されると、当該コマンドコードＬＴと共に読み出されたパラメータデータ（ｉ＝０）に応じて、ＲＡＭ１２Ｂ内の同期用テキストファイルメモリ１２ｋに前記ステップＡ２にて設定記憶された検索見出語「ｌｏｗ」に対応する１文節のテキストデータ「ｌ」「ｏ」「ｗ」（含む発音記号）が指定される（ステップＡ１０）。 When the fifth command code “LT” (i-th text load) and its parameter data “00” are read, the synchronization data in the RAM 12B is read according to the parameter data (i = 0) read together with the command code LT. Text data “l”, “o”, “w” (including phonetic symbols) corresponding to the search headword “low” set and stored in step A2 is specified in the text file memory 12k (step A10). ).

第６コマンドコード“ＶＤ”（ｉ番目テキスト文節表示）およびそのパラメータデータ“００”が読み出されると、当該コマンドコードＶＤと共に読み出されたパラメータデータ（ｉ＝０）に応じて、前記第５コマンドコード“ＬＴ”に従い指定された１文節のテキストデータ「ｌ」「ｏ」「ｗ」（含む発音記号）が読み出され、図１２（Ａ）に示すように、前記見出語検索画面Ｇ２上のテキスト同期再生用ウインドウＷ１に表示される（ステップＡ１０）。 When the sixth command code “VD” (i-th text phrase display) and its parameter data “00” are read, the fifth command is determined according to the parameter data (i = 0) read together with the command code VD. The text data “l”, “o”, “w” (including phonetic symbols) specified according to the code “LT” is read out, and as shown in FIG. Are displayed in the text synchronized playback window W1 (step A10).

第７コマンドコード“ＢＬ”（文字カウンタリセット・ｉ番目文節ブロック指定）およびそのパラメータデータ“００”が読み出されると、前記テキスト同期再生用ウインドウＷ１で表示中の検索見出語「ｌｏｗ」の文字カウンタがリセットされる（ステップＡ１０）。 When the seventh command code “BL” (character counter reset / i-th clause block designation) and its parameter data “00” are read, the character of the search term “low” being displayed in the text synchronous playback window W1 The counter is reset (step A10).

そして、第８コマンドコード“ＨＬ”（ｉ番目文字までハイライト・文字カウント）およびそのパラメータデータ“０１”が読み出されると、当該コマンドコードＨＬと共に読み出されたパラメータデータ（ｉ＝１）に応じて、図１２（Ａ）に示すように、テキスト同期再生用ウインドウＷ１に表示されている検索見出語「ｌｏｗ」（含む発音記号）のうち１番目の文字「ｌ」およびこれに対応する発音記号まで、色替え表示，反転表示，下線表示などによるハイライト（識別）表示ＨＬが行われ、文字カウンタが同２番目の文字およびこれに対応する発音記号までカウントアップされる（ステップＡ１０）。 When the eighth command code “HL” (highlight / character count up to i-th character) and its parameter data “01” are read, the parameter data (i = 1) read together with the command code HL is read. Then, as shown in FIG. 12A, the first character “l” in the search headword “low” (including phonetic symbols) displayed in the text synchronous playback window W1 and the pronunciation corresponding thereto. Up to the symbol, highlight (identification) display HL by color change display, reverse display, underline display, etc. is performed, and the character counter counts up to the second character and the corresponding phonetic symbol (step A10).

このタイムコードファイル１２ｆ２３による検索見出語「ｌｏｗ」の各文字およびこれに対応する発音記号に対するハイライト（識別）表示に際しては、図９におけるテキスト対応口表示処理の割り込みが行われる。 At the time of highlight (identification) display for each character of the search headword “low” and its corresponding phonetic symbol by the time code file 12f23, the text corresponding mouth display processing in FIG. 9 is interrupted.

すなわち、今回ハイライト（識別）表示ＨＬされた検索見出語「ｌｏｗ」の文字「ｌ」が検知されると（ステップＢ１）、この検知文字「ｌ」に対応する発音口型画像が、辞書データベース１２ｂ内のテキスト口同期ファイル（図２（Ｃ）参照）により示されるテキスト「ｌ」に対応する口番号「３６」に従い、音声別口画像データ１２ｅ（図４参照）の中から発音口型画像１２ｅ２（Ｎｏ３６）として読み出される（ステップＢ２）。そして、このハイライト（識別）表示された検索見出語「ｌｏｗ」の文字「ｌ」に対する発音口型画像１２ｅ２（Ｎｏ３６）は、図１２（Ａ）（図１２（Ｂ）(1)）に示すように、見出語検索画面Ｇ２上の画像同期再生用ウインドウＷ２に表示されているキャラクタ画像１２ｄ（Ｎｏ３）の口画像合成領域に対して、ＲＡＭ１２Ｂ内の口画像エリアメモリ１２ｐに記憶されている口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に従い合成されて表示される（ステップＢ３）。 That is, when the character “l” of the search headword “low” that is highlighted (identified) HL this time is detected (step B1), the pronunciation type image corresponding to the detected character “l” is converted into the dictionary. According to the mouth number “36” corresponding to the text “l” indicated by the text mouth synchronization file (see FIG. 2C) in the database 12b, the pronunciation mouth type is selected from the speech-specific mouth image data 12e (see FIG. 4). It is read out as image 12e2 (No36) (step B2). Then, the pronunciation type image 12e2 (No36) for the character “l” of the search headword “low” displayed in this highlight (identification) is shown in FIG. 12 (A) (FIG. 12 (B) (1)). As shown, the mouth image area memory 12p in the RAM 12B stores the mouth image composition area of the character image 12d (No 3) displayed in the image synchronized playback window W2 on the headword search screen G2. Are synthesized and displayed according to the mouth image area (X1, Y1; X2, Y2) (step B3).

ここで、前記テキスト口同期ファイル（図２（Ｃ）参照）により示される今回のハイライト（識別）表示テキスト「ｌ」の発音記号に対するアクセントマークの有無が判断される（ステップＢ４）。このハイライト（識別）表示テキスト「ｌ」の発音記号［ｌ］の場合にはアクセントマーク無しと判断されるので、キャラクタ画像１２ｄ（Ｎｏ３）はその通常の顔画像のままの表示が維持される（ステップＢ４→Ｂ５）。 Here, it is determined whether or not there is an accent mark for the phonetic symbol of the current highlight (identification) display text “l” indicated by the text mouth synchronization file (see FIG. 2C) (step B4). In the case of the phonetic symbol [l] of the highlight (identification) display text “l”, it is determined that there is no accent mark, so that the character image 12d (No 3) is kept displayed as its normal face image. (Step B4 → B5).

なお、ここでアクセントマーク有りと判断された場合には、前記キャラクタ画像１２ｄ（Ｎｏ３）は、発音強調表現用のアクセント顔画像Ｎｏ３′（図１２（Ｃ）(2)参照）に変更表示される（ステップＢ４→Ｂ６）。 If it is determined that there is an accent mark, the character image 12d (No. 3) is changed to an accent face image No. 3 ′ (see FIGS. 12C and 12) for pronunciation emphasis expression. (Step B4 → B6).

そして、前記第４コマンドコード“ＰＳ”に応じてステレオ音声出力部１９ｂから出力開始されている検索見出語「ｌｏｗ」に対応する発音音声データの出力タイミングと、本タイムコードファイル１２ｆ２３による処理単位時間（２５ｍｓ）に応じた検索見出語「ｌｏｗ」の１文字毎の識別表示タイミングとは、予め対応付けされて当該タイムコードファイル１２ｆ２３が作成されているので、当該検索見出語「ｌｏｗ」の１文字目「ｌ」の識別表示とその発音口型画像１２ｅ（Ｎｏ３６）の同期合成表示の際には、これに対応する発音記号を読み上げるところの発音音声が同期出力されることになる。 Then, the output timing of pronunciation sound data corresponding to the search headword “low” started to be output from the stereo sound output unit 19b according to the fourth command code “PS”, and the processing unit by the time code file 12f23 Since the time code file 12f23 is created in association with the identification display timing for each character of the search headword “low” corresponding to the time (25 ms), the search headword “low” is created. In the case of the identification display of the first character “l” and the synthesizing display of the sound mouth image 12e (No. 36), the pronunciation sound that reads out the corresponding phonetic symbol is synchronously output.

これにより、検索見出語「ｌｏｗ」の第１文字目「ｌ」の識別表示、その発音口型画像１２ｅ３（Ｎｏ３６）の設定キャラクタ画像１２ｄ（Ｎｏ３）への合成表示、ならびにその発音音声の出力が同期して行われる。 As a result, the identification display of the first character “l” of the search headword “low”, the synthesis display of the pronunciation mouth image 12e3 (No36) to the set character image 12d (No3), and the output of the pronunciation sound Are performed synchronously.

そして、第９コマンドコード“ＮＰ”が読み出されると、現在の検索見出語「ｌｏｗ」に対応するキャラクタ画像およびテキストデータの同期表示画面および発音音声データの同期出力状態が維持される。 When the ninth command code “NP” is read out, the character image and text data synchronous display screen corresponding to the current search headword “low” and the synchronized output state of the pronunciation voice data are maintained.

この後、第１２コマンドコード“ＨＬ”、第３５コマンドコード“ＨＬ”に従い、図１２（Ｃ）(2)、図１２（Ｃ）(3)に示すように、テキスト同期再生用ウインドウＷ１では、検索見出語のテキストデータ「ｌｏｗ」とその発音記号が、順次、２番目の文字「ｏ」と発音記号［ｏ］、３番目の文字「ｗ」と発音記号［ｕ］というように、ハイライト（識別）表示ＨＬされて行く（ステップＡ１０）。そして、これと共に、画像同期再生用ウインドウＷ２では、前記図９におけるテキスト対応の口表示処理に応じて、設定キャラクタ画像１２ｄ（Ｎｏ３）の口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に合成すべき発音口型画像も、前記テキスト口同期ファイル（図２（Ｃ）参照）に従い、口番号９に対応する発音口型画像１２ｅ（Ｎｏ９）、口番号８に対応する発音口型画像１２ｅ（Ｎｏ８）として音声別口画像１２ｅの中から読み出され、順次合成されて同期表示される（ステップＢ１〜Ｂ３）。 Thereafter, in accordance with the twelfth command code “HL” and the thirty-fifth command code “HL”, as shown in FIGS. 12 (C) (2) and 12 (C) (3), in the text synchronous playback window W1, The text data “low” of the search headword and its phonetic symbol are sequentially high, such as the second character “o” and the phonetic symbol [o], the third character “w” and the phonetic symbol [u]. The light (identification) display HL is displayed (step A10). At the same time, in the image synchronous reproduction window W2, the mouth image area (X1, Y1; X2, Y2) of the set character image 12d (No3) is synthesized in accordance with the mouth display processing corresponding to the text in FIG. The phonetic mouth type image also corresponds to the mouth mouth type file 12e (No8) corresponding to the mouth number 8 and the sounding mouth type image 12e (No8) corresponding to the mouth number 8 in accordance with the text mouth synchronization file (see FIG. ) Are read out from the voice-specific mouth image 12e, and are sequentially synthesized and displayed synchronously (steps B1 to B3).

さらに、前記第４コマンドコード“ＰＳ”に応じてステレオ音声出力部１９ｂから出力されている検索見出語「ｌｏｗ」の発音音声データも、同テキスト「ｌｏｗ」とその発音記号のハイライト（識別）表示部分を読み上げるところの音声が順次同期出力される。 Furthermore, the pronunciation sound data of the search headword “low” output from the stereo sound output unit 19b in response to the fourth command code “PS” is also highlighted (identification) of the text “low” and its pronunciation symbol. ) The sound of reading out the display part is sequentially output synchronously.

なお、前記検索見出語「ｌｏｗ」の各文字「ｌ」「ｏ」「ｗ」毎のハイライト（識別）表示ＨＬに同期させたテキスト対応口表示処理による各発音口型画像１２ｅ（Ｎｏ３６）→１２ｅ（Ｎｏ９）→１２ｅ（Ｎｏ８）の設定キャラクタ画像１２ｄ（Ｎｏ３）に対する合成切り替え表示（ステップＢ１〜Ｂ５）に際し、図１２（Ｂ）(2)で示したように、２番目の文字「ｏ」とその発音記号のハイライト（識別）表示ＨＬに伴い発音口型画像１２ｅ（Ｎｏ９）を合成表示させるときには、当該ハイライト（識別）表示テキスト「ｏ」の発音記号にはアクセントマーク有りと判断されるので、図１２（Ｃ）(2)で示したように、このときのキャラクタ画像１２ｄ（Ｎｏ３）は、発音強調表現用のアクセント顔画像Ｎｏ３′に変更されて表示される（ステップＢ４→Ｂ６）。 It should be noted that each sound mouth type image 12e (No36) by the text corresponding mouth display processing synchronized with the highlight (identification) display HL for each character “l”, “o”, “w” of the search headword “low”. In the composite switching display (steps B1 to B5) for the set character image 12d (No3) of 12e (No9) → 12e (No8), as shown in FIG. "And the phonetic symbol image 12e (No. 9) in combination with the highlight (identification) display HL of the phonetic symbol, it is determined that the phonetic symbol of the highlight (identification) display text" o "has an accent mark. Therefore, as shown in FIG. 12C (2), the character image 12d (No 3) at this time is changed to the accent face image No 3 ′ for pronunciation emphasis expression and displayed (S -Up B4 → B6).

つまり、図１２で示した検索見出語「Ｌｏｗ」のアクセント文字「ｏ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏ９）の切り替え合成表示に際しては、当該口型画像１２ｅ（Ｎｏ９）の合成先である図１２（Ｂ）(2)で示した通常の設定キャラクタ（顔）画像１２ｄ（Ｎｏ３）が、図１２（Ｃ）(2)で示した例えば頭部の発汗や口元の皺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ３′）に変更表示されるので、ユーザは検索見出語「Ｌｏｗ」の発音音声とその発声タイミングおよび各文字「Ｌ」「ｏ」「ｗ」とその発音記号の対応部分、さらには各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）を、そのそれぞれの同期再生により容易に学習できるばかりでなく、アクセントに応じて発声強調するタイミングをリアルに学習できるようになる。 That is, when the combined display of the highlight (identification) display HL synchronized with the output of the pronunciation speech for the accent character “o” of the search headword “Low” shown in FIG. The normal set character (face) image 12d (No. 3) shown in FIG. 12 (B) (2), which is the composition destination of the mouth image 12e (No. 9), is shown in FIG. 12 (C) (2). For example, since the face image 12d (No 3 ′) corresponding to an accent representing a state of strong pronunciation due to sweating of the head or heels of the mouth is displayed, the user can pronounce the search sound word “Low” and its sound The utterance timing and the corresponding portions of the letters “L”, “o”, “w” and their phonetic symbols, as well as the phonetic mouth type images 12e (No36 → No9 → No8) can be easily learned by their respective synchronized playback. Not, the speech emphasizes timing will be able to learn in real depending on the accent.

図１３は前記携帯機器１０の見出語検索処理における同期再生処理に伴いキャラクタ画像Ｎｏ１の設定状態において検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 13 shows a headword character display window W1 and a pronunciation type that are displayed on the search headword display screen G2 in the setting state of the character image No1 in accordance with the synchronized playback processing in the headword search processing of the portable device 10. It is a figure which shows the display state of the display window W2, The figure (A) is a figure which shows the setting display state of the headword character display window W1 and the pronunciation type display window W2 with respect to the search headword display screen G2. (B) is a figure which shows the change state of the headword character display window W1 and the pronunciation window display window W2 which synchronized with the output of the pronunciation sound.

すなわち、前記図７のステップＳ１〜Ｓ６によるキャラクタ設定処理において、予め記憶された３種類のキャラクタ画像データ１２ｄ（Ｎｏ１），１２ｄ（Ｎｏ２），１２ｃ（Ｎｏ３）［図３参照］の中からアニメ調のキャラクタ画像１２ｄ（Ｎｏ１）が選択設定された状態で、前記ステップＳ７〜ＳＡと同様に検索対象の見出語「ｌｏｗ」についての見出語検索処理および同期再生処理、そして図９におけるテキスト対応口表示処理が行われた場合には、図１３（Ａ）および図１３（Ｂ）に示すように、検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１には、検索見出語「ｌｏｗ」およびその発音記号の発音音声出力に同期させたハイライト（識別）表示ＨＬが順次行われる。これに伴い、発音口型表示ウインドウＷ２には、前記キャラクタ設定処理（ステップＳ１〜Ｓ６）において設定されたアニメ調のキャラクタ画像１２ｄ（Ｎｏ１）を基本の顔画像として、前記発音音声出力およびテキスト（含む発音記号）のハイライト表示ＨＬに同期させた各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）が順次切り替え合成されて表示される。 That is, in the character setting process in steps S1 to S6 in FIG. 7, the animation style is selected from the three types of character image data 12d (No1), 12d (No2), 12c (No3) [see FIG. 3] stored in advance. In the state in which the character image 12d (No1) is selected and set, the headword search processing and synchronous playback processing for the headword “low” to be searched are performed as in Steps S7 to SA, and the text correspondence in FIG. When the mouth display processing is performed, as shown in FIG. 13A and FIG. 13B, the search word “window” on the search word display window W1 for the search word display screen G2 is displayed. "Low" and the highlight (identification) display HL synchronized with the sound output of the phonetic symbols are sequentially performed. Accordingly, the sound output type display window W2 uses the animated character image 12d (No. 1) set in the character setting process (steps S1 to S6) as a basic face image and outputs the sound output and text ( Each of the sound source type images 12e (No. 36 → No. 9 → No. 8) synchronized with the highlight display HL (including phonetic symbols) is sequentially switched and combined and displayed.

そして、図１３（Ｂ）(2)で示したように、検索見出語「ｌｏｗ」の２番目の文字「ｏ」とその発音記号のハイライト（識別）表示ＨＬに伴い発音口型画像１２ｅ（Ｎｏ９）を合成表示させるときには、当該ハイライト（識別）表示テキスト「ｏ」の発音記号にはアクセントマーク有りと判断されるので、このときのアニメ調キャラクタ画像１２ｄ（Ｎｏ１）は、発音強調表現用のアクセント顔画像Ｎｏ１′に変更されて表示される（ステップＢ４→Ｂ６）。 Then, as shown in FIG. 13 (B) (2), the pronunciation mouth type image 12e accompanying the highlight (identification) display HL of the second character “o” of the search headword “low” and its pronunciation symbol. When (No9) is compositely displayed, it is determined that there is an accent mark in the phonetic symbol of the highlight (identification) display text “o”, so that the animation character image 12d (No1) at this time is pronounced emphasis expression The accent face image No1 ′ is changed to be displayed (step B4 → B6).

つまり、図１３で示したアニメ調のキャラクタ画像１２ｄ（Ｎｏ１）を選択設定した場合の、検索見出語「Ｌｏｗ」のアクセント文字「ｏ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏ９）の切り替え合成表示に際しても、当該口型画像１２ｅ（Ｎｏ９）の合成先である通常のアニメ調キャラクタ（顔）画像１２ｄ（Ｎｏ１）が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ１′）に変更表示されるので、ユーザは検索見出語「Ｌｏｗ」の発音音声とその発声タイミングおよび各文字「Ｌ」「ｏ」「ｗ」とその発音記号の対応部分、さらには各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）を、そのそれぞれの同期再生により容易に学習できるばかりでなく、アクセントに応じて発声強調するタイミングをリアルに学習できるようになる。 That is, the highlight (identification) display synchronized with the output of the pronunciation voice for the accent character “o” of the search headword “Low” when the anime-like character image 12d (No1) shown in FIG. 13 is selected and set. When switching and displaying the HL and the pronunciation mouth image 12e (No9), the normal animation character (face) image 12d (No1), which is the composition destination of the mouth image 12e (No9), is, for example, sweating of the head. Or the accent-corresponding face image 12d (No1 ') representing a state of strong pronunciation due to body shake, the user can pronounce the search headline "Low", its utterance timing, and each character. Corresponding portions of “L”, “o”, “w” and their phonetic symbols, as well as each phonetic mouth type image 12e (No36 → No9 → No8), for their respective synchronized playback Ri easily not only be learning, the speech emphasizes timing will be able to learn in real depending on the accent.

なお、前記図１１〜図１３を参照して説明した見出語検索に伴うテキスト・発音音声・発音口型画像の同期再生処理では、辞書データベース１２ｂとして予め記憶される英和辞書データの内容が、米国系１国の発音にのみ対応する内容である場合について説明したが、例えば、次の図１４〜図１６を参照して説明するように、辞書データベース１２ｂとして予め記憶される英和辞書データの内容が、米国系および英国系の２国の発音に対応する内容を有する場合に、米国系または英国系の何れか１国の発音形態を指定して見出語検索に伴うテキスト・発音音声・発音口型画像の同期再生処理を行うようにしてもよい。 In the synchronous reproduction process of the text, pronunciation sound, and pronunciation mouth image associated with the headword search described with reference to FIGS. 11 to 13, the contents of the English-Japanese dictionary data stored in advance as the dictionary database 12b are: Although the case where the content corresponds only to the pronunciation of one American country has been described, for example, as described with reference to FIGS. 14 to 16, the content of English-Japanese dictionary data stored in advance as the dictionary database 12b If the phonetic content has content corresponding to the pronunciation of two countries, American and English, specify the pronunciation form of either one of the American or British languages and use the text, pronunciation, and pronunciation Mouth-shaped image synchronous reproduction processing may be performed.

図１４は前記携帯機器１０のメイン処理内の見出語検索処理に伴い米国／英国の２国の発音形態を収録した英和辞書を利用した場合の検索見出語表示画面Ｇ２を示す図である。 FIG. 14 is a diagram showing a search headword display screen G2 when an English-Japanese dictionary containing pronunciation forms of two countries of the United States / UK is used in the headword search process in the main process of the mobile device 10. .

辞書データベース１２ｂに記憶されている例えば米国／英国の２国の発音形態を収録した英和辞書の辞書データに基づいて見出語検索を行うのに、入力部１７ａの「英和」キー１７ａ５の操作により英和辞書の検索モードに設定した後に、検索対象の見出語（例えば「ｌａｕｇｈ」）を入力すると（ステップＳ７→Ｓ８）、当該入力された見出語と一致及び一致文字を先頭に含む複数の見出語が前記英和辞書の辞書データから検索されて読み出され、検索見出語の一覧（図示せず）として表示部１８に表示される（ステップＳ９）。 In order to search for a word based on dictionary data of an English-Japanese dictionary that records pronunciation forms of, for example, two countries of the US / UK stored in the dictionary database 12b, the “English-Japanese” key 17a5 of the input unit 17a is operated. After the search mode of the English-Japanese dictionary is set, when a search word (for example, “law”) is input (step S7 → S8), a plurality of matching words and matching characters at the head are entered. The headwords are retrieved from the dictionary data of the English-Japanese dictionary and read out, and displayed on the display unit 18 as a list of search headwords (not shown) (step S9).

この検索見出語の一覧画面において、ユーザ入力した検索対象の見出語と一致する見出語（この場合「ｌａｕｇｈ」）がカーソルキーにより選択指示されて「訳／決定（音声）」キー１７ａ４が操作されると（ステップＳ１０）、当該選択検知された見出語「ｌａｕｇｈ」がＲＡＭ１２Ｂ内の見出語メモリ１２ｇに記憶されると共に、この見出語「ｌａｕｇｈ」に対応する米国／英国の２国の発音／品詞／意味内容などの辞書データが読み出されてＲＡＭ１２Ｂ内の見出語対応辞書データメモリ１２ｈに記憶され、図１４に示すように、検索見出語表示画面Ｇ２として表示部１８に表示される（ステップＳ１１）。 On this search headword list screen, the headword (in this case, “laugh”) that matches the search target headword input by the user is selected and instructed by the cursor key, and the “translation / decision (voice)” key 17a4. Is operated (step S10), the selected and detected headword “laugh” is stored in the headword memory 12g in the RAM 12B, and the US / UK corresponding to the headword “laugh” is stored. Dictionary data such as pronunciation / parts of speech / meaning contents of the two countries is read out and stored in the dictionary data memory 12h corresponding to the entry word in the RAM 12B. As shown in FIG. 14, the display unit is displayed as the search entry display screen G2. 18 (step S11).

ここで、前記検索表示された見出語「ｌａｕｇｈ」について、その米国式発音［ｌａｅｆ］または英国式発音［ｌａ：ｆ］の何れか一方の発音音声を選択的に出力させるのと同時に、これに対応した見出語の文字，発音記号と発音の口型画像を同期表示させるために、検索見出語表示画面Ｇ２上の辞書データに表示されている米国方言または英国方言の識別子［米］または［英］の何れかが指定されると共に（ステップＳ１１ａ）、「訳／決定（音声）」キー１７ａ４が操作されると（ステップＳ１２）、図８における同期再生処理に移行される（ステップＳＡ）。 Here, at the same time as outputting the pronunciation sound of either the American pronunciation [laef] or the English pronunciation [la: f] of the headword “laugh” displayed in the search, In order to synchronize and display the headword characters, phonetic symbols and pronunciation mouth images corresponding to the US dialect or English dialect identifier [US] displayed in the dictionary data on the search headword display screen G2 When either [English] is specified (step S11a) and the “translation / decision (voice)” key 17a4 is operated (step S12), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA). ).

図１５は前記携帯機器１０の見出語検索処理における同期再生処理に伴い米国式発音［米］を指定した場合に検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は米国式発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 15 shows a headword character display window W1 displayed on the search headword display screen G2 when the American pronunciation [US] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A shows the display state of the pronunciation window type display window W2, and FIG. 9A shows the setting display state of the word entry character display window W1 and the pronunciation window type display window W2 with respect to the search word entry display screen G2. FIG. 6B is a diagram showing a change state of the headword character display window W1 and the pronunciation-mouth type display window W2 synchronized with the output of the American pronunciation sound.

すなわち、前記検索見出語表示画面Ｇ２上の辞書データに表示されている米国方言または英国方言の識別子［米］または［英］の何れかが指定されて、図８における同期再生処理に移行されると、当該同期再生処理のステップＡ２では、例えば米国方言識別子［米］が指定された場合は、キャラクタ設定処理（ステップＳ２〜Ｓ６）において予め設定されたアニメ調キャラクタ画像１２ｄ（Ｎｏ１）に対応して米語用キャラクタ画像１２ｄ（Ｎｏ１ＵＳ）が読み出され、ＲＡＭ１２Ｂ内の同期用イメージファイルメモリ１２ｎに転送される。またこれと共に、辞書データベース１２ｂに記憶されている今回の検索見出語「ｌａｕｇｈ」についての同期再生用リンクデータ（図２参照）に基づき、見出語検索画面Ｇ２上でテキスト・画像の同期再生用ウインドウＷ１，Ｗ２（図１５参照）を設定するためのＨＴＭＬファイルがそのＨＴＭＬファイルＮｏに従い読み出され同期用ＨＴＭＬファイルメモリ１２ｊに書き込まれる。また、検索見出語のテキストデータ「ｌａｕｇｈ（米国方言発音記号付）」がそのテキストファイルＮｏに従い読み出され同期用テキストファイルメモリ１２ｋに書き込まれる。また、検索見出語の米国方言の発音音声データがそのサウンドファイルＮｏに従い読み出され同期用サウンドファイルメモリ１２ｍに書き込まれる（ステップＡ２）。 That is, either the US dialect or the English dialect identifier [US] or [English] displayed in the dictionary data on the search headword display screen G2 is designated, and the process proceeds to the synchronous playback process in FIG. Then, in step A2 of the synchronous reproduction process, for example, when the US dialect identifier [US] is designated, it corresponds to the animation character image 12d (No1) set in advance in the character setting process (steps S2 to S6). Then, the American character image 12d (No1US) is read and transferred to the synchronization image file memory 12n in the RAM 12B. At the same time, based on the link data for synchronous reproduction (see FIG. 2) for the current search headword “laugh” stored in the dictionary database 12b, the text / image is synchronously reproduced on the headword search screen G2. An HTML file for setting the windows W1 and W2 (see FIG. 15) is read according to the HTML file No and written to the synchronization HTML file memory 12j. In addition, the text data “search (with the American dialect phonetic symbol)” of the search headword is read according to the text file No and written into the synchronization text file memory 12k. In addition, pronunciation voice data of the American dialect of the search headword is read according to the sound file No and written into the sound file memory for synchronization 12m (step A2).

すると、ＦＬＡＳＨメモリ１２Ａ内に辞書タイムコードファイル１２ｆとして記憶されている各種見出語対応の暗号化された音声・テキスト・画像の同期再生用タイムコードファイル１２ｆｎの中から、今回の検索見出語「ｌａｕｇｈ」に対応するタイムコードファイル１２ｆｎ（図５参照）が、前記同期再生用リンクデータ（図２参照）に記述されたタイムコードファイルＮｏに従い解読復号化されて読み出され、ＲＡＭ１２Ｂ内のタイムコードファイルメモリ１２ｉに転送されて格納される（ステップＡ３）。 Then, from the time code file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as the dictionary time code file 12f in the FLASH memory 12A, the current search headword The time code file 12fn (see FIG. 5) corresponding to “law” is decoded and decoded according to the time code file No described in the synchronous reproduction link data (see FIG. 2), and the time in the RAM 12B is read. It is transferred and stored in the code file memory 12i (step A3).

そして、前記検索見出語「ｌａｕｇｈ」に対応するタイムコードファイル１２ｆｎに従った発音音声・見出語文字・発音口型画像の同期再生処理が、既に説明した検索見出語「ｌｏｗ」の場合と同様に、ステップＡ７〜Ａ１２による各コマンドコードに応じた再生処理および図９におけるテキスト対応口表示処理により開始されると、検索見出語表示画面Ｇ２上のテキスト同期再生用ウインドウＷ１には、検索見出語「ｌａｕｇｈ」と共に米国方言の発音記号が表示され、また、画像同期再生用ウインドウＷ２には、設定されたアニメ調キャラクタ画像で例えば米国旗Ｆを持っているデザインの米語用キャラクタ画像１２ｄ（Ｎｏ１ＵＳ）が口型画像合成の対象画像として表示される。 In the case where the synchronized playback processing of the pronunciation voice / word word character / speech mouth image according to the time code file 12fn corresponding to the search word “laugh” is the search word “low” already described. In the same manner as described above, when the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. 9 are started, the text synchronized reproduction window W1 on the search word display screen G2 includes The phonetic symbol of the US dialect is displayed together with the search headword “laugh”, and the character synchronization image for American English designed to have, for example, the US flag F in the image-synchronized playback window W2 12d (No1US) is displayed as the target image for the mouth image synthesis.

これにより、検索見出語「ｌａｕｇｈ」の米国方言の発音音声出力に同期して、図１５（Ｂ）(1)〜(3)に示すように、テキスト同期再生用ウインドウＷ１では、当該検索見出語「ｌａｕｇｈ」およびその発音記号の先頭文字からのハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、前記米語用キャラクタ画像１２ｄ（Ｎｏ１ＵＳ）をベースとして、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏｎ１→Ｎｏｎ２→Ｎｏｎ３）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示される。 As a result, in synchronism with the output of the pronunciation of the American dialect of the search headword “law”, as shown in FIGS. 15 (B) (1) to (3), the text synchronous playback window W1 The word “laugh” and the highlight (identification) display HL from the first character of the phonetic symbol are sequentially displayed, and in the image synchronous reproduction window W2, the mouth of the character image 12d (No1US) is used as a base. For the image area (X1, Y1; X2, Y2), the sound mouth type image 12e (Non1 → Non2 → Non3) corresponding to the mouth number of each phonetic symbol is read out from the mouth image data 12e according to sound and switched sequentially. It is synthesized and displayed.

そして、この場合にも前記同様のテキスト対応口表示処理に従って、検索見出語「Ｌａｕｇｈ」のアクセント文字「ａｕ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏｎ２）の切り替え合成表示に際し、当該口型画像１２ｅ（Ｎｏｎ２）の合成先である米語用キャラクタ（顔）画像１２ｄ（Ｎｏ１ＵＳ）が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ１ＵＳ′）に変更表示されるので、ユーザは検索見出語「Ｌａｕｇｈ」の米国方言の発音音声とその発声タイミングおよび各文字「Ｌ」「ａｕ」「ｇｈ」とその発音記号の対応部分、さらには各発音口型画像１２ｅ（Ｎｏｎ１→Ｎｏｎ２→Ｎｏｎ３）を、そのそれぞれの同期再生により容易に学習できるばかりでなく、米国方言アクセントに応じて発声強調するタイミングをリアルに学習できるようになる。 In this case as well, according to the same text corresponding mouth display process, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “au” of the search headword “Laugh”, the pronunciation type image 12e At the time of switching composition display of (Non2), the American character (face) image 12d (No1US), which is the composition destination of the mouth image 12e (Non2), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1US ') corresponding to the accent representing the state is displayed, the user can pronounce the pronunciation voice of the US dialect of the search headword “Laugh”, its utterance timing, and each letter “L” “au” “ gh "and the corresponding part of the phonetic symbol, and further each of the phonetic mouth type images 12e (Non1-> Non2-> Non3) Not only can be easily learned by the raw, the speech emphasizes timing will be able to learn in the real according to the US dialect accent.

図１６は前記携帯機器１０の見出語検索処理における同期再生処理に伴い英国式発音［英］を指定した場合に検索見出語表示画面Ｇ２上にウインドウ表示される見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の表示状態を示す図であり、同図（Ａ）は検索見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｂ）は英国式発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 16 shows a headword character display window W1 displayed on the search headword display screen G2 when English pronunciation [English] is designated in accordance with the synchronized playback processing in the headword search processing of the portable device 10. FIG. 6A shows the display state of the pronunciation window type display window W2, and FIG. 9A shows the setting display state of the word entry character display window W1 and the pronunciation window type display window W2 with respect to the search word entry display screen G2. FIG. 4B is a diagram showing a change state of the headword character display window W1 and the pronunciation type display window W2 synchronized with the output of the English pronunciation sound.

すなわち、前記図１４で示した検索見出語表示画面Ｇ２上の辞書データに表示されている米国方言または英国方言の識別子［米］または［英］のうち、例えば英国方言識別子［英］が指定されて（ステップＳ１１ａ）、図８における同期再生処理（ステップＳＡ）に移行されると、当該同期再生処理のステップＡ２では、キャラクタ設定処理（ステップＳ２〜Ｓ６）において予め設定されたアニメ調キャラクタ画像１２ｄ（Ｎｏ１）に対応して英語用キャラクタ画像１２ｄ（Ｎｏ１ＵＫ）が読み出され、ＲＡＭ１２Ｂ内の同期用イメージファイルメモリ１２ｎに転送される。またこれと共に、辞書データベース１２ｂに記憶されている今回の検索見出語「ｌａｕｇｈ」についての同期再生用リンクデータ（図２参照）に基づき、見出語検索画面Ｇ２上でテキスト・画像の同期再生用ウインドウＷ１，Ｗ２（図１６参照）を設定するためのＨＴＭＬファイルがそのＨＴＭＬファイルＮｏに従い読み出され同期用ＨＴＭＬファイルメモリ１２ｊに書き込まれる。また、検索見出語のテキストデータ「ｌａｕｇｈ（英国方言発音記号付）」がそのテキストファイルＮｏに従い読み出され同期用テキストファイルメモリ１２ｋに書き込まれる。また、検索見出語の英国方言の発音音声データがそのサウンドファイルＮｏに従い読み出され同期用サウンドファイルメモリ１２ｍに書き込まれる（ステップＡ２）。 That is, among the dialect identifiers [US] or [English] displayed in the dictionary data on the search word display screen G2 shown in FIG. 14, for example, the English dialect identifier [English] is designated. When the process proceeds to the synchronized playback process (step SA) in FIG. 8 (step S11a), in step A2 of the synchronized playback process, the animation character image preset in the character setting process (steps S2 to S6). The English character image 12d (No1UK) is read in correspondence with 12d (No1) and transferred to the synchronization image file memory 12n in the RAM 12B. At the same time, based on the link data for synchronous reproduction (see FIG. 2) for the current search headword “laugh” stored in the dictionary database 12b, the text / image is synchronously reproduced on the headword search screen G2. The HTML file for setting the windows W1 and W2 (see FIG. 16) is read according to the HTML file No and written to the synchronization HTML file memory 12j. In addition, the text data “search (with English dialect phonetic symbols)” of the search headword is read according to the text file No. and written into the synchronization text file memory 12k. Also, the pronunciation voice data of the English dialect of the search headword is read according to the sound file No and written into the sound file memory for synchronization 12m (step A2).

そして、前記検索見出語「ｌａｕｇｈ」に対応するタイムコードファイル１２ｆｎに従った発音音声・見出語文字・発音口型画像の同期再生処理が、既に説明した検索見出語「ｌｏｗ」の場合と同様に、ステップＡ７〜Ａ１２による各コマンドコードに応じた再生処理および図９におけるテキスト対応口表示処理により開始されると、検索見出語表示画面Ｇ２上のテキスト同期再生用ウインドウＷ１には、検索見出語「ｌａｕｇｈ」と共に英国方言の発音記号が表示され、また、画像同期再生用ウインドウＷ２には、設定されたアニメ調キャラクタ画像で例えば英国帽Ｍ１をかぶりステッキＭ２を持っているデザインの英語用キャラクタ画像１２ｄ（Ｎｏ１ＵＫ）が口型画像合成の対象画像として表示される。 In the case where the synchronized playback processing of the pronunciation voice / word word character / speech mouth image according to the time code file 12fn corresponding to the search word “laugh” is the search word “low” already described. In the same manner as described above, when the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. 9 are started, the text synchronized reproduction window W1 on the search word display screen G2 includes The phonetic symbol of the English dialect is displayed together with the search headline “laugh”, and the image synchronized playback window W2 has a design that has a walking stick M2 wearing, for example, a British cap M1 with a set anime-like character image. An English character image 12d (No1UK) is displayed as a target image for mouth-shaped image synthesis.

これにより、検索見出語「ｌａｕｇｈ」の英国方言の発音音声出力に同期して、図１６（Ｂ）(1)〜(3)に示すように、テキスト同期再生用ウインドウＷ１では、当該検索見出語「ｌａｕｇｈ」およびその発音記号の先頭文字からのハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、前記英語用キャラクタ画像１２ｄ（Ｎｏ１ＵＫ）をベースとして、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏｎ１→Ｎｏｎ２→Ｎｏｎ３）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示される。 As a result, in synchronization with the pronunciation voice output of the English dialect of the search headword “law”, as shown in FIGS. 16 (B) (1) to (3), the text synchronous playback window W1 The word “laugh” and the highlight (identification) display HL from the first character of the phonetic symbol are sequentially displayed, and in the image synchronous reproduction window W2, the English character image 12d (No1UK) is used as a base. For the image area (X1, Y1; X2, Y2), the sound mouth type image 12e (Non1 → Non2 → Non3) corresponding to the mouth number of each phonetic symbol is read out from the mouth image data 12e according to sound and switched sequentially. It is synthesized and displayed.

そして、この場合にも前記同様のテキスト対応口表示処理に従って、検索見出語「Ｌａｕｇｈ」のアクセント文字「ａｕ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏｎ２）の切り替え合成表示に際し、当該口型画像１２ｅ（Ｎｏｎ２）の合成先である英語用キャラクタ（顔）画像１２ｄ（Ｎｏ１ＵＫ）が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ１ＵＫ′）に変更表示されるので、ユーザは検索見出語「Ｌａｕｇｈ」の英国方言の発音音声とその発声タイミングおよび各文字「Ｌ」「ａｕ」「ｇｈ」とその発音記号の対応部分、さらには各発音口型画像１２ｅ（Ｎｏｎ１→Ｎｏｎ２→Ｎｏｎ３）を、そのそれぞれの同期再生により容易に学習できるばかりでなく、英国方言のアクセントに応じて発声強調するタイミングをリアルに学習できるようになる。 In this case as well, according to the same text corresponding mouth display process, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “au” of the search headword “Laugh”, the pronunciation type image 12e At the time of switching composition display of (Non2), the English character (face) image 12d (No1UK), which is the composition destination of the mouth image 12e (Non2), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1UK ') corresponding to the accent representing the state is displayed, the user can pronounce the pronunciation of the English dialect of the search headword "Laugh", its utterance timing, and the letters "L", "au", " gh "and the corresponding part of the phonetic symbol, and further each of the phonetic mouth type images 12e (Non1-> Non2-> Non3) Not only can be easily learned by the raw, the speech emphasizes timing will be able to learn in real depending on the accent of the British dialect.

次に、前記構成の携帯機器１０のメイン処理に伴い、例えば英単語アクセントの正解／不正解を当てるテストを行うことができるアクセントテスト処理について説明する。 Next, with reference to the main process of the mobile device 10 having the above-described configuration, for example, an accent test process capable of performing a test for applying a correct / incorrect answer of an English word accent will be described.

図１７は前記携帯機器１０のアクセントテスト処理に伴い不正解を選択した場合の操作表示状態を示す図であり、同図（Ａ）はアクセントテスト出題表示画面Ｇ３を示す図、同図（Ｂ）は出題対象の見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｃ）は誤りアクセントの発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 17 is a diagram showing an operation display state when an incorrect answer is selected in accordance with the accent test process of the mobile device 10, and FIG. 17A shows an accent test question display screen G3, and FIG. Is a diagram showing the setting display state of the headword character display window W1 and the pronunciation window display window W2 with respect to the headword display screen G2 of the subject, and FIG. It is a figure which shows the change state of the outgoing character display window W1 and the pronunciation window type display window W2.

図１８は前記携帯機器１０のアクセントテスト処理に伴い正解を選択した場合の操作表示状態を示す図であり、同図（Ａ）はアクセントテスト出題表示画面Ｇ３を示す図、同図（Ｂ）は出題対象の見出語表示画面Ｇ２に対する見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の設定表示状態を示す図、同図（Ｃ）は正解アクセントの発音音声の出力に同期した見出語文字表示ウインドウＷ１および発音口型表示ウインドウＷ２の変化状態を示す図である。 FIG. 18 is a diagram showing an operation display state when a correct answer is selected in accordance with the accent test process of the mobile device 10, where FIG. 18A shows an accent test question display screen G3, and FIG. The figure which shows the setting display state of the headword character display window W1 and the pronunciation window type display window W2 with respect to the headword display screen G2 of a question subject, The figure (C) is the headword synchronized with the output of the pronunciation sound of a correct answer accent. It is a figure which shows the change state of the word character display window W1 and the pronunciation window type display window W2.

すなわち、入力部１７ａにおける「アクセントテスト」キー１７ａ６が操作されアクセントテストモードに設定されると（ステップＳ１３）、辞書データベース１２ｃに予め記憶されている辞書データの中からランダムに見出単語が選択され（ステップＳ１４）、図１７（Ａ）に示すように、ランダム選択された単語「ｌｏｗ」について「ｏ」部分にアクセントのある正しいアクセントの発音記号と「ｕ」部分にアクセントのある誤ったアクセントの発音記号とを選択項目Ｅｔ／Ｅｆとして出題したアクセントテスト出題表示画面Ｇ３が表示部１８に表示される（ステップＳ１５）。 That is, when the “accent test” key 17a6 in the input unit 17a is operated to set the accent test mode (step S13), a found word is randomly selected from the dictionary data stored in advance in the dictionary database 12c. (Step S14), as shown in FIG. 17 (A), for the randomly selected word “low”, a correct accent phonetic symbol with an accent in the “o” part and an incorrect accent with an accent in the “u” part. An accent test question display screen G3 with the phonetic symbols as selection items Et / Ef is displayed on the display unit 18 (step S15).

このアクセントテスト出題表示画面Ｇ３において、カーソルキー１７ａ２の操作により選択フレームＸを移動させ、例えば誤ったアクセントの発音記号のある選択項目Ｅｆが選択検知されると（ステップＳ１６）、前記キャラクタ設定処理（ステップＳ２〜Ｓ６）において予め発音口型画像の合成先として選択設定されていたキャラクタ画像およびその関連画像（この場合はアニメ調キャラクタ画像１２ｄ（Ｎｏ１）およびそのアクセント対応画像（Ｎｏ１′））が、例えば黄色の通常色から青色のキャラクタ画像（Ｎｏ１ＢＬ）（Ｎｏ１ＢＬ′）に変更設定される（ステップＳ１７→Ｓ１８）。 In this accent test question display screen G3, when the selection frame X is moved by operating the cursor key 17a2, and a selection item Ef having a phonetic symbol with an incorrect accent is selected and detected (step S16), the character setting process ( In step S2 to S6), the character image and its related image (in this case, the anime-like character image 12d (No1) and its accent corresponding image (No1 ')) that have been selected and set in advance as the synthesis destination of the pronunciation mouth type image are For example, the yellow normal color is changed to the blue character image (No1BL) (No1BL ′) (steps S17 → S18).

またこれと共に、出題単語「ｌｏｗ」に対応して辞書音声データ１２ｃの中から読み出される発音音声データが、前記ユーザ選択された誤ったアクセントの発音記号に応じた音声データに補正される（ステップＳ１９）。 At the same time, the phonetic voice data read out from the dictionary voice data 12c corresponding to the question word “low” is corrected to voice data corresponding to the phonetic symbol of the erroneous accent selected by the user (step S19). ).

すると、出題単語「ｌｏｗ」がＲＡＭ１２Ｂ内の見出語メモリ１２ｇに記憶されると共に、この見出語「ｌｏｗ」に対応する発音／品詞／意味内容などの辞書データが読み出されてＲＡＭ１２Ｂ内の見出語対応辞書データメモリ１２ｈに記憶され、図１７（Ｂ）に示すように、出題単語に対応した検索見出語表示画面Ｇ２として表示部１８に表示される（ステップＳ２０）。 Then, the question word “low” is stored in the headword memory 12g in the RAM 12B, and dictionary data such as pronunciation / part of speech / semantic content corresponding to the headword “low” is read out and stored in the RAM 12B. It is stored in the entry word corresponding dictionary data memory 12h and displayed on the display unit 18 as a search entry word display screen G2 corresponding to the question word as shown in FIG. 17B (step S20).

ここで、前記ユーザにより選択したアクセントの出題単語「ｌｏｗ」について、その発音音声を出力させるのと同時に、当該見出単語の文字，発音記号と発音の口型画像を同期表示させるために、「訳／決定（音声）」キー１７ａ４が操作されると（ステップＳ２１）、図８における同期再生処理に移行される（ステップＳＡ）。 Here, for the accented question word “low” selected by the user, at the same time that the pronunciation sound is output, the character of the found word, the pronunciation symbol, and the mouth-shaped image of the pronunciation are displayed in synchronization. When the “translation / decision (voice)” key 17a4 is operated (step S21), the process proceeds to the synchronous reproduction process in FIG. 8 (step SA).

すると、同期再生処理のステップＡ２では、前記誤ったアクセントのユーザ選択に応じて青色に変更されたアニメ調キャラクタ画像１２ｄ（Ｎｏ１ＢＬ）が読み出され、ＲＡＭ１２Ｂ内の同期用イメージファイルメモリ１２ｎに転送される。またこれと共に、辞書データベース１２ｂに記憶されている今回の出題単語「ｌｏｗ」についての同期再生用リンクデータ（図２参照）に基づき、検索見出語表示画面Ｇ２上でテキスト・画像の同期再生用ウインドウＷ１，Ｗ２（図１７（Ｂ）参照）を設定するためのＨＴＭＬファイルがそのＨＴＭＬファイルＮｏに従い読み出され同期用ＨＴＭＬファイルメモリ１２ｊに書き込まれる。また、出題単語のテキストデータ「ｌｏｗ（誤り発音記号付）」が読み出され同期用テキストファイルメモリ１２ｋに書き込まれる。また、出題単語の誤りアクセントに応じて補正した発音音声データが読み出され同期用サウンドファイルメモリ１２ｍに書き込まれる（ステップＡ２）。 Then, in step A2 of the synchronous reproduction process, the animation character image 12d (No1BL) changed to blue according to the user selection of the wrong accent is read out and transferred to the synchronization image file memory 12n in the RAM 12B. The At the same time, based on the synchronous reproduction link data (see FIG. 2) for the current question word “low” stored in the dictionary database 12b, for synchronous reproduction of text and images on the search word display screen G2. An HTML file for setting the windows W1 and W2 (see FIG. 17B) is read according to the HTML file No and written to the synchronization HTML file memory 12j. Also, the text data “low (with error pronunciation symbol)” of the question word is read and written in the synchronization text file memory 12k. Further, the pronunciation sound data corrected according to the error accent of the question word is read and written into the synchronization sound file memory 12m (step A2).

すると、ＦＬＡＳＨメモリ１２Ａ内に辞書タイムコードファイル１２ｆとして記憶されている各種見出語対応の暗号化された音声・テキスト・画像の同期再生用タイムコードファイル１２ｆｎの中から、今回の出題単語「ｌｏｗ」に対応するタイムコードファイル１２ｆｎ（図５参照）が、前記同期再生用リンクデータ（図２参照）に記述されたタイムコードファイルＮｏに従い解読復号化されて読み出され、ＲＡＭ１２Ｂ内のタイムコードファイルメモリ１２ｉに転送されて格納される（ステップＡ３）。 Then, the current question word “low” is selected from the timecode file 12fn for synchronized playback of encrypted voice / text / image corresponding to various headwords stored as a dictionary timecode file 12f in the FLASH memory 12A. The time code file 12fn (see FIG. 5) corresponding to “” is decoded and decoded in accordance with the time code file No described in the synchronous reproduction link data (see FIG. 2), and the time code file in the RAM 12B is read out. It is transferred to the memory 12i and stored (step A3).

そして、前記出題単語「ｌｏｗ」に対応するタイムコードファイル１２ｆｎに従った誤りアクセントの発音音声・見出語文字・発音口型画像の同期再生処理が、既に説明した検索見出語「ｌｏｗ」の場合と同様に、ステップＡ７〜Ａ１２による各コマンドコードに応じた再生処理および図９におけるテキスト対応口表示処理により開始される。すると、図１７（Ｂ）に示すように、検索見出語表示画面Ｇ２上のテキスト同期再生用ウインドウＷ１（Ｅｆ）には、出題単語「ｌｏｗ」と共にユーザ選択による誤ったアクセントの発音記号が表示され、また、画像同期再生用ウインドウＷ２には、誤りアクセントのユーザ選択により青色変更されたアニメ調キャラクタ画像１２ｄ（Ｎｏ１ＢＬ）が口型画像合成の対象画像として表示される。 Then, the synchronized playback processing of the pronunciation sound, the headword character, and the pronunciation mouth type image of the error accent according to the time code file 12fn corresponding to the question word “low” is performed for the search headword “low” already described. As in the case, the process is started by the reproduction process corresponding to each command code in steps A7 to A12 and the text corresponding mouth display process in FIG. Then, as shown in FIG. 17 (B), in the synchronized text reproduction window W1 (Ef) on the search word display screen G2, the accented phonetic symbol by the user selection is displayed together with the question word “low”. In addition, in the image synchronized playback window W2, the animation-like character image 12d (No1BL) whose color is changed to blue by the user selection of the error accent is displayed as a target image for mouth-shaped image synthesis.

これにより、出題単語「ｌｏｗ」に対応する誤ったアクセントの発音音声出力に同期して、図１７（Ｃ）(1)〜(3)に示すように、テキスト同期再生用ウインドウＷ１（Ｅｆ）では、当該出題単語「ｌｏｗ」およびその誤った発音記号の先頭文字からのハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、前記誤ったアクセントの選択により青色変更されたアニメ調キャラクタ画像１２ｄ（Ｎｏ１ＢＬ）をベースとして、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示される。 As a result, in synchronism with the sound output of the wrong accent corresponding to the question word “low”, as shown in FIGS. 17 (C) (1) to (3), the text synchronized playback window W1 (Ef) Then, the highlight (identification) display HL from the question word “low” and the head character of the incorrect phonetic symbol is sequentially performed, and in the image synchronous reproduction window W2, the blue color is changed by the selection of the incorrect accent. On the basis of the anime-like character image 12d (No1BL), for the mouth image area (X1, Y1; X2, Y2), the pronunciation mouth type image 12e (No36 → No9 → No8) corresponding to the mouth number of each phonetic symbol. It is read out from the voice-specific mouth image data 12e, sequentially switched and synthesized and displayed.

そして、この場合にも前記同様のテキスト対応口表示処理に従って、見出単語「Ｌｏｗ」の誤ったアクセント文字「ｕ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏ８）の切り替え合成表示に際し、当該口型画像１２ｅ（Ｎｏ８）の合成先である青色変更されたアニメ調キャラクタ（顔）画像１２ｄ（Ｎｏ１ＢＬ）が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の青色顔画像１２ｄ（Ｎｏ１ＢＬ′）に変更表示されるので、ユーザは出題単語「Ｌｏｗ」の誤ったアクセントの発音音声とその誤った発声タイミングおよび各対応する発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）を、誤ったアクセントによるものとして明確に学習できるようになる。 In this case as well, according to the same text corresponding mouth display process, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the erroneous accent character “u” of the found word “Low”, the pronunciation type image When switching and displaying 12e (No8), the blue-colored animated character (face) image 12d (No1BL), which is the composition destination of the mouth image 12e (No8), is caused by, for example, sweating of the head or shaking of the body. Since it is changed and displayed in the accent-corresponding blue face image 12d (No1BL ') expressing the state of strong pronunciation, the user can pronounce the wrong accent pronunciation voice of the question word "Low", its wrong utterance timing, and each correspondence So that the pronunciation mouth type image 12e (No36 → No9 → No8) to be clearly learned can be learned as an erroneous accent. That.

一方、図１８（Ａ）に示すように、アクセントテスト出題表示画面Ｇ３において、カーソルキー１７ａ２の操作により選択フレームＸを移動させ、例えば正しいアクセントの発音記号のある選択項目Ｅｔが選択検知されると（ステップＳ１６）、前記キャラクタ画像１２ｄ（Ｎｏ１）の青色変更処理（ステップＳ１８）や誤りアクセントに応じた発音音声の補正処理（ステップＳ１９）が行われることなく、図８における同期再生処理に移行される（ステップＳ１７→ＳＡ）。 On the other hand, as shown in FIG. 18A, when the selection frame X is moved by operating the cursor key 17a2 on the accent test question display screen G3, for example, the selection item Et having a correct phonetic symbol is selected and detected. (Step S16) The process proceeds to the synchronous reproduction process in FIG. 8 without performing the blue color changing process (Step S18) of the character image 12d (No1) or the pronunciation sound correcting process (Step S19) according to the error accent. (Step S17 → SA).

すると、前記図１３を参照して前述した、アニメ調キャラクタ画像１２ｅ（Ｎｏ１）が設定されている状態での検索見出語「ｌｏｗ」に対応する発音音声・テキスト・発音口型画像の同期再生処理と同様にして、図１８（Ｂ）に示すように、検索見出語表示画面Ｇ２上のテキスト同期再生用ウインドウＷ１（Ｅｔ）には、出題単語「ｌｏｗ」と共にユーザ選択による正しいアクセントの発音記号が表示され、また、画像同期再生用ウインドウＷ２には、予め設定された通りの通常色のアニメ調キャラクタ画像１２ｄ（Ｎｏ１）が口型画像合成の対象画像として表示される。 Then, as described above with reference to FIG. 13, the synchronized reproduction of the pronunciation voice / text / speech mouth type image corresponding to the search word “low” in the state where the animation character image 12e (No. 1) is set. In the same manner as the processing, as shown in FIG. 18B, in the text synchronized playback window W1 (Et) on the search word display screen G2, the correct accent pronunciation by the user selection together with the question word “low” is displayed. A symbol is displayed, and an animation-like character image 12d (No. 1) of a normal color as set in advance is displayed as a target image for mouth-shaped image synthesis in the image synchronous reproduction window W2.

これにより、出題単語「ｌｏｗ」に対応する正しいアクセントの発音音声出力に同期して、図１８（Ｃ）(1)〜(3)に示すように、テキスト同期再生用ウインドウＷ１（Ｅｔ）では、当該出題単語「ｌｏｗ」およびその正しい発音記号の先頭文字からのハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、予め設定された通りの通常色のアニメ調キャラクタ画像１２ｄ（Ｎｏ１）をベースとして、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示される。 Thus, in synchronism with the sound output of the correct accent corresponding to the question word “low”, as shown in FIGS. 18 (C) (1) to (3), in the text synchronized playback window W1 (Et), Highlight (identification) display HL from the first word of the question word “low” and its correct phonetic symbol is sequentially performed, and in the image synchronous reproduction window W2, an animation character image of a normal color as set in advance. On the basis of 12d (No1), for the mouth image area (X1, Y1; X2, Y2), the sound mouth type image 12e (No36 → No9 → No8) corresponding to the mouth number of each phonetic symbol is the voice-specific mouth image. The data 12e is read out, sequentially switched and combined and displayed.

そして、この場合にも前記同様のテキスト対応口表示処理に従って、見出単語「Ｌｏｗ」の正しいアクセント文字「ｏ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏ９）の切り替え合成表示に際し、当該口型画像１２ｅ（Ｎｏ９）の合成先であるアニメ調キャラクタ（顔）画像１２ｄ（Ｎｏ１）が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ１′）に変更表示されるので、ユーザは出題単語「Ｌｏｗ」の正しいアクセントの発音音声とその正しい発声タイミングおよび各対応する発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）を明確に学習できるようになる。 In this case as well, according to the same text corresponding mouth display processing, the highlight (identification) display HL synchronized with the output of the pronunciation sound for the correct accent character “o” of the found word “Low”, the pronunciation type image 12e. When switching and displaying (No9), the animated character (face) image 12d (No1), which is the composition destination of the mouth-shaped image 12e (No9), is pronounced strongly by, for example, sweating of the head or shaking of the body. Since the face image 12d (No1 ') corresponding to the accent representing the state is changed and displayed, the user can pronounce the correct accent pronunciation voice of the question word "Low", its correct utterance timing, and each corresponding pronunciation mouth image 12e (No36). → No9 → No8) can be clearly learned.

したがって、前記構成の第１実施形態の携帯機器１０による見出語検索に伴う発音音声・テキスト・発音口型画像の同期再生機能によれば、検索対象の見出語「ｌｏｗ」を入力して当該検索見出語に対応する辞書データを検索し、検索見出語表示画面Ｇ２として表示させた状態で、「訳／決定（音声）」キー１７ａ４を操作すると、当該検索見出語「ｌｏｗ」のタイムコードファイル１２ｆ２３に従い、ステレオ音声出力部１９ｂから出力される発音音声に同期して、テキスト同期再生用ウインドウＷ１において、検索見出語「ｌｏｗ」およびその発音記号のハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、予め設定されたキャラクタ画像１２ｄ（Ｎｏ３）をベースとして、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示される。 Therefore, according to the synchronized playback function of the pronunciation speech / text / speech mouth type image accompanying the search for the headword by the mobile device 10 of the first embodiment having the above-described configuration, the search word “low” is input. When the dictionary data corresponding to the search headword is searched and displayed as the search headword display screen G2, when the “translation / decision (voice)” key 17a4 is operated, the search headword “low” is displayed. In accordance with the time code file 12f23, the search headword “low” and its phonetic symbol highlight (identification) display HL in the synchronized text playback window W1 in synchronization with the pronunciation sound output from the stereo sound output unit 19b. In the image synchronized playback window W2, the mouth image area (X1) is set based on the character image 12d (No. 3) set in advance. Y1; X2, Y2), the pronunciation type images 12e (No36 → No9 → No8) corresponding to the mouth numbers of the phonetic symbols are read out from the voice-specific mouth image data 12e, sequentially switched and combined and displayed. The

しかも、前記検索見出語「Ｌｏｗ」のアクセント文字「ｏ」に対する発音音声の出力と同期したハイライト（識別）表示ＨＬ、発音口型画像１２ｅ（Ｎｏ９）の切り替え合成表示に際しては、当該口型画像１２ｅ（Ｎｏ９）の合成先であるキャラクタ（顔）画像１２ｄ（Ｎｏ３）が、例えば頭部の発汗や口元の動揺によって強く発音している状態を表現するアクセント対応の顔画像１２ｄ（Ｎｏ３′）に変更表示されるので、ユーザは検索見出語「Ｌｏｗ」の発音音声とその発声タイミングおよび各文字「Ｌ」「ｏ」「ｗ」とその発音記号の対応部分、さらには各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）を、そのそれぞれの同期再生により容易に学習できるばかりでなく、アクセントに応じて発声強調するタイミングをリアルに学習できるようになる。 In addition, when the highlight (identification) display HL synchronized with the output of the pronunciation sound for the accent character “o” of the search headline “Low” and the combined speech display of the pronunciation mouth type image 12e (No9), An accent-corresponding face image 12d (No3 ') representing a state in which the character (face) image 12d (No3), which is the synthesis destination of the image 12e (No9), is pronounced strongly by, for example, sweating of the head or shaking of the mouth Therefore, the user can pronounce the pronunciation sound of the search word “Low”, its utterance timing, the corresponding part of each of the characters “L”, “o”, “w” and their pronunciation symbols, and each sound mouth type image. 12e (No36 → No9 → No8) can be easily learned by the respective synchronized playback, and the timing for emphasizing the voice according to the accent can be learned realistically. It becomes possible way.

さらに、前記構成の第１実施形態の携帯機器１０による見出語検索に伴う発音音声・テキスト・発音口型画像の同期再生機能によれば、例えば米国方言と英国方言の発音記号を有する辞書データベース１２ｂに基づき見出語検索を行った際に、図１５または図１６で示したように、米音［米］または英音［英］を指定して「訳／決定（音声）」キー１７ａ４を操作すると、指定された米音または英音の発音音声に同期して、テキスト同期再生用ウインドウＷ１において、検索見出語「ｌａｕｇｈ」およびその米音または英音発音記号のハイライト（識別）表示ＨＬが順次なされると共に、画像同期再生用ウインドウＷ２では、予め設定されたキャラクタ画像１２ｄ（Ｎｏ１）が米音表現用（Ｎｏ１ＵＳ）または英音表現用（Ｎｏ１ＵＫ）としてベース表示され、その口画像エリア（Ｘ１，Ｙ１；Ｘ２，Ｙ２）に対し、米音または英音の各発音記号の口番号に対応した発音口型画像１２ｅ（Ｎｏｎ１→Ｎｏｎ２→Ｎｏｎ３）が音声別口画像データ１２ｅの中から読み出され順次切り替え合成されて表示されるので、検索見出語に対応する米国方言の発音音声およびその発音記号・発音口型と英国方言の発音音声およびその発音記号・発音口型とを明確に区別して学習できるようになる。 Furthermore, according to the synchronized playback function of pronunciation voice / text / speaking mouth type image accompanying the search for the headword by the mobile device 10 of the first embodiment having the above-described configuration, for example, a dictionary database having phonetic symbols of the US dialect and the English dialect When the headword search is performed based on 12b, as shown in FIG. 15 or FIG. 16, the US sound [US] or the English sound [English] is designated and the “translation / decision (voice)” key 17a4 is set. When operated, in synchronization with the specified pronunciation sound of the US or English sound, the search headword “rough” and the highlighted (identification) display of the US sound or English phonetic symbol in the text synchronized playback window W1. HL is sequentially performed, and in the image synchronous reproduction window W2, a preset character image 12d (No1) is used for expressing the US sound (No1US) or for expressing the English sound (No1UK). For the mouth image area (X1, Y1; X2, Y2), the sound source type image 12e (Non1-> Non2-> Non3) corresponding to the mouth number of each phonetic symbol of English or English sound is displayed for each voice. Since it is read out from the mouth image data 12e, and is sequentially switched and synthesized, it is displayed as a pronunciation of the American dialect corresponding to the search headword and its pronunciation symbol / speaking type and pronunciation of the English dialect and its pronunciation symbol.・ Learning can be clearly distinguished from pronunciation type.

また、前記構成の第１実施形態の携帯機器１０による見出語検索に伴う発音音声・テキスト・発音口型画像の同期再生機能によれば、辞書データベース１２ｂに収録される各見出単語には、正しいアクセントの発音記号と共に誤ったアクセントの発音記号を有し、図１７および図１８で示すように、「アクセントテスト」キー１７ａ６が操作されると、ランダム選択された見出単語「ｌｏｗ」が正しいアクセントの発音記号および誤ったアクセントの発音記号と共にアクセントテスト出題表示画面Ｇ３として表示される。そして、正しいアクセントの発音記号が選択された場合には、その正しい発音音声出力に同期して通常の設定キャラクタ画像１２ｄ（Ｎｏ１）をベースとした各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）の切り替え合成表示が行われ、誤ったアクセントの発音記号が選択された場合には、その誤った発音音声出力に同期して青色変更されたキャラクタ画像１２ｄ（Ｎｏ１ＢＬ）をベースとした各発音口型画像１２ｅ（Ｎｏ３６→Ｎｏ９→Ｎｏ８）の切り替え合成表示が行われ、しかも正誤何れのアクセント部分の同期再生時にも、前記口型画像合成ベースとしてのキャラクタ画像１２ｅ（Ｎｏ１）（Ｎｏ１ＢＬ）がアクセント対応のキャラクタ画像１２ｅ（Ｎｏ１′）（Ｎｏ１ＢＬ′）に変更表示されるので、各種単語の正しいアクセントの発音と、誤ったアクセントの発音とを、そのそれぞれに応じた音声・テキスト・画像の同期再生により明確に学習できるようになる。 Further, according to the synchronized playback function of pronunciation speech / text / speech mouth type image accompanying the search for the entry word by the portable device 10 of the first embodiment having the above-described configuration, each found word recorded in the dictionary database 12b When the correct accent phonetic symbol and the wrong accent phonetic symbol are included, and the “accent test” key 17a6 is operated as shown in FIGS. 17 and 18, the randomly selected found word “low” is displayed. A correct accent phonetic symbol and a wrong accent phonetic symbol are displayed as an accent test question display screen G3. When a correct accented phonetic symbol is selected, each of the pronunciation mouth type images 12e (No36 → No9 → No8) based on the normal set character image 12d (No1) in synchronization with the correct pronunciation voice output. When the phonetic symbol of the wrong accent is selected and each of the phonetic symbols is selected based on the character image 12d (No1BL) whose color is changed in blue in synchronization with the erroneous phonetic sound output. The combined display of images 12e (No36 → No9 → No8) is performed, and the character image 12e (No1) (No1BL) as the mouth-shaped image composite base is accent-compatible even when the correct and incorrect accent parts are reproduced. Since the character image 12e (No1 ') (No1BL') is displayed in a modified manner, correct access of various words is possible. And pronunciation cement, and pronunciation wrong accent, it becomes possible to clearly learn the synchronous reproduction of the audio-text image corresponding to the respective.

なお、前記第１実施形態では、検索見出語に対応する発音音声・テキスト（発音記号付き）・発音口型画像の同期生再処理を、タイムコードファイル１２ｆに従った同期再生処理による発音音声出力に同期させたテキスト文字の順次ハイライト（識別）表示、および当該１文字ずつの順次識別表示に伴い割り込みで実行されるテキスト対応口表示処理による識別表示文字対応の発音記号に応じた発音口型画像の切り替え合成表示により行う構成としたが、次の第２実施形態おいて説明するように、アクセント記号付きの発音記号を含む各種の発音記号とそのそれぞれの発音音声データおよび発音顔画像を予め対応付けて複数組み記憶させ、再生すべき見出語の文字を先頭から順番に強調表示させるのに伴い、順次その強調表示文字の発音記号に対応付けられた発音音声データの出力および顔画像データの表示を行う構成としてもよい。 In the first embodiment, the synchronized live reprocessing of the pronunciation speech / text (with pronunciation symbols) / speech mouth type image corresponding to the search headword is performed by the synchronized playback processing according to the time code file 12f. Pronunciation type corresponding to the phonetic symbol corresponding to the identification display character by the text corresponding mouth display processing executed by interruption in accordance with the sequential identification display of the text characters synchronized with each other and the sequential identification display of each character As described in the second embodiment, various phonetic symbols including phonetic symbols with accent symbols and their respective phonetic voice data and phonetic face images are stored in advance. A plurality of sets are stored in association with each other, and as the headword characters to be reproduced are highlighted in order from the top, the phonetic symbols of the highlighted characters are sequentially It may be configured to perform display output and the face image data pronunciation audio data correlated.

（第２実施形態）
図１９は前記携帯機器１０の第２実施形態の見出語同期再生処理を示すフローチャートである。 (Second Embodiment)
FIG. 19 is a flowchart showing a headword synchronized reproduction process of the mobile device 10 according to the second embodiment.

すなわち、この第２実施形態の携帯機器１０では、アクセント記号付きの発音記号を含む各種の発音記号と、そのそれぞれの発音音声データ、および当該各種の発音記号に応じた発音音声データに対応して異なる形態の口部分や表情からなる発音顔画像を、予めメモリ１２内に複数組み記憶させる。 That is, in the mobile device 10 according to the second embodiment, various phonetic symbols including accented phonetic symbols, their respective phonetic voice data, and phonetic voice data corresponding to the various phonetic symbols. A plurality of pronunciation face images composed of mouth portions and facial expressions of different forms are stored in the memory 12 in advance.

そして、例えば辞書データベース１２ｂとして予め記憶される英和辞書を対象に、任意の見出語「ｌｏｗ」が入力されて検索され、前記図１１で示したように、検索見出語表示画面Ｇ２として表示された状態で、その発音音声および発音顔画像の同期再生を行わせるべく「訳／決定（音声）」キー１７ａ４が操作されると、図１９に示す第２実施形態の同期再生処理が開始される。 For example, an arbitrary headword “low” is input and searched for an English-Japanese dictionary stored in advance as the dictionary database 12b, and displayed as a search headword display screen G2 as shown in FIG. When the “translation / decision (speech)” key 17a4 is operated in order to perform the synchronized reproduction of the pronunciation voice and the pronunciation face image, the synchronous reproduction process of the second embodiment shown in FIG. 19 is started. The

この第２実施形態の同期再生処理が開始されると、前記図１２または図１３で示すように、まず、検索見出語表示画面Ｇ２上にテキスト同期再生用ウインドウＷ１が開かれ検索見出語「ｌｏｗ」の各文字と発音記号がその発音順に先頭から強調識別表示ＨＬされる（ステップＣ１）。そして、この強調識別表示ＨＬされた見出文字の発音記号が読み出されて（ステップＣ２）、アクセント記号付きであるか否か判断される（ステップＣ３）。 When the synchronous playback process of the second embodiment is started, as shown in FIG. 12 or FIG. 13, the text synchronous playback window W1 is first opened on the search word display screen G2, and the search word search is started. The characters “low” and the phonetic symbols are highlighted and displayed HL from the top in the order of their pronunciation (step C1). Then, the phonetic symbol of the found character that has been highlighted and displayed HL is read (step C2), and it is determined whether or not it has an accent symbol (step C3).

ここで、図１２（Ｂ）(1)または図１３（Ｂ）(1)で示すように、今回強調表示ＨＬされた見出単語「ｌｏｗ」における文字「ｌ」の発音記号がアクセント記号無しである場合には、前記メモリ１２に予め記憶された当該発音記号に対応するアクセント無しの発音音声データが読み出されてステレオ音声出力部１９ｂから出力されるのと共に（ステップＣ３→Ｃ４）、これに対応付けられたアクセント無しの発音顔画像が読み出されて画像同期再生用ウインドウＷ２に表示される（ステップＣ５）。 Here, as shown in FIG. 12 (B) (1) or FIG. 13 (B) (1), the phonetic symbol of the character “l” in the found word “low” highlighted HL this time is without an accent symbol. In some cases, unaccented pronunciation voice data corresponding to the phonetic symbol stored in advance in the memory 12 is read out and output from the stereo voice output unit 19b (steps C3 → C4). The accented pronunciation face image is read and displayed on the image synchronous reproduction window W2 (step C5).

すると、現在出力中の検索見出語「ｌｏｗ」の次の文字「ｏ」が読み出され（ステップＣ６→Ｃ７）、再び前記ステップＣ１からの処理に戻り、図１２（Ｂ）(2)または図１３（Ｂ）(2)で示すように、その発音記号と共に強調識別表示ＨＬされる（ステップＣ１）。 Then, the character “o” next to the currently searched search term “low” is read (step C 6 → C 7), and the process returns to step C 1 again. As shown in FIG. 13B (2), the emphasis identification display HL is displayed together with the phonetic symbol (step C1).

そして、今回強調表示ＨＬされた見出単語「ｌｏｗ」における文字「ｏ」の発音記号がアクセント記号有りであると判断された場合には（ステップＣ２，Ｃ３）、前記メモリ１２に予め記憶された当該発音記号に対応するアクセント有りの発音音声データが読み出されてステレオ音声出力部１９ｂから出力されるのと共に（ステップＣ３→Ｃ８）、図１２（Ｃ）(2)または図１３（Ｂ）(2)で示すように、これに対応付けられた例えば頭部の発汗や身体の動揺によってアクセント有り表現する発音顔画像が読み出されて画像同期再生用ウインドウＷ２に表示される（ステップＣ９）。 When it is determined that the phonetic symbol of the character “o” in the found word “low” highlighted HL this time has an accent symbol (steps C2 and C3), it is stored in the memory 12 in advance. Accented phonetic voice data corresponding to the phonetic symbol is read out and output from the stereo voice output unit 19b (step C3 → C8), and FIG. 12 (C) (2) or FIG. 13 (B) ( As shown in 2), a pronunciation face image expressed with an accent by, for example, sweating of the head or shaking of the body is read out and displayed in the image synchronous reproduction window W2 (step C9).

したがって、この第２実施形態の携帯機器１０による場合でも、検索見出語「Ｌｏｗ」のアクセント文字「ｏ」をハイライト（識別）表示ＨＬしたことに伴う、発音音声の出力および発音顔画像の表示に際しては、そのアクセント付きの発音記号に基づき当該発音顔画像が、例えば頭部の発汗や身体の動揺によって強く発音している状態を表現するアクセント対応の顔画像として表示されるので、ユーザは検索見出語「Ｌｏｗ」の各文字「Ｌ」「ｏ」「ｗ」と発音音声、さらには各発音顔画像を、そのそれぞれの対応出力により容易に学習できるばかりでなく、アクセントに応じて発声強調する部分をリアルに学習できるようになる。 Therefore, even in the case of the portable device 10 according to the second embodiment, the output of the pronunciation sound and the pronunciation face image associated with the highlight (identification) display HL of the accent character “o” of the search headword “Low” are displayed. At the time of display, since the pronunciation face image is displayed as an accent-corresponding face image expressing a state of strong pronunciation due to sweating of the head or shaking of the body based on the accented phonetic symbol, the user can In addition to being able to easily learn each sound “L”, “o”, “w” of the search headword “Low” and pronunciation sounds, and each pronunciation face image by their corresponding outputs, they can also speak according to the accent. The part to emphasize can be learned realistically.

なお、この第２実施形態において、前記メモリ１２に予め記憶されたアクセント記号付きの発音記号を含む各種の発音記号と、そのそれぞれの発音音声データ、および当該各種の発音記号に応じた発音音声データに対応して異なる形態の口部分や表情からなる発音顔画像について、アクセント付き発音記号に対応付けられた発音音声の出力はアクセント無し発音記号に対応付けられた発音音声より大きく設定され、また、アクセント付き発音記号に対応付けられた発音顔画像の口部分の開き具合はアクセント無し発音記号に対応付けられた発音顔画像の口部分の開き具合より大きく設定される。さらに、この顔画像における表情は、アクセント付き発音記号に対応付けられた発音顔画像の表情の方が、アクセント無し発音記号に対応付けられた発音顔画像の表情よりも強調されて設定される。 In the second embodiment, various phonetic symbols including accented phonetic symbols stored in advance in the memory 12, their respective phonetic voice data, and phonetic voice data corresponding to the various phonetic symbols. For the pronunciation face image consisting of mouth parts and facial expressions of different forms corresponding to, the output of the pronunciation speech associated with the accented phonetic symbol is set larger than the pronunciation speech associated with the accentless phonetic symbol, The opening degree of the mouth portion of the pronunciation face image associated with the accented phonetic symbol is set larger than the opening degree of the mouth portion of the pronunciation face image associated with the accented phonetic symbol. Further, the facial expression in the face image is set such that the facial expression of the pronunciation face image associated with the accented phonetic symbol is more emphasized than the facial expression of the pronunciation face image associated with the accentless pronunciation symbol.

なお、前記第２実施形態では、アクセント記号付きの発音記号を含む各種の発音記号と、そのそれぞれの発音音声データ、および当該各種の発音記号に応じた発音音声データに対応して異なる形態の口部分や表情からなる発音顔画像を予め記憶し、検索見出語の各文字をその発音順に強調表示すると共に、その発音記号に対応付けられた発音音声を読み出して出力し、また同発音記号に対応付けられ発音顔画像を読み出して表示する構成としたが、次の第３実施形態において説明するように、辞書データベース１２ｂにある各見出語のそれぞれに対応して当該見出し語の発音音声と発音顔画像とを予め組み合わせて記憶させ、検索見出語の文字表示に伴いその発音音声および発音顔画像を読み出して出力し、この際の発音音声信号のピークレベルを検出してアクセント部分を判断し、前記発音顔画像の口や表情の形態を異なる表示形態に変更制御する構成としてもよい。 In the second embodiment, various phonetic symbols including accented phonetic symbols, their respective phonetic speech data, and different forms of mouth corresponding to the phonetic speech data corresponding to the various phonetic symbols. Pre-recorded pronunciation face images consisting of parts and facial expressions, highlight each character of the search headword in the order of its pronunciation, and read out and output the pronunciation sound associated with the pronunciation symbol, Although the corresponding pronunciation face image is read and displayed, as described in the next third embodiment, the pronunciation sound of the entry word corresponding to each entry word in the dictionary database 12b The pronunciation face image is stored in combination, and the pronunciation voice and the pronunciation face image are read and output along with the character display of the search headword. Determining an accent moiety by detecting the Le, may be configured to change control to the mouth and facial expressions of the form a different display form of the sound face image.

（第３実施形態）
図２０は前記携帯機器１０の第３実施形態の見出語同期再生処理を示すフローチャートである。 (Third embodiment)
FIG. 20 is a flowchart showing a headword synchronized reproduction process of the mobile device 10 according to the third embodiment.

すなわち、この第３実施形態の携帯機器１０では、辞書データベース１２ｂの各辞書データにある各見出語のそれぞれに対応して当該見出し語の発音音声と発音顔画像とを予め組み合わせて記憶させる。 That is, in the mobile device 10 according to the third embodiment, the pronunciation sound and the pronunciation face image of the entry word are stored in combination in advance corresponding to each entry word in each dictionary data of the dictionary database 12b.

そして、例えば辞書データベース１２ｂとして予め記憶される英和辞書を対象に、任意の見出語「ｌｏｗ」が入力されて検索され、前記図１１で示したように、検索見出語表示画面Ｇ２として表示された状態で、その発音音声および発音顔画像の同期再生を行わせるべく「訳／決定（音声）」キー１７ａ４が操作されると、図２０に示す第３実施形態の同期再生処理が開始される。 For example, an arbitrary headword “low” is input and searched for an English-Japanese dictionary stored in advance as the dictionary database 12b, and displayed as a search headword display screen G2 as shown in FIG. In this state, when the “translation / decision (voice)” key 17a4 is operated so as to perform the synchronized reproduction of the pronunciation sound and the pronunciation face image, the synchronous reproduction process of the third embodiment shown in FIG. 20 is started. The

この第３実施形態の同期再生処理が開始されると、前記図１２または図１３で示すように、まず、検索見出語表示画面Ｇ２上にテキスト同期再生用ウインドウＷ１が開かれ検索見出語「ｌｏｗ」の各文字がその発音順に先頭から強調識別表示ＨＬされる（ステップＤ１）。そして、この強調識別表示ＨＬされた見出文字に対応する部分の発音音声データが読み出され（ステップＤ２）、ステレオ音声出力部１９ｂから出力される（ステップＤ３）。 When the synchronous playback process of the third embodiment is started, as shown in FIG. 12 or FIG. 13, the text synchronous playback window W1 is first opened on the search word display screen G2, and the search word search is started. Each character of “low” is highlighted and displayed HL from the top in the order of pronunciation (step D1). Then, the pronunciation sound data of the portion corresponding to the found character that has been highlighted and displayed HL is read (step D2) and output from the stereo sound output unit 19b (step D3).

ここで、例えば今回強調表示ＨＬされた見出単語「ｌｏｗ」における文字「ｌ」に対応する部分の発音音声データの信号（波形）レベルが一定値以上の音声信号レベル（アクセント部分）か否か判断されるもので（ステップＤ４）、一定音声信号レベル以上ではない、つまりアクセント部分ではないと判断された場合には、当該検索見出語に対応付けられて記憶された発音顔画像が読み出されてそのまま画像同期再生用ウインドウＷ２に表示される（ステップＤ５）。 Here, for example, whether the signal (waveform) level of the pronunciation sound data of the portion corresponding to the character “l” in the found word “low” highlighted HL this time is a sound signal level (accent portion) of a certain value or more. If it is determined (step D4) and it is determined that it is not higher than a certain audio signal level, that is, it is not an accent part, a pronunciation face image stored in association with the search word is read out. Then, it is displayed as it is on the image synchronous reproduction window W2 (step D5).

すると、現在出力中の検索見出語「ｌｏｗ」の次の文字「ｏ」が読み出され（ステップＤ６→Ｄ７）、再び前記ステップＤ１からの処理に戻り、強調識別表示ＨＬされる（ステップＤ１）。 Then, the character “o” next to the currently searched search word “low” is read (step D6 → D7), and the process returns to the process from step D1 again, and the highlight identification display HL is displayed (step D1). ).

すると、今回強調識別表示ＨＬされた見出文字「ｏ」に対応する部分の発音音声データが読み出され（ステップＤ２）、ステレオ音声出力部１９ｂから出力されると共に（ステップＤ３）、当該強調表示ＨＬされた単語文字「ｏ」に対応する部分の発音音声データの信号（波形）レベルが一定値以上の音声信号レベル（アクセント部分）か否か判断される（ステップＤ４）。 Then, the pronunciation sound data of the portion corresponding to the found character “o” that has been highlighted and displayed this time is displayed (step D2), output from the stereo sound output unit 19b (step D3), and the highlighted display. It is determined whether or not the signal (waveform) level of the pronunciation sound data of the portion corresponding to the word character “o” subjected to the HL is a sound signal level (accent portion) of a predetermined value or more (step D4).

ここで、一定音声信号レベル以上である、つまりアクセント部分であると判断された場合には、当該検索見出語に対応付けられて記憶された発音顔画像が読み出されると共に、当該顔画像はその口部分の開き具合が大きくまたその表情が強い顔画像に変更制御（例えば図１２（Ｂ）(2)→図１２（Ｃ）(2)）され、画像同期再生用ウインドウＷ２に表示される（ステップＤ４→Ｄ８）。 Here, when it is determined that the sound signal level is equal to or higher than a certain audio signal level, that is, an accent portion, the pronunciation face image stored in association with the search headword is read and the face image is The face image is controlled to be changed to a face image with a large opening degree and a strong expression (for example, FIG. 12 (B) (2) → FIG. 12 (C) (2)), and is displayed in the image synchronous reproduction window W2 ( Step D4 → D8).

なお、前記発音音声の音声信号波形レベルが一定値以上と判断されてアクセント部分であると判断された場合には、強調表示されている検索見出語の対応文字をさらに表示色の変更や付加あるいは文字フォントの変更などにより、アクセント部分の文字であることを示す形態に変更制御して表示させる構成としてもよい。 If the sound signal waveform level of the pronunciation sound is determined to be a certain value or more and is determined to be an accent part, the display color of the highlighted search word is further changed or added. Or it is good also as a structure which changes and displays in the form which shows that it is a character of an accent part by changing a character font.

したがって、この第３実施形態の携帯機器１０による場合でも、検索見出語「Ｌｏｗ」のアクセント文字「ｏ」をハイライト（識別）表示ＨＬしたことに伴う、発音音声の出力および発音顔画像の表示に際しては、そのときの発音音声信号レベルが一定値以上であることに基づき当該発音顔画像が、例えば口部分の開き具合が大きくまたその表情が強いアクセント対応の顔画像に変更制御されて表示されるので、ユーザは検索見出語「Ｌｏｗ」の各文字「Ｌ」「ｏ」「ｗ」とその発音音声、さらには発音顔画像を、そのそれぞれの対応出力により容易に学習できるばかりでなく、アクセントに応じて発声強調する部分をリアルに学習できるようになる。 Therefore, even in the case of the mobile device 10 according to the third embodiment, the output of the pronunciation sound and the pronunciation face image associated with the highlight (identification) display HL of the accent character “o” of the search headword “Low” are displayed. At the time of display, based on the sound signal level at that time being equal to or higher than a certain value, the sound generation face image is changed and displayed, for example, as an accent-corresponding face image with a large mouth opening and a strong expression. Therefore, the user not only can easily learn the characters “L”, “o”, “w” of the search headword “Low”, their pronunciation sounds, and also the pronunciation face images by their corresponding outputs. This makes it possible to learn realistically the part that emphasizes the utterance according to the accent.

なお、前記各実施形態における検索見出語の各文字（テキスト）・発音音声・発音顔画像（含む発音口型画像）の同期再生機能の説明では、当該見出語のアクセントが１箇所に存在する場合について説明したが、検索見出語のアクセントが第１アクセントと第２アクセントの２箇所に存在する場合には、各アクセント部分に対応して表示するアクセント対応の発音顔画像（含む発音口型画像）を、第１アクセントの場合と第２アクセントの場合とで、例えば口の開き具合の大小や表情の強弱などによって異なる形態にして表示させる構成としてもよい。 In the description of the synchronized playback function for each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth image) of the search headword in each of the embodiments, the accent of the headword exists in one place. In the case where the accent of the search word is present in two places, the first accent and the second accent, the accent-corresponding pronunciation face image (corresponding pronunciation mouth to be displayed) corresponding to each accent portion is described. The type image) may be displayed in different forms depending on, for example, the size of the opening of the mouth, the strength of the facial expression, or the like depending on whether the first accent or the second accent.

なお、前記各実施形態において記載した携帯機器１０による各処理の手法、すなわち、図７のフローチャートに示す第１実施形態での辞書処理プログラム１２ａに従ったメイン処理、図８のフローチャートに示す前記メイン処理に伴う見出語同期再生処理、図９のフローチャートに示す前記見出語同期再生処理に伴う各見出語文字のハイライト表示に応じて割り込みで実行されるテキスト対応口表示処理、図１９のフローチャートに示す第２実施形態での見出語同期再生処理、図２０のフローチャートに示す第３実施形態での見出語同期再生処理などの各手法は、何れもコンピュータに実行させることができるプログラムとして、メモリカード（ＲＯＭカード、ＲＡＭカード、ＤＡＴＡ・ＣＡＲＤ等）、磁気ディスク（フロッピディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ等）、半導体メモリ等の外部記録媒体１３に格納して配布することができる。そして、通信ネットワーク（インターネット）Ｎとの通信機能を備えた種々のコンピュータ端末は、この外部記録媒体１３に記憶されたプログラムを記録媒体読取部１４によってメモリ１２に読み込み、この読み込んだプログラムによって動作が制御されることにより、前記各実施形態において説明した検索見出語に対応する各文字（テキスト）・発音音声・発音顔画像（含む発音口型画像）の同期再生機能を実現し、前述した手法による同様の処理を実行することができる。 Note that each processing method by the portable device 10 described in each embodiment, that is, main processing according to the dictionary processing program 12a in the first embodiment shown in the flowchart of FIG. 7, the main shown in the flowchart of FIG. Entry-synchronized playback process associated with the process, text-corresponding mouth display process executed by interruption in accordance with the highlight display of each entry word associated with the entry-synchronized playback process shown in the flowchart of FIG. 9, FIG. Each method such as the headword synchronized playback process in the second embodiment shown in the flowchart of FIG. 20 and the headword synchronized playback process in the third embodiment shown in the flowchart of FIG. 20 can be executed by the computer. Programs include memory cards (ROM cards, RAM cards, DATA / CARD, etc.), magnetic disks (floppy disks, hard disks, etc.). De disks, etc.), optical disk (CD-ROM, DVD, etc.) can be distributed and stored in the external recording medium 13 such as a semiconductor memory. Various computer terminals having a communication function with the communication network (Internet) N read the program stored in the external recording medium 13 into the memory 12 by the recording medium reading unit 14, and the operation is performed by the read program. By being controlled, a synchronized reproduction function of each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth type image) corresponding to the search headword described in each embodiment is realized, and the method described above A similar process can be executed.

また、前記各手法を実現するためのプログラムのデータは、プログラムコードの形態として通信ネットワーク（インターネット）Ｎ上を伝送させることができ、この通信ネットワーク（インターネット）Ｎに接続されたコンピュータ端末から前記のプログラムデータを取り込み、前述した検索見出語に対応する各文字（テキスト）・発音音声・発音顔画像（含む発音口型画像）の同期再生機能を実現することもできる。 The program data for realizing each of the above methods can be transmitted on a communication network (Internet) N in the form of a program code, and the above-mentioned data can be transmitted from a computer terminal connected to the communication network (Internet) N. It is also possible to capture the program data and realize a synchronized reproduction function of each character (text), pronunciation sound, and pronunciation face image (including pronunciation mouth image) corresponding to the search headword described above.

なお、本願発明は、前記各実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。さらに、前記各実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適宜な組み合わせにより種々の発明が抽出され得る。例えば、各実施形態に示される全構成要件から幾つかの構成要件が削除されたり、幾つかの構成要件が組み合わされても、発明が解決しようとする課題の欄で述べた課題が解決でき、発明の効果の欄で述べられている効果が得られる場合には、この構成要件が削除されたり組み合わされた構成が発明として抽出され得るものである。 Note that the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the invention at the stage of implementation. Further, each of the embodiments includes inventions at various stages, and various inventions can be extracted by appropriately combining a plurality of disclosed constituent elements. For example, even if some constituent requirements are deleted from all the constituent requirements shown in each embodiment, or some constituent requirements are combined, the problem described in the column of the problem to be solved by the invention can be solved, When the effects described in the column of the effect of the invention can be obtained, a configuration in which these constituent elements are deleted or combined can be extracted as an invention.

１０ …携帯機器
１１ …ＣＰＵ
１２ …メモリ
１２Ａ…ＦＬＡＳＨメモリ
１２Ｂ…ＲＡＭ
１２ａ…辞書処理プログラム
１２ｂ…辞書データベース
１２ｃ…辞書音声データ
１２ｄ…キャラクタ画像データ
１２ｄ（Ｎｏ．ｎ）…設定キャラクタ画像
１２ｄ（Ｎｏ．ｎ′）…アクセント対応顔画像
１２ｄ（Ｎｏ．ｎＵＳ）…米語用設定キャラクタ画像
１２ｄ（Ｎｏ．ｎＵＳ′）…米語用アクセント対応顔画像
１２ｄ（Ｎｏ．ｎＵＫ）…英語用設定キャラクタ画像
１２ｄ（Ｎｏ．ｎＵＫ′）…英語用アクセント対応顔画像
１２ｄ（Ｎｏ．ｎＢＬ）…青色変更設定キャラクタ画像
１２ｄ（Ｎｏ．ｎＢＬ′）…アクセント対応の青色顔画像
１２ｅ…音声別口画像データ
１２ｆ…辞書タイムコードファイル
１２ｇ…見出語データメモリ
１２ｈ…見出語対応辞書データメモリ
１２ｉ…タイムコードファイルＮｏ２３
１２ｊ…同期用ＨＴＭＬファイルメモリ
１２ｋ…同期用テキストファイルメモリ
１２ｍ…同期用サウンドファイルメモリ
１２ｎ…同期用イメージファイルメモリ
１２ｐ…口画像エリアメモリ
１２ｑ…画像展開バッファ
１３ …外部記録媒体
１４ …記録媒体読取部
１５ …電送制御部
１６ …通信部
１７ａ…入力部
１７ｂ…座標入力装置
１８ …表示部
１９ａ…音声入力部
１９ｂ…ステレオ音声出力部
２０ …通信機器（自宅ＰＣ）
３０ …Ｗｅｂサーバ
Ｎ …通信ネットワーク（インターネット）
Ｘ …選択フレーム
Ｈ …タイムコードテーブルのヘッダ情報
Ｇ１ …キャラクタ画像の一覧選択画面
Ｇ２ …見出語検索画面
Ｇ３ …アクセントテスト出題表示画面
Ｗ１ …見出語文字表示ウインドウ（テキスト同期再生用ウインドウ）
Ｗ２ …発音口型表示ウインドウ（画像同期再生用ウインドウ）
ＨＬ …ハイライト（識別）表示
Ｅｔ …正解アクセント選択項目
Ｅｆ …誤りアクセント選択項目 10 ... Mobile device 11 ... CPU
12 ... Memory 12A ... FLASH memory 12B ... RAM
12a ... Dictionary processing program 12b ... Dictionary database 12c ... Dictionary voice data 12d ... Character image data 12d (No. n) ... Set character image 12d (No. n ') ... Accent corresponding face image 12d (No. nUS) ... For American English Set character image 12d (No. nUS ') ... American accent-corresponding face image 12d (No. nUK) ... English set character image 12d (No. nUK') ... English accent-corresponding face image 12d (No. nBL) ... Blue change setting character image 12d (No. nBL ') ... Accent-corresponding blue face image 12e ... Voice-specific mouth image data 12f ... Dictionary time code file 12g ... Yomi word data memory 12h ... Yomi word correspondence dictionary data memory 12i ... Time code file No23
12j ... HTML file memory for synchronization 12k ... Text file memory for synchronization 12m ... Sound file memory for synchronization 12n ... Image file memory for synchronization 12p ... Mouth image area memory 12q ... Image expansion buffer 13 ... External recording medium 14 ... Recording medium reading unit DESCRIPTION OF SYMBOLS 15 ... Electric transmission control part 16 ... Communication part 17a ... Input part 17b ... Coordinate input device 18 ... Display part 19a ... Audio | voice input part 19b ... Stereo audio | voice output part 20 ... Communication apparatus (home PC)
30 ... Web server N ... Communication network (Internet)
X ... selection frame H ... header information G1 of time code table ... character image list selection screen G2 ... headword search screen G3 ... accent test question display screen W1 ... headword character display window (text synchronous playback window)
W2… Speaking mouth type display window (window for synchronized playback of images)
HL ... Highlight (identification) display Et ... Correct accent selection item Ef ... Error accent selection item

本発明は、前記のような問題に鑑みてなされたもので、同じ見出語であっても、地域名に対応させて異なる発音音声で当該見出語の発音を耳で確認することができる一方で、その異なる発音音声の出力に同期させて、当該異なる発音音声に合った口型画像の変更を目で確認することができ、同じ見出語であっても地域名に合った見出語の発音の仕方と口の開け方とを耳と目との双方で確認しながら言語等の学習をすることが可能になる音声表示出力制御装置および音声表示出力制御処理プログラムを提供することを目的とする。 The present invention has been made in view of the above-described problems, and even with the same headword, the pronunciation of the headword can be confirmed by ear with different pronunciation sounds corresponding to the region name. On the other hand, in synchronization with the output of the different pronunciation sounds, it is possible to visually confirm changes in the mouth image that match the different pronunciation sounds. To provide a voice display output control device and a voice display output control processing program that enable learning of a language and the like while confirming how to pronounce a word and how to open a mouth with both ears and eyes Objective.

本発明の請求項１に係る音声表示出力制御装置は、複数の見出語と当該各見出語それぞれに割り当てられた少なくとも２以上の地域名の発音音声データとを対応付けて記憶している記憶手段と、地域名を指定する地域指定手段と、この地域指定手段により一つの地域名が指定され、かつ、複数の見出語の中から一つの見出語が指定されると、この指定された一つの見出語に割り当てられた少なくとも２以上の地域名の中から前記指定された一つの地域名の発音音声データを前記記憶手段から読み出し、この読み出された発音音声データに対応する発音音声を出力する音声出力手段と、この音声出力手段により出力される発音音声に対応する見出語を表示させる見出語表示制御手段と、少なくとも口の部分を含む顔画像を表示させる画像表示制御手段と、この画像表示制御手段の制御により表示された顔画像の口の部分の口型画像を、前記音声出力手段により出力される発音音声に同期して、当該発音音声に対応した口型画像に変更して表示させる口画像表示制御手段と、を備えたことを特徴とする。 Audio display output control apparatus according to claim 1 of the present invention stores in association with sound audio data of at least 2 or more locality names assigned to each of the plurality of headword and the respective look Dego Storage means, area designation means for designating an area name, and when one area name is designated by the area designation means and one headword is designated from a plurality of headwords, this designation is made. Out of at least two or more area names assigned to one entry word, the pronunciation sound data of the specified one area name is read out from the storage means and corresponds to the read out sound sound data. Sound output means for outputting pronunciation speech, headword display control means for displaying the headword corresponding to the pronunciation speech output by the sound output means, and image display for displaying a face image including at least the mouth portion System And a mouth shape image corresponding to the pronunciation sound in synchronism with the pronunciation sound output by the sound output means, the mouth shape image of the mouth portion of the face image displayed by the control of the image display control means And a mouth image display control means for changing to display.

これによれば、同一の見出語であっても、異なる地域名を指定してその指定された地域名の音声の出力で見出語を確認することができると共に、この発音音声の出力に同期して当該発音音声に対応した口型画像に変更して表示させることができるので、地域に対応する発音音声と、口の開け方とを互いに関連させ、確認しながら学習できる。 According to this, even in the same entry word, it is possible to confirm the entry word at the output of the voice of the specified region name and specify a different regional names, the output of the pronunciation sound voice Since the mouth-shaped image corresponding to the pronunciation sound can be changed and displayed in synchronization with the sound, the pronunciation sound corresponding to the region and the way of opening the mouth can be related to each other and learned while checking .

本発明の請求項３に係る画像表示制御装置では、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、第１の記憶手段により前記発音対象データとアクセント記号付き発音記号を含む発音記号とを対応付けて複数組み記憶し、第２の記憶手段によりアクセント記号付き発音記号を含む発音記号とその音声および顔画像を対応付けて複数組み記憶する。そして、第１の制御手段により前記一連の発音対象データの発音順の表示に伴い、この発音対象データに対応する発音記号を前記第１の記憶手段から読み出し、この読み出された発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御し、第２の制御手段により前記第１の制御によって音声を外部に出力する際に、前記読み出された発音記号の中にアクセント記号付き発音記号が含まれているか否かを判別し、アクセント記号が含まれていると判別された際は、このアクセント記号付き発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御する。 An image display control apparatus according to claim 3 of the present invention is an image display control apparatus that controls to change a face image having a mouth or a facial expression according to a display of a pronunciation order of a series of pronunciation target data including a word headword. The first storage means stores a plurality of sets of the pronunciation object data and the pronunciation symbols including the accented pronunciation symbols, and the second storage means stores the pronunciation symbols including the accented pronunciation symbols and the A plurality of sets of voice and face images are stored in association with each other. Then, with the display of the order of pronunciation of the series of pronunciation target data by the first control means, the phonetic symbols corresponding to the pronunciation target data are read from the first storage means and correspond to the read phonetic symbols. The voice and the face image to be read are read from the second storage means, the read voice is output to the outside, and the read face image is controlled to be displayed. When outputting the sound to the outside by the first control, it is determined whether or not a phonetic symbol with an accent symbol is included in the read phonetic symbol, and it is determined that an accent symbol is included. In this case, the voice and face image corresponding to the accented phonetic symbol are read from the second storage means, the read voice is output to the outside, and the read face image is displayed. It is controlled to be.

本発明の請求項４に係る画像表示制御装置では、前記請求項３に係る画像表示制御装置にあって、前記第２の記憶手段に記憶されているアクセント記号付き発音記号を含む発音記号は、アクセント記号が付いている発音記号とアクセント記号が付いていない発音記号とからなり、前記アクセント記号が付いている発音記号に対応付けて記憶されている音声および顔画像と前記アクセント記号が付いていない発音記号に対応付けて記憶されている音声および顔画像とは異なっている。 In the image display control device according to claim 4 of the present invention, in the image display control device according to claim 3 , the phonetic symbols including the phonetic symbols with accent marks stored in the second storage means are: It consists of phonetic symbols with an accent symbol and phonetic symbols without an accent symbol. The voice and face image stored in association with the phonetic symbol with the accent symbol and the accent symbol are not attached. This is different from the voice and face image stored in association with the phonetic symbols.

また、本発明の請求項５に係る画像表示制御装置では、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、記憶手段により前記発音対象データとその音声および顔画像を対応付けて複数組み記憶し、検出手段により前記記憶されている音声の信号波形のうち、前記発音対象データのアクセント部分に対応する信号波形のピーク部分を検出し、表示制御手段により前記検出されたアクセント部分の音声に対応する顔画像を前記記憶手段から読み出しこの読み出された顔画像を、アクセント部分以外の他の信号波形部分の音声に対応する顔画像と異なる表示形態で表示するように制御する。 In the image display control device according to claim 5 of the present invention, the image display control for changing and controlling the face image having the mouth or the expression according to the display of the pronunciation order of the series of pronunciation target data including the word headword. A storage unit stores a plurality of sets of the sound generation target data and their voices and face images in association with each other, and the detection unit stores an accent portion of the sound generation target data in the stored signal waveform of the sound. The peak portion of the corresponding signal waveform is detected, and the face image corresponding to the detected voice of the accent part is read from the storage means by the display control means, and the read face image is read as a signal other than the accent part. Control is performed so that the face image corresponding to the sound of the waveform portion is displayed in a different display form.

また、本発明の請求項６に係る画像表示制御装置では、前記請求項５に係る画像表示制御装置にあって、前記表示制御手段は、前記検出手段により検出されたアクセント部分に対応する発音対象データの部分の表示を、アクセント部分以外の他の信号波形部分に対応する発音対象データの部分の表示と異なる表示形態で表示するように制御するテキスト表示制御手段を備えている。 The image display control apparatus according to claim 6 of the present invention is the image display control apparatus according to claim 5 , wherein the display control means is a pronunciation target corresponding to the accent portion detected by the detection means. Text display control means is provided for controlling the display of the data portion to be displayed in a display form different from the display of the portion of the sound generation target data corresponding to the signal waveform portion other than the accent portion.

本発明の請求項１に係る音声表示出力制御装置によれば、同一の見出語であっても、異なる地域名を指定してその指定された地域名の音声の出力で見出語を確認することができると共に、この発音音声の出力に同期して当該発音音声に対応した口型画像に変更して表示させることができるので、地域に対応する発音音声と、口の開け方とを互いに関連させ、確認しながら学習できることになる。 According to the audio display output control apparatus according to claim 1 of the present invention, even for the same lemma, check the entry word by specifying a different local name output of the audio of the designated area name it is possible to, since it is possible to display and change to the corresponding mouth type images to the sound voice in synchronization with the output of the sound sound voice, a sound sound corresponding to the region, and opening the mouth You can learn while associating with each other .

本発明の請求項３（請求項８）に係る画像表示制御装置（画像表示制御処理プログラム）によれば、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、第１の記憶手段により前記発音対象データとアクセント記号付き発音記号を含む発音記号とを対応付けて複数組み記憶し、第２の記憶手段によりアクセント記号付き発音記号を含む発音記号とその音声および顔画像を対応付けて複数組み記憶する。そして、第１の制御手段により前記一連の発音対象データの発音順の表示に伴い、この発音対象データに対応する発音記号を前記第１の記憶手段から読み出し、この読み出された発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御し、第２の制御手段により前記第１の制御によって音声を外部に出力する際に、前記読み出された発音記号の中にアクセント記号付き発音記号が含まれているか否かを判別し、アクセント記号が含まれていると判別された際は、このアクセント記号付き発音記号に対応する音声と顔画像とを前記第２の記憶手段から読み出し、この読み出された音声を外部へ出力すると共に、読み出された顔画像を表示するように制御する。これにより、単語の見出し語などの発音対象データの発音順の表示に伴い、当該発音対象データの発音記号に対応する音声出力と顔画像表示ができると共に、そのアクセント部分では該アクセント記号付き発音記号に対応する音声出力と顔画像表示ができ、容易かつ明確に単語などの発音音声とこの発音に伴う顔の表現およびそのアクセント部分での発音音声とこのアクセント部分の発音に伴う顔の表現を学習できるようになる。 According to the image display control device according to claim 3 of the present invention (Claim 8) (image display control processing program), according to the display of the order of sound of the series of sound object data including a word entry word, mouth or facial expressions An image display control device for controlling a change of a face image, wherein a plurality of sets of the pronunciation target data and pronunciation symbols including accented symbols are stored in association with each other by a first storage means, and a second set is stored. The storage means stores a plurality of sets of phonetic symbols including accented phonetic symbols and their voices and face images in association with each other. Then, with the display of the order of pronunciation of the series of pronunciation target data by the first control means, the phonetic symbols corresponding to the pronunciation target data are read from the first storage means and correspond to the read phonetic symbols. The voice and the face image to be read are read from the second storage means, the read voice is output to the outside, and the read face image is controlled to be displayed. When outputting the sound to the outside by the first control, it is determined whether or not a phonetic symbol with an accent symbol is included in the read phonetic symbol, and it is determined that an accent symbol is included. In this case, the voice and face image corresponding to the accented phonetic symbol are read from the second storage means, the read voice is output to the outside, and the read face image is displayed. It is controlled to be. Thus, along with the display of the pronunciation order of the pronunciation target data such as a word headword, the voice output and the face image display corresponding to the pronunciation symbol of the pronunciation target data can be performed, and the accented pronunciation symbol at the accent portion Easily and clearly learn pronunciation sounds such as words, facial expressions that accompany this pronunciation, and voice expressions in the accent part and facial expressions that accompany this accent part. become able to.

本発明の請求項４に係る画像表示制御装置によれば、前記請求項３に係る画像表示制御装置にあって、前記第２の記憶手段に記憶されているアクセント記号付き発音記号を含む発音記号は、アクセント記号が付いている発音記号とアクセント記号が付いていない発音記号とからなり、前記アクセント記号が付いている発音記号に対応付けて記憶されている音声および顔画像と前記アクセント記号が付いていない発音記号に対応付けて記憶されている音声および顔画像とは異なっている。これにより、単語の見出し語などの発音対象データのアクセント記号の無い部分での発音音声とこれに伴う顔の表現、そしてアクセント記号がある部分での発音音声とこれに伴う顔の表現の相異をより明確に学習できるようになる。 According to the image display control device of claim 4 of the present invention, in the image display control device according to claim 3 , the phonetic symbol including the phonetic symbol with an accent symbol stored in the second storage means Consists of a phonetic symbol with an accent symbol and a phonetic symbol without an accent symbol, with a voice and face image stored in association with the phonetic symbol with the accent symbol and the accent symbol This is different from voice and face images stored in association with phonetic symbols that are not. As a result, the pronunciation of the pronunciation target data such as the headword of the word and the expression of the face in the part without the accent symbol, and the expression of the face accompanying this, and the difference between the pronunciation of the pronunciation in the part of the accent symbol and the expression of the face associated therewith Can learn more clearly.

また、本発明の請求項５に係る画像表示制御装置によれば、単語の見出語を含む一連の発音対象データの発音順の表示に従って、口または表情を備えた顔画像を変更制御する画像表示制御装置であって、記憶手段により前記発音対象データとその音声および顔画像を対応付けて複数組み記憶し、検出手段により前記記憶されている音声の信号波形のうち、アクセント部分に対応する信号波形のピーク部分を検出し、表示制御手段により前記検出されたアクセント部分の音声に対応する顔画像を前記記憶手段から読み出しこの読み出された顔画像を、アクセント部分以外の他の信号波形部分の音声に対応する顔画像と異なる表示形態で表示するように制御する。これにより、単語の見出語などの発音対象データの発音順の表示に伴い、その発音音声に対応する顔画像を表示でき、しかも音声信号波形のピーク部分によって検出されるアクセント部分では異なる表示形態にした顔画像を表示でき、アクセント部分での発音に伴う顔の表現をより明確に学習できるようになる。 According to the image display control apparatus of the fifth aspect of the present invention, the image for changing and controlling the face image having the mouth or the expression according to the display of the pronunciation order of the series of pronunciation target data including the word headword. A display control device, wherein a plurality of sets of the sound generation target data and the sound and face image thereof are stored in association with each other by a storage unit, and a signal corresponding to an accent portion of the stored signal waveform of the sound by a detection unit The peak portion of the waveform is detected, the face image corresponding to the detected voice of the accent portion is read from the storage means by the display control means, and the read face image is read out of the signal waveform portion other than the accent portion. Control is performed so that the face image corresponding to the sound is displayed in a different display form. As a result, along with the display of the pronunciation order of the pronunciation target data such as the headword of the word, a face image corresponding to the pronunciation voice can be displayed, and the display form differs in the accent part detected by the peak part of the voice signal waveform The face image can be displayed, and the expression of the face accompanying the pronunciation in the accent part can be learned more clearly.

また、本発明の請求項６に係る画像表示制御装置によれば、前記請求項５に係る画像表示制御装置にあって、前記表示制御手段は、前記検出手段により検出されたアクセント部分に対応する発音対象データの部分の表示を、アクセント部分以外の他の信号波形部分に対応する発音対象データの部分の表示と異なる表示形態で表示するように制御するテキスト表示制御手段を備えている。これにより、発音対象データの発音音声に対応する顔画像の表示に加え、さらに、発音対象データのアクセント部分の表示を、該アクセント部分以外の発音対象データの表示と異なる表示形態にして表示できるので、発音対象データのアクセント部分およびその発音音声の発声に伴う顔の表現をより明確に学習できるようになる。 According to the image display control device according to claim 6 of the present invention, the image display control device according to claim 5, wherein the display control means corresponds to the detected accented part by said detecting means Text display control means is provided for controlling the display of the pronunciation target data portion in a display form different from the display of the pronunciation target data portion corresponding to the signal waveform portion other than the accent portion. Thereby, in addition to the display of the face image corresponding to the pronunciation sound of the pronunciation target data, the display of the accent part of the pronunciation target data can be displayed in a display form different from the display of the pronunciation target data other than the accent part. Further, it becomes possible to learn more clearly the accent part of the pronunciation target data and the facial expression accompanying the utterance of the pronunciation sound.

Claims

Audio data output means for outputting audio data;
Text synchronous display control means for displaying text in synchronization with the voice data output by the voice data output means;
Image display control means for displaying an image including at least a mouth portion;
Mouth image display control for displaying a mouth image corresponding to the sound data in synchronization with the sound data output from the sound data output means for the mouth portion included in the image displayed by the image display control means Means,
Accent detection means for detecting the presence or absence of accents in the voice data or the text;
Image change display control means for changing the mouth-shaped image displayed by the image display control means in response to detection of the presence of accents by the accent detection means;
An audio display output control device comprising:

further,
Dictionary search means for searching dictionary data corresponding to the entered headword;
Dictionary data display control means for displaying dictionary data corresponding to the headword searched by the dictionary search means,
The speech data is pronunciation speech data of a headword searched by the dictionary search means, and the text is a text of a headword searched by the dictionary search means,
Output of headphone pronunciation voice data by the voice data output means, display of headword text synchronized with the headword pronunciation voice data by the text synchronous display control means, and display of image by the image display control means Is performed in the display state of dictionary data corresponding to the search headword by the dictionary data display control means,
The voice display output control device according to claim 1.

A word storage means for storing a plurality of words and phonetic symbols with correct accents and phonetic symbols with error accents for each of the words,
Voice data output means for outputting correct accent pronunciation voice data or erroneous accent pronunciation voice data of the word stored by the word storage means;
Text synchronous display control means for displaying the text of the word in synchronism with pronunciation voice data of the word output by the voice data output means;
Image display control for displaying an image including at least a mouth portion in different display forms when the sound data of correct accent is output by the sound data output means and when sound data of error accent is output Means,
A mouth image that displays a mouth-shaped image corresponding to the pronunciation sound data in synchronism with the pronunciation sound data output by the sound data output means for the mouth portion included in the image displayed by the image display control means Display control means;
Accent detection means for detecting the accent of the word from the accented phonetic symbol stored by the word storage means in accordance with the synchronous display of the word text by the text synchronization display control means,
An image change display control means for changing an image displayed by the image display control means in response to detection of an accent by the accent detection means;
An audio display output control device comprising:

further,
Correct / incorrect accent display control means for displaying a word stored by the word storage means and a correct accented phonetic symbol and a correct accented phonetic symbol associated with the word;
Correct correct / incorrect accent selection means for selecting either correct accented phonetic symbols or erroneous accented phonetic symbols of the words displayed by the correct / incorrect accent display control means,
The voice data output means outputs correct voice pronunciation data or correct accent voice data of the corresponding word in response to correct / incorrect word accent selection by the correct / incorrect accent selection means.
The voice display output control device according to claim 3.

Storage means for storing a plurality of headwords and at least two or more local pronunciation sound data of each headword;
A region designating unit for designating any region among pronunciation speech data of two or more regions of the headword stored by the storage unit;
Voice data output means for outputting pronunciation voice data of the designated area of the corresponding headword in accordance with the area designation of the pronunciation voice data by the area designation means;
Text synchronized display control means for displaying the text of the headword in synchronization with the pronunciation voice data of the designated area of the headword output by the voice data output means;
Image display control means for displaying an image including at least a mouth portion in a different display form in accordance with a designated area of pronunciation sound data by the area designation means;
A mouth image that displays a mouth-shaped image corresponding to the pronunciation sound data in synchronism with the pronunciation sound data output by the sound data output means for the mouth portion included in the image displayed by the image display control means Display control means;
Accent detection means for detecting the accent of the headword accompanying the synchronous display of the headword text by the text synchronization display control means,
An image change display control means for changing an image displayed by the image display control means in response to detection of an accent by the accent detection means;
An audio display output control device comprising:

An image display control device for changing and controlling a face image having a mouth or an expression according to a display of a pronunciation order of a series of pronunciation target data including a word headword,
First storage means for storing a plurality of sets of the pronunciation target data and pronunciation symbols including accented pronunciation symbols in association with each other;
Second storage means for storing a plurality of sets of phonetic symbols including accented phonetic symbols and their voices and face images in association with each other;
Along with the display of the order of pronunciation of the series of pronunciation target data, the phonetic symbols corresponding to the pronunciation target data are read from the first storage means, and the voice and face image corresponding to the read phonetic symbols are read out from the first storage means. First control means for reading from the second storage means, outputting the read sound to the outside, and controlling to display the read face image;
When the sound is output to the outside by the control of the first control means, it is determined whether or not the read phonetic symbol includes a phonetic symbol with an accent symbol. When it is determined that there is a voice, the voice and the face image corresponding to the accented phonetic symbol are read from the second storage means, the read voice is output to the outside, and the read face is read Second control means for controlling to display an image;
An image display control device comprising:

In the image display control device according to claim 6,
A phonetic symbol including a phonetic symbol with an accent symbol stored in the second storage means is composed of a phonetic symbol with an accent symbol and a phonetic symbol without an accent symbol, with the accent symbol attached A voice and face image stored in association with a phonetic symbol is different from a voice and face image stored in correspondence with a phonetic symbol without the accent symbol. .

An image display control device for changing and controlling a face image having a mouth or an expression according to a display of a pronunciation order of a series of pronunciation target data including a word headword,
Storage means for storing a plurality of sets in association with the pronunciation target data and their voice and face images;
Detecting means for detecting a peak portion of a signal waveform corresponding to an accented portion of the sound generation target data among the sound signal waveforms stored in the storage means;
The face image corresponding to the voice of the accent part detected by the detection means is read from the storage means, and the read face image is displayed differently from the face image corresponding to the voice of the signal waveform part other than the accent part. Display control means for controlling to display in a form;
An image display control device comprising:

In the image display control device according to claim 8,
The display control unit is configured to display a portion of the pronunciation target data corresponding to the accent portion detected by the detection unit, in a display form different from the display of the portion of the pronunciation target data corresponding to the signal waveform portion other than the accent portion. An image display control device comprising text display control means for controlling the display so as to be displayed.

An audio display output control processing program for controlling a computer of an electronic device to synchronously reproduce audio data, text, and an image,
The computer,
Audio data output means for outputting audio data;
Text synchronization display control means for displaying text in synchronization with the voice data output by the voice data output means;
Image display control means for displaying an image including at least a mouth portion;
Mouth image display control for displaying a mouth image corresponding to the sound data in synchronization with the sound data output from the sound data output means for the mouth portion included in the image displayed by the image display control means means,
An accent detection means for detecting an accent of the voice data or the text;
Image change display control means for changing the image displayed by the image display control means in accordance with the detection of accents by the accent detection means;
A computer-readable audio display output control processing program designed to function as a computer.

An audio display output control processing program for controlling a computer of an electronic device to synchronously reproduce audio data, text, and an image,
The computer,
Word storage means for storing a plurality of words and phonetic symbols with correct accents and phonetic symbols with error accents associated with each of the words,
Voice data output means for outputting correct accent pronunciation voice data or error accent pronunciation voice data of the word stored by the word storage means;
Text synchronized display control means for displaying the text of the word in synchronization with the pronunciation voice data of the word output by the voice data output means;
Image display control for displaying an image including at least a mouth portion in different display forms when the sound data of correct accent is output by the sound data output means and when sound data of error accent is output means,
Mouth image that displays a mouth-shaped image corresponding to the pronunciation sound data in synchronization with the pronunciation sound data output by the sound data output means for the mouth portion included in the image displayed by the image display control means Display control means,
Accent detection means for detecting the accent of the word from the accented phonetic symbol of the corresponding word stored by the word storage means in accordance with the synchronous display of the word text by the text synchronous display control means,
Image change display control means for changing the image displayed by the image display control means in accordance with the detection of accents by the accent detection means;
A computer-readable audio display output control processing program designed to function as a computer.

An image display control processing program for controlling a computer of an electronic device to change and control a face image having a mouth or an expression according to a display of a pronunciation order of a series of pronunciation target data including a word headword ,
The computer,
First storage means for storing a plurality of sets of the pronunciation target data and pronunciation symbols including accented pronunciation symbols in association with each other;
Second storage means for storing a plurality of sets of phonetic symbols including accented phonetic symbols and their voices and face images in association with each other;
Along with the display of the pronunciation order of the series of pronunciation target data, the phonetic symbols corresponding to the pronunciation target data are read out from the phonetic symbols stored in the first storage means, and correspond to the read phonetic symbols. The voice and the face image are read out from the voice and the face image stored in the second storage means, and the read voice is output to the outside and controlled so as to display the read face image. First control means;
When the sound is output to the outside by the control of the first control means, it is determined whether or not the read phonetic symbol includes a phonetic symbol with an accent symbol. If it is determined that the voice and the face image corresponding to the accented phonetic symbol are read from the voice and the face image stored by the second storage means, the read voice is sent to the outside. Second control means for outputting and controlling to display the read face image;
An image display control processing program designed to function as