JP4634461B2

JP4634461B2 - Document text-to-speech processing program and document browsing device

Info

Publication number: JP4634461B2
Application number: JP2007537485A
Authority: JP
Inventors: 義之長沢; 格長田; 雅秀山添; 和也佐藤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-09-27
Filing date: 2005-09-27
Publication date: 2011-02-16
Anticipated expiration: 2025-09-27
Also published as: WO2007036984A1; JPWO2007036984A1

Description

本発明は，ＨＴＭＬ（ＨｙｐｅｒＴｅｘｔＭａｒｋｕｐＬａｎｇｕａｇｅ）などのマークアップ言語で記述された構造化ドキュメントを表示処理してユーザに閲覧させ，かつ，表示したドキュメントのテキスト要素を音声合成処理して出力するドキュメント内テキスト読み上げ処理を，コンピュータに実行させるためのドキュメント内テキスト読み上げ処理プログラム，および前記ドキュメント内テキスト読み上げ処理を実行する処理手段を備えたドキュメント閲覧装置に関する。 In the present invention, a structured document described in a markup language such as HTML (HyperText Markup Language) is displayed and processed by a user, and a text element of the displayed document is processed by speech synthesis and output. The present invention relates to an in-document text-to-speech processing program for causing a computer to execute text-to-speech processing, and a document browsing apparatus having processing means for executing the in-document text to-speech processing.

コンピュータのデータ処理機能のひとつとして，テキストデータを音声合成処理して出力するというテキスト読み上げ処理機能がある。このテキスト読み上げ処理機能によって，ユーザは，表示装置や印刷装置によって出力された文書を閲覧する代わりに，スピーカから出力される音声データを聞くことによって，その内容を把握することができる。 As one of the data processing functions of a computer, there is a text-to-speech processing function that outputs text data after speech synthesis processing. With this text-to-speech processing function, the user can grasp the contents by listening to the audio data output from the speaker instead of browsing the document output by the display device or the printing device.

例えば，インターネット上の情報であるウェブページ（Ｗｅｂページ）を表示処理するＨＴＭＬドキュメント閲覧プログラムのなかには，表示装置に表示させたウェブページ内のテキストデータ部分を音声合成処理して生成した音声データを出力するテキスト読み上げ処理機能を備えたものがある。 For example, in an HTML document browsing program for displaying a web page (Web page) that is information on the Internet, speech data generated by speech synthesis processing of a text data portion in the web page displayed on the display device is output. Some have a text-to-speech processing function.

しかし，ＨＴＭＬで記述されたウェブページは，表示装置に表示されてユーザに閲覧させることを想定して作成されているため，閲覧操作のナビゲーション・メニュー，広告バナーなどの要素や，アンカータグによって指定されたリンク情報などが組み込まれている。このようなナビゲーション・メニュー，広告バナーには文字列が含まれる場合もあるため，ＨＴＭＬドキュメント内のすべての文字列を読み上げ処理の対象として扱うと，様々な情報が混在して出力されることになり，ユーザは不要な情報までも聞かされることになる。 However, web pages written in HTML are created on the assumption that they are displayed on the display device and allowed to be viewed by the user, so they are specified by elements such as navigation menus for viewing operations, advertisement banners, and anchor tags. The link information etc. which were done are incorporated. Since such navigation menus and advertisement banners may contain character strings, if all character strings in an HTML document are treated as the target of reading-out processing, various information is mixedly output. Thus, the user is also asked about unnecessary information.

そして，これらの閲覧操作用の情報を読み上げ処理の対象から除外するために，ウェブページ表示処理用に生成されるＤＯＭ（ＤｏｃｕｍｅｎｔＯｂｊｅｃｔＭｏｄｅｌ）ツリー情報を用いて，ウェブページ内のナビゲーション・メニュー部分を判別し，これらの部分のテキストデータを読み上げ処理の対象から除外して読み上げ処理を実行するような方法が知られている（例えば，特許文献１参照）。
特開２００４−１７１１１１号公報 Then, in order to exclude the information for the browsing operation from the target of the reading process, the navigation menu portion in the web page is changed using the DOM (Document Object Model) tree information generated for the web page display process. A method is known that discriminates and excludes text data of these portions from the target of the reading process and executes the reading process (see, for example, Patent Document 1).
JP 2004-171111 A

インターネットのウェブページの中には，ユーザが閲覧しやすいように一連のページが所定の形式にしたがって構成されているものがある。例えば，ニュースサーバで提供されるニュースサイトのウェブページには，ページにヘッダ部，メイン部，フッタ部を設けた構成を各ページで共通して使用しているものがある。各ページは，ヘッダ部にナビゲーション用の関連ページへのリンク情報を，フッタ部にページやサービスについての補足的説明や著作権などを表示させ，メイン部に本来の提供目的であるニュースなどを表示させるようにしている。 Some Internet web pages have a series of pages configured according to a predetermined format so that the user can easily browse. For example, some web pages of news sites provided by a news server commonly use a configuration in which a header part, a main part, and a footer part are provided on each page. Each page displays the link information to the related page for navigation in the header, the supplementary explanation and copyright of the page and service in the footer, and the news that is the original provision purpose in the main part I try to let them.

このような統一した形式によって構成されている複数ページについて読み上げ処理を行う場合に，ヘッダ部やフッタ部に同じ内容が表示されているため，通常どおりにページの先頭から読み上げ処理をすると，同じ内容がページごとに重複して読み上げられ，ユーザにとって煩わしい。 When reading out multiple pages configured in such a unified format, the same contents are displayed in the header and footer sections. Is duplicated for each page, which is troublesome for the user.

さらに，読み上げているテキストデータ中にリンク情報がある場合に，リンク先の情報が既に読み上げたものであれば，再度自動ジャンプ処理して情報を読み上げる必要はない。 Furthermore, when there is link information in the text data being read out, if the link destination information has already been read out, there is no need to read the information again by automatic jump processing.

本発明の目的は，ウェブページ間で繰り返し表示される情報や一度読み上げられた情報を，読み上げ処理の対象から外して読み上げ処理を行うことによって，ユーザが必要とする情報を効率的に読み上げることができるテキスト読み上げ処理を，コンピュータに実行させるためのプログラムを提供することである。 An object of the present invention is to efficiently read out information required by a user by removing information that is repeatedly displayed between web pages or information that has been read out from the target of the reading process and performing the reading process. It is to provide a program for causing a computer to execute a text-to-speech process that can be performed.

また，本発明の別の目的は，前記テキスト読み上げ処理を実行する処理手段を備えるドキュメント閲覧装置を提供することである。 Another object of the present invention is to provide a document browsing apparatus provided with processing means for executing the text reading process.

本発明は，コンピュータに，マークアップ言語で記述され構造化されたドキュメント内のテキストの読み上げ処理として，１）テキスト要素が読み上げ処理されたドキュメントのページ構成情報をページ構成情報記憶手段に格納するページ構成情報格納処理と，２）現在の表示対象とされているドキュメントのページ構成情報を取得するページ構成情報取得処理と，３）表示対象のドキュメントのページ構成情報とページ構成情報記憶手段に記憶されているページ構成情報とを比較し，表示対象のドキュメントのページ構成情報から，記憶されたページ構成情報のテキスト要素と一致しないテキスト要素を取り出し，取り出したテキスト要素を読み上げ処理対象として設定する読み上げ範囲設定処理と，４）表示処理対象のドキュメントにおいて読み上げ処理対象に設定されたテキスト要素を音声データに変換処理して出力する読み上げ処理とを，実行させるためのプログラムである。 According to the present invention, as a text-to-speech process in a document described and structured in a markup language on a computer, 1) a page for storing page configuration information of a document whose text elements have been read-out in page configuration information storage means Configuration information storage processing, 2) page configuration information acquisition processing for acquiring page configuration information of the document currently displayed, and 3) page configuration information and page configuration information storage means for the display target document A reading range in which the text elements that do not match the text elements of the stored page structure information are extracted from the page structure information of the document to be displayed, and the extracted text elements are set as the reading target. Setting process, and 4) display processing target document And reading processing and outputs the conversion process to the audio data set text elements processed speech Te, a program for executing.

本発明は，以下のように作用する。 The present invention operates as follows.

本発明がインストールされたコンピュータでは，例えば，テキスト要素が読み上げ処理されたＨＴＭＬドキュメントのＤＯＭツリー情報など，ＨＴＭＬドキュメント表示処理用のページ構成について，各要素および前記各要素の階層構造を解析したドキュメントのページ構成情報をページ構成情報記憶手段に格納しておく。そして，表示処理対象とされているドキュメントのページ構成情報と，ページ構成情報記憶手段に記憶しておいた過去に読み上げ処理をしたドキュメントのページ構成情報とを比較し，表示処理対象のドキュメントのページ構成情報と記憶してあるページ構成情報とにおいて重複するテキスト要素以外のテキスト要素を取り出し，取り出したテキスト要素を読み上げ処理対象に設定する。そして，現在表示処理しているドキュメント内の読み上げ処理対象に設定されたテキスト要素を音声データに変換処理して出力する読み上げ処理が実行される。 In the computer in which the present invention is installed, for example, a page structure for HTML document display processing such as DOM tree information of an HTML document in which a text element is read out, a document that is analyzed for each element and the hierarchical structure of each element. The page configuration information is stored in the page configuration information storage means. Then, the page configuration information of the document to be displayed is compared with the page configuration information of the previously read-out document stored in the page configuration information storage means, and the page of the document to be displayed is compared. Text elements other than text elements that overlap in the configuration information and the stored page configuration information are extracted, and the extracted text elements are set as a reading process target. Then, a reading process is performed in which the text element set as the reading process target in the document currently being displayed is converted into voice data and output.

コンピュータは，ページのレイアウトが揃えられている複数のページについて読み上げ処理をする場合に，複数のページに共通して表示されるテキスト要素を読み上げ処理対象から除外してＨＴＭＬドキュメントの読み上げ処理を行うことができる。よって，ユーザは，同じ内容が繰り返して読み上げられるという煩わしさから解放され，必要な情報を効率的に聞くことができる。 When a computer reads out a plurality of pages with the same page layout, the computer reads out the HTML document by excluding the text elements that are displayed in common on the plurality of pages from the target of the reading processing. Can do. Thus, the user is freed from the trouble of repeatedly reading out the same content, and can efficiently listen to necessary information.

さらに，本発明は，前記コンピュータに，５）読み上げ処理において，テキスト要素からリンク情報が検出された場合に，リンク情報に設定されたリンク先のドキュメントのページ構成情報を取得し，リンク先のページ構成情報とページ構成情報記憶手段に記憶されたページ構成情報とを比較し，リンク先のページ構成情報が読み上げ処理済みのページ構成情報と完全に一致するときは，リンク情報での自動ジャンプ処理を抑止する自動ジャンプ処理を実行させるためのプログラムである。 Further, according to the present invention, when link information is detected from a text element in 5) reading processing, the present invention acquires page configuration information of a linked document set in the linked information, and links to the linked page. Comparing the configuration information with the page configuration information stored in the page configuration information storage means, and if the page configuration information at the link destination completely matches the page configuration information that has been read out, automatic jump processing with the link information is performed. A program for executing automatic jump processing to be suppressed.

または，本発明は，前記コンピュータに，６）ドキュメントのページ構成情報から所定のハッシュ関数を用いてハッシュ関数値を演算し，読み上げ処理されたドキュメントのページ構成情報から演算したハッシュ関数値をハッシュ情報記憶手段に格納するハッシュデータ管理処理と，７）読み上げ処理において，テキスト要素からリンク情報が検出された場合に，リンク情報に設定されたリンク先のドキュメントのページ構成情報のハッシュ関数値を取得し，リンク先のドキュメントのハッシュ関数値とハッシュ情報記憶手段に記憶されたハッシュ関数値とを比較し，リンク先のドキュメントのハッシュ関数値がハッシュ情報記憶手段に記憶されたハッシュ関数値と完全に一致するときは，リンク情報での自動ジャンプ処理を抑止する自動ジャンプ処理とを，実行させるためのプログラムである。 Alternatively, according to the present invention, 6) the hash function value is calculated from the page configuration information of the document using a predetermined hash function, and the hash function value calculated from the page configuration information of the read-out document is stored in the computer. When the link information is detected from the text element in the hash data management process stored in the storage means and 7) the reading process, the hash function value of the page configuration information of the linked document set in the link information is acquired. Compare the hash function value of the linked document with the hash function value stored in the hash information storage means, and the hash function value of the linked document is exactly the same as the hash function value stored in the hash information storage means When doing this, an automatic job that suppresses automatic jump processing in link information A pump process, a program for executing.

または，本発明は，前記コンピュータに，８）表示処理されたドキュメントをドキュメント一時記憶手段に一時的に格納するドキュメント一時保管処理と，９）ドキュメント一時記憶手段に格納されたドキュメントが読み上げ処理されている場合に，読み上げ処理されたドキュメントに読み上げフラグを設定する読み上げフラグ管理処理と，１０）読み上げ処理において，テキスト要素からリンク情報が検出された場合に，リンク情報に設定されたリンク先のドキュメントがドキュメント一時記憶手段に格納され，かつ，ドキュメント一時記憶手段に格納されたドキュメントに読み上げフラグが設定されているときに，リンク情報での自動ジャンプ処理を抑止する自動ジャンプ処理とを，実行させるためのプログラムである。 Alternatively, according to the present invention, 8) a document temporary storage process for temporarily storing a display-processed document in the document temporary storage means, and 9) a document stored in the document temporary storage means is read out to the computer. And a reading flag management process for setting a reading flag in the read-out document, and 10) in the reading-out process, when link information is detected from the text element, the link destination document set in the link information is To execute automatic jump processing that suppresses automatic jump processing with link information when a reading flag is set for a document stored in the document temporary storage means and stored in the document temporary storage means It is a program.

これにより，本発明は，読み上げ処理中のテキスト要素にリンク情報が設定されている場合でも，リンク先のドキュメントが既に読み上げ処理の対象となったか否かを判定し，既に読み上げ処理の対象となっていれば，そのリンク情報での自動ジャンプ処理を行わないようにする。よって，ユーザは，既に読み上げられた内容が繰り返して読み上げられるという煩わしさから解放され，必要な情報を効率的に聞くことができる。 Accordingly, the present invention determines whether or not the linked document has already been the target of the reading process even if the link information is set in the text element being read out, and has already been the target of the reading process. If so, automatic jump processing is not performed with the link information. Therefore, the user is freed from the annoyance that the content that has already been read out is repeatedly read out, and can listen to necessary information efficiently.

また，本発明は，前記ドキュメント内テキスト読み上げ処理プログラムによって実現される処理をそれぞれ実行する処理手段を備えるドキュメント内テキスト読み上げ処理機能を備えたドキュメント閲覧装置である。 In addition, the present invention is a document browsing apparatus having an in-document text reading processing function including processing means for executing processes realized by the in-document text reading processing program.

本発明によれば，ニュースサイトなどによって提供されるような同一レイアウトを多用したウェブページを閲覧し，その内容を読み上げ処理させる場合に，各ページのヘッダ部やフッタ部に共通して表示される内容を読み上げ処理の対象から排除させることができる。よって，ユーザは，同じ情報が重複して読み上げられる煩わしさから解放されるため，必要な情報だけを効率的かつ快適に入手することができる。 According to the present invention, when a web page that uses the same layout as provided by a news site or the like is browsed and the content is read out, it is displayed in common in the header part and footer part of each page. The contents can be excluded from the target of the reading process. Therefore, the user is freed from the trouble of reading the same information repeatedly, so that only necessary information can be obtained efficiently and comfortably.

また，本発明によれば，読み上げ処理中のテキストにリンク情報が含まれている場合でも，既に読み上げたリンク先であれば自動ジャンプ処理を抑止することができる。よって，ユーザは，同じページが再度読み上げられることなく，無駄な時間を要さずに効率的に必要な情報を入手することができる。 Further, according to the present invention, even when link information is included in the text being read out, automatic jump processing can be suppressed if the link destination has already been read out. Thus, the user can efficiently obtain necessary information without reading the same page again and without wasting time.

図１は，本発明の最良の実施形態における構成例を示す図である。FIG. 1 is a diagram showing a configuration example in the best embodiment of the present invention. 図２は，ニュースサイトから提供されるウェブページの例を示す図である。FIG. 2 is a diagram illustrating an example of a web page provided from a news site. 図３は，ＤＯＭツリー管理リストの例を示す図である。FIG. 3 is a diagram illustrating an example of a DOM tree management list. 図４は，ＤＯＭツリー情報の例を示す図である。FIG. 4 is a diagram illustrating an example of DOM tree information. 図５は，第１の実施例における読み上げ処理の処理フローを示す図である。FIG. 5 is a diagram showing a processing flow of the reading process in the first embodiment. 図６は，第１の実施例における読み上げ処理の処理フローを示す図である。FIG. 6 is a diagram showing a processing flow of the reading process in the first embodiment. 図７は，第１の実施例における読み上げ処理の処理フローを示す図である。FIG. 7 is a diagram showing a processing flow of the reading process in the first embodiment. 図８は，ステップＳ１１１のＵＲＬリスト取得処理の処理フローを示す図である。FIG. 8 is a diagram showing a process flow of the URL list acquisition process in step S111. 図９は，ステップＳ８の自動ジャンプ処理の処理フローを示す図である。FIG. 9 is a diagram showing a processing flow of the automatic jump processing in step S8. 図１０は，ハッシュ管理リストの例を示す図である。FIG. 10 is a diagram illustrating an example of a hash management list. 図１１は，第２の実施例における読み上げ処理の処理フローを示す図である。FIG. 11 is a diagram showing a processing flow of the reading process in the second embodiment. 図１２は，ステップＳ２０の自動ジャンプ処理の処理フローを示す図である。FIG. 12 is a diagram showing a process flow of the automatic jump process in step S20. 図１３は，キャッシュ管理リストの例を示す図である。FIG. 13 is a diagram illustrating an example of a cache management list. 図１４は，第３の実施例における読み上げ処理の処理フローを示す図である。FIG. 14 is a diagram showing a processing flow of the reading process in the third embodiment. 図１５は，ステップＳ３０の自動ジャンプ処理の処理フローを示す図である。FIG. 15 is a diagram showing a process flow of the automatic jump process in step S30.

Explanation of symbols

１ドキュメント閲覧装置
１１０ブラウジング処理制御部
１１１通信処理部
１１２ＨＴＭＬ解析処理部
１１３レイアウト処理部
１１４画面表示処理部
１１５一時保管ドキュメント管理部
１１６読み上げフラグ管理部
１１７ドキュメント一時記憶部
１２０読み上げ処理制御部
１２１ＤＯＭツリー退避処理部
１２２ＤＯＭツリー記憶部
１２３読み上げ処理部
１２４読み上げ範囲判定部
１２５自動ジャンプ処理部
１２６ハッシュデータ管理部
１４１音声合成処理部
１４３音声出力処理部
２表示装置
３スピーカDESCRIPTION OF SYMBOLS 1 Document browsing apparatus 110 Browsing process control part 111 Communication process part 112 HTML analysis process part 113 Layout process part 114 Screen display process part 115 Temporary storage document management part 116 Reading flag management part 117 Document temporary storage part 120 Reading process control part 121 DOM Tree saving processing unit 122 DOM tree storage unit 123 Reading processing unit 124 Reading range determination unit 125 Automatic jump processing unit 126 Hash data management unit 141 Speech synthesis processing unit 143 Audio output processing unit 2 Display device 3 Speaker

図１に，本発明の最良の実施形態における構成例を示す。本実施形態では，マークアップ言語で記述され構造化されたドキュメント例として，インターネット上で提供されるウェブページ（ＨＴＭＬドキュメント）の場合を例に説明する。 FIG. 1 shows a configuration example in the best embodiment of the present invention. In the present embodiment, an example of a web page (HTML document) provided on the Internet will be described as an example of a document described and structured in a markup language.

ドキュメント閲覧装置１は，ドキュメント内テキストの読み上げ処理機能を備えたドキュメント閲覧のためのコンピュータである。本例において，本発明のドキュメント内テキスト読み上げ処理プログラムは，ドキュメント閲覧装置１内のメモリにインストールされ，所定の契機によって実行される。 The document browsing apparatus 1 is a computer for browsing a document having a reading processing function for text in a document. In this example, the in-document text reading processing program of the present invention is installed in the memory in the document browsing apparatus 1 and is executed at a predetermined opportunity.

ドキュメント閲覧装置１は，表示装置２，スピーカ３，ブラウジング処理制御部１１０，通信処理部１１１，ＨＴＭＬ解析処理部１１２，レイアウト処理部１１３，画面表示処理部１１４，一時保管ドキュメント管理部１１５，読み上げフラグ管理部１１６，ドキュメント一時記憶部１１７，読み上げ処理制御部１２０，ＤＯＭツリー退避処理部１２１，ＤＯＭツリー記憶部１２２，読み上げ処理部１２３，読み上げ範囲判定部１２４，自動ジャンプ処理部１２５，ハッシュデータ管理部１２６，音声合成処理部１４１，および音声出力処理部１４３を備える。 The document browsing apparatus 1 includes a display device 2, a speaker 3, a browsing processing control unit 110, a communication processing unit 111, an HTML analysis processing unit 112, a layout processing unit 113, a screen display processing unit 114, a temporarily stored document management unit 115, and a reading flag. Management unit 116, document temporary storage unit 117, reading processing control unit 120, DOM tree saving processing unit 121, DOM tree storage unit 122, reading processing unit 123, reading range determination unit 124, automatic jump processing unit 125, hash data management unit 126, a speech synthesis processing unit 141, and a speech output processing unit 143.

ブラウジング処理制御部１１０は、ウェブページを取得して，表示装置２に表示させるブラウジング処理に関する一連の処理を制御する処理手段である。また，ブラウジング処理制御部１１０は，ユーザが，読み上げ処理を要求した場合に，読み上げ処理制御部１２０へ読み上げ処理要求を通知する。 The browsing process control unit 110 is a processing unit that acquires a web page and controls a series of processes related to the browsing process that is displayed on the display device 2. In addition, when the user requests a reading process, the browsing process control unit 110 notifies the reading process control unit 120 of the reading process request.

通信処理部１１１は，ＨＴＴＰプロトコルにしたがって，ウェブサイトのサーバからウェブページを取得する処理手段である。 The communication processing unit 111 is a processing unit that acquires a web page from a website server according to the HTTP protocol.

ＨＴＭＬ解析処理部１１２は，ウェブページのＨＴＭＬタグを解析処理し，ＤＯＭツリーと呼ばれる形式に変換したページ構成情報を生成する処理手段である。 The HTML analysis processing unit 112 is a processing unit that analyzes the HTML tag of the web page and generates page configuration information converted into a format called a DOM tree.

レイアウト処理部１１３は，ページ構成情報（以下，ＤＯＭツリー情報という）にもとづいて，ウェブページの要素をレイアウトする処理手段である。 The layout processing unit 113 is a processing unit that lays out elements of a web page based on page configuration information (hereinafter referred to as DOM tree information).

画面表示処理部１１４は，レイアウト処理されたウェブページの個々の要素を表示装置２に表示する処理手段である。 The screen display processing unit 114 is a processing unit that displays individual elements of the layout-processed web page on the display device 2.

一時保管ドキュメント管理部１１５は，ブラウジング処理制御部１１０によって表示装置２に表示されたウェブページ（ＨＴＭＬドキュメント）を一時的にドキュメント一時記憶部１１７に格納する処理手段である。 The temporary storage document management unit 115 is processing means for temporarily storing the web page (HTML document) displayed on the display device 2 by the browsing processing control unit 110 in the document temporary storage unit 117.

読み上げフラグ管理部１１６は，ドキュメント一時記憶部１１７に格納されたウェブページが読み上げ処理部１２３によって読み上げられている場合に，そのウェブページに読み上げフラグを設定した読み上げ情報を管理する処理手段である。 When the web page stored in the document temporary storage unit 117 is read out by the read-out processing unit 123, the read-out flag management unit 116 is a processing unit that manages read-out information in which a read-out flag is set for the web page.

ドキュメント一時記憶部１１７は，ブラウジング処理制御部１１０によって表示装置２に表示されたウェブページを一時的に記憶する手段である。 The document temporary storage unit 117 is a unit that temporarily stores the web page displayed on the display device 2 by the browsing processing control unit 110.

読み上げ処理制御部１２０は，ユーザから処理要求を受けて，表示装置２に表示されたウェブページの所定のテキスト要素を音声合成処理して出力する読み上げ処理に関する一連の処理を制御する処理手段である。 The reading process control unit 120 is a processing unit that receives a processing request from the user and controls a series of processes related to the reading process in which a predetermined text element of the web page displayed on the display device 2 is subjected to speech synthesis processing and output. .

ＤＯＭツリー退避処理部１２１は，読み上げ処理の対象となったウェブページのＤＯＭツリー情報をＤＯＭツリー記憶部１２２に格納する処理手段である。 The DOM tree save processing unit 121 is a processing unit that stores the DOM tree information of the web page that is the target of the reading process in the DOM tree storage unit 122.

読み上げ処理部１２３は，読み上げ範囲判定部１２４によって処理対象とされたウェブページのテキスト要素を先頭から順番に取得し，音声合成処理部１４１，音声出力処理部１４３によって読み上げ処理を実行する処理手段である。 The reading processing unit 123 is a processing unit that sequentially acquires the text elements of the web pages to be processed by the reading range determination unit 124 from the top, and executes the reading processing by the voice synthesis processing unit 141 and the voice output processing unit 143. is there.

読み上げ範囲判定部１２４は，表示処理対象のウェブページのＤＯＭツリー情報と，ＤＯＭツリー記憶部１２２に格納されている既に読み上げ処理されたウェブページのＤＯＭツリー情報とを比較し，表示対象のＤＯＭツリー情報のテキスト要素のうち，記憶されたＤＯＭツリー情報のテキスト要素と一致しないテキスト要素を取り出し，取り出したテキスト要素を読み上げ処理部１２３の処理対象（読み上げ処理対象）として設定する処理手段である。 The reading range determination unit 124 compares the DOM tree information of the web page to be displayed with the DOM tree information of the already read-out web page stored in the DOM tree storage unit 122, and displays the DOM tree to be displayed. This is a processing means for extracting a text element that does not match the text element of the stored DOM tree information from the text elements of the information and setting the extracted text element as a processing target (reading processing target) of the reading processing unit 123.

自動ジャンプ処理部１２５は，読み上げ処理対象と設定されたテキスト要素からリンク情報を検出した場合に，リンク情報のリンク先に設定されたウェブページのＤＯＭツリー情報を取得し，リンク先のＤＯＭツリー情報とＤＯＭツリー記憶部１２２に退避処理によって記憶されたＤＯＭツリー情報とを比較し，リンク先のＤＯＭツリー情報がＤＯＭツリー記憶部１２２のＤＯＭツリー情報と完全に一致するときは，リンク情報での自動ジャンプ処理を抑止（無効化）し，リンク先のＤＯＭツリー情報がＤＯＭツリー記憶部１２２のＤＯＭツリー情報と完全に一致しないときは，そのリンク情報をもとに自動ジャンプ処理を行ってリンク先のウェブページ（ＨＴＭＬドキュメント）を取得する処理手段である。 When the automatic jump processing unit 125 detects link information from a text element set as a reading process target, the automatic jump processing unit 125 acquires the DOM tree information of the web page set as the link destination of the link information, and the DOM tree information of the link destination Is compared with the DOM tree information stored in the DOM tree storage unit 122 by the saving process. When the link destination DOM tree information completely matches the DOM tree information in the DOM tree storage unit 122, the link information is automatically When the jump process is suppressed (invalidated) and the DOM tree information of the link destination does not completely match the DOM tree information of the DOM tree storage unit 122, the automatic jump process is performed based on the link information, and the link destination It is a processing means for acquiring a web page (HTML document).

また，自動ジャンプ処理部１２５は，読み上げ処理部１２３の処理対象であるテキスト要素からリンク情報が検出された場合に，リンク情報に設定されたリンク先のウェブページから生成されたハッシュ関数値（ハッシュデータ）と，ハッシュデータ管理部１２６に退避処理された読み上げ処理済みのウェブページから生成されたハッシュデータとを比較し，リンク先のウェブページのハッシュデータがハッシュデータ管理部１２６に退避させたウェブページのハッシュデータと完全に一致するときは，そのリンク情報での自動ジャンプ処理を抑止する。 Further, when link information is detected from the text element that is the processing target of the reading processing unit 123, the automatic jump processing unit 125 generates a hash function value (hash) generated from the linked web page set in the link information. Data) and the hash data generated from the read-out processed web page saved in the hash data management unit 126, and the hash data of the linked web page is saved in the hash data management unit 126 When it completely matches the hash data of the page, the automatic jump processing with the link information is suppressed.

また，自動ジャンプ処理部１２５は，読み上げ処理部１２３の処理対象であるテキスト要素からリンク情報が検出された場合に，リンク情報に設定されたリンク先のウェブページがドキュメント一時記憶部１１７にキャッシングされているか否かを調べ，さらに，読み上げフラグ管理部１１６が管理する読み上げ情報をもとに，ドキュメント一時記憶部１１７にキャッシュされたリンク先のウェブページの読み上げフラグに読み上げ済みを示す値が設定されているときは，前記リンク情報での自動ジャンプ処理を抑止する。 Further, when link information is detected from the text element that is the processing target of the reading processing unit 123, the automatic jump processing unit 125 caches the link destination web page set in the link information in the document temporary storage unit 117. Further, based on the reading information managed by the reading flag management unit 116, a value indicating that reading has been performed is set in the reading flag of the linked web page cached in the temporary document storage unit 117. If so, the automatic jump processing with the link information is suppressed.

ハッシュデータ管理部１２６は，読み上げ処理済みのウェブページをもとに所定のハッシュ関数を用いてハッシュデータを作成し，このウェブページの格納場所情報（ＵＲＬなど）およびウェブページのハッシュデータとの対応を示すキャッシュ管理リストを生成・管理する処理手段である。 The hash data management unit 126 creates hash data using a predetermined hash function based on the web page that has been read out, and the correspondence between the storage location information (such as URL) of the web page and the hash data of the web page Is a processing means for generating and managing a cache management list indicating

また，ハッシュデータ管理部１２６は，自動ジャンプ処理部１２５から渡されたリンク先のウェブページをもとにハッシュデータを生成し，生成したリンク先のウェブページのハッシュデータを自動ジャンプ処理部１２５へ返却する。 In addition, the hash data management unit 126 generates hash data based on the linked web page passed from the automatic jump processing unit 125, and sends the generated hash data of the linked web page to the automatic jump processing unit 125. return.

音声合成処理部１４１は，読み上げ処理部１２３から取得した処理対象のテキスト要素を表音文字列に変換し，表音文字列を波形データに変換する処理手段である。音声出力処理部１４３は，波形データを音声としてスピーカ３から出力する処理手段である。 The speech synthesis processing unit 141 is a processing unit that converts the text element to be processed acquired from the reading processing unit 123 into a phonetic character string and converts the phonetic character string into waveform data. The audio output processing unit 143 is processing means for outputting the waveform data as audio from the speaker 3.

次に，本発明の具体的な処理例を説明する。 Next, a specific processing example of the present invention will be described.

ドキュメント閲覧装置１は，図２に示すようなニュースサイトから提供されるウェブページをユーザに閲覧させるために表示装置２に表示し，また，ユーザの要求によってウェブページ内のテキスト要素（テキストデータ）を変換処理した音声データをスピーカ３から出力する。ここで，ニュースサイトで提供されるウェブページは，すべて，ユーザが閲覧しやすいように，特定のレイアウトに従って作成されているものとする。 The document browsing device 1 displays a web page provided from a news site as shown in FIG. 2 on the display device 2 so as to allow the user to browse, and a text element (text data) in the web page according to a user request. Is output from the speaker 3. Here, it is assumed that all the web pages provided on the news site are created according to a specific layout so that the user can easily browse.

図２（Ａ）は，ニュース見出しを内容（以下，コンテンツＡという）とするウェブページＡの例を示す図，図２（Ｂ）は，あるニュース見出しに対応するニュース本文を内容（以下，コンテンツＢという）とするウェブページＢの例を示す図である。 FIG. 2A is a diagram showing an example of a web page A having a news headline as content (hereinafter referred to as content A), and FIG. 2B is a content of a news body corresponding to a news headline (hereinafter referred to as content). It is a figure which shows the example of the web page B set to B).

ウェブページＡおよびウェブページＢは，共通するレイアウトによって，ヘッダ部Ｈ，フッタ部Ｆ，メイン部Ｍａｉｎの３つから構成され，ヘッダ部Ｈおよびフッタ部Ｆには同一の内容が表示される。また，ウェブページＡのメイン部Ａ＿Ｍａｉｎに表示されたニュース見出し（例えば，ニュース見出しＨＬ１）に設定された一つのリンク先として，ウェブページＢが設定されている。ウェブページＢのメイン部Ｂ＿Ｍａｉｎには，リンク元のニュース見出しＨＬ１に対応するニュース本文ＮＥＷＳ１が表示される。 The web page A and the web page B are composed of a header part H, a footer part F, and a main part Main according to a common layout, and the same contents are displayed on the header part H and the footer part F. Moreover, the web page B is set as one link destination set to the news headline (for example, news headline HL1) displayed on the main part A_Main of the web page A. In the main part B_Main of the web page B, a news body NEWS1 corresponding to the news headline HL1 of the link source is displayed.

〔第1の実施例〕
第1の実施例において，ドキュメント閲覧装置１は，読み上げ処理の対象とするテキスト要素の範囲およびリンク情報における自動ジャンプ処理の要否を，ウェブページのＤＯＭツリー情報を用いて判定する。[First Example]
In the first embodiment, the document browsing apparatus 1 determines whether or not automatic jump processing is necessary for a range of text elements to be read out and link information using DOM tree information of a web page.

また，本例のドキュメント閲覧装置１において，図1の構成例に示すハッシュデータ管理部１２６，一時保管ドキュメント管理部１１５，読み上げフラグ管理部１１６，およびドキュメント一時記憶部１１７は，必須の構成要素ではない。 In the document browsing apparatus 1 of this example, the hash data management unit 126, the temporary storage document management unit 115, the reading flag management unit 116, and the document temporary storage unit 117 shown in the configuration example of FIG. Absent.

ドキュメント閲覧装置１において，通信処理部１１１が，ニュースサイトからウェブページＡをダウンロードすると，ＨＴＭＬ解析処理部１１２は，ダウンロードされたウェブページＡのＨＴＭＬタグを解析処理し，ＤＯＭツリー情報Ａを生成する。そして，レイアウト処理部１１３は，ＤＯＭツリー情報Ａにもとづいて，ウェブページＡの各要素をレイアウトし，画面表示処理部１１４は，レイアウト処理されたウェブページＡを表示装置２に表示する。 In the document browsing apparatus 1, when the communication processing unit 111 downloads the web page A from the news site, the HTML analysis processing unit 112 analyzes the HTML tag of the downloaded web page A and generates DOM tree information A. . Then, the layout processing unit 113 lays out each element of the web page A based on the DOM tree information A, and the screen display processing unit 114 displays the web page A subjected to the layout process on the display device 2.

また，読み上げ処理部１２３は，読み上げ処理制御部１２０を通じてウェブページＡのＤＯＭツリー情報Ａを取得する。読み上げ範囲判定部１２４は，ＤＯＭツリー管理リストをもとに，ウェブページＡと同じドメインに属するウェブページのＤＯＭツリー情報をＤＯＭツリー記憶部１２２から一つずつ取り出し，ＤＯＭツリー情報Ａと取り出したＤＯＭツリー情報とを先頭および最後から比較し，両者に同じテキスト要素があるか否かを判定する。 Further, the reading processing unit 123 acquires the DOM tree information A of the web page A through the reading processing control unit 120. Based on the DOM tree management list, the reading range determination unit 124 extracts the DOM tree information of web pages belonging to the same domain as the web page A one by one from the DOM tree storage unit 122, and extracts the DOM tree information A and the extracted DOM. The tree information is compared from the beginning and end, and it is determined whether or not the same text element exists in both.

ＤＯＭツリー管理リストは，ＤＯＭツリー記憶部１２２に退避格納される読み上げ処理済みのウェブページのＤＯＭツリー情報を管理するリストである。図３に，ＤＯＭツリー管理リストの例を示す。ＤＯＭツリー管理リストは，ウェブページの格納場所情報（ＵＲＬ），ＤＯＭツリー記憶部１２２のＤＯＭツリー情報へのポインタ情報であるＤＯＭツリーアドレスで構成される。 The DOM tree management list is a list for managing DOM tree information of web pages that have been read out and saved in the DOM tree storage unit 122. FIG. 3 shows an example of the DOM tree management list. The DOM tree management list includes web page storage location information (URL) and DOM tree addresses that are pointer information to DOM tree information in the DOM tree storage unit 122.

ここで，ウェブページＡがニュースサイトからダウンロードした最初のウェブページであって，ＤＯＭツリー記憶部１２２に退避されている読み上げ済みのウェブページのＤＯＭツリー情報に一致するテキスト要素がなく，ウェブページＡのすべてのテキスト要素は未だ読み上げ処理されていないとする。 Here, the web page A is the first web page downloaded from the news site, there is no text element that matches the DOM tree information of the web page that has been read out and saved in the DOM tree storage unit 122, and the web page A Assume that all text elements in are not yet read out.

読み上げ範囲判定部１２４は，ＤＯＭツリー情報Ａのすべてのテキスト要素を読み上げ処理の範囲と判定し，ＤＯＭツリー情報Ａの最初のテキスト要素から最後のテキスト要素までを読み上げ処理対象とする。例えば，図２（Ａ）のウェブページＡの「タイトル，最新ニュース，政治，経済，社会，海外，ニュース見出しＨＬ１…」が読み上げ処理対象となる。 The reading range determination unit 124 determines that all text elements of the DOM tree information A are the range of the reading process, and sets the first text element to the last text element of the DOM tree information A as a reading process target. For example, “title, latest news, politics, economy, society, overseas, news headline HL1...” On the web page A in FIG.

読み上げ処理部１２３は，ウェブページＡのＤＯＭツリー情報Ａのテキスト要素を順に音声合成処理部１４１へ渡す。音声合成処理部１４１は，読み上げ処理の対象となったテキスト要素を表音文字列に変換し，さらに表音文字列を波形データに変換し，音声出力処理部１４３は，波形データを音声としてスピーカ３から出力する。 The reading processing unit 123 sequentially passes the text elements of the DOM tree information A of the web page A to the speech synthesis processing unit 141. The speech synthesis processing unit 141 converts the text element to be read out into a phonetic character string, further converts the phonetic character string into waveform data, and the voice output processing unit 143 uses the waveform data as a voice as a speaker. 3 is output.

そして，ＤＯＭツリー退避処理部１２１は，読み上げ処理されたウェブページＡのＤＯＭツリー情報ＡをＤＯＭツリー記憶部１２２へ格納し，ＤＯＭツリー管理リストに，ウェブページＡのＵＲＬとＤＯＭツリー情報Ａへのポインタ情報（ＤＯＭツリーアドレス）を追加する。 Then, the DOM tree save processing unit 121 stores the DOM tree information A of the web page A that has been read out in the DOM tree storage unit 122, and stores the URL of the web page A and the DOM tree information A in the DOM tree management list. Pointer information (DOM tree address) is added.

さらに，読み上げ処理対象のウェブページＡのテキスト要素「ニュース見出しＨＬ１」にウェブページＢをリンク先とするリンク情報が設定されているとする。 Furthermore, it is assumed that link information that links the web page B to the text element “news headline HL1” of the web page A to be read out is set.

自動ジャンプ処理部１２５は，リンク情報「ニュース見出しＨＬ１」のリンク先であるウェブページＢのＤＯＭツリー情報Ｂを読み上げ処理制御部１２０を介して取得する。そして，取得したＤＯＭツリー情報ＢとＤＯＭツリー記憶部１２２に格納されたＤＯＭツリー情報とを比較し，完全に一致するＤＯＭツリー情報があるかどうか判定する。 The automatic jump processing unit 125 acquires the DOM tree information B of the web page B that is the link destination of the link information “news headline HL1” via the reading processing control unit 120. Then, the acquired DOM tree information B is compared with the DOM tree information stored in the DOM tree storage unit 122, and it is determined whether there is a completely matching DOM tree information.

ここで，ウェブページＢは未だ読み上げ処理の対象となっていないので，ＤＯＭツリー記憶部１２２にＤＯＭツリー情報Ｂと完全に一致するものは格納されていない。 Here, since the web page B has not yet been subjected to the reading process, the DOM tree storage unit 122 does not store anything that completely matches the DOM tree information B.

自動ジャンプ処理部１２５は，完全に一致するＤＯＭツリー情報を検出していないので，リンク情報「ニュース見出しＨＬ１」において自動ジャンプ処理を実行する。この自動ジャンプ処理によって，図２（Ｂ）のウェブページＢが表示処理対象としてダウンロードされる。 Since the automatic jump processing unit 125 has not detected the completely matched DOM tree information, the automatic jump processing unit 125 executes automatic jump processing on the link information “news headline HL1”. By this automatic jump processing, the web page B in FIG. 2B is downloaded as a display processing target.

読み上げ範囲判定部１２４は，ウェブページＢのＤＯＭツリー情報Ｂと，ＤＯＭツリー記憶部１２２に格納されている読み上げ済みのウェブページのＤＯＭツリー情報を一つずつ取り出して比較し，一致するテキスト要素があるかどうか判定する。 The reading range determination unit 124 extracts and compares the DOM tree information B of the web page B and the DOM tree information of the read web page stored in the DOM tree storage unit 122 one by one. Determine if there is.

図４に，ＤＯＭツリー情報の例を示す。図４（Ａ）は，ＤＯＭツリー記憶部１２２に格納されたＤＯＭツリー情報の一つであるウェブページＡ（コンテンツＡ）のＤＯＭツリー情報Ａの例を示し，図４（Ｂ）は，表示処理の対象となっているウェブページＢ（コンテンツＢ）のＤＯＭツリー情報Ｂの例を示す図である。 FIG. 4 shows an example of DOM tree information. 4A shows an example of the DOM tree information A of the web page A (content A), which is one of the DOM tree information stored in the DOM tree storage unit 122. FIG. 4B shows the display process. It is a figure which shows the example of the DOM tree information B of the web page B (contents B) used as the object of.

読み上げ範囲判定部１２４は，ＤＯＭツリー情報ＢとＤＯＭツリー記憶部１２２に格納されたＤＯＭツリー情報Ａのテキスト要素を，先頭および最後から順に比較する。比較処理において，ＤＯＭツリー情報Ｂの先頭からいくつかのテキスト要素「最新ニュース，政治，経済，社会，海外」と，最後からいくつかのテキスト要素「補足説明，Ｃｏｐｙｒｉｇｈｔ（Ｃ）…」とが，ＤＯＭツリー情報Ａのテキスト要素と重複していることを検出する。また，ＤＯＭツリー情報Ｂのテキスト要素「ニュース本文」は，ＤＯＭツリー情報Ａおよび他のＤＯＭツリー情報のテキスト要素と一致しなかったとする。 The reading range determination unit 124 compares the text elements of the DOM tree information B and the DOM tree information A stored in the DOM tree storage unit 122 in order from the beginning and the end. In the comparison process, some text elements “latest news, politics, economy, society, overseas” from the top of the DOM tree information B and some text elements “supplementary explanation, Copyright (C)... It is detected that the text element of the DOM tree information A is duplicated. Further, it is assumed that the text element “news text” of the DOM tree information B does not match the text elements of the DOM tree information A and other DOM tree information.

読み上げ範囲判定部１２４は，ウェブページＢのＤＯＭツリー情報Ｂから，重複するテキスト要素以外のテキスト要素「ニュース本文」（例えば，ニュース本文ＮＥＷＳ１）を読み上げ処理対象とする。読み上げ処理部１２３は，ウェブページＡの場合と同様に，読み上げ処理対象のテキスト要素を音声合成処理部１４１へ渡して読み上げ処理を行う。 The reading range determination unit 124 sets, from the DOM tree information B of the web page B, a text element “news body” (for example, news body NEWS1) other than the overlapping text elements as a reading process target. As in the case of the web page A, the reading processing unit 123 passes the text element to be read out to the speech synthesis processing unit 141 and performs the reading processing.

これにより，ウェブページＢの読み上げ処理において，既にウェブページＡの読み上げ処理において読み上げられた部分は読み上げ処理されず，ユーザは同じ内容が読み上げられるという状況を回避することができる。 Thereby, in the reading process of the web page B, the part that has already been read out in the reading process of the web page A is not read out, and the user can avoid the situation that the same content is read out.

その後，ウェブページＡのテキスト要素から別のリンク情報を検出した場合には，自動ジャンプ処理部１２５は，前記処理と同様に，そのリンク情報のリンク先のＤＯＭツリー情報によって自動ジャンプ処理を行うか否かを判定する。 Thereafter, when another link information is detected from the text element of the web page A, the automatic jump processing unit 125 performs the automatic jump processing based on the DOM tree information of the link destination of the link information as in the above processing. Determine whether or not.

また，読み上げ範囲判定部１２４は，次に表示処理対象とされた別のウェブページＣのＤＯＭツリー情報Ｃを取得した場合には，前記処理と同様に，読み上げ処理の対象とするテキスト要素の範囲を判定する。 When the reading range determination unit 124 acquires the DOM tree information C of another web page C that is the next display processing target, the range of the text element that is the target of the reading processing is the same as the processing described above. Determine.

図５〜７に，第１の実施例における読み上げ処理の処理フローを示す。 5 to 7 show a processing flow of the reading process in the first embodiment.

図５の処理フローにおいて，読み上げ処理部１２３は，読み上げ処理を実行するか否かの状態を示す読み上げ処理フラグ（以下，処理フラグとする）の初期値をＯＦＦにする（ステップＳ１０）。読み上げ範囲判定部１２４は，読み上げ範囲判定処理を実行する（ステップＳ１１）。 In the processing flow of FIG. 5, the reading processing unit 123 turns off the initial value of the reading processing flag (hereinafter referred to as processing flag) indicating whether or not to execute the reading processing (step S10). The reading range determination unit 124 executes a reading range determination process (step S11).

図６および図７に，読み上げ範囲判定処理の処理フローを示す。 6 and 7 show a processing flow of the reading range determination process.

読み上げ範囲判定部１２４は，ＮＯＤＥ＿Ｓ，ＷＯＲＫ＿Ｓに，現在の表示処理対象のウェブページＸについて，そのＤＯＭツリー情報Ｘの先頭のテキスト要素の要素番号を格納し，ＮＯＤＥ＿Ｅ，ＷＯＲＫ＿Ｅに，同じＤＯＭツリー情報Ｘの最後のテキスト要素の要素番号を格納する（ステップＳ１１０）。ここで，ＮＯＤＥ＿Ｓは，読み上げ処理を開始するテキスト要素の要素番号，ＮＯＤＥ＿Ｅは，読み上げ処理を終了するテキスト要素の要素番号を格納する記憶域である。 The reading range determination unit 124 stores the element number of the first text element of the DOM tree information X for the current display processing target web page X in NODE_S and WORK_S, and the same DOM tree information X in NODE_E and WORK_E. The element number of the last text element is stored (step S110). Here, NODE_S is a storage area for storing the element number of the text element at which the reading process is started, and NODE_E is the element number of the text element at which the reading process is ended.

次に，ＤＯＭツリー記憶部１２２から同一ドメインに属するウェブページのＤＯＭツリー情報を取得するために，ＤＯＭツリー管理リストからのＵＲＬリスト取得処理を実行する（ステップＳ１１１）。ＵＲＬリスト取得処理の詳細は後述する。 Next, in order to acquire the DOM tree information of the web page belonging to the same domain from the DOM tree storage unit 122, a URL list acquisition process from the DOM tree management list is executed (step S111). Details of the URL list acquisition process will be described later.

そして，ＵＲＬリスト取得処理によって抽出したＵＲＬリストに判定処理が未処理のＵＲＬが残っている間は（ステップＳ１１２のＹＥＳ），ステップＳ１１３以降の処理を行い，未処理で残されたＵＲＬがなくなれば（ステップＳ１１２のＮＯ），読み上げ範囲判定処理を終了する。 Then, while URLs that have not been subjected to the determination process remain in the URL list extracted by the URL list acquisition process (YES in step S112), the processes in and after step S113 are performed, and if there is no URL left unprocessed. (NO in step S112), the reading range determination process is terminated.

まず，ＵＲＬリストに判定処理が未処理のＵＲＬが残っている場合は（ステップＳ１１２のＹＥＳ），現在の表示処理対象のＤＯＭツリー情報Ｘの先頭から順にテキスト要素を取り出す（ステップＳ１１３）。さらに，ＤＯＭツリー記憶部１２２に退避させていた同一ドメイン下のＤＯＭツリー情報から抽出された一つのＤＯＭツリー情報Ｙの先頭から順にテキスト要素を取り出す（ステップＳ１１４）。ステップＳ１１３で取り出したテキスト要素とステップＳ１１４で取り出したテキスト要素とが同じテキスト要素であれば（ステップＳ１１５のＹＥＳ），ＷＯＲＫ＿Ｓに，ＤＯＭツリー情報Ｘの現在取り出しているテキスト要素の要素番号を格納する（ステップＳ１１６）。 First, when a URL that has not been subjected to determination processing remains in the URL list (YES in step S112), text elements are extracted in order from the top of the DOM tree information X that is the current display processing target (step S113). Further, text elements are extracted in order from the top of one DOM tree information Y extracted from the DOM tree information under the same domain saved in the DOM tree storage unit 122 (step S114). If the text element extracted in step S113 and the text element extracted in step S114 are the same text element (YES in step S115), the element number of the text element currently extracted in DOM tree information X is stored in WORK_S. (Step S116).

また，取り出したテキスト要素が同じでなければ（ステップＳ１１５のＮＯ），現在の表示対象のＤＯＭツリー情報Ｘの最後から順にテキスト要素を取り出す（ステップＳ１１７）。さらに，退避させていた同一ドメイン下のＤＯＭツリー情報Ｙの最後から順にテキスト要素を取り出す（ステップＳ１１８）。ステップＳ１１７で取り出したテキスト要素とステップＳ１１８で取り出したテキスト要素とが同じテキスト要素であれば（ステップＳ１１９のＹＥＳ），ＷＯＲＫ＿Ｅに，ＤＯＭツリー情報Ｘの現在取り出しているテキスト要素の要素番号を格納する（ステップＳ１２０）。 If the extracted text elements are not the same (NO in step S115), the text elements are extracted in order from the end of the currently displayed DOM tree information X (step S117). Further, the text elements are extracted in order from the end of the saved DOM tree information Y under the same domain (step S118). If the text element extracted in step S117 and the text element extracted in step S118 are the same text element (YES in step S119), the element number of the text element currently extracted in DOM tree information X is stored in WORK_E. (Step S120).

そして，取り出したテキスト要素が同じでない場合には（ステップＳ１１９のＮＯ），「ＷＯＲＫ＿Ｓに，現在のＤＯＭツリー情報Ｘの最後のテキスト要素の要素番号が格納され，かつ，ＷＯＲＫ＿Ｅに，現在のＤＯＭツリー情報Ｘの先頭のテキスト要素の要素番号が格納されている」ときは（図７：ステップＳ１２１のＹＥＳ），すべてのテキスト要素が重複していることになるので，ＷＯＲＫ＿Ｓ，ＷＯＲＫ＿Ｅの値をクリア，すなわち，値なしの状態にする（ステップＳ１２２）。 If the extracted text elements are not the same (NO in step S119), “the element number of the last text element of the current DOM tree information X is stored in WORK_S, and the current DOM tree is stored in WORK_E. When the element number of the first text element of information X is stored "(FIG. 7: YES in step S121), all text elements are duplicated, so the values of WORK_S and WORK_E are cleared. That is, a state without a value is set (step S122).

また，「ＷＯＲＫ＿Ｓに，現在のＤＯＭツリー情報Ｘの最後のテキスト要素の要素番号が格納され，かつ，ＷＯＲＫ＿Ｅに，現在のＤＯＭツリー情報Ｘの先頭のテキスト要素の要素番号が格納されている」のでなければ（ステップＳ１２１のＮＯ），ＷＯＲＫ＿Ｓの要素番号からＷＯＲＫ＿Ｅの要素番号までの範囲とＮＯＤＥ＿Ｓの要素番号からＮＯＤＥ＿Ｅの要素番号までの範囲とを比較（ステップＳ１２３）する。 Also, “the element number of the last text element of the current DOM tree information X is stored in WORK_S, and the element number of the first text element of the current DOM tree information X is stored in WORK_E”. If not (NO in step S121), the range from the element number of WORK_S to the element number of WORK_E is compared with the range from the element number of NODE_S to the element number of NODE_E (step S123).

ＷＯＲＫ＿Ｓの要素番号からＷＯＲＫ＿Ｅの要素番号までの範囲が，ＮＯＤＥ＿Ｓの要素番号からＮＯＤＥ＿Ｅの要素番号までの範囲内であれば（ステップＳ１２３のＹＥＳ），ＷＯＲＫ＿Ｓの要素番号＋１をＮＯＤＥ＿Ｓへ格納し，ＷＯＲＫ＿Ｅの要素番号−１をＮＯＤＥ＿Ｅへ格納する（ステップＳ１２４）。 If the range from the element number of WORK_S to the element number of WORK_E is within the range from the element number of NODE_S to the element number of NODE_E (YES in step S123), the element number +1 of WORK_S is stored in NODE_S, and WORK_E Element number-1 is stored in NODE_E (step S124).

そして，読み上げ範囲判定処理が終わると，読み上げ処理部１２３は，読み上げ範囲判定処理において設定された現在のＤＯＭツリー情報Ｘから順にテキスト要素を取り出し（図５：ステップＳ１２），処理フラグ＝ＯＮであれば（ステップＳ１３のＹＥＳ），取り出したテキスト要素のテキストを読み上げる（ステップＳ１４）。一方，処理フラグ＝ＯＮでなければ（ステップＳ１３のＮＯ），取り出したテキスト要素が，読み上げ処理開始のテキスト要素であるか否かを判定する（ステップＳ１５）。すなわち，取り出したテキスト要素の要素番号がＮＯＤＥ＿Ｓの番号であれば，取り出したテキスト要素が読み上げ処理開始のテキスト要素であると判定して（ステップＳ１５のＹＥＳ），処理フラグをＯＮに設定して（ステップＳ１６），そのテキスト要素の読み上げ処理を実行する（ステップＳ１４）。また，取り出したテキスト要素の要素番号がＮＯＤＥ＿Ｓの要素番号でなければ，取り出したテキスト要素は読み上げ処理開始のテキスト要素ではないと判定して（ステップＳ１５のＮＯ），ステップＳ１２の処理へ戻る。 When the reading range determination process is completed, the reading processing unit 123 sequentially extracts text elements from the current DOM tree information X set in the reading range determination process (FIG. 5: Step S12), and the processing flag = ON. If so (YES in step S13), the text of the extracted text element is read out (step S14). On the other hand, if the processing flag is not ON (NO in step S13), it is determined whether or not the extracted text element is a text element for starting the reading process (step S15). That is, if the element number of the extracted text element is NODE_S, it is determined that the extracted text element is a text element for starting the reading process (YES in step S15), and the processing flag is set to ON ( In step S16), the text element is read out (step S14). If the element number of the extracted text element is not the element number of NODE_S, it is determined that the extracted text element is not a text element for starting the reading process (NO in step S15), and the process returns to step S12.

その後，読み上げ処理しているテキスト要素からリンク情報を検出した場合に（ステップＳ１７のＹＥＳ），自動ジャンプ処理部１２５は，自動ジャンプ処理を実行する（ステップＳ１８）。自動ジャンプ処理は後述する。 Thereafter, when link information is detected from the text element being read out (YES in step S17), the automatic jump processing unit 125 executes automatic jump processing (step S18). The automatic jump process will be described later.

また，読み上げ処理しているテキスト要素からリンク情報を検出しなければ（ステップＳ１７のＮＯ），読み上げ処理しているテキスト要素が読み上げ処理終了のテキスト要素であるか否かを判定する（ステップＳ１９）。すなわち，取り出したテキスト要素の要素番号がＮＯＤＥ＿Ｅの要素番号であれば，取り出したテキスト要素は読み上げ処理終了のテキスト要素であると判定して（ステップＳ１９のＹＥＳ），読み上げ処理を終了する。 If link information is not detected from the text element being read out (NO in step S17), it is determined whether the text element being read out is a text element that has been read out (step S19). . That is, if the element number of the extracted text element is an element number of NODE_E, it is determined that the extracted text element is a text element at the end of the reading process (YES in step S19), and the reading process is terminated.

また，取り出したテキスト要素の要素番号がＮＯＤＥ＿Ｅの要素番号でなければ，取り出したテキスト要素は読み上げ処理終了のテキスト要素ではないと判定して（ステップＳ１９のＮＯ），ステップＳ１２の処理へ戻る。 If the element number of the extracted text element is not the element number of NODE_E, it is determined that the extracted text element is not a text element for which the reading process has been completed (NO in step S19), and the process returns to step S12.

図８に，図６のステップＳ１１１のＵＲＬリスト取得処理の処理フローを示す。 FIG. 8 shows a process flow of the URL list acquisition process in step S111 of FIG.

読み上げ範囲判定部１２４は，現在の表示処理の対象となっているウェブページのＵＲＬをＵＲＬ＿Ａに格納し（ステップＳ１１１０），ＵＲＬ＿Ａの”ｈｔｔｐ：／／”から次の”／”までの部分文字列をＤＯＭＡＩＮ＿Ａに格納する（ステップＳ１１１１）。 The reading range determination unit 124 stores the URL of the web page that is the target of the current display process in URL_A (step S1110), and the partial character string from “http: //” to the next “/” of URL_A. Is stored in DOMAIN_A (step S1111).

そして，ＤＯＭ管理リストに比較処理が未処理のＤＯＭツリー情報が残っている間（ステップＳ１１１２のＹＥＳ），ステップＳ１１１３〜Ｓ１１１６のループ処理を行う（ステップＳ１１１７）。 Then, while DOM tree information that has not been subjected to comparison processing remains in the DOM management list (YES in step S1112), the loop processing of steps S1113 to S1116 is performed (step S1117).

まず，ＤＯＭツリー記憶部１２２に退避させたＤＯＭツリー情報のＵＲＬをＤＯＭツリー管理リストから順に取得して，ＵＲＬ＿Ｂに格納する（ステップＳ１１１３）。そして，ＵＲＬ＿Ｂの”ｈｔｔｐ：／／”から次の”／”までの部分文字列をＤＯＭＡＩＮ＿Ｂに格納する（ステップＳ１１１４）。ＤＯＭＡＩＮ＿ＡとＤＯＭＡＩＮ＿Ｂとを比較し（ステップＳ１１１５），ＤＯＭＡＩＮ＿ＡとＤＯＭＡＩＮ＿Ｂとが完全一致した場合には（ステップＳ１１１５のＹＥＳ），ＵＲＬ＿Ｂに格納しているＵＲＬをＵＲＬリストに格納する（ステップＳ１１１６）。 First, URLs of DOM tree information saved in the DOM tree storage unit 122 are sequentially acquired from the DOM tree management list and stored in URL_B (step S1113). Then, the partial character string from “http: //” to the next “/” of URL_B is stored in DOMAIN_B (step S1114). DOMAIN_A and DOMAIN_B are compared (step S1115), and if DOMAIN_A and DOMAIN_B completely match (YES in step S1115), the URL stored in URL_B is stored in the URL list (step S1116).

前記のループ処理によってＤＯＭ管理リストに比較処理を行うＤＯＭツリー情報がなくなれば（ステップＳ１１１７），処理を終了する。 If there is no DOM tree information to be compared in the DOM management list by the loop process (step S1117), the process ends.

図９に，図５のステップＳ１８の自動ジャンプ処理の処理フローを示す。 FIG. 9 shows a process flow of the automatic jump process in step S18 of FIG.

自動ジャンプ処理部１２５は，現在の表示処理対象のウェブページのＵＲＬとＤＯＭツリー情報を一時的記憶域に退避させて（ステップＳ１８０），ブラウジング処理制御部１１０によって，リンク先に設定されたウェブページが取得されると（ステップＳ１８１），読み上げ処理制御部１２０を通じて，取得したリンク先のウェブページのＤＯＭツリー情報を取得する（ステップＳ１８２）。そして，読み上げ範囲判定処理を行う（ステップＳ１８３）。この読み上げ範囲判定処理では，ステップＳ１１（図５）の読み上げ範囲判定処理と同様の処理を行う。 The automatic jump processing unit 125 saves the URL and DOM tree information of the current display target web page in the temporary storage area (step S180), and the web page set as the link destination by the browsing processing control unit 110. Is acquired (step S181), the DOM tree information of the acquired link destination web page is acquired through the reading process control unit 120 (step S182). Then, a reading range determination process is performed (step S183). In this reading range determination process, the same processing as the reading range determination process in step S11 (FIG. 5) is performed.

そして，読み上げ範囲判定処理において，リンク先のウェブページのＤＯＭツリー情報から読み上げ処理の範囲とするテキスト要素が抽出できたかどうかを判定する（ステップＳ１８４）。リンク先のウェブページのＤＯＭツリー情報から読み上げ処理の範囲とするテキスト要素が抽出できた場合には（ステップＳ１８４のＹＥＳ），ブラウジング処理制御部１１０によってリンク先のウェブページを表示処理させ（ステップＳ１８５），読み上げ処理部１２３は，読み上げ範囲として判定されたテキスト要素の読み上げ処理を行う（ステップＳ１８６）。この読み上げ処理は，図５に示す読み上げ処理と同様の処理である。 Then, in the reading range determination process, it is determined whether or not the text element as the range of the reading process has been extracted from the DOM tree information of the linked web page (step S184). If the text element that is the range of the reading process can be extracted from the DOM tree information of the linked web page (YES in step S184), the browsing process control unit 110 displays the linked web page (step S185). ), The reading processing unit 123 performs the reading processing of the text element determined as the reading range (step S186). This reading process is the same as the reading process shown in FIG.

一方，リンク先のウェブページのＤＯＭツリー情報と同一のＤＯＭツリー情報がＤＯＭツリー記憶部１２２に格納されていて，リンク先のウェブページのＤＯＭツリー情報から読み上げ処理の範囲とするテキスト要素が抽出できなかった場合には（ステップＳ１８４のＮＯ），処理を終了する。 On the other hand, the same DOM tree information as the DOM tree information of the linked web page is stored in the DOM tree storage unit 122, and the text element as the range of the reading process can be extracted from the DOM tree information of the linked web page. If not (NO in step S184), the process is terminated.

〔第２の実施例〕
第２の実施例におけるドキュメント閲覧装置１は，読み上げ処理の対象とするテキスト要素の範囲をウェブページのＤＯＭツリー情報を用いて特定し，リンク情報における自動ジャンプ処理の要否をウェブページのＤＯＭツリー情報から生成したハッシュデータを用いて判定する。[Second Embodiment]
The document browsing apparatus 1 in the second embodiment specifies the range of text elements to be read out using the DOM tree information of the web page, and determines whether or not the automatic jump process is required in the link information. Judgment is made using hash data generated from the information.

本例のドキュメント閲覧装置１において，図1の構成例に示す一時保管ドキュメント管理部１１５，読み上げフラグ管理部１１６，およびドキュメント一時記憶部１１７は，必須の構成要素ではない。 In the document browsing apparatus 1 of this example, the temporarily stored document management unit 115, the reading flag management unit 116, and the document temporary storage unit 117 shown in the configuration example of FIG. 1 are not essential components.

ドキュメント閲覧装置１の読み上げ範囲判定部１２４は，第1の実施例と同様の処理によって，読み上げ処理の範囲を特定する。そして，ＤＯＭツリー退避処理部１２１は，読み上げ処理部１２３によって読み上げ処理されたウェブページのＤＯＭツリー情報をハッシュデータ管理部１２６に渡す。 The reading range determination unit 124 of the document browsing apparatus 1 specifies the range of the reading processing by the same processing as in the first embodiment. The DOM tree save processing unit 121 then passes the DOM tree information of the web page read-out by the read-out processing unit 123 to the hash data management unit 126.

ハッシュデータ管理部１２６は，ＤＯＭツリー情報から所定のハッシュ関数を用いてハッシュデータ（ハッシュ文字列）を生成し，ハッシュ管理リストによって管理する。 The hash data management unit 126 generates hash data (hash character string) from the DOM tree information using a predetermined hash function, and manages the hash data (hash character string) using the hash management list.

ハッシュ管理リストは，ウェブページのＤＯＭツリー情報から生成されたハッシュデータとの対応を管理するリストである。図１０に，ハッシュ管理リストの例を示す。ハッシュ管理リストには，ウェブページの格納場所情報（ＵＲＬ），およびそのウェブページのＤＯＭツリー情報から生成されたハッシュデータ（ハッシュ関数値）とが格納される。 The hash management list is a list for managing correspondence with hash data generated from DOM tree information of a web page. FIG. 10 shows an example of the hash management list. The hash management list stores web page storage location information (URL) and hash data (hash function value) generated from the DOM tree information of the web page.

そして，読み上げ処理対象からリンク情報が検出された場合には，自動ジャンプ処理部１２５は，リンク先として設定されたウェブページのＤＯＭツリー情報を取得して，ハッシュデータ管理部１２６に渡し，リンク先のウェブページのＤＯＭツリー情報から生成されたハッシュデータを取得する。さらに，リンク先のウェブページのハッシュデータとハッシュ管理リストのハッシュデータとを比較し，リンク先のハッシュデータと完全に一致するハッシュデータをハッシュ管理リストから検索したときは，このリンク情報での自動ジャンプ処理を無効化する。 When link information is detected from the reading process target, the automatic jump processing unit 125 acquires the DOM tree information of the web page set as the link destination, passes it to the hash data management unit 126, and the link destination. The hash data generated from the DOM tree information of the web page is acquired. Furthermore, when the hash data of the link destination web page is compared with the hash data of the hash management list and hash data that exactly matches the link destination hash data is searched from the hash management list, the link information is automatically Disable jump processing.

このように，リンク先のウェブページが，既に読み上げ処理の対象となっているか否かの判定をハッシュデータを用いることによって，自動ジャンプ処理の要否を判定する処理がより高速に行えるようになる。 In this way, by using hash data to determine whether or not the linked web page is already subject to read-out processing, it becomes possible to perform processing for determining the necessity of automatic jump processing at a higher speed. .

図１１に，第２の実施例における読み上げ処理の処理フローを示す。 FIG. 11 shows a processing flow of the reading process in the second embodiment.

図１１に示す処理フローにおいて，図５に示す第１の実施例の読み上げ処理フローの処理ステップと同一の番号が付与された処理ステップは，図５の処理フローの各処理ステップと同様の処理を行うことを意味する。 In the processing flow shown in FIG. 11, the processing steps to which the same numbers as the processing steps of the reading processing flow of the first embodiment shown in FIG. 5 are assigned the same processing as the processing steps of the processing flow of FIG. Means to do.

図１１の処理フローにおいて，自動ジャンプ処理部１２５は，図５に示すステップＳ１８の処理の代りに，別の自動ジャンプ処理を行う（ステップＳ２０）。 In the processing flow of FIG. 11, the automatic jump processing unit 125 performs another automatic jump processing instead of the processing of step S18 shown in FIG. 5 (step S20).

また，ステップＳ１９の処理後に，ハッシュデータ管理部１２６は，読み上げ処理したＤＯＭツリー情報からハッシュデータを生成し，対応するウェブページのＵＲＬおよびそのウェブページのＤＯＭツリー情報から生成したハッシュデータとをハッシュ管理リストに格納する（ステップＳ２１）。 Further, after the processing of step S19, the hash data management unit 126 generates hash data from the read-out DOM tree information, and hashes the corresponding web page URL and the hash data generated from the DOM tree information of the web page. Store in the management list (step S21).

図１２に，図１１のステップＳ２０の自動ジャンプ処理の処理フローを示す。 FIG. 12 shows a process flow of the automatic jump process in step S20 of FIG.

自動ジャンプ処理部１２５は，現在の表示処理対象のウェブページのＵＲＬとＤＯＭツリー情報を一時的記憶域に退避させて（ステップＳ２００），リンク先に設定されたウェブページのＤＯＭツリー情報を取得し（ステップＳ２０１），取得したＤＯＭツリー情報からハッシュデータを生成し（ステップＳ２０２），生成したハッシュデータをＨＡＳＨ＿Ａに格納する（ステップＳ２０３）。 The automatic jump processing unit 125 saves the URL and DOM tree information of the current display target web page in the temporary storage area (step S200), and acquires the DOM tree information of the web page set as the link destination. (Step S201), hash data is generated from the acquired DOM tree information (Step S202), and the generated hash data is stored in HASH_A (Step S203).

そして，ハッシュ管理リストに比較処理を行うハッシュデータが残っている間（ステップＳ２０４のＹＥＳ），ステップＳ２０５〜Ｓ２０６のループ処理を行う（ステップＳ２０７）。 Then, while the hash data to be compared remains in the hash management list (YES in step S204), the loop processing in steps S205 to S206 is performed (step S207).

まず，ハッシュ管理リストからハッシュデータを順に取り出して，ＨＡＳＨ＿Ｂに格納する（ステップＳ２０５）。ＨＡＳＨ＿ＡとＨＡＳＨ＿Ｂとに格納したハッシュデータ（ハッシュ文字列）を比較し（ステップＳ２０６），ＨＡＳＨ＿ＡとＨＡＳＨ＿Ｂとが完全に一致せず，ハッシュ管理リストにも残りがない場合には（ステップＳ２０６のＮＯ，Ｓ２０７），ブラウジング処理制御部１１０によってリンク先のウェブページを表示処理し（ステップＳ２０８），そのウェブページのテキストについて読み上げ処理を行う（ステップＳ２０９）。この読み上げ処理は，図１１に示す読み上げ処理と同様の処理を行う。 First, hash data is sequentially extracted from the hash management list and stored in HASH_B (step S205). The hash data (hash character strings) stored in HASH_A and HASH_B are compared (step S206). If HASH_A and HASH_B do not completely match and there is no remaining in the hash management list (NO in step S206, In step S207, the browsing processing control unit 110 displays the linked web page (step S208), and the text of the web page is read out (step S209). This reading process is the same as the reading process shown in FIG.

また，ＨＡＳＨ＿ＡとＨＡＳＨ＿Ｂとが完全に一致した場合には（ステップＳ２０６のＹＥＳ），リンク先のウェブページが既に読み上げ処理されているので，処理を終了する。 If HASH_A and HASH_B are completely matched (YES in step S206), the linked web page has already been read out, and the process is terminated.

〔第３の実施例〕
第３の実施例におけるドキュメント閲覧装置１は，読み上げ処理の対象とするテキスト要素の範囲およびリンク情報における自動ジャンプ処理の要否を，ウェブページのＤＯＭツリー情報を用いて判定する。[Third embodiment]
The document browsing apparatus 1 according to the third embodiment determines the necessity of automatic jump processing in the range of text elements to be read out and link information using the DOM tree information of the web page.

本例のドキュメント閲覧装置１では，図1の構成例に示すハッシュデータ管理部１２６は，必須の構成要素ではない。 In the document browsing apparatus 1 of this example, the hash data management unit 126 shown in the configuration example of FIG. 1 is not an essential component.

ドキュメント閲覧装置１の一時保管ドキュメント管理部１１５は，ブラウジング処理部１１０によるドキュメント表示処理中に，通信処理部１１１が表示処理の対象となるウェブページを取得すると，そのウェブページをドキュメント一時記憶部１１７へ一時的に格納する。 When the communication processing unit 111 acquires a web page to be displayed during document display processing by the browsing processing unit 110, the temporary storage document management unit 115 of the document browsing apparatus 1 stores the web page as a document temporary storage unit 117. Store temporarily.

そして，読み上げフラグ管理部１１６は，ドキュメント一時記憶部１１７に格納されたウェブページのテキストが読み上げ処理の対象となっている場合に，そのウェブページの読み上げフラグに読み上げ処理済みを示す値を設定し，ドキュメント一時記憶部１１７にキャッシュされたウェブページのＵＲＬおよび読み上げフラグをキャッシュ管理リストを用いて管理する。 Then, when the text of the web page stored in the document temporary storage unit 117 is the target of the reading process, the reading flag management unit 116 sets a value indicating that the reading process has been completed to the reading flag of the web page. The URL and the reading flag of the web page cached in the document temporary storage unit 117 are managed using the cache management list.

図１３に，キャッシュ管理リストの例を示す。キャッシュ管理リストは，ウェブページの格納場所情報（ＵＲＬ），そのウェブページのドキュメント一時記憶部１１７におけるキャッシュファイル名，読み上げフラグなどを管理するリストである。 FIG. 13 shows an example of the cache management list. The cache management list is a list for managing web page storage location information (URL), a cache file name in the document temporary storage unit 117 of the web page, a reading flag, and the like.

読み上げフラグは，読み上げ処理されたウェブページにＯＮが，読み上げ処理されていないウェブページにＯＦＦが設定される。 The reading flag is set to ON for a web page that has been read out, and OFF for a web page that has not been read out.

そして，読み上げ処理の範囲からリンク情報を検出した場合には，自動ジャンプ処理部１２５は，リンク情報のリンク先として設定されたＵＲＬを取得し，キャッシュ管理リストを参照して，リンク先のＵＲＬと完全に一致するＵＲＬであって，かつ，読み上げフラグがＯＮであるウェブページを検索できたときに，このリンク情報での自動ジャンプ処理を無効化する。 When link information is detected from the range of the reading process, the automatic jump processing unit 125 acquires the URL set as the link destination of the link information, refers to the cache management list, and determines the link destination URL. When it is possible to search for a web page whose URL matches completely and whose reading flag is ON, the automatic jump processing with this link information is invalidated.

このように，リンク先のウェブページが，既に読み上げ処理の対象となっているか否かの判定を，表示処理のために一時的に保管するウェブページに付与する読み上げフラグを用いることによって，自動ジャンプ処理の要否を判定する処理がより高速に行えるようになる。 In this way, the automatic jump is performed by using the reading flag that is given to the web page temporarily stored for display processing to determine whether or not the linked web page is already subject to reading processing. Processing for determining whether processing is necessary can be performed at higher speed.

図１４に，第３の実施例における読み上げ処理の処理フローを示す。 FIG. 14 shows a processing flow of the reading process in the third embodiment.

図１４に示す処理フローにおいて，図５に示す第１の実施例の読み上げ処理フローの処理ステップと同一の番号が付与された処理ステップは，図５の処理フローの各処理ステップと同様の処理を行うことを意味する。 In the processing flow shown in FIG. 14, the processing steps to which the same numbers as the processing steps of the reading processing flow of the first embodiment shown in FIG. 5 are assigned the same processing as the processing steps of the processing flow of FIG. Means to do.

図１４の処理フローにおいて，自動ジャンプ処理部１２５は，図５に示すステップＳ１８の処理の代りに，別の自動ジャンプ処理を行う（ステップＳ３０）。 In the processing flow of FIG. 14, the automatic jump processing unit 125 performs another automatic jump processing instead of the processing of step S18 shown in FIG. 5 (step S30).

また，ステップＳ１９の処理後に，読み上げフラグ管理部１１６は，表示処理されたウェブページが読み上げ処理された場合に，ドキュメント一時記憶部１１７にキャッシングされるウェブページのＵＲＬをキャッシュ管理リストに追加し，読み上げフラグにＯＮを設定する（ステップＳ３１）。 Further, after the processing in step S19, the reading flag management unit 116 adds the URL of the web page cached in the document temporary storage unit 117 to the cache management list when the displayed web page is read out. The reading flag is set to ON (step S31).

図１５に，図１４のステップＳ３０の自動ジャンプ処理の処理フローを示す。 FIG. 15 shows a process flow of the automatic jump process in step S30 of FIG.

自動ジャンプ処理部１２５は，現在の表示処理対象のウェブページのＵＲＬとＤＯＭツリー情報を一時的記憶域に退避させて（ステップＳ３００），キャッシュ管理リストからリンク先に設定されたウェブページのＵＲＬを取得し（ステップＳ３０１），キャッシュ管理リストにリンク先のＵＲＬがあるか否かを判定する（ステップＳ３０２）。キャッシュ管理リストにリンク先のＵＲＬと完全に一致するＵＲＬがなければ（ステップＳ３０２のＮＯ），さらにキャッシュ管理リストの該当するＵＲＬの読み上げフラグがＯＮであるか否かを判定する（ステップＳ３０３）。そして，リンク先のＵＲＬの読み上げフラグがＯＮでなければ（ステップＳ３０３のＮＯ），取得したリンク先のウェブページのＤＯＭツリー情報を取得し（ステップＳ３０４），ブラウジング処理制御部１１０によってリンク先のウェブページを表示処理し（ステップＳ３０５），そのウェブページのテキストについて読み上げ処理を行う（ステップＳ３０６）。この読み上げ処理は，図１４に示す読み上げ処理と同様の処理を行う。また，キャッシュ管理リストにリンク先のＵＲＬと完全に一致するＵＲＬがある場合（ステップＳ３０２のＹＥＳ），もしくは，キャッシュ管理リストの該当するＵＲＬの読み上げフラグがＯＮである場合（ステップＳ３０３のＹＥＳ）には，処理を終了する。 The automatic jump processing unit 125 saves the URL of the current display target web page and the DOM tree information in the temporary storage area (step S300), and sets the URL of the web page set as the link destination from the cache management list. It is acquired (step S301), and it is determined whether or not there is a link destination URL in the cache management list (step S302). If there is no URL that completely matches the link destination URL in the cache management list (NO in step S302), it is further determined whether or not the reading flag of the corresponding URL in the cache management list is ON (step S303). Then, if the link destination URL reading flag is not ON (NO in step S303), the DOM tree information of the acquired link destination web page is acquired (step S304), and the browsing processing control unit 110 uses the link destination web. The page is displayed (step S305), and the text of the web page is read out (step S306). This reading process is the same as the reading process shown in FIG. In addition, when there is a URL that completely matches the link destination URL in the cache management list (YES in step S302), or when the reading flag of the corresponding URL in the cache management list is ON (YES in step S303). Terminates the process.

以上,本発明をその実施の形態により説明したが,本発明はその主旨の範囲において種々の変形が可能であることは当然である。また，本発明は，コンピュータにより読み取られ実行される処理プログラムとして実施することができる。本発明を実現するプログラムは，コンピュータが読み取り可能な，可搬媒体メモリ，半導体メモリ，ハードディスクなどの適当な記録媒体に格納することができ，これらの記録媒体に記録して提供され，または，通信インタフェースを介して種々の通信網を利用した送受信により提供されるものである。 Although the present invention has been described above with reference to the embodiment, it is obvious that the present invention can be variously modified within the scope of the gist thereof. Further, the present invention can be implemented as a processing program that is read and executed by a computer. The program for realizing the present invention can be stored in an appropriate recording medium such as a portable medium memory, a semiconductor memory, and a hard disk, which can be read by a computer, provided by being recorded on these recording media, or communication. It is provided by transmission / reception using various communication networks via an interface.

Claims

As a text-to-speech process in a document written and structured in markup language on a computer
Page configuration information storage processing for storing page configuration information of a document whose text element has been read out in the page configuration information storage means;
A page configuration information acquisition process for acquiring the page configuration information of the document currently displayed;
The page configuration information of the document to be displayed is compared with the page configuration information stored in the page configuration information storage means, and from the page configuration information of the document to be displayed, the text element of the stored page configuration information A text range that does not match the text element, and sets the read text element as a target for the reading process;
Read-out processing for converting the text element set as the read-out processing target into speech data in the display-processing target document and outputting the voice data,
Text-to-speech processing program for document to be executed.

In the computer,
When link information is detected from the text element in the reading process, the page configuration information of the link destination document set in the link information is acquired, and the link destination page configuration information and the page configuration information storage are acquired. The page configuration information stored in the means is compared, and when the page configuration information at the link destination completely matches the page configuration information that has been read out, automatic jump processing that suppresses automatic jump processing with the link information Processing
The in-document text-to-speech processing program according to claim 1 for execution.

In the computer,
A hash data management process for calculating a hash function value using a predetermined hash function from the page configuration information of the document and storing the hash function value calculated from the page configuration information of the read-out document in a hash information storage unit;
When link information is detected from the text element in the reading process, a hash function value of the page configuration information of the link destination document set in the link information is acquired, and a hash function value of the link destination document And the hash function value stored in the hash information storage means, and when the hash function value of the linked document completely matches the hash function value stored in the hash information storage means, the link Automatic jump processing that suppresses automatic jump processing with information,
The in-document text-to-speech processing program according to claim 1 for execution.

In the computer,
A temporary document storage process for temporarily storing the displayed document in the temporary document storage means;
A reading flag management process for setting a reading flag for the read-out document when the document stored in the document temporary storage means is read-out;
In the reading process, when link information is detected from the text element, the link destination document set in the link information is stored in the document temporary storage means and stored in the document temporary storage means. Automatic jump processing for suppressing automatic jump processing in the link information when a reading flag is set in the document,
The in-document text-to-speech processing program according to claim 1 for execution.

In a document browsing apparatus that displays and processes a structured document described in a markup language,
Page configuration information storage means for storing page configuration information of a document whose text element has been read out;
Page configuration information storage means for storing page configuration information of the display-processed and read-out document in the page configuration information storage means;
A page configuration information acquisition means for acquiring page configuration information of a document currently displayed;
The page configuration information of the document to be displayed is compared with the page configuration information stored in the page configuration information storage means, and from the page configuration information of the document to be displayed, the text element of the stored page configuration information A text range that does not match the text element, and sets the read text element as the text processing target,
A document browsing apparatus comprising: a reading processing unit that converts a text element set as the reading processing target in the display processing target document into voice data and outputs the voice data.