JP4227294B2

JP4227294B2 - Image processing apparatus and method, and storage medium

Info

Publication number: JP4227294B2
Application number: JP2000277245A
Authority: JP
Inventors: 康弘日野
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-09-12
Filing date: 2000-09-12
Publication date: 2009-02-18
Anticipated expiration: 2020-09-12
Also published as: JP2002091946A

Description

【０００１】
【発明の属する技術分野】
本発明は、構造化記述言語で記述された文書データを物理ページにレイアウトする画像処理装置及びその方法、及び画像処理システムに関する。
【０００２】
【従来の技術】
パーソナルコンピュータ及びインターネットの急速な普及により、業種を問わず文書の電子化が促進されつつある。一般に、電子化文書のデータ形式は文書を編集するアプリケーションに依存するため、電子化された文書を閲覧するためには該当文書のデータ形式をサポートするアプリケーションを用意する必要があった。
【０００３】
そこで、HTML(Hyper Text Markup Language)やXML(eXtensible Markup Language)といった、特定のアプリケーションに依存しない構造化記述言語によって記述された文書が普及しつつある。一般に構造化記述言語は、ブラウザによってディスプレイ上に表示することを前提に設計されており、表示する画面の幅や高さを任意に変えたり、画面をスクロールさせたりすることが可能であるため、「ページ」という概念は必要ない。
【０００４】
しかしながら、構造化記述言語をページ記述言語として利用する場合には、ページへの割り付けを行う必要がある。そこで、近年ページレイアウト可能な構造化記述言語が開発されつつあり、出版用の組版ルール等を用いることによって、ページ概念の無い構造化記述言語をページレイアウト可能な構造化記述言語に変換することも可能となった。このように、ページ概念の無い構造化記述言語をページレイアウト可能な構造化記述言語に変換する工程は、「フォーマッティング」と称される。
【０００５】
図９に、ページ概念のない構造化記述言語で記述された文書のブラウザによる表示例を示す。一般的なブラウザによれば同図に示すように、表示画面の大きさを変えたり、スクロールバーを使って画面を縦横スクロールさせることができる。
【０００６】
これに対し、図１０にフォーマッティング後の構造化記述言語で記述された文書の表示例を示す。同図によれば、フォーマッティング後の文書はページ単位に表示または印刷することができる。
【０００７】
【発明が解決しようとする課題】
しかしながら、上記従来のフォーマッティングにおいては、構造化記述言語によって記述された文書データの物理ページへの割りつけは、固定倍率によるものであった。従って、ページの内容によっては見えにくくなってしまったり、ユーザ好みのレイアウトが得られるとは限らないといった問題が発生していた。
【０００８】
本発明は上述した問題を解決するために成されたものであり、構造化記述言語によって記述された文書データを物理ページへレイアウトする際に、自由度の高いレイアウトを実現する画像処理装置及びその方法、及び画像処理システムを提供することを目的とする。
【０００９】
【課題を解決するための手段】
上記目的を達成するための一手段として、本発明の画像処理装置は以下の構成を備える。
【００１０】
構造化記述言語で記述された文書データを物理ページにレイアウトする画像処理装置であって、
ページの定義されていない構造化記述言語で記述され、特定データとして文字データと図形データとが含まれる文書データを、物理ページにレイアウトする画像処理装置であって、
前記文書データに含まれる複数の特定データのうち、前記図形データに関して、その幅を示すタグを解析することで、最大の幅を有する図形データを抽出するとともに、該抽出された図形データが、物理ページの印字可能領域の幅に収まるようにレイアウトする場合の変倍率を、物理ページにおける印字可能領域の幅を示す情報と、印刷する際の解像度を示す情報と、前記最大の幅を有する図形データの幅を示す情報とを用いて決定する決定手段と、
前記文書データに含まれる複数の特定データのうち、前記文字データについては、その文字サイズを示すタグを順次抽出し、前記決定手段が決定した変倍率をかけあわせることで、該文字サイズを変更したうえで印刷処理のための画像処理を実行し、前記図形データについては、その幅と高さを示すタグを順次抽出し、前記決定手段が決定した変倍率をかけあわせることで、該幅と高さを変更したうえで印刷処理のための画像処理を実行する画像処理手段と、を有することを特徴とする。
【００１１】
上記画像処理装置において、例えば、前記構造化記述言語はＨＴＭＬであることを特徴とする。
【００１２】
上記画像処理装置において、例えば、前記構造化記述言語はＸＭＬであることを特徴とする。
【００１３】
上記画像処理装置は、更に、前記画像処理手段によって画像処理された文書データを描画する描画手段を有することを特徴とする。
【００１４】
上記画像処理装置は、更に、前記描画手段によって描画されたデータを記録媒体上に可視像化する画像形成手段を有することを特徴とする。
【００１５】
上記目的を達成するための一手段として、本発明の画像処理方法は以下の構成を備える。即ち、
ページの定義されていない構造化記述言語で記述され、特定データとして文字データと図形データとが含まれる文書データを、物理ページにレイアウトする画像処理方法であって、
前記文書データに含まれる複数の特定データのうち、前記図形データに関して、その幅を示すタグを解析することで、最大の幅を有する図形データを抽出するとともに、該抽出された図形データが、物理ページの印字可能領域の幅に収まるようにレイアウトする場合の変倍率を、物理ページにおける印字可能領域の幅を示す情報と、印刷する際の解像度を示す情報と、前記最大の幅を有する図形データの幅を示す情報とを用いて決定する決定ステップと、
前記文書データに含まれる複数の特定データのうち、前記文字データについては、その文字サイズを示すタグを順次抽出し、前記決定ステップが決定した変倍率をかけあわせることで、該文字サイズを変更したうえで印刷処理のための画像処理を実行し、前記図形データについては、その幅と高さを示すタグを順次抽出し、前記決定手段が決定した変倍率をかけあわせることで、該幅と高さを変更したうえで印刷処理のための画像処理を実行する画像処理ステップと、を有することを特徴とする。
【００１６】
上記画像処理方法において、例えば、前記構造化記述言語はＨＴＭＬであることを特徴とする。
【００１７】
上記画像処理方法において、例えば、前記構造化記述言語はＸＭＬであることを特徴とする。
【００１８】
上記画像処理方法は、更に、前記画像処理ステップによって画像処理された文書データを描画する描画ステップを有することを特徴とする。
【００１９】
上記画像処理方法は、更に、前記描画ステップによって描画されたデータを記録媒体上に可視像化する画像形成ステップを有することを特徴とする。
【００２０】
また、上記目的を達成するための一手段として、本発明の記憶媒体は、上記画像処理方法をコンピュータによって実現させるための制御プログラムを格納したことを特徴とする。
【００２１】
【発明の実施の形態】
以下、本発明に係る一実施形態について、図面を参照して詳細に説明する。
【００２２】
＜第１実施形態＞
まず、本実施形態を適用するに好適なレーザービームプリンタ（以下、単に「プリンタ」と記述する）の構成について、図１を参照して説明する。尚、本実施形態が適用されるプリンタはレーザービーム方式に限られるものではなく、他のプリント方式のプリンタでも良いことは言うまでもない。
【００２３】
図１は、本実施形態が適用されるプリンタの内部構造を示す断面図であり、不図示のデータ源からの文字パターンの登録や定型書式（フォームデータ）などの登録が行える。同図において、１０００はプリンタ本体であり、外部に接続されているホストコンピュータから供給される文字情報（文字コード）やフォーム情報あるいはマクロ命令などを入力して記憶するとともに、それらの情報に従って対応する文字パターンやフォームパターンなどを作成し、記録媒体である記録紙上に像を形成する。１０１２は操作のためのスイッチおよびＬＥＤ表示器などが配されている操作パネル、１００１はプリンタ１０００全体の制御およびホストコンピュータから供給される文字情報などを解析するプリンタ制御ユニットである。この制御ユニット１００１は、主に文字情報を対応する文字パターンのビデオ信号に変換してレーザドライバ１００２に出力する。レーザドライバ１００２は半導体レーザ１００３を駆動するための回路であり、入力されたビデオ信号に応じて半導体レーザ１００３から発射されるレーザ光１００４をオンオフ切り替えする。
【００２４】
レーザ１００４は回転多面鏡１００５で左右方向に振られ静電ドラム１００６上を走査する。これにより、静電ドラム１００６上には文字パターンの静電潜像が形成される。この潜像は、静電ドラム１００６周囲の現像ユニット１００７により現像された後、記録紙に転送される。この記録紙にはカットシートを用い、カットシート記録紙はプリンタ１０００に装着した用紙カセット１００８に収納され、給紙ローラ１００９および搬送ローラ１０１０と１０１１とにより装置内に取り込まれて、静電ドラム１００６に供給される。
【００２５】
図２は、本実施形態におけるプリンタ制御システムの構成を説明するブロック図である。ここでは、図１に示したプリンタ１０００を制御する例について説明する。尚、本発明の機能が実行されるのであれば、単体の機器であっても、複数の機器からなるシステムであっても、ＬＡＮ等のネットワークを介して処理が行われるシステムであっても、本発明を適用できることは言うまでもない。
【００２６】
図２において、３０００はホストコンピュータであり、ＲＯＭ内３のプログラム用ＲＯＭに記憶された文書処理プログラム等に基づいて文書データの取り出しやデータ変換を実行するＣＰＵ１を備え、システムバス４に接続される各デバイスをＣＰＵ１が総括的に制御する。
【００２７】
ＲＯＭ３内のプログラム用ＲＯＭには、ＣＰＵ１の制御プログラム等が記憶され、同じくフォント用ＲＯＭには上記データ変換処理の際に使用するフォントデ−タ等が記憶され、同じくデ−タ用ＲＯＭは上記データ変換処理等を行う際に使用する各種デ−タが記憶される。
【００２８】
２はＲＡＭであり、ＣＰＵ１の主メモリやワークエリア等として機能する。５はキーボードコントローラ（ＫＢＣ）であり、キーボード９や不図示のポインティングデバイスからのキー入力を制御する。６はＣＲＴコントローラ（ＣＲＴＣ）であり、ＣＲＴディスプレイ（ＣＲＴ）１０の表示を制御する。尚、本実施形態においてキーボード９やＣＲＴディスプレイ１０は必ずしも必要ではないが、ホストコンピュータのメンテナンスや動作状況の確認のために、通常装備されている。
【００２９】
７はメモリコントローラ（ＭＣ）であり、ブートプログラム，種々のアプリケーション，フォントデータ，ユーザファイル，編集ファイル等を記憶するハードディスク（ＨＤ）、フロッピーディスク（ＦＤ）等の外部メモリ１１とのアクセスを制御する。８はネットワークコントローラ（ＮＴＣ）であり、所定の双方向性インタフェース（インタフェース）２１を介してプリンタ１０００に接続され、プリンタ１０００との通信制御処理を実行する。
【００３０】
尚、ＣＰＵ１は、例えばメモリコントローラ７を制御することによって外部メモリ１１に記憶された文書データを取り出したり、ネットワークコントローラ８を制御することによって文書データを外部へ転送することができる。
【００３１】
プリンタ１０００において、１２はプリンタＣＰＵであり、ＲＯＭ１３内のプログラム用ＲＯＭに記憶された制御プログラム或いは外部メモリ１４に記憶された制御プログラム等に基づいて、システムバス１５に接続される各種のデバイスとのアクセスを総括的に制御し、印刷部インタフェース１６を介して接続される印刷部（プリンタエンジン）１７に、出力情報としての画像信号を出力する。
【００３２】
ＲＯＭ１３内のプログラムＲＯＭには、後述するフローチャートで示されるようなＣＰＵ１２の制御プログラムを記憶しても良い。同じくフォント用ＲＯＭには上記出力情報を生成する際に使用するフォントデ−タ等を記憶し、同じくデ−タ用ＲＯＭは、プリンタ１０００がハードディスク等の外部メモリ１４を備えない場合には、ホストコンピュ−タ３０００上で利用される情報等を記憶している。
【００３３】
ＣＰＵ１２は、入力部１８を介してホストコンピュータ３０００との通信処理を行うことによって、プリンタ１０００内の情報等をホストコンピュータ３０００に通知することが可能である。１９はＣＰＵ１２の主メモリ、ワークエリア等として機能するＲＡＭであり、図示しない増設ポートに接続されるオプションＲＡＭによってメモリ容量を拡張することができる。尚、ＲＡＭ１９は出力情報展開領域、環境データ格納領域、ＮＶＲＡＭ等に用いられる。
【００３４】
上述したハードディスク（ＨＤ）、ＩＣカ−ド等の外部メモリ１４は、メモリコントローラ（ＭＣ）２０によりアクセスが制御される。外部メモリ１４はオプションとして接続され、文書データ、フォントデ−タ、フォ−ムデ−タ等を記憶する。１８は操作パネルであり、プリンタ１０００に対する各種操作のためのスイッチおよびＬＥＤ表示器等が配されている。
【００３５】
尚、本実施形態における外部メモリ１４は１個に限らず、少なくとも１個以上備え、内蔵フォントに加えてオプションフォントカード，言語系の異なるプリンタ制御言語を解釈するプログラムを格納した外部メモリを複数接続しても良い。更に不図示のＮＶＲＡＭを有し、操作パネル１０１２からのプリンタモード設定情報を記憶するようにしても良い。
【００３６】
図３は、本実施形態におけるプリンタ１０００の機能構成を示すブロック図である。同図によればプリンタ１０００は、大きく分けてフォーマッタ制御部１１００、プリンタインタフェース１２００、出力制御部１３００、プリンタエンジン部１４００によって構成されている。
【００３７】
プリンタインターフェイス１２００は、フォーマッタ制御部１１００と外部とにおけるデータの入出力を制御するための手段である。
【００３８】
フォーマッタ制御部１１００は、プロトコル制御部１１０１、文書データ解析部１１０２、データ描画部１１０３、ページメモリ１１０４、物理レイアウト処理部１１０５により構成されている。一般的にフォーマッタ制御部１１００内のこれらの機能構成は、ＣＰＵやＲＯＭ，ＲＡＭ等を備えたコンピュータシステムによって実現される。
【００３９】
プロトコル制御部１１０１は、ネットワークプロトコルを解析・送信することによって外部との通信を行なう手段であり、例えばプロトコルとして HTTP(Hyper Text Transfer Protocol)を使用した場合、URL(Uniform Resource Locator)によって指示された文書の取得や、Webサーバへの情報の送信を行なう。
【００４０】
文書データ解析部１１０２は、構造化記述言語で記述された文書データを解析し、より処理しやすい形式の中間コードに変換する。文書データ解析部１１０２において生成された中間コードは、データ描画部１１０３に渡されてビットマップデータに展開された後、ページメモリ１１０４に逐次描画される。
【００４１】
物理レイアウト処理部１１０５においては、構造化記述言語によって記述された文書データの物理ページへの割り付け、即ちフォーマッティング処理を行なう。具体的には、文書データ解析部１１０２において解析された文書データがHTMLのようなページの概念がない文書データであった場合に、物理ページへの割り付けが行われたデータを作成する。
【００４２】
出力制御部１３００は、ページメモリ１１０４の内容をビデオ信号に変換し、プリンタエンジン部１４００への画像転送を行なう。プリンタエンジン部１４００は、受け取ったビデオ信号を記録紙上に永久可視画像形成するための印刷機構部である。
【００４３】
次に、上述した構成からなる本実施形態の印刷システムにおける印刷制御手順を、図４乃至図６に示すフローチャートを参照して説明する。
【００４４】
図４は、プリンタ１０００の動作開始から終了までのメイン処理を示すフローチャートである。まずステップＳ４０１で、プリンタインターフェイス１２００を介してネットワークからのデータを受け取る。次にステップＳ４０２でプロトコルの解析を行なった後、ステップＳ４０３においてフォーマッティング処理、即ち物理ページへの割り付けを行ない、ステップＳ４０４に進んで描画処理を行なう。その後、ステップＳ４０５で文書データが終了したか否かを判断し、終了であれば印刷動作を終了する。一方、文書データ終了でなければ、ステップＳ４０１に戻って上記処理を繰り返す。
【００４５】
図５は、図４のステップＳ４０３に示したフォーマッティング処理、即ち物理レイアウト処理部１１０５における物理ページへのレイアウト処理の詳細を示すフローチャートである。
【００４６】
まず、ステップＳ５０１で文書データを先頭から検索し、文字データを検出した時点でその文字サイズを算出し、ステップＳ５０２で該算出した文字サイズを保存する。そして、ステップＳ５０３で全ての文書データの検索が終了したか否かを判定し、終了するまでステップＳ５０１〜Ｓ５０２の処理を繰り返すことにより、文書中に使用されている文字サイズのリストが完成する。
【００４７】
ステップＳ５０３で全ての文書データの検索が終了した場合にはステップＳ５０４に進み、保存された文字サイズのリストから、最小の文字サイズminを検出する。そしてステップＳ５０５において、検出された最小文字サイズminを８ポイントと見立てた場合の拡大率を算出する。この拡大率が物理レイアウトの基準となるため、ステップＳ５０６ではこの拡大率を用いて、文書の先頭から順次、物理ページへの割り付けを行なう。
【００４８】
そしてステップＳ５０７において、全ての文書データに対して物理ページへの割り付けが終了したか否かを判断し、終了するまでステップＳ５０６の割り付け処理を繰り返す。
【００４９】
ここで、上記図５に示すフォーマッティング処理の具体例を示す。ステップＳ５０１〜Ｓ５０３によって、文書中に使用されている文字サイズとして例えば、６，８，１０ポイントがリストされた場合、ステップＳ５０４では最小文字サイズminとして６ポイントが検出される。するとステップＳ５０５において、これを８ポイントとするための拡大率が８／６倍として算出される。従って、リストされた他の８，１０ポイントの文字サイズはそれぞれ、８×（８／６），１０×（８／６）ポイントで物理ページに割り付けられる。尚、文字のみでなくテーブルやボーダーライン等、他の図形（オブジェクト）についてもこの拡大率（上記例では８／６）を掛け合わせることによって、物理ページへ割り付けられる。
【００５０】
図６は、図４のステップＳ４０４に示した描画処理の詳細を示すフローチャートである。尚、この描画処理は即ちデータ描画部１１０３における描画処理であり、実際に印刷を行う処理である。
【００５１】
まずステップＳ６０１において、文書データ解析部１１０２でページ終了タグが検出されたか否かを判別する。ページ終了タグが検出されなければステップＳ６０２に進み、次に解析したタグが文字印字または図形描画等、ページメモリ１１０４への展開処理を必要とするタグであるか否かを判別する。否であればステップＳ６０５に進み、属性設定や印字位置制御等、解析したタグに従った処理を直ちに実行する。
【００５２】
一方、印字・描画等のタグであればステップＳ６０３に進み、ビットマップへの展開処理が容易に行える形式の中間コードを生成する。そしてステップＳ６０４において、データ描画部１１０３でこの中間コードを受けて、ページメモリ１１０４へのビットマップ展開処理を行う。展開処理終了後は図４のステップＳ４０２に戻り、文書データのプロトコル解析処理を繰り返す。
【００５３】
一方、ステップＳ６０１においてページ終了タグが検出された場合はステップＳ６０６に進み、出力制御部１３００においてページメモリ１１０４の内容をプリンタエンジン部１４００に対するビデオ信号に変換し、画像転送出力を行う。そしてステップＳ６０７においてプリンタエンジン部１４００では、受け取ったビデオ信号に基づいて記録紙上に永久可視画像を形成し、印刷を行う。そしてステップＳ６０８において印刷が行われた記録紙が排紙されると、１ページの印刷制御処理が終了する。
【００５４】
ここで、本実施形態におけるフォーマッティング処理結果の具体例を示す。図１１はHTMLによって記述された入力文書データ例であり、この文書データは、本実施形態のフォーマッティング処理が施されることによって、図１２に示すような、ページレイアウト可能な構成に変換される。また、図１２に示す文書データを実際に印刷した結果を、図１５に示す。
【００５５】
この例によれば、図１１に示される最小文字("table 1")のサイズは４ポイントであるから、拡大率は８／４＝２倍として算出される。従って、図１１において"＜H1＞"のタグで示されるタイトル文字("Sample")のサイズが１４ポイント、その他の文字("This document...web browser.")のサイズが規定値である６ポイントであるとすると、図１２によれば、タイトルの文字サイズが２８ポイント、その他の文字サイズが１２ポイント、そして最小文字サイズが８ポイントとして変換されていることが分かる。また、テーブルオブジェクトについてもこの拡大率（２倍）が適用されるため、テーブルが２分割でページ内に割りつけられていることが分かる。
【００５６】
以上説明した様に本実施形態によれば、構造化記述言語をフォーマッティングする際に、その物理レイアウトの基準値を最小文字サイズに基づいて設定することによって、ページ内への適切な割り付けが可能となる。
【００５７】
尚、本実施形態における拡大率の算出は、構造化記述言語内において検出された最小文字サイズminを８ポイントに拡大する場合について説明したが、本発明はこの例に限定されず、他のサイズに拡大するように拡大率を算出する場合でも、同様に適用される。
【００５８】
また、算出された拡大率に基づいて文字サイズを決定する際に、単に拡大率を乗じるだけでなく、更に適切なポイント数（システムにおいて通常使用されるポイント数）へ切り上げ等の最適化処理を施しても良い。
【００５９】
＜第２実施形態＞
以下、本発明に係る第２実施形態について説明する。
【００６０】
上述した第１実施形態においては、物理レイアウトの基準値として最小文サイズを参照する例について説明した。第２実施形態では、該基準値をオブジェクトの最大幅に基づいて設定することを特徴とする。ここで、オブジェクトとは文字以外の描画図形を指し、HTML文書ではテーブルやイメージ等がこれにあたる。
【００６１】
図７は、第２実施形態におけるフォーマッティング処理を示すフローチャートである。尚、第２実施形態におけるシステム構成、及び印刷動作のメイン処理、描画処理については、上述した第１実施形態と同様であるため、説明を省略する。
【００６２】
まずステップＳ７０１において文書データを先頭から検索し、文字以外のオブジェクトを検索する。例えばHTMLであれば、"＜TABLE＞"や"＜IMG＞"というタグを検索すれば良い。
【００６３】
次にステップＳ７０２において、検出したオブジェクトの幅を算出する。例えばオブジェクトがテーブルであれば、該テーブルに指定された幅情報、もしくはテーブル内に記述される文字サイズに基づいて幅が算出できる。また、イメージであれば、データフォーマットのヘッダに記述される画像情報に基づいて幅が算出できる。
【００６４】
そしてステップＳ７０３において、全ての文書データの検索が終了したか否かを判定し、終了するまでステップＳ７０１〜Ｓ７０２の処理を繰り返すことにより、文書中に使用されているオブジェクトの幅のリストが完成する。
【００６５】
ステップＳ７０３で全ての文書データの検索が終了した場合にはステップＳ７０４に進み、保存されたオブジェクト幅のリストから、最大のオブジェクト幅maxを検出する。そしてステップＳ７０５において、検出された最大オブジェクト幅maxをページ幅と比較する。ここでページ幅とは、記録紙内において物理的に印字可能となる主捜査方向の長さを指す。
【００６６】
最大オブジェクト幅maxがページ幅以下であればステップＳ７０７に進み、縮小率を１に設定する。一方、最大オブジェクト幅maxがページ幅よりも大きければステップＳ７０６に進み、最大オブジェクト幅maxをページ幅とみなした場合の縮小率を算出する。このようにして求められた縮小率が物理レイアウトの基準となるため、ステップＳ７０８ではこの縮小率を用いて、文書の先頭から順次、物理ページへの割り付けを行なう。
【００６７】
そしてステップＳ７０９において、全ての文書データに対して物理ページへの割り付けが終了したか否かを判断し、終了するまでステップＳ７０８の割り付け処理を繰り返す。
【００６８】
尚、文字以外のオブジェクトを備えない、即ち全てが文字によって構成されている文書データについては、ステップＳ７０１においてオブジェクトが検出されない。このような場合、図７のフローチャートには特に示していないが、縮小率を１に設定して、物理ページへの割り付け処理を行なう。そのために例えば、ステップＳ７０１でオブジェクトが検出されなかった場合、オブジェクト最大幅maxを最小値（例えば０）に設定するステップを設ければ良い。
【００６９】
ここで、上記図７に示すフォーマッティング処理の具体例を示す。ステップＳ７０１〜Ｓ７０３によって、文書中に使用されているオブジェクト幅として例えば、２０００，５０００ドットがリストされた場合、ステップＳ７０４では最大オブジェクトサイズmaxとして５０００ドットが検出される。一方、記録紙がＡ４サイズ、プリンタ解像度が６００dpiであれば、該記録紙を縦に搬送する場合のページ幅は４７２０ドットである。するとステップＳ７０５において最大オブジェクトサイズmax（５０００）がページ幅（４７２０）よりも大きいため、ステップＳ７０６で縮小率が４７２０／５０００として算出される。尚、オブジェクトのみでなく文字についてもこの縮小率を掛け合わせることによって、物理ページへ割り付けられる。
【００７０】
ここで、第２実施形態におけるフォーマッティング処理結果の具体例を示す。上述した第１実施形態で図１１に示したHTMLの入力文書データについて、第２実施形態のフォーマッティング処理を施すことによって、図１３に示すような変換結果が得られる。図１３に示す文書データを実際に印刷した結果は、図１６に示すようになる。図１３及び図１６によれば、最大幅を有するオブジェクトであるテーブルがページ内に適切に割りつけられ、文字サイズが第１実施形態に示した図１５よりも小さくなっていることが分かる。
【００７１】
以上説明した様に第２実施形態によれば、構造化記述言語をフォーマッティングする際に、その物理レイアウトの基準値を最大オブジェクト幅に基づいて設定することによって、ページ内への適切な割り付けが可能となる。
【００７２】
＜第３実施形態＞
以下、本発明に係る第３実施形態について説明する。
【００７３】
第３実施形態においては、物理レイアウトの基準値を最も使用頻度の高い文字サイズに基づいて設定することを特徴とする。
【００７４】
図８は、第３実施形態におけるフォーマッティング処理を示すフローチャートである。尚、第３実施形態におけるシステム構成、及び印刷動作のメイン処理、描画処理については、上述した第１実施形態と同様であるため、説明を省略する。
【００７５】
まずステップＳ８０１において文書データを先頭から検索し、文字データを検出した時点でその文字サイズを算出し、ステップＳ８０２で文字サイズ毎の文字数をカウントして保存する。そして、ステップ８０３で全ての文書データの検索が終了したか否かを判定し、終了するまでステップＳ８０１〜Ｓ８０２の処理を繰り返すことにより、文書中に使用されている文字サイズ及びその文字数のリストが完成する。
【００７６】
ステップＳ８０３で全ての文書データの検索が終了した場合にはステップＳ８０４に進み、保存された文字サイズのリストから、最も文字数の多い、即ち最頻出の文字サイズfreqを検出する。そしてステップＳ８０５において、検出された最頻出文字サイズfreqを１０ポイントと見立てた場合の拡大率を算出する。この拡大率が物理レイアウトの基準となるため、ステップＳ８０６ではこの拡大率を用いて、文書の先頭から順次、物理ページへの割り付けを行なう。
【００７７】
そしてステップＳ８０７において、全ての文書データに対して物理ページへの割り付けが終了したか否かを判断し、終了するまでステップＳ８０６の割り付け処理を繰り返す。
【００７８】
ここで、上記図８に示すフォーマッティング処理の具体例を示す。ステップＳ８０１〜Ｓ８０３によって、例えば文書中に使用されている文字サイズとして８ポイントが１０文字、１０ポイントが４００文字、２０ポイントが８文字リストされた場合、ステップＳ８０４では最頻出文字サイズfreqとして１０ポイントが検出される。するとステップＳ８０５において、これを１０ポイントとするための拡大率が１０／１０倍（＝１倍）として算出される。従って、リストされた他の８，２０ポイントの文字サイズもそれぞれ、８，２０ポイントで物理ページに割り付けられる。尚、文字のみでなくテーブルやボーダーライン等、他の図形（オブジェクト）についてもこの拡大率（上記例では１倍）を掛け合わせることによって、物理ページへ割り付けられる。
【００７９】
ここで、第３実施形態におけるフォーマッティング処理結果の具体例を示す。上述した第１実施形態で図１１に示したHTMLの入力文書データについて、第３実施形態のフォーマッティング処理を施すことによって、図１４に示すような変換結果が得られる。図１４に示す文書データを実際に印刷した結果は、図１７に示すようになる。
【００８０】
この例によれば、図１１に示される最頻出文字サイズ("This document...web browser.")は６ポイントであるから、拡大率は１０／６倍として算出される。従って、図１１におけるタイトル文字("Sample")のサイズが１４ポイント、最小文字("table 1")のサイズが４ポイントであるから、図１４によれば、最頻出文字サイズが１０ポイント、タイトルの文字サイズが１４×１０／６＝２３．３３（２３）ポイント、最小文字サイズが４×１０／６＝６．６６（７）ポイントとして変換されていることが分かる。また、最大幅を有するテーブルオブジェクトについてもこの拡大率（１０／６倍）が適用され、かろうじてページ内に割りつけられていることが分かる。
【００８１】
以上説明した様に第３実施形態によれば、構造化記述言語をフォーマッティングする際に、その物理レイアウトの基準値を最頻出文字サイズに基づいて設定することによって、ページ内への適切な割り付けが可能となる。
【００８２】
尚、本実施形態における拡大率の算出も第１実施形態と同様に、検出された最頻出文字サイズfreqを１０ポイントに拡大する例に限定されず、他のサイズへの拡大を想定しても同様に適用される。また、文字サイズの変換の際にも、拡大率を乗じるのみならず、適切なポイント数への切り上げ等の最適化処理を施しても良い。
【００８３】
＜第４実施形態＞
以下、本発明に係る第４実施形態について説明する。
【００８４】
第４実施形態においては、上述した第１乃至第３実施形態に示したフォーマッティング処理のいずれかを、任意に選択可能とすることを特徴とする。即ち、第４実施形態におけるシステム構成、及び印刷動作のメイン処理、描画処理については、上述した第１実施形態と同様であるが、フォーマッティング処理における物理レイアウトの基準値を、最小文字サイズ、最大オブジェクト幅、又は最頻出文字サイズのいずれかに基づいて設定することを特徴とする。
【００８５】
具体的には、プリンタ１０００の操作パネル１０１２において、ユーザがフォーマッティングの基準値として上記３種類のいずれかを選択する、即ち、第１乃至第３実施形態に対応するレイアウト方法のいずれかを選択できるように構成すれば良い。
【００８６】
また、選択した基準値に基づくフォーマッティング結果を、ホストコンピュータに接続されたＣＲＴ１０においてページ単位にプレビューできるようにすれば、更に操作性が向上する。
【００８７】
以上説明した様に第４実施形態によれば、ユーザの所望するようなフォーマッティングが可能となる。例えば、図１１に示すHTMLで記述された文書データについて、図１５乃至図１７のいずれかの出力形態が選択可能となる。
【００８８】
＜他の実施形態＞
上述した第１及び第３実施形態においては、文字サイズをポイント値として算出する例について説明したが、例えば文字サイズがその他の表現（例えば大／中／小）によって指定されている文書であっても、予め各サイズ（大／中／小）の比率を保持しておくことにより、本発明を適用することが可能である。
【００８９】
また、上述した各実施形態においては、フォーマッティング対象となる文書データとしてHTMLによって記述された文書を例として説明したが、本発明はこの例に限定されるものではなく、構造化記述言語であれば、XMLやSGMLといった他の言語であっても良いことは言うまでもない。
【００９０】
なお、本発明は、複数の機器（例えばホストコンピュータ、インタフェイス機器、リーダ、プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、複写機、ファクシミリ装置など）に適用してもよい。
【００９１】
また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体（または記録媒体）を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはCPUやMPU）が記憶媒体に格納されたプログラムコードを読み出し実行することによっても、達成されることは言うまでもない。この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているオペレーティングシステム(OS)などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【００９２】
さらに、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張カードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張カードや機能拡張ユニットに備わるCPUなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。
【００９３】
【発明の効果】
以上説明したように本発明によれば、構造化記述言語によって記述された文書データを物理ページへレイアウトする際に、自由度の高いレイアウトが実現できる。
【図面の簡単な説明】
【図１】本発明に係る一実施形態におけるプリンタの側断面図である。
【図２】本実施形態における印刷システムの構成を示すブロック図である。
【図３】本実施形態におけるプリンタの機能構成を示すブロック図である。
【図４】本実施形態における印刷処理を示すフローチャートである。
【図５】本実施形態におけるフォーマッティング処理を示すフローチャートである。
【図６】本実施形態における描画処理を示すフローチャートである。
【図７】第２実施形態におけるフォーマッティング処理を示すフローチャートである。
【図８】第３実施形態におけるフォーマッティング処理を示すフローチャートである。
【図９】ページ概念のない構造化記述言語による文書データの表示例を示す図である。
【図１０】図９に示す文書データのページレイアウト後の表示例を示す図である。
【図１１】ＨＴＭＬで記述された文書データ例を示す図である。
【図１２】本実施形態におけるフォーマッティング後の文書データ例を示す図である。
【図１３】第２実施形態におけるフォーマッティング後の文書データ例を示す図である。
【図１４】第３実施形態におけるフォーマッティング後の文書データ例を示す図である。
【図１５】本実施形態におけるフォーマッティング後の印刷例を示す図である。
【図１６】第２実施形態におけるフォーマッティング後の印刷例を示す図である。
【図１７】第３実施形態におけるフォーマッティング後の印刷例を示す図である。
【符号の説明】
１０００プリンタ
１０１２操作パネル
１１００フォーマッタ制御部
１１０１プロトコル制御部
１１０２文書データ解析部
１１０３データ描画部
１１０４ページメモリ
１１０５物理レイアウト処理部
１２００プリンタインタフェース
１３００出力制御部
１４００プリンタエンジン部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing apparatus and method for laying out document data described in a structured description language on a physical page, and an image processing system.
[0002]
[Prior art]
With the rapid spread of personal computers and the Internet, digitization of documents is being promoted regardless of the type of business. Generally, since the data format of an electronic document depends on an application that edits the document, it is necessary to prepare an application that supports the data format of the corresponding document in order to view the electronic document.
[0003]
Therefore, documents described in a structured description language independent of a specific application such as HTML (Hyper Text Markup Language) and XML (eXtensible Markup Language) are becoming widespread. In general, structured description languages are designed on the assumption that they are displayed on a display by a browser, and it is possible to arbitrarily change the width and height of the screen to be displayed, and to scroll the screen. The concept of “page” is not necessary.
[0004]
However, when a structured description language is used as a page description language, it is necessary to assign pages. Therefore, a structured description language capable of page layout is being developed in recent years, and a structured description language without a page concept can be converted into a structured description language capable of page layout by using a typographic rule for publishing. It has become possible. Thus, the process of converting a structured description language without a page concept into a structured description language capable of page layout is called “formatting”.
[0005]
FIG. 9 shows a display example by a browser of a document described in a structured description language having no page concept. According to a general browser, as shown in the figure, the size of the display screen can be changed, and the screen can be scrolled vertically and horizontally using a scroll bar.
[0006]
In contrast, FIG. 10 shows a display example of a document described in a structured description language after formatting. According to the figure, the formatted document can be displayed or printed in page units.
[0007]
[Problems to be solved by the invention]
However, in the above conventional formatting, the allocation of document data described in a structured description language to physical pages is based on a fixed magnification. Accordingly, there are problems that it becomes difficult to see depending on the contents of the page or that a user-preferred layout is not always obtained.
[0008]
The present invention has been made to solve the above-described problem, and an image processing apparatus that realizes a layout having a high degree of freedom when laying out document data described in a structured description language on a physical page and its It is an object to provide a method and an image processing system.
[0009]
[Means for Solving the Problems]
As a means for achieving the above object, an image processing apparatus of the present invention comprises the following arrangement.
[0010]
An image processing apparatus for laying out document data described in a structured description language on a physical page,
Page not defined Written in a structured description language Character data and graphic data are included as specific data Document data , An image processing apparatus that lays out a physical page,
Among a plurality of specific data included in the document data, By analyzing a tag indicating the width of the graphic data, With maximum width Extract graphic data and extract the extracted graphic data The scaling factor when the data is laid out to fit within the width of the printable area of the physical page, the information indicating the width of the printable area in the physical page, the information indicating the resolution when printing, and the maximum width Determining means for determining using the information indicating the width of the graphic data having
The document data Among the plurality of specific data included in the character data, for the character data, a tag indicating the character size is sequentially extracted, The scaling factor determined by the determining means The character size is changed and image processing for print processing is executed. For the graphic data, tags indicating the width and height are sequentially extracted, and the determining means determines After changing the width and height by multiplying the magnification And image processing means for executing image processing for print processing.
[0011]
In the image processing apparatus, for example, the structured description language is HTML. It is characterized by that.
[0012]
In the image processing apparatus, for example, the structured description language is XML. It is characterized by that.
[0013]
The image processing apparatus further includes drawing means for drawing the document data image-processed by the image processing means. It is characterized by that.
[0014]
The image processing apparatus further includes an image forming unit that visualizes the data drawn by the drawing unit on a recording medium. It is characterized by that.
[0015]
As a means for achieving the above object, an image processing method of the present invention comprises the following arrangement. That is,
Page not defined Written in a structured description language Character data and graphic data are included as specific data Document data , An image processing method for laying out a physical page,
Among a plurality of specific data included in the document data, By analyzing a tag indicating the width of the graphic data, With maximum width Extract graphic data and extract the extracted graphic data The scaling factor when the data is laid out to fit within the width of the printable area of the physical page, the information indicating the width of the printable area in the physical page, the information indicating the resolution when printing, and the maximum width Determining step using information indicating the width of the graphic data having
The document data Among the plurality of specific data included in the character data, for the character data, a tag indicating the character size is sequentially extracted, The scaling factor determined in the determination step The character size is changed and image processing for print processing is executed. For the graphic data, tags indicating the width and height are sequentially extracted, and the determining means determines After changing the width and height by multiplying the magnification And an image processing step for performing image processing for printing processing.
[0016]
In the image processing method, for example, the structured description language is HTML. It is characterized by that.
[0017]
In the image processing method, for example, the structured description language is XML. It is characterized by that.
[0018]
The image processing method further includes a drawing step of drawing the document data image-processed by the image processing step. It is characterized by that.
[0019]
The image processing method further includes an image forming step of visualizing the data drawn in the drawing step on a recording medium. It is characterized by that.
[0020]
As one means for achieving the above object, the storage medium of the present invention stores a control program for realizing the above image processing method by a computer. It is characterized by that.
[0021]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment according to the present invention will be described in detail with reference to the drawings.
[0022]
<First Embodiment>
First, a configuration of a laser beam printer (hereinafter simply referred to as “printer”) suitable for application of the present embodiment will be described with reference to FIG. Needless to say, the printer to which the present embodiment is applied is not limited to the laser beam type, and may be a printer of another printing type.
[0023]
FIG. 1 is a cross-sectional view showing the internal structure of a printer to which the present embodiment is applied. Registration of a character pattern from a data source (not shown) and registration of a fixed form (form data) can be performed. In the figure, reference numeral 1000 denotes a printer main body which inputs and stores character information (character code), form information, macro commands, etc. supplied from an externally connected host computer, and responds in accordance with the information. A character pattern, a form pattern, and the like are created, and an image is formed on a recording sheet that is a recording medium. Reference numeral 1012 denotes an operation panel on which switches for operation and an LED display are arranged. Reference numeral 1001 denotes a printer control unit that controls the entire printer 1000 and analyzes character information supplied from a host computer. The control unit 1001 mainly converts character information into a video signal having a corresponding character pattern and outputs the video signal to the laser driver 1002. The laser driver 1002 is a circuit for driving the semiconductor laser 1003, and switches on and off the laser light 1004 emitted from the semiconductor laser 1003 in accordance with the input video signal.
[0024]
A laser 1004 is swung left and right by a rotary polygon mirror 1005 and scans on the electrostatic drum 1006. As a result, an electrostatic latent image of a character pattern is formed on the electrostatic drum 1006. This latent image is developed by the developing unit 1007 around the electrostatic drum 1006 and then transferred to the recording paper. A cut sheet is used as the recording paper, and the cut sheet recording paper is stored in a paper cassette 1008 mounted on the printer 1000 and is taken into the apparatus by a paper feed roller 1009 and transport rollers 1010 and 1011, and an electrostatic drum 1006. To be supplied.
[0025]
FIG. 2 is a block diagram illustrating the configuration of the printer control system according to this embodiment. Here, an example of controlling the printer 1000 shown in FIG. 1 will be described. As long as the functions of the present invention are executed, a single device, a system composed of a plurality of devices, a system that performs processing via a network such as a LAN, Needless to say, the present invention can be applied.
[0026]
In FIG. 2, reference numeral 3000 denotes a host computer, which includes a CPU 1 that extracts document data and performs data conversion based on a document processing program stored in a program ROM in the ROM 3 and is connected to the system bus 4. The CPU 1 comprehensively controls each device.
[0027]
The program ROM in the ROM 3 stores the control program of the CPU 1, the font ROM also stores font data used for the data conversion process, and the data ROM also stores the above data. Various data used when performing conversion processing and the like are stored.
[0028]
Reference numeral 2 denotes a RAM, which functions as a main memory or work area of the CPU 1. A keyboard controller (KBC) 5 controls key input from a keyboard 9 or a pointing device (not shown). Reference numeral 6 denotes a CRT controller (CRTC), which controls display on a CRT display (CRT) 10. In the present embodiment, the keyboard 9 and the CRT display 10 are not necessarily required, but are usually equipped for maintenance of the host computer and confirmation of the operation status.
[0029]
A memory controller (MC) 7 controls access to the external memory 11 such as a hard disk (HD) or floppy disk (FD) that stores a boot program, various applications, font data, user files, edit files, and the like. . A network controller (NTC) 8 is connected to the printer 1000 via a predetermined bidirectional interface (interface) 21 and executes communication control processing with the printer 1000.
[0030]
For example, the CPU 1 can take out document data stored in the external memory 11 by controlling the memory controller 7, and transfer the document data to the outside by controlling the network controller 8.
[0031]
In the printer 1000, reference numeral 12 denotes a printer CPU, which is connected to various devices connected to the system bus 15 based on a control program stored in a program ROM in the ROM 13 or a control program stored in the external memory 14. Access is comprehensively controlled, and an image signal as output information is output to a printing unit (printer engine) 17 connected via the printing unit interface 16.
[0032]
The program ROM in the ROM 13 may store a control program for the CPU 12 as shown in a flowchart described later. Similarly, the font ROM stores font data used for generating the output information. Similarly, the data ROM is used when the printer 1000 does not include the external memory 14 such as a hard disk. -The information etc. used on the data 3000 are stored.
[0033]
The CPU 12 can notify the host computer 3000 of information in the printer 1000 by performing communication processing with the host computer 3000 via the input unit 18. Reference numeral 19 denotes a RAM that functions as a main memory, a work area, and the like of the CPU 12, and the memory capacity can be expanded by an optional RAM connected to an expansion port (not shown). The RAM 19 is used as an output information expansion area, environment data storage area, NVRAM, and the like.
[0034]
Access to the external memory 14 such as the hard disk (HD) and IC card described above is controlled by a memory controller (MC) 20. The external memory 14 is connected as an option and stores document data, font data, form data, and the like. Reference numeral 18 denotes an operation panel on which switches for various operations on the printer 1000, an LED display, and the like are arranged.
[0035]
In this embodiment, the number of external memories 14 is not limited to one, and at least one external memory 14 is provided, and in addition to the built-in font, an optional font card and a plurality of external memories storing programs for interpreting printer control languages with different language systems are connected. You may do it. Further, an NVRAM (not shown) may be provided, and printer mode setting information from the operation panel 1012 may be stored.
[0036]
FIG. 3 is a block diagram illustrating a functional configuration of the printer 1000 according to the present embodiment. According to the figure, the printer 1000 is roughly composed of a formatter control unit 1100, a printer interface 1200, an output control unit 1300, and a printer engine unit 1400.
[0037]
The printer interface 1200 is a means for controlling data input / output between the formatter control unit 1100 and the outside.
[0038]
The formatter control unit 1100 includes a protocol control unit 1101, a document data analysis unit 1102, a data drawing unit 1103, a page memory 1104, and a physical layout processing unit 1105. Generally, these functional configurations in the formatter control unit 1100 are realized by a computer system including a CPU, a ROM, a RAM, and the like.
[0039]
The protocol control unit 1101 is means for performing communication with the outside by analyzing and transmitting a network protocol. For example, when HTTP (Hyper Text Transfer Protocol) is used as a protocol, the protocol control unit 1101 is instructed by a URL (Uniform Resource Locator) Acquire documents and send information to a Web server.
[0040]
A document data analysis unit 1102 analyzes document data described in a structured description language and converts the document data into an intermediate code in a format that is easier to process. The intermediate code generated in the document data analysis unit 1102 is transferred to the data drawing unit 1103 and expanded into bitmap data, and then sequentially drawn in the page memory 1104.
[0041]
The physical layout processing unit 1105 performs allocation, that is, formatting processing, of document data described in a structured description language to physical pages. Specifically, when the document data analyzed by the document data analysis unit 1102 is document data having no page concept such as HTML, the data assigned to the physical page is created.
[0042]
The output control unit 1300 converts the contents of the page memory 1104 into a video signal, and performs image transfer to the printer engine unit 1400. The printer engine unit 1400 is a printing mechanism unit that forms a received video signal on a recording sheet as a permanent visible image.
[0043]
Next, a printing control procedure in the printing system of the present embodiment having the above-described configuration will be described with reference to the flowcharts shown in FIGS.
[0044]
FIG. 4 is a flowchart illustrating main processing from the start to the end of the operation of the printer 1000. First, in step S401, data from the network is received via the printer interface 1200. Next, after analyzing the protocol in step S402, formatting processing, that is, allocation to the physical page is performed in step S403, and the flow proceeds to step S404 to perform drawing processing. Thereafter, in step S405, it is determined whether or not the document data has been completed. If the document data has been completed, the printing operation is terminated. On the other hand, if the document data is not finished, the process returns to step S401 and the above process is repeated.
[0045]
FIG. 5 is a flowchart showing details of the formatting process shown in step S403 of FIG. 4, that is, the layout process to the physical page in the physical layout processing unit 1105.
[0046]
First, document data is searched from the top in step S501, and when the character data is detected, the character size is calculated. In step S502, the calculated character size is stored. Then, in step S503, it is determined whether or not all the document data has been searched, and the processing of steps S501 to S502 is repeated until the search is completed, thereby completing a list of character sizes used in the document.
[0047]
If all the document data has been searched in step S503, the process proceeds to step S504, and the minimum character size min is detected from the stored character size list. In step S505, an enlargement ratio when the detected minimum character size min is regarded as 8 points is calculated. Since this enlargement ratio becomes a reference for the physical layout, in step S506, the enlargement ratio is used to perform allocation to physical pages sequentially from the top of the document.
[0048]
In step S507, it is determined whether or not allocation to physical pages has been completed for all document data, and the allocation processing in step S506 is repeated until the allocation is completed.
[0049]
Here, a specific example of the formatting process shown in FIG. 5 will be shown. If, for example, 6, 8, and 10 points are listed as character sizes used in the document in steps S501 to S503, 6 points are detected as the minimum character size min in step S504. Then, in step S505, the enlargement ratio for setting this to 8 points is calculated as 8/6 times. Therefore, the other 8,10 point character sizes listed are allocated to physical pages at 8 × (8/6) and 10 × (8/6) points, respectively. Note that not only characters but also other figures (objects) such as tables and border lines are allocated to physical pages by multiplying by this enlargement ratio (8/6 in the above example).
[0050]
FIG. 6 is a flowchart showing details of the drawing process shown in step S404 of FIG. This drawing process is a drawing process in the data drawing unit 1103 and is a process for actually printing.
[0051]
First, in step S601, the document data analysis unit 1102 determines whether a page end tag is detected. If the page end tag is not detected, the process proceeds to step S602, and it is determined whether or not the analyzed tag is a tag that requires development processing to the page memory 1104 such as character printing or graphic drawing. If NO, the process proceeds to step S605, and processing according to the analyzed tag, such as attribute setting and print position control, is immediately executed.
[0052]
On the other hand, if it is a tag such as printing / drawing, the process advances to step S603 to generate an intermediate code in a format that can be easily expanded into a bitmap. In step S604, the data rendering unit 1103 receives the intermediate code and performs a bitmap expansion process on the page memory 1104. After completion of the expansion process, the process returns to step S402 in FIG. 4 to repeat the document data protocol analysis process.
[0053]
On the other hand, if a page end tag is detected in step S601, the process proceeds to step S606, where the output control unit 1300 converts the contents of the page memory 1104 into a video signal for the printer engine unit 1400, and performs image transfer output. In step S607, the printer engine unit 1400 forms a permanent visible image on the recording paper based on the received video signal, and performs printing. When the recording paper on which printing has been performed in step S608 is discharged, the printing control process for one page ends.
[0054]
Here, a specific example of the formatting process result in the present embodiment will be shown. FIG. 11 is an example of input document data described in HTML, and this document data is converted into a configuration capable of page layout as shown in FIG. 12 by performing the formatting process of this embodiment. FIG. 15 shows the result of actually printing the document data shown in FIG.
[0055]
According to this example, since the size of the minimum character (“table 1”) shown in FIG. 11 is 4 points, the enlargement ratio is calculated as 8/4 = 2 times. Therefore, the size of the title character (“Sample”) indicated by the tag “<H1>” in FIG. 11 is 14 points, and the size of the other characters (“This document ... web browser.”) Is the default value. If it is 6 points, it can be seen from FIG. 12 that the character size of the title is converted to 28 points, the other character sizes are 12 points, and the minimum character size is 8 points. In addition, since this enlargement ratio (twice) is applied to the table object, it can be seen that the table is allocated in the page in two.
[0056]
As described above, according to the present embodiment, when a structured description language is formatted, it is possible to perform appropriate allocation within a page by setting a reference value of the physical layout based on the minimum character size. Become.
[0057]
The calculation of the enlargement ratio in the present embodiment has been described for the case where the minimum character size min detected in the structured description language is enlarged to 8 points. However, the present invention is not limited to this example, and other sizes are available. The same applies to the case where the enlargement ratio is calculated so as to be enlarged.
[0058]
Also, when determining the character size based on the calculated enlargement ratio, not only multiplying the enlargement ratio, but also optimization processing such as rounding up to an appropriate number of points (number of points normally used in the system) You may give it.
[0059]
Second Embodiment
Hereinafter, a second embodiment according to the present invention will be described.
[0060]
In the above-described first embodiment, the example in which the minimum sentence size is referred to as the physical layout reference value has been described. The second embodiment is characterized in that the reference value is set based on the maximum width of the object. Here, an object refers to a drawing figure other than characters, and in an HTML document, this is a table, an image, or the like.
[0061]
FIG. 7 is a flowchart showing the formatting process in the second embodiment. Note that the system configuration, the main process of the printing operation, and the drawing process in the second embodiment are the same as those in the first embodiment described above, and a description thereof will be omitted.
[0062]
First, in step S701, document data is searched from the top, and objects other than characters are searched. For example, in the case of HTML, a tag such as “<TABLE>” or “<IMG>” may be searched.
[0063]
In step S702, the width of the detected object is calculated. For example, if the object is a table, the width can be calculated based on the width information specified in the table or the character size described in the table. In the case of an image, the width can be calculated based on image information described in the header of the data format.
[0064]
In step S703, it is determined whether or not all the document data has been searched, and the processing in steps S701 to S702 is repeated until the search is completed, thereby completing the list of the widths of the objects used in the document. .
[0065]
If the search of all document data is completed in step S703, the process proceeds to step S704, and the maximum object width max is detected from the list of stored object widths. In step S705, the detected maximum object width max is compared with the page width. Here, the page width refers to the length in the main search direction in which printing is physically possible within the recording paper.
[0066]
If the maximum object width max is equal to or smaller than the page width, the process advances to step S707 to set the reduction ratio to 1. On the other hand, if the maximum object width max is larger than the page width, the process advances to step S706 to calculate a reduction ratio when the maximum object width max is regarded as the page width. Since the reduction ratio obtained in this way becomes the reference for the physical layout, in step S708, allocation to physical pages is performed sequentially from the top of the document using this reduction ratio.
[0067]
In step S709, it is determined whether or not allocation to physical pages has been completed for all document data, and the allocation processing in step S708 is repeated until the allocation is completed.
[0068]
For document data that does not include objects other than characters, that is, all document characters are included, no object is detected in step S701. In such a case, although not particularly shown in the flowchart of FIG. 7, the reduction ratio is set to 1 and the allocation process to the physical page is performed. Therefore, for example, when an object is not detected in step S701, a step of setting the object maximum width max to a minimum value (for example, 0) may be provided.
[0069]
Here, a specific example of the formatting process shown in FIG. 7 will be shown. If, for example, 2000,5000 dots are listed as the object width used in the document in steps S701 to S703, 5000 dots are detected as the maximum object size max in step S704. On the other hand, if the recording paper is A4 size and the printer resolution is 600 dpi, the page width when the recording paper is conveyed vertically is 4720 dots. Then, since the maximum object size max (5000) is larger than the page width (4720) in step S705, the reduction ratio is calculated as 4720/5000 in step S706. Note that not only objects but also characters are assigned to physical pages by multiplying the reduction ratio.
[0070]
Here, a specific example of the formatting process result in the second embodiment is shown. By applying the formatting process of the second embodiment to the HTML input document data shown in FIG. 11 in the first embodiment described above, a conversion result as shown in FIG. 13 is obtained. The result of actually printing the document data shown in FIG. 13 is as shown in FIG. 13 and 16, it can be seen that the table which is the object having the maximum width is appropriately allocated in the page, and the character size is smaller than that in FIG. 15 shown in the first embodiment.
[0071]
As described above, according to the second embodiment, when a structured description language is formatted, it is possible to appropriately allocate the page within the page by setting the reference value of the physical layout based on the maximum object width. It becomes.
[0072]
<Third Embodiment>
The third embodiment according to the present invention will be described below.
[0073]
The third embodiment is characterized in that the physical layout reference value is set based on the most frequently used character size.
[0074]
FIG. 8 is a flowchart showing the formatting process in the third embodiment. Note that the system configuration, the main process of the printing operation, and the drawing process in the third embodiment are the same as those in the first embodiment described above, and a description thereof will be omitted.
[0075]
First, in step S801, document data is searched from the top, and when character data is detected, the character size is calculated. In step S802, the number of characters for each character size is counted and stored. In step 803, it is determined whether or not all the document data has been searched, and the processing in steps S801 to S802 is repeated until the search is completed, so that a list of character sizes and the number of characters used in the document can be obtained. Complete.
[0076]
If the search of all document data is completed in step S803, the process proceeds to step S804, and the most frequently used character size freq is detected from the stored character size list. In step S805, the enlargement ratio when the detected most frequent character size freq is assumed to be 10 points is calculated. Since this enlargement ratio becomes a reference for the physical layout, in step S806, allocation to physical pages is performed sequentially from the top of the document using this enlargement ratio.
[0077]
In step S807, it is determined whether or not allocation to physical pages has been completed for all document data, and the allocation processing in step S806 is repeated until the allocation is completed.
[0078]
Here, a specific example of the formatting process shown in FIG. 8 will be shown. If the character size used in the document is listed as 10 characters for 10 points, 10 points for 400 characters, and 20 points for 8 characters in steps S801 to S803, for example, in step S804, the most frequently used character size freq is 10 points. Is detected. Then, in step S805, the enlargement ratio for setting this to 10 points is calculated as 10/10 times (= 1 times). Therefore, the other 8,20 point character sizes listed are also allocated to physical pages at 8,20 points, respectively. It should be noted that not only characters but also other figures (objects) such as tables and border lines are allocated to physical pages by multiplying by this enlargement ratio (1 time in the above example).
[0079]
Here, a specific example of the formatting process result in the third embodiment is shown. The conversion result as shown in FIG. 14 is obtained by performing the formatting process of the third embodiment on the HTML input document data shown in FIG. 11 in the first embodiment. The result of actually printing the document data shown in FIG. 14 is as shown in FIG.
[0080]
According to this example, since the most frequently used character size (“This document ... web browser.”) Shown in FIG. 11 is 6 points, the enlargement ratio is calculated as 10/6 times. Accordingly, since the size of the title character (“Sample”) in FIG. 11 is 14 points and the size of the minimum character (“table 1”) is 4 points, according to FIG. It can be seen that the character size is converted to 14 × 10/6 = 23.33 (23) points and the minimum character size is 4 × 10/6 = 6.66 (7) points. It can also be seen that this enlargement ratio (10/6 times) is applied to the table object having the maximum width, and it is barely allocated in the page.
[0081]
As described above, according to the third embodiment, when formatting a structured description language, an appropriate allocation within a page can be achieved by setting a reference value of the physical layout based on the most frequently used character size. It becomes possible.
[0082]
The calculation of the enlargement ratio in the present embodiment is not limited to the example in which the detected most frequent character size freq is enlarged to 10 points, as in the first embodiment, and even if enlargement to other sizes is assumed. The same applies. Also, when converting the character size, not only multiplication by the enlargement ratio but also optimization processing such as rounding up to an appropriate number of points may be performed.
[0083]
<Fourth embodiment>
The fourth embodiment according to the present invention will be described below.
[0084]
The fourth embodiment is characterized in that any one of the formatting processes shown in the first to third embodiments described above can be arbitrarily selected. That is, the system configuration and the main processing and drawing processing of the printing operation in the fourth embodiment are the same as those in the first embodiment described above, but the physical layout reference values in the formatting processing are set to the minimum character size and the maximum object. It is set based on either the width or the most frequent character size.
[0085]
Specifically, on the operation panel 1012 of the printer 1000, the user can select any one of the above three types as the formatting reference value, that is, can select any layout method corresponding to the first to third embodiments. What is necessary is just to comprise.
[0086]
Further, if the formatting result based on the selected reference value can be previewed page by page in the CRT 10 connected to the host computer, the operability is further improved.
[0087]
As described above, according to the fourth embodiment, formatting as desired by the user can be performed. For example, one of the output forms shown in FIGS. 15 to 17 can be selected for the document data described in HTML shown in FIG.
[0088]
<Other embodiments>
In the first and third embodiments described above, the example in which the character size is calculated as the point value has been described. For example, the document is a document in which the character size is specified by other expressions (for example, large / medium / small). However, it is possible to apply the present invention by maintaining the ratio of each size (large / medium / small) in advance.
[0089]
Further, in each of the above-described embodiments, a document described in HTML as document data to be formatted has been described as an example. However, the present invention is not limited to this example, and may be a structured description language. Of course, other languages such as XML and SGML may be used.
[0090]
Note that the present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, and a printer), and a device (for example, a copying machine and a facsimile device) including a single device. You may apply to.
[0091]
Another object of the present invention is to supply a storage medium (or recording medium) in which a program code of software that realizes the functions of the above-described embodiments is recorded to a system or apparatus, and the computer (or CPU or CPU) of the system or apparatus. Needless to say, this can also be achieved by the MPU) reading and executing the program code stored in the storage medium. In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention. Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an operating system (OS) running on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.
[0092]
Furthermore, after the program code read from the storage medium is written into a memory provided in a function expansion card inserted into the computer or a function expansion unit connected to the computer, the function is determined based on the instruction of the program code. It goes without saying that the CPU or the like provided in the expansion card or the function expansion unit performs part or all of the actual processing and the functions of the above-described embodiments are realized by the processing.
[0093]
【The invention's effect】
As described above, according to the present invention, a layout with a high degree of freedom can be realized when document data described in a structured description language is laid out on a physical page.
[Brief description of the drawings]
FIG. 1 is a side sectional view of a printer according to an embodiment of the present invention.
FIG. 2 is a block diagram illustrating a configuration of a printing system according to the present embodiment.
FIG. 3 is a block diagram illustrating a functional configuration of a printer according to the present exemplary embodiment.
FIG. 4 is a flowchart illustrating print processing according to the present exemplary embodiment.
FIG. 5 is a flowchart showing formatting processing in the present embodiment.
FIG. 6 is a flowchart illustrating a drawing process in the present embodiment.
FIG. 7 is a flowchart showing formatting processing in the second embodiment.
FIG. 8 is a flowchart showing formatting processing in the third embodiment.
FIG. 9 is a diagram illustrating a display example of document data in a structured description language without a page concept.
10 is a diagram showing a display example of the document data shown in FIG. 9 after page layout.
FIG. 11 is a diagram illustrating an example of document data described in HTML.
FIG. 12 is a diagram showing an example of document data after formatting in the present embodiment.
FIG. 13 is a diagram showing an example of document data after formatting in the second embodiment.
FIG. 14 is a diagram showing an example of document data after formatting in the third embodiment.
FIG. 15 is a diagram illustrating an example of printing after formatting in the present embodiment.
FIG. 16 is a diagram illustrating a print example after formatting in the second embodiment.
FIG. 17 is a diagram illustrating an example of printing after formatting in the third embodiment.
[Explanation of symbols]
1000 printer
1012 Operation panel
1100 Formatter controller
1101 Protocol control unit
1102 Document data analysis unit
1103 Data drawing unit
1104 page memory
1105 Physical layout processing unit
1200 Printer interface
1300 Output controller
1400 Printer engine

Claims

An image processing apparatus that lays out document data including character data and graphic data as specific data on a physical page, described in a structured description language with no page defined ,
Among the plurality of specific data included in the document data, the graphic data having the maximum width is extracted by analyzing the tag indicating the width of the graphic data. The scaling factor when laying out to fit within the printable area width of the page, information indicating the width of the printable area in the physical page, information indicating the resolution at the time of printing, and graphic data having the maximum width Determining means for determining using information indicating the width of
Among the plurality of specific data included in the document data , for the character data, the character size is changed by sequentially extracting a tag indicating the character size and multiplying by the scaling factor determined by the determining means . Then, image processing for print processing is executed, and for the graphic data, a tag indicating the width and height is sequentially extracted, and the width and height are multiplied by the scaling factor determined by the determination means. Image processing means for executing image processing for print processing after changing the size ,
An image processing apparatus comprising:

The image processing apparatus according to claim 1, wherein the structured description language is HTML.

The image processing apparatus according to claim 1, wherein the structured description language is XML.

2. The image processing apparatus according to claim 1, further comprising drawing means for drawing the document data image-processed by the image processing means.

5. The image processing apparatus according to claim 4, further comprising image forming means for visualizing the data drawn by the drawing means on a recording medium.

An image processing method for laying out document data that is described in a structured description language in which a page is not defined and includes character data and graphic data as specific data on a physical page,
Among the plurality of specific data included in the document data, the graphic data having the maximum width is extracted by analyzing the tag indicating the width of the graphic data. The scaling factor when laying out to fit within the printable area width of the page, information indicating the width of the printable area in the physical page, information indicating the resolution at the time of printing, and graphic data having the maximum width Determining step using information indicating the width of
Of the plurality of specific data included in the document data , for the character data, the character size is changed by sequentially extracting a tag indicating the character size and multiplying by the scaling factor determined by the determination step . Then, image processing for print processing is executed, and for the graphic data, a tag indicating the width and height is sequentially extracted, and the width and height are multiplied by the scaling factor determined by the determination means. An image processing step for executing image processing for print processing after changing the size ,
An image processing method comprising:

The image processing method according to claim 6, wherein the structured description language is HTML.

The image processing method according to claim 6, wherein the structured description language is XML.

The image processing method according to claim 6, further comprising a drawing step of drawing the document data image-processed by the image processing step.

The image processing method according to claim 9, further comprising an image forming step of visualizing the data drawn in the drawing step on a recording medium.

A storage medium storing a control program for realizing the image processing method according to claim 6 by a computer.