JP2021135938A

JP2021135938A - Information processing apparatus, program and information processing method

Info

Publication number: JP2021135938A
Application number: JP2020033880A
Authority: JP
Inventors: 雅規有富; Masaki Aritomi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2020-02-28
Filing date: 2020-02-28
Publication date: 2021-09-13

Abstract

To enable easily confirming from which position within a document image, meta-data being imparted to a scan document is extracted.SOLUTION: An information processing apparatus according to the present application invention, displays meta-data extracted from a document image associated with a selected document for the document selected out of a plurality of documents which are displayed as a list; and displays a first thumbnail that indicates from which position within the document image corresponding to the document, the meta-data to be imparted to a pointed document is acquired, for a document which is pointed by means of a mouse cursor, of the plurality of documents being displayed as the list.SELECTED DRAWING: Figure 6

Description

本発明は、文書から情報を抽出する作業を行う際に、ユーザが効率よく作業を行えるようにするための装置、方法、コンピュータプログラムに関する。 The present invention relates to a device, a method, and a computer program for enabling a user to efficiently perform a work of extracting information from a document.

従来、スキャンした文書画像に対して光学文字認識（ＯＣＲ）処理を行うことにより、文字列を抽出する技術が有る。また、スキャンした文書画像に対して注文書・請求書などの帳票種別を識別する帳票識別処理を行い、当該識別した帳票種別に基づいて特定される文字領域に対してＯＣＲ処理を行うことにより文字列を抽出することも行われている。また、ＯＣＲ処理により抽出した文字列は、当該文書画像のメタデータやインデックスとして、当該文書画像と関連づけて管理したり、別のシステム（例えば、経費精算システムや会計システム）に業務データとして渡したりすることが行われている。 Conventionally, there is a technique for extracting a character string by performing optical character recognition (OCR) processing on a scanned document image. In addition, the scanned document image is subjected to form identification processing for identifying the form type such as purchase order and invoice, and the character area specified based on the identified form type is subjected to OCR processing to perform characters. Columns are also extracted. In addition, the character string extracted by OCR processing can be managed as metadata or index of the document image in association with the document image, or passed to another system (for example, expense settlement system or accounting system) as business data. Is being done.

特許文献１には、メタデータとして用いた文字列が、帳票画像内のどの領域から抽出したかを記憶しておき、当該帳票に類似するスキャン画像を取得した際は、記憶しておいた文字領域の位置に基づいてＯＣＲ処理を行って文字列を抽出することが開示されている。また、抽出した文字列がスキャン画像内のどの領域から抽出されたのかをユーザが確認できるように、抽出した文字列の領域を強調した状態でスキャン画像を表示することも開示されている。 In Patent Document 1, the area in the form image from which the character string used as the metadata is extracted is stored, and when a scanned image similar to the form is acquired, the stored characters are stored. It is disclosed that the character string is extracted by performing OCR processing based on the position of the region. It is also disclosed that the scanned image is displayed with the area of the extracted character string emphasized so that the user can confirm from which area in the scanned image the extracted character string is extracted.

特開２０１９−１２８７１５号公報Japanese Unexamined Patent Publication No. 2019-128715

特許文献１のプレビュー画面では、１つのスキャン画像をプレビュー表示して確認するように構成されている。したがって、複数のスキャン画像それぞれに対してメタデータを付与するようなケースにおいて、特許文献１の技術を適用すると、スキャン画像を１つ確認するごとに画面を切り替え表示させる必要が生じ、確認作業にユーザの手間がかかってしまう。 The preview screen of Patent Document 1 is configured to preview and confirm one scanned image. Therefore, if the technique of Patent Document 1 is applied in a case where metadata is added to each of a plurality of scanned images, it becomes necessary to switch and display the screen each time one scanned image is confirmed, which is necessary for confirmation work. It takes time and effort for the user.

上記課題を解決するために、本発明の情報処理装置は、複数の文書を一覧表示し、前記一覧表示した複数の文書の中から選択された文書について、当該選択された文書に関する文書画像から抽出されたメタデータを表示し、前記一覧表示した複数の文書のうちのマウスカーソルでポイントされている文書について、当該ポイントされている文書に付与されるべきメタデータが当該文書に対応する文書画像内のどの位置から取得されたのかを示す第１のサムネイルを表示する、ことを特徴とする。 In order to solve the above problems, the information processing apparatus of the present invention displays a plurality of documents in a list, and extracts a document selected from the plurality of documents displayed in the list from a document image relating to the selected document. For the document pointed to by the mouse cursor among the plurality of documents displayed in the list, the metadata to be given to the pointed document is in the document image corresponding to the document. It is characterized in that a first thumbnail indicating from which position of the document was acquired is displayed.

本発明によれば、複数の文書を一覧表示している状態で、各文書に付与されるべきメタデータが、各文書画像内のどの位置から取得されたのかを簡単に確認することができる。 According to the present invention, it is possible to easily confirm from which position in each document image the metadata to be given to each document is acquired while displaying a list of a plurality of documents.

システム構成例System configuration example 情報処理装置のハードウェア構成例Information processing device hardware configuration example ソフトウェア構成例Software configuration example 画面ＵＩの動作概要を示す図Diagram showing the operation outline of the screen UI サムネイルの表示例Thumbnail display example サムネイル表示時の画面ＵＩの例を示す図Diagram showing an example of the screen UI when displaying thumbnails マウスカーソルを移動させたときのサムネイル表示例を示す図Diagram showing an example of thumbnail display when the mouse cursor is moved 一覧サムネイルおよび個別サムネイルの別の表示形態Different display formats for list thumbnails and individual thumbnails マウスカーソルの移動に連動してサムネイル表示するモードの説明図Explanatory diagram of the mode to display thumbnails in conjunction with the movement of the mouse cursor 本システムにおける処理を示すフローチャートFlowchart showing processing in this system サムネイルの別表示形態を適用した場合のフローチャートFlowchart when another thumbnail display format is applied

図１は、本実施形態のシステム構成の一例である。１０１はインターネット・イントラネットなどのネットワークである。スキャン文書処理サーバー１１１は、スキャン文書（スキャン画像）に対してＯＣＲ処理などを実行する。クライアント端末１２１は、スキャン文書から抽出したデータをユーザーが確認したり修正したりするための端末である。クライアント端末１２１としては、パーソナルコンピューター、ラップトップコンピューター、タブレットコンピューター、スマートフォンなどのデバイスを利用することが可能である。業務サーバー１３１は、スキャン文書から抽出したデータを受信して各種処理を行うための外部システムである。なお、スキャン文書は、不図示のスキャン機能を有する装置（スキャナや複合機）において文書を読み取ることにより生成される。なお、スキャン機能を有する装置は、ネットワークに直接接続されてスキャン文書処理サーバー１１１等に文書画像を送信できるようにしてもよいし、クライアント端末１２１にケーブル接続されて、クライアント端末経由で文書画像を送信できるようにしてもよい。 FIG. 1 is an example of the system configuration of the present embodiment. 101 is a network such as an Internet or an intranet. The scan document processing server 111 executes OCR processing or the like on the scanned document (scanned image). The client terminal 121 is a terminal for the user to check or correct the data extracted from the scanned document. As the client terminal 121, devices such as a personal computer, a laptop computer, a tablet computer, and a smartphone can be used. The business server 131 is an external system for receiving data extracted from a scanned document and performing various processes. The scanned document is generated by reading the document with a device (scanner or multifunction device) having a scanning function (not shown). The device having a scanning function may be directly connected to the network so that the document image can be transmitted to the scanning document processing server 111 or the like, or is connected to the client terminal 121 by a cable to transmit the document image via the client terminal. It may be possible to send.

図２は、スキャン文書処理サーバー１１１、クライアント端末１２１、業務サーバー１３１のいずれかとして利用可能な情報処理装置のハードウェア構成例を示している。ネットワークインターフェース２０２は、ＬＡＮなどのネットワーク１０５に接続して、他のコンピューターやネットワーク機器との通信を行うためのインタフェースである。通信の方式としては、有線・無線のいずれでもよい。ＲＯＭ２０４には、組込済みプログラムおよびデータが記録されている。ＲＡＭ２０５は、ワークエリアとして利用可能な一時メモリ領域である。二次記憶装置２０６は、ＨＤＤやフラッシュメモリであり、後述する処理を行うためのプログラムや各種データが記憶される。ＣＰＵ２０３は、ＲＯＭ２０４、ＲＡＭ２０５、二次記憶装置２０６などから読み込んだプログラムを実行する。ユーザーインターフェース２０１は、ディスプレイ、キーボード、マウス、ボタン、タッチパネルなどで構成され、ユーザからの操作を受け付けたり、情報の表示を行ったりする。各処理部は、入出力インターフェース２０７を介して接続されている。 FIG. 2 shows a hardware configuration example of an information processing device that can be used as any of the scan document processing server 111, the client terminal 121, and the business server 131. The network interface 202 is an interface for connecting to a network 105 such as a LAN and communicating with other computers and network devices. The communication method may be either wired or wireless. The embedded program and data are recorded in the ROM 204. The RAM 205 is a temporary memory area that can be used as a work area. The secondary storage device 206 is an HDD or a flash memory, and stores programs and various data for performing processing described later. The CPU 203 executes a program read from the ROM 204, the RAM 205, the secondary storage device 206, and the like. The user interface 201 is composed of a display, a keyboard, a mouse, buttons, a touch panel, and the like, and accepts operations from the user and displays information. Each processing unit is connected via the input / output interface 207.

図３は、本実施形態における各装置で実行されるソフトウェア（プログラム）の構成図である。各装置にインストールされたソフトウェアは、それぞれのＣＰＵで実行され、また、各ソフトウェア間では、矢印で示すように、相互に通信可能な構成となっている。 FIG. 3 is a configuration diagram of software (program) executed by each device in the present embodiment. The software installed in each device is executed by each CPU, and each software has a configuration in which it can communicate with each other as shown by an arrow.

スキャン文書処理アプリケーション３１１は、スキャン文書処理サーバー１１１にインストールされたプログラムである。本実施例では、スキャン文書処理サーバー１１１はスキャン文書処理アプリケーション３１１を実行することによってＷｅｂアプリケーションサーバーとして動作するものとして説明するが、これに限るものではない。３１２は、スキャン文書処理アプリケーション３１１によって提供されるＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）である。３１３は、スキャン文書処理アプリケーション３１１によって提供されるＷｅｂＵＩである。 The scan document processing application 311 is a program installed on the scan document processing server 111. In this embodiment, the scan document processing server 111 is described as operating as a Web application server by executing the scan document processing application 311, but the present invention is not limited to this. 312 is an API (Application Programming Interface) provided by the scan document processing application 311. Reference numeral 313 is a Web UI provided by the scan document processing application 311.

データストア３２１は、スキャン文書処理アプリケーション３１１、および、後述するバックエンドアプリケーション３３１が使用するデータを保存・管理するためのモジュールである。データストア３２１には、次に説明する各種データが格納される。スキャン文書格納部３２２は、スキャン文書の画像を、ＪＰＥＧ等の画像ファイル或いはＰＤＦ（ＰｏｒｔａｂｌｅＤｏｃｕｍｅｎｔＦｏｒｍａｔ）等の文書ファイルとして保存する。スキャン文書ジョブキュー３２３は、後述するメタデータ入力処理待ちのジョブを管理するキューを保持する。メタデータ管理部３２４は、スキャン文書ごとに付加が必要なメタデータの一覧・メタデータごとの名前、値のフォーマット（文字列・数字など）などを管理する。スキャン文書処理結果格納部３２５は、ＯＣＲ処理結果、帳票判別結果を格納する。また、スキャン文書処理結果格納部３２５は、スキャン文書ごとに関連付けられたメタデータとその抽出領域情報や、編集されたメタデータの値などを格納する。 The data store 321 is a module for storing and managing data used by the scan document processing application 311 and the back-end application 331 described later. Various data described below are stored in the data store 321. The scan document storage unit 322 stores the image of the scan document as an image file such as JPEG or a document file such as PDF (Portable Document Form). The scan document job queue 323 holds a queue that manages jobs waiting for metadata input processing, which will be described later. The metadata management unit 324 manages a list of metadata that needs to be added for each scanned document, a name for each metadata, a value format (character string, number, etc.), and the like. The scan document processing result storage unit 325 stores the OCR processing result and the form determination result. Further, the scan document processing result storage unit 325 stores the metadata associated with each scan document, the extraction area information thereof, the edited metadata value, and the like.

バックエンドアプリケーション３３１は、バックグラウンド処理を実行するためのプログラムである。バックエンドアプリケーション３３１は、以下に示すようなバックグラウンドで順次実行されうる処理を担当する。ＯＣＲ処理部３３２は、スキャン文書格納部３２２から文書画像を取得し、ＯＣＲ処理を実行する。ＯＣＲ処理では、文字列と認識された領域の始点座標・幅・高さ、および認識できたＯＣＲ結果文字列を抽出する。帳票処理部３３３は、入力画像を領域解析することに識別される領域の配置パターンや、ＯＣＲ処理結果の文字列の情報や、入力画像から検出された２次元コード等を用いて、帳票の種別を判別する。なお、帳票種別の判別処理は、パターン認識、機械学習などいずれの手法でもよい。外部システム通信部３３４は、外部の業務サーバー１３１にスキャン文書およびそのＯＣＲ結果等の送信処理を実行する。スキャン文書およびその処理結果を外部システムに送信する必要が無い場合（スキャン文書処理サーバー内で処理結果を保存する場合や、クライアント端末で処理結果を保存する場合）は、外部システム通信部３３４は省略可能である。 The back-end application 331 is a program for executing background processing. The back-end application 331 is in charge of processing that can be sequentially executed in the background as shown below. The OCR processing unit 332 acquires a document image from the scanned document storage unit 322 and executes the OCR processing. In the OCR process, the start point coordinates, width, and height of the area recognized as the character string and the recognized OCR result character string are extracted. The form processing unit 333 uses the arrangement pattern of the area identified for area analysis of the input image, the character string information of the OCR processing result, the two-dimensional code detected from the input image, and the like to type the form. To determine. The form type discrimination process may be performed by any method such as pattern recognition or machine learning. The external system communication unit 334 executes transmission processing such as a scanned document and its OCR result to the external business server 131. When it is not necessary to send the scanned document and its processing result to the external system (when saving the processing result in the scan document processing server or when saving the processing result in the client terminal), the external system communication unit 334 is omitted. It is possible.

クライアントアプリケーション３５１は、クライアント端末で実行されるプログラムである。本実施例では、前記スキャン文書処理アプリケーション３１１のＷｅｂアプリケーションとして提供されるものとする。すなわち、クライアント端末のウェブブラウザーでＷｅｂＵＩ３１３を表示して、必要なデータをＡＰＩ３１２を介して送受信することにより実現する方法があるが、これに限るものではない。例えば、必要なデータをＡＰＩ３１２を介して送受信するよう作成された、コンピュータのデスクトップで動作するアプリケーションや、スマートフォン等で動作するモバイルアプリケーションなどでもよい。 The client application 351 is a program executed on the client terminal. In this embodiment, it is assumed that the scan document processing application 311 is provided as a Web application. That is, there is a method of displaying the Web UI 313 on the web browser of the client terminal and transmitting / receiving necessary data via the API 312, but the method is not limited to this. For example, an application that operates on a computer desktop or a mobile application that operates on a smartphone or the like, which is created to send and receive necessary data via API 312, may be used.

業務アプリケーション３６１は、業務サーバー１３１で実行されるプログラムである。業務データストレージ３６２は、業務アプリケーション３６１が使用するデータを保存するためのモジュールである。業務アプリケーション３６１では、ファイル管理・文書管理・受注管理・会計処理などの各種業務において、スキャン文書処理サーバでの処理結果（メタデータと文書画像）を受信して、各種業務に係る処理を実行する。なお、業務の種類を限定するものでない。 The business application 361 is a program executed on the business server 131. The business data storage 362 is a module for storing data used by the business application 361. In the business application 361, in various business such as file management, document management, order management, accounting processing, etc., the processing result (metadata and document image) in the scan document processing server is received and the processing related to various business is executed. .. The type of work is not limited.

図４を用いて、クライアント端末においてクライアントアプリケーション３５１を実行することにより表示される画面ＵＩの動作概要を説明する。なお、クライアントアプリケーションがＷｅｂアプリケーションである場合は、ウェブブラウザー上で表示される画面である。 An outline of the operation of the screen UI displayed by executing the client application 351 on the client terminal will be described with reference to FIG. When the client application is a Web application, it is a screen displayed on the Web browser.

以下では、複数のスキャン文書それぞれに含まれる複数の文字領域から複数の文字列データを抽出し、当該抽出した複数の文字列データをメタデータとして各スキャン文書に関連づける場合を例にして説明する。文字列データの抽出元となる文字領域（抽出領域）は、帳票の書式ごとに予め決められた位置の文字領域であってもよいし、スキャン文書内に記載されている所定のキーワード（項目名）の右側や下側に記載されている文字領域であってもよい。各スキャン文書においては、複数の抽出領域が含まれるものとする。 In the following, a case where a plurality of character string data are extracted from a plurality of character areas included in each of the plurality of scanned documents and the extracted plurality of character string data are associated with each scanned document as metadata will be described as an example. The character area (extraction area) from which the character string data is extracted may be a character area at a position predetermined for each form format, or a predetermined keyword (item name) described in the scanned document. It may be the character area described on the right side or the lower side of). Each scanned document shall include multiple extraction areas.

アプリケーション３５１の画面ＵＩ４０１には、複数のスキャン文書がリスト形式で一覧表示される文書一覧ペイン４１１が含まれる。文書一覧ペイン４１１には、スキャンした文書の各ページの画像を特定するための情報（例えば、スキャン日時、読み取りを行ったスキャナ装置の識別子、何ページ目のスキャン画像かを示す情報、など）が、ページ画像ごとに表示されているものとする。 The screen UI 401 of application 351 includes a document list pane 411 in which a plurality of scanned documents are listed in a list format. In the document list pane 411, information for identifying the image of each page of the scanned document (for example, the date and time of scanning, the identifier of the scanner device that scanned, the information indicating the scanned image on which page, etc.) are stored. , It is assumed that it is displayed for each page image.

ユーザが、その一覧表示されている情報のうちの１つ（例えば４１２）をワンクリックして選択すると、当該選択されたページ画像から抽出された情報であって且つ当該ページ画像のメタデータとして付与される予定の情報が、校正入力ペイン４２１に表示される。このとき選択された情報の欄は、選択状態を示す太線で囲まれる。各ページ画像は、帳票処理部３３３により帳票の種別が判別されており、当該判別された帳票種別に基づいて当該ページ画像に付与すべきメタデータの項目名が特定され、ＯＣＲ結果に基づいてメタデータの項目名に対応する情報が設定される。画面ＵＩ４０１の例では、一覧の中から選択された情報４１２に対応するページ画像は、帳票種別が「オーダーシート」として判別されたものとする。そして、当該判別された帳票種別に対して予め定義されているメタデータの項目名が３つ（Ｃｕｓｔｏｍ，Ｂｉｚｃｏｄｅ，Ｐｒｉｃｅ）特定され、当該項目名に対応する情報がページ画像のＯＣＲ結果に基づき設定されて各項目名の右側の欄に表示される。ユーザは、当該校正入力ペイン４２１に表示されているメタデータの項目名に対応する情報を修正することができる。 When the user selects one of the listed information (for example, 412) with one click, the information is extracted from the selected page image and is given as the metadata of the page image. The information to be scheduled is displayed in the calibration input pane 421. The information column selected at this time is surrounded by a thick line indicating the selected state. For each page image, the type of form is determined by the form processing unit 333, the item name of the metadata to be given to the page image is specified based on the determined form type, and the meta is based on the OCR result. Information corresponding to the item name of the data is set. In the example of the screen UI 401, it is assumed that the page image corresponding to the information 412 selected from the list has the form type determined as "order sheet". Then, three predefined metadata item names (Custom, Bizcode, Price) are specified for the determined form type, and the information corresponding to the item name is set based on the OCR result of the page image. Is displayed in the column to the right of each item name. The user can modify the information corresponding to the item name of the metadata displayed in the calibration input pane 421.

さらに、当該一覧表示されている情報のうちの１つをユーザがダブルクリックすると、当該ダブルクリックされた情報に対応するページ画像４１４がプレビュー表示される。本実施形態では、文書一覧ペイン４１１内のスキャン文書の情報４１２をユーザがダブルクリックすると、画面ＵＩ４０２のような表示になる。すなわち、文書一覧ペイン４１１内において、当該ダブルクリックされた情報４１２の下に、プレビュー表示領域４１３を割り込ませて、当該ダブルクリックされた情報に対応するページ画像４１４のプレビューを当該表示領域内に表示させる。プレビュー表示領域４１３内に表示されたページ画像４１４は、スクロールしたり拡大表示や縮小表示することにより、プレビュー画像内の任意の位置の表示をすることができる。なお、プレビュー表示領域４１３が表示される前の画面ＵＩ４０１において情報４１２の下に一覧表示されていた情報は、画面ＵＩ４０２のように、プレビュー表示領域４１３の下側に移動されて一覧表示される。 Further, when the user double-clicks one of the information displayed in the list, the page image 414 corresponding to the double-clicked information is preview-displayed. In the present embodiment, when the user double-clicks the information 412 of the scanned document in the document list pane 411, the display looks like the screen UI 402. That is, in the document list pane 411, the preview display area 413 is interrupted under the double-clicked information 412, and the preview of the page image 414 corresponding to the double-clicked information is displayed in the display area. Let me. The page image 414 displayed in the preview display area 413 can be displayed at an arbitrary position in the preview image by scrolling, enlarging or reducing the display. The information listed under the information 412 in the screen UI 401 before the preview display area 413 is displayed is moved to the lower side of the preview display area 413 and displayed in a list like the screen UI 402.

図４の４４１を用いて、ページ画像４１４における直交座標系を説明する。図４の４４１は、ページ画像４１４の上端側の一部を模式的に示しており、ページ画像４１４の左上の角が、当該ページ画像における原点として定義している。ＯＣＲ処理部３３２は、ページ画像に対してＯＣＲ処理を実行して、文字列として認識された抽出領域４４２，４４３，４４４それぞれの始点座標・幅・高さを取得する。例えば、文字領域４４４の場合は、始点座標（１２００，７００）、幅７２０、高さ１２０などと表現される。 The Cartesian coordinate system in the page image 414 will be described with reference to 441 of FIG. 441 of FIG. 4 schematically shows a part of the upper end side of the page image 414, and the upper left corner of the page image 414 is defined as the origin in the page image. The OCR processing unit 332 executes OCR processing on the page image to acquire the start point coordinates, width, and height of each of the extraction areas 442, 443, and 444 recognized as character strings. For example, in the case of the character area 444, it is expressed as the start point coordinates (1200, 700), the width 720, the height 120, and the like.

画面ＵＩ４０１のように、文書一覧ペイン４１１に一覧表示された情報のうちの１つが選択されると、当該選択された情報に対応するページ画像のメタデータが校正入力ペイン４２１に表示されるので、ユーザは、メタデータの修正を容易に行うことができる。しかしながら、画面ＵＩ４０１の状態では、メタデータの修正を行うことはできるが、ページ画像のプレビューが表示されていないので、そのメタデータがページ画像内のどの位置から抽出されたものなのか判別することができない。また、画面ＵＩ４０２では、ページ画像のプレビューが表示されるので、各メタデータがページ画像内のどの位置から抽出されたものなのか識別できるが、プレビュー表示させるための操作（ダブルクリック等）の手間がかかる。さらに、ページ画像のプレビュー表示を行う際は、当該ページ画像のプレビュー画像をスキャン文書処理サーバーからダウンロードする必要があるため、画面表示のレスポンスが悪くなりがちである。 When one of the information listed in the document list pane 411 is selected as in the screen UI 401, the metadata of the page image corresponding to the selected information is displayed in the proofreading input pane 421. The user can easily modify the metadata. However, in the state of the screen UI401, although the metadata can be modified, the preview of the page image is not displayed, so it is necessary to determine from which position in the page image the metadata is extracted. I can't. Further, since the preview of the page image is displayed on the screen UI 402, it is possible to identify from which position in the page image each metadata is extracted, but it is troublesome to display the preview (double-click, etc.). It takes. Further, when the preview image of the page image is displayed, it is necessary to download the preview image of the page image from the scan document processing server, so that the response of the screen display tends to be poor.

そこで、以下では、ページ画像のプレビュー画像４１４の表示を行わずに、各メタデータがページ画像内のどの位置から抽出されたものなのかを簡易的に示すサムネイルを表示する例について説明する。図５（Ａ）は、アプリケーション３５１により表示されるサムネイルの例である。クライアントアプリケーション３５１は、スキャン文書処理サーバーから、メタデータとして用いられた文字列の領域（抽出領域）の位置情報（始点座標・幅・高さ）と、当該ページ画像のサイズ情報とを受信する。ページ画像のサイズ情報は、当該ページ画像における直交座標系での座標で表したものでよい。そして、当該受信した抽出領域の位置情報とページ画像のサイズ情報とに基づいて、メタデータとして用いた文字列の領域がページ画像内のどの位置に対応するのかを強調表示したサムネイルを作成する。サムネイル５０１は、ページ画像からインデックスとして抽出した全ての文字列領域の位置を表すサムネイルであり、図４の例と同様に３つのメタデータを抽出した場合は、３つの領域が抽出位置として強調表示されている。強調表示する際は、当該領域に対応する位置の色や濃度を変えて分かりやすく表示する。サムネイル５０１は、後述するように、マウスカーソル（ポインタとも言う）が文書一覧ペイン４１１に表示されている情報４１２の上にあるときに表示される。また、サムネイル５０２は、後述するように、マウスカーソルが校正入力ペイン４２１に表示されているメタデータ４２２の上にあるときに表示されるサムネイルである。マウスカーソルが校正入力ペイン４２１内にあるときは、マウスカーソルによりポイントされているメタデータに対応する領域がサムネイル上で表示される。 Therefore, in the following, an example of displaying a thumbnail that simply indicates from which position in the page image each metadata is extracted without displaying the preview image 414 of the page image will be described. FIG. 5A is an example of a thumbnail displayed by the application 351. The client application 351 receives from the scan document processing server the position information (start point coordinates, width, height) of the character string area (extraction area) used as metadata and the size information of the page image. The size information of the page image may be represented by the coordinates in the Cartesian coordinate system of the page image. Then, based on the received position information of the extracted area and the size information of the page image, a thumbnail highlighting which position in the page image the area of the character string used as the metadata corresponds to is created. The thumbnail 501 is a thumbnail showing the positions of all the character string areas extracted as indexes from the page image, and when three metadata are extracted as in the example of FIG. 4, the three areas are highlighted as the extraction positions. Has been done. When highlighting, change the color and density of the position corresponding to the area to display in an easy-to-understand manner. The thumbnail 501 is displayed when the mouse cursor (also referred to as a pointer) is on the information 412 displayed in the document list pane 411, as will be described later. Further, the thumbnail 502 is a thumbnail displayed when the mouse cursor is on the metadata 422 displayed in the calibration input pane 421, as will be described later. When the mouse cursor is in the calibration input pane 421, the area corresponding to the metadata pointed to by the mouse cursor is displayed on the thumbnail.

図５（Ｂ）を用い、サムネイルをホバー（マウスオーバー）表示する際の処理の原理を説明する。画面ＵＩにおいて、ＵＩコントロール５１１の領域を定義しておく。そして、マウスカーソル５１２がＵＩコントロール５１１の領域外から領域内に入り、さらに、マウスカーソル５１３がＵＩコントロール５１１の領域内で止まるホバー・イベントが発生すると、サムネイル５１４が一定時間表示されるように制御する。 The principle of processing when displaying thumbnails by hover (mouseover) will be described with reference to FIG. 5 (B). In the screen UI, the area of UI control 511 is defined. Then, when the mouse cursor 512 enters the area from outside the area of the UI control 511 and a hover event occurs in which the mouse cursor 513 stops within the area of the UI control 511, the thumbnail 514 is controlled to be displayed for a certain period of time. do.

図５（Ｃ）を用い、サムネイルをホバー表示するためのＵＩコントロール５１１を定義する位置の具体例を説明する。アプリケーション３５１は、文書一覧ペイン４１１に表示される各ページ画像に関する情報が表示される欄ごとにＵＩコントロール５１１を定義する。このとき、マウスカーソル５１３が、文書一覧ペイン４１１に表示されている複数の情報のうち、ユーザ所望の情報４１２の領域内に入って止まるホバー・イベントが発生すると、サムネイル５０１が表示される。また、アプリケーション３５１は、校正入力ペイン４２１に表示される各メタデータに対応する領域４２２に、ＵＩコントロール５１１を定義する。このとき、マウスカーソル５１３が、メタデータに対応する領域４２２内に入って止まるホバー・イベントが発生すると、サムネイル５０２が表示される。 A specific example of the position where the UI control 511 for hovering the thumbnail is defined will be described with reference to FIG. 5 (C). The application 351 defines a UI control 511 for each column in which information about each page image displayed in the document list pane 411 is displayed. At this time, when a hover event occurs in which the mouse cursor 513 enters the area of the information 412 desired by the user among the plurality of information displayed in the document list pane 411, the thumbnail 501 is displayed. Further, the application 351 defines the UI control 511 in the area 422 corresponding to each metadata displayed in the calibration input pane 421. At this time, when a hover event occurs in which the mouse cursor 513 enters the area 422 corresponding to the metadata and stops, the thumbnail 502 is displayed.

図６を用いて、アプリケーション３５１が、メタデータの抽出領域を簡易描画したサムネイルを表示する際の画面ＵＩの例を説明する。アプリケーション３５１は、サムネイル表示する抽出領域を、マウスカーソルのホバー位置にあわせて変化させる。画面ＵＩ６０１の文書一覧ペイン４１１に一覧表示されているページ画像の情報の１つの上でマウスカーソルがホバーされると、アプリケーション３５１は、ページ画像からメタデータの抽出に用いた全ての抽出領域の位置が示されているサムネイル５２１を表示する。以下では、メタデータとして抽出されたすべての文字列領域（抽出領域）の位置が示されているサムネイル（５２１のようなサムネイル）を、一覧サムネイルと呼ぶこととする。また、画面ＵＩ６０２の校正入力ペイン４２１に表示されているメタデータの１つ（４２２）にマウスカーソルがホバーされると、アプリケーション３５１は、当該ホバーされているメタデータの抽出に用いた文字列領域の位置を示すサムネイル５２２をホバー表示する。以下では、メタデータをマウスでポイントしたときに表示されるサムネイル（５２２のような個別の抽出領域を示すサムネイル）を、個別サムネイルと呼ぶこととする。スキャン対象の文書のどの位置からメタデータを抽出すべきかをユーザが予め理解している場合、ユーザは、サムネイル５２１や５２２を見るだけで、システムにより抽出されたメタデータの抽出位置が正しいかどうか判断することができる。 An example of the screen UI when the application 351 displays a thumbnail in which the metadata extraction area is simply drawn will be described with reference to FIG. The application 351 changes the extraction area to be displayed as a thumbnail according to the hover position of the mouse cursor. When the mouse cursor is hovered over one of the page image information listed in the document list pane 411 of the screen UI 601, the application 351 positions all the extraction areas used to extract the metadata from the page image. Displays the thumbnail 521 in which is shown. In the following, a thumbnail (thumbnail such as 521) showing the positions of all the character string areas (extracted areas) extracted as metadata will be referred to as a list thumbnail. When the mouse cursor is hovered over one of the metadata (422) displayed in the calibration input pane 421 of the screen UI 602, the application 351 uses the character string area for extracting the hovered metadata. The thumbnail 522 indicating the position of is displayed as a hover. In the following, thumbnails (thumbnails indicating individual extraction areas such as 522) displayed when the metadata is pointed with the mouse will be referred to as individual thumbnails. If the user knows in advance from which position in the document to be scanned the metadata should be extracted, the user simply looks at thumbnails 521 and 522 to see if the location of the metadata extracted by the system is correct. You can judge.

図７を用いて、マウスカーソル５１３の位置を矢印７０３のように、文書一覧ペインから校正入力ペインの方向へ移動させたときに連動して表示されるサムネイルの表示例について説明する。アプリケーション３５１の画面ＵＩ７０１上で、マウスカーソル５１３を文書一覧ペイン上でホバーさせると一覧サムネイル５２１が表示され、そして、マウスカーソル５１３を右側の校正入力ペインに移動させると、マウスカーソルの位置が重なっているメタデータに関する個別サムネイル５２２の表示に変更される。マウスカーソル５１３を校正入力ペイン上で上下させると、マウスカーソルの位置に対応するメタデータに関する個別サムネイルが連動して表示される。このように構成することにより、各メタデータがページ画像内のどの位置から抽出されたのかを、ユーザは個別に確認することが容易に行える。 An example of displaying thumbnails displayed in conjunction with the movement of the mouse cursor 513 from the document list pane to the proofreading input pane as shown by arrow 703 will be described with reference to FIG. 7. When the mouse cursor 513 is hovered over the document list pane on the screen UI 701 of the application 351 to display the list thumbnail 521, and when the mouse cursor 513 is moved to the calibration input pane on the right side, the positions of the mouse cursors overlap. The display is changed to the individual thumbnail 522 related to the existing metadata. When the mouse cursor 513 is moved up and down on the calibration input pane, individual thumbnails related to the metadata corresponding to the position of the mouse cursor are displayed in conjunction with each other. With this configuration, the user can easily individually confirm from which position in the page image each metadata is extracted.

図８を用いて、一覧サムネイルおよび個別サムネイルの表示方法に関する別形態について説明する。 With reference to FIG. 8, another form regarding the display method of the list thumbnail and the individual thumbnail will be described.

図８の行８１１は、図５〜７で説明したサムネイルの表示例である。文書一覧ペイン４１１上にマウスカーソルがある場合は一覧サムネイル５２１がホバー表示され、校正入力ペイン４２１上にマウスカーソルがある場合は個別サムネイル５２２がホバー表示される。このとき、各サムネイルの表示サイズは同じであったがこれに限るものではない。例えば、図８の行８１２のように、一覧サムネイル８２２の表示サイズを個別サムネイルより大きくするように制御してもよい。 Line 811 of FIG. 8 is a display example of thumbnails described with reference to FIGS. 5 to 7. When the mouse cursor is on the document list pane 411, the list thumbnail 521 is hovered, and when the mouse cursor is on the proofreading input pane 421, the individual thumbnail 522 is hovered. At this time, the display size of each thumbnail was the same, but the display size is not limited to this. For example, as shown in line 812 of FIG. 8, the display size of the list thumbnail 822 may be controlled to be larger than that of the individual thumbnails.

また、図８の行８１３のように、個別サムネイル８２３について、マウスカーソルが現在ポイントしているメタデータに対応する抽出領域について色や濃度を変えて表示し、それ以外のメタデータに対応する領域を薄くして表示するようにしてもよい。 Further, as shown in line 813 of FIG. 8, for the individual thumbnail 823, the extraction area corresponding to the metadata currently pointed by the mouse cursor is displayed in different colors and densities, and the area corresponding to the other metadata is displayed. May be dimmed and displayed.

また、図８の行８１４のように、ユーザが文書一覧ペイン４１１でダブルクリックを行ってページ画像のプレビュー（図４の４１４）を表示させている場合は、一覧サムネイル８２４を表示しないように制御してもよい。 Further, when the user double-clicks in the document list pane 411 to display the preview of the page image (414 in FIG. 4) as in line 814 of FIG. 8, the list thumbnail 824 is not displayed. You may.

また、図８の行８１５のように、新規のフォーマットの帳票を取り扱った際など、メタデータを抽出できなかった場合は、メタデータが抽出できなかったことを示すサムネイル表示にしてもよい。例えば、文書一覧ペイン４１１において一覧サムネイルを表示する場合は、白紙のサムネイル８２５、または、警告アイコン付きのサムネイル８２６を表示し、個別サムネイルは表示しないようにしてもよい。 Further, when the metadata cannot be extracted, such as when handling a form in a new format as shown in line 815 of FIG. 8, a thumbnail display indicating that the metadata could not be extracted may be displayed. For example, when displaying a list thumbnail in the document list pane 411, a blank thumbnail 825 or a thumbnail 826 with a warning icon may be displayed, and individual thumbnails may not be displayed.

図９を用いて、一覧サムネイルおよび個別サムネイルを表示する際の別モードについて説明する。図６〜７では、ユーザが文書一覧ペイン４１１で所望の文書をワンクリックして選択すると、当該選択状態となった文書が太線枠で囲まれて表示され、さらに、当該選択された文書に対応するメタデータが、校正入力ペイン４２１に表示されるようになっていた。図９では、ワンクリックして選択状態となった文書とは異なる文書上でマウスカーソルがホバーされた場合のサムネイル表示の形態について説明する。 A different mode for displaying list thumbnails and individual thumbnails will be described with reference to FIG. In FIGS. 6 to 7, when the user selects a desired document by one-clicking in the document list pane 411, the selected document is displayed surrounded by a thick line frame, and further corresponds to the selected document. The metadata to be used is displayed in the calibration input pane 421. FIG. 9 describes a form of thumbnail display when the mouse cursor is hovered on a document different from the document selected by one click.

図９の画面ＵＩ９０１のように、文書一覧ペイン４１１において、選択状態となっている文書とは異なる別文書上に、ユーザがマウスカーソル５１３を移動させたものとする。次に、その別文書上でマウスカーソルのホバー・イベントが発生すると、図９の画面ＵＩ９０２のように、文書一覧ペイン上の文書の選択状態を解除するとともに、当該ホバーされた別文書の一覧サムネイル５２１を表示するように制御する。更に、別文書上でマウスカーソル５１３をホバーすることで一覧サムネイル５２１を表示するモードに入った場合は、その後、マウスカーソル５１３を更に他の文書上に移動させると、当該移動後の文書に関する一覧サムネイルを表示するように制御する。すなわち、図９の画面ＵＩ９０３のように、マウスカーソル５１３を文書一覧ペイン上で矢印９３０のように上下方向に動かした場合、マウスカーソルの動きに応じて現在ポイントされている文書に関する一覧サムネイルに切り替えて表示する。更に、アプリケーション３５１は、文書一覧ペイン上でのマウスカーソルの動きに応じて現在ポイントされている文書に関するメタデータを校正入力ペイン４２１に表示する。そして、マウスカーソル５１３をその校正入力ペイン４２１上に動かし、当該表示されているメタデータのいずれかの上にマウスカーソルが移動した場合、その文書のメタデータに関する個別サムネイル５２２を表示するように制御する。このように、選択状態となっている文書とは異なる別文書上でマウスカーソルをホバーすることによりサムネイル表示するモードになった場合は、マウスカーソルの文書一覧ペイン上での動きに連動して、対応する文書の一覧サムネイルとメタデータとが表示されるようになる。また、このモードは、画面ＵＩ９０４のように、文書一覧ペイン４１１上のいずれかの文書がワンクリックされて選択されると解除され、元のモードへ復帰する。 It is assumed that the user moves the mouse cursor 513 on a separate document different from the selected document in the document list pane 411 as in the screen UI 901 of FIG. Next, when a hover event of the mouse cursor occurs on the other document, the selected state of the document on the document list pane is released as shown in the screen UI902 of FIG. 9, and the hovered list thumbnail of the other document is released. Control to display 521. Furthermore, if the mode for displaying the list thumbnail 521 is entered by hovering the mouse cursor 513 on another document, then when the mouse cursor 513 is further moved to another document, a list of the moved documents is entered. Control to display thumbnails. That is, when the mouse cursor 513 is moved up and down on the document list pane as shown by the arrow 930 as in the screen UI 903 of FIG. 9, the thumbnail is switched to the list thumbnail of the currently pointed document according to the movement of the mouse cursor. To display. Further, the application 351 displays the metadata about the currently pointed document in the proofreading input pane 421 in response to the movement of the mouse cursor on the document list pane. Then, the mouse cursor 513 is moved onto the calibration input pane 421, and when the mouse cursor moves over any of the displayed metadata, the individual thumbnail 522 related to the metadata of the document is displayed. do. In this way, when the thumbnail display mode is set by hovering the mouse cursor on a document different from the selected document, the movement of the mouse cursor on the document list pane is linked. List of corresponding documents Thumbnails and metadata will be displayed. Further, this mode is canceled when any document on the document list pane 411 is selected by one click like the screen UI 904, and returns to the original mode.

図９のモードでは、マウスカーソル５１３の動きに連動して、一覧サムネイル５２１とメタデータとが表示されるので、文書一覧ペイン上でマウスカーソルを上下に移動させるだけで、各文書から抽出されたメタデータを簡単に確認できるようになる。 In the mode of FIG. 9, the list thumbnail 521 and the metadata are displayed in conjunction with the movement of the mouse cursor 513. Therefore, the list thumbnails 521 and the metadata are displayed, and the documents are extracted from each document simply by moving the mouse cursor up and down on the document list pane. You will be able to easily check the metadata.

図１０を用いて、本実施形態のシステムにおける処理フローについて説明する。 The processing flow in the system of this embodiment will be described with reference to FIG.

ステップＳ１００２において、スキャン文書処理サーバー１１１は、スキャン機能を有する装置またはクライアント端末から、複数の文書のスキャン画像を受信する。 In step S1002, the scan document processing server 111 receives scanned images of a plurality of documents from a device having a scanning function or a client terminal.

ステップＳ１００３において、スキャン文書処理サーバー１１１は、当該受信した複数のスキャン画像に対して、帳票種別の判別とＯＣＲ処理とを行って、複数のスキャン画像からメタデータを抽出する。 In step S1003, the scan document processing server 111 performs form type determination and OCR processing on the received plurality of scanned images, and extracts metadata from the plurality of scanned images.

ステップＳ１００４において、アプリケーション（クライアントアプリケーション）３５１は、スキャン文書処理サーバー１１１から、複数のスキャン画像に関する情報のリスト（文書一覧ペインに表示される情報）を取得して表示する。 In step S1004, the application (client application) 351 acquires and displays a list of information (information displayed in the document list pane) related to a plurality of scanned images from the scan document processing server 111.

ステップＳ１００５において、アプリケーション３５１は、文書一覧ペインに一覧表示される複数の文書のうち、最初の文書を選択状態にする（すなわち、フォーカスをあてた状態にして太線枠で囲んで表示する）。 In step S1005, the application 351 selects the first document among the plurality of documents listed in the document list pane (that is, puts the focus on the document and displays it surrounded by a thick line).

ステップＳ１０１１において、アプリケーション３５１は、当該フォーカスした文書のスキャン画像から抽出されたメタデータを、校正入力ペイン４２１に表示する。 In step S1011, the application 351 displays the metadata extracted from the scanned image of the focused document in the calibration input pane 421.

ステップＳ１０２０において、アプリケーション３５１は、当該選択状態の文書がダブルクリックされたかどうか判定し、ダブルクリックされたと判定した場合はステップＳ１０２１に進み、ダブルクリックされていないならばステップＳ１０２５へ進む。 In step S1020, the application 351 determines whether or not the document in the selected state is double-clicked, and if it is determined that the document is double-clicked, the process proceeds to step S1021. If the document is not double-clicked, the process proceeds to step S1025.

ステップＳ１０２１において、アプリケーション３５１は、当該ダブルクリックされた文書のスキャン画像のプレビューを、プレビュー表示領域４１３に表示する。 In step S1021, the application 351 displays a preview of the scanned image of the double-clicked document in the preview display area 413.

ステップＳ１０２５において、アプリケーション３５１は、文書一覧ペインに一覧表示されている複数の文書のうちの別の文書がクリックされたか判定し、クリックされたと判定した場合はステップＳ１０２６へ進む。一方、クリックされていないならばステップＳ１０３０へ進む。 In step S1025, the application 351 determines whether another document among the plurality of documents listed in the document list pane has been clicked, and if it determines that the document has been clicked, proceeds to step S1026. On the other hand, if it is not clicked, the process proceeds to step S1030.

ステップＳ１０２６において、アプリケーション３５１は、当該クリックされた文書を選択状態にする（すなわち、当該クリックされた文書にフォーカスをあてる）。 In step S1026, application 351 selects the clicked document (ie, focuses on the clicked document).

ステップＳ１０３０において、アプリケーション３５１は、フォーカスしている文書（選択状態の文書）上でマウスカーソルがホバーされたか（ホバー・イベント１が発生したか）を判定し、ホバー・イベント１が発生したと判定した場合は、ステップＳ１０３１へ進む。一方、ホバー・イベント１が発生していないならばステップＳ１０３５へ進む。 In step S1030, the application 351 determines whether the mouse cursor has been hovered (whether hover event 1 has occurred) on the focused document (selected document), and determines that hover event 1 has occurred. If so, the process proceeds to step S1031. On the other hand, if the hover event 1 has not occurred, the process proceeds to step S1035.

ステップＳ１０３１において、アプリケーション３５１は、当該マウスカーソルがホバーされた文書に関する一覧サムネイルを表示する。 In step S1031, application 351 displays a list thumbnail of the document hovered by the mouse cursor.

ステップＳ１０３５において、アプリケーション３５１は、校正入力ペイン４２１に表示されているメタデータの行領域内でマウスカーソルがホバーされたか（ホバー・イベント２が発生したか）を判定する。そして、ホバー・イベント２が発生したと判定した場合はステップＳ１０３６へ進み、ホバー・イベント２が発生していないならばステップＳ１０４０へ進む。 In step S1035, the application 351 determines whether the mouse cursor has been hovered within the row area of the metadata displayed in the calibration input pane 421 (whether the hover event 2 has occurred). Then, if it is determined that the hover event 2 has occurred, the process proceeds to step S1036, and if the hover event 2 has not occurred, the process proceeds to step S1040.

ステップＳ１０３６において、アプリケーション３５１は、当該マウスカーソルがポイントしているメタデータに対応する抽出領域の位置を強調表示した個別サムネイルを表示する。 In step S1036, application 351 displays an individual thumbnail highlighting the position of the extraction area corresponding to the metadata pointed to by the mouse cursor.

ステップＳ１０４０において、アプリケーション３５１は、フォーカスしていない別文書（選択状態でない文書）上でマウスカーソルがホバーされたか（ホバー・イベント３が発生したか）を判定する。ホバー・イベント３が発生したと判定した場合はステップＳ１０４１へ進み、ホバー・イベント３が発生していないならばステップＳ１０５１へ進む。 In step S1040, the application 351 determines whether the mouse cursor has been hovered (whether the hover event 3 has occurred) on another document that is not in focus (a document that is not in the selected state). If it is determined that the hover event 3 has occurred, the process proceeds to step S1041, and if the hover event 3 has not occurred, the process proceeds to step S1051.

ステップＳ１０４１において、アプリケーション３５１は、選択状態になっていた文書のフォーカスを解除し、図９で説明したサムネイル表示モードにする。 In step S1041, the application 351 releases the focus of the selected document and sets it to the thumbnail display mode described with reference to FIG.

ステップＳ１０４２において、アプリケーション３５１は、マウスカーソルがポイントしている文書の一覧サムネイルを表示する。 In step S1042, application 351 displays a list thumbnail of the document pointed to by the mouse cursor.

ステップＳ１０４３において、アプリケーション３５１は、マウスカーソルがポイントしている文書のメタデータを校正入力ペイン４２１に表示する。 In step S1043, application 351 displays the metadata of the document pointed to by the mouse cursor in the calibration input pane 421.

ステップＳ１０４４において、アプリケーション３５１は、マウスカーソルが他の文書の位置に移動したか判定し、移動したと判定した場合はステップＳ１０４２に進んで、マウスカーソルが移動後にポイントしている文書の一覧サムネイルを表示する。一方、他の文書の位置に移動していないと判定した場合はステップＳ１０４５へ進む。 In step S1044, the application 351 determines whether the mouse cursor has moved to the position of another document, and if it determines that the mouse cursor has moved, proceeds to step S1042 to display a thumbnail list of documents pointed to after the mouse cursor moves. indicate. On the other hand, if it is determined that the document has not moved to the position of another document, the process proceeds to step S1045.

ステップＳ１０４５において、アプリケーション３５１は、当該文書がクリックされたか判定し、クリックされたと判定した場合はステップＳ１０２６へ進み、クリックされていないと判定した場合はステップＳ１０４６へ進む。 In step S1045, the application 351 determines whether the document has been clicked, proceeds to step S1026 if it determines that the document has been clicked, and proceeds to step S1046 if it determines that the document has not been clicked.

ステップＳ１０４６において、アプリケーション３５１は、マウスカーソルが校正入力ペインに移動したか判定し、移動したと判定した場合はステップＳ１０４７に進み、移動していないと判定した場合はステップＳ１０４８に進む。 In step S1046, the application 351 determines whether the mouse cursor has moved to the calibration input pane, proceeds to step S1047 if it determines that it has moved, and proceeds to step S1048 if it determines that it has not moved.

ステップＳ１０４７において、アプリケーション３５１は、マウスカーソルがポイントしているメタデータに対応する抽出領域の位置を強調表示した個別サムネイルを表示する。 In step S1047, application 351 displays an individual thumbnail highlighting the position of the extraction area corresponding to the metadata pointed to by the mouse cursor.

ステップＳ１０４８において、アプリケーション３５１は、文書の表示・校正の完了指示がなされたか判定し、完了指示されたと判定した場合は処理を終了し、完了指示されていない場合はステップＳ１０４４に戻る。 In step S1048, the application 351 determines whether or not the document display / proofreading completion instruction has been given, ends the process if it determines that the completion instruction has been given, and returns to step S1044 if the completion instruction has not been given.

ステップＳ１０５１において、アプリケーション３５１は、文書の表示・校正の完了指示がなされたか判定し、完了指示されたと判定した場合は処理を終了し、完了指示されていない場合はステップＳ１０１１に戻る。 In step S1051, the application 351 determines whether or not the document display / proofreading completion instruction has been given, ends the process if it determines that the completion instruction has been given, and returns to step S1011 if the completion instruction has not been given.

図１１は、一覧サムネイルおよび個別サムネイルの表示形態として、図８の行８１２，８１４，８１５の表示形態を採用した場合の処理の流れを示すフローチャートである。 FIG. 11 is a flowchart showing a processing flow when the display form of lines 812, 814, 815 of FIG. 8 is adopted as the display form of the list thumbnail and the individual thumbnail.

ステップＳ１１２０において、アプリケーション３５１は、文書一覧ペインに表示されている文書がダブルクリックされて、当該文書のプレビューが表示されているか判定する。プレビュー表示がなされているならステップＳ１１４０へ進み、プレビューが表示されていないならステップＳ１１３０へ進む。 In step S1120, the application 351 determines whether the document displayed in the document list pane is double-clicked to display a preview of the document. If the preview is displayed, the process proceeds to step S1140, and if the preview is not displayed, the process proceeds to step S1130.

ステップＳ１１３０において、アプリケーション３５１は、文書一覧ペイン４１１の文書上でマウスカーソルがホバーされたか（ホバー・イベントが発生したか）判定し、ホバーされたと判定した場合はステップＳ１１３１へ進む。一方、ホバーされていないと判定した場合はステップＳ１１４０へ進む。 In step S1130, the application 351 determines whether the mouse cursor has been hovered (whether a hover event has occurred) on the document in the document list pane 411, and if it is determined that the mouse cursor has been hovered, proceeds to step S1131. On the other hand, if it is determined that the hover is not performed, the process proceeds to step S1140.

ステップＳ１１３１において、アプリケーション３５１は、当該ホバーされた文書のメタデータが未抽出であり、かつ、メタデータの抽出位置の学習が未だ行われていない帳票であるか判定し、未学習であると判定した場合はステップＳ１１３６へ進む。一方、当該ホバーされた文書のメタデータが抽出済みである場合は、ステップＳ１１３２へ進む。 In step S1131, the application 351 determines whether the metadata of the hovered document has not been extracted and the metadata extraction position has not yet been learned, and determines that the form has not been learned. If so, the process proceeds to step S1136. On the other hand, if the metadata of the hovered document has been extracted, the process proceeds to step S1132.

ステップＳ１１３２において、アプリケーション３５１は、当該ホバーされた文書の一覧サムネイルの表示サイズとして、個別サムネイルより大きいサイズを設定する。 In step S1132, the application 351 sets a size larger than the individual thumbnail as the display size of the list thumbnail of the hovered document.

ステップＳ１１３３において、アプリケーション３５１は、ステップＳ１１３２で設定された表示サイズで、当該ホバーされた文書の一覧サムネイルを表示する。 In step S1133, the application 351 displays a list thumbnail of the hovered document in the display size set in step S1132.

ステップＳ１１３６において、アプリケーション３５１は、警告アイコンを表示する。 In step S1136, application 351 displays a warning icon.

ステップＳ１１４０において、アプリケーション３５１は、校正入力ペイン４２１に表示されているメタデータ上でマウスカーソルがホバーされたか判定し、ホバーされたと判定した場合はステップＳ１１４１へ進む。一方、ホバーされていないと判定した場合は、ステップＳ１１６０へ進む。 In step S1140, the application 351 determines whether the mouse cursor has been hovered on the metadata displayed in the calibration input pane 421, and if it is determined that the mouse cursor has been hovered, proceeds to step S1141. On the other hand, if it is determined that the hover is not performed, the process proceeds to step S1160.

ステップＳ１１４１において、アプリケーション３５１は、当該文書のメタデータが未抽出であるか判定し、未抽出であると判定した場合はステップＳ１１６０に進む。一方、文書のメタデータが抽出済みである場合は、ステップＳ１１４２へ進む。 In step S1141, the application 351 determines whether the metadata of the document has not been extracted, and if it determines that the metadata has not been extracted, proceeds to step S1160. On the other hand, if the metadata of the document has been extracted, the process proceeds to step S1142.

ステップＳ１１４２において、アプリケーション３５１は、個別サムネイルの表示サイズとして、一覧サムネイルより小さいサイズを設定する。 In step S1142, the application 351 sets a size smaller than the list thumbnail as the display size of the individual thumbnail.

ステップＳ１１４３において、アプリケーション３５１は、ステップＳ１１４２で設定された表示サイズで、当該ホバーされたメタデータの抽出に用いた領域を強調表示した個別サムネイルを表示する。 In step S1143, application 351 displays an individual thumbnail highlighting the area used to extract the hovered metadata at the display size set in step S1142.

ステップＳ１１６０において、アプリケーション３５１は、文書の表示・校正の完了指示がなされたか判定し、完了指示されたと判定した場合は処理を終了し、完了指示されていない場合はステップＳ１１２０に戻る。 In step S1160, the application 351 determines whether or not the document display / proofreading completion instruction has been given, ends the process if it determines that the completion instruction has been given, and returns to step S1120 if the completion instruction has not been given.

以上説明したように、本実施形態によれば、スキャンした複数の文書をリスト表示している状態で、各文書画像に付与されるメタデータが、各文書画像内のどの位置から抽出されたものなのかを簡易に示したサムネイルを表示するようにした。これにより、ユーザは、各文書画像に付与されるメタデータが正しい位置から抽出されたものなのかを簡易に判別することができ、また、必要に応じて文書画像のプレビュー表示を行って詳細確認することも可能となる。例えば、ユーザが頻繁にスキャンするフォーマットの文書であれば、メタデータの抽出位置だけ確認すればよいというケースもあり、本発明を適用することで、文書画像にメタデータを付与して保存するまでのユーザの手間を減らすことができる。 As described above, according to the present embodiment, the metadata given to each document image is extracted from which position in each document image in a state where a plurality of scanned documents are displayed in a list. Changed to display a thumbnail that simply shows what it is. As a result, the user can easily determine whether the metadata given to each document image is extracted from the correct position, and if necessary, preview the document image to confirm the details. It is also possible to do. For example, in the case of a document in a format that the user frequently scans, there is a case where only the extraction position of the metadata needs to be confirmed. By applying the present invention, until the document image is added with the metadata and saved. It is possible to reduce the trouble of the user.

＜その他の実施例＞
上述した実施形態では、スキャン文書処理サーバーにおいてスキャン画像の解析処理（帳票識別やＯＣＲ処理など）を行うように構成したが、これに限るものではなく、クライアント端末において、スキャン画像の解析処理も行うように構成してもよい。また、スキャン文書処理サーバーは、１つのコンピュータで実現するように構成してもよいし、クラウドコンピューティングを用いてスキャン画像を解析する処理を行うように構成してもよい。 <Other Examples>
In the above-described embodiment, the scan document processing server is configured to perform scan image analysis processing (form identification, OCR processing, etc.), but the present invention is not limited to this, and the scan image analysis processing is also performed on the client terminal. It may be configured as follows. Further, the scan document processing server may be configured to be realized by one computer, or may be configured to perform a process of analyzing a scanned image by using cloud computing.

以上、本発明の好ましい実施例について詳述したが、本発明はかかる特定の実施例に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 Although the preferred examples of the present invention have been described in detail above, the present invention is not limited to such specific examples, and various modifications are made within the scope of the gist of the present invention described in the claims.・ Can be changed.

Claims

List multiple documents
For the document selected from the plurality of documents displayed in the list, the metadata extracted from the document image related to the selected document is displayed.
For the document pointed by the mouse cursor among the plurality of documents displayed in the list, from which position in the document image corresponding to the document the metadata to be given to the pointed document was acquired. Display the first thumbnail indicating,
An information processing device characterized by this.

When the displayed metadata is pointed with the mouse cursor, a second thumbnail indicating from which position in the document image corresponding to the document the pointed metadata was acquired is displayed. The information processing apparatus according to claim 1.

The information processing apparatus according to claim 2, wherein the first thumbnail is larger than the second thumbnail.

The first thumbnail is characterized by being a thumbnail indicating from which position in the document image corresponding to the document all of the plurality of metadata to be given to the pointed document were acquired. The information processing apparatus according to any one of claims 1 to 3.

A claim characterized by displaying a warning for a document pointed to by the mouse cursor among the plurality of documents displayed in the list when the metadata to be given to the pointed document has not been acquired. Item 2. The information processing apparatus according to any one of Items 1 to 4.

The invention according to any one of claims 1 to 5, wherein when any of the plurality of documents displayed in the list is double-clicked, a preview of the double-clicked document is displayed. Information processing device.

The information processing apparatus according to claim 1, wherein the first thumbnail is displayed when the mouse cursor is hovered over any of the plurality of documents displayed in the list. ..

The information processing apparatus according to claim 2, wherein the second thumbnail is displayed when the mouse cursor is hovered on the displayed metadata.

When the mouse cursor is hovered on a document different from the selected document among the listed documents, the selected document is deselected and the mouse cursor points to the selected document. The information processing apparatus according to claim 1, wherein the first thumbnail of a document is displayed.

The list display of the plurality of documents is displayed in the first pane of the screen.
The metadata is displayed in the second pane of the screen.
The information processing device according to any one of claims 1 to 9, wherein the information processing device is characterized by the above.

A program for causing a computer to function as the information processing device according to any one of claims 1 to 10.

List multiple documents
For the document selected from the plurality of documents displayed in the list, the metadata extracted from the document image related to the selected document is displayed.
For the document pointed by the mouse cursor among the plurality of documents displayed in the list, from which position in the document image corresponding to the document the metadata to be given to the pointed document was acquired. Display the first thumbnail indicating,
An information processing method characterized by the fact that.