JP2024079918A

JP2024079918A - Image processing system, image processing method and image processing program

Info

Publication number: JP2024079918A
Application number: JP2022192608A
Authority: JP
Inventors: 輝彦松岡; Teruhiko Matsuoka; 大作今泉; Daisaku Imaizumi; 央光加藤木; Hisamitsu Katogi
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2022-12-01
Filing date: 2022-12-01
Publication date: 2024-06-13

Abstract

To provide an image processing system, an image processing method and an image processing program that appropriately extract an item value of an extraction object from image data.SOLUTION: In an image processing system, an extraction processing unit of an image processing apparatus extracts a plurality of item names and a plurality of item values corresponding to each of the plurality of item names from image data P1. An identification processing unit identifies a plurality of related item names associated with an object item name of a plurality of item names B1 through B3 extracted by the extraction processing unit, and identifies a plurality of related item values corresponding to each of the plurality of related item names, of the plurality of item values A2, A3 and A6 extracted by the extraction processing unit. An output processing unit outputs a calculated item value that is calculated on the basis of the plurality of related item values identified by the identification processing unit as an object item value.SELECTED DRAWING: Figure 4

Description

本開示は、画像データに対して文字認識等の画像処理を実行する画像処理システム、画像処理方法、及び画像処理プログラムに関する。 The present disclosure relates to an image processing system, an image processing method, and an image processing program that perform image processing such as character recognition on image data.

従来、帳票等の書類の画像データから文字列を抽出する技術が知られている。例えば、画像データを文字認識した文字列から項目値及び項目名を抽出し、項目名に対応する項目値を、少なくとも項目値と項目名との位置関係に基づいて決定し、決定された項目値が、複数の項目名に対応付けられないように１つの項目名と１又は複数の項目値との組み合わせを生成する技術が知られている（例えば特許文献１参照）。 Conventionally, there are known techniques for extracting character strings from image data of documents such as forms. For example, there is known a technique for extracting item values and item names from character strings obtained by character recognition of image data, determining an item value corresponding to an item name based at least on the positional relationship between the item value and the item name, and generating a combination of one item name and one or more item values such that the determined item value cannot be associated with multiple item names (see, for example, Patent Document 1).

特開２０２１－９９６８８号公報JP 2021-99688 A

しかし、従来の技術では、項目値と項目名との位置関係に基づいて項目値と項目名との組み合わせを生成しているため、項目値と項目名との位置関係が所定のルールに合致しない場合には、抽出対象の項目値と項目名との組み合わせを正しく特定することができないという問題が生じる。 However, in conventional technology, combinations of item values and item names are generated based on the positional relationship between the item values and item names, so if the positional relationship between the item values and item names does not match specified rules, a problem occurs in that the combination of item values and item names to be extracted cannot be correctly identified.

本開示の目的は、画像データから抽出対象の項目値を適切に抽出することが可能な画像処理システム、画像処理方法、及び画像処理プログラムを提供することにある。 The objective of the present disclosure is to provide an image processing system, an image processing method, and an image processing program that are capable of appropriately extracting item values to be extracted from image data.

本開示の一の態様に係る画像処理システムは、画像データから抽出対象の対象項目名に対応する対象項目値を抽出する画像処理システムである。前記画像処理システムは、抽出処理部と前記特定処理部と前記出力処理部とを備える。前記抽出処理部は、前記画像データから複数の項目名と前記複数の項目名のそれぞれに対応する複数の項目値とを抽出する。前記特定処理部は、前記抽出処理部により抽出される前記複数の項目名のうち前記対象項目名に関連する複数の関連項目名を特定し、前記抽出処理部により抽出される前記複数の項目値のうち前記複数の関連項目名のそれぞれに対応する複数の関連項目値を特定する。前記出力処理部は、前記特定処理部により特定される前記複数の関連項目値に基づいて算出される算出項目値を、前記対象項目値として出力する。 An image processing system according to one aspect of the present disclosure is an image processing system that extracts target item values corresponding to target item names to be extracted from image data. The image processing system includes an extraction processing unit, the identification processing unit, and the output processing unit. The extraction processing unit extracts a plurality of item names and a plurality of item values corresponding to each of the plurality of item names from the image data. The identification processing unit identifies a plurality of related item names related to the target item name from the plurality of item names extracted by the extraction processing unit, and identifies a plurality of related item values corresponding to each of the plurality of related item names from the plurality of item values extracted by the extraction processing unit. The output processing unit outputs a calculated item value calculated based on the plurality of related item values identified by the identification processing unit as the target item value.

本開示の他の態様に係る画像処理方法は、画像データから抽出対象の対象項目名に対応する対象項目値を抽出する画像処理方法である。前記画像処理方法では、前記画像データから複数の項目名と前記複数の項目名のそれぞれに対応する複数の項目値とを抽出することと、前記複数の項目名のうち前記対象項目名に関連する複数の関連項目名を特定し、前記複数の項目値のうち前記複数の関連項目名のそれぞれに対応する複数の関連項目値を特定することと、前記複数の関連項目値に基づいて算出される算出項目値を、前記対象項目値として出力することと、を一又は複数のプロセッサーが実行する。 An image processing method according to another aspect of the present disclosure is an image processing method for extracting target item values corresponding to target item names to be extracted from image data. In the image processing method, one or more processors execute the following steps: extracting from the image data a plurality of item names and a plurality of item values corresponding to each of the plurality of item names; identifying a plurality of related item names related to the target item name from the plurality of item names; identifying a plurality of related item values from the plurality of item values corresponding to each of the plurality of related item names; and outputting a calculated item value calculated based on the plurality of related item values as the target item value.

本開示の他の態様に係る画像処理プログラムは、画像データから抽出対象の対象項目名に対応する対象項目値を抽出する画像処理プログラムである。前記画像処理プログラムは、前記画像データから複数の項目名と前記複数の項目名のそれぞれに対応する複数の項目値とを抽出することと、前記複数の項目名のうち前記対象項目名に関連する複数の関連項目名を特定し、前記複数の項目値のうち前記複数の関連項目名のそれぞれに対応する複数の関連項目値を特定することと、前記複数の関連項目値に基づいて算出される算出項目値を、前記対象項目値として出力することと、を一又は複数のプロセッサーに実行させるためのプログラムである。 An image processing program according to another aspect of the present disclosure is an image processing program that extracts target item values corresponding to target item names to be extracted from image data. The image processing program is a program for causing one or more processors to execute the following operations: extracting multiple item names and multiple item values corresponding to each of the multiple item names from the image data; identifying multiple related item names related to the target item name from the multiple item names; identifying multiple related item values from the multiple item values corresponding to each of the multiple related item names; and outputting a calculated item value calculated based on the multiple related item values as the target item value.

本開示によれば、画像データから抽出対象の項目値を適切に抽出することが可能な画像処理システム、画像処理方法、及び画像処理プログラムを提供することができる。 According to the present disclosure, it is possible to provide an image processing system, an image processing method, and an image processing program that are capable of appropriately extracting item values to be extracted from image data.

図１は、本開示の実施形態に係る画像処理システムの構成を示す機能ブロック図である。FIG. 1 is a functional block diagram showing a configuration of an image processing system according to an embodiment of the present disclosure. 図２は、本開示の実施形態に係る帳票に含まれる見積書の一例を示す図である。FIG. 2 is a diagram illustrating an example of an estimate included in a form according to an embodiment of the present disclosure. 図３は、本開示の実施形態に係る画像データにおける文字認識処理の一例を示す図である。FIG. 3 is a diagram illustrating an example of character recognition processing on image data according to an embodiment of the present disclosure. 図４は、本開示の実施形態に係る画像データにおける文字認識処理の一例を示す図である。FIG. 4 is a diagram illustrating an example of character recognition processing on image data according to an embodiment of the present disclosure. 図５は、本開示の実施形態に係る操作端末に表示される抽出結果ページの一例を示す図である。FIG. 5 is a diagram illustrating an example of an extraction result page displayed on the operation terminal according to the embodiment of the present disclosure. 図６は、本開示の実施形態に係る操作端末に表示される抽出結果ページの一例を示す図である。FIG. 6 is a diagram illustrating an example of an extraction result page displayed on the operation terminal according to an embodiment of the present disclosure. 図７は、本開示の実施形態に係る操作端末に表示される抽出結果ページの一例を示す図である。FIG. 7 is a diagram illustrating an example of an extraction result page displayed on the operation terminal according to the embodiment of the present disclosure. 図８は、本開示の実施形態に係る画像処理システムで実行される金額抽出処理の手順の一例を説明するためのフローチャートである。FIG. 8 is a flowchart for explaining an example of the procedure of an amount extraction process executed in the image processing system according to the embodiment of the present disclosure. 図９は、本開示の実施形態に係る画像処理システムで実行される金額決定処理の手順の一例を説明するためのフローチャートである。FIG. 9 is a flowchart for explaining an example of the procedure of a price determination process executed in the image processing system according to an embodiment of the present disclosure. 図１０は、本開示の実施形態に係る画像処理システムで実行される金額候補の抽出方法の一例を示す図である。FIG. 10 is a diagram illustrating an example of a method for extracting price candidates executed in the image processing system according to an embodiment of the present disclosure.

以下、添付図面を参照しながら、本開示の実施形態について説明する。なお、以下の実施形態は、本開示を具体化した一例であって、本開示の技術的範囲を限定する性格を有さない。 Embodiments of the present disclosure will be described below with reference to the attached drawings. Note that the following embodiments are examples of the present disclosure and are not intended to limit the technical scope of the present disclosure.

［画像処理システム１０］
図１は、本開示の実施形態に係る画像処理システム１０の構成を示すブロック図である。画像処理システム１０は、画像処理装置１と操作端末２とを含んでいる。画像処理装置１と操作端末２とは、ネットワークＮ１（例えばインターネット、ＬＡＮなど）を介して互いに接続されている。画像処理システム１０には、複数の操作端末２が含まれてもよい。 [Image processing system 10]
1 is a block diagram showing a configuration of an image processing system 10 according to an embodiment of the present disclosure. The image processing system 10 includes an image processing device 1 and an operation terminal 2. The image processing device 1 and the operation terminal 2 are connected to each other via a network N1 (e.g., the Internet, a LAN, etc.). The image processing system 10 may include a plurality of operation terminals 2.

画像処理システム１０において、画像処理装置１は、操作端末２から送信される帳票等の書類の画像データを取得し、当該画像データから抽出対象の対象項目名に対応する対象項目値を抽出する。例えば、操作端末２は、見積書、注文書、請求書、納品書などの紙媒体の帳票をスキャンして生成した画像データ（ＰＤＦデータなど）を画像処理装置１に送信する。また、操作端末２は、例えば文書作成アプリケーションなどによりユーザーの操作に基づいて前記帳票の文書ファイルを作成し、当該文書ファイルを画像データ（例えばサーチャブルＰＤＦデータ（画像＋テキストデータ）など）として画像処理装置１に送信する。画像処理装置１は、操作端末２から送信される前記画像データを受信すると、当該画像データに対して後述する各種処理を実行して、帳票に含まれる抽出対象の対象項目名に対応する対象項目値を抽出する。例えば、画像処理装置１は、見積書の見積金額（税込見積金額、税抜見積金額など）、注文書の注文金額（税込注文金額、税抜注文金額など）、請求書の請求金額（税込請求金額、税抜請求金額など）、納品書の合計金額（税込合計金額、税抜合計金額など）などを抽出する。また、画像処理装置１は、抽出した対象項目名に対応する対象項目値を、所定のデータベースに登録する。例えば、画像処理装置１は、見積書の画像データを取得するごとに、見積書を管理するデータベースに、当該見積書の見積内容（例えば、商品名、見積書発行日、会社名など）及び見積金額を登録していく。これにより、紙の帳票を電子データとして保存することができる。 In the image processing system 10, the image processing device 1 acquires image data of documents such as forms sent from the operation terminal 2, and extracts target item values corresponding to the target item names to be extracted from the image data. For example, the operation terminal 2 scans paper forms such as estimates, order forms, invoices, and delivery notes, and sends the generated image data (e.g., PDF data) to the image processing device 1. The operation terminal 2 also creates a document file of the form based on a user's operation, for example, using a word processing application, and sends the document file to the image processing device 1 as image data (e.g., searchable PDF data (image + text data)). When the image processing device 1 receives the image data sent from the operation terminal 2, it executes various processes to be described later on the image data to extract target item values corresponding to the target item names to be extracted contained in the form. For example, the image processing device 1 extracts the estimated amount of the estimate (estimated amount including tax, estimated amount excluding tax, etc.), the order amount of the order (order amount including tax, order amount excluding tax, etc.), the billing amount of the invoice (billing amount including tax, billing amount excluding tax, etc.), the total amount of the delivery note (total amount including tax, total amount excluding tax, etc.), etc. Furthermore, the image processing device 1 registers the target item values corresponding to the extracted target item names in a specified database. For example, each time the image processing device 1 acquires image data of a quote, it registers the quote details (e.g., product name, quote issue date, company name, etc.) and the estimated amount of the quote in a database that manages quotes. This allows paper forms to be saved as electronic data.

画像処理システム１０は、本開示の画像処理システムの一例である。なお、本開示の画像処理システムは、画像処理装置１単体で構成されてもよい。 The image processing system 10 is an example of an image processing system according to the present disclosure. Note that the image processing system according to the present disclosure may be configured with the image processing device 1 alone.

［画像処理装置１］
図１に示すように、画像処理装置１は、制御部１１、記憶部１２、操作表示部１３、通信部１４などを備える。画像処理装置１は、１台又は複数台のクラウドサーバーであってもよいし、１台又は複数台の物理サーバーであってもよい。 [Image processing device 1]
1, the image processing device 1 includes a control unit 11, a storage unit 12, an operation display unit 13, and a communication unit 14. The image processing device 1 may be one or more cloud servers, or one or more physical servers.

通信部１４は、画像処理装置１を有線又は無線でネットワークＮ１に接続し、ネットワークＮ１を介して操作端末２との間で所定の通信プロトコルに従ったデータ通信を実行するための通信インターフェースである。ネットワークＮ１は、例えばインターネット、ＬＡＮなどで構成される。 The communication unit 14 is a communication interface for connecting the image processing device 1 to the network N1 by wire or wirelessly and for executing data communication with the operation terminal 2 via the network N1 in accordance with a predetermined communication protocol. The network N1 is composed of, for example, the Internet, a LAN, etc.

操作表示部１３は、各種の情報を表示する液晶ディスプレイ又は有機ＥＬディスプレイのような表示部と、操作を受け付けるマウス、キーボード、又はタッチパネルなどの操作部とを備えるユーザーインターフェースである。 The operation display unit 13 is a user interface that includes a display unit such as a liquid crystal display or an organic EL display that displays various information, and an operation unit such as a mouse, keyboard, or touch panel that accepts operations.

記憶部１２は、各種の情報を記憶するＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）、又はフラッシュメモリーなどの不揮発性の記憶部である。記憶部１２には、制御部１１に後述の金額抽出処理（図８及び図９参照）を実行させるための金額抽出プログラム（本開示の画像処理プログラムの一例）などの制御プログラムが記憶されている。例えば、前記金額抽出プログラムは、ＣＤ又はＤＶＤなどのコンピュータ読取可能な記録媒体に非一時的に記録され、画像処理装置１が備えるＣＤドライブ又はＤＶＤドライブなどの読取装置（不図示）で読み取られて記憶部１２に記憶される。なお、前記金額抽出プログラムは、クラウドサーバーから配信されて記憶部１２に記憶されてもよい。 The storage unit 12 is a non-volatile storage unit such as a hard disk drive (HDD), a solid state drive (SSD), or a flash memory that stores various information. The storage unit 12 stores a control program such as an amount extraction program (an example of an image processing program of the present disclosure) for causing the control unit 11 to execute the amount extraction process (see Figures 8 and 9) described below. For example, the amount extraction program is non-temporarily recorded on a computer-readable recording medium such as a CD or DVD, and is read by a reading device (not shown) such as a CD drive or DVD drive provided in the image processing device 1 and stored in the storage unit 12. The amount extraction program may be distributed from a cloud server and stored in the storage unit 12.

また、記憶部１２には、操作端末２から取得する帳票等の書類の画像データ（スキャンデータなど）が記憶されている。 The memory unit 12 also stores image data (e.g., scanned data) of documents such as forms acquired from the operation terminal 2.

図２には、帳票の一例として見積書を示している。図２に示すように、見積書には、書類の区分（「見積書」）、発行日、見積者の連絡先（住所、電話番号、ＦＡＸ番号、担当者）、見積額、商品名、数量、標準価格、値引き額、小計、消費税、合計金額など、複数の項目が含まれる。ユーザーは、操作端末２において、前記見積書をスキャンして画像データＰ１を画像処理装置１にアップロードする。制御部１１は、見積書の画像データＰ１を取得すると記憶部１２に記憶する。他の実施形態として、制御部１１は、操作端末２において作成された見積書の文書ファイルを取得し、当該文書ファイルを記憶部１２に記憶してもよい。 Figure 2 shows an estimate as an example of a document. As shown in Figure 2, an estimate includes multiple items such as the document type ("estimate"), issue date, contact information for the estimator (address, telephone number, fax number, person in charge), estimated amount, product name, quantity, standard price, discount amount, subtotal, consumption tax, and total amount. The user scans the estimate on the operation terminal 2 and uploads image data P1 to the image processing device 1. When the control unit 11 acquires the image data P1 of the estimate, it stores it in the memory unit 12. In another embodiment, the control unit 11 may acquire a document file of the estimate created on the operation terminal 2 and store the document file in the memory unit 12.

制御部１１は、ＣＰＵ、ＲＯＭ、及びＲＡＭなどの制御機器を有する。前記ＣＰＵは、各種の演算処理を実行するプロセッサーである。前記ＲＯＭは、前記ＣＰＵに各種の処理を実行させるためのＢＩＯＳ及びＯＳなどの制御プログラムを予め記憶する。前記ＲＡＭは、各種の情報を記憶し、前記ＣＰＵが実行する各種の処理の一時記憶メモリー（作業領域）として使用される。そして、制御部１１は、前記ＲＯＭ又は記憶部１２に予め記憶された各種の制御プログラムを前記ＣＰＵで実行することにより画像処理装置１を制御する。 The control unit 11 has control devices such as a CPU, a ROM, and a RAM. The CPU is a processor that executes various types of arithmetic processing. The ROM pre-stores control programs such as a BIOS and an OS for causing the CPU to execute various types of processing. The RAM stores various types of information and is used as a temporary storage memory (work area) for the various types of processing executed by the CPU. The control unit 11 controls the image processing device 1 by having the CPU execute various control programs pre-stored in the ROM or the memory unit 12.

具体的には、制御部１１は、図１に示すように、取得処理部１１１、抽出処理部１１２、特定処理部１１３、判定処理部１１４、出力処理部１１５などの各種の処理部を含む。なお、制御部１１は、前記金額抽出プログラムに従った各種の処理を実行することによって前記各種の処理部として機能する。また、制御部１１に含まれる一部又は全部の処理部が電子回路で構成されていてもよい。なお、前記金額抽出プログラムは、複数のプロセッサーを前記各種の処理部として機能させるためのプログラムであってもよい。 Specifically, as shown in FIG. 1, the control unit 11 includes various processing units such as an acquisition processing unit 111, an extraction processing unit 112, a specific processing unit 113, a determination processing unit 114, and an output processing unit 115. The control unit 11 functions as the various processing units by executing various processes according to the amount extraction program. Some or all of the processing units included in the control unit 11 may be configured with electronic circuits. The amount extraction program may be a program for causing multiple processors to function as the various processing units.

取得処理部１１１は、処理対象の帳票の画像データＰ１を取得する。具体的には、ユーザーが操作端末２において見積書（図２参照）をスキャンしてアップロード操作を行うと、取得処理部１１１は見積書の画像データＰ１を取得する。取得処理部１１１は、画像データＰ１を取得すると記憶部１２に記憶する。 The acquisition processing unit 111 acquires image data P1 of the document to be processed. Specifically, when a user scans an estimate (see FIG. 2) on the operation terminal 2 and performs an upload operation, the acquisition processing unit 111 acquires image data P1 of the estimate. When the acquisition processing unit 111 acquires the image data P1, it stores it in the memory unit 12.

抽出処理部１１２は、画像データＰ１から所定の項目名及び項目値を抽出する。具体的には、抽出処理部１１２は、画像データＰ１に対して解像度変換処理、スキュー補正処理、下地除去処理などの前処理を実行し、その後に文書パーツの認識処理（オブジェクト認識処理）を実行する。例えば、抽出処理部１１２は、前記前処理を実行した画像データＰ１において、文字列、表、イラスト、捺印などの各パーツをオブジェクトの矩形領域として認識する。また、抽出処理部１１２は、前記オブジェクト認識処理を実行すると、画像データＰ１の全体（全体画像）において文字認識処理（ＯＣＲ処理）を実行する。 The extraction processing unit 112 extracts predetermined item names and item values from the image data P1. Specifically, the extraction processing unit 112 performs preprocessing such as resolution conversion processing, skew correction processing, and background removal processing on the image data P1, and then performs document part recognition processing (object recognition processing). For example, the extraction processing unit 112 recognizes each part, such as character strings, tables, illustrations, and seals, in the image data P1 on which the preprocessing has been performed as a rectangular area of an object. In addition, after performing the object recognition processing, the extraction processing unit 112 performs character recognition processing (OCR processing) on the entire image data P1 (entire image).

また、抽出処理部１１２は、ＯＣＲ処理の結果を用いて、文字データを文字列として統合（単語化）する。また、抽出処理部１１２は、文字列内の項目名と項目値とを切り分ける処理、項目名と項目値とを関連付ける処理などを実行する。 The extraction processing unit 112 also uses the results of the OCR processing to integrate the character data into a string (convert it into words). The extraction processing unit 112 also performs processing such as separating the item names and item values in the string and associating the item names with the item values.

例えば図３に示す見積書の画像データＰ１において、抽出処理部１１２は、書類の区分を判定する。具体的には、抽出処理部１１２は、書類の上下方向の中心位置より上部の領域において、比較的大きい文字（例えばフォントサイズが１６ポイント以上の文字）で構成された文字列であって、「見積書」、「注文書」、「請求書」、「納品書」を含む文字列を抽出する。ここでは、抽出処理部１１２は、「お見積書」の文字列Ｔ１を抽出する。このように、抽出処理部１１２は、画像データＰ１から文書情報を抽出する。 For example, in the image data P1 of the quotation shown in FIG. 3, the extraction processing unit 112 determines the classification of the document. Specifically, the extraction processing unit 112 extracts character strings consisting of relatively large characters (for example, characters with a font size of 16 points or more) in the area above the vertical center position of the document, including "quotation," "order form," "invoice," and "delivery note." Here, the extraction processing unit 112 extracts the character string T1 of "quotation." In this way, the extraction processing unit 112 extracts document information from the image data P1.

また、抽出処理部１１２は、画像データＰ１において日付を抽出する。具体的には、抽出処理部１１２は、書類の上下方向の中心位置より上部の領域において、「（和暦又は何もなし）（元～又は１９ｘｘ～２０ｘｘ）年（１～１２）月（１～３１）日」の文字がこの順に存在し（又は、年月日の代わりに「／」、「－」、「．」などの記号で数字が区切られている）、「発行日」の文字列の右隣りに存在する文字列を抽出する。ここでは、抽出処理部１１２は、「令和４年１１月１日」の文字列Ｄ１を抽出する。このように、抽出処理部１１２は、画像データＰ１から日付（取引年月日、発行日）を抽出する。 The extraction processing unit 112 also extracts dates from the image data P1. Specifically, the extraction processing unit 112 extracts a character string that exists to the right of the character string "Issuance date" in the area above the vertical center of the document, where the characters "(Japanese calendar or nothing) (Gen- or 19xx-20xx) year (1-12) month (1-31) day" exist in this order (or numbers are separated by symbols such as "/", "-", or "." instead of the year, month, and date). Here, the extraction processing unit 112 extracts the character string D1 of "November 1, 2022". In this way, the extraction processing unit 112 extracts dates (transaction date, issue date) from the image data P1.

また、抽出処理部１１２は、画像データＰ１において金額情報を抽出する。具体的には、抽出処理部１１２は、書類の上部及び下部を除外した中央領域Ａｒ１（図４参照）において、所定の抽出ルールに従って数字の文字列を金額候補として抽出する。例えば、抽出処理部１１２は、数字以外の記号（「－」、「／」、「〒」など）が含まれる文字列を除外し、当該記号を含まない数字のみで構成される文字列を金額候補として抽出する。なお、抽出処理部１１２は、数字と、「￥」、「，」、「円」、「ＪＰＹ」、「金」など金額を表すような文字とで構成される文字列を金額候補として抽出してもよい。図４に示す例では、抽出処理部１１２は、郵便番号の文字列Ｅ１、電話番号の文字列Ｅ２、ＦＡＸ番号の文字列Ｅ３を除外し、「￥５，４８９，０００」の文字列Ａ１、「￥５，０００，０００」の文字列Ａ２、「￥１０，０００」の文字列Ａ３、「￥４，９９０，０００」の文字列Ａ４、「￥４，９９０，０００」の文字列Ａ５、「￥４９９，０００」の文字列Ａ６、「￥５，４８９，０００」の文字列Ａ７をそれぞれ金額候補として抽出する。このように、抽出処理部１１２は、画像データＰ１から金額候補を抽出する。 The extraction processing unit 112 also extracts monetary amount information from the image data P1. Specifically, the extraction processing unit 112 extracts numeric character strings as monetary amount candidates in the central area Ar1 (see FIG. 4), excluding the top and bottom of the document, according to a predetermined extraction rule. For example, the extraction processing unit 112 excludes character strings that include symbols other than numbers (such as "-", "/", and "〒"), and extracts character strings consisting only of numbers without including such symbols as monetary amount candidates. The extraction processing unit 112 may also extract character strings consisting of numbers and characters that represent monetary amounts, such as "￥", ","", "円", "JPY", and "金", as monetary amount candidates. In the example shown in FIG. 4, the extraction processing unit 112 excludes the character string E1 of the postal code, the character string E2 of the telephone number, and the character string E3 of the fax number, and extracts the character string A1 of "¥5,489,000", the character string A2 of "¥5,000,000", the character string A3 of "¥10,000", the character string A4 of "¥4,990,000", the character string A5 of "¥499,000", the character string A6 of "¥5,489,000", and the character string A7 of "¥5,489,000" as candidate amounts. In this way, the extraction processing unit 112 extracts candidate amounts from the image data P1.

このように、抽出処理部１１２は、画像データＰ１から複数の項目名と複数の項目名のそれぞれに対応する複数の項目値（例えば金額候補）とを抽出する。 In this way, the extraction processing unit 112 extracts multiple item names and multiple item values (e.g., price candidates) corresponding to each of the multiple item names from the image data P1.

特定処理部１１３は、抽出処理部１１２により抽出される複数の項目（文字列）のうち、対象項目値を算出するために必要な複数の項目名（キーワード）を特定する。例えば、見積書における見積金額（税込見積金額、税抜見積金額）が抽出対象の金額（対象項目値）に設定されている場合に、特定処理部１１３は、見積金額を算出するために必要な項目名として、「標準価格」、「値引き額」、「消費税」をキーワードに決定する。そして、特定処理部１１３は、画像データＰ１から抽出処理部１１２が抽出した複数の項目のうち前記キーワードに合致する文字列を探索する。図４に示す例では、特定処理部１１３は、画像データＰ１において「標準価格」の文字列Ｂ１、「値引き額」の文字列Ｂ２、「消費税」の文字列Ｂ３を特定する。 The identification processing unit 113 identifies multiple item names (keywords) necessary to calculate the target item value from among the multiple items (character strings) extracted by the extraction processing unit 112. For example, when the estimated amount in the quotation (estimated amount including tax, estimated amount excluding tax) is set as the amount to be extracted (target item value), the identification processing unit 113 determines "standard price", "discount amount", and "consumption tax" as keywords for the item names necessary to calculate the estimated amount. The identification processing unit 113 then searches for character strings matching the keywords from among the multiple items extracted from the image data P1 by the extraction processing unit 112. In the example shown in FIG. 4, the identification processing unit 113 identifies the character string B1 for "standard price", the character string B2 for "discount amount", and the character string B3 for "consumption tax" in the image data P1.

また、特定処理部１１３は、前記キーワード（関連項目名）を特定すると、当該キーワードとの位置関係に基づいて、当該キーワードに対応する項目値（金額候補）を特定する。具体的には、特定処理部１１３は、前記キーワードの右方向又は下方向の位置に存在する金額候補を、前記キーワードに対応する金額候補として特定する。図４に示す例では、「標準価格」の文字列Ｂ１の下に「￥５，０００，０００」の文字列Ａ２が存在するため、特定処理部１１３は、「￥５，０００，０００」を「標準価格」の金額候補として特定する。また「値引き額」の文字列Ｂ２の下に「￥１０，０００」の文字列Ａ３が存在するため、特定処理部１１３は、「￥１０，０００」を「値引き額」の金額候補として特定する。また、「消費税」の文字列Ｂ３の右に「￥４９９，０００」の文字列Ａ６が存在するため、特定処理部１１３は、「￥４９９，０００」を「消費税」の金額候補として特定する。 In addition, when the identification processing unit 113 identifies the keyword (related item name), it identifies the item value (amount candidate) corresponding to the keyword based on the positional relationship with the keyword. Specifically, the identification processing unit 113 identifies the amount candidate that exists to the right or below the keyword as the amount candidate corresponding to the keyword. In the example shown in FIG. 4, since the character string A2 of "￥5,000,000" exists below the character string B1 of "standard price", the identification processing unit 113 identifies "￥5,000,000" as the amount candidate of "standard price". Furthermore, since the character string A3 of "￥10,000" exists below the character string B2 of "discount amount", the identification processing unit 113 identifies "￥10,000" as the amount candidate of "discount amount". Furthermore, since the character string A6 of "￥499,000" exists to the right of the character string B3 of "consumption tax", the identification processing unit 113 identifies "￥499,000" as the amount candidate of "consumption tax".

また、特定処理部１１３は、抽出処理部１１２により抽出される複数の項目（文字列）のうち、対象項目値の候補を特定する。ここでは、特定処理部１１３は、見積金額の候補として、見積額（合計金額）の「￥５，４８９，０００」の文字列Ａ７を特定する。 The identification processing unit 113 also identifies candidates for the target item value from among the multiple items (character strings) extracted by the extraction processing unit 112. Here, the identification processing unit 113 identifies the character string A7 of the estimated amount (total amount) "￥5,489,000" as a candidate for the estimated amount.

このように、特定処理部１１３は、抽出処理部１１２により抽出される複数の項目名のうち対象項目名（上記の例では見積金額）に関連する複数の関連項目名を特定し、抽出処理部１１２により抽出される複数の項目値（例えば金額候補）のうち複数の関連項目名（上記の例では標準価格、値引き額、消費税）のそれぞれに対応する複数の関連項目値を特定する。 In this way, the identification processing unit 113 identifies multiple related item names related to the target item name (estimated amount in the above example) from among the multiple item names extracted by the extraction processing unit 112, and identifies multiple related item values corresponding to each of the multiple related item names (standard price, discount amount, and consumption tax in the above example) from among the multiple item values (e.g., potential amounts) extracted by the extraction processing unit 112.

判定処理部１１４は、抽出対象の金額を判定する。例えば、判定処理部１１４は、特定された複数の金額候補に基づいて、見積金額（税込見積金額及び税抜見積金額）を判定する。具体的には、判定処理部１１４は、標準価格、値引き額、消費税のそれぞれに対応する金額候補（文字列Ａ２、Ａ３、Ａ６）に基づいて、税込見積金額及び税抜見積金額（算出項目値）を算出する。例えば、判定処理部１１４は、特定処理部１１３が特定した金額候補のうち１番大きい金額から２番目に大きい金額を減算しその差分が消費税額に一致しない場合に、１番大きい金額から次（３番目）に大きい金額を減算して差分を算出する。判定処理部１１４は、算出した差分が消費税額に一致する場合に、１番大きい金額を税込見積金額と判定し、次（３番目）に大きい金額を税抜見積金額と判定する。図２に示す例では、１番大きい金額「￥５，４８９，０００」と２番目に大きい金額「￥５，０００，０００」との差分が消費税額「￥４９９，０００」に一致せず、１番大きい金額「￥５，４８９，０００」と次（３番目）に大きい金額「￥４，９９０，０００」との差分が消費税額「￥４９９，０００」に一致するため、判定処理部１１４は、１番大きい金額「￥５，４８９，０００」を税込見積金額と判定し、２番目に大きい金額「￥５，０００，０００」を除外して次に大きい金額「￥４，９９０，０００」を税抜見積金額と判定する。なお、判定処理部１１４による判定方法の詳細は後述する。 The determination processing unit 114 determines the amount to be extracted. For example, the determination processing unit 114 determines the estimated amount (estimated amount including tax and estimated amount excluding tax) based on the multiple amount candidates identified. Specifically, the determination processing unit 114 calculates the estimated amount including tax and estimated amount excluding tax (calculation item value) based on the amount candidates (character strings A2, A3, A6) corresponding to the standard price, discount amount, and consumption tax, respectively. For example, the determination processing unit 114 subtracts the second largest amount from the largest amount among the amount candidates identified by the identification processing unit 113, and if the difference does not match the consumption tax amount, the determination processing unit 114 subtracts the next (third) largest amount from the largest amount to calculate the difference. If the calculated difference matches the consumption tax amount, the determination processing unit 114 determines the largest amount to be the estimated amount including tax and the second (third) largest amount to be the estimated amount excluding tax. In the example shown in FIG. 2, the difference between the largest amount "¥5,489,000" and the second largest amount "¥5,000,000" does not match the consumption tax amount "¥499,000", and the difference between the largest amount "¥5,489,000" and the next (third) largest amount "¥4,990,000" matches the consumption tax amount "¥499,000", so the determination processing unit 114 determines that the largest amount "¥5,489,000" is the estimated amount including tax, and excludes the second largest amount "¥5,000,000" and determines that the next largest amount "¥4,990,000" is the estimated amount excluding tax. The method of determination by the determination processing unit 114 will be described in detail later.

出力処理部１１５は、特定処理部１１３により特定される複数の関連項目値に基づいて算出される算出項目値を対象項目値として出力する。例えば図５に示すように、出力処理部１１５は、操作端末２において抽出結果ページＰ２を表示させる。また、出力処理部１１５は、抽出結果ページＰ２において、判定処理部１１４により判定された算出項目値、ここでは見積金額の税込見積金額（「￥５，４８９，０００」）と税抜見積金額（「￥４，９９０，０００」）とを表示させる。 The output processing unit 115 outputs the calculated item value calculated based on the multiple related item values identified by the identification processing unit 113 as the target item value. For example, as shown in FIG. 5, the output processing unit 115 displays an extraction result page P2 on the operation terminal 2. The output processing unit 115 also displays the calculated item value determined by the determination processing unit 114 on the extraction result page P2, in this case the estimated amount including tax ("¥5,489,000") and the estimated amount excluding tax ("¥4,990,000")

このように、出力処理部１１５は、特定処理部１１３により特定される複数の関連項目値に基づいて算出される算出項目値を、対象項目値（上記の例では見積金額）として出力する。 In this way, the output processing unit 115 outputs the calculated item value calculated based on the multiple related item values identified by the identification processing unit 113 as the target item value (the estimated amount in the above example).

また、出力処理部１１５は、抽出結果ページＰ２において、画像データＰ１を表示させるとともに、画像データＰ１において抽出した見積金額（税込見積金額、税抜見積金額）に対応する文字列を識別可能に表示させる。例えば図５に示すように、出力処理部１１５は、「￥４，９９０，０００」（文字列Ａ５）と「￥５，４８９，０００」（文字列Ａ７）とに矩形枠を重ねて表示させる。 The output processing unit 115 also displays the image data P1 on the extraction result page P2, and identifiably displays character strings corresponding to the estimated amount (estimated amount including tax, estimated amount excluding tax) extracted from the image data P1. For example, as shown in FIG. 5, the output processing unit 115 displays rectangular frames superimposed on "￥4,990,000" (character string A5) and "￥5,489,000" (character string A7).

また、出力処理部１１５は、抽出結果ページＰ２において、画像データＰ１を表示させるとともに、前記見積金額を算出するために用いた標準価格、値引き額、消費税（関連項目値）に対応する文字列を矩形枠などにより識別可能に表示させてもよい。 In addition, the output processing unit 115 may display image data P1 on the extraction result page P2, and may also display character strings corresponding to the standard price, discount amount, and consumption tax (related item values) used to calculate the estimated amount in an identifiable manner using a rectangular frame or the like.

また、出力処理部１１５は、抽出結果ページＰ２に「ＯＫ」ボタンＫ１を表示させる。ユーザーは、抽出結果ページＰ２において、画像データＰ１から算出された見積金額（税込見積金額、税抜見積金額）が正しいか否かを確認し、正しいと判断すると「ＯＫ」ボタンＫ１を押下する。ユーザーが「ＯＫ」ボタンＫ１を押下すると、出力処理部１１５は、判定（算出）した見積金額（税込見積金額、税抜見積金額）と、見積内容（例えば、商品名、見積書発行日、会社名など）とをデータベースに登録する。 The output processing unit 115 also displays an "OK" button K1 on the extraction result page P2. On the extraction result page P2, the user checks whether the estimated amount (estimated amount including tax, estimated amount excluding tax) calculated from the image data P1 is correct, and presses the "OK" button K1 if they determine that it is correct. When the user presses the "OK" button K1, the output processing unit 115 registers the determined (calculated) estimated amount (estimated amount including tax, estimated amount excluding tax) and the estimate details (e.g., product name, date of issue of the estimate, company name, etc.) in the database.

また、ユーザーは、算出された見積金額（税込見積金額、税抜見積金額）が誤っていると判断すると、正しい見積金額に修正して「ＯＫ」ボタンＫ１を押下する。 If the user determines that the calculated estimated amount (tax-inclusive estimated amount, tax-exclusive estimated amount) is incorrect, the user corrects the estimated amount and presses the "OK" button K1.

また、ユーザーが抽出結果ページＰ２の「文字列候補をすべて表示」のチェックボックスにチェックを入れると、図６に示すように、出力処理部１１５は、特定処理部１１３が特定した全ての文字列を識別可能に表示させる。これにより、ユーザーは、特定された各文字列を確認することができる。また、ユーザーは、図６に示す抽出結果ページＰ２において、文字列の特定箇所（文字認識結果）を修正することができる。例えば、ユーザーは、矩形領域の大きさ及び位置を変更して、文字列として認識されなかった箇所、文字認識エラーが生じた箇所などを修正することができる。図７には、矩形領域が修正された状態（文字列Ｃ１～Ｃ７）を示している。 Furthermore, when the user checks the "Show all character string candidates" checkbox on the extraction result page P2, the output processing unit 115 causes all character strings identified by the identification processing unit 113 to be displayed in an identifiable manner, as shown in FIG. 6. This allows the user to check each of the identified character strings. The user can also correct specific parts of the character strings (character recognition results) on the extraction result page P2 shown in FIG. 6. For example, the user can change the size and position of the rectangular area to correct parts that were not recognized as character strings or parts where character recognition errors occurred. FIG. 7 shows the rectangular area after it has been corrected (character strings C1 to C7).

ここで、制御部１１は、矩形領域（文字列矩形領域）が修正された場合に、修正内容を学習する機能を備えてもよい。例えば、制御部１１は、抽出結果ページＰ２においてユーザーが矩形領域の大きさ及び位置を修正したり、新たな矩形領域を追加したりした場合に、「学習」ボタンＫ２を選択可能に表示させる。なお、ユーザーが矩形領域を修正する前の段階では、「学習」ボタンＫ２の選択はできない状態となっている（図５及び図６参照）。ユーザーが抽出結果ページＰ２の「学習」ボタンＫ２を押下すると、制御部１１は修正内容を学習して学習モデルを修正（再学習）する。これにより、画像データＰ１に対する文字列認識（オブジェクト認識）の精度を向上させることができる。 Here, the control unit 11 may have a function of learning the correction content when the rectangular area (character string rectangular area) is corrected. For example, the control unit 11 displays the "Learn" button K2 as selectable when the user corrects the size and position of a rectangular area or adds a new rectangular area on the extraction result page P2. Note that the "Learn" button K2 cannot be selected before the user corrects the rectangular area (see Figures 5 and 6). When the user presses the "Learn" button K2 on the extraction result page P2, the control unit 11 learns the correction content and corrects (relearns) the learning model. This makes it possible to improve the accuracy of character string recognition (object recognition) for the image data P1.

以上のように、画像処理装置１は、画像データＰ１から金額らしい文字列を抽出し、抽出した文字列の中から、対象項目名（上記の例では「見積金額」）に対応する関連項目値（上記の例では標準価格、値引き額、消費税）を特定し、関連項目値に基づいて対象項目値（上記の例では税込見積金額、税抜見積金額）を出力する。 As described above, the image processing device 1 extracts character strings that resemble amounts from the image data P1, and from the extracted character strings, identifies related item values (in the above example, standard price, discount amount, and consumption tax) that correspond to the target item name (in the above example, "estimated amount"), and outputs the target item values (in the above example, estimated amount including tax and estimated amount excluding tax) based on the related item values.

［操作端末２］
図１に示すように、操作端末２は、制御部２１、記憶部２２、操作表示部２３、及び通信部２４などを備える。操作端末２は、例えば、画像形成機能（例えばスキャン機能）を備えた画像形成装置、複合機などである。なお、操作端末２は、例えばパーソナルコンピュータ、スマートフォン、タブレット端末などであってもよい。 [Operation Terminal 2]
1, the operation terminal 2 includes a control unit 21, a storage unit 22, an operation display unit 23, and a communication unit 24. The operation terminal 2 is, for example, an image forming device or a multifunction peripheral equipped with an image forming function (e.g., a scanning function). Note that the operation terminal 2 may also be, for example, a personal computer, a smartphone, a tablet terminal, or the like.

通信部２４は、操作端末２を有線又は無線でネットワークＮ１に接続し、ネットワークＮ１を介して画像処理装置１などの外部機器との間で所定の通信プロトコルに従ったデータ通信を実行するための通信インターフェースである。 The communication unit 24 is a communication interface that connects the operation terminal 2 to the network N1 by wire or wirelessly and performs data communication with an external device such as the image processing device 1 via the network N1 in accordance with a predetermined communication protocol.

操作表示部２３は、各種のウェブページなどの情報を表示する液晶ディスプレイ又は有機ＥＬディスプレイのような表示部と、操作を受け付けるマウス、キーボード、又はタッチパネルのような操作部とを備えるユーザーインターフェースである。 The operation display unit 23 is a user interface that includes a display unit such as a liquid crystal display or an organic EL display that displays information such as various web pages, and an operation unit such as a mouse, keyboard, or touch panel that accepts operations.

記憶部２２は、各種の情報を記憶するＨＤＤ、ＳＳＤ、又はフラッシュメモリーなどの不揮発性の記憶部である。記憶部２２には、制御部２１に各種処理を実行する制御プログラムが記憶されている。例えば、前記制御プログラムは、ＣＤ又はＤＶＤなどのコンピュータ読取可能な記録媒体に非一時的に記録され、操作端末２が備えるＣＤドライブ又はＤＶＤドライブなどの読取装置（不図示）で読み取られて記憶部２２に記憶される。なお、前記制御プログラムは、クラウドサーバー又は画像処理装置１から配信されて記憶部２２に記憶されてもよい。 The storage unit 22 is a non-volatile storage unit such as an HDD, SSD, or flash memory that stores various types of information. The storage unit 22 stores a control program that causes the control unit 21 to execute various processes. For example, the control program is non-temporarily recorded on a computer-readable recording medium such as a CD or DVD, and is read by a reading device (not shown) such as a CD drive or DVD drive provided in the operation terminal 2 and stored in the storage unit 22. The control program may be distributed from a cloud server or the image processing device 1 and stored in the storage unit 22.

制御部２１は、ＣＰＵ、ＲＯＭ、及びＲＡＭなどの制御機器を有する。前記ＣＰＵは、各種の演算処理を実行するプロセッサーである。前記ＲＯＭは、前記ＣＰＵに各種の処理を実行させるためのＢＩＯＳ及びＯＳなどの制御プログラムが予め記憶された不揮発性の記憶部である。前記ＲＡＭは、各種の情報を記憶する揮発性又は不揮発性の記憶部であり、前記ＣＰＵが実行する各種の処理の一時記憶メモリー（作業領域）として使用される。そして、制御部２１は、前記ＲＯＭ又は記憶部２２に予め記憶された各種の制御プログラムを前記ＣＰＵで実行することにより操作端末２を制御する。なお、制御部２１に含まれる一部又は全部の処理部は電子回路で構成されていてもよい。 The control unit 21 has control devices such as a CPU, a ROM, and a RAM. The CPU is a processor that executes various arithmetic processes. The ROM is a non-volatile storage unit in which control programs such as a BIOS and an OS for causing the CPU to execute various processes are stored in advance. The RAM is a volatile or non-volatile storage unit that stores various information, and is used as a temporary storage memory (work area) for various processes executed by the CPU. The control unit 21 controls the operation terminal 2 by executing the various control programs stored in advance in the ROM or the storage unit 22 with the CPU. Note that some or all of the processing units included in the control unit 21 may be composed of electronic circuits.

具体的には、制御部２１は、記憶部２２に記憶されている前記制御プログラムに従って各種の処理を実行する。例えば、制御部２１は、見積書、注文書、請求書、納品書などの紙媒体の帳票をスキャンして生成した画像データ（ＰＤＦデータなど）を画像処理装置１に送信する。また、制御部２１は、文書作成アプリケーションなどによりユーザーの操作に基づいて前記帳票の文書ファイルを作成し、当該文書ファイルを画像データ（例えばサーチャブルＰＤＦデータ（画像＋テキストデータ）など）として画像処理装置１に送信する。 Specifically, the control unit 21 executes various processes according to the control program stored in the memory unit 22. For example, the control unit 21 scans paper documents such as estimates, purchase orders, invoices, and delivery notes, and transmits the generated image data (e.g., PDF data) to the image processing device 1. The control unit 21 also creates a document file of the document based on a user's operation using a word processing application or the like, and transmits the document file to the image processing device 1 as image data (e.g., searchable PDF data (image + text data)).

他の実施形態として、制御部２１は、画像処理装置１からネットワークＮ１を介して提供されるウェブページを操作表示部２３に表示させ、操作表示部２３に対する操作を画像処理装置１に入力するブラウザ処理を実行してもよい。この場合、制御部２１は、前記ウェブページにおいて、前記画像データのアップロード処理を実行する。 In another embodiment, the control unit 21 may execute a browser process in which a web page provided from the image processing device 1 via the network N1 is displayed on the operation display unit 23 and an operation on the operation display unit 23 is input to the image processing device 1. In this case, the control unit 21 executes an upload process of the image data on the web page.

また、制御部２１は、画像処理装置１による金額抽出処理の結果を表示させる。具体的には、制御部２１は、図５～図７に示す抽出結果ページＰ２を操作表示部２３に表示させる。また、制御部２１は、抽出結果ページＰ２において、ユーザーの各種操作を受け付ける。例えば、制御部２１は、「ＯＫ」ボタンＫ１を押下する操作、抽出結果（上記の例では見積金額）を修正する操作、文字列認識結果（矩形領域）を修正する操作、「学習」ボタンＫ２を押下する操作などを受け付ける。制御部２１は、ユーザーの前記操作に応じた情報を画像処理装置１に送信する。 The control unit 21 also displays the results of the amount extraction process performed by the image processing device 1. Specifically, the control unit 21 displays the extraction result page P2 shown in Figures 5 to 7 on the operation display unit 23. The control unit 21 also accepts various operations by the user on the extraction result page P2. For example, the control unit 21 accepts an operation to press the "OK" button K1, an operation to correct the extraction result (the estimated amount in the above example), an operation to correct the character string recognition result (rectangular area), an operation to press the "Learn" button K2, and the like. The control unit 21 transmits information corresponding to the user's operations to the image processing device 1.

以上のように、画像処理装置１は、操作端末２から取得する画像データＰ１に基づいて前記金額抽出処理を実行し、前記金額抽出処理の結果を操作端末２に表示させる。画像処理装置１は、複数の操作端末２のそれぞれから画像データＰ１を取得することが可能であり、画像データＰ１を取得するごとに前記金額抽出処理を実行する。 As described above, the image processing device 1 executes the amount extraction process based on the image data P1 acquired from the operation terminal 2, and displays the result of the amount extraction process on the operation terminal 2. The image processing device 1 is capable of acquiring image data P1 from each of the multiple operation terminals 2, and executes the amount extraction process each time image data P1 is acquired.

［金額抽出処理］
以下、図８及び図９を参照しつつ、画像処理システム１０において実行される金額抽出処理の手順の一例について説明する。 [Amount extraction process]
An example of the procedure of the amount extraction process executed in the image processing system 10 will be described below with reference to FIGS.

なお、本開示は、前記金額抽出処理に含まれる一又は複数のステップを実行する金額抽出方法（本開示の画像処理方法）として捉えることができる。また、ここで説明する前記金額抽出処理に含まれる一又は複数のステップが適宜省略されてもよい。また、前記金額抽出処理における各ステップは、同様の作用効果を生じる範囲で実行順序が異なってもよい。さらに、ここでは画像処理装置１の制御部１１が前記金額抽出処理における各ステップを実行する場合を例に挙げて説明するが、他の実施形態では、一又は複数のプロセッサーが前記金額抽出処理における各ステップを分散して実行してもよい。また、制御部１１は、複数の操作端末２のそれぞれから画像データＰ１を取得すると、画像データＰ１ごとに前記金額抽出処理を並行して実行することが可能である。 The present disclosure can be understood as an amount extraction method (image processing method of the present disclosure) that executes one or more steps included in the amount extraction process. One or more steps included in the amount extraction process described here may be omitted as appropriate. The steps in the amount extraction process may be executed in a different order as long as the same effect is achieved. Furthermore, although an example is described here in which the control unit 11 of the image processing device 1 executes each step in the amount extraction process, in other embodiments, one or more processors may execute each step in the amount extraction process in a distributed manner. Furthermore, when the control unit 11 acquires image data P1 from each of the multiple operation terminals 2, it is possible for the control unit 11 to execute the amount extraction process in parallel for each image data P1.

先ず、ステップＳ１において、制御部１１は、操作端末２から処理対象の帳票の画像データＰ１を取得したか否かを判定する。制御部１１は、操作端末２から画像データＰ１を取得すると（Ｓ１：Ｙｅｓ）、処理をステップＳ２に移行させる。制御部１１は、操作端末２から画像データＰ１を取得するまで待機する（Ｓ１：Ｎｏ）。ここでは、制御部１１は、操作端末２から見積書（図２参照）の画像データＰ１を取得するものとする。 First, in step S1, the control unit 11 determines whether or not image data P1 of the document to be processed has been acquired from the operation terminal 2. When the control unit 11 acquires the image data P1 from the operation terminal 2 (S1: Yes), the control unit 11 transitions the process to step S2. The control unit 11 waits until it acquires the image data P1 from the operation terminal 2 (S1: No). Here, it is assumed that the control unit 11 acquires image data P1 of a quotation (see FIG. 2) from the operation terminal 2.

ステップＳ２において、制御部１１は、オブジェクト認識処理（文字列領域認識処理）と文字認識処理とを実行する。具体的には、制御部１１は、画像データＰ１に対して、解像度変換処理、スキュー補正処理、下地除去処理などの前処理を実行し、続いて文書パーツの認識処理（オブジェクト認識処理）を実行する。例えば、制御部１１は、前記前処理を実行した画像データＰ１において、文字列、表、イラスト、捺印などの各パーツをオブジェクトの矩形領域として認識する。また、制御部１１は、前記オブジェクト認識処理を実行すると、画像データＰ１の全体において文字認識処理（ＯＣＲ処理）を実行する。また、制御部１１は、ＯＣＲ処理の結果を用いて、文字データを文字列として統合（単語化）する。 In step S2, the control unit 11 executes object recognition processing (character string area recognition processing) and character recognition processing. Specifically, the control unit 11 executes preprocessing such as resolution conversion processing, skew correction processing, and background removal processing on the image data P1, and then executes document part recognition processing (object recognition processing). For example, the control unit 11 recognizes each part such as character strings, tables, illustrations, and seals in the image data P1 that has been preprocessed as a rectangular area of an object. After executing the object recognition processing, the control unit 11 also executes character recognition processing (OCR processing) on the entire image data P1. The control unit 11 also uses the results of the OCR processing to integrate the character data into character strings (wordization).

次にステップＳ３において、制御部１１は、金額決定処理を実行する。図９には、前記金額決定処理の具体例を示している。 Next, in step S3, the control unit 11 executes an amount determination process. Figure 9 shows a specific example of the amount determination process.

前記金額決定処理では、先ずステップＳ３１において、制御部１１は、金額らしい数字の文字列（金額候補）を抽出する。ここで、金額候補の抽出方法の一例について図１０を用いて説明する。図１０には、画像データに含まれる文字列「合計￥１，２８０」を例示している。 In the amount determination process, first, in step S31, the control unit 11 extracts a string of numbers that resembles an amount (amount candidate). Here, an example of a method for extracting amount candidates will be described with reference to FIG. 10. FIG. 10 shows an example of the string "Total: ￥1,280" contained in the image data.

制御部１１は、前記オブジェクト認識処理により、文字列（「合計￥１，２８０」）の矩形領域（図１０の点線枠）の左上座標（ＯＢＪ＿Ｘ１［Ｎ］，ＯＢＪ＿Ｙ１［Ｎ］）と、右下座標（ＯＢＪ＿Ｘ２［Ｎ］，ＯＢＪ＿Ｙ２［Ｎ］）とを特定する。 The control unit 11 uses the object recognition process to identify the top left coordinates (OBJ_X1[N], OBJ_Y1[N]) and bottom right coordinates (OBJ_X2[N], OBJ_Y2[N]) of the rectangular area (dotted frame in FIG. 10) of the character string ("Total ￥1,280").

また、制御部１１は、前記ＯＣＲ処理の結果から、文字（「￥」）及び文字（「１，２８０」）の矩形領域（図１０の実線枠）の左上座標（ＯＣＲ＿Ｘ１［Ｍ］，ＯＣＲ＿Ｙ１［Ｍ］）と、右下座標（ＯＣＲ＿Ｘ２［Ｍ］，ＯＣＲ＿Ｙ２［Ｍ］）とを特定する。 The control unit 11 also identifies the upper left coordinates (OCR_X1[M], OCR_Y1[M]) and the lower right coordinates (OCR_X2[M], OCR_Y2[M]) of the rectangular area (solid line frame in FIG. 10) of the character ("￥") and the character ("1,280") from the results of the OCR processing.

制御部１１は、「ｎ：０～Ｎ－１，ｍ：０～Ｍ－１」とし、Ｎ×Ｍの組み合わせで以下の条件を判定する。
・Ｘ１＝ＯＢＪ＿Ｘ１［ｎ］＞ＯＣＲ＿Ｘ１［ｍ］？
ＯＢＪ＿Ｘ１［ｎ］：ＯＣＲ＿Ｘ１［ｍ］（大きい方の座標値を取得）
・Ｙ１＝ＯＢＪ＿Ｙ１［ｎ］＞ＯＣＲ＿Ｙ１［ｍ］？
ＯＢＪ＿Ｙ１［ｎ］：ＯＣＲ＿Ｙ１［ｍ］（大きい方の座標値を取得）
・Ｘ２＝ＯＢＪ＿Ｘ２［ｎ］＜ＯＣＲ＿Ｘ２［ｍ］？
ＯＢＪ＿Ｘ２［ｎ］：ＯＣＲ＿Ｘ２［ｍ］（小さい方の座標値を取得）
・Ｙ２＝ＯＢＪ＿Ｙ２［ｎ］＜ＯＣＲ＿Ｙ２［ｍ］？
ＯＢＪ＿Ｙ２［ｎ］：ＯＣＲ＿Ｙ２［ｍ］（小さい方の座標値を取得）
・Ｘ１＜Ｘ２かつＹ１＜Ｙ２であれば矩形の重なりありと判定 The control unit 11 determines the following conditions for each of N×M combinations of "n: 0 to N-1, m: 0 to M-1".
・X1 = OBJ_X1[n] > OCR_X1[m]?
OBJ_X1[n]: OCR_X1[m] (get the larger coordinate value)
・Y1 = OBJ_Y1[n] > OCR_Y1[m]?
OBJ_Y1[n]: OCR_Y1[m] (gets the larger coordinate value)
・X2 = OBJ_X2[n] < OCR_X2[m]?
OBJ_X2[n]: OCR_X2[m] (gets the smaller coordinate value)
・Y2 = OBJ_Y2[n] < OCR_Y2[m]?
OBJ_Y2[n]: OCR_Y2[m] (obtains the smaller coordinate value)
If X1<X2 and Y1<Y2, it is determined that the rectangles overlap.

また、制御部１１は、以下の式により矩形領域の重なり率Ａを算出する。
Ａ＝｛（Ｘ２－Ｘ１）×（Ｙ２－Ｙ１）｝／｛（ＯＣＲ＿Ｘ２［ｍ］－ＯＣＲ＿Ｘ１［ｍ］）×（ＯＣＲ＿Ｙ２［ｍ］－ＯＣＲ＿Ｙ１［ｍ］）｝ In addition, the control unit 11 calculates the overlap rate A of the rectangular areas using the following formula.
A = {(X2-X1) x (Y2-Y1)} / {(OCR_X2[m]-OCR_X1[m]) x (OCR_Y2[m]-OCR_Y1[m])}

制御部１１は、Ａ≧Ｔｈ１（例えばＴｈ１＝０．８）の場合に、前記オブジェクト認識処理による矩形領域とＯＣＲ結果による矩形領域とが同一領域であると判定する。 When A ≧ Th1 (for example, Th1 = 0.8), the control unit 11 determines that the rectangular area obtained by the object recognition process and the rectangular area obtained by the OCR result are the same area.

制御部１１は、同一領域となった矩形領域のＯＣＲ結果に数字（０～９）の文字が含まれているか否かを判定し、その後、「先頭に「￥」がある」、「途中に「－（ハイフン）」がない」、「金額によくある書式の文字が存在する」、「郵便番号、電話番号、ＦＡＸ番号、書類番号、商品の型番などを表す文字が存在しない」、などの各条件で金額らしさを判定し、金額らしいと判定すると、金額候補文字列として抽出する。上記の例では、制御部１１は、「￥１，２８０」を金額候補文字列として抽出する。これにより、前記オブジェクト認識処理で文字列と認識された領域に絞り込むことで、ＯＣＲ処理において文字以外の場所の誤認識を除外することが可能となる。 The control unit 11 determines whether the OCR results of the rectangular area that became the same area contain any numeric characters (0 to 9), and then determines whether it is likely to be an amount based on various conditions such as "there is a '￥' at the beginning," "there is no '- (hyphen)' in the middle," "there is the presence of characters that are commonly used in amounts," and "there is no presence of characters that represent a postal code, telephone number, fax number, document number, product model number, etc." If it is determined to be likely to be an amount, it extracts it as a candidate amount string. In the above example, the control unit 11 extracts "￥1,280" as a candidate amount string. This makes it possible to eliminate misrecognition of places other than characters in the OCR process by narrowing down the area to those that were recognized as characters in the object recognition process.

例えば、制御部１１は、ＯＣＲ処理の結果から「標準価格」、「値引き額」、「消費税」、「口座番号（請求書の場合のみ）」、「各種コード」（客コード、店コードなど）に該当する文字列を探索し、その位置情報を抽出する。そして、制御部１１は、前記文字列の位置情報から、右方向又は下方向に存在する数字の文字列（金額、口座番号、各種コードなど）を探索し、一番近い位置の数字を金額候補文字列として抽出する。また、制御部１１は、口座番号と各種コードについては、金額候補文字列の中から除外する。図２に示す見積書において、キーワードとして「標準価格」、「値引き額」、「消費税」が設定されている場合、制御部１１は、「標準価格」の文字列の下に存在する「￥５，０００，０００」と、「値引き額」の文字列の下に存在する「￥１０，０００」と、「消費税」の文字列の右に存在する「￥４９９，０００」とを抽出する。 For example, the control unit 11 searches for character strings corresponding to "standard price", "discount amount", "consumption tax", "account number (only for invoices)", and "various codes" (customer code, store code, etc.) from the results of the OCR processing, and extracts their position information. The control unit 11 then searches for numeric character strings (amount, account number, various codes, etc.) that exist to the right or below the position information of the character string, and extracts the nearest number as a candidate character string for the amount. The control unit 11 also excludes the account number and various codes from the candidate character strings for the amount. In the quotation shown in FIG. 2, if "standard price", "discount amount", and "consumption tax" are set as keywords, the control unit 11 extracts "￥5,000,000" that exists under the character string for "standard price", "￥10,000" that exists under the character string for "discount amount", and "￥499,000" that exists to the right of the character string for "consumption tax".

このように、制御部１１は、前記オブジェクト認識処理による文字列の位置情報からＯＣＲ結果を用いて数字を含む文字列を抽出し、数字の文字列が金額らしい数字の文字列（金額候補）を判定する。 In this way, the control unit 11 uses the OCR results to extract character strings containing numbers from the character string position information obtained by the object recognition process, and determines whether the character string is likely to represent an amount (amount candidate).

次にステップＳ３２において、制御部１１は、対象項目値を算出するために必要な項目名（キーワード）を設定する。例えば、見積書における見積金額（税込見積金額、税抜見積金額）が抽出対象の金額（対象項目値）である場合に、制御部１１は、「消費税」をキーワードに設定する。 Next, in step S32, the control unit 11 sets the item name (keyword) required to calculate the target item value. For example, if the estimated amount in the quotation (estimated amount including tax, estimated amount excluding tax) is the amount to be extracted (target item value), the control unit 11 sets "consumption tax" as the keyword.

次にステップＳ３３において、制御部１１は、前記キーワード（「消費税」）に対応する金額候補を抽出する。例えば、制御部１１は、画像データＰ１の複数の金額候補において、「税」を含む文字列を探索し、探索した当該文字列の右側又は下側に金額候補があるかどうかを判定する。制御部１１は、前記文字列の右側又は下側に金額候補がある場合、「消費税」に対応する金額候補として抽出する。ここでは、制御部１１は、「税」の文字列の右側に存在する「￥４９９，０００」を金額候補として抽出する。 Next, in step S33, the control unit 11 extracts a candidate amount corresponding to the keyword ("consumption tax"). For example, the control unit 11 searches for a character string containing "tax" among the multiple candidate amount amounts in the image data P1, and determines whether there is a candidate amount to the right or below the searched character string. If there is a candidate amount to the right or below the character string, the control unit 11 extracts it as the candidate amount corresponding to "consumption tax". Here, the control unit 11 extracts "￥499,000" that is to the right of the character string "tax" as the candidate amount.

次にステップＳ３４において、制御部１１は、対象項目値を算出するために必要な項目名（キーワード）として、「値引き額」を設定する。 Next, in step S34, the control unit 11 sets "discount amount" as the item name (keyword) required to calculate the target item value.

次にステップＳ３５において、制御部１１は、前記キーワード（「値引き額」）に対応する金額候補を抽出する。例えば、制御部１１は、画像データＰ１の複数の金額候補において、「値引」を含む文字列を探索し、探索した当該文字列の右側又は下側に金額候補があるかどうかを判定する。制御部１１は、前記文字列の右側又は下側に金額候補がある場合、「値引き額」に対応する金額候補として抽出する。ここでは、制御部１１は、「値引」の文字列の下側に存在する「￥１０，０００」を金額候補として抽出する。 Next, in step S35, the control unit 11 extracts a candidate amount corresponding to the keyword ("discount amount"). For example, the control unit 11 searches for a character string containing "discount" among the multiple candidate amount values in the image data P1, and determines whether there is a candidate amount to the right or below the searched character string. If there is a candidate amount to the right or below the character string, the control unit 11 extracts it as the candidate amount corresponding to "discount amount". Here, the control unit 11 extracts "￥10,000" present below the character string "discount" as the candidate amount.

次にステップＳ３６において、制御部１１は、対象項目値を算出するために必要な項目名（キーワード）として、「標準価格」を設定する。 Next, in step S36, the control unit 11 sets "standard price" as the item name (keyword) required to calculate the target item value.

次にステップＳ３７において、制御部１１は、前記キーワード（「標準価格」）に対応する金額候補を抽出する。例えば、制御部１１は、画像データＰ１の複数の金額候補において、「標準」、「参考」などを含む文字列を探索し、探索した当該文字列の右側又は下側に金額候補があるかどうかを判定する。制御部１１は、前記文字列の右側又は下側に金額候補がある場合、「標準価格」に対応する金額候補として抽出する。ここでは、制御部１１は、「標準」の文字列の下側に存在する「￥５，０００，０００」を金額候補として抽出する。 Next, in step S37, the control unit 11 extracts a price candidate corresponding to the keyword ("standard price"). For example, the control unit 11 searches for character strings including "standard" or "reference" among the multiple price candidates in the image data P1, and determines whether there is a price candidate to the right or below the searched character string. If there is a price candidate to the right or below the character string, the control unit 11 extracts it as the price candidate corresponding to "standard price". Here, the control unit 11 extracts "￥5,000,000" present below the character string "standard" as the price candidate.

次にステップＳ３８において、制御部１１は、前記キーワードに対応する複数の金額候補の中から、１番大きい金額と２番目に大きい金額とを特定する。例えば、制御部１１は、「標準価格」の金額候補が見つかった場合は、当該金額候補を除外し、「値引き額」の金額候補が見つかった場合は、１番大きい金額と２番目に大きい金額との差分が値引き額に該当する場合に１番大きい金額を値引き前の金額として除外する。 Next, in step S38, the control unit 11 identifies the largest and second largest amounts from among the multiple amount candidates corresponding to the keyword. For example, if the control unit 11 finds a price candidate for "standard price," it excludes that price candidate, and if the control unit 11 finds a price candidate for "discount amount," it excludes the largest amount as the amount before discount if the difference between the largest and second largest amounts corresponds to the discount amount.

図２に示す見積書では、先ず、制御部１１は、ステップＳ３１で抽出した金額候補のうち１番大きい金額（「￥５，４８９，０００」）、２番目に大きい金額（「￥５，０００，０００」）、３番目に大きい金額（「￥４，９９０，０００」）を抽出する。次に、制御部１１は、この３つの金額の中に「標準価格」に対応する金額候補（「￥５，０００，０００」）があるか否かを判定する。図２の例では２番目に大きい金額（「￥５，０００，０００」）が標準価格の候補（「￥５，０００，０００」）と一致するため、制御部１１は、２番目の金額（「￥５，０００，０００」）を標準価格の可能性ありと判定する。次に、制御部１１は、１番大きい金額（「￥５，４８９，０００」）から２番目に大きい金額（「￥５，０００，０００」）、１番目に大きい金額（「￥５，４８９，０００」）から３番目に大きい金額（「￥４，９９０，０００」）、２番目に大きい金額（「￥５，０００，０００」）から３番目に大きい金額（「￥４，９９０，０００」）をそれぞれ減算する。制御部１１は、それぞれ減算した差分の中で値引き額に一致したものがある場合には引かれる方の金額を値引き前の金額として判定する。それぞれ減算した差分の中で全てが値引き額に一致しない場合、制御部１１は、標準価格の可能性ありと判定した金額候補があれば、その金額のみを除外する。制御部１１は、標準価格の可能性ありと判定した金額候補がなければそのままとする。図２に示す例では、先ず、１番大きい金額から２番目に大きい金額を減算した差分（「￥４８９，０００」）が値引き額（「￥１０，０００」）に一致しない。次に、１番大きい金額から３番目に大きい金額を減算した差分（「￥４９９，０００」）も値引き額（「￥１０，０００」）に一致しない。最後に、２番目に大きい金額から３番目に大きい金額を減算した差分（「￥１０，０００」）が値引き額（「￥１０，０００」）に一致する。このため、制御部１１は、２番目に大きい金額が値引き前の金額であると判定する。制御部１１は、標準価格の可能性があり、かつ、値引き前の金額であると判定した２番目に大きい金額（「￥５，０００，０００」）を除外する。制御部１１は、除外した金額（「￥５，０００，０００」）を除いて１番大きい金額（「￥５，４８９，０００」）と２番目に大きい金額（「￥４，９９０，０００」）とを抽出する。そして、制御部１１は、１番大きい金額として「￥５，４８９，０００」を特定し、２番目に大きい金額として「￥４，９９０，０００」を特定する。 In the quotation shown in FIG. 2, first, the control unit 11 extracts the largest amount ("¥5,489,000"), the second largest amount ("¥5,000,000"), and the third largest amount ("¥4,990,000") from the price candidates extracted in step S31. Next, the control unit 11 determines whether or not there is a price candidate ("¥5,000,000") that corresponds to the "standard price" among these three amounts. In the example of FIG. 2, since the second largest amount ("¥5,000,000") matches the standard price candidate ("¥5,000,000"), the control unit 11 determines that the second amount ("¥5,000,000") is likely to be the standard price. Next, the control unit 11 subtracts the second largest amount ("¥5,000,000") from the largest amount ("¥5,489,000"), the third largest amount ("¥4,990,000") from the first largest amount ("¥5,489,000"), and the third largest amount ("¥4,990,000") from the second largest amount ("¥5,000,000"). If any of the differences obtained by each subtraction match the discount amount, the control unit 11 determines that the amount to be subtracted is the amount before the discount. If none of the differences obtained by each subtraction match the discount amount, the control unit 11 excludes only the amount that is determined to be the standard price if there is one. If there is no amount candidate that is determined to be the standard price, the control unit 11 leaves it as it is. In the example shown in FIG. 2, first, the difference ("¥489,000") obtained by subtracting the second largest amount from the first largest amount does not match the discount amount ("¥10,000"). Next, the difference ("¥499,000") obtained by subtracting the third largest amount from the first largest amount does not match the discount amount ("¥10,000"). Finally, the difference ("¥10,000") obtained by subtracting the third largest amount from the second largest amount matches the discount amount ("¥10,000"). Therefore, the control unit 11 determines that the second largest amount is the amount before the discount. The control unit 11 excludes the second largest amount ("¥5,000,000"), which may be the standard price and which has been determined to be the amount before the discount. The control unit 11 extracts the largest amount ("¥5,489,000") and the second largest amount ("¥4,990,000"), excluding the excluded amount ("¥5,000,000"). The control unit 11 then identifies "¥5,489,000" as the largest amount, and "¥4,990,000" as the second largest amount.

次にステップＳ３９において、制御部１１は、見積金額（税込見積金額及び税抜見積金額）を判定する。ここでは、制御部１１は、１番大きい金額（「￥５，４８９，０００」）から２番目に大きい金額（「￥４，９９０，０００」）を減算した金額が「消費税」に対応する金額（「￥４９９，０００」）に一致する場合に、１番大きい金額（「￥５，４８９，０００」）を税込見積金額に決定し、２番目に大きい金額（「￥４，９９０，０００」）を税抜見積金額に決定する。 Next, in step S39, the control unit 11 determines the estimated amount (estimated amount including tax and estimated amount excluding tax). Here, if the amount obtained by subtracting the second largest amount ("¥4,990,000") from the largest amount ("¥5,489,000") matches the amount corresponding to "consumption tax" ("¥499,000"), the control unit 11 determines the largest amount ("¥5,489,000") as the estimated amount including tax, and determines the second largest amount ("¥4,990,000") as the estimated amount excluding tax.

ここで、キーワードとして「消費税」が存在しない場合には、制御部１１は、１番大きい金額と２番目に大きい金額との比率を求め、求めた比率が消費税率に該当するか否かを判定する。その際、制御部１１は、事前にＯＣＲ結果から書類の上部の発行日を抽出できている場合には、その発行日に従って、その年月日に対応する税率と計算で求めた税率とが一致する場合に１番大きい金額を税込見積金額に決定し、２番目に大きい金額を税抜見積金額に決定する。 If "consumption tax" is not present as a keyword, the control unit 11 finds the ratio between the largest amount and the second largest amount, and determines whether the found ratio corresponds to the consumption tax rate. In this case, if the control unit 11 has been able to extract the issue date from the top of the document from the OCR results in advance, it will determine the largest amount as the estimated amount including tax if the tax rate corresponding to that date matches the calculated tax rate according to the issue date, and will determine the second largest amount as the estimated amount excluding tax.

また、制御部１１は、「消費税」のキーワードとともに「１０％」などの文字列を抽出できている場合には、当該文字列の税率（パーセンテージ）と計算で求めた税率とが一致する場合に１番大きい金額を税込見積金額に決定し、２番目に大きい金額を税抜見積金額に決定する。 In addition, if the control unit 11 is able to extract a character string such as "10%" along with the keyword "consumption tax," and if the tax rate (percentage) in the character string matches the calculated tax rate, it will determine the largest amount as the estimated amount including tax, and the second largest amount as the estimated amount excluding tax.

ここで、制御部１１は、書類の区分に応じて金額を決定してもよい。例えば、書類が見積書又は注文書の場合には、「消費税」のキーワードが存在せず、さらに１番大きい金額と２番目に大きい金額との比率が消費税率に該当しない場合に、制御部１１は、１番大きい金額の位置周辺に「税抜」、「税込」の文字列が存在するかどうかを判定し、「税抜」が存在する又はいずれも存在しない場合に税込金額表示なしと判定して、１番大きい金額を税抜金額に決定する。また、「税込」の文字列が存在する場合には、制御部１１は、税抜金額表示なしと判定して、１番大きい金額を税込金額に決定する。 Here, the control unit 11 may determine the amount according to the classification of the document. For example, if the document is a quotation or order form, if the keyword "consumption tax" is not present and the ratio between the largest amount and the second largest amount does not correspond to the consumption tax rate, the control unit 11 determines whether the character strings "excluding tax" and "including tax" are present around the position of the largest amount, and if "excluding tax" is present or neither is present, it determines that the amount including tax is not displayed, and determines the largest amount as the amount including tax. Also, if the character string "including tax" is present, the control unit 11 determines that the amount including tax is not displayed, and determines the largest amount as the amount including tax.

また例えば、書類が請求書又は納品書の場合には、「消費税」のキーワードが存在せず、さらに１番大きい金額と２番目に大きい金額との比率が消費税率に該当しない場合に、制御部１１は、１番大きい金額の位置周辺に「税抜」、「税込」の文字列が存在するかどうかを判定し、「税込」が存在する又はいずれも存在しない場合に税抜金額表示なしと判定して、１番大きい金額を税込金額に決定する。また、「税抜」の文字列が存在する場合には、制御部１１は、税込金額表示なしと判定して、１番大きい金額を税抜金額に決定する。 For example, if the document is an invoice or delivery note, and the keyword "consumption tax" is not present and the ratio between the largest amount and the second largest amount does not correspond to the consumption tax rate, the control unit 11 determines whether the character strings "excluding tax" and "including tax" are present around the position of the largest amount, and if "including tax" is present or neither is present, it determines that the amount excluding tax is not displayed and determines the largest amount as the amount including tax. Also, if the character string "excluding tax" is present, the control unit 11 determines that the amount including tax is not displayed and determines the largest amount as the amount excluding tax.

なお、抽出対象の対象項目名に「消費税」が設定されている場合には、制御部１１は、消費税額を決定する。 If "consumption tax" is set as the name of the item to be extracted, the control unit 11 determines the amount of consumption tax.

制御部１１は、以上のようにして前記金額決定処理を実行する。制御部１１は、対象項目値である見積金額（税込見積金額、税抜見積金額）を決定すると、ステップＳ４（図８参照）において、前記金額決定処理の結果を出力する。ここでは、制御部１１は、図５に示す抽出結果ページＰ２を操作端末２に表示させるとともに、抽出結果ページＰ２において見積金額の税込見積金額（「￥５，４８９，０００」）と税抜見積金額（「￥４，９９０，０００」）とを表示させる。 The control unit 11 executes the amount determination process in the above manner. When the control unit 11 determines the estimated amount (estimated amount including tax, estimated amount excluding tax), which is the target item value, in step S4 (see FIG. 8), it outputs the result of the amount determination process. Here, the control unit 11 causes the operation terminal 2 to display the extraction result page P2 shown in FIG. 5, and also causes the extraction result page P2 to display the estimated amount including tax ("¥5,489,000") and the estimated amount excluding tax ("¥4,990,000") of the estimated amount.

次にステップＳ５において、制御部１１は、ユーザーから「ＯＫ」ボタンＫ１の押下操作を受け付けたか否かを判定する。ユーザーが抽出結果ページＰ２の「ＯＫ」ボタンＫ１を押下すると（Ｓ５：Ｙｅｓ）、制御部１１は、処理をステップＳ６に移行させる。ユーザーは、抽出結果ページＰ２において、見積金額（税込見積金額、税抜見積金額）が正しいか否かを確認して正しいと判断すると「ＯＫ」ボタンＫ１を押下する。ユーザーが抽出結果ページＰ２の「ＯＫ」ボタンＫ１を押下しない場合（Ｓ５：Ｎｏ）、制御部１１は、処理をステップＳ５１に移行させる。 Next, in step S5, the control unit 11 determines whether or not it has received an operation from the user to press the "OK" button K1. When the user presses the "OK" button K1 on the extraction result page P2 (S5: Yes), the control unit 11 transitions the process to step S6. The user checks whether the estimated amount (estimated amount including tax, estimated amount excluding tax) is correct on the extraction result page P2, and if it is determined to be correct, presses the "OK" button K1. If the user does not press the "OK" button K1 on the extraction result page P2 (S5: No), the control unit 11 transitions the process to step S51.

ステップＳ５１では、制御部１１は、ユーザーから修正操作を受け付けたか否かを判定する。ユーザーは、見積金額（税込見積金額、税抜見積金額）が誤っていると判断すると、正しい見積金額に修正する。制御部１１は、ユーザーから修正操作を受け付けると（Ｓ５１：Ｙｅｓ）、ステップＳ５２において見積金額を修正して再表示させる。その後、制御部１１は、処理をステップＳ５に移行させる。 In step S51, the control unit 11 determines whether or not a correction operation has been received from the user. If the user determines that the estimated amount (estimated amount including tax, estimated amount excluding tax) is incorrect, the user corrects the estimated amount to the correct amount. If the control unit 11 receives a correction operation from the user (S51: Yes), the control unit 11 corrects the estimated amount and redisplays it in step S52. Thereafter, the control unit 11 transitions the process to step S5.

ステップＳ６では、制御部１１は、前記金額決定処理の結果を登録する。ここでは、制御部１１は、決定した見積金額（税込見積金額、税抜見積金額）と、見積内容（例えば、商品名、見積書発行日、会社名など）とをデータベースに登録する。なお、抽出対象に「消費税」が含まれる場合には、制御部１１は、消費税額を抽出結果ページＰ２に表示させるとともにデータベースに登録する。 In step S6, the control unit 11 registers the results of the amount determination process. Here, the control unit 11 registers the determined estimated amount (estimated amount including tax, estimated amount excluding tax) and the estimate contents (e.g., product name, date of issue of estimate, company name, etc.) in the database. Note that if the extraction target includes "consumption tax," the control unit 11 displays the consumption tax amount on the extraction result page P2 and also registers it in the database.

以上のようにして、制御部１１は、前記金額抽出処理を実行する。制御部１１は、各操作端末２から画像データＰ１を取得するごとに、前記金額抽出処理を実行する。 In this manner, the control unit 11 executes the amount extraction process. The control unit 11 executes the amount extraction process each time it acquires image data P1 from each operation terminal 2.

以上説明したように、本実施形態に係る画像処理システム１０は、画像データＰ１から抽出対象の対象項目名に対応する対象項目値を抽出するシステムである。また、画像処理システム１０は、画像データＰ１から複数の項目名と複数の項目名のそれぞれに対応する複数の項目値とを抽出し、抽出された複数の項目名のうち対象項目名に関連する複数の関連項目名（例えば、「標準価格」、「値引き額」、「消費税」）を特定し、抽出された複数の項目値のうち複数の関連項目名のそれぞれに対応する複数の関連項目値を特定する。そして、画像処理システム１０は、特定された複数の関連項目値に基づいて算出される算出項目値を、前記対象項目値として出力する。 As described above, the image processing system 10 according to this embodiment is a system that extracts target item values corresponding to target item names to be extracted from image data P1. Furthermore, the image processing system 10 extracts multiple item names and multiple item values corresponding to each of the multiple item names from the image data P1, identifies multiple related item names (e.g., "standard price," "discount amount," "consumption tax") related to the target item name from the extracted multiple item names, and identifies multiple related item values corresponding to each of the multiple related item names from the extracted multiple item values. Then, the image processing system 10 outputs a calculated item value calculated based on the identified multiple related item values as the target item value.

例えば、制御部１１は、見積書又は注文書と認識された画像データＰ１の文字列情報（ＰＤＦの文字列情報）に対して、認識された文字列から「標準価格」、「値引き額」、「消費税」などのキーワードを抽出し、さらに当該キーワードの右側又は下側に存在する数字の文字列をキーワードに対応したバリュー（金額）として抽出するとともに、認識された全文字列から金額らしき数字の文字列を金額候補として全て抽出する。そして、制御部１１は、「標準価格」や「値引き額」のキーワードが存在し、そのバリューが見つかった場合には、そのバリューの金額に関連した金額を金額候補の中から除外した後、金額候補の中から１番大きい金額と２番目に大きい金額を抽出し、この２つの金額と消費税額、又は、消費税率から税込金額と税抜金額とを決定する。また、「消費税」のキーワードが存在せず、さらに１番目と２番目の金額の比率が消費税額に該当しない場合には、制御部１１は、「税込」、「税抜」のキーワードも存在しない場合に税込金額表示なしと判定して、１番目の金額を税抜金額に決定する。 For example, the control unit 11 extracts keywords such as "standard price", "discount amount", and "consumption tax" from the recognized character strings for the character string information (PDF character string information) of the image data P1 recognized as an estimate or order form, and further extracts the numeric character strings present to the right or below the keyword as the value (amount) corresponding to the keyword, and extracts all numeric character strings resembling amounts as candidate amounts from all recognized character strings. Then, when the keyword "standard price" or "discount amount" exists and the value is found, the control unit 11 excludes the amount related to the amount of the value from the candidate amounts, and then extracts the largest and second largest amounts from the candidate amounts, and determines the tax-inclusive amount and tax-exclusive amount from these two amounts and the amount of consumption tax or the consumption tax rate. Also, when the keyword "consumption tax" does not exist and the ratio between the first and second amounts does not correspond to the amount of consumption tax, the control unit 11 determines that the tax-inclusive amount is not displayed if the keywords "tax-inclusive" and "tax-exclusive" do not exist, and determines the first amount as the tax-exclusive amount.

また例えば、制御部１１は、請求書又は納品書と認識された画像データＰ１の文字列情報（ＰＤＦの文字列情報）に対して、認識された文字列から「標準価格」、「値引き額」、「消費税」などのキーワードを抽出し、さらに当該キーワードの右側又は下側に存在する数字の文字列をキーワードに対応したバリュー（金額）として抽出するとともに、認識された全文字列から金額らしき数字の文字列を金額候補として全て抽出する。そして、制御部１１は、「標準価格」や「値引き額」のキーワードが存在し、そのバリューが見つかった場合には、そのバリューの金額に関連した金額を金額候補の中から除外した後、金額候補の中から１番大きい金額と２番目に大きい金額とを抽出し、この２つの金額と消費税額又は消費税率から税込金額と税抜金額とを決定する。また、「消費税」のキーワードが存在せず、さらに１番目と２番目の金額の比率が消費税額に該当しない場合には、制御部１１は、「税込」、「税抜」のキーワードも存在しない場合に税抜金額表示なしと判定し、１番目の金額を税込金額に決定する。 For example, the control unit 11 extracts keywords such as "standard price", "discount amount", and "consumption tax" from the recognized character strings of the character strings (PDF character string information) of the image data P1 recognized as an invoice or delivery note, and further extracts the numeric character strings present to the right or below the keyword as the value (amount) corresponding to the keyword, and extracts all numeric character strings resembling amounts as candidate amounts from all recognized character strings. Then, when the keyword "standard price" or "discount amount" exists and the value is found, the control unit 11 excludes the amount related to the amount of the value from the candidate amounts, and then extracts the largest and second largest amounts from the candidate amounts, and determines the tax-inclusive amount and tax-exclusive amount from these two amounts and the consumption tax amount or consumption tax rate. Also, when the keyword "consumption tax" does not exist and the ratio between the first and second amounts does not correspond to the consumption tax amount, the control unit 11 determines that the tax-exclusive amount is not displayed if the keywords "tax included" and "tax excluded" do not exist, and determines the first amount as the tax-inclusive amount.

上記構成によれば、項目値と項目名との位置関係が所定のルールに合致しない場合であっても、画像データＰ１から抽出対象の項目値を適切に抽出することが可能となる。 According to the above configuration, even if the positional relationship between the item value and the item name does not match the specified rules, it is possible to appropriately extract the item value to be extracted from the image data P1.

本開示は上述の実施形態に限定されない。本開示は以下に示す実施形態であってもよい。 The present disclosure is not limited to the above-described embodiments. The present disclosure may include the following embodiments:

本開示の他の実施形態として、制御部１１は、画像データＰ１から数字の文字列のみを抽出してもよい。例えば、制御部１１は、数字の文字列として、金額、電話番号、郵便番号、企業番号、書類番号などの数字を含む文字列を予め学習モデルにより学習しておく。そして、制御部１１は、画像データＰ１を取得すると、ＡＩの推論処理による数字の文字列抽出と画像全体のＯＣＲ処理を実行する。次に、制御部１１は、ＯＣＲ処理の結果（文字列単位の区切り情報あり）と、数字の文字列抽出結果（文字列矩形の左上と右下の座標位置）とを入力として金額抽出を行う。画像の特徴からＡＩで数字の文字列らしき矩形領域を推論で出力するため、学習精度が上がれば、ＯＣＲ処理において数字をアルファベットやその他の記号などに誤認識してもＡＩの出力を信じてワードコレクトし、正しい数字に戻すことが可能となる。また、さらに精度良く金額らしき文字列のみ推論できるようになれば、金額抽出処理の一部を簡略化でき、精度とともに処理パフォーマンスも向上する。 In another embodiment of the present disclosure, the control unit 11 may extract only a character string of numbers from the image data P1. For example, the control unit 11 learns character strings containing numbers such as amounts, telephone numbers, postal codes, company numbers, and document numbers as character strings of numbers in advance using a learning model. Then, when the control unit 11 acquires the image data P1, it executes an AI inference process to extract character strings of numbers and an OCR process for the entire image. Next, the control unit 11 extracts the amount of money using the result of the OCR process (with delimiter information for each character string) and the result of the extraction of the character string of numbers (the coordinate positions of the upper left and lower right of the character string rectangle) as inputs. Since the AI infers and outputs a rectangular area that resembles a character string of numbers from the characteristics of the image, if the learning accuracy improves, it will be possible to trust the output of the AI, perform word correction, and return to the correct number even if the numbers are misrecognized as alphabets or other symbols in the OCR process. Furthermore, if it becomes possible to infer only character strings that resemble amounts with even greater accuracy, part of the amount extraction process can be simplified, and the processing performance will improve along with the accuracy.

例えば、制御部１１は、見積書、注文書、納品書、請求書と認識された画像データＰ１の文字列情報（ＰＤＦの文字列情報）に対して、認識された文字列から「標準価格」、「値引き額」、「消費税」などのキーワードを抽出し、さらに当該キーワードの右側又は下側に存在する数字の文字列をキーワードに対応したバリュー（金額）として抽出するとともに、認識された数字の文字列領域から金額らしき数字の文字列を金額候補として全て抽出する。そして、制御部１１は、「標準金額」や「値引き額」のキーワードが存在し、そのバリューが見つかった場合には、そのバリューの金額に関連した金額を金額候補の中から除外した後、金額候補の中から１番大きい金額と２番目に大きい金額とを抽出し、この２つの金額と消費税額又は消費税率から税込金額と税抜金額とを特定する。また、「消費税」のキーワードが存在せず、さらに１番目と２番目の金額の比率が消費税額に該当しない場合には、制御部１１は、「税込」、「税抜」のキーワードも存在しない場合に、見積書又は注文書の場合は税込金額表示なしと判定し１番目の金額を税抜金額に決定し、請求書又は納品書の場合は税抜金額表示なしと判定し１番目の金額を税込金額に決定する。 For example, the control unit 11 extracts keywords such as "standard price," "discount amount," and "consumption tax" from the recognized character strings of image data P1 recognized as an estimate, order form, delivery note, or invoice, and further extracts numeric character strings to the right or below the keyword as a value (amount) corresponding to the keyword, and extracts all numeric character strings that resemble amounts from the recognized numeric character string area as amount candidates. Then, when the keyword "standard price" or "discount amount" is present and a value is found, the control unit 11 excludes the amount associated with the value from the amount candidates, and then extracts the largest and second largest amounts from the amount candidates, and determines the tax-inclusive amount and tax-exclusive amount from these two amounts and the consumption tax amount or consumption tax rate. Furthermore, if the keyword "consumption tax" does not exist and the ratio between the first and second amounts does not correspond to the amount of consumption tax, and if the keywords "tax included" and "tax excluded" do not exist either, the control unit 11 determines that the amount including tax is not displayed in the case of a quotation or order form and determines the first amount to be the amount including tax, and determines that the amount excluding tax is not displayed in the case of an invoice or delivery note and determines that the amount excluding tax is not displayed in the case of an invoice or delivery note and determines the first amount to be the amount including tax.

上述の実施形態では、画像処理装置１単体が本開示に係る画像処理システムに相当するが、本開示に係る画像処理システムは、画像処理装置１及び操作端末２で構成されてもよい。例えば、画像処理装置１及び操作端末２の構成要素が協働して前記金額抽出処理を分担して実行する場合には、その処理を実行する複数の構成要素を含むシステムが本開示に係る画像処理システムに相当する。また、操作端末２が前記金額抽出処理を実行する場合には、操作端末２単体が本開示に係る画像処理システムを構成してもよい。 In the above embodiment, the image processing device 1 alone corresponds to the image processing system according to the present disclosure, but the image processing system according to the present disclosure may be composed of the image processing device 1 and the operation terminal 2. For example, when the components of the image processing device 1 and the operation terminal 2 cooperate to share and execute the amount extraction process, a system including the multiple components that execute the process corresponds to the image processing system according to the present disclosure. Also, when the operation terminal 2 executes the amount extraction process, the operation terminal 2 alone may constitute the image processing system according to the present disclosure.

［開示の付記］
以下、上述の実施形態から抽出される開示の概要について付記する。なお、以下の付記で説明する各構成及び各処理機能は取捨選択して任意に組み合わせることが可能である。 [Disclosure Addendum]
The following will provide an overview of the disclosure extracted from the above-described embodiment. Note that the configurations and processing functions described in the following supplementary notes can be selected and combined as desired.

＜付記１＞
画像データから抽出対象の対象項目名に対応する対象項目値を抽出する画像処理システムであって、
前記画像データから複数の項目名と前記複数の項目名のそれぞれに対応する複数の項目値とを抽出する抽出処理部と、
前記抽出処理部により抽出される前記複数の項目名のうち前記対象項目名に関連する複数の関連項目名を特定し、前記抽出処理部により抽出される前記複数の項目値のうち前記複数の関連項目名のそれぞれに対応する複数の関連項目値を特定する特定処理部と、
前記特定処理部により特定される前記複数の関連項目値に基づいて算出される算出項目値を、前記対象項目値として出力する出力処理部と、
を備える画像処理システム。 <Appendix 1>
An image processing system for extracting a target item value corresponding to a target item name to be extracted from image data, comprising:
an extraction processing unit that extracts a plurality of item names and a plurality of item values corresponding to each of the plurality of item names from the image data;
an identification processing unit that identifies a plurality of related item names related to the target item name among the plurality of item names extracted by the extraction processing unit, and identifies a plurality of related item values corresponding to each of the plurality of related item names among the plurality of item values extracted by the extraction processing unit;
an output processing unit that outputs a calculated item value calculated based on the plurality of related item values identified by the identification processing unit as the target item value;
An image processing system comprising:

＜付記２＞
前記特定処理部は、前記複数の項目名のうち前記対象項目値を算出するために必要な複数の項目名を前記複数の関連項目名として特定する、
付記１に記載の画像処理システム。 <Appendix 2>
the identification processing unit identifies, as the plurality of related item names, a plurality of item names necessary for calculating the target item value among the plurality of item names;
2. The image processing system of claim 1.

＜付記３＞
前記特定処理部は、項目名と項目値との位置関係に基づいて、前記関連項目名に対応する項目値を特定する、
付記２に記載の画像処理システム。 <Appendix 3>
the identification processing unit identifies an item value corresponding to the related item name based on a positional relationship between the item name and the item value;
3. The image processing system of claim 2.

＜付記４＞
前記特定処理部は、前記関連項目名の右方向又は下方向の位置に存在する項目値を、前記関連項目名に対応する項目値として特定する、
付記３に記載の画像処理システム。 <Appendix 4>
The identification processing unit identifies an item value present to the right or below the related item name as an item value corresponding to the related item name.
4. The image processing system of claim 3.

＜付記５＞
前記特定処理部は、金額に関する第１項目名及び第２項目名を特定し、前記第１項目名に対応する第１金額と前記第２項目名に対応する第２金額とを特定し、
前記出力処理部は、前記第１金額及び前記第２金額に基づいて算出される第３金額を、前記対象項目値として出力する
付記１～４のいずれかに記載の画像処理システム。 <Appendix 5>
The identification processing unit identifies a first item name and a second item name related to an amount, and identifies a first amount corresponding to the first item name and a second amount corresponding to the second item name;
The image processing system according to any one of appendices 1 to 4, wherein the output processing unit outputs a third amount calculated based on the first amount and the second amount as the target item value.

＜付記６＞
前記出力処理部は、前記対象項目値を表示端末に表示させる、
付記１～５のいずれかに記載の画像処理システム。 <Appendix 6>
The output processing unit displays the target item value on a display terminal.
6. An image processing system according to any one of claims 1 to 5.

＜付記７＞
前記出力処理部は、前記表示端末において、前記画像データを表示させるとともに前記画像データにおいて前記対象項目値に対応する項目値を識別可能に表示させる、
付記６に記載の画像処理システム。 <Appendix 7>
the output processing unit causes the display terminal to display the image data and to identifiably display an item value corresponding to the target item value in the image data.
7. The image processing system of claim 6.

＜付記８＞
前記表示端末においてユーザーから前記対象項目値を修正する操作を受け付けた場合に、前記出力処理部は、修正された前記対象項目値を再表示させる、
付記６又は７に記載の画像処理システム。 <Appendix 8>
When an operation to modify the target item value is received from a user on the display terminal, the output processing unit re-displays the modified target item value.
8. The image processing system according to claim 6 or 7.

＜付記９＞
前記出力処理部は、前記表示端末において、前記画像データを表示させるとともに前記画像データにおいて前記複数の関連項目値を識別可能に表示させる、
付記６に記載の画像処理システム。 <Appendix 9>
the output processing unit causes the display terminal to display the image data and to identifiably display the plurality of related item values in the image data.
7. The image processing system of claim 6.

＜付記１０＞
前記画像データに対する文字認識処理の結果を操作端末に表示させ、ユーザーから当該文字認識処理の結果を修正する操作を受け付けた場合に、修正内容を学習する、
付記６～９のいずれかに記載の画像処理システム。 <Appendix 10>
displaying a result of character recognition processing on the image data on an operation terminal, and learning the correction content when an operation to correct the result of the character recognition processing is received from a user;
10. An image processing system according to any one of claims 6 to 9.

１：画像処理装置
２：操作端末
１０：画像処理システム
１１：制御部
１２：記憶部
１３：操作表示部
１４：通信部
２１：制御部
２２：記憶部
２３：操作表示部
２４：通信部
１１１：取得処理部
１１２：抽出処理部
１１３：特定処理部
１１４：判定処理部
１１５：出力処理部
Ｐ１：画像データ
Ｐ２：抽出結果ページ 1: Image processing device 2: Operation terminal 10: Image processing system 11: Control unit 12: Storage unit 13: Operation display unit 14: Communication unit 21: Control unit 22: Storage unit 23: Operation display unit 24: Communication unit 111: Acquisition processing unit 112: Extraction processing unit 113: Identification processing unit 114: Determination processing unit 115: Output processing unit P1: Image data P2: Extraction result page

Claims

An image processing system for extracting a target item value corresponding to a target item name to be extracted from image data, comprising:
an identification processing unit that identifies, in the image data including a plurality of item names and a plurality of item values corresponding to each of the plurality of item names, a plurality of related item names related to the target item name among the plurality of item names, and identifies a plurality of related item values corresponding to each of the plurality of related item names among the plurality of item values;
an output processing unit that outputs a calculated item value calculated based on the plurality of related item values identified by the identification processing unit as the target item value;
An image processing system comprising:

the identification processing unit identifies, as the plurality of related item names, a plurality of item names necessary for calculating the target item value among the plurality of item names;
The image processing system according to claim 1 .

the identification processing unit identifies an item value corresponding to the related item name based on a positional relationship between the item name and the item value;
The image processing system according to claim 2 .

The identification processing unit identifies an item value present to the right or below the related item name as an item value corresponding to the related item name.
The image processing system according to claim 3 .

The identification processing unit identifies a first item name and a second item name related to an amount, and identifies a first amount corresponding to the first item name and a second amount corresponding to the second item name;
The image processing system according to claim 1 , wherein the output processing unit outputs a third amount calculated based on the first amount and the second amount as the target item value.

The output processing unit displays the target item value on a display terminal.
The image processing system according to claim 1 .

the output processing unit causes the display terminal to display the image data and to identifiably display an item value corresponding to the target item value in the image data.
The image processing system according to claim 6.

When an operation to modify the target item value is received from a user on the display terminal, the output processing unit re-displays the modified target item value.
The image processing system according to claim 6.

the output processing unit causes the display terminal to display the image data and to identifiably display the plurality of related item values in the image data.
The image processing system according to claim 6.

displaying a result of the object recognition processing on the image data on an operation terminal, and when an operation to correct a character string rectangular area that is the result of the object recognition processing is received from a user, learning the correction content;
The image processing system according to claim 6.

An image processing method for extracting a target item value corresponding to a target item name to be extracted from image data, comprising:
extracting a plurality of item names and a plurality of item values corresponding to each of the plurality of item names from the image data;
Identifying a plurality of related item names related to the target item name among the plurality of item names, and identifying a plurality of related item values corresponding to each of the plurality of related item names among the plurality of item values;
outputting a calculated item value calculated based on the plurality of related item values as the target item value;
An image processing method executed by one or more processors.

An image processing program for extracting a target item value corresponding to a target item name to be extracted from image data,
extracting a plurality of item names and a plurality of item values corresponding to each of the plurality of item names from the image data;
Identifying a plurality of related item names related to the target item name among the plurality of item names, and identifying a plurality of related item values corresponding to each of the plurality of related item names among the plurality of item values;
outputting a calculated item value calculated based on the plurality of related item values as the target item value;
An image processing program for causing one or more processors to execute the above.