JP2024067409A

JP2024067409A - Character recognition device, character recognition method, and program

Info

Publication number: JP2024067409A
Application number: JP2022177461A
Authority: JP
Inventors: 秀策安田; Shusaku Yasuda; 拓也村田; Takuya Murata; 草太小川; Sota Ogawa; 真由美斎藤; Mayumi Saito
Original assignee: Mitsubishi Heavy Industries Ltd
Current assignee: Mitsubishi Heavy Industries Ltd
Priority date: 2022-11-04
Filing date: 2022-11-04
Publication date: 2024-05-17

Abstract

To provide a character recognition device which accurately recognizes characters described on a drawing sheet.SOLUTION: A character recognition device includes: a learning data acquisition unit which acquires first learning data obtained by attaching alternative characters as a correct answer label to image data of a displayed object described on a drawing sheet and not being a character; a learning unit which creates a character recognition model by learning the first learning data; a recognition object data acquisition unit which acquires recognition object image data including displayed objects being a character and not a character; and a recognition unit which recognizes characters from the recognition object image data on the basis of the character recognition model, and outputs residual characters obtained by deleting the alternative characters from the recognized characters as a recognition result.SELECTED DRAWING: Figure 1

Description

本開示は、文字認識装置、文字認識方法及びプログラムに関する。 This disclosure relates to a character recognition device, a character recognition method, and a program.

原子力等のプラント設計業務では、図面上のシンボル、機器番号、図面番号などを手掛かりに複数の図面を照合する作業が多い。現状では、紙図面に対し人手で機器の種別判定や機器番号の読み取り、関連付けを行っており、多大な工数を要している。この作業の自動化には、ＯＣＲ（Optical Character Recognition）による機器番号の読み取り、シンボル認識による機器の種別判定、機器番号と機器の情報関連付けなどが必要である。特許文献１には、建築図面から建築構造物の重量の算出に必要な寸法などの文字を読み取る技術について開示がある。特許文献１では、建築図面から構造物の柱、壁、鉄骨などを認識して文字から分離し、文字領域を対象に認識処理を行うことで文字の認識精度を高めている。また、特許文献２には、図面に描かれた弁、ポンプ、電子素子などのシンボルの認識の精度を向上する技術が開示されている。 In plant design work such as nuclear power plants, there is a lot of work to compare multiple drawings using symbols, equipment numbers, drawing numbers, etc. on the drawings. Currently, the type of equipment is determined manually, equipment numbers are read, and association is performed on paper drawings, which requires a lot of man-hours. To automate this work, it is necessary to read equipment numbers using OCR (Optical Character Recognition), determine equipment types using symbol recognition, and associate equipment numbers with equipment information. Patent Document 1 discloses a technology to read characters such as dimensions required for calculating the weight of architectural structures from architectural drawings. In Patent Document 1, columns, walls, steel frames, etc. of structures are recognized from architectural drawings and separated from characters, and recognition processing is performed on the character areas to improve the recognition accuracy of characters. Patent Document 2 discloses a technology to improve the recognition accuracy of symbols such as valves, pumps, and electronic elements drawn on drawings.

特開２０１９－２０７５３０号公報JP 2019-207530 A 特開２０２２－６３５９９号公報JP 2022-63599 A

図面の文字認識精度を低下させる要因の一つに、図面上に出現するシンボルなどに起因する文字の誤認識が挙げられるが、特許文献１、２には、このような要因による誤認識への有効な手段は開示されていない。図面の文字認識を向上する方法が必要とされている。 One of the factors that reduces the accuracy of character recognition on drawings is erroneous recognition of characters due to symbols that appear on the drawings, but Patent Documents 1 and 2 do not disclose effective means to address erroneous recognition due to such factors. There is a need for a method to improve character recognition on drawings.

本開示は、上記課題を解決することができる文字認識装置、文字認識方法及びプログラムを提供する。 This disclosure provides a character recognition device, a character recognition method, and a program that can solve the above problems.

本開示の文字認識装置は、図面に記載された、文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得する学習データ取得部と、前記第１の学習データを学習して文字認識モデルを作成する学習部と、文字および前記文字ではない表示物を含む認識対象の画像データを取得する認識対象データ取得部と、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力する認識部と、を備える。 The character recognition device of the present disclosure includes a learning data acquisition unit that acquires first learning data in which alternative characters are assigned as correct labels to image data of non-character display objects shown in the drawings, a learning unit that learns the first learning data to create a character recognition model, a recognition target data acquisition unit that acquires image data of a recognition target including characters and the non-character display objects, and a recognition unit that recognizes characters from the image data of the recognition target based on the character recognition model, and outputs the remaining characters after deleting the alternative characters from the recognized characters as recognition results.

本開示の文字認識方法は、図面に記載された文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得するステップと、前記第１の学習データを学習して文字認識モデルを作成するステップと、文字および前記文字ではない表示物を含む認識対象の画像データを取得するステップと、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力するステップと、を有する。 The character recognition method disclosed herein includes the steps of acquiring first learning data in which substitute characters are assigned as correct answer labels to image data of non-character objects depicted in drawings, learning the first learning data to create a character recognition model, acquiring image data of a recognition target including characters and the non-character objects, and recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after deleting the substitute characters from the recognized characters as recognition results.

本開示のプログラムは、コンピュータに、図面に記載された文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得するステップと、前記第１の学習データを学習して文字認識モデルを作成するステップと、文字および前記文字ではない表示物を含む認識対象の画像データを取得するステップと、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力するステップと、を実行させる。 The program disclosed herein causes a computer to execute the steps of acquiring first learning data in which alternative characters are assigned as correct labels to image data of non-character objects depicted in drawings, learning the first learning data to create a character recognition model, acquiring image data of a recognition target including characters and the non-character objects, and recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after removing the alternative characters from the recognized characters as recognition results.

上述の文字認識装置、文字認識方法及びプログラムによれば、図面に記載された文字の認識率を向上することができる。 The above-mentioned character recognition device, character recognition method, and program can improve the recognition rate of characters written in a drawing.

実施形態に係る文字認識装置の一例を示すブロック図である。1 is a block diagram showing an example of a character recognition device according to an embodiment; 実施形態に係る学習済みＯＣＲモデルの効果の一例を示す図である。FIG. 13 is a diagram illustrating an example of an effect of a trained OCR model according to an embodiment. 実施形態に係る誤認識の一例を示す第１の図である。FIG. 1 is a first diagram illustrating an example of erroneous recognition according to an embodiment. 実施形態に係る誤認識の一例を示す第２の図である。FIG. 11 is a second diagram illustrating an example of erroneous recognition according to the embodiment. 実施形態に係る誤認識防止のための学習手法について説明する図である。11A and 11B are diagrams illustrating a learning method for preventing erroneous recognition according to an embodiment. 実施形態に係る誤認識の他の例を示す第１の図である。FIG. 11 is a first diagram showing another example of erroneous recognition according to the embodiment. 実施形態に係る誤認識の他の例を示す第２の図である。FIG. 11 is a second diagram showing another example of erroneous recognition according to the embodiment. 実施形態に係る認識率向上のための処理の一例を示すフローチャートである。10 is a flowchart illustrating an example of a process for improving a recognition rate according to the embodiment. 実施形態に係る図面の文字認識処理の一例を示すフローチャートである。4 is a flowchart illustrating an example of a character recognition process of the drawings according to the embodiment. 実施形態に係る後処理の一例を示すフローチャートである。10 is a flowchart illustrating an example of post-processing according to the embodiment. 実施形態に係る後処理の効果の一例を示す図である。11A and 11B are diagrams illustrating an example of an effect of post-processing according to the embodiment. 本実施形態に係る文字認識装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of the character recognition device according to the present embodiment.

＜実施形態＞
以下、本開示の文字認識方法について図面を参照して説明する。
（構成）
図１は、実施形態に係る文字認識装置の一例を示すブロック図である。文字認識装置１０は、１台又は複数台のコンピュータによって構成される。文字認識装置１０は、図面に記載された文字を精度よく認識する。図示するように文字認識装置１０は、データ取得部１１と、入力受付部１２と、処理部１３と、出力部１４と、記憶部１５と、を備える。 <Embodiment>
The character recognition method of the present disclosure will be described below with reference to the drawings.
(composition)
1 is a block diagram showing an example of a character recognition device according to an embodiment. The character recognition device 10 is configured with one or more computers. The character recognition device 10 accurately recognizes characters shown in the drawings. As shown in the figure, the character recognition device 10 includes a data acquisition unit 11, an input reception unit 12, a processing unit 13, an output unit 14, and a storage unit 15.

データ取得部１１は、機械やプラント等の図面に記載された文字および／又は文字ではない表示物を含む画像データを取得する。例えば、データ取得部１１は、画像データに正解ラベルが付された学習用の画像データや、文字認識対象用の画像データを取得する。文字とは、英数字、カナ、ひらがな、漢字、かっこやハイフンなどの記号（読み取り対象の記号文字）であり、図面から読み取りたい対象の文字である。文字ではない表示物とは、図面に含まれる接点記号などのシンボル、表の罫線などの読み取り対象ではない表記である。一般的なＯＣＲモデルを使用して、図面の文字を読み取ると、文字ではない表示物が読み取り対象の文字と誤認識されることが多い。後述するように、文字認識装置１０は、このような誤認識を防ぐ機能を有している。 The data acquisition unit 11 acquires image data including characters and/or non-character objects written on drawings of machines, plants, etc. For example, the data acquisition unit 11 acquires learning image data in which a correct answer label is attached to the image data, and image data for character recognition. Characters are symbols such as alphanumeric characters, kana, hiragana, kanji, parentheses, and hyphens (symbol characters to be read), and are characters to be read from the drawings. Non-character objects are symbols such as contact marks included in the drawings, and notations to be read, such as table lines. When characters on drawings are read using a general OCR model, non-character objects are often erroneously recognized as characters to be read. As will be described later, the character recognition device 10 has a function to prevent such erroneous recognition.

入力受付部１２は、キーボード、マウス、タッチパネル、入力インタフェース等の入力装置を用いて構成され、これらの入力装置を用いて入力される、文字認識率向上のための各種設定や文字認識処理の実行指示などを受け付ける。 The input reception unit 12 is configured using input devices such as a keyboard, mouse, touch panel, and input interface, and receives various settings for improving the character recognition rate and instructions to execute character recognition processing, which are input using these input devices.

処理部１３は、文字認識処理や文字認識用の深層学習モデル（ＯＣＲモデルと呼ぶ。）作成などの処理を実行する。処理部１３は、学習部１３１と、文字認識部１３２と、後処理部１３３と、を備える。
学習部１３１は、文字認識用に提供されている一般的なＯＣＲモデル（例えば、FOTS、MaskTextSpotter等）を追加学習させて、図面の文字認識に特化したＯＣＲモデルを作成する。ＯＣＲモデルは、例えば、深層学習モデルである。屋外画像などの一般画像で学習された一般的な学習済みＯＣＲモデルによる文字認識では、図面のシンボル周辺の文字や表の内部の文字などを実用的な精度で認識することができない。また、図面には、文字のほかにも多様なシンボル、表、図などが記載されており、誤認識を誘発しやすい。また、図面に記載された文字は、一般に読み取り対象とされている文字とは形態が大きく異なる場合があり、認識精度が低下する。これに対し、学習部１３１では、後述する２つの手法によりＯＣＲモデルの追加学習を行って、図面の文字認識に特化したＯＣＲモデルを作成し、図面の文字認識率の向上を図る。 The processing unit 13 executes processes such as character recognition processing, creation of a deep learning model for character recognition (called an OCR model), etc. The processing unit 13 includes a learning unit 131, a character recognition unit 132, and a post-processing unit 133.
The learning unit 131 additionally learns a general OCR model (e.g., FOTS, MaskTextSpotter, etc.) provided for character recognition to create an OCR model specialized for character recognition of drawings. The OCR model is, for example, a deep learning model. Character recognition using a general trained OCR model trained on general images such as outdoor images cannot recognize characters around symbols on drawings or characters inside tables with practical accuracy. In addition, in addition to characters, various symbols, tables, figures, etc. are written on drawings, which are likely to induce erroneous recognition. In addition, characters written on drawings may have a significantly different shape from characters that are generally intended to be read, which reduces the recognition accuracy. In response to this, the learning unit 131 additionally learns the OCR model using two methods described later to create an OCR model specialized for character recognition on drawings, thereby improving the character recognition rate of drawings.

文字認識部１３２は、学習部１３１が作成したＯＣＲモデルを用いて図面の文字認識を行う。
後処理部１３３は、文字認識部１３２の文字認識結果に対して後処理を行う。後処理とは、図面の文字認識に特化したＯＣＲモデルを用いても発生する誤認識を補正する処理である。また、後処理部１３３は、複数のＯＣＲモデルによる図面文字の認識結果を統合する統合処理を行う。 The character recognition unit 132 performs character recognition on the drawing using the OCR model created by the learning unit 131 .
The post-processing unit 133 performs post-processing on the character recognition results of the character recognition unit 132. The post-processing is a process for correcting misrecognition that occurs even when an OCR model specialized for character recognition of drawings is used. The post-processing unit 133 also performs an integration process for integrating the recognition results of characters on drawings obtained by multiple OCR models.

出力部１４は、処理部１３による図面の文字の認識結果を表示装置、他装置、電子ファイル等へ出力する。
記憶部１５は、文字認識に必要な各種データを記憶する。例えば、記憶部１５は、データ取得部１１が取得した画像データ、入力受付部１２が受け付けた設定情報などを取得する。また、記憶部１５は、１つ又は複数のＯＣＲモデルを記憶している。図１には、便宜的にＯＣＲモデル１を記載している。例えば、ＯＣＲモデルが２つ記憶されている場合、ＯＣＲモデル１、ＯＣＲモデル２のように記載する。ＯＣＲモデル１、２の区別が不要な場合には単にＯＣＲモデルと記載する。後述するように、本実施形態では、一例として２つのＯＣＲモデル、ＯＣＲモデル１とＯＣＲモデル２を組み合わせることによって（統合処理）、文字認識率の向上を図るが、組み合わせるＯＣＲモデルの数に限定は無い。３種類以上のＯＣＲモデルを使用してもよいし、１種類のＯＣＲモデルだけを使用してもよい。（１種類のＯＣＲモデルのみを使用する場合には、統合処理による文字認識率の向上を図ることはできないが、後述するように本実施形態では複数の手法により、図面の文字認識率を向上することができるので、１種類のＯＣＲモデルのみを使用する場合であっても、それらの手法により、文字認識率を向上することができる。）複数のＯＣＲモデルを使用する場合、学習部１３１、文字認識部１３２は、各ＯＣＲモデル用の機能（プログラム）を備えることになる。例えば、学習部１３１がＯＣＲモデル１の作成を行うときには、学習部１３１は、ＯＣＲモデル１用に構築された機能（ＯＣＲモデル１用の学習部）により追加学習を行い、文字認識部１３２がＯＣＲモデル２を用いて文字認識を行う場合には、文字認識部１３２は、ＯＣＲモデル２用に構築された機能（ＯＣＲモデル２用の文字認識部）を使って文字認識を行う。以下では、ＯＣＲモデル１用の学習部１３１、ＯＣＲモデル２用の学習部１３１といった記載を行わずに、単に学習部１３１がＯＣＲモデル１を作成する、文字認識部１３２がＯＣＲモデル２を用いて文字認識を行う、のように記載するが、実態としては上記したように各ＯＣＲモデル用の機能を用いて処理を行う。 The output unit 14 outputs the result of the recognition of characters on the drawing by the processing unit 13 to a display device, another device, an electronic file, or the like.
The storage unit 15 stores various data necessary for character recognition. For example, the storage unit 15 acquires image data acquired by the data acquisition unit 11, setting information accepted by the input acceptance unit 12, and the like. The storage unit 15 also stores one or more OCR models. For convenience, an OCR model 1 is described in FIG. 1. For example, when two OCR models are stored, they are described as an OCR model 1 and an OCR model 2. When it is not necessary to distinguish between the OCR models 1 and 2, they are simply described as an OCR model. As described later, in this embodiment, as an example, two OCR models, an OCR model 1 and an OCR model 2, are combined (integration process) to improve the character recognition rate, but there is no limit to the number of OCR models to be combined. Three or more types of OCR models may be used, or only one type of OCR model may be used. (When only one type of OCR model is used, it is not possible to improve the character recognition rate by integration processing. However, as described later, in this embodiment, the character recognition rate of the drawing can be improved by a plurality of methods. Therefore, even when only one type of OCR model is used, the character recognition rate can be improved by those methods.) When a plurality of OCR models are used, the learning unit 131 and the character recognition unit 132 have functions (programs) for each OCR model. For example, when the learning unit 131 creates OCR model 1, the learning unit 131 performs additional learning using a function constructed for OCR model 1 (learning unit for OCR model 1), and when the character recognition unit 132 performs character recognition using OCR model 2, the character recognition unit 132 performs character recognition using a function constructed for OCR model 2 (character recognition unit for OCR model 2). In the following, instead of describing them as a learning unit 131 for OCR model 1 and a learning unit 131 for OCR model 2, it will be simply described as learning unit 131 creating OCR model 1, character recognition unit 132 performing character recognition using OCR model 2, but in reality, processing is performed using the functions for each OCR model as described above.

本実施形態では、大きく分類して、一般的なＯＣＲモデルを学習させて図面の文字認識に特化したモデルを作成する手法、ＯＣＲモデルの認識結果に対して後処理を行うことによって認識精度を向上する手法、の２通りの文字認識率向上手法を提供する。 In this embodiment, two methods for improving character recognition rates are provided: a method for training a general OCR model to create a model specialized for character recognition in drawings, and a method for improving recognition accuracy by performing post-processing on the recognition results of the OCR model.

＜ＯＣＲモデルの学習による文字認識率向上の手法＞
最初に一般的なＯＣＲモデルを学習させて図面の文字認識に特化したモデルを作成する手法について説明する。
（手法Ａ）図面に記載された読み取り対象の文字に対して正解ラベルを付した学習データを作成し、作成した学習データを用いて、学習部１３１に学習させる（例えば、深層学習モデルのfine-tuningを行う。）。これにより、図面に特化した深層学習モデルを学習する。図面に特化するよう深層学習モデルを学習することで、文字認識自体の精度向上が可能となり、相対的にシンボル等の誤認識を減らすことができる。図面文字認識に適した学習データを学習して作成したＯＣＲモデルの図面文字認識率の一例を図２に示す。図２のグラフ２ａは、ＯＣＲモデル１における学習前後の文字認識率を示している。図２のグラフ２ｂは、ＯＣＲモデル１とは異なるＯＣＲモデル２における学習前後の文字認識率を示している。図示するように、ＯＣＲモデル１、２の何れにおいても、実際に図面に記載された文字を学習することによって、文字列検出率、文字列認識率が大幅に向上している。 <Method of improving character recognition rate by learning OCR model>
First, a method for creating a model specialized for character recognition in drawings by training a general OCR model will be described.
(Method A) Learning data is created with a correct answer label attached to the characters to be read that are written on the drawing, and the learning unit 131 is made to learn using the created learning data (for example, fine-tuning of the deep learning model is performed). This allows a deep learning model specialized for drawings to be learned. By learning a deep learning model to be specialized for drawings, it is possible to improve the accuracy of character recognition itself, and it is possible to relatively reduce erroneous recognition of symbols, etc. FIG. 2 shows an example of the drawing character recognition rate of an OCR model created by learning learning data suitable for drawing character recognition. Graph 2a in FIG. 2 shows the character recognition rate before and after learning in OCR model 1. Graph 2b in FIG. 2 shows the character recognition rate before and after learning in OCR model 2, which is different from OCR model 1. As shown in the figure, in both OCR models 1 and 2, the character string detection rate and character string recognition rate are significantly improved by learning characters actually written on drawings.

（手法Ｂ）図面のシンボルや罫線など、誤認識を誘発しやすい表示物に対して代替文字を割り当てて深層学習モデルを学習し、シンボルや罫線などを割り当てた代替文字として認識させる。図３Ａ、図３Ｂに誤認識の例を示す。図の範囲３ａ，３ｂは文字認識の対象範囲を表している。図３Ａは、図面には「ＡＢＣシンボル（円の上方向に線が伸びる形状、接点記号）」が表記されているところ、一般的なＯＣＲモデルが「ＡＢＣＯ（アルファベットのオー）」と誤認識する例を示している。図３Ｂは、図面の表の枠に中に「ＡＢＣ」と表記されているところ一般的なＯＣＲモデルが表の左側の罫線を「１」と誤認識して「１ＡＢＣ」と誤認識する例を示している。このような誤認識に対し、誤認識を誘発した表示物に対して、図面では使用される予定がない文字を割り当てる。例えば、図３Ａのシンボルに対しては「＠」を割り当て、図３Ｂの罫線に対しては「｜」を割り当てる。手法２では、まず、手法１によって作成されたＯＣＲモデルによって、図面の文字認識を実行した結果から、残存する誤認識箇所を洗い出して、代替文字を学習させる対象を特定する。ここでは図３Ａ、図３Ｂで説明した誤認識が特定されたとする。代替文字を学習する対象を特定した後の処理を図４に示す。１．まず、接点記号及び票の罫線を代替文字とした学習データを作成する。例えば、手法１で使用した学習データを加工して、図３Ａの範囲３ａに対して正解ラベル「ＡＢＣ＠」を付し、図３Ｂの範囲３ｂに対して正解ラベル「｜ＡＢＣ」を付して学習データを作成する。代替文字は、図面では使用予定のない文字、かつ、誤認識を生じさせる表示物に形状が類似していることが望ましい。形状が類似する代替文字を選択することで次に行う学習処理の負荷を軽減することができる。２．次に学習部１３１が、１．で作成した学習データをＯＣＲモデルに学習させる。これにより、図３Ａの範囲３ａの文字列を「ＡＢＣ＠」、図３Ｂの範囲３ｂの文字列を「｜ＡＢＣ」と認識するＯＣＲモデルが作成される。３．次に２．で作成したＯＣＲモデルの認識結果から代替文字を削除する処理を行うよう文字認識部１３２に設定する。文字認識部１３２は、例えば、代替文字が認識結果の文字列の最初か最後にあれば単に代替文字を削除する。例えば、認識結果が「ＡＢＣ＠」の場合、文字認識部１３２は、認識結果から＠を削除して「ＡＢＣ」とする。また、代替文字が認識結果の途中にある場合、文字認識部１３２は、代替文字を境に文字列を分割した後、代替文字を削除する。例えば、認識結果が「ＡＢＣ＠ＤＥＦ」の場合、文字認識部１３２は、認識結果を「ＡＢＣ」と「ＤＥＦ」とする。例えば、認識結果が「Ａ｜Ｂ｜Ｃ」の場合、文字認識部１３２は、認識結果を「Ａ」と「Ｂ」と「Ｃ」とする。文字認識部１３２は、代替文字を削除した後の文字や文字列を認識結果として出力する。 (Method B) A deep learning model is trained by assigning substitute characters to display objects that are likely to induce misrecognition, such as symbols and lines in drawings, and is made to recognize the symbols, lines, etc. as substitute characters assigned. Examples of misrecognition are shown in Figures 3A and 3B. Ranges 3a and 3b in the figure represent the target range of character recognition. Figure 3A shows an example in which a general OCR model misrecognizes an "ABC symbol (a shape with lines extending upward from a circle, a contact symbol)" as "ABCO (the letter O)" when the "ABC" is written in the frame of a table in a drawing, and the general OCR model misrecognizes the ruled line on the left side of the table as "1" and misrecognizes it as "1ABC". In response to such misrecognition, a character that is not planned to be used in the drawing is assigned to the display object that induces the misrecognition. For example, "@" is assigned to the symbol in Figure 3A, and "|" is assigned to the ruled line in Figure 3B. In the method 2, first, the remaining misrecognition parts are identified from the result of performing character recognition of the drawing by the OCR model created by the method 1, and a target for learning a substitute character is specified. Here, it is assumed that the misrecognition explained in FIG. 3A and FIG. 3B is identified. FIG. 4 shows the process after identifying the target for learning a substitute character. 1. First, learning data is created with the contact marks and the ruled lines of the ticket as substitute characters. For example, the learning data used in the method 1 is processed to create learning data by attaching the correct answer label "ABC@" to the range 3a in FIG. 3A and attaching the correct answer label "|ABC" to the range 3b in FIG. 3B. It is desirable that the substitute character is a character that is not scheduled to be used in the drawing and has a shape similar to the display object that causes the misrecognition. By selecting a substitute character with a similar shape, the load of the next learning process can be reduced. 2. Next, the learning unit 131 causes the OCR model to learn the learning data created in 1. As a result, an OCR model is created that recognizes the character string in the range 3a of FIG. 3A as "ABC@" and the character string in the range 3b of FIG. 3B as "|ABC". 3. Next, the character recognition unit 132 is set to perform a process of deleting the substitute character from the recognition result of the OCR model created in 2. The character recognition unit 132 simply deletes the substitute character if, for example, the substitute character is at the beginning or end of the character string of the recognition result. For example, if the recognition result is "ABC@", the character recognition unit 132 deletes @ from the recognition result to make it "ABC". Also, if the substitute character is in the middle of the recognition result, the character recognition unit 132 divides the character string at the substitute character and then deletes the substitute character. For example, if the recognition result is "ABC@DEF", the character recognition unit 132 recognizes the recognition result as "ABC" and "DEF". For example, if the recognition result is "A|B|C", the character recognition unit 132 recognizes the recognition result as "A", "B", and "C". The character recognition unit 132 outputs the character or character string after removing the substitute character as the recognition result.

＜後処理による文字認識率向上の手法＞
次にＯＣＲモデルの認識結果に対して、後処理部１３３が、後処理を行うことによって認識精度を向上する手法について説明する。上記の方法では、誤認識しやすいシンボルや罫線の影響を排除できるが、括弧やハイフンなどの図面において重要な意味を有する認識したい対象である記号文字の誤認識を防ぐことが難しい。 <Methods for improving character recognition rate through post-processing>
Next, a method for improving the recognition accuracy by performing post-processing on the recognition results of the OCR model by the post-processing unit 133 will be described. The above method can eliminate the influence of symbols and lines that are easily misrecognized, but it is difficult to prevent misrecognition of symbol characters such as parentheses and hyphens that have important meanings in drawings and are the objects to be recognized.

＜手法１＞
後処理部１３３は、認識結果の文字列の先頭と末尾の一方にのみ対になる一組の文字（例えば、括弧）が存在する場合、文字列端の他方にも対になる他方の文字を追加する。例えば、片方の括弧が不足している場合にもう片方の括弧を補完するような処理である。図５Ａに片方の括弧が認識されなかった認識結果の一例を示す。図５Ａは、図面に「（ＡＢＣ１）」と表記されているところ、文字認識部１３２によって「ＡＢＣ１）」と認識された例を示している。後処理部１３３は、文字列の後端に一方の括弧があり、先頭に括弧が無い場合、先頭の括弧を補完し「（ＡＢＣ１）」と補正する。このように、後処理部１３３は、括弧の形状に基づいて、文字列後端の括弧の形状が閉じる側の括弧の形状をしている場合、文字列の先頭にもう片方の括弧を補完するようにしてもよい。同様に先頭にのみ括弧（始め側の括弧）が存在する場合には、後処理部１３３は、文字列の後端にもう一方の括弧を補完することも可能である。また、補完対象の括弧は、図示したものに限らず他の形状のものであっても構わない。また、補完対象となるのは、括弧に限定されず他の対となる一組の文字であってもよい。また、ユーザは、補完対象となる括弧や対となる文字の組合せを任意に設定することができてもよい。例えば、ユーザは、「＜」と「＞」が対になる文字の組み合わせてあることを文字認識装置１０に設定する。入力受付部１２は、ユーザの設定を受け付け、記憶部１５にその設定を登録する。後処理部１３３は、ユーザによる設定に基づいて括弧等を補完する処理を行う。手法１により、例えば、片方の括弧を補完することができる。片方の括弧を補完することにより、括弧の認識漏れを低減することができる。 <Method 1>
When a pair of characters (e.g., parentheses) exists only at the beginning or end of the character string of the recognition result, the post-processing unit 133 adds the other pair of characters to the other end of the character string. For example, when one parenthesis is missing, the other parenthesis is complemented. FIG. 5A shows an example of a recognition result in which one parenthesis was not recognized. FIG. 5A shows an example in which "(ABC1)" is recognized by the character recognition unit 132 as "ABC1)". When one parenthesis is present at the end of a character string and there is no parenthesis at the beginning, the post-processing unit 133 complements the beginning parenthesis and corrects it to "(ABC1)". In this way, the post-processing unit 133 may complement the other parenthesis at the beginning of a character string when the shape of the parenthesis at the end of the character string is the shape of a closing parenthesis based on the shape of the parenthesis. Similarly, when a parenthesis (opening parenthesis) exists only at the beginning, the post-processing unit 133 can also complement the other parenthesis at the end of the character string. Moreover, the brackets to be complemented are not limited to those shown in the figure, and may have other shapes. Moreover, the objects to be complemented are not limited to brackets, and may be other pairs of characters. Moreover, the user may be able to arbitrarily set the brackets to be complemented and the pair of characters. For example, the user sets in the character recognition device 10 that "<" and ">" are a pair of characters. The input reception unit 12 receives the user's setting, and registers the setting in the storage unit 15. The post-processing unit 133 performs a process of complementing brackets, etc., based on the user's setting. For example, one side of a bracket can be complemented by the method 1. By complementing one side of a bracket, it is possible to reduce the recognition failure of the bracket.

＜手法２＞
後処理部１３３は、文字認識部１３２の認識結果を、辞書に登録された文字列と比較し、認識結果が辞書に登録された文字列に類似する場合に、認識結果を登録された文字列に置き換える処理を行ってもよい。具体的には、命名規則から作成した辞書や、類似文字の設定を行い、誤認識を修正する。例えば、ある部品の名称が「ＡＢＣ－Ｄ＄」（＄は任意の数字を表す。）であって、この部品名が図面に多数、記載されるとする。この場合、ユーザは、「ＡＢＣ－Ｄ＄」を辞書へ登録するよう文字認識装置１０へ指示する。入力受付部１２は、この指示を受け付け、記憶部１５に「ＡＢＣ－Ｄ＄」を記憶部１５の辞書へ登録する。辞書登録された「ＡＢＣ－Ｄ＄」は、文字認識部１３２によって「ＡＢＣ－Ｄ＄」に類似した文字列が認識された場合には、後処理部１３３が、認識された文字列を「ＡＢＣ－Ｄ＄」に変換（補正）することを意味する。例えば、文字認識部１３２によって「ＲＢＣ－Ｄ１」という文字列が認識されると、後処理部１３３は、この認識結果を「ＡＢＣ－Ｄ１」に補正する。どのような文字列が「ＡＢＣ－Ｄ＄」に類似すると判定されるかについては、例えば、類似文字の設定を記憶部１５に登録しておき、後処理部１３３は、類似文字の設定に基づいて、ある文字列が「ＡＢＣ－Ｄ＄」に類似するかどうかを判定してもよい。類似文字の設定例として、「Ｒ」と「Ａ」を類似する文字（ＯＣＲモデルによって「Ｒ」が「Ａ」と認識されたり、「Ａ」が「Ｒ」と認識されたりし易い。）として記憶部１５に登録しておく、後処理部１３３は、この設定を参照して、「ＲＢＣ－Ｄ１」を「ＡＢＣ－Ｄ１」に変換する。後処理部１３３は、変換後の「ＡＢＣ－Ｄ１」と「ＡＢＣ－Ｄ＄」を比較する。この場合、数字「＄」を含めて一致するので、後処理部１３３は、「ＲＢＣ－Ｄ１」と辞書に登録された「ＡＢＣ－Ｄ＄」は類似すると判定し、「ＲＢＣ－Ｄ１」を「ＡＢＣ－Ｄ１」に補正する。例えば、文字認識部１３２によって「ＦＢＣ－Ｄ１」という文字列が認識されると、「Ｆ」、「Ｂ」、「Ｃ」、「Ｄ」の何れについても類似する文字の設定が無いため、後処理部１３３は、「ＦＢＣ－Ｄ１」をそのまま「ＡＢＣ－Ｄ＄」と比較する。この場合、例えば「ＦＢＣ」と「ＡＢＣ」は異なるため、後処理部１３３は、認識された「ＦＢＣ－Ｄ１」は辞書登録された「ＡＢＣ－Ｄ＄」とは異なると判定し、「ＦＢＣ－Ｄ１」に対しては辞書登録に基づく補正を行わない。なお、辞書登録は、ユーザが行ってもよいし、文字認識部１３２による認識結果を出力部１４が表示装置に表示して、ユーザの確認のもと辞書に登録するようにしてもよい。例えば、図面中に「ＡＢＣ－Ｄ１」、「ＡＢＣ－Ｄ２」、「ＡＢＣ－Ｄ３」があり、ＯＣＲモデルがこれらの文字列を認識した場合、数字を＄に変換すると「ＡＢＣ－Ｄ＄」が３つとなる。この認識結果を出力部１４が表示装置に提示し、ユーザが辞書への登録を指示する。すると、入力受付部１２が「ＡＢＣ－Ｄ＄」を記憶部１５が記憶する辞書へ登録する。これにより、例えば、図面に対する専門知見に基づいた辞書登録や類似文字の設定を行うことができ、図面に対する専門知見に基づいた誤認識の修正が可能となる。 <Method 2>
The post-processing unit 133 may compare the recognition result of the character recognition unit 132 with a character string registered in a dictionary, and if the recognition result is similar to the character string registered in the dictionary, replace the recognition result with the registered character string. Specifically, a dictionary created from a naming rule or similar characters is set to correct erroneous recognition. For example, assume that the name of a certain part is "ABC-D$" ($ represents any number) and this part name is written many times on a drawing. In this case, the user instructs the character recognition device 10 to register "ABC-D$" in the dictionary. The input receiving unit 12 receives this instruction and registers "ABC-D$" in the dictionary of the storage unit 15. The dictionary-registered "ABC-D$" means that if the character recognition unit 132 recognizes a character string similar to "ABC-D$", the post-processing unit 133 converts (corrects) the recognized character string to "ABC-D$". For example, when the character string "RBC-D1" is recognized by the character recognition unit 132, the post-processing unit 133 corrects the recognition result to "ABC-D1". Regarding what character string is determined to be similar to "ABC-D$", for example, a setting of similar characters may be registered in the storage unit 15, and the post-processing unit 133 may determine whether a certain character string is similar to "ABC-D$" based on the setting of similar characters. As an example of setting similar characters, "R" and "A" are registered in the storage unit 15 as similar characters ("R" is easily recognized as "A" and "A" is easily recognized as "R" by the OCR model). The post-processing unit 133 converts "RBC-D1" to "ABC-D1" by referring to this setting. The post-processing unit 133 compares the converted "ABC-D1" with "ABC-D$". In this case, since the characters match including the number "$", the post-processing unit 133 determines that "RBC-D1" and "ABC-D$" registered in the dictionary are similar, and corrects "RBC-D1" to "ABC-D1". For example, when the character string "FBC-D1" is recognized by the character recognition unit 132, since there is no setting of a similar character for any of "F", "B", "C", and "D", the post-processing unit 133 directly compares "FBC-D1" with "ABC-D$". In this case, since "FBC" and "ABC" are different, for example, the post-processing unit 133 determines that the recognized "FBC-D1" is different from the dictionary-registered "ABC-D$", and does not perform correction based on the dictionary registration for "FBC-D1". The dictionary registration may be performed by the user, or the output unit 14 may display the recognition result by the character recognition unit 132 on a display device and register it in the dictionary upon confirmation by the user. For example, if a drawing contains "ABC-D1,""ABC-D2," and "ABC-D3," and the OCR model recognizes these character strings, converting the numbers to $ results in three "ABC-D$." The output unit 14 presents this recognition result on the display device, and the user instructs registration in the dictionary. The input reception unit 12 then registers "ABC-D$" in the dictionary stored in the storage unit 15. This makes it possible to perform dictionary registration and similar character setting based on, for example, expert knowledge of drawings, and to correct erroneous recognition based on expert knowledge of drawings.

＜手法３＞
後処理部１３３は、１つ又は複数の文字列が所定の条件を満たす場合、所定の文字が１つ又は複数の文字列の所定の位置に含まれていることを定める規則に基づいて、認識結果の文字列の所定の位置に所定の文字を追加することにより、認識結果を補正する、例えば、図面中、複数行に跨って文字列を表記する場合には、最上段の文字列以外の先頭にハイフンを表示する規則があるとする。このような規則があることを前提として、複数行の文字列が認識され、２行目以降の先頭にハイフンが無ければ、後処理部１３３は、２行目以降の先頭にハイフンを追加する。（例えば、複数行の文字列が認識された場合、それらは独立した文字列ではなく、複数行に跨って表記された文字列であるとここでは仮定する。）図５Ｂに２行に跨る文字列の２行目において「－Ｄ１」と表記されているところ、文字認識部１３２によって「Ｄ１」と認識された例を示す。このような場合、後処理部１３３は、２行目の先頭に「－」を追加する。手法３によれば、適切に規則を設定することにより、例えば、「－（ハイフン）」を含む文字列と想定される文字列に対してハイフンを補完することができる。ハイフンを含む機器番号や図面内の文字列の改行に対応した文字認識が可能となり、長文等の文字認識率を向上することができる。 <Method 3>
If one or more character strings satisfy a predetermined condition, the post-processing unit 133 corrects the recognition result by adding a predetermined character to a predetermined position of the character string in the recognition result based on a rule that specifies that a predetermined character is included in a predetermined position of one or more character strings. For example, in a drawing, when a character string is written across multiple lines, a rule is assumed that a hyphen is displayed at the beginning of the character string except the top line. On the premise that such a rule exists, if a character string of multiple lines is recognized and there is no hyphen at the beginning of the second line or subsequent lines, the post-processing unit 133 adds a hyphen to the beginning of the second line or subsequent lines. (For example, when a character string of multiple lines is recognized, it is assumed here that the characters are not independent characters but are character strings written across multiple lines.) FIG. 5B shows an example in which "-D1" is written in the second line of a character string that spans two lines, and the character recognition unit 132 recognizes it as "D1". In such a case, the post-processing unit 133 adds "-" to the beginning of the second line. According to Method 3, by setting appropriate rules, it is possible to supplement a character string that is assumed to contain a hyphen (-), with a hyphen. Character recognition that corresponds to line breaks in equipment numbers that contain hyphens or character strings in drawings becomes possible, and the character recognition rate for long texts can be improved.

＜手法４＞
後処理部１３３は、複数のＯＣＲモデルの認識結果を統合する。例えば、後処理部１３３は、最も認識率の高いＯＣＲモデルが認識できなかった文字を他のＯＣＲモデルが認識できた場合、他のＯＣＲモデルのうち最も認識率が高いＯＣＲモデルが認識した文字を認識結果として選択する。例えば、ＯＣＲモデル１、２のうち、ＯＣＲモデル２の文字認識率が高いとすると、後処理部１３３は、ＯＣＲモデル２が認識できた文字については、ＯＣＲモデル１で認識できたかどうかにかかわらず、ＯＣＲモデル２が認識した文字を認識結果として採用し、ＯＣＲモデル２で認識できなかった文字について、ＯＣＲモデル１で認識できた場合には、ＯＣＲモデル１が認識した文字を認識結果として採用する。これにより、認識率が高いＯＣＲモデル２の認識結果を尊重しつつ、ＯＣＲモデル２が認識できない文字については、ＯＣＲモデル１の認識結果によって補完することができる。
なお、ＯＣＲモデル１とＯＣＲモデル２の組合せは、異なるＯＣＲモデルをベースとして図面の文字認識に特化させるように作成したモデルの組合せ（例えば、ＯＣＲモデル１はFOTSをベースに作成したモデル、ＯＣＲモデル２はMaskTextSpotterをベースに作成したモデル等）でもよいし、同じＯＣＲモデルを異なる条件で学習させたモデルの組合せ（例えば、ＯＣＲモデル１はFOTSに“学習データ１”を学習させて作成したモデル、ＯＣＲモデル２はFOTSに“学習データ２”を学習させて作成したモデル等）であってもよい。 <Method 4>
The post-processing unit 133 integrates the recognition results of a plurality of OCR models. For example, when another OCR model recognizes a character that the OCR model with the highest recognition rate could not recognize, the post-processing unit 133 selects the character recognized by the OCR model with the highest recognition rate among the other OCR models as the recognition result. For example, if the character recognition rate of the OCR model 2 is higher than that of the OCR models 1 and 2, the post-processing unit 133 adopts the character recognized by the OCR model 2 as the recognition result for the character that the OCR model 2 recognized, regardless of whether the OCR model 1 recognized the character, and adopts the character recognized by the OCR model 1 as the recognition result for the character that the OCR model 2 could not recognize, if the OCR model 1 recognized the character. This makes it possible to respect the recognition result of the OCR model 2 with the high recognition rate, while complementing the character that the OCR model 2 cannot recognize with the recognition result of the OCR model 1.
The combination of OCR model 1 and OCR model 2 may be a combination of models created based on different OCR models to be specialized for character recognition in drawings (for example, OCR model 1 is a model created based on FOTS, and OCR model 2 is a model created based on MaskTextSpotter), or a combination of models created by training the same OCR model under different conditions (for example, OCR model 1 is a model created by training FOTS with "training data 1", and OCR model 2 is a model created by training FOTS with "training data 2").

（動作）
次に文字認識装置１０の処理の流れを説明する。
最初に図６を参照して、図面の文字認識のための準備処理について説明する。
図６は、実施形態に係る認識率向上のための処理の一例を示すフローチャートである。
学習部１３１は、図面文字用の学習データを学習して、図面の文字認識に特化したＯＣＲモデルを作成する（ステップＳ１１）。この処理は、上記の手法Ａに対応する。次に、学習部１３１は、シンボルや罫線など誤認識の原因となる表示物の代替文字を学習する（ステップＳ１２）。また、代替文字の学習に合わせて、文字認識部１３２に対して、認識結果から代替文字を削除するよう設定する。この処理は、上記の手法Ｂに対応する。次に、後処理部１３３に対して、ユーザが、後処理に関する設定を行う（ステップＳ１３）。この処理は、上記の手法１～４のための準備である。手法１については補完する括弧等の文字の設定を行う。手法２については辞書登録や類似文字の設定を行う。手法３についてはハイフン等を追加する条件（２行に跨る場合であって、２行目の先頭に－が無い）、追加する文字（「－」）、追加する位置（２行目の先頭）など規則の設定を行う。手法４については、使用する複数のＯＣＲモデルについて、文字認識率の順位を設定する。入力受付部１２は、これらの設定を受け付け、記憶部１５に登録する。なお、ユーザは、手法１～４のうち、実行する手法の設定だけを行えばよい。 (motion)
Next, the process flow of the character recognition device 10 will be described.
First, with reference to FIG. 6, a preparatory process for character recognition in a drawing will be described.
FIG. 6 is a flowchart illustrating an example of a process for improving a recognition rate according to the embodiment.
The learning unit 131 learns learning data for drawing characters and creates an OCR model specialized for character recognition of drawings (step S11). This process corresponds to the above method A. Next, the learning unit 131 learns substitute characters for displayed objects that cause erroneous recognition, such as symbols and ruled lines (step S12). In addition, in accordance with the learning of substitute characters, the character recognition unit 132 is set to delete the substitute characters from the recognition result. This process corresponds to the above method B. Next, the user sets the post-processing related settings for the post-processing unit 133 (step S13). This process is preparation for the above methods 1 to 4. For method 1, characters such as parentheses to be completed are set. For method 2, dictionary registration and similar characters are set. For method 3, rules such as conditions for adding hyphens, etc. (when spanning two lines and there is no - at the beginning of the second line), characters to be added ("-"), and positions to add (the beginning of the second line) are set. For method 4, the order of character recognition rates is set for the multiple OCR models to be used. The input receiving unit 12 receives these settings and registers them in the storage unit 15. It is only necessary for the user to set, among methods 1 to 4, the method to be executed.

次に図７、図８を参照して、図面の文字を認識する処理について説明する。
図７は、実施形態に係る図面の文字認識処理の一例を示すフローチャートである。
まず、データ取得部１１が、認識対象の画像データを取得する（ステップＳ２１）。データ取得部１１は、画像データを記憶部１５に記録する。次に文字認識部１３２が、ＯＣＲモデル１、２を用いて画像データから文字を認識する処理を行う。文字認識部１３２は、ＯＣＲモデル１を用いて記憶部１５に記録された画像データの文字認識を行う（ステップＳ２２）。文字認識部１３２は、ＯＣＲモデル１の文字認識結果を取得して、代替文字を削除する（ステップＳ２３）。文字認識部１３２は、代替文字を削除した結果を後処理部１３３へ出力する。次に後処理部１３３は、後処理を行う（ステップＳ２４）。 Next, the process of recognizing characters on a drawing will be described with reference to FIGS.
FIG. 7 is a flowchart illustrating an example of a character recognition process in the drawings according to the embodiment.
First, the data acquisition unit 11 acquires image data to be recognized (step S21). The data acquisition unit 11 records the image data in the storage unit 15. Next, the character recognition unit 132 performs processing to recognize characters from the image data using OCR models 1 and 2. The character recognition unit 132 performs character recognition on the image data recorded in the storage unit 15 using OCR model 1 (step S22). The character recognition unit 132 acquires the character recognition result of OCR model 1 and deletes substitute characters (step S23). The character recognition unit 132 outputs the result after deleting the substitute characters to the post-processing unit 133. Next, the post-processing unit 133 performs post-processing (step S24).

図８に実施形態に係る後処理の一例を示す。後処理部１３３は、文字認識部１３２から認識結果を取得する（ステップＳ３１）。次に、後処理部１３３は、括弧を含む文字列に対する後処理を行う（ステップＳ３２）。例えば、後処理部１３３は、認識結果「ＡＢＣ）を「（ＡＢＣ）」に補正する。この処理は、上記の手法１に対応する。次に、後処理部１３３は、辞書に基づく後処理を行う（ステップＳ３３）。例えば、辞書に「ＲＢＣ－Ｄ１」が登録され、類似文字として「Ａ」と「Ｒ」が類似することが設定されている場合、後処理部１３３は、認識結果「ＲＢＣ－Ｄ１」を「ＡＢＣ－Ｄ１」に補正する。この処理は、上記の手法２に対応する。次に、後処理部１３３は、ハイフンを含む文字列に対する後処理を行う（ステップＳ３４）。例えば、後処理部１３３は、２行目の認識結果「Ｄ１」を「－Ｄ１」に補正する。 An example of post-processing according to the embodiment is shown in FIG. 8. The post-processing unit 133 acquires the recognition result from the character recognition unit 132 (step S31). Next, the post-processing unit 133 performs post-processing on the character string including parentheses (step S32). For example, the post-processing unit 133 corrects the recognition result "ABC" to "(ABC)". This processing corresponds to the above-mentioned method 1. Next, the post-processing unit 133 performs post-processing based on the dictionary (step S33). For example, if "RBC-D1" is registered in the dictionary and "A" and "R" are set as similar characters, the post-processing unit 133 corrects the recognition result "RBC-D1" to "ABC-D1". This processing corresponds to the above-mentioned method 2. Next, the post-processing unit 133 performs post-processing on the character string including a hyphen (step S34). For example, the post-processing unit 133 corrects the recognition result "D1" in the second line to "-D1".

同様にして、文字認識部１３２は、ＯＣＲモデル２を用いて記憶部１５に記録された画像データの文字認識を行う（ステップＳ２５）。文字認識部１３２は、ＯＣＲモデル２による文字認識結果を取得して、代替文字を削除する（ステップＳ２６）。文字認識部１３２は、代替文字を削除した結果を後処理部１３３へ出力する。次に後処理部１３３は、後処理を行う（ステップＳ２７）。後処理については、ＯＣＲモデル１に関して、図８を用いて説明した内容と同様である。なお、ステップＳ２２～ステップＳ２４の処理と、ステップＳ２５～ステップＳ２７の処理は、並行して行ってもよいし、何れか一方を先に行ってもよい。 Similarly, the character recognition unit 132 uses OCR model 2 to perform character recognition on the image data recorded in the storage unit 15 (step S25). The character recognition unit 132 obtains the character recognition result from OCR model 2 and deletes the substitute characters (step S26). The character recognition unit 132 outputs the result after deleting the substitute characters to the post-processing unit 133. Next, the post-processing unit 133 performs post-processing (step S27). The post-processing is the same as that described for OCR model 1 using FIG. 8. Note that the processing from step S22 to step S24 and the processing from step S25 to step S27 may be performed in parallel, or one of them may be performed first.

次に後処理部１３３は、ステップＳ２２～ステップＳ２４の処理を行って得られた認識結果と、ステップＳ２５～ステップＳ２７の処理を行って得られた認識結果を統合する（ステップＳ２８）。後処理部１３３は、認識率が高いＯＣＲモデル２による認識結果を正として、ＯＣＲモデル２が認識できなかった文字について、ＯＣＲモデル１が認識できている場合にはＯＣＲモデル１の認識結果を採用する処理を行って、認識結果を統合する。次に出力部１４が、統合処理後の認識結果を表示装置等に出力する（ステップＳ２９）。 Next, the post-processing unit 133 integrates the recognition results obtained by performing the processes of steps S22 to S24 and the recognition results obtained by performing the processes of steps S25 to S27 (step S28). The post-processing unit 133 integrates the recognition results by performing a process in which the recognition result by OCR model 2, which has a higher recognition rate, is regarded as correct, and when OCR model 1 recognizes a character that OCR model 2 was unable to recognize, the recognition result by OCR model 1 is adopted. Next, the output unit 14 outputs the recognition result after the integration process to a display device or the like (step S29).

図６のステップＳ１３の説明で、文字認識率が高い順にＯＣＲモデルの順番を設定することとしたが、ステップＳ２３、ステップＳ２６の処理を実行し終えた段階又はステップＳ２４、ステップＳ２７の処理を実行し終えた段階で、どちらのＯＣＲモデルの認識率が高いかを検証し、その検証結果に基づいてステップＳ２８以降を実行してもよい。また、後処理の手法１～４については全く実行しないように構成してもよいし、一部のみを実行するように構成してもよい。また、図８のステップＳ３２～ステップＳ３４の処理順は任意の順番で実行することができる。 In the explanation of step S13 in FIG. 6, the order of OCR models is set in descending order of character recognition rate, but after steps S23 and S26 have been executed or after steps S24 and S27 have been executed, it may be verified which OCR model has the higher recognition rate, and steps S28 and onward may be executed based on the verification results. Also, post-processing methods 1 to 4 may be configured not to be executed at all, or may be configured to be executed only in part. Also, the processing order of steps S32 to S34 in FIG. 8 may be arbitrary.

後処理によって文字列の認識率が向上する過程の一例を図９に示す。図９を参照すると、ステップＳ３１～ステップＳ３４、ステップＳ２８の後処理を実行する度に図面文字の認識精度が向上していることを確認することができる。 Figure 9 shows an example of the process by which the character string recognition rate is improved by post-processing. Referring to Figure 9, it can be seen that the recognition accuracy of the drawing characters is improved each time the post-processing of steps S31 to S34 and step S28 is performed.

（効果）
以上説明したように、本実施形態によれば、実際の図面の文字を学習すること、及び図面特有のシンボルや罫線などを文字の一種とみなして一旦は文字認識し、文字認識結果から図面特有のシンボル等に対応する代替文字を除外することによって、図面に記載された文字の認識率を向上することができる。また、文字認識後も後処理による文字の補完、補正を行うことにより、文字の認識率を向上させることができる。さらに複数のＯＣＲモデルの認識結果を統合することで、複数のＯＣＲモデルの長短を補完しあうことが可能となり、文字認識装置１０全体としての認識率および頑強性を向上することができる。 (effect)
As described above, according to the present embodiment, the recognition rate of characters written on drawings can be improved by learning characters on actual drawings, and by regarding symbols and lines specific to drawings as a type of character and performing character recognition once, and excluding substitute characters corresponding to the symbols specific to drawings from the character recognition results. In addition, the recognition rate of characters can be improved by performing character complementation and correction through post-processing after character recognition. Furthermore, by integrating the recognition results of multiple OCR models, it is possible to complement the advantages and disadvantages of multiple OCR models, and the recognition rate and robustness of the character recognition device 10 as a whole can be improved.

図１０は、実施形態に係る文字認識装置のハードウェア構成の一例を示す図である。
コンピュータ９００は、ＣＰＵ９０１、主記憶装置９０２、補助記憶装置９０３、入出力インタフェース９０４、通信インタフェース９０５を備える。
上述の文字認識装置１０は、コンピュータ９００に実装される。そして、上述した各機能は、プログラムの形式で補助記憶装置９０３に記憶されている。ＣＰＵ９０１は、プログラムを補助記憶装置９０３から読み出して主記憶装置９０２に展開し、当該プログラムに従って上記処理を実行する。また、ＣＰＵ９０１は、プログラムに従って、記憶領域を主記憶装置９０２に確保する。また、ＣＰＵ９０１は、プログラムに従って、処理中のデータを記憶する記憶領域を補助記憶装置９０３に確保する。 FIG. 10 is a diagram illustrating an example of a hardware configuration of a character recognition device according to an embodiment.
The computer 900 includes a CPU 901 , a main memory device 902 , an auxiliary memory device 903 , an input/output interface 904 , and a communication interface 905 .
The character recognition device 10 described above is implemented in a computer 900. The above-described functions are stored in the auxiliary storage device 903 in the form of a program. The CPU 901 reads the program from the auxiliary storage device 903, loads it in the main storage device 902, and executes the above-described processing in accordance with the program. The CPU 901 also reserves a storage area in the main storage device 902 in accordance with the program. The CPU 901 also reserves a storage area in the auxiliary storage device 903 for storing data being processed in accordance with the program.

なお、文字認識装置１０の全部または一部の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより各機能部による処理を行ってもよい。ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、ＣＤ、ＤＶＤ、ＵＳＢ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。また、このプログラムが通信回線によってコンピュータ９００に配信される場合、配信を受けたコンピュータ９００が当該プログラムを主記憶装置９０２に展開し、上記処理を実行しても良い。また、上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 A program for implementing all or part of the functions of the character recognition device 10 may be recorded on a computer-readable recording medium, and the program recorded on the recording medium may be read into a computer system and executed to perform processing by each functional unit. The term "computer system" as used herein includes hardware such as an OS and peripheral devices. In addition, if a WWW system is used, the term "computer system" also includes a home page providing environment (or display environment). In addition, the term "computer-readable recording medium" refers to portable media such as CDs, DVDs, and USBs, and storage devices such as hard disks built into a computer system. In addition, if the program is distributed to a computer 900 via a communication line, the computer 900 that receives the program may load the program into the main storage device 902 and execute the above processing. In addition, the program may be for implementing part of the functions described above, or may be capable of implementing the functions described above in combination with a program already recorded in the computer system.

以上のとおり、本開示に係るいくつかの実施形態を説明したが、これら全ての実施形態は、例として提示したものであり、発明の範囲を限定することを意図していない。これらの実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で種々の省略、置き換え、変更を行うことができる。これらの実施形態及びその変形は、発明の範囲や要旨に含まれると同様に、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 As described above, several embodiments of the present disclosure have been described, but all of these embodiments are presented as examples and are not intended to limit the scope of the invention. These embodiments can be implemented in various other forms, and various omissions, substitutions, and modifications can be made without departing from the gist of the invention. These embodiments and their modifications are included in the scope of the invention and its equivalents as described in the claims, as well as in the scope and gist of the invention.

＜付記＞
各実施形態に記載の文字認識装置、文字認識方法及びプログラムは、例えば以下のように把握される。 <Additional Notes>
The character recognition device, the character recognition method, and the program described in each embodiment can be understood, for example, as follows.

（１）第１の態様に係る文字認識装置１０は、図面に記載された、文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得する学習データ取得部と、前記第１の学習データを学習して文字認識モデルを作成する学習部と、文字および文字ではない表示物を含む認識対象の画像データを取得する認識対象データ取得部と、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力する認識部と、を備える。
これにより、図面固有のシンボルや記号、表の罫線など文字の誤認識の原因となる表示物の影響を低減し、文字認識を行うことができ、文字認識率を向上することができる。
データ取得部１１は、学習データ取得部の一例である。ＯＣＲモデル１、２は文字認識モデルの一例である。データ取得部１１は、認識対象データ取得部の一例である。文字認識部１３２は、認識部の一例である。 (1) A character recognition device 10 according to a first aspect includes a learning data acquisition unit that acquires first learning data in which alternative characters are assigned as correct labels to image data of non-character objects as shown in the drawings; a learning unit that learns the first learning data to create a character recognition model; a recognition target data acquisition unit that acquires image data of a recognition target including characters and non-character objects; and a recognition unit that recognizes characters from the image data of the recognition target based on the character recognition model, and outputs the remaining characters after removing the alternative characters from the recognized characters as recognition results.
This makes it possible to perform character recognition while reducing the influence of display objects that may cause erroneous character recognition, such as symbols and marks unique to drawings and table lines, thereby improving the character recognition rate.
The data acquisition unit 11 is an example of a learning data acquisition unit. The OCR models 1 and 2 are examples of character recognition models. The data acquisition unit 11 is an example of a recognition target data acquisition unit. The character recognition unit 132 is an example of a recognition unit.

（２）第２の態様に係る文字認識装置１０は、（１）の文字認識装置であって、前記学習データ取得部は、さらに、前記図面に記載された文字の画像データに対して正解ラベルを付した第２の学習データを取得し、前記学習部は、前記第２の学習データを学習して前記文字認識モデルを作成する。
図面の文字は、一般的なＯＣＲモデルが学習した文字と形状が異なる場合があり、文字認識率低下の原因となっている。実際の図面に記載された文字に正解ラベルを付して学習することにより、図面の文字認識に特化したＯＣＲモデルを作成することができ、図面の文字認識率を大幅に向上することができる（図２）。 (2) A character recognition device 10 according to a second aspect is the character recognition device of (1), wherein the learning data acquisition unit further acquires second learning data in which a correct answer label is attached to image data of a character shown in the drawing, and the learning unit learns the second learning data to create the character recognition model.
Characters on drawings may have different shapes from those used by general OCR models, which reduces the character recognition rate. By learning characters written on actual drawings with correct answer labels, an OCR model specialized for character recognition on drawings can be created, which can significantly improve the character recognition rate on drawings (Figure 2).

（３）第３の態様に係る文字認識装置１０は、（１）～（２）の文字認識装置であって、前記認識部が出力する前記認識結果を補正する後処理部、をさらに備え、前記後処理部は、前記認識結果の文字列の先頭および末尾のうちの一方にのみ所定の対になる一組の文字の一方が含まれる場合、前記先頭および前記末尾のうちの他方に前記対になる一組の文字の他方を追加することにより、前記認識結果を補正する。
これにより、例えば、片方の括弧を補完することができる。片方の括弧を補完することにより、括弧の認識漏れを低減し、文字認識率を向上することができる。後処理部１３３は後処理部の一例である。 (3) A character recognition device 10 according to a third aspect is a character recognition device according to (1) to (2), further comprising a post-processing unit that corrects the recognition result output by the recognition unit, and when one of a set of characters that forms a predetermined pair is included only at one of the beginning and end of a string of characters in the recognition result, the post-processing unit corrects the recognition result by adding the other of the set of characters that forms the pair to the other of the beginning and end.
As a result, for example, one of the parentheses can be completed. By completing one of the parentheses, it is possible to reduce oversight of parentheses recognition and improve the character recognition rate. The post-processing unit 133 is an example of a post-processing unit.

（４）第４の態様に係る文字認識装置１０は、（１）～（３）の文字認識装置であって、前記認識部が出力する前記認識結果を補正する後処理部、をさらに備え、前記後処理部は、前記認識結果を辞書に登録された登録文字列と比較し、前記認識結果が登録文字列に類似する場合、前記認識結果を類似する前記登録文字列に置き換えることにより、前記認識結果を補正する。
図面に対する専門知見に基づいた辞書登録や、類似文字の設定を行うことにより、図面に対する専門知見に基づいた誤認識修正が可能となる。後処理部１３３は後処理部の一例である。 (4) A character recognition device 10 according to a fourth aspect is a character recognition device according to any one of (1) to (3), further comprising a post-processing unit that corrects the recognition result output by the recognition unit. The post-processing unit compares the recognition result with a registered character string registered in a dictionary, and if the recognition result is similar to the registered character string, corrects the recognition result by replacing the recognition result with the similar registered character string.
By registering a dictionary based on expert knowledge about drawings and setting similar characters, it becomes possible to correct misrecognition based on expert knowledge about drawings. The post-processing unit 133 is an example of a post-processing unit.

（５）第５の態様に係る文字認識装置１０は、（１）～（４）の文字認識装置であって、前記認識部が出力する前記認識結果を補正する後処理部、をさらに備え、前記後処理部は、１つ又は複数の文字列が所定の条件を満たす場合には所定の文字が前記１つ又は複数の文字列の所定の位置に含まれていることを定める規則に基づいて、前記認識結果に前記所定の文字を追加することにより、前記認識結果を補正する。
これにより、例えば、「－（ハイフン）」を含む文字列と想定される文字列に対してハイフンを補完することができる。ハイフンを含む機器番号や図面内の文字列の改行に対応可能となり、長文等の文字認識率向上が可能となる。後処理部１３３は後処理部の一例である。 (5) A character recognition device 10 according to a fifth aspect is a character recognition device according to any one of (1) to (4), further comprising a post-processing unit that corrects the recognition result output by the recognition unit, and the post-processing unit corrects the recognition result by adding a specified character to the recognition result based on a rule that specifies that if one or more character strings satisfy a specified condition, the specified character is included in a specified position of the one or more character strings.
This makes it possible to supplement a character string that is assumed to include a "- (hyphen)" with a hyphen, for example. It also makes it possible to handle equipment numbers that include hyphens and line breaks in character strings in drawings, improving the character recognition rate for long texts, etc. The post-processing unit 133 is an example of a post-processing unit.

（６）第６の態様に係る文字認識装置１０は、（１）～（５）の文字認識装置であって、前記認識部が、複数の前記文字認識モデルの各々に基づいて、前記認識対象の画像データから文字を認識する場合、各々の前記文字認識モデルに基づく前記認識結果を統合する統合部、をさらに備え、前記統合部は、最も認識率の高い前記文字認識モデルが認識できなかった文字を他の前記文字認識モデルが認識できた場合、前記他の前記文字認識モデルのうち最も認識率が高い前記文字認識モデルが認識した文字を前記認識結果として出力する。
これにより、複数の文字認識モデル（ＯＣＲモデル、文字認識用の深層学習モデル）の長短を補完しあうことが可能となり、システム（文字認識装置１０）全体としての認識率と頑強性の向上が可能となる。後処理部１３３は統合部の一例である。 (6) A character recognition device 10 according to a sixth aspect is a character recognition device according to any one of (1) to (5), further comprising an integration unit that integrates the recognition results based on each of the character recognition models when the recognition unit recognizes a character from the image data of the recognition target based on each of the multiple character recognition models, and when another character recognition model is able to recognize a character that the character recognition model with the highest recognition rate was unable to recognize, the integration unit outputs the character recognized by the character recognition model with the highest recognition rate among the other character recognition models as the recognition result.
This makes it possible to complement the strengths and weaknesses of multiple character recognition models (OCR models, deep learning models for character recognition), thereby improving the recognition rate and robustness of the entire system (character recognition device 10). The post-processing unit 133 is an example of an integration unit.

（７）第７の態様に係る文字認識方法は、図面に記載された文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得するステップと、前記第１の学習データを学習して文字認識モデルを作成するステップと、文字および文字ではない表示物を含む認識対象の画像データを取得するステップと、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力するステップと、を有する。 (7) A character recognition method according to a seventh aspect includes the steps of acquiring first learning data in which an alternative character is assigned as a correct answer label to image data of non-character objects depicted in a drawing, learning the first learning data to create a character recognition model, acquiring image data of a recognition target including characters and non-character objects, and recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after deleting the alternative characters from the recognized characters as recognition results.

（８）第８の態様に係るプログラムは、コンピュータに、図面に記載された文字ではない表示物の画像データに対して代替文字を正解ラベルとして付した第１の学習データを取得するステップと、前記第１の学習データを学習して文字認識モデルを作成するステップと、文字および文字ではない表示物を含む認識対象の画像データを取得するステップと、前記文字認識モデルに基づいて、前記認識対象の画像データから文字を認識し、認識した文字から前記代替文字を削除した残りの文字を認識結果として出力するステップと、を実行させる。 (8) The program according to the eighth aspect causes a computer to execute the steps of acquiring first learning data in which alternative characters are assigned as correct labels to image data of non-character objects depicted in a drawing, learning the first learning data to create a character recognition model, acquiring image data of a recognition target including characters and non-character objects, and recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after removing the alternative characters from the recognized characters as recognition results.

１、２・・・ＯＣＲモデル
１０・・・文字認識装置
１１・・・データ取得部
１２・・・入力受付部
１３・・・処理部
１３１・・・学習部
１３２・・・文字認識部
１３３・・・後処理部
１４・・・出力部
１５・・・記憶部
９００・・・コンピュータ
９０１・・・ＣＰＵ
９０２・・・主記憶装置
９０３・・・補助記憶装置
９０４・・・入出力インタフェース
９０５・・・通信インタフェース Reference Signs List 1, 2: OCR model 10: character recognition device 11: data acquisition unit 12: input reception unit 13: processing unit 131: learning unit 132: character recognition unit 133: post-processing unit 14: output unit 15: memory unit 900: computer 901: CPU
902: Main memory device 903: Auxiliary memory device 904: Input/output interface 905: Communication interface

Claims

A learning data acquisition unit that acquires first learning data in which substitute characters are assigned as correct answer labels to image data of non-character display objects shown in the drawings;
a learning unit that learns the first learning data to create a character recognition model;
a recognition target data acquisition unit that acquires image data of a recognition target including characters and non-character display objects;
a recognition unit that recognizes characters from the image data of the recognition target based on the character recognition model, and outputs the remaining characters after deleting the substitute characters from the recognized characters as a recognition result;
A character recognition device comprising:

The learning data acquisition unit further acquires second learning data in which a correct answer label is attached to image data of a character depicted in the drawing,
The learning unit learns the second learning data to create the character recognition model.
2. The character recognition device according to claim 1.

a post-processing unit that corrects the recognition result output by the recognition unit;
Further equipped with
the post-processing unit, when one of a set of characters forming a predetermined pair is included only at one of the beginning and the end of the character string of the recognition result, corrects the recognition result by adding the other of the set of characters forming the pair to the other of the beginning and the end.
3. The character recognition device according to claim 1.

a post-processing unit that corrects the recognition result output by the recognition unit;
Further equipped with
the post-processing unit compares the recognition result with a registered character string registered in a dictionary, and if the recognition result is similar to the registered character string, corrects the recognition result by replacing the recognition result with the similar registered character string.
3. The character recognition device according to claim 1.

a post-processing unit that corrects the recognition result output by the recognition unit;
Further equipped with
the post-processing unit corrects the recognition result by adding a predetermined character to the recognition result based on a rule that specifies that if one or more character strings satisfy a predetermined condition, the one or more character strings contain a predetermined character at a predetermined position;
3. The character recognition device according to claim 1.

an integration unit that integrates the recognition results based on each of the character recognition models when the recognition unit recognizes characters from the image data of the recognition target based on each of the plurality of character recognition models;
Further equipped with
when another character recognition model is able to recognize a character that the character recognition model with the highest recognition rate was unable to recognize, the integration unit outputs, as the recognition result, the character recognized by the character recognition model with the highest recognition rate among the other character recognition models.
3. The character recognition device according to claim 1.

A step of acquiring first learning data in which an alternative character is assigned as a correct answer label to image data of a display object that is not a character depicted in a drawing;
creating a character recognition model by learning the first learning data;
acquiring image data of a recognition target including characters and non-character objects;
a step of recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after deleting the substitute characters from the recognized characters as a recognition result;
The character recognition method includes the steps of:

On the computer,
A step of acquiring first learning data in which an alternative character is assigned as a correct answer label to image data of a display object that is not a character depicted in a drawing;
creating a character recognition model by learning the first learning data;
acquiring image data of a recognition target including characters and non-character objects;
a step of recognizing characters from the image data of the recognition target based on the character recognition model, and outputting the remaining characters after deleting the substitute characters from the recognized characters as a recognition result;
A program that executes the following.