JP5964078B2

JP5964078B2 - Character recognition device, character recognition method and program

Info

Publication number: JP5964078B2
Application number: JP2012042085A
Authority: JP
Inventors: 明生中村; 正隆淵田
Original assignee: Tokyo Denki University
Current assignee: Tokyo Denki University
Priority date: 2012-02-28
Filing date: 2012-02-28
Publication date: 2016-08-03
Anticipated expiration: 2032-02-28
Also published as: JP2013178659A

Description

本発明は文字認識装置、文字認識方法およびプログラムに関する。 The present invention relates to a character recognition device, a character recognition method, and a program.

視覚障害者は、主に点字や音声といった情報によって周囲から情報を取得している。しかし、商品の容器や張り紙など、生活において重要な情報を含む文字情報には、これらの情報が備わっていないことが多い。 A visually impaired person obtains information from surroundings mainly by information such as Braille or voice. However, text information including important information in daily life, such as product containers and stickers, often lacks such information.

点字や音声の情報を備えない文字情報から、音声情報を作成するシステムとして、例えば特許文献１や非特許文献１に記載の装置がある。特許文献１に記載のシステムは、装置への入力情報とする画像の中で、ある特性をもつ色に囲まれた領域を切り出し、その領域に存在する文字を抽出する。そして抽出した文字を文字コードに変換して音声合成部で読み上げるものである。 As a system for creating speech information from character information that does not include Braille or speech information, there are devices described in Patent Literature 1 and Non-Patent Literature 1, for example. The system described in Patent Document 1 cuts out an area surrounded by a color having a certain characteristic from an image used as input information to the apparatus, and extracts characters existing in the area. The extracted character is converted into a character code and read out by the speech synthesizer.

非特許文献１に記載の装置は、入力画像上の文字を指先でなぞり、なぞった部分にある文字を認識して読み上げるものである。 The apparatus described in Non-Patent Document 1 traces characters on an input image with a fingertip, recognizes characters in the traced portion, and reads them out.

特開平４−２４８８５号公報JP-A-4-24885

淵田正隆、中村明生著、「指先指定文字抽出の検討」、第１７回画像センシングシンポジウム、横浜、２０１１年６月、ＩＳ３−０８−１〜ＩＳ３−０８−４Masataka Hamada, Akio Nakamura, “Examination of fingertip designation character extraction”, 17th Image Sensing Symposium, Yokohama, June 2011, IS3-08-1 to IS3-08-4

しかしながら、特許文献１では、当該装置用に予めマークが施された資料でなければ、文字を抽出することができず、一般に流通する商品や張り紙のように、予めこのようなマークをつけることが難しいものでは、文字情報は認識できない。 However, in Patent Document 1, characters can not be extracted unless the material is pre-marked for the device, and such a mark can be pre-marked like a generally distributed product or a sticker. Character information cannot be recognized if it is difficult.

また、非特許文献１に記載の文字認識装置では、２値化した原画像全体に膨張処理を施して繋がった領域を文字領域として認識する。しかし、原画像全体に膨張処理を施してしまうと文字でない背景領域のノイズが大きくなるため、文字領域として誤って認識されてしまい、精度が落ちる恐れがある。 In the character recognition device described in Non-Patent Document 1, an area connected by performing dilation processing on the entire binarized original image is recognized as a character area. However, if dilation processing is performed on the entire original image, noise in the background area that is not a character increases, so that it may be mistakenly recognized as a character area and accuracy may be reduced.

本発明の目的は、上記問題に鑑み、指先でなぞった部分の文字をより高い精度で認識する、汎用的な文字認識装置、文字認識方法およびプログラムを提供することにある。 In view of the above problems, an object of the present invention is to provide a general-purpose character recognition device, a character recognition method, and a program for recognizing a character traced with a fingertip with higher accuracy.

本発明によれば、
２値化された画像から、１つの文字と推定される領域である文字候補領域を認識する文字候補認識手段と、
前記文字候補認識手段が認識した前記文字候補領域毎に、文字と推定される色を有する領域を膨張させる膨張処理を施し、前記膨張処理を施した前記文字候補領域同士で繋がった領域を文字列領域として認識する文字列領域認識手段と、
前記文字列領域に基づき、前記画像から文字列を抽出する文字列抽出手段を有する文字認識装置が提供される。 According to the present invention,
Character candidate recognition means for recognizing a character candidate area that is an area estimated as one character from the binarized image;
For each of the character candidate areas recognized by the character candidate recognition means, an expansion process for expanding an area having a color estimated to be a character is performed, and an area connected between the character candidate areas subjected to the expansion process is a character string. A character string area recognition means for recognizing as an area;
There is provided a character recognition device having character string extraction means for extracting a character string from the image based on the character string region.

本発明によれば、
コンピュータが、
２値化された画像から、１つの文字と推定される領域である文字候補領域を認識し、
前記文字候補領域毎に、文字と推定される色を有する領域を膨張させる膨張処理を施し、前記膨張処理を施した前記文字候補領域同士で繋がった領域を文字列領域として認識し、
前記文字列領域に基づき、前記画像から文字列を抽出する文字認識方法が提供される。 According to the present invention,
Computer
Recognizing a character candidate area, which is an area estimated as one character, from a binarized image;
For each of the character candidate areas, an expansion process is performed to expand an area having a color estimated to be a character, and an area connected between the character candidate areas subjected to the expansion process is recognized as a character string area.
A character recognition method for extracting a character string from the image based on the character string region is provided.

本発明によれば、
コンピュータを、
２値化された画像から、１つの文字と推定される領域である文字候補領域を認識する手段、
前記文字候補領域毎に、文字と推定される色を有する領域を膨張させる膨張処理を施し、前記膨張処理を施した前記文字候補領域同士で繋がった領域を文字列領域として認識する手段、
前記文字列領域に基づき、前記画像から文字列を抽出する手段として機能させるためのプログラムが提供される。 According to the present invention,
Computer
Means for recognizing a character candidate area which is an area presumed to be one character from the binarized image;
Means for expanding an area having a color estimated to be a character for each character candidate area, and recognizing an area connected between the character candidate areas subjected to the expansion process as a character string area;
A program for causing a character string to be extracted from the image based on the character string area is provided.

本発明によれば、指先でなぞった部分の文字をより高い精度で認識することができる、汎用的な文字認識装置、文字認識方法およびプログラムが提供される。 ADVANTAGE OF THE INVENTION According to this invention, the general purpose character recognition apparatus, the character recognition method, and program which can recognize the character of the part traced with the fingertip with higher precision are provided.

本発明の第１の実施形態に係る文字認識装置の構成を示すブロック図である。It is a block diagram which shows the structure of the character recognition apparatus which concerns on the 1st Embodiment of this invention. 本発明の第１の実施形態に係る文字認識装置の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the character recognition apparatus which concerns on the 1st Embodiment of this invention. 生成される矩形の例を示す図である。It is a figure which shows the example of the rectangle produced | generated. 矩形の配置例を示す図である。It is a figure which shows the example of an arrangement | positioning of a rectangle. 矩形の拡大方法を説明する図である。It is a figure explaining the expansion method of a rectangle. 文字候補領域を認識する流れを説明する図である。It is a figure explaining the flow which recognizes a character candidate area | region. 文字候補認識部が文字列領域部に送信する情報の例を示す図である。It is a figure which shows the example of the information which a character candidate recognition part transmits to a character string area | region part. 文字列領域認識部が送信する文字列領域の例を示す図である。It is a figure which shows the example of the character string area | region which a character string area | region recognition part transmits. 文字列抽出部が抽出する文字列の例を示す図である。It is a figure which shows the example of the character string which a character string extraction part extracts. 文字と推定される色を決定する流れを示す図である。It is a figure which shows the flow which determines the color estimated as a character.

以下、本発明の実施の形態について、図面を用いて説明する。尚、すべての図面において、同様な構成要素には同様の符号を付し、適宜説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In all the drawings, the same reference numerals are given to the same components, and the description will be omitted as appropriate.

（第１の実施形態）
図１は、本発明の第１の実施形態に係る文字認識装置１０の構成を示すブロック図である。文字認識装置１０は、文字候補認識部１０２と、文字列領域認識部１０４と、文字列抽出部１０６を有する。 (First embodiment)
FIG. 1 is a block diagram showing a configuration of a character recognition apparatus 10 according to the first embodiment of the present invention. The character recognition device 10 includes a character candidate recognition unit 102, a character string region recognition unit 104, and a character string extraction unit 106.

文字候補認識部１０２は、１つの文字と推定される領域である文字候補領域を認識する。 The character candidate recognition unit 102 recognizes a character candidate area that is an area estimated to be one character.

文字列領域認識部１０４は、文字候補認識部１０２が認識した文字候補領域毎に、文字と推定される色を有する領域を膨張させる膨張処理を施し、膨張処理を施した文字候補領域同士で繋がった領域を、文字列領域１１２として認識する。 The character string area recognition unit 104 performs an expansion process for expanding an area having a color estimated to be a character, for each character candidate area recognized by the character candidate recognition unit 102, and the character candidate areas subjected to the expansion process are connected to each other. This area is recognized as the character string area 112.

文字列抽出部１０６は、文字列領域１１２に基づき、画像から文字を抽出する。 The character string extraction unit 106 extracts characters from the image based on the character string region 112.

なお、各図に示した文字認識装置１０の各構成要素は、ハードウエア単位の構成ではなく、機能単位のブロックを示している。文字認識装置１０の各構成要素は、任意のコンピュータのＣＰＵ、メモリ、メモリにロードされた本図の構成要素を実現するプログラム、そのプログラムを格納するハードディスクなどの記憶メディア、ネットワーク接続用インタフェースを中心にハードウエアとソフトウエアの任意の組合せによって実現される。そして、その実現方法、装置には様々な変形例がある。 Note that each component of the character recognition device 10 shown in each figure is not a hardware unit configuration but a functional unit block. Each component of the character recognition apparatus 10 is centered on an arbitrary computer CPU, memory, a program for realizing the components shown in the figure loaded in the memory, a storage medium such as a hard disk for storing the program, and a network connection interface. It is realized by any combination of hardware and software. There are various modifications of the implementation method and apparatus.

本実施形態における処理の流れを、図２〜９を用いて説明する。 The processing flow in this embodiment will be described with reference to FIGS.

まず、文字候補認識部１０２は、２値化された画像（以下、２値化画像とする。）を取得する（Ｓ１０２）。そして、文字認識手段１０２は、２値化画像から文字候補領域の認識処理を行う（Ｓ１０４）。 First, the character candidate recognition unit 102 acquires a binarized image (hereinafter referred to as a binarized image) (S102). Then, the character recognition unit 102 performs a character candidate area recognition process from the binarized image (S104).

まず、文字候補認識部１０２は、図３に示すように、画像の中の指標１０８に基づき文字候補領域を特定するための矩形１１０を２値化画像上に生成する。ここで、矩形１１０と画像上の文字との関係は、例えば、図４（ａ）に示すように、文字候補領域が矩形１１０内に収まる場合と、図４（ｂ）、（ｃ）に示すように、文字候補領域が矩形１１０内に収まらない場合とが考えられる。文字候補認識部１０２は、図４（ａ）のような場合は、矩形１１０をそのまま文字候補領域としてよいが、図４（ｂ）、（ｃ）のような場合は、図５に示すように矩形１１０を拡大して文字候補領域を特定する。以下で矩形１１０を拡大する処理の流れについて説明する。 First, as shown in FIG. 3, the character candidate recognition unit 102 generates a rectangle 110 for specifying a character candidate region on the binarized image based on the index 108 in the image. Here, the relationship between the rectangle 110 and the characters on the image is as shown in FIGS. 4B and 4C, for example, when the character candidate area is within the rectangle 110 as shown in FIG. Thus, the case where the character candidate area does not fit within the rectangle 110 can be considered. In the case of FIG. 4A, the character candidate recognition unit 102 may use the rectangle 110 as the character candidate area as it is, but in the case of FIGS. 4B and 4C, as shown in FIG. The rectangle 110 is enlarged to specify a character candidate area. The flow of processing for enlarging the rectangle 110 will be described below.

文字候補認識部１０２は、図５に示すように矩形１１０の外周を走査し、矩形１１０の外周と接する画素が文字と推定される色であるか否かを確認する。文字と推定される色は、２値化画像のどちらの色が文字と推定される色かを示す情報を、図示しない記憶部で予め記憶しておくことなどで判断できる。文字と推定される色の画素が存在した場合、文字候補認識部１０２は、矩形１１０を一定量拡大する。ここで、文字候補認識部１０２は、外周の縦方向を走査していた場合は横方向の幅を拡大し、外周の横方向を走査していた場合は縦方向の幅を拡大する。例えば、図５のＡ地点では、文字候補認識部１０２は、矩形１１０の横幅を左方向に一定量拡大する。文字候補認識部１０２は、文字と推定される色の画素が矩形１１０の外周に存在しなくなるまでこの処理を繰り返す。そして、拡大した矩形１１０'内の領域を文字候補領域として認識する。なお、矩形１１０の総拡大量には制限が設けてあり、文字候補認識部１０２は、制限した量まで拡大した場合はそれ以上矩形１１０を拡大しないよう制御する。これにより、矩形１１０が制限なく拡大しないよう制御する。また、文字候補認識部１０２は、指標１０８が矩形内に含まれること防ぐため、指標１０８の示す座標値より下の領域へ矩形１１０を拡大しないよう制御する。 As shown in FIG. 5, the character candidate recognition unit 102 scans the outer periphery of the rectangle 110 and confirms whether or not the pixel in contact with the outer periphery of the rectangle 110 has a color estimated as a character. The color estimated as a character can be determined by storing information indicating which color of the binarized image is estimated as a character in a storage unit (not shown). When there is a pixel of a color estimated to be a character, the character candidate recognition unit 102 enlarges the rectangle 110 by a certain amount. Here, the character candidate recognition unit 102 enlarges the width in the horizontal direction when scanning the vertical direction of the outer periphery, and expands the width in the vertical direction when scanning the horizontal direction of the outer periphery. For example, at point A in FIG. 5, the character candidate recognition unit 102 enlarges the horizontal width of the rectangle 110 by a certain amount in the left direction. The character candidate recognition unit 102 repeats this process until pixels of a color estimated to be a character do not exist on the outer periphery of the rectangle 110. Then, the area within the enlarged rectangle 110 ′ is recognized as a character candidate area. Note that there is a restriction on the total amount of enlargement of the rectangle 110, and the character candidate recognition unit 102 performs control so that the rectangle 110 is not further enlarged when the amount of enlargement reaches the restricted amount. Thus, control is performed so that the rectangle 110 is not enlarged without limitation. Further, the character candidate recognition unit 102 performs control so that the rectangle 110 is not expanded to a region below the coordinate value indicated by the index 108 in order to prevent the index 108 from being included in the rectangle.

文字候補認識部１０２は、図示しない記憶部で記憶する、指標１０８が通過した位置の座標に基づき、上述の処理を繰り返し実行して複数の文字候補領域を特定する。最終的に、文字候補認識部１０２は、例えば図６に示すように、複数の文字候補領域を特定する。文字候補認識部１０２は、特定したすべての文字候補領域を文字列領域認識部１０４へ送信する。図７は、文字列領域認識部１０４へ送信される情報の例を示す図である。 The character candidate recognizing unit 102 identifies the plurality of character candidate regions by repeatedly executing the above-described processing based on the coordinates of the position through which the index 108 is stored, which is stored in a storage unit (not shown). Finally, the character candidate recognition unit 102 specifies a plurality of character candidate regions as shown in FIG. 6, for example. The character candidate recognition unit 102 transmits all the specified character candidate regions to the character string region recognition unit 104. FIG. 7 is a diagram illustrating an example of information transmitted to the character string region recognition unit 104.

文字列領域認識部１０４は、文字候補認識部１０２より受信した情報に基づき、文字列領域１１２を認識する（Ｓ１０６）。まず、文字列領域認識部１０４は、認識した各文字候補領域において、文字と推定される色の領域を膨張させる膨張処理を施す。そして、文字列領域認識部１０４は、各文字候補領域同士でいくつか形成される連結領域の中から、ノイズを除去して文字列領域１１２を特定する。文字列領域認識部１０４は、連結領域と対応する文字候補領域の形状に基づき、当該連結領域がノイズか否かを判別する。文字列領域認識部１０４は、当該連結領域と対応する文字候補領域の上下左右の対称性、２値化した色の密度比、２値化した色で形成される形状の複雑さや大きさなどの情報に基づき、ノイズを判別する。このようにして、連結領域の中から例えば図８に示すように、膨張処理によって繋がった各文字候補領域からノイズを除去し、文字列領域１１２を認識する。文字列領域認識部１０４は、文字列を抽出するための情報として、例えば文字列領域１１２の座標値や膨張処理をかけた図８に示す画像（マスク情報）などを、文字列抽出部１０６へ送信する。 The character string region recognition unit 104 recognizes the character string region 112 based on the information received from the character candidate recognition unit 102 (S106). First, the character string region recognizing unit 104 performs an expansion process for expanding a color region estimated to be a character in each recognized character candidate region. Then, the character string region recognition unit 104 identifies the character string region 112 by removing noise from among several connected regions formed by the character candidate regions. The character string area recognition unit 104 determines whether the connected area is noise based on the shape of the character candidate area corresponding to the connected area. The character string region recognizing unit 104 determines the vertical and horizontal symmetry of the character candidate region corresponding to the connection region, the binarized color density ratio, the complexity and size of the shape formed by the binarized color, etc. Based on the information, the noise is determined. In this manner, as shown in FIG. 8, for example, noise is removed from each character candidate region connected by the expansion process from the connected region, and the character string region 112 is recognized. The character string area recognition unit 104 sends, for example, the coordinate value of the character string area 112 or the image (mask information) shown in FIG. 8 subjected to the expansion process to the character string extraction unit 106 as information for extracting the character string. Send.

なお、元となる２値化画像が傾いていると、文字候補領域は傾いた状態で連結される可能性があるため、文字列領域認識手段１０４は、ノイズを除去する前に、連結領域の慣性主軸を計算して傾き角度を求め、アフィン変換などで傾きを補正してもよい。傾きを補正することで、文字列領域認識手段１０４が文字列とノイズを判別する精度が向上する。 If the original binarized image is tilted, the character candidate regions may be connected in a tilted state. Therefore, the character string region recognizing unit 104 determines whether to connect the connected regions before removing the noise. The tilt angle may be obtained by calculating the inertial main axis, and the tilt may be corrected by affine transformation or the like. By correcting the inclination, the accuracy with which the character string area recognition unit 104 discriminates the character string from the noise is improved.

文字列抽出部１０６は、文字列領域認識部１０４から受信した文字列領域１１２に基づき、２値化された画像から文字列を抽出する（Ｓ１０８）。例えば、文字列領域１１２の座標値を受信した場合、文字列抽出部１０６は、その座標値に該当する領域を文字列として抽出する。あるいは、マスク情報を受信した場合、２値化された画像とマスク情報の論理積により、該当する領域を文字列として抽出する。図９は、文字列抽出部１０６が抽出する文字列の一例を示す図である。 The character string extraction unit 106 extracts a character string from the binarized image based on the character string region 112 received from the character string region recognition unit 104 (S108). For example, when the coordinate value of the character string area 112 is received, the character string extraction unit 106 extracts an area corresponding to the coordinate value as a character string. Alternatively, when mask information is received, the corresponding area is extracted as a character string by the logical product of the binarized image and the mask information. FIG. 9 is a diagram illustrating an example of a character string extracted by the character string extraction unit 106.

以上、本実施形態において、指標の通過した座標値に基づいて１つの文字と推定される文字候補領域を認識し、各文字候補領域に膨張処理を施し、連結した領域の中から文字列領域１１２を認識する構成を取る。本構成により、背景のノイズを抑えて文字列を精度良く抽出することができる。 As described above, in the present embodiment, the character candidate area estimated as one character is recognized based on the coordinate value passed by the index, the character candidate area is expanded, and the character string area 112 is selected from the connected areas. Take the configuration to recognize. With this configuration, it is possible to accurately extract a character string while suppressing background noise.

なお、本実施形態において、例えば指先そのものを指標とすることもできるし、指先に特定波長の光を反射する物質をつけ、その物資を指標とすることもできる。 In this embodiment, for example, the fingertip itself can be used as an index, or a material that reflects light of a specific wavelength can be attached to the fingertip, and the material can be used as an index.

（第２の実施形態）
本実施形態は、第１の実施形態で認識した文字情報を音声として読み上げる構成を取る。 (Second Embodiment)
This embodiment has a configuration in which the character information recognized in the first embodiment is read out as speech.

本実施形態では、第１の実施形態の文字認識装置１０に、図示しない画像入力部および音声合成部をさらに有する。画像入力部は、カメラなどの入力装置からカラー画像を取得する。音声合成部は、文字候補認識部１０２、文字列領域認識部１０４、文字列抽出部１０６を用いて、商品パッケージの表面などから抽出した文字列をテキスト化し、音声として読み上げる。 In the present embodiment, the character recognition device 10 of the first embodiment further includes an image input unit and a voice synthesis unit (not shown). The image input unit acquires a color image from an input device such as a camera. The speech synthesis unit uses the character candidate recognition unit 102, the character string region recognition unit 104, and the character string extraction unit 106 to convert the character string extracted from the surface of the product package into text and read it out as speech.

画像入力部は、例えば小型カメラなどから、処理対象とするカラー画像を原画像として取得する。そして、画像入力部は、原画像を「文字と推定される色」と「文字と推定される色以外の色」とで構成される２値化画像に変換する。図１０は、文字と推定される色を決定する流れを示す図である。画像入力部は、原画像に平滑化処理を施した画像のヒストグラムを生成し、極大値、極小値などを基準に、ヒストグラムの切り分け位置を決定する。そして、切り分け位置に基づいた色空間を作成する。画像入力部は、指標１０８の座標値に基づき、矩形１１０を生成し、矩形１１０内に存在する色を先の色空間に投影する。そして、画像入力部は、色空間内で最も投票数が多いマスに該当する色を、「文字と推定される色」の候補とする。そして、画像入力部は、候補とした「文字と推定される色」で２値化を行い、矩形１１０内の「文字と推定される色」の密度や、「文字と推定される色」の領域の個数などを判定基準とし、矩形１１０の「文字と推定される色」を決定する。最も投票数が多い色が判定基準を満たさない場合、画像入力部は、次に投票数が多いマスに該当する色で同様の判定を行う。そして、画像入力部は、「文字と推定される色」と「文字と推定される色以外の色」とで２値化した画像を作成して、文字候補認識手段１０２へ送信する。 The image input unit acquires a color image to be processed as an original image from, for example, a small camera. Then, the image input unit converts the original image into a binary image composed of “a color estimated as a character” and “a color other than a color estimated as a character”. FIG. 10 is a diagram illustrating a flow of determining a color estimated as a character. The image input unit generates a histogram of an image obtained by performing a smoothing process on the original image, and determines a histogram separation position on the basis of a maximum value, a minimum value, or the like. Then, a color space based on the cut position is created. The image input unit generates a rectangle 110 based on the coordinate value of the index 108 and projects the color existing in the rectangle 110 onto the previous color space. Then, the image input unit sets the color corresponding to the square with the largest number of votes in the color space as a candidate for “color estimated to be a character”. Then, the image input unit binarizes the candidate “color estimated to be a character”, and the density of “color estimated to be a character” in the rectangle 110 or the color estimated to be a character. The “color estimated to be a character” of the rectangle 110 is determined using the number of regions as a criterion. When the color with the largest number of votes does not satisfy the judgment criterion, the image input unit performs the same judgment with the color corresponding to the square with the next largest number of votes. Then, the image input unit creates an image binarized with “a color estimated as a character” and “a color other than a color estimated as a character”, and transmits the image to the character candidate recognition unit 102.

文字候補認識手段１０２が２値化画像を受信してから文字列を抽出するまでの処理の流れは、第１の実施形態と同様のため省略する。文字抽出部１０６は、抽出した文字列を音声合成部へ送信する。 Since the process flow from when the character candidate recognition unit 102 receives the binarized image to when the character candidate is extracted is the same as that in the first embodiment, the description is omitted. The character extraction unit 106 transmits the extracted character string to the speech synthesis unit.

音声合成部は、文字抽出部１０６から受信した文字列をＯＣＲ（Optical Character Reader）などでテキスト化し、既存の読み上げソフトなどを用いて音声情報として読みあげる。 The voice synthesis unit converts the character string received from the character extraction unit 106 into text using an OCR (Optical Character Reader) or the like, and reads it out as voice information using existing reading software or the like.

以上、本実施形態において、さらに画像入力部と音声合成部を有する。本構成により、画像入力部で取得した画像内の文字情報を、第１の実施形態のとおり認識して、音声合成部で音声情報として提供することが可能となる。本構成により、例えば商品パッケージ上の商品名を読み上げるなど、視覚障害者の生活を支援する装置を提供することができる。 As described above, the present embodiment further includes the image input unit and the voice synthesis unit. With this configuration, the character information in the image acquired by the image input unit can be recognized as in the first embodiment, and can be provided as voice information by the voice synthesis unit. With this configuration, it is possible to provide a device that supports the life of the visually impaired, for example, reading a product name on a product package.

以上、図面を参照して本発明の実施形態について述べたが、これらは本発明の例示であり、上記以外の様々な構成を採用することもできる。 As mentioned above, although embodiment of this invention was described with reference to drawings, these are the illustrations of this invention, Various structures other than the above are also employable.

１０文字認識装置
１０２文字候補認識部
１０４文字列領域認識部
１０６文字列抽出部
１０８指標
１１０、１１０' 矩形
１１２文字列領域 DESCRIPTION OF SYMBOLS 10 Character recognition apparatus 102 Character candidate recognition part 104 Character string area | region recognition part 106 Character string extraction part 108 Index 110,110 'rectangle 112 Character string area | region

Claims

Character candidate recognition means for detecting a position of an index in the binarized image and recognizing a character candidate area that is an area estimated as one character based on the position of the index ;
For each of the character candidate areas recognized by the character candidate recognition means, an expansion process for expanding an area having a color estimated to be a character is performed, and an area connected between the character candidate areas subjected to the expansion process is a character string. A character string area recognition means for recognizing as an area;
A character recognition device comprising character string extraction means for extracting a character string from the image based on the character string region.

The character recognition device according to claim 1,
The character string extracting means includes
A character that holds a two-dimensional coordinate value in the entire image, compares the two-dimensional coordinate value in the entire image with the two-dimensional coordinate value in the character string region, and extracts the character string from the image Recognition device.

The character recognition device according to claim 1,
The character string extracting means includes
A character recognition apparatus that separately holds an image subjected to the expansion processing as mask information, and extracts the character string from a logical product of the image and the mask information.

In the character recognition device according to any one of claims 1 to 3,
The character candidate recognition means includes:
It generates a reference area based on the position of the indicator which moves within the image, recognizing a character recognition apparatus the character candidate region from the image by using the reference region described above generated.

Computer
Detecting the position of the index in the binarized image , and recognizing a character candidate area which is an area estimated as one character based on the position of the index ;
For each of the character candidate areas, an expansion process is performed to expand an area having a color estimated to be a character, and an area connected between the character candidate areas subjected to the expansion process is recognized as a character string area.
A character recognition method for extracting a character string from the image based on the character string region.

The character recognition method according to claim 5,
The computer is
A character that holds a two-dimensional coordinate value in the entire image, compares the two-dimensional coordinate value in the entire image with the two-dimensional coordinate value in the character string region, and extracts the character string from the image Recognition method.

The character recognition method according to claim 5,
The computer is
A character recognition method for separately holding an image subjected to the expansion process as mask information and extracting the character string from a logical product of the image and the mask information.

In the character recognition method according to any one of claims 5 to 7,
The computer is
Wherein generating the reference region based on the position of the indicator, a character recognition method for recognizing the character candidate region from the image by using the reference area the generated moving within the image.

Computer
Means for detecting a position of an index in the binarized image and recognizing a character candidate area which is an area estimated as one character based on the position of the index ;
Means for expanding an area having a color estimated to be a character for each character candidate area, and recognizing an area connected between the character candidate areas subjected to the expansion process as a character string area;
A program for functioning as means for extracting a character string from the image based on the character string region.

The program according to claim 9,
The computer,
Means for holding a two-dimensional coordinate value in the entire image;
A program for further functioning as means for extracting the character string from the image by comparing the two-dimensional coordinate value of the entire image with the two-dimensional coordinate value of the character string region.

The program according to claim 9,
The computer,
Means for separately holding the image subjected to the expansion processing as mask information;
A program for further functioning as means for extracting the character string from the logical product of the image and the mask information.

The program according to any one of claims 9 to 11, wherein the computer is
Wherein based on the position of the index to generate a reference area, a program for further functions as a means for recognizing the character candidate region from the image by using the reference area the generated moving within the image.