JP2011128916A

JP2011128916A - Object detection apparatus and method, and program

Info

Publication number: JP2011128916A
Application number: JP2009287063A
Authority: JP
Inventors: Yi Hu; 軼胡
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2009-12-18
Filing date: 2009-12-18
Publication date: 2011-06-30

Abstract

<P>PROBLEM TO BE SOLVED: To accurately detect objects such as faces without increasing the number of identifying apparatuses. <P>SOLUTION: A plurality of identifying apparatuses different in face direction to be identified are used to discriminate faces. The identifying apparatuses comprise a plurality of weak identifying apparatuses. The identifying apparatuses are classified into a prestage weak identifying apparatus group WC-F for determining a face direction, and a poststage weak identifying apparatus group WC-B for identifying a face facing the direction that can be identified by each of the identifying apparatuses. The pre- and post-stage weak identifying apparatus groups WC-F and WC-B produce first and second scores, respectively. A face is detected according to the sum of products of the first and second scores in each identifying apparatus. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、検出対象画像から人物の顔等のオブジェクトを検出するオブジェクト検出装置および方法並びにオブジェクト検出方法をコンピュータに実行させるためのプログラムに関するものである。 The present invention relates to an object detection apparatus and method for detecting an object such as a human face from a detection target image, and a program for causing a computer to execute the object detection method.

従来、デジタルカメラによって撮影されたスナップ写真における人物の顔領域の色分布を調べてその肌色を補正したり、監視システムのデジタルビデオカメラで撮影されたデジタル映像中の人物を認識したりすることが行われている。このような場合、デジタル画像中の人物の顔に対応する顔領域を検出する必要があるため、これまでに、デジタル画像中の顔を検出する手法が種々提案されている。その中でもとくに検出精度、ロバスト性が優れているとされる顔検出の手法として、サンプル画像を用いたマシンラーニングの学習により生成された、複数の弱い判別器（弱判別器）を結合した判別器を用いる手法が知られている。 Conventionally, the color distribution of a person's face area in a snapshot photographed by a digital camera is examined to correct the skin color, or a person in a digital image photographed by a digital video camera of a surveillance system is recognized. Has been done. In such a case, since it is necessary to detect a face region corresponding to a person's face in the digital image, various techniques for detecting a face in the digital image have been proposed so far. Among them, a classifier that combines multiple weak classifiers (weak classifiers) generated by machine learning learning using sample images as a face detection method that is considered to have excellent detection accuracy and robustness. A method using this is known.

この手法は、複数の異なる顔のサンプル画像からなる顔サンプル画像群と、顔でないことが分かっている複数の異なる非顔サンプル画像とからなる非顔サンプル画像群とを用いて、顔であることの特徴を学習させ、ある画像が顔の画像であるか否かを判別できる判別器を生成して用意しておき、顔の検出対象となる画像（以下、検出対象画像という）において部分画像を順次切り出し、その部分画像が顔であるか否かを上記の判別器を用いて判別し、顔であると判別した部分画像の領域を抽出することにより、検出対象画像上の顔を検出する手法である。 This method uses a face sample image group composed of a plurality of different face sample images and a non-face sample image group composed of a plurality of different non-face sample images that are known not to be faces. A classifier that can learn whether or not an image is a face image is generated and prepared, and a partial image is detected in an image that is a face detection target (hereinafter referred to as a detection target image). A method of detecting a face on a detection target image by sequentially cutting out, determining whether or not the partial image is a face using the above discriminator, and extracting a region of the partial image determined to be a face It is.

ところで、上述した判別器に入力される画像には、顔が正面を向いた画像のみならず、顔が画像平面上において回転している（以下「面内回転」という）画像や、顔が画像平面内において回転している（以下、「面外回転」という）画像が入力される。ここで、１つの判別器が判別可能な顔の回転範囲は限られており、面内回転している画像では３０度程度、面外回転している画像では３０度〜６０度程度の回転であれば顔か非顔かを判別することができる。このため、より広い範囲の顔の向きに対応するために、それぞれの向きの画像を判別可能な複数の判別器を用意し、すべての判別器に、特定の向きの顔であるか否かの判別を行わせ、最終的な各判別器の出力から顔であるか否かを判定する、マルチクラス判別手法が提案されている。 By the way, the image input to the discriminator described above includes not only an image with the face facing forward, but also an image in which the face is rotated on the image plane (hereinafter referred to as “in-plane rotation”) or a face. An image rotating in the plane (hereinafter referred to as “out-of-plane rotation”) is input. Here, the rotation range of the face that can be discriminated by one discriminator is limited, and the rotation is about 30 degrees for an in-plane rotated image and about 30 to 60 degrees for an out-of-plane rotated image. If there is, it can be determined whether it is a face or a non-face. For this reason, in order to support a wider range of face orientations, a plurality of discriminators capable of discriminating images in the respective orientations are prepared, and whether or not each discriminator has a face in a specific direction is determined. A multi-class discrimination method has been proposed in which discrimination is performed and whether a face is determined from the final output of each discriminator is determined.

また、マルチクラス判別手法において、各判別器を構成する複数の弱判別器（弱判別器群）の前段にて顔が含まれるか否かを判別し、最も大きいスコアが得られた判別器においてのみ、後段の弱判別器群にて顔のであるか否かの判別を行うことにより、判別のための処理を低減させる手法が提案されている（特許文献１、非特許文献１参照）。さらに、すべての判別器について判別を行い、すべての判別器の出力を加算して、検出対象画像に顔が含まれるか否かを判別する手法も提案されている。 In the multi-class discrimination method, it is determined whether or not a face is included in the previous stage of a plurality of weak classifiers (weak classifier groups) constituting each classifier, and the classifier having the largest score is obtained. However, a technique for reducing processing for discrimination by determining whether or not the face is a face using a group of weak classifiers in the subsequent stage has been proposed (see Patent Document 1 and Non-Patent Document 1). Furthermore, a method has been proposed in which discrimination is performed for all discriminators, and the outputs of all discriminators are added to determine whether or not a face is included in the detection target image.

特開２００７−６６０１０号公報JP 2007-66010 A Paul Viola and Michael Jones, Rapid object detection using a boosted cascade of features, IEEE CVPR, 2001Paul Viola and Michael Jones, Rapid object detection using a boosted cascade of features, IEEE CVPR, 2001

上述したマルチクラス判別手法においては、判別可能な顔の向きが異なる複数の判別器は、それぞれが判別する向きの顔の画像を学習しているため、判別可能な向きの顔については、判別を精度良く行うことができる。しかしながら、あらゆる向きの顔を検出できるように判別器を用意するとなると、１つの判別器において広い角度に亘る向きの顔を判別できるように判別器を構成する必要がある。具体的には、正面を向いた顔の角度を０度とした場合、左右にそれぞれ０度±１５度、３０度±１５度、６０度±１５度、および９０度±１５度を向いた顔の画像を学習のためのサンプル画像として用意し、０度±１５度のサンプル画像により正面顔を判別可能な判別器を、３０度±１５度のサンプル画像により左右３０度を向いた顔を判別可能な判別器を、６０度±１５度のサンプル画像により左右６０度を向いた顔を判別可能な判別器を、９０度±１５度のサンプル画像により左右９０度を向いた顔を判別可能な判別器をそれぞれ学習する必要がある。 In the multi-class discrimination method described above, a plurality of discriminators having different discriminating face orientations learn the face images in the discriminating directions, so that the discriminating face is discriminated. It can be performed with high accuracy. However, if a discriminator is prepared so that a face in any orientation can be detected, it is necessary to configure the discriminator so that a single discriminator can discriminate a face in a wide angle. Specifically, when the angle of the face facing the front is 0 degree, the faces facing 0 degree ± 15 degrees, 30 degrees ± 15 degrees, 60 degrees ± 15 degrees, and 90 degrees ± 15 degrees to the left and right respectively Is prepared as a sample image for learning, a discriminator capable of discriminating a front face from a sample image of 0 ° ± 15 °, and a face facing 30 ° from left / right by a sample image of 30 ° ± 15 ° A discriminator capable of discriminating a face facing 60 degrees left and right from a sample image of 60 degrees ± 15 degrees, and a face discriminating from 90 degrees left and right from a sample image of 90 degrees ± 15 degrees It is necessary to learn each classifier.

しかしながら、このように広い角度範囲の顔を判別できるようにすると、１つの判別器における弱判別器の数が非常に多くなり、その結果、顔の検出を高速に行うことができなくなってしまう。また、このように判別器を構成した場合は、３０度単位で向きが異なる顔を精度良く検出できるが、対応する角度の間の角度（例えば１５度、４５度）を向いた顔の検出精度はやはり低下してしまう。 However, if a face having a wide angle range can be discriminated in this way, the number of weak discriminators in one discriminator becomes very large, and as a result, the face cannot be detected at high speed. Further, when the discriminator is configured in this way, faces whose directions are different in units of 30 degrees can be detected with high accuracy, but the detection accuracy of faces facing angles between corresponding angles (for example, 15 degrees and 45 degrees). Will still decline.

本発明は上記事情に鑑みなされたものであり、判別器の数を多くすることなく、精度よく顔等の特定種類のオブジェクトを検出することを目的とする。 The present invention has been made in view of the above circumstances, and an object thereof is to detect a specific type of object such as a face with high accuracy without increasing the number of discriminators.

本発明によるオブジェクト検出装置は、判別対象のオブジェクトから抽出した特徴量をあらかじめ学習させた複数の弱判別器からなる、判別可能な前記オブジェクトの向きがそれぞれ異なる複数の判別器を有し、検出対象画像から抽出した特徴量を用いて、該検出対象画像から前記オブジェクトを検出する判別手段を備えたオブジェクト検出装置において、
前記各判別器の複数の弱判別器が前段の弱判別器群および後段の弱判別器群に分割されてなり、前記前段の弱判別器群が、前記オブジェクトの向きを判別する学習がなされ、前記後段の弱判別器群が、該各後段の弱判別器が属する判別器が判別可能な前記オブジェクトの向きに対応したオブジェクトを検出する学習がなされてなり、
前記判別手段は、前記前段の弱判別器群の出力である第１のスコア、および前記後段の弱判別器の出力である第２のスコアを取得し、前記複数の判別器のそれぞれにおける前記第１のスコアおよび前記第２のスコアの乗算値の、すべての前記判別器についての和に基づいて、前記オブジェクトを検出する手段であることを特徴とするものである。 An object detection apparatus according to the present invention includes a plurality of discriminators each having different discriminating directions of the object, each of which includes a plurality of weak discriminators in which feature amounts extracted from an object to be discriminated are learned in advance. In an object detection apparatus including a determination unit that detects the object from the detection target image using a feature amount extracted from the image,
A plurality of weak classifiers of each classifier is divided into a weak classifier group in the previous stage and a weak classifier group in the subsequent stage, and the weak classifier group in the previous stage is learned to determine the orientation of the object, The latter weak classifier group is learned to detect an object corresponding to the direction of the object that can be discriminated by the classifier to which each subsequent weak classifier belongs,
The discriminating unit obtains a first score that is an output of the preceding weak discriminator group and a second score that is an output of the subsequent weak discriminator, and the first score in each of the plurality of discriminators. It is a means for detecting the object on the basis of the sum of the product of the score of 1 and the second score for all the discriminators.

なお、本発明によるオブジェクト検出装置においては、前記前段の弱判別器群が、前記複数の判別器における少なくとも一部において前記特徴量を共有してなることが好ましい。 In the object detection device according to the present invention, it is preferable that the weak classifier group in the previous stage shares the feature amount in at least a part of the plurality of classifiers.

弱判別器群は、複数の弱判別器が線形に結合した構造を有しており、弱判別器は、検出対象画像における少なくとも１つの特徴量を算出し、この特徴量を用いてオブジェクトを判別するものである。このため、各弱判別器は、複数のサンプル画像における特徴量を用いてオブジェクトを判別するための学習がなされる。「前段の弱判別器群が、前記複数の判別器群の少なくとも一部において特徴量を共有してなる」とは、複数の判別器間における対応する弱判別器が、同一の特徴量を用いて学習がなされていることを意味する。このように同一の特徴量を用いて学習がなされた弱判別器は、検出対象画像における同一の特徴量を用いてオブジェクトを判別するものとなる。なお、複数の判別器群間における対応する弱判別器のすべてについて、共通する特徴量により学習がなされている必要はなく、少なくとも一部の弱判別器が共通する特徴量により学習がなされていればよい。 The weak classifier group has a structure in which a plurality of weak classifiers are linearly combined, and the weak classifier calculates at least one feature amount in the detection target image and uses this feature amount to determine an object. To do. For this reason, each weak discriminator learns to discriminate an object using feature amounts in a plurality of sample images. “A weak classifier group in the previous stage shares a feature quantity in at least a part of the plurality of classifier groups” means that corresponding weak classifiers among a plurality of classifiers use the same feature quantity. Means learning. Thus, the weak discriminator trained using the same feature amount discriminates an object using the same feature amount in the detection target image. Note that all of the corresponding weak classifiers among a plurality of classifier groups need not be learned with a common feature amount, and at least some weak classifiers have been learned with a common feature amount. That's fine.

また、複数の弱判別器は前段の弱判別器群および後段の弱判別器群に分割されているが、とくに前段の弱判別器群を、各判別器群間において特徴量を共有させる場合には、前段の弱判別器群を後段の弱判別器群よりも多くすることが好ましい。 The weak classifiers are divided into a weak classifier group at the front stage and a weak classifier group at the rear stage. Especially when the weak classifier group at the front stage is shared with each classifier group. It is preferable that the number of weak classifier groups at the front stage is larger than the number of weak classifier groups at the rear stage.

また、本発明によるオブジェクト検出装置においては、前記前段の弱判別器群と前記後段の弱判別器群とが連続して接続されてなるものとしてもよい。 In the object detection apparatus according to the present invention, the preceding weak classifier group and the subsequent weak classifier group may be connected in series.

また、本発明によるオブジェクト検出装置においては、前記判別器を、前記オブジェクトが所定の方向を向いた基準サンプル画像、並びに該基準サンプル画像の前記判別対象を該基準サンプル画像の平面において回転させた、回転角度が異なる複数の面内回転サンプル画像、および前記基準サンプル画像内の前記判別対象の向きを回転させた、回転角度が異なる複数の面外回転サンプル画像の少なくとも一方を用いて学習されてもよい。 Further, in the object detection device according to the present invention, the discriminator rotates the reference sample image in which the object faces a predetermined direction, and the discrimination target of the reference sample image on the plane of the reference sample image, Learning is performed using at least one of a plurality of in-plane rotation sample images with different rotation angles and a plurality of out-of-plane rotation sample images with different rotation angles obtained by rotating the direction of the discrimination target in the reference sample image. Good.

本発明によるオブジェクト検出方法は、判別対象のオブジェクトから抽出した特徴量をあらかじめ学習させた複数の弱判別器からなる、判別可能な前記オブジェクトの向きがそれぞれ異なる複数の判別器を有し、検出対象画像から抽出した特徴量を用いて、該検出対象画像から前記オブジェクトを検出するオブジェクト検出方法において、
前記各判別器の複数の弱判別器が前段の弱判別器群および後段の弱判別器群に分割されてなり、前記前段の弱判別器群が、前記オブジェクトの向きを判別する学習がなされ、前記後段の弱判別器群が、該各後段の弱判別器が属する判別器が判別可能な前記オブジェクトの向きに対応したオブジェクトを検出する学習がなされてなり、
前記前段の弱判別器群の出力である第１のスコア、および前記後段の弱判別器の出力である第２のスコアを取得し、
前記複数の判別器のそれぞれにおける前記第１のスコアおよび前記第２のスコアの乗算値の、すべての前記判別器についての和に基づいて、前記オブジェクトを検出することを特徴とするものである。 An object detection method according to the present invention includes a plurality of discriminators each having different discriminating directions of the object, each including a plurality of weak discriminators in which feature amounts extracted from an object to be discriminated are previously learned. In the object detection method for detecting the object from the detection target image using the feature amount extracted from the image,
A plurality of weak classifiers of each classifier is divided into a weak classifier group in the previous stage and a weak classifier group in the subsequent stage, and the weak classifier group in the previous stage is learned to determine the orientation of the object, The latter weak classifier group is learned to detect an object corresponding to the direction of the object that can be discriminated by the classifier to which each subsequent weak classifier belongs,
Obtaining a first score that is the output of the preceding weak classifier group, and a second score that is the output of the subsequent weak classifier;
The object is detected on the basis of the sum of the product of the first score and the second score in each of the plurality of classifiers for all the classifiers.

なお、本発明によるオブジェクト検出方法をコンピュータに実行させるためのプログラムとして提供してもよい。 In addition, you may provide as a program for making a computer perform the object detection method by this invention.

本発明によれば、前段の弱判別器群によりオブジェクトの向きを表す第１のスコアが取得され、後段の弱判別器群により各判別器が判別可能なオブジェクトの向きに対応したオブジェクトであることを表す第２のスコアが取得される。そして、複数の判別器のそれぞれにおける第１のスコアおよび第２のスコアの乗算値の、すべての判別器についての和に基づいて、オブジェクトが検出される。このため、判別可能なオブジェクトの向きがそれぞれ異なる複数の判別器による判別結果を統合して、オブジェクトの検出を行うことができ、その結果、特許文献１および非特許文献１の手法のように特定の向きのみのオブジェクトを検出するものと比較して、向きが異なるオブジェクトを柔軟に検出できることとなる。したがって、弱判別器の数を増加させることなく、オブジェクトの検出精度を向上させることができる。 According to the present invention, the first score indicating the direction of the object is acquired by the weak classifier group in the previous stage, and the object corresponds to the object direction that can be determined by each classifier by the weak classifier group in the subsequent stage. A second score representing is obtained. Then, an object is detected based on the sum of the multiplied values of the first score and the second score in each of the plurality of classifiers for all the classifiers. For this reason, it is possible to integrate the discrimination results obtained by a plurality of discriminators having different discriminating object orientations, and to detect the object. As a result, as in the methods of Patent Literature 1 and Non-Patent Literature 1 As compared with the case of detecting an object having only the direction of the object, an object having a different direction can be detected flexibly. Therefore, the object detection accuracy can be improved without increasing the number of weak classifiers.

また、前段の弱判別器群を、複数の判別器の少なくとも一部において特徴量を共有させることにより、複数の判別器の少なくとも一部における１つの判別処理を１つの弱判別器において行うことができることとなる。したがって、複数の判別器において、前段の弱判別器群の数を少なくすることができ、その結果、前段の弱判別器群における弱判別器の数を少なくすることができる。 In addition, by making the weak classifier group in the previous stage share the feature amount in at least a part of the plurality of classifiers, one classification process in at least a part of the plurality of classifiers can be performed in one weak classifier. It will be possible. Therefore, in the plurality of classifiers, the number of weak classifier groups in the previous stage can be reduced, and as a result, the number of weak classifiers in the group of weak classifiers in the previous stage can be reduced.

また、前段の弱判別器群と前記後段の弱判別器群とを連続して接続されてなるものとすることにより、前段の弱判別器群と前記後段の弱判別器群とを連続して行うことができるため、処理速度を向上させることができる。 Further, the preceding weak classifier group and the subsequent weak classifier group are continuously connected to each other so that the former weak classifier group and the subsequent weak classifier group are continuously connected. Since this can be done, the processing speed can be improved.

本実施形態における顔検出システムの構成を示すブロック図The block diagram which shows the structure of the face detection system in this embodiment ウィンドウが走査される様子を示す模式図Schematic showing how the window is scanned 顔検出部の構成を示す概略ブロック図Schematic block diagram showing the configuration of the face detection unit 候補検出部の構成を示す概略ブロック図Schematic block diagram showing the configuration of the candidate detection unit 判別部の構成を示すブロック図Block diagram showing the configuration of the discriminator 特徴量の共有を説明するための図Diagram for explaining feature sharing 判別器が行う処理を示すフローチャートFlow chart showing processing performed by classifier 本実施形態による顔検出システムにおける処理の流れを示すフローチャートThe flowchart which shows the flow of a process in the face detection system by this embodiment. 詳細な検出処理のフローチャートDetailed detection process flowchart 顔サンプル画像の例を示す図Figure showing an example of a face sample image 判別器の学習方法を示すフローチャートFlow chart showing the learning method of the classifier 弱判別器のヒストグラムを導出する方法を示す図The figure which shows the method of deriving the histogram of the weak classifier 前段の弱判別器群により顔であると判別されたサンプル画像の例を示す図The figure which shows the example of the sample image discriminate | determined as a face by the weak classifier group of the front | former stage

以下、図面を参照して本発明の実施形態について説明する。図１は本発明のオブジェクト検出装置を適用した顔検出システムの構成を示す概略ブロック図である。この顔検出システムは、デジタル画像中に含まれる顔を検出するものである。図１に示すように、顔検出システム１は、顔を検出する対象となる検出対象画像Ｓ０を多重解像度化して解像度が異なる複数の画像（以下、解像度画像という）を生成する多重解像度化部１０と、検出対象画像Ｓ０に含まれる顔を表す画像（以下顔画像とする）Ｆ０を検出する顔検出部２０とを備える。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a schematic block diagram showing the configuration of a face detection system to which the object detection device of the present invention is applied. This face detection system detects a face included in a digital image. As shown in FIG. 1, the face detection system 1 multi-resolutions a detection target image S0 that is a target for detecting a face to generate a plurality of images having different resolutions (hereinafter referred to as resolution images) 10. And a face detection unit 20 that detects an image (hereinafter referred to as a face image) F0 representing a face included in the detection target image S0.

多重解像度化部１０は、検出対象画像Ｓ０の解像度（画像サイズ）を変換することにより、その解像度を所定の解像度、例えば、ＶＧＡサイズ（６４０×４８０画素）の矩形サイズの画像に規格化する。そして、多重解像度化部１０は、規格化された検出対象画像Ｓ０を基本として、解像度変換を行うことにより、図２に示すように、解像度の異なる複数の解像度画像Ｓ１〜Ｓ３…を生成する。なお、規格化された検出対象画像Ｓ０も解像度画像に含めるものとする。 The multi-resolution conversion unit 10 converts the resolution (image size) of the detection target image S0 to normalize the resolution into an image having a predetermined resolution, for example, a VGA size (640 × 480 pixels) rectangular size. Then, the multi-resolution conversion unit 10 generates a plurality of resolution images S1 to S3... Having different resolutions as shown in FIG. 2 by performing resolution conversion based on the standardized detection target image S0. Note that the standardized detection target image S0 is also included in the resolution image.

なお、本実施形態においては、図２に示すように、設定された画素数（例えば３２画素×３２画素）を有するウィンドウＷを、解像度画像Ｓｋ（ｋ＝０〜ｍ））において走査させ、ウィンドウＷにより囲まれた領域を切り出すことにより設定画素数からなる部分画像Ｂを生成するようになっている。これにより、高解像度の解像度画像においてウィンドウＷ内に顔（判別対象）が収まらなかった場合であっても、低解像度画像上においてはウィンドウＷ内に収めることが可能となり、各種サイズの顔の検出を確実に行うことができる。 In the present embodiment, as shown in FIG. 2, a window W having a set number of pixels (for example, 32 pixels × 32 pixels) is scanned in a resolution image Sk (k = 0 to m), and the window A partial image B having a set number of pixels is generated by cutting out an area surrounded by W. As a result, even if the face (discrimination target) does not fit in the window W in the high-resolution resolution image, it can be placed in the window W on the low-resolution image, and detection of faces of various sizes is possible. Can be performed reliably.

顔検出部２０は、多重解像度化部１０が生成した複数の解像度画像Ｓｋ（以下、解像度画像群Ｓｇとする）のそれぞれに対して顔検出処理を施し、各解像度画像Ｓｋにおける顔画像Ｆ０を検出するものである。図３は顔検出部２０の構成を示す概略ブロック図である。図３に示すように、顔検出部２０は、後述の各部を制御して顔検出処理におけるシーケンス制御を主に行う検出制御部２１と、解像度画像群Ｓｇの中から顔検出処理に供する解像度画像Ｓｋをサイズの大きいものから順に順次選択する解像度画像選択部２２と、解像度画像選択部２２により選択された解像度画像Ｓｋにおいて、顔画像であるか否かの判別対象となる部分画像Ｂを切り出すウィンドウＷを、その位置をずらしながら順次設定するウィンドウ設定部２３と、その切り出された部分画像Ｂが顔画像であるか否かを判別する候補判別部２４と、顔画像であると判別された部分画像（以下候補画像ＣＰとする）が顔画像であるか否かをさらに判別する判別部２５とから構成されている。 The face detection unit 20 performs face detection processing on each of a plurality of resolution images Sk (hereinafter referred to as resolution image group Sg) generated by the multi-resolution conversion unit 10, and detects a face image F0 in each resolution image Sk. To do. FIG. 3 is a schematic block diagram showing the configuration of the face detection unit 20. As shown in FIG. 3, the face detection unit 20 controls each unit described later to mainly perform sequence control in the face detection process, and a resolution image used for the face detection process from the resolution image group Sg. A resolution image selection unit 22 that sequentially selects Sk in descending order of size, and a window that cuts out a partial image B that is a determination target of whether or not it is a face image in the resolution image Sk selected by the resolution image selection unit 22 A window setting unit 23 that sequentially sets W while shifting its position, a candidate determination unit 24 that determines whether or not the cut out partial image B is a face image, and a portion that is determined to be a face image The determination unit 25 further determines whether or not an image (hereinafter referred to as a candidate image CP) is a face image.

検出制御部２１は、解像度画像群Ｓｇの各解像度画像Ｓｋに対して、顔画像Ｆ０を検出するという顔検出処理を行うべく、解像度画像選択部２２およびウィンドウ設定部２３を制御するものである。例えば、適宜、解像度画像選択部２２に対して解像度画像Ｓｋの選択を指示したり、ウィンドウ設定部２３に対してウィンドウＷの設定条件を指示したり、得られた検出結果を出力したりする。なお、ウィンドウ設定条件には、ウィンドウＷを設定する画像上の範囲、ウィンドウＷの移動間隔（検出の粗さ）等が含まれる。 The detection control unit 21 controls the resolution image selection unit 22 and the window setting unit 23 so as to perform face detection processing for detecting the face image F0 for each resolution image Sk in the resolution image group Sg. For example, the resolution image selection unit 22 is appropriately instructed to select the resolution image Sk, the window setting unit 23 is instructed about the setting condition of the window W, and the obtained detection result is output. Note that the window setting condition includes a range on the image where the window W is set, a movement interval (detection roughness) of the window W, and the like.

解像度画像選択部２２は、検出制御部２１の制御により、解像度画像群Ｓｇの中から顔検出処理に供する解像度画像Ｓｋをサイズの大きい順に（解像度の細かい順に）順次選択するものである。なお、本実施形態における顔検出の手法が、各解像度画像上で順次切り出された部分画像Ｂについてその部分画像Ｂが顔画像であるか否かを判別し、顔画像であると判別した部分画像Ｂの領域を抽出することにより、検出対象画像Ｓ０における顔画像を検出する手法であるから、この解像度画像選択部２２は、検出対象画像Ｓ０における検出すべき顔の大きさを毎回変えながら設定するものであって、検出すべき顔の大きさを小から大へ変えながら設定するものと同等なものということができる。 Under the control of the detection control unit 21, the resolution image selection unit 22 sequentially selects resolution images Sk to be subjected to face detection processing from the resolution image group Sg in descending order of size (in order of fine resolution). Note that the face detection method in this embodiment determines whether or not the partial image B is a face image for the partial images B sequentially cut out on each resolution image, and the partial image that is determined to be a face image. Since this is a technique for detecting the face image in the detection target image S0 by extracting the area B, the resolution image selection unit 22 sets the size of the face to be detected in the detection target image S0 while changing each time. It can be said that it is equivalent to setting the face to be detected while changing the size of the face from small to large.

ウィンドウ設定部２３は、検出制御部２１により設定されたウィンドウ設定条件に基づいて、解像度画像選択部２２により選択された解像度画像Ｓｋ上でウィンドウＷを移動させながら順次設定する。 The window setting unit 23 sequentially sets the window W while moving it on the resolution image Sk selected by the resolution image selection unit 22 based on the window setting condition set by the detection control unit 21.

候補判別部２４は、部分画像Ｂが顔画像である否かの２値判別を行う機能を有し、図４に示すように複数の弱判別器ＷＣを有する候補判別器３０を有する。ここで、候補判別器３０は、画像平面上において判別対象が回転している面内回転画像と、画像内の判別対象の向きが回転している面外回転画像との双方を顔であると判別するようになっている。 The candidate discriminating unit 24 has a function of performing binary discrimination as to whether or not the partial image B is a face image, and includes a candidate discriminator 30 having a plurality of weak discriminators WC as shown in FIG. Here, the candidate discriminator 30 has both an in-plane rotated image in which the discrimination target is rotated on the image plane and an out-of-plane rotated image in which the orientation of the discrimination target in the image is rotated as a face. It is to be determined.

候補判別器３０は、複数の弱判別器ＷＣが線形に結合したカスケード構造を有しており、弱判別器ＷＣは、部分画像Ｂの画素値（輝度）の分布に係る少なくとも１つの特徴量を算出することにより部分画像Ｂから特徴量を抽出し、この特徴量を用いて部分画像Ｂが顔画像であるか否かを判別するものである。なお、候補判別器２４は弱判別器ＷＣおける判別結果を用いて顔画像であるか否かの判別を行うようになっている。本実施形態においては、各弱判別器ＷＣがそれぞれ算出する判定のためのスコアの総和により、部分画像Ｂが顔画像であるか否かの判別結果ＣＲを出力するものとなっている。 The candidate discriminator 30 has a cascade structure in which a plurality of weak discriminators WC are linearly coupled, and the weak discriminator WC obtains at least one feature amount related to the distribution of pixel values (luminance) of the partial image B. By calculating, a feature amount is extracted from the partial image B, and using this feature amount, it is determined whether or not the partial image B is a face image. The candidate discriminator 24 discriminates whether or not it is a face image using the discrimination result in the weak discriminator WC. In the present embodiment, a determination result CR as to whether or not the partial image B is a face image is output based on the total score for determination calculated by each weak classifier WC.

判別部２５は、候補判別部２４により部分画像Ｂが顔画像であると判別された場合に、顔画像であると判別された部分画像Ｂ、すなわち候補画像ＣＰが、顔画像であるか否かをさらに判別する判別器である。図５は判別部２５の構成を示す図である。図５に示すように判別部３４は、判別可能な顔の向きが異なるｎクラスの判別器２５−１〜２５−ｎ、および判別結果出力部２５−Ｌを有する。ここで、ｎクラスの判別器２５−ｉ（ｉ＝１〜ｎ）は、画像平面上における顔の向きを判別可能な面内回転判別器と、画像内の顔の向きを判別可能な面外回転判別器とを有する。面内回転判別器は、画像の縦方向と顔の中心線との角度が０度の顔を判別可能な０度面内回転判別器、および３０度の顔を判別可能な３０度面内回転判別器等、例えば３０度〜３３０度の範囲で面内回転角度が３０度ずつ異なる向きの顔を判別可能な複数の判別器からなる。なお、例えば０度面内回転判別器は回転角度が０度を中心に−１５度（＝３４５度）〜＋１５度の範囲内にある顔を判別できるようになっている。 The determination unit 25 determines whether or not the partial image B determined as a face image, that is, the candidate image CP is a face image when the candidate determination unit 24 determines that the partial image B is a face image. It is a discriminator that further discriminates. FIG. 5 is a diagram illustrating a configuration of the determination unit 25. As illustrated in FIG. 5, the determination unit 34 includes n classes of determination devices 25-1 to 25-n having different recognizable face directions, and a determination result output unit 25 -L. Here, the n class discriminator 25-i (i = 1 to n) includes an in-plane rotation discriminator capable of discriminating the face orientation on the image plane and an out-of-plane discriminator capable of discriminating the face orientation in the image. A rotation discriminator. The in-plane rotation discriminator is a 0 degree in-plane rotation discriminator capable of discriminating a face whose angle between the vertical direction of the image and the face center line is 0 degree, and a 30 degree in-plane rotation capable of discriminating a face of 30 degrees. For example, the discriminator includes a plurality of discriminators capable of discriminating faces having different in-plane rotation angles by 30 degrees within a range of 30 degrees to 330 degrees, for example. For example, the 0-degree in-plane rotation discriminator can discriminate a face whose rotation angle is in the range of −15 degrees (= 345 degrees) to +15 degrees centering on 0 degrees.

同様に、面外回転判別器は、画像内の顔の向き（角度）が０度の顔、すなわち正面顔を判別可能な０度面外回転判別器、および３０度の顔を判別可能な３０度面外回転判別器等、例えば−９０度〜＋９０度の範囲で面外回転角度が３０度ずつ異なる向きの顔を判別可能な判別器からなる。なお、例えば０度面外回転判別器は回転角度が０度を中心に−１５度〜＋１５度の範囲内にある顔を判別できるようになっている。 Similarly, the out-of-plane rotation discriminator can discriminate a face whose orientation (angle) of the face in the image is 0 degrees, that is, a 0-degree out-of-plane rotation discriminator capable of discriminating a front face and a 30-degree face. It consists of a discriminator capable of discriminating faces with different out-of-plane rotation angles by 30 degrees in the range of −90 degrees to +90 degrees, for example. For example, the 0-degree out-of-plane rotation discriminator can discriminate a face whose rotation angle is in the range of -15 degrees to +15 degrees with 0 degrees as the center.

また、各判別器２５−ｉは、図５に示すように、複数の弱判別器ＷＣが線形に結合したカスケード構造を有しており、弱判別器ＷＣは、候補画像ＣＰの画素値（輝度）の分布に係る少なくとも１つの特徴量を算出し、この特徴量を用いて候補画像ＣＰが顔画像であるか否かを判別するものである。 Each discriminator 25-i has a cascade structure in which a plurality of weak discriminators WC are linearly coupled as shown in FIG. 5, and the weak discriminator WC has a pixel value (luminance) of the candidate image CP. ) Is calculated, and it is determined whether or not the candidate image CP is a face image using the feature amount.

また、各判別器２５−ｉに含まれる複数の弱判別器ＷＣは、前段の弱判別器群ＷＣ−Ｆおよび後段の弱判別器群ＷＣ−Ｂに分割されている。前段の弱判別器群ＷＣ−Ｆは、候補画像ＣＰについての面内および面外の向きを判別するためのものであり、後段の弱判別器群ＷＣ−Ｂは、弱判別器群ＷＣが属する判別器２５−ｉが判別可能な顔の向きに対応した顔であるか否かを判別するためのものである。このため、本実施形態においては、前段の弱判別器群ＷＣ−Ｆと後段の弱判別器群ＷＣ−Ｂとにおいて、学習に使用するサンプル画像が若干異なるものとなっている。なお、弱判別器の学習については後述する。そして、本実施形態においては、すべての判別器２５−ｉの前段の弱判別器群ＷＣ−Ｆにおいて最終的に得られるスコアを出力し、さらにすべての判別器２５−ｉの後段の弱判別器群ＷＣ−Ｂにおいて最終的に得られるスコアを出力する。なお、前段の弱判別器群ＷＣ−Ｆが出力するスコアを第１のスコア、後段の弱判別器群ＷＣ−Ｂが出力するスコアを第２のスコアと称する。 The plurality of weak classifiers WC included in each classifier 25-i are divided into a preceding weak classifier group WC-F and a subsequent weak classifier group WC-B. The former weak classifier group WC-F is for discriminating the in-plane and out-of-plane directions of the candidate image CP, and the latter weak classifier group WC-B belongs to the weak classifier group WC. The discriminator 25-i is for discriminating whether the face corresponds to the face orientation that can be discriminated. For this reason, in the present embodiment, the sample images used for learning are slightly different between the weak classifier group WC-F at the front stage and the weak classifier group WC-B at the rear stage. Note that the weak classifier learning will be described later. In this embodiment, the score finally obtained in the weak classifier group WC-F at the preceding stage of all the classifiers 25-i is output, and the weak classifiers at the subsequent stage of all the classifiers 25-i. The score finally obtained in the group WC-B is output. Note that the score output from the weak classifier group WC-F in the previous stage is referred to as a first score, and the score output from the weak classifier group WC-B in the subsequent stage is referred to as a second score.

なお、各判別器２５−ｉを構成する前段の弱判別器群ＷＣ−Ｆに含まれる少なくとも一部の弱判別器ＷＣは、判別器２５−ｉ間において特徴量を共有している。すなわち、少なくとも一部の弱判別器ＷＣが同一の特徴量を用いて学習がなされており、候補画像ＣＰにおける同一の特徴量を用いて顔の向きの判別を行うものとなっている。なお、図５においては、特徴量を共有していることを、弱判別器ＷＣに斜線を付与して示すものとする。 Note that at least some of the weak classifiers WC included in the preceding weak classifier group WC-F configuring each classifier 25-i share the feature quantity among the classifiers 25-i. That is, at least some weak classifiers WC have learned using the same feature amount, and determine the face orientation using the same feature amount in the candidate image CP. In FIG. 5, it is assumed that the sharing of the feature quantity is indicated by hatching the weak classifier WC.

ここで、図５においては、説明のために各判別器２５−ｉの特徴量を共有している弱判別器ＷＣに斜線を付与して示しているが、各判別器２５−ｉにおいて、特徴量を共有している弱判別器は１つ作成すれよいこととなる。図６は特徴量の共有を説明するための図である。なお、図６においては、説明のために、４つの判別器２５−１〜２５−４のみを示し、さらに各判別器２５−１〜２５−４の前段の４つの弱判別器ＷＣのみを示している。なお、各判別器２５−１〜２５−４における１〜４段目の弱判別器をそれぞれ弱判別器２５−１−１，２５−１−２のように示す。 Here, in FIG. 5, for the sake of explanation, the weak discriminator WC sharing the feature quantity of each discriminator 25-i is shown by hatching, but in each discriminator 25-i, One weak classifier that shares the quantity may be created. FIG. 6 is a diagram for explaining feature amount sharing. In FIG. 6, for the sake of explanation, only four discriminators 25-1 to 25-4 are shown, and further, only four weak discriminators WC preceding each discriminator 25-1 to 25-4 are shown. ing. The first to fourth weak classifiers in the classifiers 25-1 to 25-4 are shown as weak classifiers 25-1-1 and 25-1-2, respectively.

図６の上側に示すよう、１段目の弱判別器が判別器２５−１〜２５−４のすべてにおいて特徴量を共有し、２段目の弱判別器が、弱判別器２５−１−２，２５−２−２，２５−３−２において特徴量を共有し、３段目の弱判別器が、弱判別器２５−１−３，２５−４−３において特徴量を共有し、４段目の弱判別器が、弱判別器２５−２−４，２５−３−４，２５−３−４において特徴量を共有しているものとする。この場合、特徴量を共有している弱判別器は１つの判別器のみ作成すればよいことから、判別器２５−１〜２５−４における１〜４段目までの弱判別器は図６の下側に示すように結合されることとなる。したがって、弱判別器の数を１６から８に減少させることができる。 As shown in the upper side of FIG. 6, the first-stage weak discriminator shares the feature quantity among all the discriminators 25-1 to 25-4, and the second-stage weak discriminator is the weak discriminator 25-1-. 2, 25-2-2, 25-3-2 share the feature value, and the third-stage weak classifier shares the feature value between the weak classifiers 25-1-3 and 25-4-3, It is assumed that the weak discriminator at the fourth stage shares the feature quantity among the weak discriminators 25-2-4, 25-3-4, and 25-3-4. In this case, since only one classifier needs to be created as the weak classifier sharing the feature amount, the weak classifiers in the first to fourth stages in the classifiers 25-1 to 25-4 are shown in FIG. It will be combined as shown below. Therefore, the number of weak classifiers can be reduced from 16 to 8.

次いで、判別部２５における具体的な処理について説明する。図７は判別部２５に含まれる各判別器２５−ｉが行う処理を示すフローチャートである。なお、以下の説明においては、各判別器２５−ｉにおける処理を並列に行っているが、各判別器２５−ｉにおける処理を順次行うようにしてもよいことはもちろんである。 Next, specific processing in the determination unit 25 will be described. FIG. 7 is a flowchart showing processing performed by each discriminator 25-i included in the discriminator 25. In the following description, the processing in each discriminator 25-i is performed in parallel. Of course, the processing in each discriminator 25-i may be performed sequentially.

まず、各判別器２５−ｉにおける前段の弱判別器群ＷＣ−Ｆにおいて、１番目の弱判別器ＷＣが、候補画像ＣＰに対してこの候補画像ＣＰにおける顔の方向を判別するために、候補画像ＣＰから特徴量を算出し（ステップＳＴ１）、特徴量に応じて後述するスコアテーブルを参照してスコアを算出し（ステップＳＴ２）、直前の弱判別器が算出したスコアに自己の算出したスコアを加算して累積スコアを算出する（ステップＳＴ３）。なお、最初の弱判別器では、直前の弱判別器がないので、自己の算出したスコアをそのまま累積スコアとする。次いで、前段の弱判別器群ＷＣ−Ｆのすべてについての累積スコアを算出したか否かを判定し（ステップＳＴ４）、ステップＳＴ４が否定されると、次の弱判別器による判別に移行し（ステップＳＴ５）、ステップＳＴ１に戻る。これにより、前段の弱判別器群ＷＣ−Ｆを構成するすべての弱判別器についての累積スコアが算出される。ステップＳＴ５が肯定されると、累積スコアを第１のスコアＰ１−ｉ（ｉ＝１〜ｎ）として出力する（ステップＳＴ６）。 First, in the previous weak classifier group WC-F in each classifier 25-i, the first weak classifier WC determines the face direction in the candidate image CP with respect to the candidate image CP. A feature amount is calculated from the image CP (step ST1), a score is calculated with reference to a score table to be described later according to the feature amount (step ST2), and the score calculated by the weak classifier immediately before is calculated by itself. Is added to calculate a cumulative score (step ST3). In the first weak classifier, since there is no previous weak classifier, the self-calculated score is used as the cumulative score as it is. Next, it is determined whether or not the cumulative score for all of the weak classifier groups WC-F in the previous stage has been calculated (step ST4). If step ST4 is negative, the process proceeds to determination by the next weak classifier ( Step ST5) and return to step ST1. Thereby, the cumulative score for all weak classifiers constituting the weak classifier group WC-F in the previous stage is calculated. If step ST5 is affirmed, the accumulated score is output as the first score P1-i (i = 1 to n) (step ST6).

続いて、後段の弱判別器群ＷＣ−Ｂによる処理に進む。ステップＳＴ６に引き続き、後段の弱判別器群ＷＣ−Ｂにおける１番目の弱判別器ＷＣが、候補画像ＣＰから特徴量を算出し（ステップＳＴ７）、特徴量に応じてスコアテーブルを参照してスコアを算出し（ステップＳＴ８）、直前の弱判別器が算出したスコアに自己の算出したスコアを加算して累積スコアを算出する（ステップＳＴ９）。なお、後段の弱判別器群ＷＣ−Ｂにおける最初の弱判別器では、直前の弱判別器がないので、自己の算出したスコアを第１のスコアＰ１−ｉに加算して累積スコアとする。次いで、後段の弱判別器群ＷＣ−Ｂのすべてについての累積スコアを算出したか否かを判定し（ステップＳＴ１０）、ステップＳＴ１０が否定されると、次の弱判別器による判別に移行し（ステップＳＴ１１）、ステップＳＴ７に戻る。これにより、前段の弱判別器群ＷＣ−Ｆを構成するすべての弱判別器についての累積スコアが算出される。ステップＳＴ１０が肯定されると、後段の弱判別器群ＷＣ−Ｂについての累積スコアを第２のスコアＰ２−ｉ（ｉ＝１〜ｎ）として出力する（ステップＳＴ１２）。 Subsequently, the process proceeds to the processing by the weak classifier group WC-B in the subsequent stage. Subsequent to step ST6, the first weak classifier WC in the subsequent weak classifier group WC-B calculates a feature quantity from the candidate image CP (step ST7), and refers to the score table according to the feature quantity to obtain a score. Is calculated (step ST8), and the score calculated by itself is added to the score calculated by the previous weak discriminator to calculate the cumulative score (step ST9). In the first weak classifier in the subsequent weak classifier group WC-B, since there is no previous weak classifier, the self-calculated score is added to the first score P1-i to obtain a cumulative score. Next, it is determined whether or not the cumulative score for all of the subsequent weak classifier groups WC-B has been calculated (step ST10). If step ST10 is negative, the process proceeds to determination by the next weak classifier ( Step ST11) and return to step ST7. Thereby, the cumulative score for all weak classifiers constituting the weak classifier group WC-F in the previous stage is calculated. If step ST10 is affirmed, the cumulative score for the subsequent weak classifier group WC-B is output as the second score P2-i (i = 1 to n) (step ST12).

次いで、判別結果出力部２５−Ｌが、第１のスコアＰ１−ｉおよび第２のスコアＰ２−ｉを乗算し、乗算した結果をすべての判別器２５−ｉについて加算して、候補画像ＣＰについての最終スコアＰＬを算出する（ステップＳＴ１３）。最終スコアＰＬは下記の式（１）により算出される。なお、Σはｉ＝１〜ｎまでの（Ｐ１−ｉ）×（Ｐ２−ｉ）の値を加算することを表す
ＰＬ＝Σ（Ｐ１−ｉ）×（Ｐ２−ｉ）（１）
そして、判別結果出力部２５−Ｌが、最終スコアＰＬが所定の閾値以上であるか否かによって候補画像ＣＰが顔画像であるか否かを判別し、判別結果Ｒを出力する（ステップＳＴ１４）。 Next, the discrimination result output unit 25-L multiplies the first score P1-i and the second score P2-i, adds the multiplication results for all the discriminators 25-i, and performs the candidate image CP. The final score PL is calculated (step ST13). The final score PL is calculated by the following formula (1). Note that Σ represents adding the values of (P1-i) × (P2-i) from i = 1 to n. PL = Σ (P1-i) × (P2-i) (1)
Then, the determination result output unit 25-L determines whether or not the candidate image CP is a face image depending on whether or not the final score PL is greater than or equal to a predetermined threshold value, and outputs the determination result R (step ST14). .

なお、本実施形態において、検出制御部２１、解像度画像選択部２２、ウィンドウ設定部２３、候補判別部２４および判別部２５が、本発明の判定手段として機能する。 In the present embodiment, the detection control unit 21, the resolution image selection unit 22, the window setting unit 23, the candidate determination unit 24, and the determination unit 25 function as the determination unit of the present invention.

次に、顔検出システム１における処理の流れについて説明する。図８は本実施形態による顔検出システムにおける処理の流れを示すフローチャートである。図８に示すように、多重解像度化部１０に検出対象画像Ｓ０が入力されると（ステップＳＴ２１）、多重解像度化部１０が検出対象画像Ｓ０を多重解像度化して複数の解像度画像Ｓｋからなる解像度画像群Ｓｇを生成する（ステップＳＴ２２）。顔検出部２０は、検出制御部２１からの指示を受けた解像度画像選択部２２により、解像度画像群Ｓｇの中から画像サイズの大きい順に解像度画像Ｓｋを選択する（ステップＳＴ２３）。次に検出制御部２１が、ウィンドウ設定部２３に対して、ウィンドウＷを初期位置に、すなわち選択された解像度画像上の最初の注目画素にウィンドウＷを設定する指示を行う（ステップＳＴ２４）。ウィンドウ設定部２３は、選択された解像度画像上にウィンドウＷを設定し、設定したウィンドウＷにより部分画像Ｂを切り出し（ステップＳＴ２５）、部分画像Ｂを候補判別部２４に入力する（ステップＳＴ２６）。 Next, the flow of processing in the face detection system 1 will be described. FIG. 8 is a flowchart showing the flow of processing in the face detection system according to this embodiment. As shown in FIG. 8, when the detection target image S0 is input to the multi-resolution converting unit 10 (step ST21), the multi-resolution converting unit 10 multi-resolutions the detection target image S0 to obtain a resolution composed of a plurality of resolution images Sk. An image group Sg is generated (step ST22). In response to the instruction from the detection control unit 21, the face detection unit 20 uses the resolution image selection unit 22 to select resolution images Sk from the resolution image group Sg in descending order of image size (step ST23). Next, the detection control unit 21 instructs the window setting unit 23 to set the window W at the initial position, that is, the first target pixel on the selected resolution image (step ST24). The window setting unit 23 sets a window W on the selected resolution image, cuts out the partial image B using the set window W (step ST25), and inputs the partial image B to the candidate determination unit 24 (step ST26).

候補判別部２４は、入力される部分画像Ｂに対して、部分画像Ｂが顔画像であるか否かの判別を行い、検出制御部２１がその判別結果ＣＲを取得し（ステップＳＴ２７）、判別結果ＣＲが部分画像Ｂが顔画像であるというものであるか否かを判定する（ステップＳＴ２８）。判別結果ＣＲが部分画像Ｂが顔画像でないというものであった場合（ステップＳＴ２８否定）、検出制御部２１は、現在切り出された部分画像Ｂが最後の注目画素に位置する部分画像、すなわち最後の部分画像であるか否かを判定し（ステップＳＴ２９）、部分画像Ｂが最後の部分画像でないと判定された場合には、ウィンドウＷを設定する位置を次の注目画素の位置（すなわち次の位置）に設定し（ステップＳＴ３０）、ステップＳＴ２５に戻って、ウィンドウ設定部２３が新たな部分画像Ｂを切り出す。 The candidate determination unit 24 determines whether or not the partial image B is a face image with respect to the input partial image B, and the detection control unit 21 acquires the determination result CR (step ST27). It is determined whether or not the result CR is that the partial image B is a face image (step ST28). If the determination result CR is that the partial image B is not a face image (No in step ST28), the detection control unit 21 determines that the currently extracted partial image B is the partial image located at the last pixel of interest, that is, the last image It is determined whether or not it is a partial image (step ST29), and if it is determined that the partial image B is not the last partial image, the position where the window W is set is set to the position of the next pixel of interest (that is, the next position). ) (Step ST30), the process returns to step ST25, and the window setting unit 23 cuts out a new partial image B.

なお、部分画像Ｂが最後の部分画像であると判定された場合には、検出制御部２１は、現在選択されている解像度画像Ｓｋが最後に判別される画像、すなわち最後の解像度画像Ｓｍであるか否かを判定し（ステップＳＴ３１）、最後の解像度画像Ｓｍであると判定された場合には検出処理を終了し、検出結果を出力する（ステップＳＴ３２）。一方、最後の解像度画像ではないと判定された場合には、ステップＳＴ２３に戻り、解像度画像選択部２２により、現在選択されている解像度画像より１段階サイズが小さい解像度画像が選択され、さらに顔画像の検出が実行される。 When it is determined that the partial image B is the last partial image, the detection control unit 21 is the image in which the currently selected resolution image Sk is determined last, that is, the last resolution image Sm. Whether or not it is the last resolution image Sm, the detection process is terminated and the detection result is output (step ST32). On the other hand, if it is determined that the resolution image is not the last resolution image, the process returns to step ST23, and the resolution image selection unit 22 selects a resolution image that is one step smaller than the currently selected resolution image. Detection is performed.

一方、判別結果ＣＲが部分画像Ｂが顔画像であるというものであった場合、候補判別部２４は部分画像Ｂを候補画像ＣＰと判別し、さらに詳細な検出処理を行う（ステップＳＴ３３）。図９は詳細な検出処理のフローチャートである。詳細な検出処理においては、候補判別部２４において部分画像Ｂが候補画像ＣＰであると判別されていることから、検出制御部２１が候補画像ＣＰを判別部２５に入力する（ステップＳＴ４１）。判別部２５は、入力される候補画像ＣＰが顔画像であるか否かの判別を行い、検出制御部２１がその判別結果Ｒを取得し（ステップＳＴ４２）、図８に示すフローチャートのステップＳＴ２９に進む。以上の処理を行うことにより、検出対象画像Ｓ０から種々の方向を向いた顔を含む画像を検出することができる。 On the other hand, if the determination result CR is that the partial image B is a face image, the candidate determination unit 24 determines the partial image B as the candidate image CP and performs more detailed detection processing (step ST33). FIG. 9 is a flowchart of detailed detection processing. In the detailed detection process, since the candidate determination unit 24 determines that the partial image B is the candidate image CP, the detection control unit 21 inputs the candidate image CP to the determination unit 25 (step ST41). The determination unit 25 determines whether the input candidate image CP is a face image, the detection control unit 21 acquires the determination result R (step ST42), and the process proceeds to step ST29 of the flowchart shown in FIG. move on. By performing the above processing, an image including a face facing in various directions can be detected from the detection target image S0.

なお、検出結果の出力は、検出対象画像Ｓ０から顔が検出できなかった場合にはその旨を出力し、検出対象画像Ｓ０に顔が検出できた場合には、検出対象画像Ｓ０上における顔が検出された部分画像の位置の座標を出力する。 The detection result is output when a face cannot be detected from the detection target image S0. When a face is detected in the detection target image S0, the face on the detection target image S0 is output. The coordinates of the position of the detected partial image are output.

次に、判別器の学習方法（生成方法）について説明する。なお、学習は、判別器の種類、すなわち、判別すべき顔の向き毎に行われる。 Next, a learning method (generation method) of the discriminator will be described. Note that learning is performed for each type of discriminator, that is, for each orientation of the face to be discriminated.

学習の対象となるサンプル画像群は、ウィンドウＷのサイズで規格化された、顔であることが分かっている複数のサンプル画像（顔サンプル画像群）と、顔でないことが分かっている複数のサンプル画像（非顔サンプル画像群）とからなる。 The sample image group to be learned is standardized by the size of the window W, a plurality of sample images (face sample image group) known to be faces and a plurality of samples known to be non-faces It consists of an image (non-face sample image group).

顔サンプル画像は、判別器２５−ｉのクラス数に応じた顔の向きを有するものとなっている。具体的には、図１０（ａ）に示すように設定位置（例えば中央）に配置された顔が３０°ずつ回転した１２種類の画像からなる面内回転サンプル画像、および図１０（ｂ）に示すように設定位置（例えば中央）に配置された顔の向きが±３０°ずつ回転した７種類の画像からなる面外回転サンプル画像からなる。なお、各顔サンプル画像は、顔の位置およびサイズが規格化されている。 The face sample image has a face orientation corresponding to the number of classes of the discriminator 25-i. Specifically, as shown in FIG. 10A, an in-plane sample image consisting of 12 types of images in which the face arranged at the set position (for example, the center) is rotated by 30 °, and FIG. As shown in the figure, it consists of out-of-plane rotated sample images consisting of seven types of images in which the orientation of the face arranged at the set position (for example, the center) is rotated by ± 30 °. Each face sample image has a standardized face position and size.

そして、このような顔サンプル画像群と非顔サンプル画像群とを用いて顔の向き毎に判別器２５−ｉの学習を行い、１９種類の判別器を生成する。以下、具体的な学習手法について説明する。 Then, using such face sample image group and non-face sample image group, learning of the discriminator 25-i is performed for each face direction, and 19 types of discriminators are generated. Hereinafter, a specific learning method will be described.

図１１は判別器の学習方法を示すフローチャートである。なお、本実施形態においては、前段の弱判別器群ＷＣ−Ｆは、判別器２５−ｉ間において特徴量を共有するものであり、いずれの特徴量を共有させるかは、ユーザが学習時に適宜選択すればよいものである。まず、各サンプル画像には、重みすなわち重要度が割り当てられる。まず、すべてのサンプル画像の重みの初期値が等しく１に設定される（ステップＳＴ５１）。次に、サンプル画像から特徴量を取得し、その特徴量について弱半別器が作成される（ステップＳＴ５２）。なお、特徴量としては、サンプル画像におけるあらかじめ定められた２点間における画素値（輝度値）の差分値等を用いることができる。本実施形態においては、特徴量についてのヒストグラムを弱判別器のスコアテーブルの基礎として使用する。 FIG. 11 is a flowchart showing a learning method of the classifier. In the present embodiment, the weak classifier group WC-F in the previous stage shares the feature quantity among the classifiers 25-i, and which feature quantity is shared is determined appropriately by the user at the time of learning. You just have to choose. First, a weight or importance is assigned to each sample image. First, the initial value of the weight of all sample images is set equal to 1 (step ST51). Next, a feature amount is acquired from the sample image, and a weak semi-separator is created for the feature amount (step ST52). As the feature amount, a difference value between pixel values (luminance values) between two predetermined points in the sample image can be used. In the present embodiment, the histogram for the feature quantity is used as the basis of the score table of the weak classifier.

図１２を参照しながらある弱判別器の作成について説明する。図１２の左側のサンプル画像に示すように、この弱判別器を作成するための特徴量は、顔サンプル画像において、サンプル画像上の右目の中心にある点Ｏ１、および右側の頬の部分にある点Ｏ２の画素値の差分値とする。なお、ある弱判別器を作成するための特徴量を得るための座標位置はすべてのサンプル画像において同一である。そして顔サンプル画像について特徴量が求められ、そのヒストグラムが作成される。ここで、特徴量がとり得る値は、画像の輝度階調数に依存するが、仮に１６ビット階調である場合には、１つの画素値の差分値につき６５５３６通りとなってしまい、学習および検出のために多大なサンプルの数、時間およびメモリを要することとなる。このため、本実施形態においては、特徴量を適当な数値幅で区切って量子化し、ｎ値化する（例えばｎ＝１００）。これにより、特徴量の組合せの数はｎ通りとなるため、特徴量を表すデータ数を低減できる。 The creation of a weak classifier will be described with reference to FIG. As shown in the sample image on the left side of FIG. 12, the feature quantities for creating this weak classifier are in the point O1 at the center of the right eye on the sample image and the cheek portion on the right side in the face sample image. The difference value of the pixel value at the point O2 is used. Note that the coordinate position for obtaining a feature value for creating a weak classifier is the same in all sample images. Then, a feature amount is obtained for the face sample image, and a histogram thereof is created. Here, the value that can be taken by the feature amount depends on the number of luminance gradations of the image, but if it is a 16-bit gradation, there are 65536 different values for the difference value of one pixel value. A large number of samples, time and memory are required for detection. For this reason, in the present embodiment, the feature quantity is divided and quantized by an appropriate numerical value width and converted into an n-value (for example, n = 100). As a result, the number of combinations of feature amounts is n, so the number of data representing feature amounts can be reduced.

同様に、非顔サンプル画像についてもヒストグラムが作成される。なお、非顔サンプル画像については、顔サンプル画像上における特徴量を取得する画素に対応する位置の画素値が用いられる。これらの２つのヒストグラムが示す頻度値の比の対数値をとってヒストグラムで表したものを、図１２の一番右側に示す、弱判別器のスコアテーブルの基礎として用いられるヒストグラムである。この弱判別器のヒストグラムが示す各縦軸の値を、以下スコアと称する。この弱判別器によれば、正のスコアに対応する、特徴量の組合せの分布を示す画像は顔である可能性が高く、スコアの絶対値が大きいほどその可能性は高まると言える。逆に、負のスコアに対応する特徴量の組合せの分布を示す画像は顔でない可能性が高く、やはりスコアの絶対値が大きいほどその可能性は高まる。ステップＳＴ５２では、判別に使用され得る特徴量の組合せについて、上記のヒストグラム形式の複数の弱判別器が作成される。 Similarly, a histogram is also created for the non-face sample image. For the non-face sample image, a pixel value at a position corresponding to a pixel for acquiring a feature amount on the face sample image is used. A histogram obtained by taking logarithm values of the ratios of the frequency values indicated by these two histograms and representing the histogram is shown on the rightmost side of FIG. 12 and used as the basis of the score table of the weak classifier. The value of each vertical axis indicated by the histogram of the weak classifier is hereinafter referred to as a score. According to this weak classifier, an image showing the distribution of the combination of feature amounts corresponding to a positive score is highly likely to be a face, and it can be said that the possibility increases as the absolute value of the score increases. Conversely, an image showing the distribution of the combination of feature amounts corresponding to a negative score is highly likely not to be a face, and the possibility increases as the absolute value of the score increases. In step ST52, a plurality of weak classifiers in the above-described histogram format are created for combinations of feature quantities that can be used for discrimination.

続いて、ステップＳＴ５２で作成した複数の弱半別器のうち、画像が特定方向を向いた顔であるか否かを判別するのに最も有効な弱判別器が選択される。最も有効な弱判別器の選択は、各サンプル画像の重みを考慮して行われる。この例では、各弱判別器の重み付き正答率が比較され、最も高い重み付き正答率を示す弱判別器が選択される（ステップＳＴ５３）。すなわち、最初のステップＳＴ５３では、各サンプル画像の重みは等しく１であるので、単純にその弱判別器によって画像が特定方向を向いた顔であるか否かが正しく判別されるサンプル画像の数が最も多いものが、最も有効な弱判別器として選択される。一方、後述するステップＳＴ５５において各サンプル画像の重みが更新された後の２回目のステップＳＴ５３では、重みが１のサンプル画像、重みが１よりも大きいサンプル画像、および重みが１よりも小さいサンプル画像が混在しており、重みが１よりも大きいサンプル画像は、正答率の評価において、重みが１のサンプル画像よりも重みが大きい分多くカウントされる。これにより、２回目以降のステップＳＴ５３では、重みが小さいサンプル画像よりも、重みが大きいサンプル画像が正しく判別されることに、より重点が置かれる。 Subsequently, the weak classifier that is most effective for determining whether or not the image is a face facing a specific direction is selected from the plurality of weak half-classifiers created in step ST52. The most effective weak classifier is selected in consideration of the weight of each sample image. In this example, the weighted correct answer rates of the weak discriminators are compared, and the weak discriminator showing the highest weighted correct answer rate is selected (step ST53). That is, in the first step ST53, since the weight of each sample image is equal to 1, the number of sample images for which the weak discriminator can simply determine whether the image is a face facing a specific direction is simply determined. The most numerous are selected as the most effective weak classifiers. On the other hand, in the second step ST53 after the weight of each sample image is updated in step ST55, which will be described later, a sample image with a weight of 1, a sample image with a weight greater than 1, and a sample image with a weight less than 1 The sample images having a weight greater than 1 are counted more in the evaluation of the correct answer rate because the weight is larger than the sample images having a weight of 1. Thereby, in step ST53 after the second time, more emphasis is placed on correctly identifying a sample image having a large weight than a sample image having a small weight.

次に、それまでに選択した弱判別器の組合せの正答率、すなわち、それまでに選択した弱判別器を組み合せて使用して（学習段階では、弱判別器は必ずしも線形に結合させる必要はない）各サンプル画像が特定方向を向いた顔の画像であるか否かを判別した結果が、実際に顔の画像であるか否かの答えと一致する率が、所定の閾値を超えたか否かが確かめられる（ステップＳＴ５４）。ここで、弱判別器の組合せの正答率の評価に用いられるのは、現在の重みが付けられたサンプル画像群でも、重みが等しくされたサンプル画像群でもよい。所定の閾値を超えた場合は、それまでに選択した弱判別器を用いれば画像が顔であるか否かを十分に高いスコアで判別できるため、学習は終了する。所定の閾値以下である場合は、それまでに選択した弱判別器と組み合せて用いるための追加の弱判別器を選択するために、ステップＳＴ５６へと進む。ステップＳＴ５６では、直近のステップＳＴ５３で選択された弱判別器が再び選択されないようにするため、その弱判別器が除外される。 Next, the correct answer rate of the combination of the weak classifiers selected so far, that is, using the weak classifiers selected so far (in the learning stage, the weak classifiers do not necessarily have to be linearly combined) ) Whether the result of determining whether or not each sample image is a face image facing in a specific direction has exceeded a predetermined threshold, the rate at which it matches the answer of whether or not it is actually a face image Is confirmed (step ST54). Here, the current weighted sample image group or the sample image group with equal weight may be used for evaluating the correct answer rate of the combination of weak classifiers. When the predetermined threshold value is exceeded, learning is completed because it is possible to determine whether the image is a face with a sufficiently high score using the weak classifier selected so far. If it is equal to or less than the predetermined threshold value, the process proceeds to step ST56 in order to select an additional weak classifier to be used in combination with the weak classifier selected so far. In step ST56, the weak discriminator selected in the latest step ST53 is excluded so as not to be selected again.

次に、直近のステップＳＴ５３で選択された弱判別器では特定方向を向いた顔であるか否かを正しく判別できなかったサンプル画像の重みが大きくされ、画像が顔であるか否かを正しく判別できたサンプル画像の重みが小さくされる（ステップＳＴ５５）。このように重みを大小させる理由は、次の弱判別器の選択において、既に選択された弱判別器では正しく判別できなかった画像を重要視し、それらの画像が顔であるか否かを正しく判別できる弱判別器が選択されるようにして、弱判別器の組合せの効果を高めるためである。続いて、ステップＳＴ５３へと戻り、上記したように重み付き正答率を基準にして次に有効な弱判別器が選択される。 Next, the weak discriminator selected in the most recent step ST53 cannot correctly determine whether the face is in a specific direction or not, the weight of the sample image is increased, and whether or not the image is a face is correct. The weight of the sample image that can be discriminated is reduced (step ST55). The reason for increasing or decreasing the weight in this way is that in the selection of the next weak classifier, importance is placed on images that could not be correctly determined by the already selected weak classifier, and whether or not those images are faces is correct. This is because a weak discriminator that can be discriminated is selected to enhance the effect of the combination of the weak discriminators. Subsequently, the process returns to step ST53, and the next effective weak classifier is selected based on the weighted correct answer rate as described above.

以上のステップＳＴ５３からＳ５６を繰り返して、特定方向を向いた顔であるか否かを判別するのに適した弱判別器として、特徴量の組合せに対応する弱判別器が選択されたところで、ステップＳＴ５４で確認される正答率が閾値を超えたとすると、特定方向を向いた顔であるか否かの判別に用いる弱判別器の種類と判別条件とが確定され（ステップＳＴ５７）、これにより学習を終了する。なお、選択された弱判別器は、その重み付き正答率が高い順に線形結合され、１つの判別器が構成される。また、各弱判別器については、それぞれ得られたヒストグラムを基に、特徴量の組合せに応じてスコアを算出するためのスコアテーブルが生成される。なお、ヒストグラム自身をスコアテーブルとして用いることもでき、この場合、ヒストグラムの判別ポイントがそのままスコアとなる。このようにして、各顔サンプル画像群毎に学習を行うことにより、上述の１９種類の判別器が生成される。 Steps ST53 to S56 are repeated, and when a weak discriminator corresponding to a combination of feature amounts is selected as a weak discriminator suitable for discriminating whether or not the face is in a specific direction, the step If the correct answer rate confirmed in ST54 exceeds the threshold, the type of weak discriminator used for discriminating whether or not the face is in a specific direction and the discriminating condition are determined (step ST57), thereby learning. finish. The selected weak classifiers are linearly combined in descending order of the weighted correct answer rate to constitute one classifier. For each weak classifier, a score table for calculating a score according to the combination of feature amounts is generated based on the obtained histogram. Note that the histogram itself can also be used as a score table. In this case, the discrimination point of the histogram is directly used as a score. In this manner, the 19 kinds of discriminators described above are generated by performing learning for each face sample image group.

ここで、本実施形態においては、前段の弱判別器群ＷＣ−Ｆは、候補画像ＣＰについての顔の向きを判別するためのものであり、後段の弱判別器群ＷＣ−Ｂは、弱判別器群ＷＣが属する判別器２５−ｉが判別可能な顔の向きに対応した顔であるか否かを判別するためのものである。このため、本実施形態においては、あらかじめ定められた数の弱判別器が選択された時点で、それまでに選択された弱判別器から前段の弱判別器群ＷＣ−Ｆを構成する。そして、前段の弱判別器群ＷＣ−Ｆを用いて、顔サンプル画像群および非顔サンプル画像群についての顔の判別を行う。ここで、前段の弱判別器群ＷＣ−Ｆのみでは、判別器２５−ｉの学習の途中の段階であることから、精度良く特定方向の顔を判別できないため、特定方向の顔と見間違えるような非顔サンプル画像を特定方向の顔と判別してしまう場合がある。 Here, in the present embodiment, the weak classifier group WC-F in the previous stage is for determining the face orientation of the candidate image CP, and the weak classifier group WC-B in the subsequent stage is the weak classifier. This is for discriminating whether or not the discriminator 25-i to which the vessel group WC belongs is a face corresponding to the face orientation that can be discriminated. For this reason, in this embodiment, when a predetermined number of weak classifiers are selected, the weak classifier group WC-F in the previous stage is configured from the weak classifiers selected so far. Then, using the weak classifier group WC-F in the previous stage, face discrimination is performed for the face sample image group and the non-face sample image group. Here, since only the weak classifier group WC-F in the previous stage is in the middle of learning by the classifier 25-i, a face in a specific direction cannot be accurately identified, so that it may be mistaken for a face in a specific direction. In some cases, a non-face sample image is identified as a face in a specific direction.

図１３は正面を向いた顔を判別可能な判別器における前段の弱判別器群ＷＣ−Ｆにより顔であると判別されたサンプル画像の例を示す図である。図１３に示す４つのサンプル画像のうち、サンプル画像ＳＰ１〜ＳＰ３は正面を向いた顔であることが明らかであるが、サンプル画像ＳＰ４は明らかに顔ではないことが分かる。このため、本実施形態においては、前段の弱判別器群ＷＣ−Ｆによって特定方向の顔であると判別されたサンプル画像のうち、明らかに顔でないサンプル画像を、非顔サンプル画像として後段の弱判別器群ＷＣ−Ｂの学習には使用しないようにする。このようにして、学習に使用するサンプル画像を選択した後、前段の弱判別器群ＷＣ−Ｆから続けて、後段の弱判別器群ＷＣ−Ｂを構成する弱判別器の学習を行う。なお、後段の弱判別器群ＷＣ−Ｂの学習時においては、ステップＳＴ５４において使用する閾値を、前段の弱判別器群ＷＣ−Ｆの学習に使用した閾値よりも高くすることにより、後段の弱判別器群ＷＣ−Ｂにより精度の高い判別を行うことが可能となる。 FIG. 13 is a diagram illustrating an example of a sample image that is determined to be a face by the weak classifier group WC-F in the previous stage in the classifier capable of discriminating a face facing the front. Among the four sample images shown in FIG. 13, it is clear that the sample images SP1 to SP3 are faces facing the front, but the sample image SP4 is clearly not a face. For this reason, in the present embodiment, among the sample images determined to be faces in a specific direction by the weak classifier group WC-F in the previous stage, a sample image that is clearly not a face is used as a non-face sample image. It is not used for learning of the classifier group WC-B. In this way, after selecting a sample image to be used for learning, the weak classifiers constituting the subsequent weak classifier group WC-B are learned following the weak classifier group WC-F in the previous stage. Note that, at the time of learning the subsequent weak classifier group WC-B, the threshold value used in step ST54 is set higher than the threshold value used for learning the previous weak classifier group WC-F, so that The discriminator group WC-B can perform discrimination with high accuracy.

なお、上記の学習手法を採用する場合において、弱判別器は、特徴量の組合せを用いて顔の画像と顔でない画像とを判別する基準を提供するものであれば、上記のヒストグラムの形式のものに限られずいかなるものであってもよく、例えば２値データ、閾値または関数等であってもよい。また、同じヒストグラムの形式であっても、図１２の中央に示した２つのヒストグラムの差分値の分布を示すヒストグラム等を用いてもよい。また、学習の方法としては上記手法に限定されるものではなく、ニューラルネットワーク等他のマシンラーニングの手法を用いることができる。 In the case of adopting the above learning method, if the weak discriminator provides a reference for discriminating between a face image and a non-face image using a combination of feature amounts, The data is not limited to any data, and may be any data, for example, binary data, a threshold value, a function, or the like. Further, even in the same histogram format, a histogram or the like indicating the distribution of difference values between the two histograms shown in the center of FIG. 12 may be used. Further, the learning method is not limited to the above method, and other machine learning methods such as a neural network can be used.

このように、本実施形態によれば、前段の弱判別器群ＷＣ−Ｆにより顔の向きを表す第１のスコアＰ１−ｉを取得し、後段の弱判別器群ＷＣ−Ｂにより各判別器２５−ｉが判別可能な顔の向きに対応した顔であることを表す第２のスコアＰ２−ｉを取得する。そして、各判別器２５−ｉのそれぞれにおける第１のスコアＰ１−ｉおよび第２のスコアＰ２−ｉの乗算値の、すべての判別器２５−ｉの和に基づいて、顔を検出するようにしたものである。このため、判別可能な顔の向きがそれぞれ異なる複数の判別器２５−ｉによる判別結果を統合して、顔の検出を行うことができるため、特許文献１および非特許文献１の手法のように、特定の向きのみの顔を検出するものと比較して、向きが異なる顔を柔軟に検出できることとなる。したがって、弱判別器の数を増加させることなく、顔の検出精度を向上させることができる。 Thus, according to the present embodiment, the first score P1-i representing the face direction is acquired by the weak classifier group WC-F in the previous stage, and each classifier is acquired by the weak classifier group WC-B in the subsequent stage. A second score P2-i indicating that the face corresponds to the face orientation that can be identified by 25-i is acquired. Then, the face is detected based on the sum of all the discriminators 25-i of the multiplication values of the first score P1-i and the second score P2-i in each of the discriminators 25-i. It is a thing. For this reason, it is possible to detect the face by integrating the discrimination results by the plurality of discriminators 25-i having different face directions that can be discriminated. Therefore, as in the methods of Patent Literature 1 and Non-Patent Literature 1. Compared with detecting a face only in a specific direction, a face with a different orientation can be detected flexibly. Therefore, face detection accuracy can be improved without increasing the number of weak classifiers.

また、前段の弱判別器群ＷＣ−Ｆを、複数の判別器２５−ｉの少なくとも一部において特徴量を共有させることにより、複数の判別器２５−ｉの少なくとも一部における複数の判別処理を１つの弱判別器ＷＣにおいて行うことができることとなる。したがって、前段の弱判別器群ＷＣ−Ｆの数を少なくすることができ、その結果、前段の弱判別器群ＷＣ−Ｆにおける弱判別器ＷＣの数を少なくすることができる。 In addition, the weak classifier group WC-F in the previous stage shares a feature amount in at least a part of the plurality of classifiers 25-i, thereby performing a plurality of discrimination processes in at least a part of the plurality of classifiers 25-i. This can be performed in one weak classifier WC. Therefore, the number of weak classifier groups WC-F in the previous stage can be reduced, and as a result, the number of weak classifiers WC in the weak classifier group WC-F in the previous stage can be reduced.

なお、上記実施形態においては、検出対象を人物の顔としているが、人物の手等の他のオブジェクトを検出するようにしてもよい。この場合、判別器はオブジェクトを含むサンプル画像群とオブジェクトを含まないサンプル画像群とを用いて学習を行えばよい
以上、本発明の実施形態に係る顔検出システムについて説明したが、この顔検出システムのうちの本発明のオブジェクト検出装置に対応する部分における各処理をコンピュータに実行させるためのプログラムも、本発明の実施形態の１つである。また、そのようなプログラムを記録したコンピュータ読取可能な記録媒体も、本発明の実施形態の１つである。 In the above embodiment, the detection target is a human face, but other objects such as a human hand may be detected. In this case, the discriminator may perform learning using a sample image group including an object and a sample image group including no object. As described above, the face detection system according to the embodiment of the present invention has been described. A program for causing a computer to execute each process in a portion corresponding to the object detection device of the present invention is also an embodiment of the present invention. A computer-readable recording medium that records such a program is also one embodiment of the present invention.

１顔検出システム
１０多重解像度化部
２０顔検出部
２１検出制御部
２２解像度画像選択部
２３ウィンドウ設定部
２４候補判別部
２５判別部 DESCRIPTION OF SYMBOLS 1 Face detection system 10 Multi-resolution part 20 Face detection part 21 Detection control part 22 Resolution image selection part 23 Window setting part 24 Candidate discrimination | determination part 25 Discrimination part

Claims

A plurality of weak classifiers that have previously learned feature quantities extracted from objects to be discriminated, each having a plurality of discriminators having different orientations of the distinguishable objects, and using feature quantities extracted from detection target images In the object detection apparatus provided with a determination unit for detecting the object from the detection target image,
A plurality of weak classifiers of each classifier is divided into a weak classifier group in the previous stage and a weak classifier group in the subsequent stage, and the weak classifier group in the previous stage is learned to determine the orientation of the object, The latter weak classifier group is learned to detect an object corresponding to the orientation of the object that can be discriminated by the classifier to which each subsequent weak classifier belongs,
The discriminating unit obtains a first score that is an output of the preceding weak discriminator group and a second score that is an output of the subsequent weak discriminator, and the first score in each of the plurality of discriminators. An object detection apparatus, comprising: means for detecting the object based on a sum of multiplication values of the score of 1 and the second score for all the discriminators.

The object detection apparatus according to claim 1, wherein the weak classifier group in the previous stage shares the feature quantity in at least a part of the plurality of classifiers.

3. The object detection apparatus according to claim 1, wherein the preceding weak classifier group and the subsequent weak classifier group are connected in series.

A reference sample image in which the object is directed in a predetermined direction, and a plurality of in-plane rotation sample images having different rotation angles obtained by rotating the determination target of the reference sample image in the plane of the reference sample image And learning using at least one of a plurality of out-of-plane rotated sample images with different rotation angles obtained by rotating the direction of the discrimination target in the reference sample image. 4. The object detection device according to any one of items 3.

A plurality of weak classifiers that have previously learned feature quantities extracted from objects to be discriminated, each having a plurality of discriminators having different orientations of the distinguishable objects, and using feature quantities extracted from detection target images In the object detection method for detecting the object from the detection target image,
A plurality of weak classifiers of each classifier is divided into a weak classifier group in the previous stage and a weak classifier group in the subsequent stage, and the weak classifier group in the previous stage is learned to determine the orientation of the object, The latter weak classifier group is learned to detect an object corresponding to the direction of the object that can be discriminated by the classifier to which each subsequent weak classifier belongs,
Obtaining a first score that is the output of the preceding weak classifier group and a second score that is the output of the subsequent weak classifier;
An object detection method, comprising: detecting the object based on a sum of multiplication values of the first score and the second score in each of the plurality of classifiers for all the classifiers.

A plurality of weak classifiers that have previously learned feature quantities extracted from objects to be discriminated, each having a plurality of discriminators having different orientations of the distinguishable objects, and using feature quantities extracted from detection target images In a program for causing a computer to execute an object detection method for detecting the object from the detection target image,
A plurality of weak classifiers of each classifier is divided into a weak classifier group in the previous stage and a weak classifier group in the subsequent stage, and the weak classifier group in the previous stage is learned to determine the orientation of the object, The latter weak classifier group is learned to detect an object corresponding to the direction of the object that can be discriminated by the classifier to which each subsequent weak classifier belongs,
The program acquires a first score that is an output of the preceding weak classifier group, and a second score that is an output of the subsequent weak classifier;
Causing the computer to execute a procedure for detecting the object based on a sum of the product of the first score and the second score of each of the plurality of classifiers for all the classifiers. A featured program.