JP2015056004A

JP2015056004A - Object detection device, method and program, and feature quantity deriving method and program

Info

Publication number: JP2015056004A
Application number: JP2013188673A
Authority: JP
Inventors: 載勲劉; Jaehoon Yu; 龍介宮本; Ryusuke Miyamoto; 孝雄尾上; Takao Onoe
Original assignee: Osaka University NUC
Current assignee: Osaka University NUC
Priority date: 2013-09-11
Filing date: 2013-09-11
Publication date: 2015-03-23

Abstract

PROBLEM TO BE SOLVED: To increase processing speed and improve object detection accuracy.SOLUTION: An object detection device generates an integral image of a predetermined scale, for each of channels including a predetermined number of luminance gradient direction and gradient intensity, on the basis of an input image (31). On the basis of the generated integral image of the predetermined scale, the object detection device calculates a gradient feature quantity for each of the luminance gradient directions and an intensity feature quantity for the gradient intensity, by a rectangular area (51, 61) unit corresponding to a search window (40). The object detection device identifies an object by use of a normalized gradient feature quantity obtained by dividing the gradient feature quantity calculated based on the gradient integral image of the predetermined scale, as an identification feature quantity of the luminance gradient direction, by the intensity feature quantity calculated based on the intensity integral image of the predetermined scale in the same rectangular area.

Description

本発明は、対象物検出装置、方法およびプログラム、ならびに、特徴量導出方法およびプログラムに関し、特に、輝度勾配を用いて、入力画像から歩行者などの対象物を検出するための対象物検出装置、方法およびプログラム、ならびに、入力画像から対象物を検出するための識別用特徴量を導出するための特徴量導出方法およびプログラムに関する。 The present invention relates to an object detection device, a method and a program, and a feature amount derivation method and a program, and in particular, an object detection device for detecting an object such as a pedestrian from an input image using a luminance gradient, The present invention relates to a method and a program, and a feature quantity derivation method and a program for deriving an identification feature quantity for detecting an object from an input image.

画像処理技術に基づいた歩行者検出には様々な手法がある。現在、世界最高レベルの検出精度を示す手法の１つに、「インテグラル・チャネル・フィーチャーズ（Integral Channel Features）」という手法がある（非特許文献１）。この手法においては、入力画像から輝度勾配の方向、強度、色情報といった様々な特徴を積分画像として生成する。そして、これらの積分画像を用いて計算される矩形特徴をノードとする木構造弱識別器を構成し、この弱識別器をブースティングによって学習して強識別器を構築している。 There are various methods for detecting pedestrians based on image processing technology. Currently, there is a method called “Integral Channel Features” as one of the methods showing the world's highest level of detection accuracy (Non-Patent Document 1). In this method, various features such as the direction of brightness gradient, intensity, and color information are generated as an integrated image from an input image. Then, a tree-structure weak classifier having a rectangular feature calculated using these integral images as a node is configured, and this weak classifier is learned by boosting to construct a strong classifier.

上記手法では、任意の大きさの歩行者に対応するために、チャネル（特徴）ごとに、複数の異なるスケールの積分画像を生成している。このように、任意の大きさの対象物を検出するためには、たとえば特開２００８−２７２７５号公報（特許文献１）にも示されるように、リサイズ画像に対する処理が必要となることは周知である。 In the above method, in order to deal with a pedestrian having an arbitrary size, a plurality of integral images of different scales are generated for each channel (feature). As described above, in order to detect an object having an arbitrary size, it is well known that processing on a resized image is required as disclosed in, for example, Japanese Patent Application Laid-Open No. 2008-27275 (Patent Document 1). is there.

インテグラル・チャネル・フィーチャーズの演算を高速化する手法の１つに「ＦＰＤＷ（The Fastest Pedestrian Detector in the West）」という手法がある（非特許文献２）。この手法においては、インテグラル・チャネル・フィーチャーズよりも疎なスケール間隔で複数スケールの積分画像を生成している。そして、生成された積分画像に対し実験的に得られた固定定数を乗ずることによって、所定スケールの積分画像の矩形特徴量を近似的に算出（推定）している。 One of the methods for speeding up the calculation of Integral Channel Features is “FPDW (The Fastest Pedestrian Detector in the West)” (Non-patent Document 2). In this method, a multi-scale integral image is generated at a sparser scale interval than that of Integral Channel Features. Then, by multiplying the generated integral image by a fixed constant obtained experimentally, the rectangular feature amount of the integral image of a predetermined scale is approximately calculated (estimated).

特開２００８−２７２７５号公報JP 2008-27275 A

Piotr Dollar、他３名、“Integral Channel Features”、［online］、２００９年、［２０１３年７月３日検索］、インターネット＜ HYPERLINK "http://www.loni.ucla.edu/~ztu/publication/dollarBMVC09ChnFtrs#0.pdf" http://www.loni.ucla.edu/~ztu/publication/dollarBMVC09ChnFtrs#0.pdf＞Piotr Dollar and three others, “Integral Channel Features”, [online], 2009, [searched July 3, 2013], Internet <HYPERLINK "http://www.loni.ucla.edu/~ztu/publication /dollarBMVC09ChnFtrs#0.pdf "http://www.loni.ucla.edu/~ztu/publication/dollarBMVC09ChnFtrs#0.pdf> Piotr Dollar、他２名、“ＦＰＤＷ（The Fastest Pedestrian Detector in the West）”、［online］、２０１０年、［２０１３年７月３日検索］、インターネット＜http://vision.ucsd.edu/sites/default/files/FPDW#0.pdf＞Piotr Dollar and two others, “FPDW (The Fastest Pedestrian Detector in the West)”, [online], 2010, [searched July 3, 2013], Internet <http://vision.ucsd.edu/sites /default/files/FPDW#0.pdf>

インテグラル・チャネル・フィーチャーズ（以下「ＩＣＦ」と略す）では、多くのスケールの積分画像を生成する必要があるため、リアルタイムでの歩行者検出は困難である。また、ＦＰＤＷでは、ある程度のスケール非依存性を実現しているものの、検出精度を維持するために、矩形特徴量の推定に用いられる固定定数は事前の実験により求められている。 In Integral Channel Features (hereinafter abbreviated as “ICF”), since it is necessary to generate integrated images of many scales, it is difficult to detect pedestrians in real time. In addition, although the FPDW achieves a certain degree of scale independence, a fixed constant used for estimation of the rectangular feature value is obtained by a prior experiment in order to maintain detection accuracy.

本発明の目的は、処理速度を高速化するとともに、検出精度を向上させることのできる対象物検出装置、方法およびプログラム、ならびに、特徴量導出方法およびプログラムを提供することである。 An object of the present invention is to provide an object detection device, a method and a program, and a feature quantity deriving method and a program capable of increasing the processing speed and improving the detection accuracy.

この発明のある局面に従う対象物検出装置は、入力画像から対象物を検出するための対象物検出装置であって、所定数の輝度勾配方向と勾配強度とを含む複数のチャネルそれぞれについて、入力画像に基づき、所定スケールの勾配積分画像および強度積分画像を生成するための生成手段と、生成手段により生成された所定スケールの勾配積分画像および強度積分画像に基づいて、検索窓に対応する矩形領域単位で、各輝度勾配方向についての勾配特徴量と、勾配強度についての強度特徴量とを算出する算出手段と、算出手段により算出された勾配特徴量および強度特徴量に基づいて、矩形領域内における対象物の有無を識別する識別手段とを備える。識別手段は、所定スケールの勾配積分画像に基づき算出された勾配特徴量を、同一の矩形領域における、所定スケールの強度積分画像に基づき算出された強度特徴量で除算した正規化勾配特徴量を含む特徴量を用いて、対象物の有無を識別する。 An object detection device according to an aspect of the present invention is an object detection device for detecting an object from an input image, and each of a plurality of channels including a predetermined number of luminance gradient directions and gradient intensities. And a rectangular area unit corresponding to the search window based on the gradient integration image and the intensity integral image of the predetermined scale generated by the generation means And calculating means for calculating the gradient feature quantity for each luminance gradient direction and the intensity feature quantity for the gradient intensity, and the object in the rectangular area based on the gradient feature quantity and the intensity feature quantity calculated by the calculation means. Identification means for identifying the presence or absence of an object. The identifying means includes a normalized gradient feature amount obtained by dividing the gradient feature amount calculated based on the gradient integral image of a predetermined scale by the intensity feature amount calculated based on the intensity integral image of the predetermined scale in the same rectangular region. The feature quantity is used to identify the presence or absence of an object.

好ましくは、生成手段は、勾配強度のチャネルについては、入力画像に基づいて、所定スケールとはスケールが異なる縮小強度積分画像をさらに生成し、算出手段は、勾配強度のチャネルについては、縮小強度積分画像に基づく強度特徴量をさらに算出する。識別手段は、判別対象のチャネルが勾配強度の場合に、同一の矩形領域における、複数スケール分の強度特徴量に基づいて、対象物の有無を識別する。 Preferably, for the gradient intensity channel, the generation means further generates a reduced intensity integrated image having a scale different from the predetermined scale based on the input image, and the calculation means calculates the reduced intensity integration for the gradient intensity channel. An intensity feature quantity based on the image is further calculated. The identification means identifies the presence / absence of an object based on intensity feature quantities for a plurality of scales in the same rectangular area when the channel to be identified is gradient intensity.

好ましくは、識別手段は、判別対象のチャネルが輝度勾配方向に該当する場合に、勾配特徴量を所定スケールの強度特徴量で除算する正規化手段を含む。 Preferably, the identification unit includes a normalization unit that divides the gradient feature amount by the intensity feature amount of a predetermined scale when the channel to be identified corresponds to the luminance gradient direction.

好ましくは、複数のチャネルは、色情報をさらに含み、生成手段は、色情報のチャネルについて、入力画像に基づき、所定スケールを含む複数スケールの色積分画像を生成し、算出手段は、色情報のチャネルについては、生成された複数スケールの色積分画像それぞれに基づいて、矩形領域単位で色特徴量を算出する。識別手段は、判別対象のチャネルが色情報の場合に、同一の矩形領域における複数スケール分の色特徴量に基づいて、対象物の有無を識別する。 Preferably, the plurality of channels further include color information, and the generation unit generates a multi-scale color integrated image including a predetermined scale based on the input image for the color information channel, and the calculation unit includes the color information. For the channel, a color feature amount is calculated in units of rectangular areas based on each of the generated multi-scale color integrated images. The identification means identifies the presence / absence of an object based on color feature values for a plurality of scales in the same rectangular area when the channel to be identified is color information.

または、複数のチャネルは、色情報をさらに含み、算出手段は、色情報のチャネルについて、矩形領域単位で、入力画像に基づき、所定スケールの色積分画像に基づいて色特徴量を算出し、識別手段は、色特徴量を、対象の矩形領域における色の合計を示す演算値で除算した正規化色特徴量を用いて、対象物の有無を識別することも望ましい。 Alternatively, the plurality of channels further include color information, and the calculating unit calculates color feature amounts based on the color integration image of a predetermined scale based on the input image based on the input image with respect to the color information channel in units of rectangular areas. It is also preferable that the means identify the presence / absence of an object using a normalized color feature amount obtained by dividing the color feature amount by a calculated value indicating the sum of colors in the target rectangular area.

上記演算値は、矩形領域内の画素値であってもよい。 The calculated value may be a pixel value in a rectangular area.

好ましくは、識別手段による矩形領域ごとの識別結果に基づいて、入力画像中の対象物の位置を判定する判定手段と、判定手段による判定結果を出力する出力手段とをさらに備える。 Preferably, a determination unit that determines the position of the object in the input image based on the identification result for each rectangular area by the identification unit, and an output unit that outputs the determination result by the determination unit are further provided.

この発明の他の局面に従う対象物検出方法は、入力画像から対象物を検出するための対象物検出方法であって、所定数の輝度勾配方向と勾配強度とを含む複数のチャネルそれぞれについて、入力画像に基づき、所定スケールの勾配積分画像および強度積分画像を生成するステップと、生成された、所定スケールの勾配積分画像および強度積分画像に基づいて、検索窓に対応する矩形領域単位で、各輝度勾配方向についての勾配特徴量と、勾配強度についての強度特徴量とを算出するステップと、算出された勾配特徴量および強度特徴量に基づいて、矩形領域内における対象物の有無を識別するステップとを備える。識別するステップは、所定スケールの勾配積分画像に基づき算出された勾配特徴量を、同一の矩形領域における、所定スケールの強度積分画像に基づき算出された強度特徴量で除算した正規化勾配特徴量を含む特徴量を用いて、対象物の有無を識別するステップを含む。 An object detection method according to another aspect of the present invention is an object detection method for detecting an object from an input image, wherein each of a plurality of channels including a predetermined number of luminance gradient directions and gradient intensities is input. A step of generating a gradient integrated image and an intensity integrated image of a predetermined scale based on the image, and each luminance in a rectangular area unit corresponding to the search window based on the generated gradient integrated image and the intensity integrated image of the predetermined scale. Calculating a gradient feature amount for the gradient direction and an intensity feature amount for the gradient strength, and identifying the presence or absence of an object in the rectangular region based on the calculated gradient feature amount and the intensity feature amount; Is provided. The identifying step includes a normalized gradient feature amount obtained by dividing a gradient feature amount calculated based on a gradient integral image of a predetermined scale by an intensity feature amount calculated based on an intensity integral image of a predetermined scale in the same rectangular region. The step of identifying the presence / absence of an object using the included feature amount is included.

この発明のさらに他の局面に従う対象物検出プログラムは、上記記載の対象物検出方法に含まれる各ステップをコンピュータに実行させる。 An object detection program according to still another aspect of the present invention causes a computer to execute each step included in the object detection method described above.

この発明のさらに他の局面に従う特徴量導出方法は、入力画像から対象物を検出するための識別用特徴量の導出方法であって、入力画像に基づいて、所定の特性情報を示す複数のチャネルそれぞれの積分画像を生成するステップと、生成された積分画像に基づいて、検索窓に対応する矩形領域ごとに、各チャネルの矩形特徴量を算出するステップと、算出された矩形特徴量を、同一の矩形領域における所定の特性情報の総和を示す演算値で除算することによって、識別用特徴量を算出するステップとを備える。 A feature value deriving method according to still another aspect of the present invention is a method for deriving an identification feature value for detecting an object from an input image, and includes a plurality of channels indicating predetermined characteristic information based on the input image. The step of generating each integrated image, the step of calculating the rectangular feature amount of each channel for each rectangular region corresponding to the search window based on the generated integrated image, and the calculated rectangular feature amount are the same. And calculating a feature value for identification by dividing by a calculated value indicating the sum of predetermined characteristic information in the rectangular area.

この発明のさらに他の局面に従う特徴量導出プログラムは、上記記載の特徴量導出方法に含まれる各ステップをコンピュータに実行させる。 A feature quantity deriving program according to still another aspect of the present invention causes a computer to execute each step included in the feature quantity deriving method described above.

本発明によれば、スケールに依存しない正規化勾配特徴量によって対象物の有無が識別される。したがって、処理速度を高速化することができる。また、対象物の検出精度を向上させることもできる。 According to the present invention, the presence / absence of an object is identified by a normalized gradient feature quantity independent of the scale. Therefore, the processing speed can be increased. Moreover, the detection accuracy of the object can be improved.

本発明の実施の形態に係る対象物検出装置のハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the target object detection apparatus which concerns on embodiment of this invention. 本発明の実施の形態に係る対象物検出装置の機能構成を示す機能ブロック図である。It is a functional block diagram which shows the function structure of the target object detection apparatus which concerns on embodiment of this invention. 色情報および勾配強度の識別用特徴量の算出方法を概念的に示す概念図である。It is a conceptual diagram which shows notionally the calculation method of the feature-value for identification of color information and gradient strength. 輝度勾配の識別用特徴量の算出方法を概念的に示す概念図である。It is a conceptual diagram which shows notionally the calculation method of the feature-value for identification of a brightness | luminance gradient. 本発明の実施の形態に係る識別処理を概念的に示す図である。It is a figure which shows notionally the identification process which concerns on embodiment of this invention. 本発明の実施の形態に係る対象物検出処理を示すフローチャートである。It is a flowchart which shows the target object detection process which concerns on embodiment of this invention. 本発明の実施の形態における識別処理を示すフローチャートである。It is a flowchart which shows the identification process in embodiment of this invention. ＩＮＲＩＡデータセットを用いて評価を行った場合の歩行者検出精度を示すグラフである。It is a graph which shows the pedestrian detection accuracy at the time of evaluating using an INRIA data set.

本発明の実施の形態について図面を参照しながら詳細に説明する。なお、図中同一または相当部分には同一符号を付してその説明は繰返さない。 Embodiments of the present invention will be described in detail with reference to the drawings. In the drawings, the same or corresponding parts are denoted by the same reference numerals and description thereof will not be repeated.

本実施の形態に係る対象物検出装置は、入力画像から、対象物としてたとえば歩行者を検出する。本実施の形態に係る対象物検出装置は、上述のＩＣＦと同様に、入力画像から、歩行者の検出に有効な特性情報として、色情報、輝度勾配、および、輝度勾配強度を抽出している。色情報は、たとえばＬＵＶ表色系に基づいた３つのチャネルに分類される。輝度勾配としては、たとえば、１８０°を６分割した０°、３０°、６０°、９０°、１２０°および１８０°の６つのチャネルに分類される。輝度勾配強度は１チャネルのみである。対象物検出装置は、チャネルごとの積分画像を生成し、積分画像から矩形特徴を抽出して、ブースティングに基づき構築された強識別器のスコアを弱識別器のスコアから計算する。 The object detection device according to the present embodiment detects, for example, a pedestrian as an object from the input image. Similar to the ICF described above, the object detection device according to the present embodiment extracts color information, luminance gradient, and luminance gradient intensity as characteristic information effective for detection of a pedestrian from an input image. . The color information is classified into three channels based on, for example, the LUV color system. The luminance gradient is classified into, for example, six channels of 0 °, 30 °, 60 °, 90 °, 120 °, and 180 ° obtained by dividing 180 ° into six. The intensity gradient intensity is only for one channel. The object detection device generates an integrated image for each channel, extracts rectangular features from the integrated image, and calculates a score of the strong classifier constructed based on boosting from the score of the weak classifier.

なお、弱識別器は、たとえばソフトカスケード接続されている。ソフトカスケードを用いて画像から対象物を検出する方法は、米国特許第７，６３４，１４２号公報に記載されており、ソフトカスケードを用いて画像の識別処理を高速化する方法は、米国特許公開第２００９／００１８９８０号公報に記載されている。また、ソフトカスケードを構築する方法も公知である。 Note that the weak classifiers are, for example, soft cascade connected. A method for detecting an object from an image using a soft cascade is described in US Pat. No. 7,634,142, and a method for speeding up an image identification process using a soft cascade is disclosed in US Pat. No. 2009/0018980. A method for constructing a soft cascade is also known.

ここで、ＩＣＦおよびＦＰＤＷでは、チャネルごとに、同じ大きさの検出窓に対し複数スケールの積分画像を生成している。これに対し、本実施の形態では、輝度勾配に相当する積分画像から矩形特徴を演算する際に適切な補正を行うことによって、スケールに依存しない特徴量を構築している。以下に、スケールに依存しない識別用特徴量を導出し、このような識別用特徴量を用いて対象物を検出する対象物検出装置の構成および動作について、詳細に説明する。 Here, in the ICF and the FPDW, a multi-scale integrated image is generated for a detection window of the same size for each channel. On the other hand, in the present embodiment, a scale-independent feature amount is constructed by performing an appropriate correction when calculating a rectangular feature from an integrated image corresponding to a luminance gradient. Hereinafter, the configuration and operation of an object detection apparatus for deriving an identification feature quantity independent of the scale and detecting the object using such an identification feature quantity will be described in detail.

＜構成について＞
（ハードウェア構成）
図１は、本発明の実施の形態に係る対象物検出装置１のハードウェア構成例を示すブロック図である。図１に示されるように、対象物検出装置１は、たとえばＰＣ（Personal Computer）などの汎用コンピュータによって実現可能である。 <About configuration>
(Hardware configuration)
FIG. 1 is a block diagram illustrating a hardware configuration example of an object detection device 1 according to an embodiment of the present invention. As shown in FIG. 1, the object detection device 1 can be realized by a general-purpose computer such as a PC (Personal Computer).

対象物検出装置１は、各種演算処理を行うためのＣＰＵ（Central Processing Unit）１１と、各種データおよびプログラムを格納するＲＯＭ（Read Only Memory）１２と、作業データ等を記憶するＲＡＭ（Random Access Memory）１３と、不揮発性の記憶装置であるハードディスク１４と、キーボードなどを含む操作部１５と、各種情報を表示するための表示部１６と、記録媒体１７ａからのデータやプログラムを読み出しおよび書き込み可能なドライブ装置１７と、ネットワーク通信するための通信Ｉ／Ｆ（インターフェイス）１８と、画像（静止画および動画を含む）を入力する画像入力部１９とを備える。画像入力部１９は、画像を撮影するカメラにより実現される。記録媒体１７ａは、たとえば、ＣＤ−ＲＯＭ（Compact Disc-ROM）や、メモリカードなどであってよい。 The object detection apparatus 1 includes a CPU (Central Processing Unit) 11 for performing various arithmetic processes, a ROM (Read Only Memory) 12 for storing various data and programs, and a RAM (Random Access Memory) for storing work data and the like. ) 13, a hard disk 14 which is a nonvolatile storage device, an operation unit 15 including a keyboard, a display unit 16 for displaying various information, and data and programs from the recording medium 17a can be read and written. The drive device 17 includes a communication I / F (interface) 18 for network communication, and an image input unit 19 for inputting images (including still images and moving images). The image input unit 19 is realized by a camera that captures an image. The recording medium 17a may be, for example, a CD-ROM (Compact Disc-ROM) or a memory card.

なお、対象物検出装置１は、画像入力部１９を備えていなくてもよい。その場合、たとえば通信Ｉ／Ｆ１８から得られた画像データや、記録媒体１７ａから読み出された画像データに対して、対象物検出処理が実行されてよい。本実施の形態では、このように外部から得られる画像も「入力画像」であるものとする。 Note that the object detection device 1 may not include the image input unit 19. In that case, for example, the object detection process may be performed on the image data obtained from the communication I / F 18 or the image data read from the recording medium 17a. In the present embodiment, it is assumed that an image obtained from the outside is also an “input image”.

（機能構成）
図２は、本発明の実施の形態に係る対象物検出装置１の機能構成を示す機能ブロック図である。図２を参照して、対象物検出装置１は、その機能構成として、画像記憶部１１０と、変換部１２０と、生成部１３０と、設定部１４０と、算出部１５０と、識別部１６０と、判定部１７０と、出力部１８０とを含む。 (Functional configuration)
FIG. 2 is a functional block diagram showing a functional configuration of the object detection device 1 according to the embodiment of the present invention. Referring to FIG. 2, the object detection apparatus 1 includes, as its functional configuration, an image storage unit 110, a conversion unit 120, a generation unit 130, a setting unit 140, a calculation unit 150, an identification unit 160, A determination unit 170 and an output unit 180 are included.

画像記憶部１１０は、入力画像を記憶する。入力画像は、たとえば画像入力部１９より入力された画像である。画像記憶部１１０は、たとえば、ＲＡＭ１３あるいはハードディスク１４によって実現される。 The image storage unit 110 stores an input image. The input image is, for example, an image input from the image input unit 19. The image storage unit 110 is realized by the RAM 13 or the hard disk 14, for example.

変換部１２０は、画像記憶部１１０に記憶された入力画像から、複数のチャネルそれぞれについて、所定スケールの複数の変換画像を生成する。複数のチャネルは、上述のように、３つの色情報と６つの輝度勾配方向と１つの勾配強度とを含む。したがって、変換部１２０は、色情報抽出部１２２と、輝度勾配抽出部１２４と、勾配強度抽出部１２６とを含む。色情報抽出部１２２は、入力画像から色情報を抽出して、入力画像をＬＵＶ画像に変換する。輝度勾配抽出部１２４は、入力画像から輝度勾配を抽出して、入力画像をそれぞれの方向の勾配画像に変換する。勾配強度抽出部１２６は、入力画像から勾配強度を抽出し、入力画像を勾配強度画像に変換する。勾配強度は、輝度勾配方向の総計を示す。 The conversion unit 120 generates a plurality of converted images having a predetermined scale for each of a plurality of channels from the input image stored in the image storage unit 110. The plurality of channels includes three color information, six luminance gradient directions, and one gradient intensity as described above. Therefore, the conversion unit 120 includes a color information extraction unit 122, a luminance gradient extraction unit 124, and a gradient strength extraction unit 126. The color information extraction unit 122 extracts color information from the input image and converts the input image into an LUV image. The luminance gradient extraction unit 124 extracts a luminance gradient from the input image, and converts the input image into a gradient image in each direction. The gradient strength extraction unit 126 extracts gradient strength from the input image, and converts the input image into a gradient strength image. The gradient strength indicates the total in the luminance gradient direction.

生成部１３０は、変換部１２０によって生成された各変換画像に基づいて、チャネルごとに積分画像を生成する。積分画像は、矩形領域の４角の画素値を加減算することで特徴量が導出されるように、変換画像の各画素値を演算した結果を、その変換画像の画素の配列に従って配置した周知のデータである。生成部１３０は、色情報および勾配強度のチャネルについては、上述のＦＰＤＷに従い複数スケールの積分画像を生成する。輝度勾配のチャネルについては、入力画像に基づき、所定スケールの積分画像のみを生成する。「所定スケール」は、入力画像と同一のスケールであることが望ましい。なお、本実施の形態では、全画面に対して積分画像が生成されるが、積分画像は部分的に生成されてもよい。 The generation unit 130 generates an integral image for each channel based on each converted image generated by the conversion unit 120. The integral image is a well-known arrangement in which the result of calculating each pixel value of the converted image is arranged according to the arrangement of the pixels of the converted image so that the feature amount is derived by adding and subtracting the four pixel values of the rectangular area. It is data. The generation unit 130 generates a multi-scale integrated image according to the FPDW described above for the color information and gradient intensity channels. For the luminance gradient channel, only an integral image of a predetermined scale is generated based on the input image. The “predetermined scale” is preferably the same scale as the input image. In the present embodiment, an integral image is generated for the entire screen, but the integral image may be partially generated.

設定部１４０は、入力画像に対し、検索窓を設定する。検索窓による探索順は、予め定められている。 The setting unit 140 sets a search window for the input image. The search order by the search window is determined in advance.

算出部１５０は、設定部１４０により設定された検索窓に対応する矩形領域単位で、各チャネルの特徴量（矩形特徴量）を算出する。算出部１５０は、チャネルごとに、生成部１３０により生成された１以上の積分画像に基づいて矩形特徴量を算出する。算出部１５０による矩形特徴量の具体的な算出方法については後述する。 The calculation unit 150 calculates the feature amount (rectangular feature amount) of each channel in units of a rectangular area corresponding to the search window set by the setting unit 140. The calculation unit 150 calculates a rectangular feature amount for each channel based on one or more integrated images generated by the generation unit 130. A specific calculation method of the rectangular feature amount by the calculation unit 150 will be described later.

識別部１６０は、算出部１５０により算出されたチャネルごとの矩形特徴量に基づいて、検索窓内における対象物の有無を識別する。本実施の形態において、識別部１６０は、正規化部１６２と、判別処理部１６４とを含む。正規化部１６２は、算出部１５０で算出された輝度勾配の矩形特徴量、すなわち勾配特徴量を正規化する。つまり、正規化部１６２によって、輝度勾配についての識別用特徴量として、正規化勾配特徴量が導出される。正規化部１６２による具体的な正規化方法についても後述する。 The identification unit 160 identifies the presence or absence of an object in the search window based on the rectangular feature amount for each channel calculated by the calculation unit 150. In the present embodiment, identification unit 160 includes a normalization unit 162 and a discrimination processing unit 164. The normalization unit 162 normalizes the rectangular feature amount of the brightness gradient calculated by the calculation unit 150, that is, the gradient feature amount. That is, the normalization unit 162 derives a normalized gradient feature value as an identification feature value for the luminance gradient. A specific normalization method by the normalization unit 162 will also be described later.

判別処理部１６４は、各チャネルの矩形特徴量に基づく判別処理を行う。判別処理は、弱識別器の各ノードにおいて行われる処理であり、予め学習によって設定された閾値に基づいて行われる。判別処理部１６４は、輝度勾配のチャネルについては、上記正規化勾配特徴量を用いて判別処理を行い、それ以外のチャネルについては、算出部１５０で算出された矩形特徴量に基づいて判別処理を行う。 The discrimination processing unit 164 performs discrimination processing based on the rectangular feature amount of each channel. The discrimination process is a process performed at each node of the weak classifier, and is performed based on a threshold value set in advance by learning. The discrimination processing unit 164 performs discrimination processing using the normalized gradient feature amount for the luminance gradient channel, and performs discrimination processing based on the rectangular feature amount calculated by the calculation unit 150 for the other channels. Do.

判定部１７０は、識別部１６０による検索窓ごとの識別結果に基づいて、入力画像中の歩行者の位置を判定する。判定部１７０は、歩行者の位置を特定する情報を出力部１８０に出力する。 The determination unit 170 determines the position of the pedestrian in the input image based on the identification result for each search window by the identification unit 160. The determination unit 170 outputs information specifying the position of the pedestrian to the output unit 180.

出力部１８０は、判定部１７０による判定結果を出力する。出力部１８０は、たとえば、表示部１６によって実現される。その場合、歩行者が存在すると判定（識別）された検索窓が、表示部１６において矩形枠にて示される。なお、出力部１８０は、たとえば通信Ｉ／Ｆ１８により実現されてもよい。この場合、判定結果は、ネットワークを介して接続された外部装置に送信されてもよい。 The output unit 180 outputs the determination result by the determination unit 170. The output unit 180 is realized by the display unit 16, for example. In that case, the search window determined (identified) that there is a pedestrian is indicated by a rectangular frame on the display unit 16. The output unit 180 may be realized by the communication I / F 18, for example. In this case, the determination result may be transmitted to an external device connected via a network.

なお、図２に示した画像記憶部１１０および出力部１８０以外の機能ブロックは、図１に示したＣＰＵ１１が、たとえばハードディスク１４に格納されたソフトウェアを実行することで実現されてもよいし、これらのうち少なくとも１つは、ハードウェアにより実現されてもよい。 The functional blocks other than the image storage unit 110 and the output unit 180 illustrated in FIG. 2 may be realized by the CPU 11 illustrated in FIG. 1 executing, for example, software stored in the hard disk 14. At least one of them may be realized by hardware.

ここで、本実施の形態では、識別部１６０による識別処理に用いられる特徴量の算出（導出）方法が、輝度勾配の場合とそれ以外とで異なっている。このことについて、図３および図４を参照して具体的に説明する。 Here, in the present embodiment, the feature amount calculation (derivation) method used for the identification processing by the identification unit 160 is different between the case of the luminance gradient and the other case. This will be specifically described with reference to FIG. 3 and FIG.

図３は、色情報および勾配強度の識別用特徴量の算出方法を概念的に示す概念図である。 FIG. 3 is a conceptual diagram conceptually showing a method for calculating feature information for identifying color information and gradient strength.

図３を参照して、画像３１は入力画像である。画像３２〜３６，…は、上述のＩＣＦに従った場合に細かいスケール間隔で生成される縮小画像を概念的に表したものである。本実施の形態では、上述のＦＰＤＷに従い、疎なスケール間隔で縮小画像３６，…を生成する。つまり、本実施の形態では縮小画像３２〜３５は生成されない。生成部１３０は、入力画像３１から色情報（３チャネル）および勾配強度情報（１チャネル）を抽出し、チャネルごとに「積分画像１」を生成する。縮小画像３６についても同様に、色情報および勾配強度のチャネルごとに「積分画像６」を生成する。なお、ＩＣＦでは、各スケールに対応する積分画像１〜６の全てが生成される。 Referring to FIG. 3, an image 31 is an input image. Images 32 to 36,... Conceptually represent reduced images generated at fine scale intervals in accordance with the above-described ICF. In the present embodiment, reduced images 36,... Are generated at sparse scale intervals in accordance with the FPDW described above. That is, in this embodiment, the reduced images 32 to 35 are not generated. The generation unit 130 extracts color information (3 channels) and gradient intensity information (1 channel) from the input image 31 and generates “integrated image 1” for each channel. Similarly for the reduced image 36, an “integrated image 6” is generated for each channel of color information and gradient intensity. In the ICF, all the integral images 1 to 6 corresponding to each scale are generated.

各画像３１，３６に対し、同じ位置および大きさの検索窓４０が設定されている。積分画像１，６それぞれの矩形領域５１，５２は、検索窓４０に対応しているものとする。その場合、算出部１５０は、たとえば勾配強度の積分画像１（入力画像と同スケールの強度積分画像）の矩形領域５１から、１つ目の強度特徴量を算出する。つまり、矩形領域５１の合計（積分値）が１つ目の強度特徴量として算出される。同様に、勾配強度の積分画像６（入力画像とはスケールが異なる強度積分画像）の矩形領域５２から、２つ目の強度特徴量を算出する。つまり、矩形領域５２の合計（積分値）が２つ目の強度特徴量として算出される。算出部１５０は、これら２つの強度特徴量から乗数を推定し、積分画像１から「積分画像３」を処理する。積分画像３は、積分画像１のスケールと積分画像６のスケールとの間のスケール（以下「中間スケール」という）の積分画像である。つまり、実験的に得られた固定乗数を１つ目の強度特徴量にかけることで、実際には生成されていない積分画像３の強度特徴量（３つ目の強度特徴量）を近似的に算出する。色情報の各チャネルも同様に複数スケール分の色特徴量が算出される。色情報および勾配強度のチャネルでは、上記のように算出される複数スケール分の矩形特徴量が、識別用特徴量となる。このように、実際には生成されていない積分画像３の矩形特徴量を推定することで、積分画像の生成枚数をＩＣＦよりも減らすことができる。 A search window 40 having the same position and size is set for each of the images 31 and 36. It is assumed that the rectangular areas 51 and 52 of the integrated images 1 and 6 correspond to the search window 40, respectively. In this case, the calculation unit 150 calculates the first intensity feature amount from the rectangular region 51 of the gradient intensity integral image 1 (intensity integral image having the same scale as the input image), for example. That is, the total (integrated value) of the rectangular areas 51 is calculated as the first intensity feature amount. Similarly, the second intensity feature quantity is calculated from the rectangular area 52 of the gradient intensity integral image 6 (intensity integral image having a different scale from the input image). That is, the total (integrated value) of the rectangular areas 52 is calculated as the second intensity feature amount. The calculation unit 150 estimates a multiplier from these two intensity feature values, and processes the “integrated image 3” from the integrated image 1. The integral image 3 is an integral image having a scale between the scale of the integral image 1 and the scale of the integral image 6 (hereinafter referred to as “intermediate scale”). That is, by multiplying the first intensity feature quantity by an experimentally obtained fixed multiplier, the intensity feature quantity (third intensity feature quantity) of the integral image 3 that is not actually generated is approximately approximated. calculate. Similarly, color feature values for a plurality of scales are calculated for each channel of color information. In the color information and gradient strength channels, the rectangular feature values for a plurality of scales calculated as described above are the identification feature values. Thus, by estimating the rectangular feature amount of the integral image 3 that is not actually generated, the number of generated integral images can be reduced as compared with the ICF.

図４は、輝度勾配の識別用特徴量の算出方法を概念的に示す概念図である。 FIG. 4 is a conceptual diagram conceptually showing a method for calculating a luminance gradient identifying feature value.

図４を参照して、画像３１は入力画像である。輝度勾配については、縮小画像３２〜３６，…のいずれも生成されない。輝度勾配については、入力画像３１のみから、角度に応じた積分画像が生成されている。この、入力画像と同スケールの勾配積分画像の矩形領域６１は、図３と同じ検索窓４０に対応しているものとする。この場合、算出部１５０によって、勾配積分画像の矩形領域６１の合計（積分値）が、勾配特徴量として算出される。本実施の形態では、この勾配矩形特徴量が、入力画像と同スケールの強度積分画像（図３の積分画像１に相当）より算出される同じ位置および大きさの矩形領域５１の合計（積分値）により除算される。これにより、輝度勾配の識別用特徴量として、正規化勾配特徴量が得られる。 Referring to FIG. 4, an image 31 is an input image. For the luminance gradient, none of the reduced images 32-36,... Is generated. As for the luminance gradient, an integral image corresponding to the angle is generated from only the input image 31. It is assumed that the rectangular area 61 of the gradient integral image having the same scale as the input image corresponds to the same search window 40 as in FIG. In this case, the calculation unit 150 calculates the total (integrated value) of the rectangular regions 61 of the gradient integral image as the gradient feature amount. In the present embodiment, this gradient rectangular feature amount is the sum (integrated value) of rectangular regions 51 having the same position and size calculated from the intensity integrated image (corresponding to the integrated image 1 in FIG. 3) of the same scale as the input image. ). As a result, a normalized gradient feature value is obtained as a feature value for identifying the luminance gradient.

矩形領域５１の強度特徴量は、検索窓４０の輝度勾配の総和を示している。したがって、ある勾配方向の正規化勾配特徴量は、矩形領域６１内における、その勾配方向の構成割合を示すことになる。構成割合は、矩形領域すなわち検索窓の大きさに依存しない。そのため、輝度勾配のチャネルについては、複数スケールの縮小画像を生成して複数の矩形特徴量を算出しなくても、スケールに依存しない識別用特徴量を得ることができる。 The intensity feature amount of the rectangular area 51 indicates the sum of the luminance gradients of the search window 40. Therefore, the normalized gradient feature quantity in a certain gradient direction indicates the component ratio in the gradient direction in the rectangular area 61. The composition ratio does not depend on the size of the rectangular area, that is, the search window. Therefore, for a luminance gradient channel, it is possible to obtain an identification feature quantity independent of scale without generating a plurality of scaled reduced images and calculating a plurality of rectangular feature quantities.

図５は、本発明の実施の形態に係る識別処理を概念的に示す図である。 FIG. 5 is a diagram conceptually showing the identification processing according to the embodiment of the present invention.

図５を参照して、木構造の弱識別器７０Ａ，７０Ｂ，…，７０Ｎにおいて、輝度勾配のチャネルのノードがハッチングにて示されている。たとえば、木構造弱識別器７０Ｎでは、ノードＮ１〜Ｎ７のうち、Ｎ３，Ｎ７が輝度勾配のチャネルであり、他のノードＮ１，Ｎ２，Ｎ４〜Ｎ６が、色情報および勾配強度のいずれかのチャネルである。 Referring to FIG. 5, in the weak classifiers 70A, 70B,..., 70N having a tree structure, the nodes of the luminance gradient channels are indicated by hatching. For example, in the tree-structure weak classifier 70N, among the nodes N1 to N7, N3 and N7 are channels of luminance gradient, and the other nodes N1, N2, and N4 to N6 are channels of any one of color information and gradient intensity. It is.

＜動作について＞
図６は、本発明の実施の形態に係る対象物検出処理を示すフローチャートである。図６に示す対象物検出処理は、ＣＰＵ１１が、たとえばハードディスク１４に予め格納されたプログラムを読み出して実行することによって実現される。 <About operation>
FIG. 6 is a flowchart showing the object detection processing according to the embodiment of the present invention. The object detection process shown in FIG. 6 is realized by the CPU 11 reading and executing a program stored in advance in the hard disk 14, for example.

図２および図６を参照して、はじめに、変換部１２０に含まれるそれぞれの抽出部１２２，１２４，１２６は、画像記憶部１１０に記憶された入力画像から、色情報、輝度勾配および勾配強度を抽出する（ステップＳ２）。これにより、入力画像から、ＬＵＶ画像（３チャネル）、６方向の輝度勾配画像（６チャネル）、および勾配強度（１チャネル）の画像が生成される。次に、生成部１３０は、チャネルごとに、入力画像と同一スケールの積分画像を生成する（ステップＳ４）。また、生成部１３０は、色および勾配強度のチャネルについては、たとえば上述のＦＰＤＷに従うスケール間隔で、複数の縮小積分画像をさらに生成する（ステップＳ６）。生成された積分画像は、たとえばＲＡＭ１３に一時記憶される。 Referring to FIGS. 2 and 6, first, each extraction unit 122, 124, 126 included in conversion unit 120 obtains color information, luminance gradient, and gradient intensity from the input image stored in image storage unit 110. Extract (step S2). Thereby, an LUV image (3 channels), a luminance gradient image in 6 directions (6 channels), and an image of gradient intensity (1 channel) are generated from the input image. Next, the production | generation part 130 produces | generates the integral image of the same scale as an input image for every channel (step S4). In addition, for the color and gradient intensity channels, the generation unit 130 further generates a plurality of reduced integrated images, for example, at a scale interval according to the FPDW described above (step S6). The generated integral image is temporarily stored in the RAM 13, for example.

次に、設定部１４０が検索窓を設定すると、算出部１５０は、検索窓内の矩形特徴量を抽出する（ステップＳ８）。算出部１５０は、輝度勾配の各チャネルについては、ステップＳ４で生成された１つの積分画像から勾配特徴量を算出する。色情報の各チャネルについては、ステップＳ４で生成された積分画像、および、ステップＳ６で生成された少なくとも１つの縮小積分画像から、複数スケール分の色特徴量を算出する。また、算出されたこれらの色特徴量から、中間スケールの色特徴量を近似的に算出（推定）する。勾配強度のチャネルも、色情報の場合と同様に、複数スケール分の強度特徴量を算出および推定する。 Next, when the setting unit 140 sets a search window, the calculation unit 150 extracts a rectangular feature amount in the search window (step S8). For each channel of the luminance gradient, the calculation unit 150 calculates a gradient feature amount from the one integrated image generated in step S4. For each color information channel, color feature quantities for a plurality of scales are calculated from the integrated image generated in step S4 and at least one reduced integrated image generated in step S6. Further, from the calculated color feature values, the color feature values of the intermediate scale are approximately calculated (estimated). In the gradient intensity channel, as in the case of color information, intensity feature quantities for a plurality of scales are calculated and estimated.

矩形特徴量が抽出されると、識別部１６０による識別処理が実行される（ステップＳ１０）。識別処理については、図７にサブルーチンを挙げて説明する。なお、ここでの識別処理は、図５の各弱識別器７０Ａ〜７０Ｎが行う処理に相当する。 When the rectangular feature amount is extracted, identification processing by the identification unit 160 is executed (step S10). The identification process will be described with reference to a subroutine in FIG. The identification process here corresponds to the process performed by each of the weak classifiers 70A to 70N in FIG.

図７は、本発明の実施の形態における識別処理を示すフローチャートである。 FIG. 7 is a flowchart showing identification processing in the embodiment of the present invention.

図７を参照して、識別部１６０は、判別対象チャネルの種類が輝度勾配であるか否かを判定する（ステップＳ１０２）。判別対象チャネルが輝度勾配である場合、ステップＳ１０４へ進み、それ以外の場合、ステップＳ１０８へ進む。 With reference to FIG. 7, the identifying unit 160 determines whether or not the type of the determination target channel is a luminance gradient (step S <b> 102). If the determination target channel is a luminance gradient, the process proceeds to step S104. Otherwise, the process proceeds to step S108.

ステップＳ１０４において、正規化部１６２は、ステップＳ８で入力画像と同スケールの勾配積分画像から算出された勾配特徴量を、同一の矩形領域の同スケールに対応する強度特徴量で除算し、正規化特徴量を得る。判別処理部１６４は、この正規化特徴量により判別処理を実行する（ステップＳ１０６）。なお、判別処理は、図５に示したノードＮ１〜Ｎ７において行われる処理である。 In step S104, the normalization unit 162 divides the gradient feature amount calculated from the gradient integral image having the same scale as that of the input image in step S8 by the intensity feature amount corresponding to the same scale of the same rectangular region, and normalizes. Get features. The discrimination processing unit 164 executes discrimination processing based on the normalized feature amount (step S106). The discrimination process is a process performed in the nodes N1 to N7 shown in FIG.

ステップ１０８では、判別処理部１６４が、ＦＰＤＷ手法に基づき、複数スケールの積分画像それぞれの特徴量を用いて、判別処理を実行する。 In step 108, the discrimination processing unit 164 executes discrimination processing using the feature amounts of each of the multiple scale integrated images based on the FPDW method.

判別処理部１６４による全階層の判別処理が終了するまで（ステップＳ１１０にてＮＯ）、上記処理が繰返される。全階層の判別処理が終了すると（ステップＳ１１０にてＹＥＳ）、識別部１６０は、検索窓（矩形領域）内に、歩行者が存在するか否かを識別する（ステップＳ１１２）。全ノード（弱識別器）の識別処理が終了していない場合（ステップＳ１１４にてＮＯ）、ステップＳ１０２に戻り、上記処理が繰り返される。全ノードの識別処理が終了した場合に（ステップＳ１１４にてＹＥＳ）、所定の計算式を用いてスコアを計算する（ステップＳ１１６）。スコアが計算されると、処理はメインルーチンに戻される。弱識別器を表わすノードは、上述のようにソフトカスケード接続されており、その接続順序に従ってステップＳ１０２〜Ｓ１１２に示される弱識別処理が行われる。 The above processing is repeated until the discrimination processing for all layers by the discrimination processing unit 164 is completed (NO in step S110). When the discrimination process for all layers is completed (YES in step S110), identification unit 160 identifies whether or not there is a pedestrian in the search window (rectangular region) (step S112). If all nodes (weak classifiers) have not been identified (NO in step S114), the process returns to step S102 and the above process is repeated. When all nodes have been identified (YES in step S114), a score is calculated using a predetermined calculation formula (step S116). Once the score is calculated, processing returns to the main routine. The nodes representing the weak classifiers are soft-cascade connected as described above, and the weak classification processing shown in steps S102 to S112 is performed according to the connection order.

再び図３を参照して、設定された検索窓に対する識別処理が終わると、全領域網羅したか否かが判断される（ステップＳ１２）。つまり、全ての検索窓に対する識別処理が終わったか否かが判断される。まだの場合は（ステップＳ１２にてＮＯ）、設定部１４０により検索窓が移動され（ステップＳ１４）、上述の特徴量抽出処理（ステップＳ８）および識別処理（ステップＳ１０）が繰返される。 Referring to FIG. 3 again, when the identification process for the set search window is completed, it is determined whether or not the entire area is covered (step S12). That is, it is determined whether or not the identification processing for all search windows has been completed. If not yet (NO in step S12), the search window is moved by setting unit 140 (step S14), and the above-described feature amount extraction processing (step S8) and identification processing (step S10) are repeated.

全ての検索窓に対する識別処理が終わった場合（ステップＳ１２にてＹＥＳ）、判定部１７０は、識別処理の結果に基づいて、入力画像中における歩行者の有無を判定する（ステップＳ１６）。歩行者が存在すると判定した場合は、その位置（検索窓の位置）を特定し、出力部１８０に与える。これにより、判定部１７０による判定結果が、出力部１８０により出力される（ステップＳ１８）。 When the identification process for all the search windows has been completed (YES in step S12), determination unit 170 determines the presence or absence of a pedestrian in the input image based on the result of the identification process (step S16). If it is determined that there is a pedestrian, the position (the position of the search window) is specified and given to the output unit 180. Thereby, the determination result by the determination part 170 is output by the output part 180 (step S18).

以上で、対象物検出処理は終了される。 This completes the object detection process.

上述のように、本実施の形態では、勾配特徴量が、単純な固定定数ではなく、物理的に関連のある数値によって補正（正規化）される。したがって、本実施の形態に係る対象物検出装置１によれば、事前の実験を行うことなく任意の入力に対応可能である。また、輝度勾配については１つの積分画像を生成するだけでよいため、ＦＰＤＷと同等またはそれ以上の高速化を実現することができる。また、勾配特徴量の正規化に用いられる除数は、歩行者検出の１つのチャネルの特徴量である。そのため、正規化処理の負荷を抑えることもできる。 As described above, in the present embodiment, the gradient feature amount is corrected (normalized) by a numerical value that is physically related instead of a simple fixed constant. Therefore, according to the target object detection apparatus 1 according to the present embodiment, it is possible to cope with any input without conducting a prior experiment. In addition, since only one integral image needs to be generated for the luminance gradient, it is possible to realize a speed increase equivalent to or higher than that of the FPDW. The divisor used for normalization of the gradient feature value is a feature value of one channel for pedestrian detection. Therefore, it is possible to reduce the load of normalization processing.

図８は、ＩＮＲＩＡデータセットを用いて評価を行った場合の歩行者検出精度を示すグラフである。図８には、横軸に誤検出率（False Positive Per Image）、縦軸に検出ミス率をとったグラフが示されている。当該グラフにおいて、グラフ線Ｌ１が本実施の形態に従って歩行者検出を行った場合の誤検出率を示している。グラフ線Ｌ２は、ＦＰＤＷ手法の誤検出率を示している。グラフ線Ｌ３は、ＩＣＦ手法の誤検出率を示している。なお、グラフ線Ｌ２，Ｌ３は、２０１２年のＩＮＲＩＡデータセットによる評価を引用している。 FIG. 8 is a graph showing the pedestrian detection accuracy when an evaluation is performed using the INRIA data set. FIG. 8 shows a graph in which the horizontal axis represents the false detection rate (False Positive Per Image) and the vertical axis represents the detection error rate. In the graph, a graph line L1 indicates a false detection rate when pedestrian detection is performed according to the present embodiment. A graph line L2 indicates a false detection rate of the FPDW method. A graph line L3 indicates the false detection rate of the ICF method. The graph lines L2 and L3 are quoted from the 2012 INRIA dataset.

図８に示されるように、誤検出率を評価指標として歩行者検出の精度評価を行った場合に、本実施の形態に従った歩行者検出手法は、どの誤検出率においても従来手法を上回る世界最高の検出精度を達成している。たとえば誤検出率が「１０^-1」、すなわち、入力画像１０枚に対して全画面探索を行った場合に１つの誤検出を発生する検出閾値の場合に、およそ８５％の検出精度を達成している。したがって、本実施の形態の対象物検出装置によれば、市場からの要求が強い車載用途の実用にも耐え得る高い検出精度を実現している。特に近年は、多くの自動車に歩行者検出システムの搭載を目指す流れになっており、経済的効果は高い。また、車載用途のみならず、監視用途やヒューマンマシンインターフェースへの適用も容易であり、このような市場においても採用される可能性は高いと期待できる。 As shown in FIG. 8, when the accuracy of pedestrian detection is evaluated using the false detection rate as an evaluation index, the pedestrian detection method according to the present embodiment exceeds the conventional method at any false detection rate. Achieves the highest detection accuracy in the world. For example, in the case of a detection threshold value of “10 ⁻¹ ”, that is, a detection threshold value that generates one false detection when a full screen search is performed on 10 input images, a detection accuracy of approximately 85% is achieved. ing. Therefore, according to the object detection device of the present embodiment, high detection accuracy that can withstand practical use for in-vehicle applications that is strongly demanded from the market is realized. In particular, in recent years, there has been a trend toward mounting a pedestrian detection system in many automobiles, and the economic effect is high. Moreover, it can be easily applied not only to in-vehicle use but also to monitoring use and human machine interface, and it is expected that the possibility of being adopted in such a market is high.

なお、本実施の形態では、識別部１６０の識別処理において、判別対象のチャネルが輝度勾配の場合に正規化処理を行うこととしたが、識別処理に移行する前に事前に正規化処理を行ってもよい。つまり、図４のステップＳ１０４で行なわれる正規化のための演算処理は、図４のステップＳ８（矩形特徴量の算出）と、ステップＳ１０（識別処理）との間に行われてもよい。 In this embodiment, in the identification process of the identification unit 160, the normalization process is performed when the channel to be identified is a luminance gradient. However, the normalization process is performed in advance before the process proceeds to the identification process. May be. That is, the normalization calculation process performed in step S104 in FIG. 4 may be performed between step S8 (rectangular feature amount calculation) and step S10 (identification process) in FIG.

また、本実施の形態では、輝度勾配のチャネルについてのみ、スケールフリーな識別用特徴量を導出することとしたが、色情報の矩形特徴量も同様に、各色が矩形領域内に占める割合を算出し、この値をスケールフリーな識別用特徴量としてもよい。この場合、色特徴量を、同一矩形領域内の色の合計、たとえば矩形領域内の画素値で除算して得られる正規化色特徴量を、識別用特徴量とすることができる。 In this embodiment, the scale-free identification feature amount is derived only for the luminance gradient channel. Similarly, for the rectangular feature amount of the color information, the proportion of each color in the rectangular region is calculated. However, this value may be used as a scale-free identification feature quantity. In this case, a normalized color feature amount obtained by dividing the color feature amount by the sum of the colors in the same rectangular region, for example, the pixel value in the rectangular region can be used as the distinguishing feature amount.

または、色情報は、歩行者検出のための補助的な情報であり、２〜３％しか歩行者検出に貢献しないことが分かっている。したがって、歩行者検出のチャネルに、色情報を含めないこととしてもよい。その場合、歩行者検出処理のさらなる高速化が期待できる。 Or color information is auxiliary information for pedestrian detection, and it is known that only 2-3% contributes to pedestrian detection. Therefore, color information may not be included in the pedestrian detection channel. In that case, further speeding up of the pedestrian detection process can be expected.

なお、本実施の形態では、輝度勾配のチャネルについては、１つの入力画像に対して、入力画像と同一スケールの１つの積分画像が生成されることとしたが、必ずしも同一スケールでなくてもよい。その場合、勾配積分画像より算出される勾配特徴量が、勾配積分画像と同じスケール（つまり所定スケール）の強度積分画像より算出される強度特徴量により正規化されればよい。 In the present embodiment, with respect to the luminance gradient channel, one integrated image having the same scale as that of the input image is generated for one input image. . In this case, the gradient feature amount calculated from the gradient integral image may be normalized by the intensity feature amount calculated from the intensity integral image having the same scale as the gradient integral image (that is, a predetermined scale).

また、歩行者検出に有効な特性情報が色、輝度勾配および勾配強度の他にあった場合、その特性情報についても、輝度勾配と同様の概念に基づき補正することとしてもよい。具体的には、あるチャネルの積分画像より算出される矩形特徴量を、その矩形領域内においてそのチャネルの占める割合を示す正規化特徴量に補正することとしてもよい。つまり、算出部により算出される矩形特徴量を、同一の矩形領域における所定の特性情報の総和を示す演算値で除算することで、識別用特徴量を導出してもよい。その意味において、本実施の形態によれば、歩行者検出のための識別用特徴量の導出方法を提供することもできる。 In addition, when there is characteristic information effective for pedestrian detection in addition to the color, the luminance gradient, and the gradient intensity, the characteristic information may be corrected based on the same concept as the luminance gradient. Specifically, the rectangular feature value calculated from the integral image of a certain channel may be corrected to a normalized feature value indicating the proportion of the channel in the rectangular region. That is, the identification feature amount may be derived by dividing the rectangular feature amount calculated by the calculation unit by the calculated value indicating the sum of predetermined characteristic information in the same rectangular area. In that sense, according to the present embodiment, it is also possible to provide a method for deriving an identification feature amount for detecting a pedestrian.

また、本実施の形態では、対象物を歩行者としたが、輝度勾配を用いる検出装置であれば、車両など他の物体であってもよい。 In the present embodiment, the object is a pedestrian, but other objects such as a vehicle may be used as long as the detection device uses a luminance gradient.

なお、本実施の形態の対象物検出装置により実行される対象物検出方法および識別用特徴量の導出方法を、プログラムとして提供することもできる。このようなプログラムは、ＣＤ−ＲＯＭ（Compact Disc-ROM）などの光学媒体や、メモリカードなどのコンピュータ読取り可能な一時的でない（non-transitory）記録媒体にて記録させて提供することができる。また、ネットワークを介したダウンロードによって、プログラムを提供することもできる。 Note that the object detection method and the identification feature amount derivation method executed by the object detection device of the present embodiment can also be provided as a program. Such a program can be provided by being recorded on an optical medium such as a CD-ROM (Compact Disc-ROM) or a computer-readable non-transitory recording medium such as a memory card. A program can also be provided by downloading via a network.

なお、本発明にかかるプログラムは、コンピュータのオペレーティングシステム（ＯＳ）の一部として提供されるプログラムモジュールのうち、必要なモジュールを所定の配列で所定のタイミングで呼出して処理を実行させるものであってもよい。その場合、プログラム自体には上記モジュールが含まれずＯＳと協働して処理が実行される。このようなモジュールを含まないプログラムも、本発明にかかるプログラムに含まれ得る。 The program according to the present invention is a program module that is provided as a part of a computer operating system (OS) and calls necessary modules in a predetermined arrangement at a predetermined timing to execute processing. Also good. In that case, the program itself does not include the module, and the process is executed in cooperation with the OS. A program that does not include such a module can also be included in the program according to the present invention.

また、本発明にかかるプログラムは他のプログラムの一部に組込まれて提供されるものであってもよい。その場合にも、プログラム自体には上記他のプログラムに含まれるモジュールが含まれず、他のプログラムと協働して処理が実行される。このような他のプログラムに組込まれたプログラムも、本発明にかかるプログラムに含まれ得る。 The program according to the present invention may be provided by being incorporated in a part of another program. Even in this case, the program itself does not include the module included in the other program, and the process is executed in cooperation with the other program. Such a program incorporated in another program can also be included in the program according to the present invention.

今回開示された実施の形態はすべての点で例示であって制限的なものではないと考えられるべきである。本発明の範囲は上記した説明ではなくて特許請求の範囲によって示され、特許請求の範囲と均等の意味および範囲内でのすべての変更が含まれることが意図される。 The embodiment disclosed this time should be considered as illustrative in all points and not restrictive. The scope of the present invention is defined by the terms of the claims, rather than the description above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１対象物検出装置、１１ＣＰＵ、１４ハードディスク、１５操作部、１６表示部、１７ドライブ装置、１７ａ記録媒体、１８通信Ｉ／Ｆ、１９画像入力部、１１０画像記憶部、１２０変換部、１２２色情報抽出部、１２４輝度勾配抽出部、１２６勾配強度抽出部、１３０生成部、１４０設定部、１５０算出部、１６０識別部、１６２正規化部、１６４判別処理部、１７０判定部、１８０出力部。 DESCRIPTION OF SYMBOLS 1 Object detection apparatus, 11 CPU, 14 Hard disk, 15 Operation part, 16 Display part, 17 Drive apparatus, 17a Recording medium, 18 Communication I / F, 19 Image input part, 110 Image storage part, 120 Conversion part, 122 colors Information extraction unit, 124 luminance gradient extraction unit, 126 gradient intensity extraction unit, 130 generation unit, 140 setting unit, 150 calculation unit, 160 identification unit, 162 normalization unit, 164 discrimination processing unit, 170 determination unit, 180 output unit.

Claims

An object detection device for detecting an object from an input image,
For each of a plurality of channels including a predetermined number of luminance gradient directions and gradient intensities, generation means for generating a gradient integral image and an intensity integral image of a predetermined scale based on the input image;
On the basis of the gradient integral image and intensity integral image of the predetermined scale generated by the generation means, the gradient feature amount for each of the luminance gradient directions and the gradient intensity for each rectangular area corresponding to the search window. A calculation means for calculating an intensity feature amount;
Identification means for identifying the presence or absence of the object in the rectangular region based on the gradient feature value and the intensity feature value calculated by the calculation means,
The identification means is a normal value obtained by dividing the gradient feature amount calculated based on the gradient integral image of the predetermined scale by the intensity feature amount calculated based on the intensity integral image of the predetermined scale in the same rectangular region. An object detection apparatus that identifies the presence or absence of the object using a feature amount including a grading gradient feature amount.

The generation means further generates a reduced intensity integrated image having a scale different from the predetermined scale based on the input image for the gradient intensity channel,
The calculation means further calculates an intensity feature amount based on the reduced intensity integrated image for the gradient intensity channel,
The said identification means identifies the presence or absence of the said object based on the said intensity | strength feature-value for several scales in the same said rectangular area, when the channel of discrimination | determination object is the said gradient intensity | strength. Object detection device.

3. The object detection according to claim 2, wherein the identification unit includes a normalization unit that divides the gradient feature amount by an intensity feature amount of the predetermined scale when a channel to be identified corresponds to the luminance gradient direction. apparatus.

The plurality of channels further include color information;
The generation unit generates a multi-scale color integration image including the predetermined scale based on the input image for the color information channel;
For the color information channel, the calculation means calculates a color feature amount in units of the rectangular area based on each of the generated multi-scale color integrated images.
The said identification means identifies the presence or absence of the said object based on the said color feature-value for several scales in the same said rectangular area, when the channel of discrimination | determination object is the said color information. The object detection apparatus as described.

The plurality of channels further include color information;
The calculation means calculates a color feature amount based on the color integration image of the predetermined scale based on the input image in the rectangular area unit for the color information channel;
The said identification means identifies the presence or absence of the said object using the normalized color feature-value which divided the said color feature-value with the calculated value which shows the sum total of the color in the said rectangular area | region of object. 2. The object detection apparatus described in 1.

The object detection apparatus according to claim 5, wherein the calculated value is a pixel value in the rectangular area.

Determination means for determining a position of the object in the input image based on an identification result for each rectangular area by the identification means;
The object detection apparatus according to claim 1, further comprising an output unit that outputs a determination result by the determination unit.

An object detection method for detecting an object from an input image,
Generating a gradient integrated image and an intensity integrated image of a predetermined scale based on the input image for each of a plurality of channels including a predetermined number of luminance gradient directions and gradient intensities;
Based on the generated gradient integral image and intensity integral image of the predetermined scale, a gradient feature amount for each of the luminance gradient directions, and an intensity feature amount for the gradient strength in units of rectangular regions corresponding to the search window, Calculating steps,
Identifying the presence or absence of the object in the rectangular region based on the calculated gradient feature amount and the strength feature amount,
In the identifying step, the gradient feature amount calculated based on the gradient integral image of the predetermined scale is divided by the intensity feature amount calculated based on the intensity integral image of the predetermined scale in the same rectangular region. An object detection method comprising the step of identifying the presence or absence of the object using a feature quantity including a normalized gradient feature quantity.

An object detection program for causing a computer to execute each step included in the object detection method according to claim 8.

A method for deriving a feature for identification for detecting an object from an input image,
Generating an integral image of each of a plurality of channels indicating predetermined characteristic information based on the input image;
Calculating a rectangular feature amount of each channel for each rectangular region corresponding to the search window based on the generated integral image;
A feature amount deriving method comprising: calculating the identification feature amount by dividing the calculated rectangular feature amount by an operation value indicating a sum of the predetermined characteristic information in the same rectangular region.

A feature quantity deriving program for causing a computer to execute each step included in the feature quantity deriving method according to claim 10.