JP2007049545A

JP2007049545A - Image composing apparatus, image composing method, and program

Info

Publication number: JP2007049545A
Application number: JP2005233373A
Authority: JP
Inventors: Masaaki Sasaki; 雅昭佐々木; Rei Hamada; 玲浜田; Shinichi Matsui; 紳一松井
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2005-08-11
Filing date: 2005-08-11
Publication date: 2007-02-22
Anticipated expiration: 2025-08-11
Also published as: JP4640032B2

Abstract

<P>PROBLEM TO BE SOLVED: To generate a composite image while reducing computational complexity required for image processing and keeping sufficient tracing accuracy. <P>SOLUTION: Pyramid hierarchical layers are configured by stepwise reducing images being composite objects through the use of a prescribed reduction size of an original image for an uppermost layer. Then a feature point is traced sequentially from a lowermost layer toward the uppermost layer. In this case, the motion of the feature point is converged by multiplying a prescribed coefficient with a vector denoting the motion of the feature point. Thus, the computational complexity is reduced by eliminating the need for image processing at the original image size in tracing the feature point. Moreover, the convergence is improved by multiplying the prescribed coefficient with the motion vector of the feature point so as to generate the composite image while keeping sufficient tracing accuracy. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、例えばデジタルカメラ等の撮像装置に用いられる画像合成装置と、画像合成方法及びプログラムに関する。 The present invention relates to an image composition device, an image composition method, and a program used in an imaging device such as a digital camera.

近年、デジタルカメラでは、手ぶれを起こさない短い露光時間で連続撮影を行い、その連続撮影によって得られた複数枚の画像を重ね合わせることで、ブレのないノイズを抑えた鮮明な画像を得る方法が考えられている。 In recent years, with digital cameras, there is a method of obtaining a clear image with reduced noise by performing continuous shooting with a short exposure time without causing camera shake and overlaying a plurality of images obtained by the continuous shooting. It is considered.

ここで、連続撮影における各フレームの間隔は短時間であっても、その間にカメラと被写体との間に位置ずれが生じる可能性がある。このため、連続撮影によって得られた各画像を重ね合わせて合成する場合には、このような位置ずれを補正して重ね合わせることが必要となる。 Here, even if the interval between the frames in the continuous shooting is short, there is a possibility that a positional deviation occurs between the camera and the subject during that time. For this reason, when superimposing and synthesizing the images obtained by continuous shooting, it is necessary to correct the positional deviation and superimpose the images.

例えば、特許文献１では、２枚の画像からオプティカルフローを計算し、そのオプティカルフローに基づいて２枚の画像間での位置合わせを行う技術について開示されている。オプティカルフロー（optical flow）とは、カメラから見た物体の見かけ上の動きのことである。 For example, Patent Literature 1 discloses a technique for calculating an optical flow from two images and performing alignment between the two images based on the optical flow. The optical flow is an apparent movement of an object as viewed from the camera.

このオプティカルフローの推定方法の１つに勾配法と呼ばれるものがある。勾配法は、動きを画素値（明るさ）の勾配（一次微分）から計算する方法である。つまり、勾配法によるオプティカルフロー推定とは、「物体上の点の明るさは移動後も変化しない」という仮定から時空間微分とオプティカルフローとの関係式を導出し、それを利用して対象の動きを推定するものである。 One of the optical flow estimation methods is called a gradient method. The gradient method is a method of calculating the motion from the gradient (first derivative) of the pixel value (brightness). In other words, optical flow estimation by the gradient method means that the relation between the spatio-temporal derivative and the optical flow is derived from the assumption that the brightness of the point on the object does not change after movement, and is used to It estimates motion.

画像合成技術では、このような勾配法を用いて画像上で特徴点を追跡することにより、両画像間における射影変換の行列式Ｈを決定し、その行列式Ｈを用いて画像間の位置ずれを補正して重ね合わせを行う。 In the image synthesis technique, a determinant H of the projective transformation between the two images is determined by tracking the feature points on the image using such a gradient method, and the positional deviation between the images is determined using the determinant H. Is corrected and superimposed.

具体的に説明すると、例えば図８に示すように、画像１と画像２を合成対象とした場合に、まず、基準となる画像１の中から所定数分の特徴点をランダムに抽出し、これらの特徴点を画像１に重ね合わせる画像２上で勾配法を用いて追跡する。今、図中のＰ１，Ｐ２，Ｐ３を画像１から抽出した特徴点とすれば、この３つの特徴点Ｐ１，Ｐ２，Ｐ３の場所を画像２の中で追跡することになる。 More specifically, for example, as shown in FIG. 8, when image 1 and image 2 are to be combined, first, a predetermined number of feature points are randomly extracted from reference image 1, and these are extracted. Are tracked using the gradient method on the image 2 that is superimposed on the image 1. If P1, P2, and P3 in the figure are feature points extracted from the image 1, the locations of the three feature points P1, P2, and P3 are traced in the image 2.

特徴点Ｐ１，Ｐ２，Ｐ３を追跡できたら、この特徴点Ｐ１，Ｐ２，Ｐ３の画像２上の座標位置と画像１上の座標位置との関係から射影変換における行列式Ｈを算出する。そして、この行列式Ｈに基づいて画像１上のすべての特徴点をずらしてみて（画像１′）、ある程度の精度で画像２上の各特徴点と一致すれば（画像１′＋画像２）、当該行列式Ｈを用いて画像１と画像２との位置ずれを補正して重ね合わせる。 When the feature points P1, P2, and P3 can be tracked, a determinant H in the projective transformation is calculated from the relationship between the coordinate positions of the feature points P1, P2, and P3 on the image 2 and the coordinate positions on the image 1. Then, all feature points on the image 1 are shifted based on the determinant H (image 1 ′), and if they match each feature point on the image 2 with a certain degree of accuracy (image 1 ′ + image 2). Then, the determinant H is used to correct the positional deviation between the image 1 and the image 2 and superimpose them.

なお、実際には、このような特徴点の抽出と追跡を何回か繰り返し行うことで、最終的に最も評価値の高い行列式Ｈを選出するといった処理を行う。 It should be noted that in practice, processing such as extraction and tracking of such feature points is repeated several times to finally select a determinant H having the highest evaluation value.

ところで、画像間の特徴点を追跡する場合に、画像上のすべての画素を対象として特徴点を１つ１つ追跡していたのでは計算量が多くなり、非常に時間がかかる。そこで、予め段階的に解像度を低くした画像を作り、最も解像度が少ないレイヤから順番に最も解像度が高いレイヤに向かって特徴点を追跡していく方法が用いられる。これをピラミッド化と呼ぶ。 By the way, when tracking feature points between images, if the feature points are tracked one by one for all the pixels on the image, the amount of calculation increases and it takes a very long time. In view of this, a method is used in which an image with a resolution lowered stepwise in advance is used, and feature points are tracked in order from the layer with the lowest resolution toward the layer with the highest resolution. This is called pyramidization.

図９は従来のピラミッド化による特徴点の追跡方法を説明するための図である。図中のｋは階層のパラメータであり、ｋ＝０〜ｍの値を取る。ｆは元画像サイズ、∂ｆ／∂ｘと∂ｆ／∂ｙは点（ｘ，ｙ）の一次微分を表している。 FIG. 9 is a diagram for explaining a conventional method of tracking feature points by pyramidization. K in the figure is a hierarchical parameter and takes a value of k = 0 to m. f represents the original image size, and ∂f / ∂x and ∂f / ∂y represent the first derivative of the point (x, y).

まず、元画像に対してノイズ除去フィルタをかけ、その画像をレイヤ０とする。また、レイヤ０に対してサブサンプリング（間引きによる縮小処理）したものをレイヤ１、そのレイヤ１に対してサブサンプリングしたものをレイヤ２とし、順にレイヤｍまでのピラミッド階層を構成する。 First, a noise removal filter is applied to the original image, and the image is set as layer 0. Also, layer 1 is subsampled (reduction processing by thinning) with respect to layer 0, and layer 2 is subsampled with respect to layer 1, and a pyramid hierarchy up to layer m is formed in order.

ここで、特徴点の追跡は、最下位のレイヤｍから始め、レイヤｍから最上位のレイヤ０の画像に微分フィルタをかけて、上述した勾配法により特徴点の動きベクトルＶを求める。 Here, the tracking of feature points starts with the lowest layer m, applies a differential filter to the image of layer 0 from the layer m to the highest layer 0, and obtains the motion vector V of the feature points by the gradient method described above.

すなわち、例えば画像１（基準画像）の特徴点を、その画像１に連続する画像２で追跡する場合に、まず、最下位のレイヤにて画像２の追跡開始点ＹをＹ＝Ｘにして、勾配法により特徴点の動きベクトルＶを求める。そして、次のポイントをＹ＝Ｙ＋Ｖに更新し、そのポイントから再度勾配法により動きベクトルＶを求める。同様にして、次のポイントをＹ＝Ｙ＋Ｖに更新する。これを繰り返し、Ｖが十分小さくなったか、予定された回数を繰り返した時点で計算を終了する。 That is, for example, when the feature point of the image 1 (reference image) is tracked by the image 2 continuous with the image 1, first, the tracking start point Y of the image 2 is set to Y = X in the lowest layer, The motion vector V of the feature point is obtained by the gradient method. Then, the next point is updated to Y = Y + V, and the motion vector V is obtained again from the point by the gradient method. Similarly, the next point is updated to Y = Y + V. This is repeated, and the calculation is terminated when V becomes sufficiently small or when the predetermined number of times is repeated.

続いて、１つ上の階層にて同様の追跡を行う。この場合、階層が１つ上がる毎に画像サイズが大きくなる。１階層毎に画像の１辺が倍になるものと仮定した場合に、下の階層で得られたＸとＹを２倍にして、これを初期値として特徴点追跡（動きベクトルの計算）を繰り返す。 Subsequently, the same tracking is performed in the upper hierarchy. In this case, the image size increases each time the hierarchy is increased by one. If it is assumed that one side of the image doubles for each layer, X and Y obtained in the lower layer are doubled, and feature point tracking (motion vector calculation) is performed using this as an initial value. repeat.

最終的に最上位のレイヤまで動きベクトルを順に引き継ぎながら特徴点を追跡していく。そして、最終的に得られた画像２上の特徴点の位置と画像１上の特徴点の位置との相対関係から上述した射影変換の行列式Ｈを算出する。 Finally, the feature points are tracked while the motion vectors are successively inherited up to the highest layer. Then, the determinant H of the projective transformation described above is calculated from the relative relationship between the position of the feature point on the image 2 finally obtained and the position of the feature point on the image 1.

具体的な処理動作を図１０乃至図１２に示す。図１０は従来の画像合成処理の全体の流れを示すフローチャートである。図１１はその画像合成処理に含まれる特徴点抽出処理を示すフローチャート、図１２はその特徴点抽出処理に含まれる動きベクトルの計算処理を示すフローチャートである。 Specific processing operations are shown in FIGS. FIG. 10 is a flowchart showing the overall flow of a conventional image composition process. FIG. 11 is a flowchart showing a feature point extraction process included in the image composition process, and FIG. 12 is a flowchart showing a motion vector calculation process included in the feature point extraction process.

今、連続撮影により複数枚の連続した画像が図示せぬメモリに記憶されており、そのメモリから各画像を順次読み出して重ね合わせることで１枚の合成画像を作成するものとして説明する。 Now, a description will be given assuming that a plurality of continuous images are stored in a memory (not shown) by continuous shooting, and each image is sequentially read out from the memory and superimposed to create one composite image.

図１０のフローチャートにおいて、ステップＤ１１〜Ｄ１９は基準画像に対するピラミッド化の処理、ステップＤ２０〜Ｄ２７はその基準画像に重ね合わせる画像に対するピラミッド化の処理を示している。 In the flowchart of FIG. 10, steps D11 to D19 indicate pyramidal processing for the reference image, and steps D20 to D27 indicate pyramidal processing for the image to be superimposed on the reference image.

すなわち、まず、前記メモリから画像合成の基準となる画像を読み込む（ステップＤ１１）。その際に、図９に示すように、階層数ｋ＝０として（ステップＤ１２）、前記基準画像にノイズ除去フィルタをかけてレイヤ０の画像を生成する（ステップＤ１３）。このレイヤ０の画像の中で特徴点を抽出した後（ステップＤ１４）、当該画像に微分フィルタをかけて画素値の勾配を求める（ステップＤ１５）。 That is, first, an image serving as a reference for image composition is read from the memory (step D11). At that time, as shown in FIG. 9, the layer number k = 0 (step D12), and a noise removal filter is applied to the reference image to generate a layer 0 image (step D13). After extracting feature points from the layer 0 image (step D14), a differential filter is applied to the image to obtain a gradient of pixel values (step D15).

続いて、ｋを＋１更新し（ステップＤ１６）、サブサンプリングフィルタによりレイヤ０の画像を縦横１／２サイズに間引き処理することでレイヤ１の画像を作成する（ステップＤ１７）。そして、このレイヤ１の画像に微分フィルタをかけて画素値の勾配を求める（ステップＤ１８）。 Subsequently, k is updated by +1 (step D16), and a layer 1 image is created by thinning the layer 0 image into 1/2 size vertically and horizontally by the sub-sampling filter (step D17). Then, a differential filter is applied to the layer 1 image to obtain the gradient of the pixel value (step D18).

以後、ｋを更新しながら、ｋ＝ｍになるまで前記同様の処理を繰り返す（ステップＤ１９）。これにより、図９に示したように、レイヤ０を最上位、レイヤｍを最下位とした階層画像が作成されることになる。 Thereafter, while updating k, the same processing is repeated until k = m (step D19). As a result, as shown in FIG. 9, a hierarchical image with the layer 0 as the highest level and the layer m as the lowest level is created.

続いて、前記メモリから基準画像に重ね合わせる画像を読み出し（ステップＤ２０）、ｋ＝０にセットした後（ステップＤ２１）、その画像に対して前記同様のピラミッド化の処理を行うことで（ステップＤ２２〜Ｄ２７）、レイヤ０〜ｍの画像を作成する。 Subsequently, an image to be overlaid on the reference image is read from the memory (step D20), k = 0 is set (step D21), and the same pyramidal process is performed on the image (step D22). To D27), images of layers 0 to m are created.

このようにして、基準画像とそれに重ね合わせる画像をピラミッド化すると、次に、それぞれのレイヤ画像間で特徴点の追跡処理を行い（ステップＤ２８）、その追跡処理で得られた特徴点の座標位置に基づいて両画像を位置合わせして合成する（ステップＤ２９）。これを所定枚数分繰り返して行うことで（ステップＤ３０）、１枚の合成画像を作成する。 When the reference image and the image to be superimposed on it are pyramidal in this way, next, feature point tracking processing is performed between the respective layer images (step D28), and the coordinate position of the feature point obtained by the tracking processing is performed. Based on the above, both images are aligned and synthesized (step D29). By repeating this for a predetermined number of sheets (step D30), one composite image is created.

次に、前記ステップＤ２８で実行される特徴点追跡処理について説明する。 Next, the feature point tracking process executed in step D28 will be described.

図１１のフローチャートに示すように、まず、ｉ＝０とおく（ステップＥ１１）。ｉは処理対象となる特徴点の番号であり、０〜ｎの値を取るものとする。 As shown in the flowchart of FIG. 11, first, i = 0 is set (step E11). i is the number of the feature point to be processed, and takes a value of 0 to n.

今、基準画像の特徴点の位置をＸｉ、その基準画像に重ね合わせる画像（以下、被追跡画像と称す）の特徴点の位置（検索位置）をＹｉとすると、レイヤｍにおける特徴点位置は以下のように表せる（ステップＥ１２）。 Now, assuming that the position of the feature point of the reference image is Xi and the position (search position) of the feature point of the image to be superimposed on the reference image (hereinafter referred to as a tracked image) is Yi, the feature point position in layer m is (Step E12).

Ｘｉ＝Ｘｉ／２^ｍ
Ｙｉ＝Ｘｉ
ここで、特徴点の追跡は最下位のレイヤから始めるため、ｋ＝ｍとする（ステップＥ１３）。また、繰り返し回数（つまり階層毎の特徴点の追跡回数）ｊを初期値０にセットして（ステップＥ１４）、被追跡画像上における特徴点ｉの動きベクトルＶｊを計算する（ステップＥ１５）。 Xi = Xi / 2 ^m
Yi = Xi
Here, since tracking of feature points starts from the lowest layer, k = m is set (step E13). Also, the number of repetitions (that is, the number of tracking of feature points for each layer) j is set to an initial value 0 (step E14), and the motion vector Vj of the feature point i on the tracked image is calculated (step E15).

図１２のフローチャートに示すように、動きベクトルＶｊはＸｉとＹｉから計算する（ステップＦ１１）。この場合、追跡開始時点ではＹｉ＝Ｘｉであり、Ｖｊ＝０である。この点を追跡の開始ポイントにして、勾配法により動きベクトルＶｊを求め、次のポイントをＹｉ＝Ｙｉ＋Ｖｊに更新する（ステップＦ１２）。 As shown in the flowchart of FIG. 12, the motion vector Vj is calculated from Xi and Yi (step F11). In this case, Yi = Xi and Vj = 0 at the tracking start time. Using this point as the tracking start point, the motion vector Vj is obtained by the gradient method, and the next point is updated to Yi = Yi + Vj (step F12).

このとき得られたＶｊの動き量が所定値以下に収束していれば（ステップＦ１３のＹｅｓ）、終了とする。また、所定値より大きければ（ステップＦ１３のＮｏ）、ステップＦ１１に戻って同様の計算を繰り返す。その際、繰り返し回数ｊを＋１ずつ更新し（ステップＦ１５）、その値が予め設定された最大反復数に達した時点で（ステップＦ１４のＹｅｓ）、終了とする。 If the amount of motion of Vj obtained at this time has converged to a predetermined value or less (Yes in step F13), the process ends. If it is larger than the predetermined value (No in step F13), the process returns to step F11 and the same calculation is repeated. At that time, the number of repetitions j is updated by +1 (step F15), and when the value reaches the preset maximum number of repetitions (Yes in step F14), the process is terminated.

図１１に戻って、レイヤｍでの特徴点の動きベクトルＶｊが得られると、次に、１つの上の階層で特徴点の追跡処理を行うべく、ＸｉとＹｉを以下のように２倍にする（ステップＥ１６）。 Returning to FIG. 11, when the motion vector Vj of the feature point in the layer m is obtained, next, Xi and Yi are doubled as follows in order to perform the feature point tracking process in one layer above (Step E16).

Ｘｉ＝Ｘｉ×２
Ｙｉ＝Ｙｉ×２
そして、階層パラメータｋの値を−１して（ステップＥ１８）、ステップＥ１４に戻り、１つ上の階層で同様の追跡処理を行う。このとき、下の階層で得られた動きベクトルＶｊの値を引き継いで追跡処理を行うことになる。最上位のレイヤ１まで追跡すると（ステップＥ１７のＹｅｓ）、別の特徴点について同様の処理を繰り返す。そして、ｉ＝ｎに達したとき（ステップＥ１９のＹｅｓ）、つまり、ｎ個分の特徴点に対する追跡処理を終えた時点で完了とする。 Xi = Xi × 2
Yi = Yi × 2
Then, the value of the hierarchy parameter k is decremented by -1 (step E18), the process returns to step E14, and the same tracking process is performed in the hierarchy one level higher. At this time, the tracking process is performed by taking over the value of the motion vector Vj obtained in the lower layer. When tracing up to the highest layer 1 (Yes in step E17), the same processing is repeated for another feature point. When i = n is reached (Yes in step E19), that is, when the tracking processing for n feature points is completed, the processing is completed.

前記ステップＤ２９の画像合成処理では、これらの特徴点の座標位置に基づいて射影変換の行列式Ｈを求め、その行列式Ｈに従って２つの画像（基準画像と被追跡画像）の位置合わせを行って合成することになる。
特開２００１−１６７２４９号公報 In the image synthesizing process in step D29, a determinant H of projective transformation is obtained based on the coordinate positions of these feature points, and the two images (reference image and tracked image) are aligned according to the determinant H. Will be synthesized.
JP 2001-167249 A

上述したように、画像間で特徴点を追跡する際に、各画像をピラミッド化して最も解像度が少ないレイヤから順番に最も解像度が高いレイヤに向かって追跡していく方法が用いられる。 As described above, when tracking feature points between images, a method is used in which each image is made into a pyramid and tracked from the layer with the lowest resolution to the layer with the highest resolution in order.

しかしながら、この方法をソフトウエアで実装した場合に、高解像度の画像にかけるフィルタ演算が多くなり、その処理量が膨大なるといった問題がある。 However, when this method is implemented by software, there is a problem in that the number of filter operations to be applied to a high-resolution image increases, and the amount of processing increases.

本発明は前記のような点に鑑みなされたもので、画像処理に必要な演算量を削減し、十分な追跡精度を保ちながら合成画像を作成することのできる画像合成装置、画像合成方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an image composition device, an image composition method, and a program capable of reducing the amount of calculation required for image processing and creating a composite image while maintaining sufficient tracking accuracy. The purpose is to provide.

本発明の請求項１に係る画像合成装置は、第１の画像から抽出した特徴点を第２の画像で追跡し、その特徴点の座標位置から前記第１の画像と前記第２の画像を位置合わせして合成する画像合成装置であって、元画像の所定縮小サイズを最上位のレイヤとして、前記第１および第２の画像を段階的に縮小してピラミッド階層を構成するピラミッド化手段と、このピラミッド化手段による最下位のレイヤから順番に最上位のレイヤに向けて前記特徴点を追跡する特徴点追跡手段と、この特徴点追跡手段によって得られる前記特徴点の動きを示すベクトルに所定の係数を乗じて前記特徴点の動きを収束させる収束手段とを具備して構成される。 An image composition device according to claim 1 of the present invention tracks a feature point extracted from a first image with a second image, and extracts the first image and the second image from the coordinate position of the feature point. An image composition device for aligning and compositing, wherein a pyramidal means for composing a pyramid hierarchy by reducing the first and second images stepwise with a predetermined reduced size of the original image as the highest layer A feature point tracking means for tracking the feature points from the lowest layer to the highest layer in order by the pyramidization means, and a vector indicating the motion of the feature points obtained by the feature point tracking means. And a converging means for converging the movement of the feature points by multiplying the coefficient.

このような構成によれば、特徴点を追跡する際に元画像サイズでの画像処理が不要となるので、その分、演算量が大幅に削減される。また、特徴点の動きベクトルに所定の係数を乗じることで収束性を高め、十分な追跡精度を保ちながら合成画像を作成することができる。 According to such a configuration, when the feature points are tracked, image processing with the original image size is not necessary, and the amount of calculation is significantly reduced accordingly. Further, by multiplying a motion vector of a feature point by a predetermined coefficient, convergence can be improved, and a composite image can be created while maintaining sufficient tracking accuracy.

また、本発明の請求項２は、前記請求項１記載の画像合成装置において、前記ピラミッド化手段は、元画像の１／４サイズを最上位のレイヤとして前記第１および第２の画像を段階的に縮小してピラミッド階層を構成することを特徴とする。 According to a second aspect of the present invention, in the image synthesizing apparatus according to the first aspect, the pyramidizing means steps the first and second images with a quarter size of the original image as the highest layer. It is characterized in that a pyramid hierarchy is formed by reducing the size.

このような構成によれば、最下位のレイヤから順に特徴点を追跡していき、最終的に元画像の１／４サイズでその追跡を完了することができるので、特徴点の追跡処理を高速化できる。 According to such a configuration, the feature points are tracked in order from the lowest layer, and finally the tracking can be completed with a quarter size of the original image. Can be

本発明の請求項３は、前記請求項１記載の画像合成装置において、前記収束手段は、前記所定の係数を各階層での特徴点の追跡回数に応じて段階的に収束性を高める方向に変更することを特徴とする。 According to a third aspect of the present invention, in the image synthesizing apparatus according to the first aspect, the convergence means increases the convergence in a stepwise manner according to the number of tracking of the feature points in each layer. It is characterized by changing.

このような構成によれば、特徴点の動きベクトルが発散することを防いで、所定の追跡回数内で安定して収束させることができる。 According to such a configuration, motion vectors of feature points can be prevented from diverging and can be stably converged within a predetermined number of times of tracking.

本発明の請求項４に係る画像合成方法は、第１の画像から抽出した特徴点を第２の画像で追跡し、その特徴点の座標位置から前記第１の画像と前記第２の画像を位置合わせして合成する画像合成方法であって、元画像の所定縮小サイズを最上位のレイヤとして、前記第１および第２の画像を段階的に縮小してピラミッド階層を構成する第１のステップと、この第１のステップによる最下位のレイヤから順番に最上位のレイヤに向けて特徴点を追跡する第２のステップと、この第２のステップによって得られる前記特徴点の動きを示すベクトルに所定の係数を乗じて前記特徴点の動きを収束させる第３のステップとを備えたことを特徴とする。 According to a fourth aspect of the present invention, a feature point extracted from a first image is tracked with a second image, and the first image and the second image are obtained from the coordinate position of the feature point. An image composition method for aligning and compositing, wherein a first reduction step of the original image is used as the highest layer, and the first and second images are reduced stepwise to form a pyramid hierarchy And a second step for tracking feature points from the lowest layer in the first step toward the highest layer in order, and a vector indicating the motion of the feature points obtained by the second step. And a third step of converging the movement of the feature points by multiplying by a predetermined coefficient.

このような画像合成方法によれば、前記各ステップに従った処理を実行することにより、前記請求項１記載の発明と同様の作用効果が奏せられる。 According to such an image synthesizing method, the same effects as those of the first aspect of the invention can be achieved by executing the processing according to each of the steps.

本発明の請求項５に係るプログラムは、第１の画像から抽出した特徴点を第２の画像で追跡し、その特徴点の座標位置から前記第１の画像と前記第２の画像を位置合わせして合成するプログラムであって、コンピュータに、元画像の所定縮小サイズを最上位のレイヤとして、前記第１および第２の画像を段階的に縮小してピラミッド階層を構成する第１の機能と、この第１の機能による最下位のレイヤから順番に最上位のレイヤに向けて特徴点を追跡する第２の機能と、この第２の機能によって得られる前記特徴点の動きを示すベクトルに所定の係数を乗じて前記特徴点の動きを収束させる第３の機能とを実行させることを特徴とする。 The program according to claim 5 of the present invention tracks a feature point extracted from the first image with the second image, and aligns the first image and the second image from the coordinate position of the feature point. A first function for composing a pyramid hierarchy by gradually reducing the first and second images with a predetermined reduced size of the original image as the uppermost layer. A second function for tracking feature points from the lowest layer to the highest layer in order from the lowest layer by the first function, and a vector indicating the motion of the feature points obtained by the second function are predetermined. And a third function for converging the movement of the feature points by multiplying by the coefficient.

したがって、コンピュータが前記各機能を実現するためのプログラムを実行することにより、前記請求項１記載の発明と同様の作用効果が奏せられる。 Therefore, when the computer executes the program for realizing each function, the same effects as those of the first aspect of the invention can be achieved.

本発明によれば、ピラミッド化による特徴点の追跡を行う場合に、画像処理に必要な演算量を削減し、十分な追跡精度を保ちながら合成画像を作成することができる。 According to the present invention, when tracking feature points by pyramidization, it is possible to reduce the amount of calculation required for image processing and create a composite image while maintaining sufficient tracking accuracy.

以下、図面を参照して本発明の実施形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は本発明の一実施形態に係る画像合成装置としてデジタルカメラを例にした場合の外観構成を示す図であり、図１（ａ）は主に前面の構成、同図（ｂ）は主に背面の構成を示す斜視図である。 FIG. 1 is a diagram showing an external configuration when a digital camera is taken as an example of an image composition device according to an embodiment of the present invention. FIG. 1 (a) is mainly a front configuration, and FIG. It is a perspective view which shows the structure of a back surface.

このデジタルカメラ１は、略矩形の薄板状ボディ２の前面に、撮影レンズ３、セルフタイマランプ４、光学ファインダ窓５、ストロボ発光部６、マイクロホン部７などを有し、上面の（ユーザにとって）右端側には電源キー８及びシャッタキー９などが設けられている。 The digital camera 1 has a photographing lens 3, a self-timer lamp 4, an optical finder window 5, a strobe light emitting unit 6, a microphone unit 7 and the like on the front surface of a substantially rectangular thin plate-like body 2 on the upper surface (for the user). On the right end side, a power key 8 and a shutter key 9 are provided.

電源キー８は、電源のオン／オフ毎に操作するキーであり、シャッタキー９は、撮影時に撮影タイミングを指示するキーである。 The power key 8 is a key operated every time the power is turned on / off, and the shutter key 9 is a key for instructing a photographing timing at the time of photographing.

また、デジタルカメラ１の背面には、撮影モード（Ｒ）キー１０、再生モード（Ｐ）キー１１、光学ファインダ１２、スピーカ部１３、マクロキー１４、ストロボキー１５、メニュー（ＭＥＮＵ）キー１６、リングキー１７、セット（ＳＥＴ）キー１８、表示部１９などが設けられている。 Also, on the back of the digital camera 1, a shooting mode (R) key 10, a playback mode (P) key 11, an optical viewfinder 12, a speaker unit 13, a macro key 14, a strobe key 15, a menu (MENU) key 16, a ring A key 17, a set (SET) key 18, a display unit 19, and the like are provided.

撮影モードキー１０は、電源オフの状態から操作することで自動的に電源オンとして静止画の撮影モードに移行する一方で、電源オンの状態から繰返し操作することで、静止画モード、動画モードを循環的に設定する。静止画モードは、静止画を撮影するためのモードである。また、動画モードは、動画を撮影するためのモードである。 The shooting mode key 10 is operated automatically from the power-off state to automatically turn on the power and shift to the still image shooting mode. On the other hand, by repeatedly operating from the power-on state, the still image mode and the moving image mode are switched. Set cyclically. The still image mode is a mode for photographing a still image. The moving image mode is a mode for shooting a moving image.

前記シャッタキー９は、これらの撮影モードに共通に使用される。すなわち、静止画モードでは、シャッタキー９が押下されたときのタイミングで静止画の撮影が行われる。動画モードでは、シャッタキー９が押下されたときのタイミングで動画の撮影が開始され、シャッタキー９が再度押下されたときにその動画の撮影が終了する。 The shutter key 9 is commonly used for these photographing modes. That is, in the still image mode, a still image is taken at the timing when the shutter key 9 is pressed. In the moving image mode, shooting of a moving image is started at a timing when the shutter key 9 is pressed, and shooting of the moving image is ended when the shutter key 9 is pressed again.

再生モードキー１１は、電源オフの状態から操作することで自動的に電源オンとして再生モードに移行する。 When the playback mode key 11 is operated from the power-off state, the playback mode key 11 is automatically turned on to enter the playback mode.

マクロキー１４は、静止画の撮影モードで通常撮影とマクロ撮影とを切換える際に操作する。ストロボキー１５は、ストロボ発光部６の発光モードを切換える際に操作する。メニューキー１６は、連続撮影モードを含む各種メニュー項目等を選択する際に操作する。リングキー１７は、上下左右各方向への項目選択用のキーが一体に形成されたものであり、このリングキー１７の中央に位置するセットキー１８は、その時点で選択されている項目を設定する際に操作する。 The macro key 14 is operated when switching between normal shooting and macro shooting in the still image shooting mode. The strobe key 15 is operated when switching the light emission mode of the strobe light emitting unit 6. The menu key 16 is operated when selecting various menu items including the continuous shooting mode. The ring key 17 is integrally formed with item selection keys in the up, down, left, and right directions, and the set key 18 located in the center of the ring key 17 sets the item selected at that time. To operate.

表示部１９は、バックライト付きのカラー液晶パネルで構成されるもので、撮影モード時には電子ファインダとしてスルー画像のモニタ表示を行う一方で、再生モード時には選択した画像等を再生表示する。 The display unit 19 is composed of a color liquid crystal panel with a backlight, and displays a through image on the monitor as an electronic viewfinder in the photographing mode, and reproduces and displays the selected image and the like in the reproduction mode.

なお、図示はしないがデジタルカメラ１の底面には、記録媒体として用いられるメモリカードを着脱するためのメモリカードスロットや、外部のパーソナルコンピュータ等と接続するためのシリアルインタフェースコネクタとして、例えばＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）コネクタ等が設けられている。 Although not shown, the digital camera 1 has a memory card slot for attaching / detaching a memory card used as a recording medium, a serial interface connector for connecting to an external personal computer, etc., for example, USB (Universal). Serial Bus) connector and the like are provided.

図２はデジタルカメラ１の電子回路構成を示すブロック図である。 FIG. 2 is a block diagram showing an electronic circuit configuration of the digital camera 1.

このデジタルカメラ１には、光学レンズ装置２１、イメージセンサ２２、メモリ２３、表示装置２４、画像処理装置２５、操作部２６、コンピュータインタフェース部２７、外部記憶ＩＯ装置２８、プログラムコード記憶装置２９、ＣＰＵ３０、メモリカード３１が備えられている。 The digital camera 1 includes an optical lens device 21, an image sensor 22, a memory 23, a display device 24, an image processing device 25, an operation unit 26, a computer interface unit 27, an external storage IO device 28, a program code storage device 29, and a CPU 30. A memory card 31 is provided.

光学レンズ装置２１は、撮影レンズ３を構成する図示せぬフォーカスレンズおよびズームレンズを含むレンズ光学系とその駆動部とを備えたものであり、イメージセンサ２２上に、撮影対象からの光を集光させて像を結像させる。 The optical lens device 21 includes a lens optical system including a focus lens and a zoom lens (not shown) constituting the photographing lens 3 and a driving unit thereof, and collects light from the photographing target on the image sensor 22. Light to form an image.

イメージセンサ２２は、結像した画像を、デジタル化した画像データとして取り込むためのものであり、例えば、ＣＣＤ（Charge Coupled Device：電荷結合素子）等によって構成される。イメージセンサ２２は、ＣＰＵ３０によって制御され、シャッタキー９が押下されなければ、プレビュー用の解像度の低いデジタルの画像データを生成し、この画像データを秒間３０枚程度の間隔で、定期的にメモリ２３に送出する。また、イメージセンサ２２は、シャッタキー９が押下されると、解像度の高い画像データを生成し、生成した画像データをメモリ２３に送出する。また、イメージセンサ２２は、ＣＰＵ３０によって撮像感度（ＩＳＯ感度）の設定可能である。 The image sensor 22 is for capturing a formed image as digitized image data, and is configured by, for example, a CCD (Charge Coupled Device). If the image sensor 22 is controlled by the CPU 30 and the shutter key 9 is not pressed, digital image data having a low preview resolution is generated, and this image data is periodically stored in the memory 23 at intervals of about 30 sheets per second. To send. Further, when the shutter key 9 is pressed, the image sensor 22 generates image data with high resolution and sends the generated image data to the memory 23. Further, the image sensor 22 can set imaging sensitivity (ISO sensitivity) by the CPU 30.

メモリ２３は、イメージセンサ２２からの低解像度のプレビュー画像、高解像度の画像データまたは画像処理装置２５が画像処理する元画像のデータ、処理後の画像データを一時記憶するものである。メモリ２３は、一時記憶した画像データを表示装置２４または画像処理装置２５に送り出す。 The memory 23 temporarily stores a low-resolution preview image from the image sensor 22, high-resolution image data, original image data processed by the image processing device 25, and processed image data. The memory 23 sends the temporarily stored image data to the display device 24 or the image processing device 25.

表示装置２４は、液晶モニタである表示部１９に画像を表示させるためのものである。表示装置２４は、メモリ２３が一時記憶した低解像度のプレビュー画像または解像度の高い画像を表示部１９に表示する。 The display device 24 is for displaying an image on the display unit 19 which is a liquid crystal monitor. The display device 24 displays a low-resolution preview image or a high-resolution image temporarily stored in the memory 23 on the display unit 19.

画像処理装置２５は、メモリ２３に一時記憶された画像データに対して、画像データの圧縮等の画像処理を行うためのものである。 The image processing device 25 is for performing image processing such as image data compression on the image data temporarily stored in the memory 23.

操作部２６は、シャッタキー９の他に、電源キー８、撮影モードキー１０、再生モードキー１１、マクロキー１４、ストロボキー１５、メニューキー１６、リングキー１７、セットキー１８などから構成され、それらのキー操作に伴う信号は直接ＣＰＵ３０へ送出される。 In addition to the shutter key 9, the operation unit 26 includes a power key 8, a shooting mode key 10, a playback mode key 11, a macro key 14, a strobe key 15, a menu key 16, a ring key 17, a set key 18, and the like. Signals associated with these key operations are sent directly to the CPU 30.

コンピュータインタフェース部２７は、デジタルカメラ１がコンピュータ（図示せず）に接続されたときに、ＵＳＢのストレジクラスドライバとして動作するものである。これにより、コンピュータは、デジタルカメラ１に接続されると、メモリカード３１をコンピュータの外部記憶装置として取り扱う。 The computer interface unit 27 operates as a USB storage class driver when the digital camera 1 is connected to a computer (not shown). Thus, when the computer is connected to the digital camera 1, the computer handles the memory card 31 as an external storage device of the computer.

外部記憶ＩＯ装置２８は、メモリカード３１との間で、画像データ等の入出力を行うものである。メモリカード３１は、外部記憶ＩＯ装置２８から供給された画像データ等を記憶するものである。 The external storage IO device 28 inputs and outputs image data and the like with the memory card 31. The memory card 31 stores image data and the like supplied from the external storage IO device 28.

プログラムコード記憶装置２９は、ＣＰＵ３０が実行するプログラムを記憶するためのものであり、ＲＯＭやフラッシュメモリなどによって構成される。 The program code storage device 29 is for storing a program executed by the CPU 30, and is configured by a ROM, a flash memory, or the like.

ＣＰＵ３０は、プログラムコード記憶装置２９に格納されているプログラムに従って、システム全体を制御するものである。なお、メモリ２３は、ＣＰＵ３０の作業メモリとしても用いられる。 The CPU 30 controls the entire system according to a program stored in the program code storage device 29. The memory 23 is also used as a work memory for the CPU 30.

操作部２６のスイッチ・キーが押下されることにより、操作部２６から操作情報が送信されると、ＣＰＵ３０は、この操作情報に基づいて、イメージセンサ２２、メモリ２３、表示装置２４、画像処理装置２５等を制御する。 When operation information is transmitted from the operation unit 26 by pressing a switch key of the operation unit 26, the CPU 30 performs image sensor 22, memory 23, display device 24, and image processing device based on the operation information. 25 etc. are controlled.

具体的には、操作部２６から撮影モードキー１０が押下された旨の操作情報が送信されると、ＣＰＵ３０は各部を撮影モードに設定する。この状態で、シャッタキー９が押下されなければ、イメージセンサ２２をプレビューモードに設定し、シャッタキー９が押下されれば、解像度の高い撮影対象画像を読み込む高解像度モードに設定する。その際、メニューキー１６の操作により連続撮影モードが設定されていれば、シャッタキー９の押下に伴い、所定枚数分の画像の読み込み処理が所定時間間隔で実行される。 Specifically, when operation information indicating that the shooting mode key 10 is pressed is transmitted from the operation unit 26, the CPU 30 sets each unit to the shooting mode. In this state, if the shutter key 9 is not pressed, the image sensor 22 is set to the preview mode, and if the shutter key 9 is pressed, the high-resolution mode for reading a high-resolution image to be captured is set. At this time, if the continuous shooting mode is set by operating the menu key 16, a predetermined number of image reading processes are executed at predetermined time intervals as the shutter key 9 is pressed.

また、再生モードキー１１が押下された旨の操作情報が送信されると、ＣＰＵ３０は、各部を再生モードに設定する。 When the operation information indicating that the playback mode key 11 is pressed is transmitted, the CPU 30 sets each unit to the playback mode.

また、ＣＰＵ３０は、外部記憶ＩＯ装置２８を介してメモリカード３１に、プレビュー画像、高解像度の画像のデータを記録したり、メモリカード３１から、記録された画像データを読み出したりする。ＣＰＵ３０は、メモリカード３１には、例えば、ＪＰＥＧフォーマットで圧縮した画像データを記録する。 Further, the CPU 30 records preview image and high-resolution image data on the memory card 31 via the external storage IO device 28, and reads the recorded image data from the memory card 31. For example, the CPU 30 records image data compressed in the JPEG format in the memory card 31.

ＣＰＵ３０は、メモリ２３に画像データを一時記憶する際、プレビュー画像、高解像度の画像データを異なる記憶領域に記録する。また、ＣＰＵ３０は、メモリカード３１には、画像データを画像ファイルに分けて記録する。 When the image data is temporarily stored in the memory 23, the CPU 30 records the preview image and the high-resolution image data in different storage areas. Further, the CPU 30 records the image data separately in the image file in the memory card 31.

また、ＣＰＵ３０は、外部記憶ＩＯ装置２８を介してメモリカード３１に、プレビュー画像、高解像度の画像のデータを記録したり、メモリカード３１から、記録された画像データを読み出したりする。ＣＰＵ３０は、メモリカード３１に画像データを格納する画像ファイルを作成する。 Further, the CPU 30 records preview image and high-resolution image data on the memory card 31 via the external storage IO device 28, and reads the recorded image data from the memory card 31. The CPU 30 creates an image file for storing image data in the memory card 31.

次に、前記構成のデジタルカメラ１による画像合成処理について説明する。 Next, image composition processing by the digital camera 1 having the above-described configuration will be described.

図９で説明したように、一般的なピラミッド化による方法では、元画像に対してノイズ除去フィルタをかけた画像をレイヤ０として、そのレイヤ０に対してサブサンプリングしたものをレイヤ１、レイヤ１に対してサブサンプリングしたものをレイヤ２…として順にレイヤｍまで求める。そして、画像間の特徴点を追跡する場合に、最下位のレイヤｍから順番に最も解像度が高い最上位のレイヤ０に向かって追跡を行う。 As described with reference to FIG. 9, in the general pyramidization method, an image obtained by applying a noise removal filter to an original image is set as layer 0, and the subsampled layer 0 is layer 1 and layer 1 Are sub-sampled to layer m in order as layer 2. When tracking feature points between images, tracking is performed from the lowest layer m to the highest layer 0 having the highest resolution in order.

ここで、当然のことながら、全てのフィルタ処理は解像度が高いほど多くなる。よって、図９のような方法では、元画像サイズであるレイヤ０に対してのフィルタ処理が最も多くなり、ソフトウエアで実装する場合にその処理量が膨大となり、また、追跡処理に時間がかかるなどの問題がある。 Here, as a matter of course, all the filtering processes increase as the resolution increases. Therefore, in the method as shown in FIG. 9, the filter processing for the layer 0 which is the original image size is the largest, the processing amount becomes enormous when implemented by software, and the tracking processing takes time. There are problems such as.

そこで、本実施形態では、図３に示すように、元画像の１／４サイズ（縦横１／２サイズ）を最上位のレイヤ（ｋ＝１）としてピラミッド階層を構成し、特徴点の抽出処理および追跡処理をこのピラミッド階層で行う。これにより、図３から明らかなように、各レイヤの画像を作るのに必要なフィルタ処理の演算量が大幅に削減され、追跡処理時間も短縮化されることになる。 Therefore, in the present embodiment, as shown in FIG. 3, the pyramid hierarchy is configured with the 1/4 size (vertical / horizontal size) of the original image as the highest layer (k = 1), and feature point extraction processing is performed. The tracking process is performed in this pyramid hierarchy. Thus, as is apparent from FIG. 3, the amount of filter processing required to create an image of each layer is greatly reduced, and the tracking processing time is also shortened.

ただし、その代償として、オプティカルフローの精度が低下する問題がある。これは、図１２で説明した特徴点の動きベクトルの計算（Ｙｉ＝Ｘｉ＋Ｖ０＋Ｖ１＋…＋Ｖｊ）で各Ｖｊが振動することが原因である。この問題点を解決するため、各ＶｊをＹｉに単純に加算していくのではなく、ｊ（繰り返し回数）によって異なる係数をＶｊにかけ、加算値を徐々に小さくしていくという方法を用いる。 However, as a price, there is a problem that the accuracy of the optical flow is lowered. This is because each Vj vibrates in the calculation of the motion vector of the feature points described in FIG. 12 (Yi = Xi + V0 + V1 +... + Vj). To solve this problem, instead of simply adding each Vj to Yi, a method is used in which a different coefficient is applied to Vj depending on j (the number of repetitions), and the added value is gradually reduced.

以下に、図４乃至図６のフローチャートを用いて具体的な処理について説明する。なお、以下の各フローチャートで示される処理は、マイクロコンピュータであるＣＰＵ３０がプログラムコード記憶装置２９に記憶されたプログラムを読み込むことにより、そのプログラムに記述された手順に従って実行される。 Specific processing will be described below with reference to the flowcharts of FIGS. The processing shown in the following flowcharts is executed according to the procedure described in the program by the CPU 30 that is a microcomputer reading the program stored in the program code storage device 29.

図４はデジタルカメラ１の画像合成処理の全体の流れを示すフローチャートである。図５はその画像合成処理に含まれる特徴点抽出処理を示すフローチャート、図６はその特徴点抽出処理に含まれる動きベクトルの計算処理を示すフローチャートである。 FIG. 4 is a flowchart showing an overall flow of image composition processing of the digital camera 1. FIG. 5 is a flowchart showing a feature point extraction process included in the image composition process, and FIG. 6 is a flowchart showing a motion vector calculation process included in the feature point extraction process.

今、連続撮影により複数枚の連続した画像が図２に示したメモリ２３に記憶されており、そのメモリ２３から各画像を順次読み出して重ね合わせることで１枚の合成画像を作成するものとして説明する。 Now, it is assumed that a plurality of continuous images are stored in the memory 23 shown in FIG. 2 by continuous shooting, and each image is sequentially read from the memory 23 and superimposed to create one composite image. To do.

図４のフローチャートにおいて、ステップＡ１１〜Ａ１９は基準画像に対するピラミッド化の処理、ステップＡ２０〜Ａ２７はその基準画像に重ね合わせる画像に対するピラミッド化の処理を示している。 In the flowchart of FIG. 4, steps A11 to A19 indicate pyramidal processing for a reference image, and steps A20 to A27 indicate pyramidal processing for an image to be superimposed on the reference image.

すなわち、まず、ＣＰＵ３０は、メモリ２３から画像合成の基準となる画像を読み込む（ステップＡ１１）。その際、図３に示すように、ＣＰＵ３０は、階層数ｋ＝１として（ステップＡ１２）、前記基準画像にノイズ除去とサブサンプリングの合成フィルタをかけて、元画像の１／４（縦横１／２）サイズに縮小したレイヤ１の画像を生成する（ステップＡ１３）。本実施形態では、このレイヤ１の画像サイズがピラミッド階層の最上位となる。 That is, first, the CPU 30 reads an image serving as a reference for image composition from the memory 23 (step A11). At that time, as shown in FIG. 3, the CPU 30 sets the number of hierarchies k = 1 (step A12), applies a noise removal and sub-sampling synthesis filter to the reference image, and ¼ (vertical / horizontal 1 / vertical) of the original image. 2) An image of layer 1 reduced in size is generated (step A13). In the present embodiment, the image size of this layer 1 is the highest level of the pyramid hierarchy.

ＣＰＵ３０は、このレイヤ１の画像の中で特徴点を抽出した後（ステップＡ１４）、当該画像に微分フィルタをかけて画素値の勾配を求める（ステップＡ１５）。 After extracting feature points from the layer 1 image (step A14), the CPU 30 applies a differential filter to the image to obtain a gradient of pixel values (step A15).

続いて、ＣＰＵ３０は、ｋを＋１更新し（ステップＡ１６）、サブサンプリングフィルタによりレイヤ１の画像を縦横１／２サイズに間引き処理することでレイヤ２の画像を作成する（ステップＡ１７）。そして、このレイヤ２の画像に微分フィルタをかけて画素値の勾配を求める（ステップＡ１８）。 Subsequently, the CPU 30 updates k by +1 (step A16), and creates a layer 2 image by thinning the layer 1 image into 1/2 size vertically and horizontally by the sub-sampling filter (step A17). Then, a differential filter is applied to the layer 2 image to obtain the gradient of the pixel value (step A18).

以後、ｋを更新しながら、ｋ＝ｍになるまで前記同様の処理を繰り返す（ステップＡ１９）。これにより、図３に示したように、レイヤ１を最上位、レイヤｍを最下位とした階層画像が作成されることになる。 Thereafter, while updating k, the same process is repeated until k = m (step A19). As a result, as shown in FIG. 3, a hierarchical image having the highest layer 1 and the lowest layer m is created.

続いて、ＣＰＵ３０は、前記メモリ２３から基準画像に重ね合わせる画像を読み出し（ステップＡ２０）、ｋ＝１にセットした後（ステップＡ２１）、その画像に対して前記同様のピラミッド化の処理を行うことで（ステップＡ２２〜Ａ２７）、レイヤ１〜ｍの画像を作成する。この場合、前記基準画像のときと同様に、レイヤ１では、当該画像に対してノイズ除去とサブサンプリングの合成フィルタをかけて、元画像の１／４（縦横１／２）サイズに縮小した画像を作成する。 Subsequently, the CPU 30 reads an image to be superimposed on the reference image from the memory 23 (step A20), sets k = 1 (step A21), and performs the same pyramidization process on the image. (Steps A22 to A27), the images of layers 1 to m are created. In this case, as in the case of the reference image, in layer 1, the image is reduced to 1/4 (vertical / horizontal 1/2) size of the original image by applying a noise reduction and sub-sampling synthesis filter to the image. Create

このようにして、基準画像とそれに重ね合わせる画像をピラミッド化すると、次に、それぞれのレイヤ画像間で特徴点の追跡処理を行い（ステップＡ２８）、その追跡処理で得られた特徴点の座標位置に基づいて両画像を位置合わせして合成する（ステップＡ２９）。これを所定枚数分繰り返して行うことで（ステップＡ３０）、１枚の合成画像を作成する。 When the reference image and the image to be superimposed on it are pyramidal in this way, next, feature point tracking processing is performed between the respective layer images (step A28), and the coordinate position of the feature point obtained by the tracking processing is performed. Based on the above, both images are aligned and synthesized (step A29). By repeating this for a predetermined number of sheets (step A30), one composite image is created.

次に、前記ステップＡ２８で実行される特徴点追跡処理について説明する。 Next, the feature point tracking process executed in step A28 will be described.

図５のフローチャートに示すように、まず、ｉ＝０とおく（ステップＢ１１）。ｉは処理対象となる特徴点の番号であり、０〜ｎの値を取るものとする。 As shown in the flowchart of FIG. 5, first, i = 0 is set (step B11). i is the number of the feature point to be processed, and takes a value of 0 to n.

今、基準画像の特徴点の位置をＸｉ、その基準画像に重ね合わせる画像（以下、被追跡画像と称す）の特徴点の位置（検索位置）をＹｉとすると、レイヤｍにおける特徴点位置は以下のように表せる（ステップＢ１２）。 Now, assuming that the position of the feature point of the reference image is Xi and the position (search position) of the feature point of the image to be superimposed on the reference image (hereinafter referred to as a tracked image) is Yi, the feature point position in layer m is (Step B12).

Ｘｉ＝Ｘｉ／２^ｍ
Ｙｉ＝Ｘｉ
ここで、特徴点の追跡は最下位のレイヤから始めるため、ｋ＝ｍとする（ステップＢ１３）。また、ｊ（繰り返し回数）を初期値０にセットして（ステップＢ１４）、被追跡画像上における特徴点ｉの動きベクトルＶｊを計算する（ステップＢ１５）。 Xi = Xi / ^{2 m}
Yi = Xi
Here, since tracking of feature points starts from the lowest layer, k = m is set (step B13). Also, j (number of repetitions) is set to an initial value 0 (step B14), and a motion vector Vj of the feature point i on the tracked image is calculated (step B15).

図６のフローチャートに示すように、ＣＰＵ３０は、ＸｉとＹｉから動きベクトルＶｊを計算する（ステップＣ１１）。この場合、追跡開始時点ではＹｉ＝Ｘｉであり、Ｖｊ＝０である。この点を追跡の開始ポイントにして、勾配法により動きベクトルＶｊを求める。 As shown in the flowchart of FIG. 6, the CPU 30 calculates a motion vector Vj from Xi and Yi (step C11). In this case, Yi = Xi and Vj = 0 at the tracking start time. Using this point as a tracking start point, a motion vector Vj is obtained by the gradient method.

ここで、動きベクトルＶｊの振動を抑えるため、ＣＰＵ３０は、ｊ（繰り返し回数）によって異なる係数をＶｊに乗じる（ステップＣ１２〜Ｃ１５）。具体的には、最大反復数をＴ、係数をαとすると、下記のような条件でαが設定される。 Here, in order to suppress the vibration of the motion vector Vj, the CPU 30 multiplies Vj by a coefficient that differs depending on j (number of repetitions) (steps C12 to C15). Specifically, when the maximum number of iterations is T and the coefficient is α, α is set under the following conditions.

０≦ｊ＜Ｔ×１／３のとき、α＝１
Ｔ×１／３≦ｊ＜Ｔ×２／３のとき、α＝１／２
Ｔ×２／３≦ｊのとき、α＝１／４
すなわち、例えば最大反復数Ｔが１２回とすると、ｊが４回未満であれば、係数α＝１である。よって、Ｖｊはそのままの値となる。一方、ｊが４回〜７回の間では、係数α＝１／２に設定される。よって、Ｖｊ＝Ｖｊ／２となる。また、ｊが８回以上であれば、係数α＝１／３に設定され、Ｖｊ＝Ｖｊ／４となる。 When 0 ≦ j <T × 1/3, α = 1
When T × 1/3 ≦ j <T × 2/3, α = 1/2
When T × 2/3 ≦ j, α = 1/4
That is, for example, if the maximum number of repetitions T is 12, and if j is less than 4, the coefficient α = 1. Therefore, Vj is a value as it is. On the other hand, when j is 4 to 7 times, the coefficient α is set to 1/2. Therefore, Vj = Vj / 2. If j is 8 times or more, the coefficient α = 1/3 is set, and Vj = Vj / 4.

このようにして、ｊに応じた係数を特徴点の動きベクトルＶｊに乗じた後、次のポイントをＹｉ＝Ｙｉ＋Ｖｊに更新する（ステップＣ１６）。そして、このとき得られたＶｊの動き量と所定値とを比較し、Ｖｊの動き量が所定値以下に収束していれば（ステップＣ１７のＹｅｓ）、終了とする。また、所定値より大きければ（ステップＣ１７のＮｏ）、ステップＣ１１に戻って同様の計算を繰り返す。その際、繰り返し回数ｊを＋１ずつ更新し（ステップＣ１９）、その値が予め設定された最大反復数に達した時点で（ステップＣ１８のＹｅｓ）、終了とする。 In this way, after multiplying the motion vector Vj of the feature point by a coefficient corresponding to j, the next point is updated to Yi = Yi + Vj (step C16). Then, the amount of motion of Vj obtained at this time is compared with a predetermined value, and if the amount of motion of Vj has converged below the predetermined value (Yes in step C17), the processing ends. If it is larger than the predetermined value (No in step C17), the process returns to step C11 and the same calculation is repeated. At that time, the number of repetitions j is updated by +1 (step C19), and when the value reaches the preset maximum number of repetitions (Yes in step C18), the process is terminated.

図５に戻って、レイヤｍでの特徴点の動きベクトルＶｊが得られると、次に、ＣＰＵ３０は、１つの上の階層で特徴点の追跡処理を行うべく、ＸｉとＹｉを以下のように２倍にする（ステップＢ１６）。 Returning to FIG. 5, when the motion vector Vj of the feature point in the layer m is obtained, the CPU 30 next sets Xi and Yi as follows in order to perform the feature point tracking process in one layer above Double it (step B16).

Ｘｉ＝Ｘｉ×２
Ｙｉ＝Ｙｉ×２
そして、ＣＰＵ３０は、階層パラメータｋの値を−１して（ステップＢ１８）、ステップＢ１４に戻り、１つ上の階層で同様の追跡処理を行う。このとき、下の階層で得られた動きベクトルＶｊの値を引き継いで追跡処理を行うことになる。最上位のレイヤ１まで追跡すると（ステップＢ１７のＹｅｓ）、別の特徴点について同様の処理を繰り返す。そして、ｉ＝ｎに達したとき（ステップＢ１９のＹｅｓ）、つまり、ｎ個分の特徴点に対する追跡処理を終えた時点で完了とする。 Xi = Xi × 2
Yi = Yi × 2
Then, the CPU 30 decrements the value of the hierarchy parameter k by -1 (step B18), returns to step B14, and performs the same tracking process on the next higher hierarchy. At this time, the tracking process is performed by taking over the value of the motion vector Vj obtained in the lower layer. When tracing to the highest layer 1 (Yes in step B17), the same processing is repeated for another feature point. When i = n is reached (Yes in step B19), that is, when the tracking processing for n feature points is completed, the processing is completed.

前記ステップＡ２９の画像合成処理では、これらの特徴点の座標位置に基づいて射影変換の行列式Ｈを求め、その行列式Ｈに従って２つの画像（基準画像と被追跡画像）の位置合わせを行って合成することになる。 In the image composition processing in step A29, a determinant H of projective transformation is obtained based on the coordinate positions of these feature points, and the two images (reference image and tracked image) are aligned according to the determinant H. Will be synthesized.

図７は動きベクトルの収束の概念を説明するための図であり、図７（ａ）は発散状態、同図（ｂ）は収束状態を示す。 7A and 7B are diagrams for explaining the concept of motion vector convergence. FIG. 7A shows a divergent state, and FIG. 7B shows a convergent state.

ある１つの特徴点を２枚の画像間で勾配法を繰り返して追跡した場合に、図７（ａ）に示すように、動きベクトルＶｊは徐々に収束していくが、最終段階で動きベクトルＶｊが振動して発散状態になることがある。この例では、動きベクトルＶｊがＶ１〜Ｖ７まで徐々に収束した後、発散状態となったことを示している。この場合、特徴点の位置を正確に特定できないため、射影変換の行列式Ｈの精度に影響がでる。 When a certain feature point is tracked by repeating the gradient method between two images, the motion vector Vj gradually converges as shown in FIG. 7A, but at the final stage, the motion vector Vj May vibrate and diverge. This example shows that the motion vector Vj has converged gradually from V1 to V7, and then has entered a divergence state. In this case, since the position of the feature point cannot be accurately specified, the accuracy of the determinant H of the projective transformation is affected.

そこで、図７（ｂ）の点線で示すように、動きベクトルＶｊに所定の係数を乗じることで振動を抑え込み、最終段階で収束させる。この例では、Ｖ６〜Ｖ９までのベクトル成分が係数によって減衰し、最終的に収束状態となったことを示している。この場合、係数の値は一定ではなく、そのときの繰り返し回数に応じて減衰量を多くする方向（つまり、収束性の高める方向）に段階的に変更される。具体的には、図６で説明したように、１→１／２→１／４といったように変更される。 Therefore, as indicated by the dotted line in FIG. 7B, the vibration is suppressed by multiplying the motion vector Vj by a predetermined coefficient, and is converged at the final stage. In this example, it is shown that the vector components from V6 to V9 are attenuated by the coefficient and finally become converged. In this case, the value of the coefficient is not constant, and is changed stepwise in the direction of increasing the amount of attenuation (that is, the direction of increasing the convergence) according to the number of repetitions at that time. Specifically, as described with reference to FIG. 6, the change is made such that 1 → 1/2 → 1/4.

以上のように、元画像の１／４サイズを最上位レイヤとしたピラミッド階層により特徴点を追跡する構成としたことで、従来のように元画像サイズを含めたピラミッド階層に比べ、元画像サイズでの画像処理が不要となるので、その分、演算量が大幅に削減される。また、特徴点の動きベクトルに所定の係数を乗じることで収束性を高め、十分な追跡精度を保ちながら合成画像を作成することができる。 As described above, since the feature points are tracked by the pyramid hierarchy with the 1/4 size of the original image as the highest layer, the original image size is compared with the conventional pyramid hierarchy including the original image size. Thus, the amount of calculation is greatly reduced. Further, by multiplying a motion vector of a feature point by a predetermined coefficient, convergence can be improved, and a composite image can be created while maintaining sufficient tracking accuracy.

なお、前記実施形態では、元画像の１／４サイズを最上位レイヤとしたが、１／４サイズ以外の縮小サイズを最上位レイヤとしてピラミッド階層を構成することでも良い。ただし、追跡の最終段階となる最上位レイヤでの画像サイズが小さくなるほど、動きベクトルが振動して収束しづらくなるため、そのことを考慮して動きベクトルに乗じる係数を設定することが好ましい。 In the above-described embodiment, the ¼ size of the original image is set as the highest layer, but a pyramid hierarchy may be configured with a reduced size other than the ¼ size as the highest layer. However, as the image size in the uppermost layer, which is the final stage of tracking, becomes smaller, the motion vector vibrates and becomes more difficult to converge. Therefore, it is preferable to set a coefficient by which the motion vector is multiplied in consideration of this.

また、前記実施形態では、動きベクトルに乗じる係数を図６のように２段階に変更したが、さらに細かく段階的に変更することでも良く、また、その係数を乗じるタイミングも特徴点の追跡回数に応じて適宜変更可能である。要は、各階層で動きベクトルが発散することを防いで、安定して収束させることのできる減衰係数を適切なタイミングで乗じれば良い。 In the above embodiment, the coefficient to be multiplied by the motion vector is changed to two stages as shown in FIG. 6, but it may be changed more finely in stages, and the timing for multiplying the coefficient by the number of tracking of the feature points is also possible. It can be changed as appropriate. In short, it is only necessary to multiply at appropriate timing an attenuation coefficient that can prevent the motion vector from diverging in each layer and can be stably converged.

なお、前記実施形態では、デジタルカメラを例にして説明したが、本発明はこれに限られるものではなく、例えばカメラ付きの携帯電話など、撮像機能を備えた電子機器であれば同様に適用可能である。 In the above embodiment, the digital camera has been described as an example. However, the present invention is not limited to this, and can be similarly applied to an electronic device having an imaging function such as a mobile phone with a camera. It is.

さらに、予めカメラで連続撮影した複数枚の画像をＰＣ（パーソナルコンピュータ）等の情報処理装置に与えて、その情報処理装置内で前記のような処理を行うことで良い。画像の提供方法としては、着脱可能なメモリに画像を保存する方法の他に、通信ネットワークを介して提供することでも良い。 Furthermore, a plurality of images continuously photographed by a camera in advance may be given to an information processing apparatus such as a PC (personal computer) and the above-described processing may be performed in the information processing apparatus. As a method for providing an image, in addition to a method for storing an image in a removable memory, the image may be provided via a communication network.

要するに、本発明は前記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、前記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 In short, the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. Moreover, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

さらに、上述した実施形態において記載した手法は、コンピュータに実行させることのできるプログラムとして、例えば磁気ディスク（フレキシブルディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ等）、半導体メモリなどの記録媒体に書き込んで各種装置に適用したり、そのプログラム自体をネットワーク等の伝送媒体により伝送して各種装置に適用することも可能である。本装置を実現するコンピュータは、記録媒体に記録されたプログラムあるいは伝送媒体を介して提供されたプログラムを読み込み、このプログラムによって動作が制御されることにより、上述した処理を実行する。 Further, the method described in the above-described embodiment is a program that can be executed by a computer, for example, recording on a magnetic disk (flexible disk, hard disk, etc.), optical disk (CD-ROM, DVD-ROM, etc.), semiconductor memory, etc. The program can be written on a medium and applied to various apparatuses, or the program itself can be transmitted through a transmission medium such as a network and applied to various apparatuses. A computer that implements this apparatus reads a program recorded on a recording medium or a program provided via a transmission medium, and performs the above-described processing by controlling operations by this program.

図１は本発明の一実施形態に係る画像合成装置としてデジタルカメラを例にした場合の外観構成を示す図であり、図１（ａ）は主に前面の構成、同図（ｂ）は主に背面の構成を示す斜視図である。FIG. 1 is a diagram showing an external configuration when a digital camera is taken as an example of an image composition device according to an embodiment of the present invention. FIG. 1 (a) is mainly a front configuration, and FIG. It is a perspective view which shows the structure of a back surface. 図２は同実施形態におけるデジタルカメラの電子回路構成を示すブロック図である。FIG. 2 is a block diagram showing an electronic circuit configuration of the digital camera in the embodiment. 図３は同実施形態におけるデジタルカメラによるピラミッド化の処理を説明するための図である。FIG. 3 is a view for explaining pyramidization processing by the digital camera in the embodiment. 図４は同実施形態におけるデジタルカメラの画像合成処理の全体の流れを示すフローチャートである。FIG. 4 is a flowchart showing an overall flow of image composition processing of the digital camera in the embodiment. 図５は前記画像合成処理に含まれる特徴点抽出処理を示すフローチャートである。FIG. 5 is a flowchart showing a feature point extraction process included in the image composition process. 図６は前記特徴点抽出処理に含まれる動きベクトルの計算処理を示すフローチャートである。FIG. 6 is a flowchart showing a motion vector calculation process included in the feature point extraction process. 図７は動きベクトルの収束の概念を説明するための図であり、図７（ａ）は発散状態、同図（ｂ）は収束状態を示す図である。7A and 7B are diagrams for explaining the concept of motion vector convergence. FIG. 7A shows a divergent state, and FIG. 7B shows a convergent state. 図８はオプティカルフロー推定を用いた画像合成技術を説明するための図である。FIG. 8 is a diagram for explaining an image synthesis technique using optical flow estimation. 図９は従来のピラミッド化による特徴点の追跡方法を説明するための図である。FIG. 9 is a diagram for explaining a conventional method of tracking feature points by pyramidization. 図１０は従来の画像合成処理の全体の流れを示すフローチャートである。FIG. 10 is a flowchart showing the overall flow of a conventional image composition process. 図１１は前記画像合成処理に含まれる特徴点抽出処理を示すフローチャートである。FIG. 11 is a flowchart showing a feature point extraction process included in the image composition process. 図１２は前記特徴点抽出処理に含まれる動きベクトルの計算処理を示すフローチャートである。FIG. 12 is a flowchart showing a motion vector calculation process included in the feature point extraction process.

Explanation of symbols

１…デジタルカメラ、２…ボディ、３…撮影レンズ、４…セルフタイマランプ、５…光学ファインダ窓、６…ストロボ発光部、７…マイクロホン部、８…電源キー、９…シャッタキー、１０…撮影モードキー、１１…再生モードキー、１２…光学ファインダ、１３…スピーカ部、１４…マクロキー、１５…ストロボキー、１６…メニュー（ＭＥＮＵ）キー、１７…リングキー、１８…セット（ＳＥＴ）キー、１９…表示部、２１…光学レンズ装置、２２…イメージセンサ、２３…メモリ、２４…表示装置、２５…画像処理装置、２６…操作部、２７…コンピュータインタフェース部、２８…外部記憶ＩＯ装置、２９…プログラムコード記憶装置、３０…ＣＰＵ、３１…メモリカード。 DESCRIPTION OF SYMBOLS 1 ... Digital camera, 2 ... Body, 3 ... Shooting lens, 4 ... Self-timer lamp, 5 ... Optical finder window, 6 ... Strobe light emission part, 7 ... Microphone part, 8 ... Power key, 9 ... Shutter key, 10 ... Photographing Mode key, 11 ... Playback mode key, 12 ... Optical viewfinder, 13 ... Speaker unit, 14 ... Macro key, 15 ... Strobe key, 16 ... Menu (MENU) key, 17 ... Ring key, 18 ... Set key, DESCRIPTION OF SYMBOLS 19 ... Display part, 21 ... Optical lens apparatus, 22 ... Image sensor, 23 ... Memory, 24 ... Display apparatus, 25 ... Image processing apparatus, 26 ... Operation part, 27 ... Computer interface part, 28 ... External storage IO device, 29 ... Program code storage device, 30 ... CPU, 31 ... memory card.

Claims

An image synthesis device that tracks a feature point extracted from a first image with a second image, aligns the first image with the second image from the coordinate position of the feature point, and synthesizes the image,
Pyramidizing means for forming a pyramid hierarchy by reducing the first and second images in stages with a predetermined reduced size of the original image as the top layer;
Feature point tracking means for tracking the feature points in order from the lowest layer to the highest layer by the pyramidal means;
An image synthesizing apparatus comprising: convergence means for converging the motion of the feature points by multiplying a vector indicating the motion of the feature points obtained by the feature point tracking means by a predetermined coefficient.

2. The image according to claim 1, wherein the pyramidization unit forms a pyramid hierarchy by reducing the first and second images in stages with a quarter size of the original image as a top layer. Synthesizer.

The image synthesizing apparatus according to claim 1, wherein the convergence unit changes the predetermined coefficient in a direction in which the convergence is gradually improved in accordance with the number of tracking of feature points in each layer.

An image synthesis method for tracking a feature point extracted from a first image with a second image and aligning and synthesizing the first image and the second image from a coordinate position of the feature point,
A first step of forming a pyramid hierarchy by reducing the first and second images in stages, with a predetermined reduced size of the original image as the top layer;
A second step of tracking the feature points from the lowest layer to the highest layer in order from the first step;
And a third step of converging the motion of the feature points by multiplying a vector indicating the motion of the feature points obtained by the second step by a predetermined coefficient.

A feature point extracted from the first image is tracked by the second image, and the first image and the second image are aligned and synthesized from the coordinate position of the feature point,
On the computer,
A first function that configures a pyramid hierarchy by gradually reducing the first and second images with a predetermined reduced size of the original image as the top layer;
A second function for tracking the feature points from the lowest layer to the highest layer in order from the first function;
And a third function for converging the motion of the feature point by multiplying a vector indicating the motion of the feature point obtained by the second function by a predetermined coefficient.