JP2014071619A

JP2014071619A - Subject detection device, subject detection method, and program

Info

Publication number: JP2014071619A
Application number: JP2012216635A
Authority: JP
Inventors: Yoshihiro Tejima; 義裕手島; Rei Hamada; 玲浜田
Original assignee: Casio Computer Co Ltd
Current assignee: Casio Computer Co Ltd
Priority date: 2012-09-28
Filing date: 2012-09-28
Publication date: 2014-04-21
Anticipated expiration: 2032-09-28
Also published as: JP6079102B2

Abstract

PROBLEM TO BE SOLVED: To properly estimate the attitude of a specific subject with respect to an imaging part.SOLUTION: A portable terminal 100 includes a coordinate detection part 6c detecting the coordinates of a plurality of points constituting the image area of the specific subject in an acquired image, a first attitude candidate estimation part 6d estimating at least one of the attitude candidates of the specific subject with respect to an imaging part 3, based on the detected coordinates of the plurality of points, and an attitude specifying part 6f specifying one of the estimated attitude candidates as the optimal solution of the attitude of the specific subject with respect to the imaging part.

Description

本発明は、被写体検出装置、被写体検出方法及びプログラムに関する。 The present invention relates to a subject detection device, a subject detection method, and a program.

従来、拡張現実（augmented reality：ＡＲ）技術として、撮像された画像におけるマーカー（実オブジェクト）の画像に対応付けて、仮想オブジェクト（例えば、三次元モデル等）を表示させる画像処理装置が知られている（例えば、特許文献１参照）。
この画像処理装置では、撮像されたフレーム画像の画像解析の結果に基づいて、フレーム画像におけるマーカーの画像の傾き（例えば、表示画面内における正視状態からの回転角度等）を検出して、当該マーカーの画像の傾きを初期値として記録しておく。また、画像処理装置は、逐次撮像されるフレーム画像におけるマーカーの傾きと初期値として記録されている傾きとの相対的な差分、具体的には、座標値の差分を算出する。そして、画像処理装置は、算出された座標値の差分から、所定のベクトルに沿うように仮想オブジェクトの画像を傾けて逐次表示させる処理を行う。 Conventionally, as an augmented reality (AR) technique, an image processing apparatus that displays a virtual object (for example, a three-dimensional model) in association with an image of a marker (real object) in a captured image is known. (For example, refer to Patent Document 1).
In this image processing apparatus, based on the result of image analysis of the captured frame image, the inclination of the image of the marker in the frame image (for example, the rotation angle from the normal viewing state in the display screen) is detected, and the marker Is recorded as an initial value. In addition, the image processing apparatus calculates a relative difference between the inclination of the marker in the sequentially captured frame image and the inclination recorded as the initial value, specifically, a difference in coordinate values. Then, the image processing apparatus performs a process of sequentially displaying the virtual object image tilted along a predetermined vector from the calculated coordinate value difference.

特開２００６−７２６６７号公報JP 2006-72667 A

しかしながら、マーカーの撮像の際の環境や各種の条件等によっては、三次元空間内のマーカーの４隅の点の検出誤差が生じてしまい、撮像部に対するマーカーの姿勢の推定を適正に行うことができない虞がある。 However, depending on the environment and various conditions at the time of imaging of the marker, detection errors of the four corner points of the marker in the three-dimensional space occur, and it is possible to appropriately estimate the posture of the marker with respect to the imaging unit. There is a possibility that it cannot be done.

本発明は、このような問題に鑑みてなされたものであり、本発明の課題は、撮像部に対する特定の被写体の姿勢の推定を適正に行うことができる被写体検出装置、被写体検出方法及びプログラムを提供することである。 The present invention has been made in view of such a problem, and an object of the present invention is to provide a subject detection apparatus, a subject detection method, and a program capable of appropriately estimating the posture of a specific subject with respect to an imaging unit. Is to provide.

上記課題を解決するため、本発明に係る被写体検出装置は、
撮像部により撮像された特定の被写体を含む画像を逐次取得する取得手段と、前記取得手段により取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する検出手段と、前記検出手段により検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する候補推定手段と、前記候補推定手段により推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する特定手段と、を備えたことを特徴としている。 In order to solve the above problems, a subject detection apparatus according to the present invention provides:
Acquisition means for sequentially acquiring an image including a specific subject imaged by the imaging unit; and detection means for detecting coordinates of a plurality of points constituting the image area of the specific subject in the image acquired by the acquisition means; , Candidate estimation means for estimating at least one attitude candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detection means; and the attitude estimated by the candidate estimation means And a specifying unit that specifies any one of the candidates as an optimal solution of the posture of the specific subject with respect to the imaging unit.

また、本発明に係る被写体検出方法は、
被写体検出装置を用いた被写体検出方法であって、撮像部により撮像された特定の被写体を含む画像を逐次取得する処理と、取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する処理と、検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する処理と、推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する処理と、を含むことを特徴としている。 The subject detection method according to the present invention includes:
A method for detecting an object using an object detection device, comprising: sequentially acquiring an image including a specific object imaged by an imaging unit; and a plurality of points constituting an image area of the specific object in the acquired image A process of detecting the coordinates of the plurality of points, a process of estimating at least one attitude candidate of the specific subject with respect to the imaging unit based on the detected coordinates of the plurality of points, and among the estimated attitude candidates And a process of specifying any one as an optimal solution of the posture of the specific subject with respect to the imaging unit.

また、本発明に係るプログラムは、
被写体検出装置のコンピュータを、撮像部により撮像された特定の被写体を含む画像を逐次取得する取得手段、前記取得手段により取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する検出手段、前記検出手段により検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する候補推定手段、前記候補推定手段により推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する特定手段、として機能させることを特徴としている。 The program according to the present invention is
An acquisition unit that sequentially acquires an image including a specific subject imaged by the imaging unit, and coordinates of a plurality of points constituting an image area of the specific subject in the image acquired by the acquisition unit A detection means for detecting the position, a candidate estimation means for estimating at least one posture candidate of the specific subject relative to the imaging unit based on the coordinates of the plurality of points detected by the detection means, and the estimation by the candidate estimation means Any one of the posture candidates is made to function as a specifying unit that specifies an optimal solution of the posture of the specific subject with respect to the imaging unit.

本発明によれば、撮像部に対する特定の被写体の姿勢の推定を適正に行うことができる。 According to the present invention, it is possible to appropriately estimate the posture of a specific subject with respect to the imaging unit.

本発明を適用した実施形態１の携帯端末の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the portable terminal of Embodiment 1 to which this invention is applied. 図１の携帯端末によるマーカー検出処理に係る動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement which concerns on the marker detection process by the portable terminal of FIG. 図１の携帯端末とマーカーとの位置関係を模式的に示す図である。It is a figure which shows typically the positional relationship of the portable terminal of FIG. 1, and a marker. 図２のマーカー検出処理を説明するための図である。It is a figure for demonstrating the marker detection process of FIG. 本発明を適用した実施形態２の携帯端末の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the portable terminal of Embodiment 2 to which this invention is applied. 図５の携帯端末によるマーカー検出処理に係る動作の一例を示すフローチャートである。It is a flowchart which shows an example of the operation | movement which concerns on the marker detection process by the portable terminal of FIG.

以下に、本発明について、図面を用いて具体的な態様を説明する。ただし、発明の範囲は、図示例に限定されない。 Hereinafter, specific embodiments of the present invention will be described with reference to the drawings. However, the scope of the invention is not limited to the illustrated examples.

［実施形態１］
図１は、本発明を適用した実施形態１の携帯端末１００の概略構成を示すブロック図である。
図１に示すように、本実施形態の携帯端末１００は、中央制御部１と、メモリ２と、撮像部３と、撮像制御部４と、画像データ生成部５と、マーカー検出処理部６と、表示部７と、表示制御部８と、送受話部９と、通信制御部１０と、操作入力部１１等を備えている。
なお、携帯端末１００は、例えば、通信機能を具備する撮像装置や、携帯電話やＰＨＳ（Personal Handy-phone System）などの移動体通信網で用いられる移動局、ＰＤＡ（Personal Data Assistants）等から構成されている。 [Embodiment 1]
FIG. 1 is a block diagram showing a schematic configuration of a mobile terminal 100 according to the first embodiment to which the present invention is applied.
As shown in FIG. 1, the mobile terminal 100 of the present embodiment includes a central control unit 1, a memory 2, an imaging unit 3, an imaging control unit 4, an image data generation unit 5, and a marker detection processing unit 6. A display unit 7, a display control unit 8, a transmission / reception unit 9, a communication control unit 10, an operation input unit 11, and the like.
The mobile terminal 100 includes, for example, an imaging device having a communication function, a mobile station used in a mobile communication network such as a mobile phone or a PHS (Personal Handy-phone System), a PDA (Personal Data Assistants), and the like. Has been.

中央制御部１は、携帯端末１００の各部を制御する。具体的には、中央制御部１は、携帯端末１００の各部を制御するＣＰＵ（図示略）を具備し、各種処理プログラム（図示略）に従って各種の制御動作を行う。 The central control unit 1 controls each unit of the mobile terminal 100. Specifically, the central control unit 1 includes a CPU (not shown) that controls each part of the mobile terminal 100, and performs various control operations according to various processing programs (not shown).

メモリ２は、例えば、ＤＲＡＭ（Dynamic Random Access Memory）等により構成され、画像情報等を一時的に記録するバッファメモリや、中央制御部１などのワーキングメモリ、当該携帯端末１００の機能に係る各種プログラムやデータが格納されたプログラムメモリ等（何れも図示略）を備えている。 The memory 2 is composed of, for example, a DRAM (Dynamic Random Access Memory) or the like, and is a buffer memory that temporarily records image information, a working memory such as the central control unit 1, and various programs related to the functions of the mobile terminal 100 And a program memory (not shown) in which data is stored.

撮像部３は、特定の被写体（例えば、マーカーＭ）を撮像してフレーム画像Ｆ（図３参照）の画像信号を生成する。
ここで、特定の被写体とは、二次元画像であっても良いし、３次元形状の立体であっても良い。例えば、特定の被写体を二次元画像とした場合には、一辺１２ｍｍの正方形をなす略枠状（額縁状）のマーカーＭ等が挙げられる（図３参照）。
なお、上記したマーカーＭの形状、寸法は、一例であってこれに限られるものではなく、適宜任意に変更可能である。 The imaging unit 3 captures a specific subject (for example, the marker M) and generates an image signal of the frame image F (see FIG. 3).
Here, the specific subject may be a two-dimensional image or a three-dimensional solid. For example, when a specific subject is a two-dimensional image, a substantially frame-shaped (frame-shaped) marker M having a square with a side of 12 mm can be used (see FIG. 3).
The shape and dimensions of the marker M described above are merely examples, and are not limited thereto, and can be arbitrarily changed as appropriate.

また、撮像部３は、レンズ部３ａと、電子撮像部３ｂとを備えている。
レンズ部３ａは、ズームレンズやフォーカスレンズ等の複数のレンズから構成されている。
電子撮像部３ｂは、例えば、ＣＣＤ（Charge Coupled Device）やＣＭＯＳ（Complementary Metal-oxide Semiconductor）等のイメージセンサから構成され、レンズ部３ａの各種レンズを通過した光学像を二次元の画像信号に変換する。
なお、図示は省略するが、撮像部３は、レンズ部３ａを通過する光の量を調整する絞りを備えていても良い。 The imaging unit 3 includes a lens unit 3a and an electronic imaging unit 3b.
The lens unit 3a includes a plurality of lenses such as a zoom lens and a focus lens.
The electronic imaging unit 3b is composed of, for example, an image sensor such as a charge coupled device (CCD) or a complementary metal-oxide semiconductor (CMOS), and converts an optical image that has passed through various lenses of the lens unit 3a into a two-dimensional image signal. To do.
In addition, although illustration is abbreviate | omitted, the imaging part 3 may be provided with the aperture_diaphragm | restriction which adjusts the quantity of the light which passes the lens part 3a.

撮像制御部４は、撮像部３による被写体の撮像を制御する。即ち、撮像制御部４は、図示は省略するが、タイミング発生器、ドライバなどを備えている。そして、撮像制御部４は、タイミング発生器、ドライバにより電子撮像部３ｂを走査駆動して、所定周期毎に光学像を電子撮像部３ｂにより二次元の画像信号に変換させ、当該電子撮像部３ｂの撮像領域から１画面分ずつフレーム画像Ｆを読み出して画像データ生成部５に出力させる。
また、撮像制御部４は、ＡＦ（自動合焦処理）、ＡＥ（自動露出処理）、ＡＷＢ（自動ホワイトバランス）等の被写体の撮像条件の調整制御を行う。 The imaging control unit 4 controls the imaging of the subject by the imaging unit 3. That is, the imaging control unit 4 includes a timing generator, a driver, and the like, although not illustrated. Then, the imaging control unit 4 scans and drives the electronic imaging unit 3b with a timing generator and a driver, converts the optical image into a two-dimensional image signal with the electronic imaging unit 3b at a predetermined period, and the electronic imaging unit 3b. The frame image F is read out from the imaging area for each screen and is output to the image data generation unit 5.
In addition, the imaging control unit 4 performs adjustment control of imaging conditions of the subject such as AF (automatic focusing process), AE (automatic exposure process), AWB (automatic white balance), and the like.

画像データ生成部５は、電子撮像部３ｂから転送されたフレーム画像Ｆのアナログ値の信号に対してＲＧＢの各色成分毎に適宜ゲイン調整した後に、サンプルホールド回路（図示略）でサンプルホールドしてＡ／Ｄ変換器（図示略）でデジタルデータに変換し、カラープロセス回路（図示略）で画素補間処理及びγ補正処理を含むカラープロセス処理を行った後、デジタル値の輝度信号Ｙ及び色差信号Ｃｂ，Ｃｒ（ＹＵＶデータ）を生成する。
また、画像データ生成部５は、生成したフレーム画像ＦのＹＵＶデータを水平及び垂直ともに所定倍率で縮小処理を行って、ライブビュー表示用の低解像度（例えば、ＶＧＡやＱＶＧＡサイズ等）の画像データを生成する。具体的には、画像データ生成部５は、表示部７によるライブビュー画像の所定の表示フレームレートに応じた所定のタイミングで、フレーム画像ＦのＹＵＶデータからライブビュー表示用の低解像度の画像データを生成する。 The image data generation unit 5 appropriately adjusts the gain for each RGB color component with respect to the analog value signal of the frame image F transferred from the electronic image pickup unit 3b, and then samples and holds it by a sample hold circuit (not shown). The digital data is converted into digital data by an A / D converter (not shown), color processing including pixel interpolation processing and γ correction processing is performed by a color process circuit (not shown), and then a digital luminance signal Y and color difference signal Cb, Cr (YUV data) is generated.
Further, the image data generation unit 5 reduces the YUV data of the generated frame image F at a predetermined magnification in both horizontal and vertical directions, and generates low-resolution (for example, VGA, QVGA size, etc.) image data for live view display. Is generated. Specifically, the image data generation unit 5 has low resolution image data for live view display from YUV data of the frame image F at a predetermined timing according to a predetermined display frame rate of the live view image by the display unit 7. Is generated.

そして、画像データ生成部５は、生成された各フレーム画像ＦのＹＵＶデータをメモリ２に順次出力し、当該メモリ２に格納させる。 Then, the image data generation unit 5 sequentially outputs the generated YUV data of each frame image F to the memory 2 and stores it in the memory 2.

マーカー検出処理部６は、仮姿勢指定部６ａと、画像取得部６ｂと、座標検出部６ｃと、第１姿勢候補推定部６ｄと、判定部６ｅと、第１姿勢特定部６ｆと、画像生成部６ｇとを具備している。
なお、マーカー検出処理部６の各部は、例えば、所定のロジック回路から構成されているが、当該構成は一例であってこれに限られるものではない。 The marker detection processing unit 6 includes a temporary posture designation unit 6a, an image acquisition unit 6b, a coordinate detection unit 6c, a first posture candidate estimation unit 6d, a determination unit 6e, a first posture specification unit 6f, and an image generation 6g.
In addition, although each part of the marker detection process part 6 is comprised from the predetermined | prescribed logic circuit, for example, the said structure is an example and is not restricted to this.

仮姿勢指定部６ａは、予め当該端末本体に対するマーカーＭの仮の姿勢を指定する。
即ち、仮姿勢指定部（指定手段）６ａは、予め撮像部３を具備する端末本体に対するマーカー（特定の被写体）Ｍの仮の姿勢を指定する。具体的には、例えば、ユーザによる操作入力部１１の所定操作に基づいて、二次元画像であるマーカーＭの当該二次元平面に対する法線方向（即ち、図３中、マーカー座標系におけるＺ軸の向き等）の向き（例えば、上下左右の何れか一の向き）が入力されると、仮姿勢指定部６ａは、入力されたマーカーＭの二次元平面に対する法線方向の向きを、端末本体に対するマーカーＭの仮の姿勢として指定する。
なお、仮の姿勢の指定方法は、一例であってこれに限られるものではなく、適宜任意に変更可能である。例えば、仮の姿勢は、予めデフォルトとして設定されているものが自動的に指定される構成であっても良い。 The temporary posture designation unit 6a designates the temporary posture of the marker M with respect to the terminal body in advance.
That is, the provisional posture designation unit (designation unit) 6a designates the provisional posture of the marker (specific subject) M with respect to the terminal body including the imaging unit 3 in advance. Specifically, for example, based on a predetermined operation of the operation input unit 11 by the user, the normal direction of the marker M, which is a two-dimensional image, with respect to the two-dimensional plane (that is, the Z axis in the marker coordinate system in FIG. 3). Direction) (for example, any one of up, down, left, and right directions) is input, the temporary posture designating unit 6a determines the normal direction of the input marker M with respect to the two-dimensional plane with respect to the terminal body. The temporary posture of the marker M is designated.
Note that the provisional posture designation method is an example, and is not limited thereto, and can be arbitrarily changed as appropriate. For example, the temporary posture may be configured so that a preset default posture is automatically specified.

画像取得部６ｂは、複数のフレーム画像Ｆ、…を逐次取得する。
即ち、画像取得部（取得手段）６ｂは、マーカー（特定の被写体）Ｍが連続して撮像された複数の画像を逐次取得する。具体的には、画像取得部６ｂは、撮像部３によりマーカーＭが連続して撮像されて画像データ生成部５により逐次生成された複数のフレーム画像Ｆ、…の所定の解像度の画像データ（例えば、輝度データ）をメモリ２から逐次取得する。 The image acquisition unit 6b sequentially acquires a plurality of frame images F,.
That is, the image acquisition unit (acquisition unit) 6b sequentially acquires a plurality of images in which the marker (specific subject) M is continuously captured. Specifically, the image acquisition unit 6b has a predetermined resolution image data (for example, a plurality of frame images F, which are sequentially generated by the image data generation unit 5 after the marker M is continuously captured by the imaging unit 3). , Luminance data) from the memory 2 sequentially.

座標検出部６ｃは、マーカー画像領域Ｓ（図３参照）を構成する複数の点の座標を検出する。
即ち、座標検出部（検出手段）６ｃは、画像取得部６ｂにより取得されたフレーム画像Ｆに含まれるマーカー画像領域Ｓを構成する複数の点（例えば、４隅の点等）の座標を検出する。具体的には、座標検出部６ｃは、画像取得部６ｂにより取得されたフレーム画像Ｆに対して所定の二値化処理が施されることで生成された二値化画像の画像データに対して所定の特徴抽出処理を行って、略枠状のマーカー画像領域Ｓを抽出する。そして、座標検出部６ｃは、マーカー画像領域Ｓを構成する４隅の位置座標を検出する。
なお、特徴抽出処理は、公知の技術であるので、ここでは詳細な説明を省略する。 The coordinate detection unit 6c detects the coordinates of a plurality of points constituting the marker image region S (see FIG. 3).
That is, the coordinate detection unit (detection unit) 6c detects the coordinates of a plurality of points (for example, four corner points) constituting the marker image region S included in the frame image F acquired by the image acquisition unit 6b. . Specifically, the coordinate detection unit 6c applies to the image data of the binarized image generated by performing a predetermined binarization process on the frame image F acquired by the image acquisition unit 6b. A predetermined feature extraction process is performed to extract a substantially frame-shaped marker image region S. Then, the coordinate detection unit 6c detects the position coordinates of the four corners constituting the marker image region S.
Note that the feature extraction process is a known technique, and thus detailed description thereof is omitted here.

第１姿勢候補推定部６ｄは、端末本体に対するマーカーＭの姿勢候補を少なくとも一つ推定する。具体的には、第１姿勢候補推定部６ｄは、第１推定部ｄ１と、第２推定部ｄ２とを具備している。 The first posture candidate estimation unit 6d estimates at least one posture candidate of the marker M with respect to the terminal body. Specifically, the first posture candidate estimation unit 6d includes a first estimation unit d1 and a second estimation unit d2.

第１推定部（第１推定手段）ｄ１は、座標検出部６ｃにより検出された複数の点の座標に基づいて、端末本体に対するマーカーＭの第１姿勢候補を推定する。具体的には、第１推定部ｄ１は、所定の記録手段に記録されているデータベース（図示略）等を参照し、マーカー画像領域Ｓの４隅の点の位置座標から幾何学的な位置関係を利用して、マーカーＭの各点の三次元空間での座標とフレーム画像Ｆ内の二次元の平面座標との対応関係を規定する座標変換式（例えば、式（１）参照；後述）の第１初期値を算出する。
ここで、マーカー画像領域Ｓの複数の点の座標変換には、例えば、比較的精度の高い中心射影モデルを用いても良いし、中心射影モデルに比べて精度の低い擬似中心射影モデル（図４（ａ）参照）を用いても良い。例えば、擬似中心射影モデルでは、携帯端末１００から所定の距離以上離れると、当該端末本体に対する前後方向（奥行き方向）の狭い範囲にて平行投影とみなされ、図４（ａ）中の点１、２は何れも画像の点ａに投影されてしまう。 The first estimation unit (first estimation unit) d1 estimates the first posture candidate of the marker M with respect to the terminal body based on the coordinates of the plurality of points detected by the coordinate detection unit 6c. Specifically, the first estimation unit d1 refers to a database (not shown) recorded in a predetermined recording unit and the geometric positional relationship from the position coordinates of the four corner points of the marker image region S. A coordinate conversion formula (for example, see formula (1); described later) that defines the correspondence between the coordinates of each point of the marker M in the three-dimensional space and the two-dimensional plane coordinates in the frame image F is used. A first initial value is calculated.
Here, for coordinate conversion of a plurality of points in the marker image region S, for example, a center projection model with relatively high accuracy may be used, or a pseudo center projection model (FIG. 4) with lower accuracy than the center projection model. (Refer to (a)). For example, in the pseudo-central projection model, when a predetermined distance or more away from the mobile terminal 100, it is regarded as parallel projection in a narrow range in the front-rear direction (depth direction) with respect to the terminal body, and the point 1 in FIG. 2 is projected onto the point a of the image.

また、第１推定部ｄ１は、算出された第１初期値を端末本体に対するマーカーＭの姿勢を表す行列Ｒの初期値として下記式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔをバンドル調整（bundle adjustment）により推定する。そして、第１推定部ｄ１は、推定された行列Ｒを端末本体に対するマーカーＭの第１姿勢候補として特定する。
ここで、バンドル調整とは、画像から幾何学的なモデルのパラメータを高い推定精度の達成を目指して推定する手法であり、例えば、未知のパラメータの再投影誤差の合計を最小化するような手法等が用いられる。

なお、式（１）の座標変換式は、同次座標変換行列である。また、当該式（１）中、「Ｘ_ｗ」、「Ｙ_ｗ」及び「Ｚ_ｗ」は、互いに直交するＸ軸、Ｙ軸及びＺ軸により規定される三次元空間でのマーカーＭを構成する各点の座標位置を表し、「ｘ_ｉ」及び「ｙ_ｉ」は、Ｘ軸及びＹ軸により規定される二次元平面であるフレーム画像Ｆ内でマーカーＭの各点の対応する座標位置を表している。
また、端末本体に対するマーカーＭの姿勢を表す行列Ｒは、具体的には、ｎ行ｍ列の行列（ｎ、ｍは、それぞれ自然数）で表され、例えば、３×３の回転行列等が挙げられる。また、端末本体に対するマーカーＭの位置関係を表す行列Ｔは、具体的には、ｎ行ｍ列の行列（ｎ、ｍは、それぞれ自然数）で表され、例えば、１×３の並進ベクトル等が挙げられる。
また、式（１）中の「Ｃ」は、撮像部３の焦点距離等の内部パラメータを表し、「ｈ」は、フレーム画像Ｆのスケールを表している。 In addition, the first estimation unit d1 uses the calculated first initial value as an initial value of the matrix R representing the attitude of the marker M with respect to the terminal body, and solves the coordinate transformation expression of the following formula (1), thereby making the marker for the terminal body Matrixes R and T representing the posture and positional relationship of M are estimated by bundle adjustment. And the 1st estimation part d1 specifies the estimated matrix R as a 1st attitude | position candidate of the marker M with respect to a terminal main body.
Here, bundle adjustment is a method for estimating geometric model parameters from an image in order to achieve high estimation accuracy, for example, a method for minimizing the total reprojection error of unknown parameters. Etc. are used.

In addition, the coordinate conversion formula of Formula (1) is a homogeneous coordinate conversion matrix. Moreover, in the said Formula (1), " _Xw ", " _Yw ", and " _Zw " comprise the marker M in the three-dimensional space prescribed | regulated by the mutually orthogonal X-axis, Y-axis, and Z-axis. The coordinate position of each point is represented, and “x _i ” and “y _i ” represent the corresponding coordinate position of each point of the marker M in the frame image F which is a two-dimensional plane defined by the X axis and the Y axis. ing.
Further, the matrix R representing the attitude of the marker M with respect to the terminal body is specifically represented by a matrix of n rows and m columns (n and m are natural numbers, respectively), for example, a 3 × 3 rotation matrix or the like. It is done. Further, the matrix T representing the positional relationship of the marker M with respect to the terminal body is specifically represented by a matrix of n rows and m columns (n and m are natural numbers, respectively), for example, a 1 × 3 translation vector or the like. Can be mentioned.
Further, “C” in Expression (1) represents an internal parameter such as a focal length of the imaging unit 3, and “h” represents a scale of the frame image F.

第２推定部（第２推定手段）ｄ２は、第１推定部ｄ１による推定結果に基づいて、端末本体に対するマーカーＭの第１姿勢候補と異なる第２姿勢候補を推定する。
具体的には、第２推定部ｄ２は、判定部６ｅによって第１姿勢候補が正しくないと判定された場合に（詳細後述）、端末本体に対するマーカーＭの第２姿勢候補を推定する。つまり、判定部６ｅによって第１姿勢候補が正しくないと判定された場合には、当該第１姿勢候補はバンドル調整により推定された局所解であると考えられるので、第１姿勢候補推定部６ｄは、第２推定部ｄ２によって端末本体に対するマーカーＭの姿勢候補（第２姿勢候補）を再度推定する。
例えば、第２推定部ｄ２は、端末本体の所定位置（例えば、光軸が通る端末本体中心）とマーカーＭの中心を通る軸（中心軸）とマーカーＭの二次元平面の法線ベクトルの外積を算出する。そして、第２推定部ｄ２は、仮姿勢指定部６ａにより指定されたマーカーＭの仮の姿勢（例えば、下向き；図４（ｂ）中、破線Ｌ２で表すマーカーＭの向き）と反対（例えば、上向き；図４（ｂ）中、実線Ｌ１で表すマーカーＭの向き）となるように、算出された外積を軸として第１姿勢候補である行列Ｒを回転させて、第２初期値を算出する。その後、第２推定部ｄ２は、算出された第２初期値を端末本体に対するマーカーＭの姿勢を表す行列Ｒの初期値として式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔを推定し、推定された行列Ｒを端末本体に対するマーカーＭの第２姿勢候補として特定する。
なお、第２姿勢候補の推定方法は、一例であってこれに限られるものではなく、適宜任意に変更可能である。例えば、第２推定部ｄ２は、上記第１推定部ｄ１と同様に、マーカー画像領域Ｓの４隅の点の座標から擬似中心射影モデルを利用することで、座標変換式（例えば、式（１）参照）の初期値として第１初期値と異なる第２初期値を算出し、当該第２初期値を用いて端末本体に対するマーカーＭの姿勢を表す行列Ｒ（第２姿勢候補）をバンドル調整により推定しても良い。 The second estimation unit (second estimation unit) d2 estimates a second posture candidate different from the first posture candidate of the marker M with respect to the terminal body based on the estimation result by the first estimation unit d1.
Specifically, the second estimation unit d2 estimates the second posture candidate of the marker M with respect to the terminal body when the determination unit 6e determines that the first posture candidate is incorrect (described later in detail). That is, when the determination unit 6e determines that the first posture candidate is not correct, the first posture candidate is considered to be a local solution estimated by bundle adjustment, so the first posture candidate estimation unit 6d The posture estimation candidate (second posture candidate) of the marker M with respect to the terminal body is estimated again by the second estimation unit d2.
For example, the second estimation unit d2 calculates the outer product of a predetermined position of the terminal body (for example, the center of the terminal body through which the optical axis passes), the axis passing through the center of the marker M (center axis), and the normal vector of the marker M on the two-dimensional plane Is calculated. Then, the second estimation unit d2 is opposite to the provisional posture of the marker M (for example, downward; the direction of the marker M represented by the broken line L2 in FIG. 4B) designated by the provisional posture designation unit 6a (for example, The second initial value is calculated by rotating the matrix R, which is the first posture candidate, about the calculated outer product as an axis so as to be upward (the direction of the marker M represented by the solid line L1 in FIG. 4B). . Thereafter, the second estimation unit d2 uses the calculated second initial value as an initial value of a matrix R representing the attitude of the marker M with respect to the terminal body, and thereby solves the coordinate conversion equation of Expression (1), whereby the marker M with respect to the terminal body. Matrixes R and T representing the posture and positional relationship are estimated, and the estimated matrix R is specified as a second posture candidate of the marker M with respect to the terminal body.
Note that the second posture candidate estimation method is an example and is not limited to this, and can be arbitrarily changed as appropriate. For example, the second estimation unit d2 uses the pseudo-center projection model from the coordinates of the four corner points of the marker image region S, similarly to the first estimation unit d1, thereby converting the coordinate conversion formula (for example, formula (1 ))), A second initial value different from the first initial value is calculated, and a matrix R (second attitude candidate) representing the attitude of the marker M with respect to the terminal body is calculated by bundle adjustment using the second initial value. It may be estimated.

このように、第２推定部ｄ２は、判定部６ｅによって第１姿勢候補が正しくないと判定された場合にのみ第２姿勢候補を推定するので、第１姿勢候補推定部６ｄ（候補推定手段）は、撮像部３を具備する端末本体に対するマーカーＭの姿勢候補を少なくとも一つ推定する。 Thus, since the second estimation unit d2 estimates the second posture candidate only when the determination unit 6e determines that the first posture candidate is not correct, the first posture candidate estimation unit 6d (candidate estimation means) Estimates at least one posture candidate of the marker M with respect to the terminal body including the imaging unit 3.

判定部６ｅは、第１姿勢候補が正しいか否かを判定する。
即ち、判定部（判定手段）６ｅは、第１推定部ｄ１により推定された第１姿勢候補が正しいか否かを所定の判定基準に従って判定する。具体的には、判定部６ｅは、仮姿勢指定部６ａによって指定されたマーカーＭの仮の姿勢に基づいて、端末本体に対するマーカーＭの第１姿勢候補が正しいか否かを判定する。
例えば、端末本体に対するマーカーＭの第１姿勢候補を表す行列Ｒが３×３の回転行列で表されている場合、１列目の各要素r₁₁、r₂₁、r₃₁は、端末本体の座標系におけるマーカー座標系のＸ軸の向きを表し、２列目の各要素r₁₂、r₂₂、r₃₂は、端末本体の座標系におけるマーカー座標系のＹ軸の向きを表し、３列目の各要素r₁₃、r₂₃、r₃₃は、端末本体の座標系におけるマーカー座標系のＺ軸の向きを表す。ここで、例えば、仮姿勢指定部６ａにより予めマーカーＭの二次元平面に対する法線方向の向きが上向きと指定されている場合、判定部６ｅは、端末本体に対するマーカーＭの第１姿勢候補を表す行列Ｒの所定の要素（例えば、２行目、３列目の要素r₂₃）の符号を確認することで第１姿勢候補と仮の姿勢が一致するか否かを判定する。即ち、マーカーＭの法線方向は、マーカー座標系のＺ軸の向きと一致しているため、判定部６ｅは、３×３の回転行列のマーカー座標系のＺ軸の向きを表す３列目の各要素のうち、端末本体の座標系のＹ軸に対応する要素r₂₃の符号を確認することによって、マーカーＭの法線方向が端末本体の座標系のＹ軸の正の方向を向いているか否かを判定することができる。 The determination unit 6e determines whether or not the first posture candidate is correct.
That is, the determination unit (determination unit) 6e determines whether the first posture candidate estimated by the first estimation unit d1 is correct according to a predetermined determination criterion. Specifically, the determination unit 6e determines whether or not the first posture candidate of the marker M with respect to the terminal body is correct based on the temporary posture of the marker M designated by the temporary posture designation unit 6a.
For example, when the matrix R representing the first posture candidate of the marker M with respect to the terminal body is represented by a 3 × 3 rotation matrix, each element r ₁₁ , r ₂₁ , r ₃₁ in the first column is the coordinates of the terminal body. Represents the direction of the X axis of the marker coordinate system in the system, and each element r ₁₂ , r ₂₂ , r ₃₂ in the second column represents the direction of the Y axis of the marker coordinate system in the coordinate system of the terminal body. each element r _13, r _23, r ₃₃ represents the direction of Z-axis of the marker coordinate system in the coordinate system of the terminal body. Here, for example, when the orientation of the normal direction with respect to the two-dimensional plane of the marker M is designated in advance by the temporary posture designation unit 6a, the determination unit 6e represents the first posture candidate of the marker M with respect to the terminal body. By confirming the sign of a predetermined element of the matrix R (for example, element r _{23 in the} second row and third column), it is determined whether or not the first posture candidate matches the temporary posture. That is, since the normal direction of the marker M coincides with the Z-axis direction of the marker coordinate system, the determination unit 6e displays the third column indicating the Z-axis direction of the marker coordinate system of the 3 × 3 rotation matrix. By confirming the sign of the element r ₂₃ corresponding to the Y axis of the coordinate system of the terminal body, the normal direction of the marker M faces the positive direction of the Y axis of the coordinate system of the terminal body. It can be determined whether or not.

なお、第１姿勢候補の判定方法は、一例であってこれに限られるものではなく、適宜任意に変更可能である。
具体的には、例えば、判定部６ｅは、第１姿勢特定部６ｆ（後述）により最適解として逐次特定される端末本体に対するマーカーＭの姿勢の変化の態様に基づいて、第１姿勢候補が正しいか否かを判定するようにしても良い。例えば、フレーム画像Ｆが所定の撮像フレームレート（例えば、６０ｆｐｓ等）で撮像されて取得される場合に、所定時間内に取得された複数のフレーム画像Ｆ、…に対応する第１姿勢候補の所定の要素に基づいてマーカーＭの姿勢が所定の向き（例えば、上向き）である割合を算出し、その割合が所定値以上となった場合に、第１姿勢候補が正しいと判定するようにしても良い。また、例えば、判定部６ｅは、前回特定された第１姿勢候補の所定の要素と比較して、今回特定された第１姿勢候補の所定の要素が一致するか否かに応じて第１姿勢候補が正しいか否かを判定するようにしても良い。
この場合には、携帯端末１００は、必ずしも仮姿勢指定部６ａを具備する必要はない。 Note that the determination method of the first posture candidate is an example, and is not limited to this, and can be arbitrarily changed as appropriate.
Specifically, for example, the determination unit 6e determines that the first posture candidate is correct based on the change in posture of the marker M with respect to the terminal body that is sequentially identified as the optimal solution by the first posture identification unit 6f (described later). It may be determined whether or not. For example, when the frame image F is captured and acquired at a predetermined imaging frame rate (for example, 60 fps), predetermined first posture candidates corresponding to a plurality of frame images F,. The ratio of the posture of the marker M in a predetermined direction (for example, upward) is calculated based on the above elements, and when the ratio becomes a predetermined value or more, it is determined that the first posture candidate is correct. good. Further, for example, the determination unit 6e compares the predetermined element of the first posture candidate specified last time with the first posture according to whether or not the predetermined element of the first posture candidate specified this time matches. You may make it determine whether a candidate is correct.
In this case, the mobile terminal 100 does not necessarily need to include the temporary posture specifying unit 6a.

第１姿勢特定部６ｆは、端末本体に対するマーカーＭの姿勢の最適解を特定する。
即ち、第１姿勢特定部（特定手段）６ｆは、第１姿勢候補推定部６ｄにより推定された少なくとも一の姿勢候補のうちの何れか一を端末本体に対するマーカーＭの姿勢の最適解として特定する。具体的には、第１姿勢特定部６ｆは、第１姿勢候補及び第２姿勢候補のうち、一方を最適解として特定する。
例えば、第１姿勢特定部６ｆは、判定部６ｅによって第１姿勢候補が正しいと判定された場合に、当該第１姿勢候補を最適解として特定する一方で、第１姿勢候補が正しくないと判定された場合に、第２推定部ｄ２により推定された第２姿勢候補を最適解として特定する。
つまり、判定部６ｅにより第１姿勢候補が正しいと判定された場合には、第１姿勢候補推定部６ｄにより一の姿勢候補（第１姿勢候補）のみが推定された状態であり、第１姿勢特定部６ｆは、当該第１姿勢候補を端末本体に対するマーカーＭの姿勢を表す行列Ｒとして特定する。これに対して、判定部６ｅにより第１姿勢候補が正しくないと判定された場合には、第１姿勢候補推定部６ｄの第２推定部ｄ２により改めて第２姿勢候補が推定され、第１姿勢特定部６ｆは、当該第２姿勢候補を端末本体に対するマーカーＭの姿勢を表す行列Ｒとして特定する。 The first posture specifying unit 6f specifies an optimal solution for the posture of the marker M with respect to the terminal body.
That is, the first posture specifying unit (specifying unit) 6f specifies any one of at least one posture candidate estimated by the first posture candidate estimating unit 6d as the optimum solution of the posture of the marker M with respect to the terminal body. . Specifically, the first posture specifying unit 6f specifies one of the first posture candidate and the second posture candidate as the optimal solution.
For example, when the determination unit 6e determines that the first posture candidate is correct, the first posture specifying unit 6f specifies the first posture candidate as the optimal solution, and determines that the first posture candidate is not correct. In such a case, the second posture candidate estimated by the second estimation unit d2 is specified as the optimal solution.
That is, when the determination unit 6e determines that the first posture candidate is correct, only the first posture candidate (first posture candidate) is estimated by the first posture candidate estimation unit 6d. The specifying unit 6f specifies the first posture candidate as a matrix R representing the posture of the marker M with respect to the terminal body. On the other hand, when the determination unit 6e determines that the first posture candidate is not correct, the second estimation unit d2 of the first posture candidate estimation unit 6d re-estimates the second posture candidate, and the first posture The specifying unit 6f specifies the second posture candidate as a matrix R representing the posture of the marker M with respect to the terminal body.

画像生成部６ｇは、画像取得部６ｂにより逐次取得されたフレーム画像Ｆのマーカー画像領域Ｓに当該マーカーＭと対応する仮想オブジェクト（例えば、三次元モデル等；図示略）を重畳させた仮想画像（図示略）を生成する。
具体的には、例えば、フレーム画像Ｆが逐次取得される毎に、画像生成部６ｇは、第１姿勢特定部６ｆにより最適解として特定された端末本体に対するマーカーＭの姿勢及び位置関係に基づいて、フレーム画像Ｆ内でのマーカー画像の姿勢（向き）や位置を調整し、当該マーカー画像と対応する仮想オブジェクトを所定の記録手段から取得して仮想画像の画像データを生成する。そして、画像生成部６ｇは、生成された各仮想画像の画像データをメモリ２に順次出力し、当該メモリ２に格納させる。
なお、画像生成部６ｇにより生成された仮想画像の画像データは、例えば、所定の圧縮形式（例えば、ＪＰＥＧ形式等）で符号化され、不揮発性メモリ（フラッシュメモリ）等の記録媒体（図示略）に記録されても良い。 The image generation unit 6g superimposes a virtual object (for example, a three-dimensional model; not shown) corresponding to the marker M on the marker image region S of the frame image F sequentially acquired by the image acquisition unit 6b. (Not shown).
Specifically, for example, every time the frame image F is sequentially acquired, the image generation unit 6g is based on the posture and positional relationship of the marker M with respect to the terminal body identified as the optimal solution by the first posture identification unit 6f. Then, the posture (orientation) and position of the marker image in the frame image F are adjusted, a virtual object corresponding to the marker image is acquired from a predetermined recording means, and image data of the virtual image is generated. Then, the image generation unit 6 g sequentially outputs the generated image data of each virtual image to the memory 2 and stores it in the memory 2.
The image data of the virtual image generated by the image generation unit 6g is encoded in a predetermined compression format (for example, JPEG format, for example), and is a recording medium (not shown) such as a nonvolatile memory (flash memory). May be recorded.

表示部７は、例えば、液晶表示パネルから構成され、表示制御部８からのビデオ信号に基づいて撮像部３により撮像された画像（例えば、ライブビュー画像等）を表示画面に表示する。 The display unit 7 is configured by, for example, a liquid crystal display panel, and displays an image (for example, a live view image) captured by the imaging unit 3 based on a video signal from the display control unit 8 on a display screen.

表示制御部８は、メモリ２に一時的に記憶されている表示用の画像データを読み出して表示部７に表示させる制御を行う。
具体的には、表示制御部８は、ＶＲＡＭ（Video Random Access Memory）、ＶＲＡＭコントローラ、デジタルビデオエンコーダなどを備えている。そして、デジタルビデオエンコーダは、中央制御部１の制御下にてメモリ２から読み出されてＶＲＡＭ（図示略）に記憶されている輝度信号Ｙ及び色差信号Ｃｂ，Ｃｒを、ＶＲＡＭコントローラを介してＶＲＡＭから所定の再生フレームレート（例えば、３０ｆｐｓ）で読み出して、これらのデータを元にビデオ信号を発生して表示部７に出力する。
例えば、表示制御部８は、撮像部３及び撮像制御部４により撮像され画像データ生成部５により生成された複数のフレーム画像Ｆ、…を所定の表示フレームレートで逐次更新しながら表示部７にライブビュー表示させる。また、表示制御部８は、例えば、メモリ２から仮想画像の画像データを逐次取得して、当該仮想画像を表示部７に逐次ライブビュー表示させる。 The display control unit 8 performs control to read display image data temporarily stored in the memory 2 and display it on the display unit 7.
Specifically, the display control unit 8 includes a VRAM (Video Random Access Memory), a VRAM controller, a digital video encoder, and the like. The digital video encoder reads the luminance signal Y and the color difference signals Cb and Cr read from the memory 2 and stored in the VRAM (not shown) under the control of the central control unit 1 through the VRAM controller. Are read out at a predetermined playback frame rate (for example, 30 fps), and a video signal is generated based on these data and output to the display unit 7.
For example, the display control unit 8 updates the plurality of frame images F captured by the imaging unit 3 and the imaging control unit 4 and generated by the image data generation unit 5 to the display unit 7 while sequentially updating at a predetermined display frame rate. Display live view. For example, the display control unit 8 sequentially acquires image data of virtual images from the memory 2 and sequentially displays the virtual images on the display unit 7 in live view.

送受話部９は、通信ネットワークＮを介して接続された外部機器の外部ユーザとの通話を行う。
具体的には、送受話部９は、マイク９ａ、スピーカ９ｂ、データ変換部９ｃ等を備えている。そして、送受話部９は、マイク９ａから入力されるユーザの送話音声をデータ変換部９ｃによりＡ／Ｄ変換処理して送話音声データを中央制御部１に出力するとともに、中央制御部１の制御下にて、通信制御部１０から出力されて入力される受話音声データ等の音声データをデータ変換部９ｃによりＤ／Ａ変換処理してスピーカ９ｂから出力する。 The transmitter / receiver unit 9 performs a call with an external user of an external device connected via the communication network N.
Specifically, the transmission / reception unit 9 includes a microphone 9a, a speaker 9b, a data conversion unit 9c, and the like. The transmission / reception unit 9 performs A / D conversion processing on the user's transmission voice input from the microphone 9a by the data conversion unit 9c and outputs the transmission voice data to the central control unit 1. Under the control, voice data such as received voice data outputted and inputted from the communication control unit 10 is D / A converted by the data conversion unit 9c and outputted from the speaker 9b.

通信制御部１０は、通信ネットワークＮ及び通信アンテナ１０ａを介してデータの送受信を行う。
即ち、通信アンテナ１０ａは、当該携帯端末１００が無線基地局（図示略）との通信で採用している所定の通信方式（例えば、Ｗ−ＣＤＭＡ(Wideband Code Division Multiple Access)方式、ＧＳＭ（Global System for Mobile Communications；登録商標）方式等）に対応したデータの送受信が可能なアンテナである。そして、通信制御部１０は、所定の通信方式に対応する通信プロトコルに従って、この通信方式で設定される通信チャネルにより無線基地局との間で通信アンテナ１０ａを介してデータの送受信を行う。
即ち、通信制御部１０は、中央制御部１から出力されて入力される指示信号に基づいて、通信相手の外部機器に対して、当該外部機器の外部ユーザとの通話中の音声の送受信や、電子メールのデータの送受信を行う。
なお、通信制御部１０の構成は一例であってこれに限られるものではなく、適宜任意に変更可能であり、例えば、図示は省略するが、無線ＬＡＮモジュールを搭載し、アクセスポイント（Access Point）を介して通信ネットワークＮにアクセス可能な構成としても良い。 The communication control unit 10 transmits and receives data via the communication network N and the communication antenna 10a.
That is, the communication antenna 10a is connected to a predetermined communication method (for example, W-CDMA (Wideband Code Division Multiple Access) method, GSM (Global System)) used by the mobile terminal 100 for communication with a radio base station (not shown). for Mobile Communications (registered trademark) system, etc.). The communication control unit 10 transmits / receives data to / from the radio base station via the communication antenna 10a using a communication channel set in the communication method according to a communication protocol corresponding to a predetermined communication method.
That is, the communication control unit 10 transmits and receives voice during a call with an external user of the external device to the external device of the communication partner based on the instruction signal output from the central control unit 1 and input, Send and receive e-mail data.
Note that the configuration of the communication control unit 10 is an example and is not limited to this, and can be arbitrarily changed as appropriate. For example, although not illustrated, a wireless LAN module is mounted and an access point (Access Point) is provided. It is good also as a structure which can access the communication network N via this.

なお、通信ネットワークＮは、携帯端末１００を無線基地局やゲートウェイサーバ（図示略）等を介して接続する通信ネットワークである。また、通信ネットワークＮは、専用線や既存の一般公衆回線を利用して構築された通信ネットワークであり、ＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）等の様々な回線形態を適用することが可能である。また、通信ネットワークＮには、例えば、電話回線網、ＩＳＤＮ回線網、専用線、移動体通信網、通信衛星回線、ＣＡＴＶ回線網等の各種通信ネットワーク網と、ＩＰネットワーク、ＶｏＩＰ（Voice over Internet Protocol）ゲートウェイ、インターネットサービスプロバイダ等が含まれる。 The communication network N is a communication network that connects the mobile terminal 100 via a wireless base station, a gateway server (not shown), or the like. The communication network N is a communication network constructed by using a dedicated line or an existing general public line, and applies various line forms such as a LAN (Local Area Network) and a WAN (Wide Area Network). Is possible. The communication network N includes, for example, various communication network networks such as a telephone line network, ISDN line network, dedicated line, mobile communication network, communication satellite line, CATV line network, IP network, VoIP (Voice over Internet Protocol). ) Gateways, Internet service providers, etc. are included.

操作入力部１１は、端末本体に対して各種指示を入力するためのものである。
具体的には、操作入力部１１は、被写体の撮影指示に係るシャッタボタン、モードや機能等の選択指示に係る上下左右のカーソルボタンや決定ボタン、電話の発着信や電子メールの送受信等の実行指示に係る通信関連ボタン、テキストの入力指示に係る数字ボタンや記号ボタン等の各種ボタン（何れも図示略）を備えている。
そして、ユーザにより各種ボタンが操作されると、操作入力部１１は、操作されたボタンに応じた操作指示を中央制御部１に出力する。中央制御部１は、操作入力部１１から出力され入力された操作指示に従って所定の動作（例えば、被写体の撮像、電話の発着信、電子メールの送受信等）を各部に実行させる。 The operation input unit 11 is for inputting various instructions to the terminal body.
Specifically, the operation input unit 11 executes a shutter button according to an instruction to shoot a subject, up / down / left / right cursor buttons and a determination button according to a selection instruction of a mode, a function, etc. Various buttons (not shown) such as communication-related buttons related to instructions and numeric buttons and symbol buttons related to text input instructions are provided.
When various buttons are operated by the user, the operation input unit 11 outputs an operation instruction corresponding to the operated button to the central control unit 1. The central control unit 1 causes each unit to execute a predetermined operation (for example, imaging of a subject, incoming / outgoing calls, transmission / reception of an e-mail, etc.) according to an operation instruction output from the operation input unit 11 and input.

なお、操作入力部１１は、表示部７と一体となって設けられたタッチパネルを有していても良く、ユーザによるタッチパネルの所定操作に基づいて、当該所定操作に応じた操作指示を中央制御部１に出力しても良い。 The operation input unit 11 may include a touch panel provided integrally with the display unit 7, and an operation instruction corresponding to the predetermined operation is given to the central control unit based on a predetermined operation of the touch panel by the user. 1 may be output.

次に、携帯端末１００によるマーカー検出処理について図２を参照して説明する。
図２は、マーカー検出処理に係る動作の一例を示すフローチャートである。
なお、以下のマーカー検出処理にて検出されるマーカーＭは、例えば、鉢植えの観葉植物の鉢内の所定位置に配設されているものとする。 Next, marker detection processing by the mobile terminal 100 will be described with reference to FIG.
FIG. 2 is a flowchart illustrating an example of an operation related to the marker detection process.
In addition, the marker M detected by the following marker detection process shall be arrange | positioned in the predetermined position in the pot of the potted houseplant, for example.

図２に示すように、先ず、仮姿勢指定部６ａは、ユーザによる操作入力部１１の所定操作に基づいて入力されたマーカーＭの二次元平面に対する法線方向の向き（例えば、下向き等）を、端末本体に対するマーカーＭの仮の姿勢として指定する（ステップＳ１）。
その後、撮像制御部４は、撮像部３にマーカーＭを撮像させ、画像データ生成部５は、電子撮像部３ｂから転送された複数のフレーム画像Ｆ、…の画像データを生成する（ステップＳ２）。ここで、例えば、レンズ部の焦点距離を４０ｍｍ（３５ｍｍ判換算）として、一辺１２ｍｍの正方形をなす略枠状のマーカーＭを４０ｃｍ程度離れた場所から撮像すると、ＶＧＡサイズのフレーム画像Ｆでマーカー画像領域Ｓの一辺が５０画素程度となる。このような状況では、後述するマーカー画像領域Ｓの４隅の点の座標の検出誤差が生じ易くなる。
そして、画像データ生成部５は、生成された各フレーム画像ＦのＹＵＶデータをメモリ２に順次出力し、当該メモリ２に格納させる。 As shown in FIG. 2, first, the temporary posture designating unit 6 a sets the normal direction (for example, downward) of the marker M input based on a predetermined operation of the operation input unit 11 by the user with respect to the two-dimensional plane. The temporary posture of the marker M with respect to the terminal body is designated (step S1).
Thereafter, the imaging control unit 4 causes the imaging unit 3 to image the marker M, and the image data generation unit 5 generates image data of a plurality of frame images F,... Transferred from the electronic imaging unit 3b (step S2). . Here, for example, when the focal length of the lens unit is 40 mm (35 mm size conversion) and a substantially frame-shaped marker M having a square with a side of 12 mm is imaged from a location about 40 cm away, the marker image is a frame image F of VGA size. One side of the region S is about 50 pixels. In such a situation, detection errors of the coordinates of the four corner points of the marker image region S described later are likely to occur.
Then, the image data generation unit 5 sequentially outputs the generated YUV data of each frame image F to the memory 2 and stores it in the memory 2.

次に、マーカー検出処理部６の画像取得部６ｂは、画像データ生成部５により逐次生成された何れか一のフレーム画像Ｆの画像データをメモリ２から取得する（ステップＳ３）。続けて、座標検出部６ｃは、画像取得部６ｂにより取得されたフレーム画像Ｆを二値化した画像の画像データに対して所定の特徴抽出処理を行って略枠状のマーカー画像領域Ｓを抽出し、抽出されたマーカー画像領域Ｓを構成する４隅の点の位置座標を検出する（ステップＳ４）。 Next, the image acquisition unit 6b of the marker detection processing unit 6 acquires image data of any one of the frame images F sequentially generated by the image data generation unit 5 from the memory 2 (step S3). Subsequently, the coordinate detection unit 6c performs a predetermined feature extraction process on the image data of the binarized image of the frame image F acquired by the image acquisition unit 6b to extract a substantially frame-shaped marker image region S. Then, the position coordinates of the four corner points constituting the extracted marker image region S are detected (step S4).

次に、第１推定部ｄ１は、マーカー画像領域Ｓの４隅の点の座標から幾何学的な位置関係を利用して、マーカーＭの各点の三次元空間での座標とフレーム画像Ｆ内の二次元の平面座標との対応関係を規定する座標変換式（例えば、式（１））の第１初期値を算出する（ステップＳ５）。続けて、第１推定部ｄ１は、算出された第１初期値を用いて下記式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔをバンドル調整により推定し、推定された行列Ｒを端末本体に対するマーカーＭの第１姿勢候補として特定する（ステップＳ６）。

Next, the first estimation unit d1 uses the geometrical positional relationship from the coordinates of the four corner points of the marker image region S to determine the coordinates of each point of the marker M in the three-dimensional space and the frame image F. A first initial value of a coordinate conversion formula (for example, formula (1)) that defines the correspondence relationship with the two-dimensional plane coordinates is calculated (step S5). Subsequently, the first estimation unit d1 uses the calculated first initial value to solve the coordinate transformation formula of the following formula (1), thereby matrixes R and T representing the posture and positional relationship of the marker M with respect to the terminal body. Is estimated by bundle adjustment, and the estimated matrix R is specified as the first posture candidate of the marker M with respect to the terminal body (step S6).

次に、判定部６ｅは、仮姿勢指定部６ａによって指定されたマーカーＭの仮の姿勢に基づいて、端末本体に対するマーカーＭの第１姿勢候補が正しいか否かを判定する（ステップＳ７）。具体的には、判定部６ｅは、端末本体に対するマーカーＭの第１姿勢候補を表す行列Ｒの所定の要素（例えば、２行目、３列目の要素r₂₃）の符号を確認することで第１姿勢候補と仮の姿勢が一致するか否かを判定し、当該判定結果に応じて第１姿勢候補が正しいか否かを判定する。 Next, the determination unit 6e determines whether or not the first posture candidate of the marker M with respect to the terminal body is correct based on the temporary posture of the marker M designated by the temporary posture designation unit 6a (step S7). Specifically, the determination unit 6e confirms the sign of a predetermined element (for example, element r _{23 in the} second row and third column) of the matrix R representing the first posture candidate of the marker M with respect to the terminal body. It is determined whether or not the first posture candidate matches the provisional posture, and it is determined whether or not the first posture candidate is correct according to the determination result.

ステップＳ７にて、第１姿勢候補が正しいと判定されると（ステップＳ７；ＹＥＳ）、第１姿勢特定部６ｆは、当該第１姿勢候補を端末本体に対するマーカーＭの姿勢を表す行列Ｒの最適解として特定する（ステップＳ８）。
そして、画像生成部６ｇは、最適解として特定された端末本体に対するマーカーＭの姿勢（第１姿勢候補）及び位置関係に基づいてフレーム画像Ｆ内でのマーカー画像の姿勢や位置を調整し、当該フレーム画像Ｆ中にマーカー画像と対応する仮想オブジェクトを重畳させた仮想画像の画像データを生成する。そして、表示制御部８は、当該仮想画像を表示部７にライブビュー表示させる（ステップＳ９）。
その後、中央制御部１のＣＰＵは、処理をステップＳ３に戻す。 When it is determined in step S7 that the first posture candidate is correct (step S7; YES), the first posture specifying unit 6f uses the first posture candidate as the optimal matrix R that represents the posture of the marker M with respect to the terminal body. It is specified as a solution (step S8).
Then, the image generation unit 6g adjusts the posture and position of the marker image in the frame image F based on the posture (first posture candidate) and the positional relationship of the marker M with respect to the terminal body identified as the optimal solution, and Image data of a virtual image in which a virtual object corresponding to the marker image is superimposed in the frame image F is generated. Then, the display control unit 8 displays the virtual image on the display unit 7 in live view (step S9).
Thereafter, the CPU of the central control unit 1 returns the process to step S3.

一方、ステップＳ７にて、第１姿勢候補が正しくないと判定されると（ステップＳ７；ＮＯ）、第２推定部ｄ２は、第１姿勢候補を用いて座標変換式の第２初期値を算出する（ステップＳ１０）。具体的には、第２推定部ｄ２は、光軸が通る端末本体中心とマーカーＭの中心を通る軸（中心軸）とマーカーＭの二次元平面の法線ベクトルの外積を算出する。そして、第２推定部ｄ２は、指定済みのマーカーＭの仮の姿勢（例えば、下向き）と反対（例えば、上向き）となるように、算出された外積を軸として第１姿勢候補である行列Ｒを回転させて、第２初期値を算出する。
続けて、第２推定部ｄ２は、算出された第２初期値を用いて式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔをバンドル調整により推定し、推定された行列Ｒを端末本体に対するマーカーＭの第２姿勢候補として特定する（ステップＳ１１）。 On the other hand, when it is determined in step S7 that the first posture candidate is not correct (step S7; NO), the second estimation unit d2 calculates the second initial value of the coordinate transformation equation using the first posture candidate. (Step S10). Specifically, the second estimating unit d2 calculates the outer product of the normal vector of the two-dimensional plane of the marker M and the axis (center axis) passing through the center of the terminal body through which the optical axis passes and the center of the marker M. Then, the second estimation unit d2 uses the calculated outer product as an axis so as to be opposite to the provisional posture (for example, downward) of the designated marker M (for example, upward), as a matrix R that is a first posture candidate. Is rotated to calculate the second initial value.
Subsequently, the second estimation unit d2 uses the calculated second initial value to solve the coordinate conversion equation of Equation (1), thereby obtaining matrices R and T representing the posture and positional relationship of the marker M with respect to the terminal body. The estimation is performed by bundle adjustment, and the estimated matrix R is specified as the second posture candidate of the marker M with respect to the terminal body (step S11).

その後、第１姿勢特定部６ｆは、当該第２姿勢候補を端末本体に対するマーカーＭの姿勢を表す行列Ｒの最適解として特定する（ステップＳ１２）。そして、中央制御部１のＣＰＵは、処理をステップＳ９に移行して、それ以降の各処理の実行を制御する。
即ち、ステップＳ９にて、画像生成部６ｇは、最適解として特定された端末本体に対するマーカーＭの姿勢（第２姿勢候補）及び位置関係に基づいてフレーム画像Ｆ内でのマーカー画像の姿勢や位置を調整し、当該フレーム画像Ｆ中にマーカー画像と対応する仮想オブジェクトを重畳させた仮想画像の画像データを生成する。そして、表示制御部８は、当該仮想画像を表示部７にライブビュー表示させる。 Thereafter, the first attitude specifying unit 6f specifies the second attitude candidate as an optimal solution of the matrix R representing the attitude of the marker M with respect to the terminal body (step S12). Then, the CPU of the central control unit 1 shifts the process to step S9 and controls the execution of each process thereafter.
That is, in step S9, the image generation unit 6g determines the posture or position of the marker image in the frame image F based on the posture (second posture candidate) and the positional relationship of the marker M with respect to the terminal body identified as the optimal solution. And image data of a virtual image in which a virtual object corresponding to the marker image is superimposed in the frame image F is generated. Then, the display control unit 8 displays the virtual image on the display unit 7 in live view.

上記の各処理は、ステップＳ３にてフレーム画像Ｆが取得される毎に繰り返し実行される。 Each of the above processes is repeatedly executed every time the frame image F is acquired in step S3.

以上のように、実施形態１の携帯端末１００によれば、マーカー（特定の被写体）Ｍの画像領域Ｓを構成する複数の点の座標の座標に基づいて、撮像部３（端末本体）に対するマーカーＭの姿勢候補を少なくとも一つ推定し、推定された姿勢候補のうちの何れか一を撮像部３に対するマーカーＭの姿勢の最適解として特定するので、マーカーＭの撮像の際の環境や各種の条件等によってマーカー画像領域Ｓを構成する複数の点の検出誤差が生じて局所解が算出され易いような環境であっても、局所解に陥ることなく最適解を適正に特定することができ、撮像部３に対するマーカーＭの姿勢の推定を適正に行うことができる。
即ち、例えば、携帯端末１００の撮像部３からマーカーＭまでの距離が相対的に遠い場合には、平行投影とみなされてマーカー画像領域Ｓを構成する複数の点の検出誤差が生じ易くなる。また、日差しの強い環境や暗闇などではマーカー画像領域Ｓの特定を適正に行うことができず、結果として、マーカー画像領域Ｓを構成する複数の点の検出誤差が生じ易くなる。また、撮像フレームレートが相対的に高い場合などにもマーカー画像領域Ｓを構成する複数の点の検出結果の信頼性が低下してしまう。このような環境では、マーカー画像領域Ｓの複数の点の座標から中心射影モデルを用いて撮像部３に対するマーカーＭの姿勢を推定しても、局所解が算出され易くなってしまう。
そこで、本実施形態では、撮像部３に対するマーカーＭの姿勢の最適解を一つ直接推定するのではなく、局所解を含む可能性があるものの、予め撮像部３に対するマーカーＭの姿勢候補を推定しておき、姿勢候補の中から撮像部３に対するマーカーＭの姿勢の最適解を特定するようにすることで、最終的に撮像部３に対するマーカーＭの姿勢の局所解が特定されてしまうことを抑制することができる。
これにより、撮像部３に対するマーカーＭの姿勢の推定を適正に行うことができることとなり、結果として、マーカーＭに対応する仮想オブジェクトの表示を適正に行うことができる。 As described above, according to the mobile terminal 100 of the first embodiment, the marker for the imaging unit 3 (terminal body) is based on the coordinates of the coordinates of a plurality of points that form the image area S of the marker (specific subject) M. Since at least one of the M posture candidates is estimated, and any one of the estimated posture candidates is specified as the optimum solution of the posture of the marker M with respect to the imaging unit 3, the environment at the time of imaging the marker M and various types Even in an environment where a detection error of a plurality of points constituting the marker image region S occurs depending on conditions or the like and the local solution is easily calculated, the optimal solution can be appropriately specified without falling into the local solution, The posture of the marker M with respect to the imaging unit 3 can be estimated appropriately.
That is, for example, when the distance from the imaging unit 3 of the mobile terminal 100 to the marker M is relatively long, it is regarded as parallel projection, and detection errors of a plurality of points constituting the marker image region S are likely to occur. In addition, the marker image region S cannot be properly specified in an environment with strong sunlight or darkness, and as a result, detection errors of a plurality of points constituting the marker image region S are likely to occur. In addition, the reliability of the detection results of a plurality of points constituting the marker image region S also decreases when the imaging frame rate is relatively high. In such an environment, even if the posture of the marker M with respect to the imaging unit 3 is estimated from the coordinates of a plurality of points in the marker image region S using the central projection model, a local solution is likely to be calculated.
Therefore, in this embodiment, instead of directly estimating one optimal solution of the posture of the marker M with respect to the imaging unit 3, although there may be a local solution, the posture candidate of the marker M with respect to the imaging unit 3 is estimated in advance. In addition, by specifying the optimal solution for the posture of the marker M with respect to the imaging unit 3 from the posture candidates, the local solution for the posture of the marker M with respect to the imaging unit 3 is finally specified. Can be suppressed.
As a result, the posture of the marker M with respect to the imaging unit 3 can be appropriately estimated, and as a result, the virtual object corresponding to the marker M can be properly displayed.

また、マーカー画像領域Ｓの複数の点の座標に基づいて、撮像部３に対するマーカー（特定の被写体）Ｍの第１姿勢候補を推定し、当該第１姿勢候補が正しくないと判定された場合に、第１姿勢候補の推定結果に基づいて、撮像部３に対するマーカーＭの第２姿勢候補を推定するので、第１姿勢候補及び第２姿勢候補のうち、一方を最適解として特定することができる。即ち、第１姿勢候補が正しいと判定された場合に、当該第１姿勢候補を最適解として特定することができる一方で、第１姿勢候補が正しくないと判定された場合に、第２姿勢候補を最適解として特定することができる。
具体的には、例えば、マーカー画像領域Ｓの複数の点の座標から擬似中心射影モデルを用いて撮像部３に対するマーカーＭの姿勢を推定する場合、マーカーＭの二次元平面に対する法線方向の向きが異なる複数の姿勢候補を推定することができる。このとき、第１姿勢候補が正しいと判定された場合には、当該第１姿勢候補を最適解として特定することができ、第１姿勢候補が正しくないと判定された場合には、第１姿勢候補の推定結果を考慮して当該第１姿勢候補と異なる第２姿勢候補を推定して、当該第２姿勢候補を最適解として特定することができる。 Further, when the first posture candidate of the marker (specific subject) M with respect to the imaging unit 3 is estimated based on the coordinates of a plurality of points in the marker image region S, and the first posture candidate is determined to be incorrect. Since the second posture candidate of the marker M with respect to the imaging unit 3 is estimated based on the estimation result of the first posture candidate, one of the first posture candidate and the second posture candidate can be specified as the optimum solution. . That is, when the first posture candidate is determined to be correct, the first posture candidate can be specified as the optimal solution, while when the first posture candidate is determined to be incorrect, the second posture candidate Can be specified as the optimal solution.
Specifically, for example, when the posture of the marker M with respect to the imaging unit 3 is estimated from the coordinates of a plurality of points in the marker image region S using the pseudo center projection model, the direction of the normal direction of the marker M with respect to the two-dimensional plane Can be estimated. At this time, when it is determined that the first posture candidate is correct, the first posture candidate can be specified as the optimal solution, and when it is determined that the first posture candidate is not correct, the first posture is determined. A second posture candidate different from the first posture candidate can be estimated in consideration of the candidate estimation result, and the second posture candidate can be specified as an optimal solution.

また、予め指定された撮像部３に対するマーカー（特定の被写体）Ｍの仮の姿勢や、最適解として逐次特定される撮像部３に対するマーカーＭの姿勢の変化の態様に基づいて、第１姿勢候補が正しいか否かを適正に判定することができ、撮像部３に対するマーカーＭの姿勢の最適解の特定を適正に行うことができる。 Further, the first posture candidate is based on the provisional posture of the marker (specific subject) M with respect to the imaging unit 3 designated in advance or the manner of change in the posture of the marker M with respect to the imaging unit 3 that is sequentially identified as the optimal solution. It is possible to appropriately determine whether or not is correct, and it is possible to appropriately identify the optimal solution for the posture of the marker M with respect to the imaging unit 3.

なお、上記実施形態のマーカー検出処理にあっては、ステップＳ７における仮の姿勢と一致するか否かの判定対象を、第１初期値を用いて座標変換式を解くことで推定された、端末本体に対するマーカーＭの姿勢を表す行列Ｒとしたが、一例であってこれに限られるものではなく、例えば、ステップＳ５にてマーカー画像領域Ｓの４隅の点の座標から幾何学的な位置関係を利用して算出された第１初期値を表す行列Ｒとしても良い。 In the marker detection process of the above-described embodiment, a terminal that is determined by solving a coordinate conversion equation using the first initial value is determined as a determination target of whether or not the provisional posture in step S7 matches. Although the matrix R representing the posture of the marker M with respect to the main body is an example, the present invention is not limited to this. For example, the geometric positional relationship from the coordinates of the four corner points of the marker image region S in step S5. It is also possible to use a matrix R that represents the first initial value calculated using.

［実施形態２］
以下に、実施形態２の携帯端末２００について説明する。
実施形態２の携帯端末２００は、以下に詳細に説明する以外の点で上記実施形態１の携帯端末１００と略同様の構成をなし、詳細な説明は省略する。 [Embodiment 2]
Below, the portable terminal 200 of Embodiment 2 is demonstrated.
The portable terminal 200 according to the second embodiment has substantially the same configuration as the portable terminal 100 according to the first embodiment except for the details described below, and detailed description thereof is omitted.

図５は、本発明を適用した実施形態２の携帯端末２００の概略構成を示すブロック図である。
図５に示すように、実施形態２の携帯端末２００のマーカー検出処理部２０６は、仮姿勢指定部６ａと、画像取得部６ｂと、座標検出部６ｃと、第２姿勢候補推定部２０６ｄと、第２姿勢特定部２０６ｆと、画像生成部６ｇとを具備している。
なお、仮姿勢指定部６ａ、画像取得部６ｂ、座標検出部６ｃ、画像生成部６ｇは、実施形態１の携帯端末１００に備わるものと略同様の構成及び処理をなし、ここでは詳細な説明は省略する。 FIG. 5 is a block diagram illustrating a schematic configuration of the mobile terminal 200 according to the second embodiment to which the present invention is applied.
As illustrated in FIG. 5, the marker detection processing unit 206 of the mobile terminal 200 according to the second embodiment includes a temporary posture designation unit 6a, an image acquisition unit 6b, a coordinate detection unit 6c, a second posture candidate estimation unit 206d, A second posture specifying unit 206f and an image generating unit 6g are provided.
The temporary posture designation unit 6a, the image acquisition unit 6b, the coordinate detection unit 6c, and the image generation unit 6g have substantially the same configuration and processing as those provided in the mobile terminal 100 of the first embodiment, and detailed description thereof will be given here. Omitted.

第２姿勢候補推定部２０６ｄは、撮像部３を具備する端末本体に対するマーカーＭの姿勢候補を複数推定する。
即ち、第２姿勢候補推定部２０６ｄは、座標検出部６ｃにより検出された複数の点の座標に基づいて、端末本体に対するマーカーＭの姿勢候補を複数推定する。具体的には、第２姿勢候補推定部２０６ｄは、所定の記録手段に記録されているデータベース（図示略）等を参照し、マーカー画像領域Ｓの４隅の点の位置座標から擬似中心射影モデルを利用して、マーカーＭの各点の三次元空間での座標とフレーム画像Ｆ内の二次元の平面座標との対応関係を規定する座標変換式（例えば、式（１））の初期値を複数算出する。つまり、第２姿勢候補推定部２０６ｄは、端末本体に対するマーカーＭの姿勢を表す行列Ｒの初期値を姿勢候補として複数推定する。 The second posture candidate estimation unit 206d estimates a plurality of posture candidates for the marker M with respect to the terminal body including the imaging unit 3.
That is, the second posture candidate estimation unit 206d estimates a plurality of posture candidates for the marker M with respect to the terminal body based on the coordinates of the plurality of points detected by the coordinate detection unit 6c. Specifically, the second posture candidate estimation unit 206d refers to a database (not shown) or the like recorded in a predetermined recording unit, and determines the pseudo center projection model from the position coordinates of the four corner points of the marker image region S. Is used to determine the initial value of a coordinate transformation formula (for example, formula (1)) that defines the correspondence between the coordinates of each point of the marker M in the three-dimensional space and the two-dimensional plane coordinates in the frame image F. Calculate multiple. That is, the second posture candidate estimation unit 206d estimates a plurality of initial values of the matrix R representing the posture of the marker M with respect to the terminal body as posture candidates.

第２姿勢特定部２０６ｆは、端末本体に対するマーカーＭの姿勢の最適解を特定する。
具体的には、第２姿勢特定部２０６ｆは、第２姿勢候補推定部２０６ｄにより推定された複数の姿勢候補のうち、仮姿勢指定部６ａによって指定された仮の姿勢と略一致するものを最適解として特定する。例えば、仮姿勢指定部６ａにより予めマーカーＭの二次元平面に対する法線方向の向きが上向きと指定されている場合、第２姿勢特定部２０６ｆは、端末本体に対するマーカーＭの各姿勢候補（初期値）を表す行列Ｒの所定の要素（例えば、２行目、３列目の要素r₂₃）の符号を確認することで仮の姿勢と一致するか否かを判定する。そして、第２姿勢特定部２０６ｆは、仮の姿勢と略一致する姿勢候補を特定して、座標変換式（例えば、式（１））の最終的な初期値とする。
その後、第２姿勢特定部２０６ｆは、特定された最終的な初期値を用いて式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔをバンドル調整により推定し、推定された行列Ｒを端末本体に対するマーカーＭの姿勢の最適解として特定する。

The 2nd attitude | position specific | specification part 206f specifies the optimal solution of the attitude | position of the marker M with respect to a terminal main body.
Specifically, the second posture specifying unit 206f optimally selects a plurality of posture candidates estimated by the second posture candidate estimating unit 206d that substantially matches the temporary posture specified by the temporary posture specifying unit 6a. Specify as a solution. For example, in the case where the normal orientation of the marker M with respect to the two-dimensional plane is designated in advance by the temporary orientation specifying unit 6a, the second orientation specifying unit 206f selects each orientation candidate (initial value) of the marker M relative to the terminal body. ) Indicating a predetermined posture (for example, the element r _{23 in the} second row and the third column) of the matrix R is determined to determine whether or not it matches the temporary posture. Then, the second posture specifying unit 206f specifies a posture candidate that substantially matches the temporary posture, and sets it as the final initial value of the coordinate conversion formula (for example, formula (1)).
Thereafter, the second posture identifying unit 206f uses the identified final initial values to solve the coordinate transformation equation of Equation (1), thereby matrixes R and T representing the posture and positional relationship of the marker M with respect to the terminal body. Is estimated by bundle adjustment, and the estimated matrix R is specified as the optimum solution of the posture of the marker M with respect to the terminal body.

なお、最適解の特定方法は、一例であってこれに限られるものではなく、適宜任意に変更可能である。
具体的には、例えば、第２姿勢特定部２０６ｆは、最適解として端末本体に対するマーカーＭの姿勢を逐次特定していき、当該姿勢の変化の態様に基づいて端末本体に対するマーカーＭの姿勢の最適解を再度特定するようにしても良い。例えば、フレーム画像Ｆが所定の撮像フレームレート（例えば、６０ｆｐｓ等）で撮像されて取得される場合に、所定時間内に取得された複数のフレーム画像Ｆ、…に対応する姿勢の所定の要素に基づいてマーカーＭの姿勢が所定の向き（例えば、上向き）である割合を算出し、その割合が所定値以上となった場合に、端末本体に対するマーカーＭの姿勢が正しい（最適解である）と判定するようにしても良い。また、例えば、第２姿勢特定部２０６ｆは、前回最適解として特定された端末本体に対するマーカーＭの姿勢の所定の要素と比較して、今回特定された端末本体に対するマーカーＭの姿勢の所定の要素が一致するか否かに応じて端末本体に対するマーカーＭの姿勢が正しいか否かを判定するようにしても良い。
この場合には、携帯端末２００は、必ずしも仮姿勢指定部６ａを具備する必要はない。 The method for specifying the optimal solution is an example and is not limited to this, and can be arbitrarily changed as appropriate.
Specifically, for example, the second posture specifying unit 206f sequentially specifies the posture of the marker M with respect to the terminal body as the optimum solution, and optimizes the posture of the marker M with respect to the terminal body based on the change of the posture. You may make it specify a solution again. For example, when the frame image F is captured and acquired at a predetermined imaging frame rate (for example, 60 fps), a predetermined element having a posture corresponding to the plurality of frame images F,. Based on this, a ratio in which the attitude of the marker M is in a predetermined direction (for example, upward) is calculated, and when the ratio is equal to or greater than a predetermined value, the attitude of the marker M with respect to the terminal body is correct (is an optimal solution). It may be determined. In addition, for example, the second posture specifying unit 206f compares the predetermined element of the posture of the marker M with respect to the terminal body specified this time as compared with the predetermined element of the posture of the marker M with respect to the terminal main body specified as the optimal solution last time. Whether or not the posture of the marker M with respect to the terminal body is correct may be determined according to whether or not the two match.
In this case, the mobile terminal 200 does not necessarily have to include the temporary posture specifying unit 6a.

次に、携帯端末２００によるマーカー検出処理について図６を参照して説明する。
図６は、マーカー検出処理に係る動作の一例を示すフローチャートである。
なお、携帯端末２００によるマーカー検出処理は、以下に詳細に説明する以外の点で上記実施形態１の携帯端末１００によるマーカー検出処理と略同様であり、詳細な説明は省略する。 Next, marker detection processing by the mobile terminal 200 will be described with reference to FIG.
FIG. 6 is a flowchart illustrating an example of an operation related to the marker detection process.
In addition, the marker detection process by the portable terminal 200 is substantially the same as the marker detection process by the portable terminal 100 of the said Embodiment 1 except the point demonstrated in detail below, and detailed description is abbreviate | omitted.

図６に示すように、実施形態２の携帯端末２００は、上記実施形態１におけるマーカー検出処理と同様に、ステップＳ１〜Ｓ４の各処理を実行する。
即ち、ステップＳ１にて、仮姿勢指定部６ａは、端末本体に対するマーカーＭの仮の姿勢を指定し、ステップＳ２にて、画像データ生成部５は、電子撮像部３ｂから転送された複数のフレーム画像Ｆ、…の画像データを生成する。また、ステップＳ３にて、画像取得部６ｂは、画像データ生成部５により逐次生成された何れか一のフレーム画像Ｆの画像データを取得し、ステップＳ４にて、座標検出部６ｃは、フレーム画像Ｆからマーカー画像領域Ｓを抽出し、マーカー画像領域Ｓを構成する４隅の点の位置座標を検出する。 As illustrated in FIG. 6, the mobile terminal 200 according to the second embodiment performs the processes of steps S <b> 1 to S <b> 4 in the same manner as the marker detection process according to the first embodiment.
That is, in step S1, the provisional posture designation unit 6a designates the provisional posture of the marker M with respect to the terminal body, and in step S2, the image data generation unit 5 transmits a plurality of frames transferred from the electronic imaging unit 3b. Image data of images F,... Is generated. In step S3, the image acquisition unit 6b acquires image data of any one of the frame images F sequentially generated by the image data generation unit 5. In step S4, the coordinate detection unit 6c The marker image area S is extracted from F, and the position coordinates of the four corner points constituting the marker image area S are detected.

その後、第２姿勢候補推定部２０６ｄは、端末本体に対するマーカーＭの姿勢を表す行列Ｒの初期値を姿勢候補として複数推定する（ステップＳ２５）。具体的には、第２姿勢候補推定部２０６ｄは、マーカー画像領域Ｓの４隅の点の位置座標から擬似中心射影モデルを利用して、マーカーＭの各点の三次元空間での座標とフレーム画像Ｆ内の二次元の平面座標との対応関係を規定する座標変換式（例えば、式（１））の初期値を複数算出する。 Thereafter, the second posture candidate estimation unit 206d estimates a plurality of initial values of the matrix R representing the posture of the marker M with respect to the terminal body as posture candidates (step S25). Specifically, the second posture candidate estimation unit 206d uses the pseudo-center projection model from the position coordinates of the four corner points of the marker image region S, and coordinates and frames of each point of the marker M in the three-dimensional space. A plurality of initial values of a coordinate transformation formula (for example, formula (1)) that defines the correspondence with the two-dimensional plane coordinates in the image F are calculated.

次に、第２姿勢特定部２０６ｆは、第２姿勢候補推定部２０６ｄにより推定された複数の姿勢候補のうち、仮姿勢指定部６ａによって指定された仮の姿勢と略一致するものを特定する（ステップＳ２６）。具体的には、第２姿勢特定部２０６ｆは、端末本体に対するマーカーＭの各姿勢候補を表す行列Ｒの所定の要素（例えば、２行目、３列目の要素r₂₃）の符号を確認することで仮の姿勢と一致するか否かを判定し、仮の姿勢と略一致する姿勢候補を座標変換式（例えば、式（１））の最終的な初期値とする。
続けて、第２姿勢特定部２０６ｆは、特定された最終的な初期値を用いて下記式（１）の座標変換式を解くことで、端末本体に対するマーカーＭの姿勢及び位置関係を表す行列Ｒ、Ｔをバンドル調整により推定する。そして、第２姿勢特定部２０６ｆは、推定された行列Ｒを端末本体に対するマーカーＭの姿勢の最適解として特定する。

Next, the second posture specifying unit 206f specifies a plurality of posture candidates estimated by the second posture candidate estimating unit 206d that substantially matches the temporary posture specified by the temporary posture specifying unit 6a ( Step S26). Specifically, the second posture identifying unit 206f confirms the sign of a predetermined element (for example, the element r _{23 in the} second row and the third column) of the matrix R representing each posture candidate of the marker M with respect to the terminal body. Thus, it is determined whether or not it matches the temporary posture, and a posture candidate that substantially matches the temporary posture is set as the final initial value of the coordinate conversion equation (for example, equation (1)).
Subsequently, the second posture identifying unit 206f uses the identified final initial value to solve the coordinate transformation formula of the following formula (1), thereby expressing the matrix R representing the posture and positional relationship of the marker M with respect to the terminal body. , T are estimated by bundle adjustment. Then, the second posture specifying unit 206f specifies the estimated matrix R as the optimal solution for the posture of the marker M with respect to the terminal body.

そして、上記実施形態１におけるマーカー検出処理と同様に、ステップＳ９にて、画像生成部６ｇは、フレーム画像Ｆ中にマーカー画像と対応する仮想オブジェクトを重畳させた仮想画像の画像データを生成し、表示制御部８は、当該仮想画像を表示部７にライブビュー表示させる。
その後、中央制御部１のＣＰＵは、処理をステップＳ３に戻す。 Then, similarly to the marker detection process in the first embodiment, in step S9, the image generation unit 6g generates image data of a virtual image in which a virtual object corresponding to the marker image is superimposed in the frame image F, The display control unit 8 displays the virtual image on the display unit 7 in live view.
Thereafter, the CPU of the central control unit 1 returns the process to step S3.

以上のように、実施形態２の携帯端末２００によれば、上記実施形態１の携帯端末１００と同様に、マーカーＭの撮像の際の環境や各種の条件等によってマーカー画像領域Ｓを構成する複数の点の検出誤差が生じて局所解が算出され易いような環境であっても、局所解に陥ることなく最適解を適正に特定することができ、撮像部３に対するマーカーＭの姿勢の推定を適正に行うことができる。
特に、予め複数の姿勢候補を推定しておくことで、これら複数の姿勢候補の中から最適解を選択により特定することができ、撮像部３に対するマーカーＭの姿勢の最適解の特定を簡便に行うことができる。 As described above, according to the mobile terminal 200 of the second embodiment, similar to the mobile terminal 100 of the first embodiment, a plurality of marker image areas S are configured by the environment and various conditions when the marker M is imaged. Even in an environment where a point detection error occurs and the local solution is easily calculated, the optimal solution can be appropriately specified without falling into the local solution, and the posture of the marker M with respect to the imaging unit 3 can be estimated. It can be done properly.
In particular, by estimating a plurality of posture candidates in advance, it is possible to specify an optimum solution from among the plurality of posture candidates by selection, and it is easy to specify the optimum solution of the posture of the marker M with respect to the imaging unit 3. It can be carried out.

また、推定された複数の姿勢候補のうち、予め指定された撮像部３に対するマーカー（特定の被写体）Ｍの仮の姿勢と略一致するものを最適解として特定したり、最適解として撮像部３に対するマーカーＭの姿勢を逐次特定して当該姿勢の変化の態様に基づいて最適解を再度特定するので、撮像部３に対するマーカーＭの姿勢の最適解の特定を適正に行うことができる。 In addition, among the estimated plurality of posture candidates, the one that substantially matches the provisional posture of the marker (specific subject) M with respect to the imaging unit 3 designated in advance is identified as the optimal solution, or the imaging unit 3 is selected as the optimal solution. Since the optimal solution of the marker M with respect to the image pickup unit 3 is sequentially specified and the optimal solution is specified again based on the change of the posture, the optimal solution of the posture of the marker M with respect to the imaging unit 3 can be appropriately specified.

なお、本発明は、上記実施形態に限定されることなく、本発明の趣旨を逸脱しない範囲において、種々の改良並びに設計の変更を行っても良い。
例えば、所定の座標変換式を解くことで端末本体（撮像部３）に対するマーカーＭの姿勢を表す行列Ｒを特定するようにしたが、端末本体に対するマーカーＭの姿勢の一例であって一例であってこれに限られるものではなく、適宜任意に変更可能である。
また、座標変換式を解くことで端末本体に対するマーカーＭの位置関係を表す行列Ｔを特定するようにしたが、必ずしも当該端末本体に対するマーカーＭの位置関係を特定する必要はない。 The present invention is not limited to the above-described embodiment, and various improvements and design changes may be made without departing from the spirit of the present invention.
For example, the matrix R representing the posture of the marker M with respect to the terminal body (imaging unit 3) is specified by solving a predetermined coordinate conversion formula, but this is an example of the posture of the marker M with respect to the terminal body. However, the present invention is not limited to this, and can be arbitrarily changed as appropriate.
Further, the matrix T representing the positional relationship of the marker M with respect to the terminal body is specified by solving the coordinate conversion formula, but the positional relationship of the marker M with respect to the terminal body is not necessarily specified.

さらに、携帯端末１００（２００）の構成は、上記実施形態に例示したものは一例であり、これに限られるものではなく、少なくとも取得手段、検出手段、候補推定手段、特定手段を備える構成であれば適宜任意に変更することができる。即ち、携帯端末１００（２００）は、必ずしも撮像部３を具備する必要はなく、外部の撮像装置により撮像された画像を取得して、特定の被写体（マーカーＭ）を検出する処理を行うようにしても良い。 Furthermore, the configuration of the mobile terminal 100 (200) is merely an example illustrated in the above embodiment, and is not limited thereto, and may include at least an acquisition unit, a detection unit, a candidate estimation unit, and a specification unit. If necessary, it can be arbitrarily changed. That is, the mobile terminal 100 (200) does not necessarily have to include the imaging unit 3, and performs processing for acquiring an image captured by an external imaging device and detecting a specific subject (marker M). May be.

加えて、上記実施形態にあっては、取得手段、検出手段、候補推定手段、特定手段としての機能を、携帯端末１００（２００）の中央制御部１の制御下にて、画像取得部６ｂ、座標検出部６ｃ、第１姿勢候補推定部６ｄ（第２姿勢候補推定部２０６ｄ）、第１姿勢特定部６ｆ（第２姿勢特定部２０６ｆ）が駆動することにより実現される構成としたが、これに限られるものではなく、中央制御部１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。
即ち、プログラムを記憶するプログラムメモリに、取得処理ルーチン、検出処理ルーチン、候補推定処理ルーチン、特定処理ルーチンを含むプログラムを記憶しておく。そして、取得処理ルーチンにより中央制御部１のＣＰＵを、撮像部３により撮像された特定の被写体を含む画像を逐次取得する手段として機能させるようにしても良い。また、検出処理ルーチンにより中央制御部１のＣＰＵを、取得された画像における特定の被写体の画像領域を構成する複数の点の座標を検出する手段として機能させるようにしても良い。また、候補推定処理ルーチンにより中央制御部１のＣＰＵを、検出された複数の点の座標に基づいて、撮像部３に対する特定の被写体の姿勢候補を少なくとも一つ推定する手段として機能させるようにしても良い。また、特定処理ルーチンにより中央制御部１のＣＰＵを、推定された姿勢候補のうちの何れか一を撮像部３に対する特定の被写体の姿勢の最適解として特定する手段として機能させるようにしても良い。 In addition, in the above embodiment, the functions of the acquisition unit, the detection unit, the candidate estimation unit, and the identification unit are controlled by the central control unit 1 of the mobile terminal 100 (200), the image acquisition unit 6b, The coordinate detecting unit 6c, the first posture candidate estimating unit 6d (second posture candidate estimating unit 206d), and the first posture specifying unit 6f (second posture specifying unit 206f) are driven. However, the present invention is not limited to this, and a configuration realized by executing a predetermined program by the CPU of the central control unit 1 may be adopted.
That is, a program including an acquisition process routine, a detection process routine, a candidate estimation process routine, and a specific process routine is stored in a program memory that stores the program. Then, the CPU of the central control unit 1 may function as means for sequentially acquiring images including a specific subject imaged by the imaging unit 3 by the acquisition processing routine. Further, the CPU of the central control unit 1 may function as a means for detecting the coordinates of a plurality of points constituting the image area of a specific subject in the acquired image by the detection processing routine. Further, the CPU of the central control unit 1 is caused to function as a means for estimating at least one posture candidate of a specific subject with respect to the imaging unit 3 based on the coordinates of the detected plurality of points by the candidate estimation processing routine. Also good. Further, the CPU of the central control unit 1 may function as a means for specifying any one of the estimated posture candidates as an optimum solution of the posture of the specific subject with respect to the imaging unit 3 by the specifying process routine. .

同様に、第１推定手段、第２推定手段、判定手段、指定手段についても、中央制御部１のＣＰＵによって所定のプログラム等が実行されることにより実現される構成としても良い。 Similarly, the first estimation unit, the second estimation unit, the determination unit, and the designation unit may be realized by executing a predetermined program or the like by the CPU of the central control unit 1.

さらに、上記の各処理を実行するためのプログラムを格納したコンピュータ読み取り可能な媒体として、ＲＯＭやハードディスク等の他、フラッシュメモリ等の不揮発性メモリ、ＣＤ−ＲＯＭ等の可搬型記録媒体を適用することも可能である。また、プログラムのデータを所定の通信回線を介して提供する媒体としては、キャリアウェーブ（搬送波）も適用される。 Furthermore, as a computer-readable medium storing a program for executing each of the above processes, a non-volatile memory such as a flash memory or a portable recording medium such as a CD-ROM is applied in addition to a ROM or a hard disk. Is also possible. A carrier wave is also used as a medium for providing program data via a predetermined communication line.

〔付記〕
本発明のいくつかの実施形態を説明したが、本発明の範囲は、上述の実施の形態に限定するものではなく、特許請求の範囲に記載された発明の範囲とその均等の範囲を含む。
以下に、この出願の願書に最初に添付した特許請求の範囲に記載した発明を付記する。付記に記載した請求項の項番は、この出願の願書に最初に添付した特許請求の範囲の通りである。
＜請求項１＞
撮像部により撮像された特定の被写体を含む画像を逐次取得する取得手段と、
前記取得手段により取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する検出手段と、
前記検出手段により検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する候補推定手段と、
前記候補推定手段により推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する特定手段と、
を備えたことを特徴とする被写体検出装置。
＜請求項２＞
前記候補推定手段は、
前記検出手段により検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の第１姿勢候補を推定する第１推定手段と、
前記第１推定手段による推定結果に基づいて、前記撮像部に対する前記特定の被写体の前記第１姿勢候補と異なる第２姿勢候補を推定する第２推定手段と、を備え、
前記特定手段は、
前記第１姿勢候補及び前記第２姿勢候補のうち、一方を前記最適解として特定することを特徴とする請求項１に記載の被写体検出装置。
＜請求項３＞
前記第１推定手段により推定された前記第１姿勢候補が正しいか否かを所定の判定基準に従って判定する判定手段を更に備え、
前記第２推定手段は、前記判定手段によって前記第１姿勢候補が正しくないと判定された場合に、前記第２姿勢候補を推定することを特徴とする請求項２に記載の被写体検出装置。
＜請求項４＞
前記特定手段は、
前記判定手段によって前記第１姿勢候補が正しいと判定された場合に、当該第１姿勢候補を前記最適解として特定する一方で、前記第１姿勢候補が正しくないと判定された場合に、前記第２推定手段により推定された前記第２姿勢候補を前記最適解として特定することを特徴とする請求項３に記載の被写体検出装置。
＜請求項５＞
予め前記撮像部に対する前記特定の被写体の仮の姿勢を指定する指定手段を更に備え、
前記判定手段は、前記指定手段によって指定された仮の姿勢に基づいて、前記第１姿勢候補が正しいか否かを判定することを特徴とする請求項３又は４に記載の被写体検出装置。
＜請求項６＞
前記判定手段は、前記特定手段により前記最適解として逐次特定される前記撮像部に対する前記特定の被写体の姿勢の変化の態様に基づいて、前記第１姿勢候補が正しいか否かを判定することを特徴とする請求項３又は４に記載の被写体検出装置。
＜請求項７＞
予め前記撮像部に対する前記特定の被写体の仮の姿勢を指定する指定手段を更に備え、
前記特定手段は、前記候補推定手段により推定された複数の前記姿勢候補のうち、前記指定手段によって指定された仮の姿勢と略一致するものを前記最適解として特定することを特徴とする請求項１に記載の被写体検出装置。
＜請求項８＞
前記特定手段は、前記最適解として前記撮像部に対する前記特定の被写体の姿勢を逐次特定するとともに、当該姿勢の変化の態様に基づいて前記最適解を再度特定することを特徴とする請求項１に記載の被写体検出装置。
＜請求項９＞
前記特定の被写体は、マーカーであることを特徴とする請求項１〜８の何れか一項に記載の被写体検出装置。
＜請求項１０＞
被写体検出装置を用いた被写体検出方法であって、
撮像部により撮像された特定の被写体を含む画像を逐次取得する処理と、
取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する処理と、
検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する処理と、
推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する処理と、
を含むことを特徴とする被写体検出方法。
＜請求項１１＞
被写体検出装置のコンピュータを、
撮像部により撮像された特定の被写体を含む画像を逐次取得する取得手段、
前記取得手段により取得された画像における前記特定の被写体の画像領域を構成する複数の点の座標を検出する検出手段、
前記検出手段により検出された前記複数の点の座標に基づいて、前記撮像部に対する前記特定の被写体の姿勢候補を少なくとも一つ推定する候補推定手段、
前記候補推定手段により推定された前記姿勢候補のうちの何れか一を前記撮像部に対する前記特定の被写体の姿勢の最適解として特定する特定手段、
として機能させることを特徴とするプログラム。 [Appendix]
Although several embodiments of the present invention have been described, the scope of the present invention is not limited to the above-described embodiments, but includes the scope of the invention described in the claims and equivalents thereof.
The invention described in the scope of claims attached to the application of this application will be added below. The item numbers of the claims described in the appendix are as set forth in the claims attached to the application of this application.
<Claim 1>
Acquisition means for sequentially acquiring images including a specific subject imaged by the imaging unit;
Detecting means for detecting coordinates of a plurality of points constituting the image area of the specific subject in the image acquired by the acquiring means;
Candidate estimating means for estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detecting means;
Specifying means for specifying any one of the posture candidates estimated by the candidate estimating means as an optimum solution of the posture of the specific subject with respect to the imaging unit;
A subject detection apparatus comprising:
<Claim 2>
The candidate estimating means includes
First estimation means for estimating a first posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detection means;
Second estimation means for estimating a second posture candidate different from the first posture candidate of the specific subject with respect to the imaging unit based on an estimation result by the first estimation means,
The specifying means is:
The subject detection apparatus according to claim 1, wherein one of the first posture candidate and the second posture candidate is specified as the optimal solution.
<Claim 3>
A determination unit for determining whether the first posture candidate estimated by the first estimation unit is correct according to a predetermined determination criterion;
The subject detection apparatus according to claim 2, wherein the second estimation unit estimates the second posture candidate when the determination unit determines that the first posture candidate is not correct.
<Claim 4>
The specifying means is:
When the determination means determines that the first posture candidate is correct, the first posture candidate is specified as the optimal solution, while the first posture candidate is determined to be incorrect, The subject detection apparatus according to claim 3, wherein the second posture candidate estimated by two estimation means is specified as the optimal solution.
<Claim 5>
Further comprising designation means for designating a temporary posture of the specific subject with respect to the imaging unit in advance;
5. The subject detection apparatus according to claim 3, wherein the determination unit determines whether the first posture candidate is correct based on the provisional posture specified by the specification unit.
<Claim 6>
The determination unit determines whether the first posture candidate is correct based on a change in posture of the specific subject with respect to the imaging unit that is sequentially specified as the optimal solution by the specifying unit. The subject detection device according to claim 3, wherein the subject detection device is a feature.
<Claim 7>
Further comprising designation means for designating a temporary posture of the specific subject with respect to the imaging unit in advance;
The said specifying means specifies the thing substantially agree | coinciding with the temporary attitude | position designated by the said designation | designated means among the several said attitude | position candidates estimated by the said candidate estimation means as the said optimal solution. The subject detection device according to 1.
<Claim 8>
2. The specification unit according to claim 1, wherein the specifying unit sequentially specifies the posture of the specific subject with respect to the imaging unit as the optimal solution, and specifies the optimal solution again based on a change of the posture. The subject detection device described.
<Claim 9>
The subject detection apparatus according to claim 1, wherein the specific subject is a marker.
<Claim 10>
A subject detection method using a subject detection device,
A process of sequentially acquiring images including a specific subject imaged by the imaging unit;
Processing for detecting coordinates of a plurality of points constituting the image area of the specific subject in the acquired image;
A process of estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the detected coordinates of the plurality of points;
Processing for specifying any one of the estimated posture candidates as an optimal solution of the posture of the specific subject with respect to the imaging unit;
A method for detecting a subject, comprising:
<Claim 11>
The computer of the subject detection device
Acquisition means for sequentially acquiring images including a specific subject imaged by the imaging unit;
Detecting means for detecting coordinates of a plurality of points constituting the image area of the specific subject in the image acquired by the acquiring means;
Candidate estimating means for estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detecting means;
A specifying unit that specifies any one of the posture candidates estimated by the candidate estimation unit as an optimal solution of the posture of the specific subject with respect to the imaging unit;
A program characterized by functioning as

１００、２００携帯端末
１中央制御部
３撮像部
６、２０６マーカー検出処理部
６ａ仮姿勢指定部
６ｂ画像取得部
６ｃ座標検出部
６ｄ第１姿勢候補推定部
ｄ１第１推定部
ｄ２第２推定部
２０６ｄ第２姿勢候補推定部
６ｅ判定部
６ｆ第１姿勢特定部
２０６ｆ第２姿勢特定部
Ｆフレーム画像
Ｍマーカー 100, 200 Mobile terminal 1 Central control unit 3 Imaging unit 6, 206 Marker detection processing unit 6a Temporary posture designation unit 6b Image acquisition unit 6c Coordinate detection unit 6d First posture candidate estimation unit d1 First estimation unit d2 Second estimation unit 206d Second posture candidate estimation unit 6e Determination unit 6f First posture specification unit 206f Second posture specification unit F Frame image M Marker

Claims

Acquisition means for sequentially acquiring images including a specific subject imaged by the imaging unit;
Detecting means for detecting coordinates of a plurality of points constituting the image area of the specific subject in the image acquired by the acquiring means;
Candidate estimating means for estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detecting means;
Specifying means for specifying any one of the posture candidates estimated by the candidate estimating means as an optimum solution of the posture of the specific subject with respect to the imaging unit;
A subject detection apparatus comprising:

The candidate estimating means includes
First estimation means for estimating a first posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detection means;
Second estimation means for estimating a second posture candidate different from the first posture candidate of the specific subject with respect to the imaging unit based on an estimation result by the first estimation means,
The specifying means is:
The subject detection apparatus according to claim 1, wherein one of the first posture candidate and the second posture candidate is specified as the optimal solution.

A determination unit for determining whether the first posture candidate estimated by the first estimation unit is correct according to a predetermined determination criterion;
The subject detection apparatus according to claim 2, wherein the second estimation unit estimates the second posture candidate when the determination unit determines that the first posture candidate is not correct.

The specifying means is:
When the determination means determines that the first posture candidate is correct, the first posture candidate is specified as the optimal solution, while the first posture candidate is determined to be incorrect, The subject detection apparatus according to claim 3, wherein the second posture candidate estimated by two estimation means is specified as the optimal solution.

Further comprising designation means for designating a temporary posture of the specific subject with respect to the imaging unit in advance;
5. The subject detection apparatus according to claim 3, wherein the determination unit determines whether the first posture candidate is correct based on the provisional posture specified by the specification unit.

The determination unit determines whether the first posture candidate is correct based on a change in posture of the specific subject with respect to the imaging unit that is sequentially specified as the optimal solution by the specifying unit. The subject detection device according to claim 3, wherein the subject detection device is a feature.

Further comprising designation means for designating a temporary posture of the specific subject with respect to the imaging unit in advance;
The said specifying means specifies the thing substantially agree | coinciding with the temporary attitude | position designated by the said designation | designated means among the several said attitude | position candidates estimated by the said candidate estimation means as the said optimal solution. The subject detection device according to 1.

2. The specification unit according to claim 1, wherein the specifying unit sequentially specifies the posture of the specific subject with respect to the imaging unit as the optimal solution, and specifies the optimal solution again based on a change of the posture. The subject detection device described.

The subject detection apparatus according to claim 1, wherein the specific subject is a marker.

A subject detection method using a subject detection device,
A process of sequentially acquiring images including a specific subject imaged by the imaging unit;
Processing for detecting coordinates of a plurality of points constituting the image area of the specific subject in the acquired image;
A process of estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the detected coordinates of the plurality of points;
Processing for specifying any one of the estimated posture candidates as an optimal solution of the posture of the specific subject with respect to the imaging unit;
A method for detecting a subject, comprising:

The computer of the subject detection device
Acquisition means for sequentially acquiring images including a specific subject imaged by the imaging unit;
Detecting means for detecting coordinates of a plurality of points constituting the image area of the specific subject in the image acquired by the acquiring means;
Candidate estimating means for estimating at least one posture candidate of the specific subject with respect to the imaging unit based on the coordinates of the plurality of points detected by the detecting means;
A specifying unit that specifies any one of the posture candidates estimated by the candidate estimation unit as an optimal solution of the posture of the specific subject with respect to the imaging unit;
A program characterized by functioning as