JP2024081356A

JP2024081356A - Information processing device, information processing method, and program

Info

Publication number: JP2024081356A
Application number: JP2022194922A
Authority: JP
Inventors: 稜大市場; Ryota Ichiba
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-12-06
Filing date: 2022-12-06
Publication date: 2024-06-18

Abstract

To allow for displaying an image with object information superimposed so as to match a left-to-right relationship with respect to a subject.SOLUTION: An information processing device acquires an image of a subject or acquires object information superimposed on the image of the subject. The information processing device includes a first mode in which a synthetic image superimposed on the image of the subject is generated without flipping the object information horizontally and a second mode in which a synthetic image in which horizontally flipped object information is superimposed on the image of the subject is generated, and accepts a switching instruction to the first mode or the second mode. The information processing device also displays the generated synthetic image.SELECTED DRAWING: Figure 3

Description

本発明は、撮像された画像にオブジェクト情報を重畳する情報処理技術に関する。 The present invention relates to an information processing technology that superimposes object information onto a captured image.

ＡＲ（オーグメンテッドリアリティ、拡張現実感）技術を利用した仮想試着のサービスが展開されている。仮想試着サービスでは、カメラで撮像された人物画像の位置姿勢を推定し、その推定した位置姿勢に応じた衣料品画像を生成して人物画像に重畳して表示することにより、その人物が衣料品を着用した際の見えを表現する。近年、ＣＧ（コンピュータグラフィックス）技術や撮像技術、計算機性能の向上によって、仮想試着サービスはスマートフォン等の携帯端末上のアプリケーションプログラム（以下、アプリケーションとする）でも提供されるようになっている。スマートフォンにおいて仮想試着サービスを用いる場合には、内蔵カメラで撮像された人物画像に衣料品画像を重畳して内蔵ディスプレイに表示するようなことが行われる。 Virtual try-on services that use AR (Augmented Reality) technology are being developed. In virtual try-on services, the position and orientation of a person's image captured by a camera are estimated, and a clothing image corresponding to the estimated position and orientation is generated and displayed superimposed on the person's image, thereby expressing how the person will look when wearing the clothing. In recent years, with improvements in CG (computer graphics) technology, imaging technology, and computer performance, virtual try-on services are also being provided through application programs (hereinafter referred to as applications) on mobile devices such as smartphones. When using a virtual try-on service on a smartphone, a clothing image is superimposed on a person's image captured by a built-in camera and displayed on the built-in display.

ここで、スマートフォンは、内蔵カメラとして、表示画面と同じ面側に配されたインカメラと、表示画面に対して背面側に配されたアウトカメラとを備えていることが多い。そしてスマートフォンにおいてインカメラで撮像した画像を表示する際には、鏡に映る被写体の見え方を再現するために、撮像された画像の左右を反転させて表示するのが一般的である。つまり実在の鏡に映る被写体の像は左右が反転した鏡像であるため、インカメラで撮像された画像は、左右が反転された画像として表示される。一方、アウトカメラにて撮像された画像を表示する際には、その撮像画像の左右を反転させることなくそのまま表示される。このため、スマートフォンにおいて仮想試着サービスを用いた場合、インカメラで撮像された被写体画像には左右反転させた衣料品画像が重畳され、一方、アウトカメラで撮像された被写体画像には左右反転していない衣料品画像が重畳される。 Here, a smartphone often has built-in cameras, an in-camera arranged on the same side as the display screen, and an out-camera arranged on the rear side of the display screen. When displaying an image captured by the in-camera on a smartphone, the captured image is generally displayed with left and right inverted to reproduce the way the subject appears in a mirror. In other words, since the image of the subject reflected in a real mirror is a mirror image with left and right inverted, the image captured by the in-camera is displayed as an inverted image. On the other hand, when displaying an image captured by the out-camera, the captured image is displayed as is without being inverted. For this reason, when using a virtual fitting service on a smartphone, a left-right inverted clothing image is superimposed on the subject image captured by the in-camera, while a non-left-right inverted clothing image is superimposed on the subject image captured by the out-camera.

特許文献１には、外側撮像装置で撮像された画像であるか、或いは内側撮像装置で撮像された画像であるかに応じて、オブジェクトの画像を自動的に左右反転して撮像画像に重畳する手法が開示されている。 Patent document 1 discloses a method for automatically flipping an image of an object horizontally and superimposing it on a captured image depending on whether the image was captured by an outer imaging device or an inner imaging device.

特開２０１４－１８６５０７号公報JP 2014-186507 A

しかしながら、特許文献１は、実在の鏡に映っている被写体像を撮像することを考慮していないため、被写体に対してオブジェクトの左右関係が適切に整合していない重畳画像が表示される場合がある。 However, since Patent Document 1 does not take into consideration capturing an image of a subject reflected in a real mirror, there are cases where a superimposed image is displayed in which the left-right relationship of the object is not properly aligned with the subject.

そこで、本発明は、被写体に対して左右関係が整合したオブジェクト情報を重畳した画像の表示を可能にすることを目的とする。 The present invention aims to make it possible to display an image in which object information that is consistent with the left-right relationship of the subject is superimposed.

本発明の情報処理装置は、被写体の画像を取得する画像取得手段と、前記被写体の画像に重畳するオブジェクト情報を取得するオブジェクト取得手段と、前記オブジェクト情報を左右反転させずに前記被写体の画像に重畳した合成画像を生成する第１のモードと、左右反転させた前記オブジェクト情報を前記被写体の画像に重畳した合成画像を生成する第２のモードとを含む合成手段と、前記第１のモードと前記第２のモードのいずれかへの切替指示を受け付ける受付手段と、前記合成手段にて生成された合成画像を表示する表示制御手段と、を有することを特徴とする。 The information processing device of the present invention is characterized by having an image acquisition means for acquiring an image of a subject, an object acquisition means for acquiring object information to be superimposed on the image of the subject, a synthesis means including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without left-right inversion, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information that has been left-right inverted, a reception means for receiving an instruction to switch to either the first mode or the second mode, and a display control means for displaying the synthetic image generated by the synthesis means.

本発明によれば、被写体に対して左右関係が整合したオブジェクト情報を重畳した画像の表示が可能となる。 The present invention makes it possible to display an image in which object information that is aligned with the left-right relationship of the subject is superimposed.

情報処理装置のハードウェア外観図である。FIG. 2 is a diagram illustrating the external hardware appearance of the information processing device. 情報処理装置内部のハードウェア構成図である。FIG. 2 is a diagram illustrating the internal hardware configuration of the information processing device. 第１の実施形態に係る情報処理装置の機能構成を示す図である。FIG. 2 is a diagram illustrating a functional configuration of the information processing device according to the first embodiment. 第１の実施形態に係る情報処理のフローチャートである。4 is a flowchart of information processing according to the first embodiment. 被写体画像の一例を示す図である。FIG. 2 is a diagram showing an example of a subject image. 衣料品情報取得から合成画像生成までの処理のフローチャートである。13 is a flowchart of a process from obtaining clothing information to generating a composite image. 衣料品情報に基づくオブジェクトの一例を示す図である。FIG. 13 is a diagram showing an example of an object based on clothing information. 合成されるオブジェクト画像の例を示す図である。FIG. 13 is a diagram showing an example of an object image to be synthesized. 衣料品オブジェクト画像が重畳された合成画像の例を示す図である。FIG. 13 is a diagram showing an example of a composite image on which a clothing object image is superimposed. 合成画像とモード通知アイコンの表示例を示す図である。13A and 13B are diagrams illustrating display examples of a composite image and a mode notification icon. 第２の実施形態に係る情報処理装置の機能構成を示す図である。FIG. 11 is a diagram illustrating a functional configuration of an information processing device according to a second embodiment. 第２の実施形態に係る情報処理のフローチャートである。10 is a flowchart of information processing according to the second embodiment.

以下、本発明に係る実施形態を、図面を参照しながら説明する。以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。実施形態の構成は、本発明が適用される装置の仕様や各種条件（使用条件、使用環境等）によって適宜修正又は変更され得る。また、以下の実施形態において、同一の構成には同じ参照符号を付し、重複する説明は省略する。 The following describes an embodiment of the present invention with reference to the drawings. The following embodiment does not limit the present invention, and not all of the combinations of features described in the present embodiment are necessarily essential to the solution of the present invention. The configuration of the embodiment may be modified or changed as appropriate depending on the specifications of the device to which the present invention is applied and various conditions (conditions of use, environment of use, etc.). In the following embodiments, the same components are given the same reference symbols, and duplicated descriptions are omitted.

＜第１の実施形態＞
本実施形態では、情報処理装置としてスマートフォン型の情報端末を例に挙げる。本実施形態の情報処理装置には、被写体を撮像した画像に対して所定のオブジェクト画像を重畳した合成画像を生成して表示するアプリケーションプロブラムが実装されている。本実施形態では、被写体の一例として人物を挙げ、また所定のオブジェクト画像の一例として衣料品のオブジェクト画像を挙げる。すなわち本実施形態では、被写体である人物を撮像した画像に対して衣料品オブジェクト画像を重畳した合成画像を生成して表示する仮想試着アプリケーションプロブラム（以下、試着アプリケーションとする）を例に挙げる。本実施形態の試着アプリケーションでは、被写体の人物を撮像した画像からその人物の位置姿勢を推定し、当該推定した位置姿勢に沿うような衣料品オブジェクト画像を生成して被写体の画像に重畳した合成画像を生成して表示する。これにより、その被写体の人物が仮想的に衣料品を着用した際の見えを提供可能とする。
スマートフォンのインカメラでは被写体の全身が画角内に収まらない場合や、衣料品を着用した状態が他人からどのように見えるのかを確認したい場合、ユーザは、実在の鏡に映っている被写体像をアウトカメラで撮像することがある。実在の鏡に映っている被写体像は左右が反転した鏡像であり、一方、アウトカメラで撮像された被写体の画像は左右を反転させていない画像となる。このため仮想試着サービスを用いた場合、鏡に映っている被写体像をアウトカメラで撮影した被写体の画像に衣料品画像が重畳された画像は、被写体に対して衣料品の左右関係が適切に整合していない画像になってしまう。このため、鏡に映った左右反転した鏡像を撮像した画像を用いた場合であっても、被写体に対して左右関係が整合した衣料品等のオブジェクト情報を重畳した画像の表示を可能にすることが望まれる。
そこで本実施形態の試着アプリケーションは、被写体の画像にオブジェクト画像を重畳して合成画像を生成する際の生成モードとして、第１の生成モードと第２の生成モードを有している。第１の生成モードは、被写体を撮像して得られた画像に対して、被写体の位置姿勢に応じたオブジェクト画像を重畳して合成画像を生成するモード（以下、通常モードとする）である。第２の生成モードは、被写体の位置姿勢に応じた所定のオブジェクト画像を左右反転させ、その左右反転したオブジェクト画像を被写体の画像に重畳して合成画像を生成するモード（以下、左右反転モードとする）である。なお本実施形態において、画像を左右反転させる処理とは、鏡に映った鏡像のような画像を生成する処理であり、これは公知の処理であるためその詳細な説明は省略する。
さらに本実施形態の試着アプリケーションは、人物の画像に衣料品オブジェクト画像を重畳した合成画像の表示とともに所定の通知アイコンをも表示することによって、その合成画像がどちらの生成モードに属しているかをユーザに通知可能とする。そして試着アプリケーションは、その通知アイコンに対してユーザが所定の操作（例えばタップ操作）を行った場合には生成モードの切り替えを行う。これにより、ユーザは、実在の鏡に映っている被写体の人物の鏡像を撮像しているような場合に、その人物の画像に対して、左右関係が整合した状態の衣料品オブジェクト画像が重畳された合成画像の表示を選択することができる。すなわち本実施形態では、鏡に映った鏡像ではない被写体を撮像した画像だけでなく、鏡に映った左右反転した被写体の鏡像を撮像した画像のいずれであっても、被写体に対して左右関係が整合したオブジェクト画像を重畳した合成画像の表示を実現可能とする。 First Embodiment
In this embodiment, a smartphone-type information terminal is taken as an example of the information processing device. The information processing device of this embodiment is equipped with an application program that generates and displays a composite image in which a predetermined object image is superimposed on an image of a subject. In this embodiment, a person is taken as an example of a subject, and a clothing object image is taken as an example of a predetermined object image. That is, in this embodiment, a virtual try-on application program (hereinafter, referred to as a try-on application) that generates and displays a composite image in which a clothing object image is superimposed on an image of a subject person is taken as an example. In the try-on application of this embodiment, the position and orientation of a subject person is estimated from an image of the subject person, a clothing object image is generated in accordance with the estimated position and orientation, and a composite image is generated and displayed by superimposing it on the image of the subject. This makes it possible to provide an appearance when the subject person is virtually wearing the clothing.
When the entire body of a subject does not fit within the angle of view of the smartphone's front camera, or when a user wants to check how the subject looks to others wearing the clothing, the user may use the rear camera to capture the image of the subject reflected in a real mirror. The image of the subject reflected in a real mirror is a mirror image with the left and right reversed, whereas the image of the subject captured by the rear camera is an image that is not reversed left and right. For this reason, when using a virtual fitting service, an image of the subject reflected in the mirror captured by the rear camera with a clothing image superimposed thereon is an image in which the left and right relationship of the clothing is not properly matched with respect to the subject. For this reason, it is desirable to be able to display an image in which object information such as clothing that is matched with the left and right relationship with respect to the subject is superimposed, even when an image of a mirror image with the left and right reversed is used.
Therefore, the fitting application of this embodiment has a first generation mode and a second generation mode as generation modes when generating a composite image by superimposing an object image on an image of a subject. The first generation mode is a mode (hereinafter referred to as a normal mode) in which an object image corresponding to the position and orientation of the subject is superimposed on an image obtained by capturing an image of the subject to generate a composite image. The second generation mode is a mode (hereinafter referred to as a left-right inversion mode) in which a predetermined object image corresponding to the position and orientation of the subject is inverted left-right and the inverted object image is superimposed on the image of the subject to generate a composite image. Note that in this embodiment, the process of inverting an image left-right is a process of generating an image like a mirror image reflected in a mirror, and since this is a well-known process, a detailed description thereof will be omitted.
Furthermore, the try-on application of the present embodiment can notify the user of which generation mode the composite image belongs to by displaying a composite image in which a clothing object image is superimposed on an image of a person and also displaying a predetermined notification icon. The try-on application switches the generation mode when the user performs a predetermined operation (e.g., a tap operation) on the notification icon. This allows the user to select the display of a composite image in which a clothing object image in a state where the left-right relationship is consistent is superimposed on the image of the person when capturing an image of a mirror image of a subject reflected in a real mirror. That is, in the present embodiment, it is possible to realize the display of a composite image in which an object image in a state where the left-right relationship is consistent with respect to the subject is superimposed not only in an image of a subject that is not a mirror image reflected in a mirror, but also in an image of a mirror image of a subject that is reversed in a mirror.

図１（ａ）と図１（ｂ）はスマートフォンに本実施形態の情報処理装置１を適用した場合の外観の一例を示した図であり、情報処理装置１は、ディスプレイ２０４、背面カメラ２０８、前面カメラ２０９等を備えている。図１（ａ）は情報処理装置１においてディスプレイ２が配されている前面側を示し、一方、図１（ｂ）は情報処理装置１の背面側を示している。ディスプレイ２０４は、液晶や有機ＥＬ等の表示パネルにいわゆるタッチパネル機能が付加された表示画面を備えた表示装置である。前面カメラ２０９は、ディスプレイ２０４が配されている面と同じ面（前面）に配置されている前面撮像装置（いわゆるインカメラ）である。背面カメラ２０８は、ディスプレイ２０４が配されている前面側に対して、背面側に配置されている背面撮像装置（いわゆるアウトカメラ）である。背面カメラ２０８と前面カメラ２０９は、撮像素子として例えばＣＭＯＳセンサを有し、例えば１０８０×１９２０ピクセルの２次元８ビットＲＧＢ画像を取得する。なお、図１（ｂ）に示されている座標軸１０は、情報処理装置１において設定されている固定座標軸であり、ｘ，ｙ，ｚの３軸方向を示している。 1(a) and 1(b) are diagrams showing an example of the appearance of the information processing device 1 of this embodiment when applied to a smartphone, and the information processing device 1 is equipped with a display 204, a rear camera 208, a front camera 209, etc. FIG. 1(a) shows the front side of the information processing device 1 on which the display 2 is arranged, while FIG. 1(b) shows the rear side of the information processing device 1. The display 204 is a display device equipped with a display screen to which a so-called touch panel function is added to a display panel such as a liquid crystal or organic EL. The front camera 209 is a front imaging device (so-called in-camera) arranged on the same surface (front) as the surface on which the display 204 is arranged. The rear camera 208 is a rear imaging device (so-called out-camera) arranged on the rear side of the front side on which the display 204 is arranged. The rear camera 208 and the front camera 209 have, for example, a CMOS sensor as an imaging element, and acquire, for example, a two-dimensional 8-bit RGB image of 1080 x 1920 pixels. Note that the coordinate axes 10 shown in FIG. 1(b) are fixed coordinate axes set in the information processing device 1, and indicate the three axial directions of x, y, and z.

図２は、情報処理装置１の内部ハードウェア構成例を示したブロック図である。
情報処理装置１は、前述したディスプレイ２０４、背面カメラ２０８、前面カメラ２０９に加え、ＣＰＵ（中央演算処理装置）２０１、ＲＯＭ（リードオンリーメモリー）２０２、ＲＡＭ（ランダムアクセスメモリ）２０３等を備える。さらに情報処理装置１は、大容量メモリ２０５、加速度センサ２０６、方位センサ２０７、ＮＩＣ（ネットワークインターフェースカード）２１０、マイクロフォン２１１等をも備えている。なお本実施形態ではスマートフォンへの適用を想定しているため、情報処理装置１は、図２に示した各構成に加えて、一般的なスマートフォンが有している各構成（不図示）をも備えている。 FIG. 2 is a block diagram showing an example of the internal hardware configuration of the information processing device 1.
The information processing device 1 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, a RAM (Random Access Memory) 203, etc., in addition to the above-mentioned display 204, rear camera 208, and front camera 209. The information processing device 1 also includes a large-capacity memory 205, an acceleration sensor 206, a direction sensor 207, a NIC (Network Interface Card) 210, a microphone 211, etc. In addition to the components shown in FIG. 2, the information processing device 1 also includes components (not shown) that are included in a typical smartphone, since the present embodiment is intended to be applied to a smartphone.

ＣＰＵ２０１は、ＲＡＭ２０３をワークメモリとして、ＲＯＭ２０２や大容量メモリ２０５などに格納されたＯＳ（オペレーティングシステム）と各種プログラムを実行する。ＣＰＵ２０１は、シリアルバス２１２を介して各構成を制御する。またＣＰＵ２０１は、大容量メモリ２０５やＲＯＭ２０２、インターネット上の記録メディアを、データ格納領域として使用する。またＣＰＵ２０１は、プログラムによって提供されるＵＩ（ユーザーインターフェース）をディスプレイ２０４に表示する表示制御を行い、ディスプレイ２０４のタッチパネルを介してユーザからの入力を受け付ける。 The CPU 201 uses the RAM 203 as a working memory and executes an OS (operating system) and various programs stored in the ROM 202, the large-capacity memory 205, etc. The CPU 201 controls each component via a serial bus 212. The CPU 201 also uses the large-capacity memory 205, the ROM 202, and recording media on the Internet as data storage areas. The CPU 201 also performs display control to display a UI (user interface) provided by a program on the display 204, and accepts input from the user via the touch panel of the display 204.

ＮＩＣ２１０は、インターネットに接続し、外部装置との間で情報の入力および出力を行う。
加速度センサ２０６は、例えば静電容量方式等により、図１（ｂ）に示した情報処理装置１の座標軸１０のｘ，ｙ，ｚの３軸方向の加速度を検出する。
方位センサ２０７は、地磁気を計測するセンサであり、図１（ｂ）に示した３軸のｘ，ｙ，ｚ座標系において北の方向を示す３次元ベクトルを出力する。
マイクロフォン２１１は、音声取得装置であり、例えばムービングコイル方式等により、音声を電気信号として出力する。 The NIC 210 connects to the Internet and inputs and outputs information to and from external devices.
The acceleration sensor 206 detects acceleration in three axial directions, x, y, and z, of the coordinate system 10 of the information processing device 1 shown in FIG.
The orientation sensor 207 is a sensor that measures geomagnetism, and outputs a three-dimensional vector that indicates the north direction in the three-axis x, y, z coordinate system shown in FIG. 1(b).
The microphone 211 is a sound capture device, and outputs sound as an electrical signal, for example, by a moving coil method.

図３は、本実施形態の情報処理装置１の機能ブロック構成を示した図である。
ＯＳ３は、オペレーティングシステムであり、入出力の制御及び各種アプリケーションの起動や切り替えを行う命令群である。デバイスドライバ３０７は、ＯＳ３に含まれる命令群であり、情報処理装置１のディスプレイ２０４、背面カメラ２０８、前面カメラ２０９、および各種センサを制御する。各種アプリケーションは、ＯＳ３に所定の命令を送ることによりこれらに機器を制御することができる。 FIG. 3 is a diagram showing a functional block configuration of the information processing device 1 of the present embodiment.
The OS3 is an operating system, and is a set of commands that control input/output and launch and switch various applications. The device driver 307 is a set of commands included in the OS3, and controls the display 204, the rear camera 208, the front camera 209, and various sensors of the information processing device 1. Various applications can control these devices by sending predetermined commands to the OS3.

試着アプリケーション２は、本実施形態の情報処理装置１において、被写体を撮像した画像に衣料品オブジェクト画像を重畳した合成画像を生成して表示する仮想試着アプリケーションプロブラムである。試着アプリケーション２は、大容量メモリ２０５やインターネット上の記録メディアからオブジェクト情報を取得し、そのオブジェクト情報を基に生成したオブジェクト画像を、被写体の画像に重畳した合成画像を生成してディスプレイ２０４に表示する命令群である。本実施形態では、衣料品のオブジェクト画像を用いる例を挙げているため、以下、オブジェクト情報は衣料品情報とする。すなわち、試着アプリケーション２は、衣料品情報を取得し、その衣料品情報を基に生成した衣料品オブジェクト画像を、被写体の画像に重畳した合成画像を生成してディスプレイ２０４に表示する。図２に示されたＣＰＵ２０１は、試着アプリケーション２の実行により、画像取得部３０１、操作受付部３０２、オブジェクト取得部３０３、姿勢推定部３０４、合成部３０５、表示部３０６の各機能部を実現する。なお、試着アプリケーション２による各機能部の幾つか若しくは全ては回路等により実現されてもよい。 The try-on application 2 is a virtual try-on application program that generates and displays a composite image in which a clothing object image is superimposed on an image captured of a subject in the information processing device 1 of this embodiment. The try-on application 2 is a set of commands that acquire object information from the mass memory 205 or a recording medium on the Internet, generate an object image based on the object information, superimpose the object image on the image of the subject, and display the composite image on the display 204. In this embodiment, an example is given in which an object image of clothing is used, so hereinafter, object information is assumed to be clothing information. That is, the try-on application 2 acquires clothing information, generates a composite image in which a clothing object image generated based on the clothing information is superimposed on the image of the subject, and displays the composite image on the display 204. The CPU 201 shown in FIG. 2 executes the try-on application 2 to realize the functional units of the image acquisition unit 301, the operation acceptance unit 302, the object acquisition unit 303, the posture estimation unit 304, the synthesis unit 305, and the display unit 306. Note that some or all of the functional units of the try-on application 2 may be realized by circuits or the like.

画像取得部３０１は、前述した前面カメラ２０９と背面カメラ２０８のうち、例えば背面カメラ２０８が被写体の人物等を撮像したＲＧＢ画像（以下、被写体画像Ｉ１とする）を取得する。
オブジェクト取得部３０３は、例えばインターネット上の記録メディアなどから、衣料品情報を取得する。オブジェクト取得部３０３は、例えば、複数の衣料品情報の中からユーザが操作受付部３０２を通じて選択した衣料品情報を取得する。衣料品情報は、被写体画像Ｉ１内の被写体（人物の画像）に対して重畳される衣料品オブジェクト画像の基になる情報であり、衣料品の３次元形状を表す３次元オブジェクトモデル情報とその衣料品が装着される部位などを示す情報を含む。以下、３次元オブジェクトモデル情報を３次元モデル情報とする。 The image acquisition unit 301 acquires an RGB image (hereinafter referred to as a subject image I1) captured by the rear camera 208 out of the front camera 209 and rear camera 208 described above, for example, a subject person or the like.
The object acquisition unit 303 acquires clothing information, for example, from a recording medium on the Internet. The object acquisition unit 303 acquires clothing information selected by the user from a plurality of pieces of clothing information through the operation reception unit 302. The clothing information is information that serves as the basis of a clothing object image that is superimposed on the subject (image of a person) in the subject image I1, and includes three-dimensional object model information that represents the three-dimensional shape of the clothing item and information indicating the part of the body on which the clothing item is worn. Hereinafter, the three-dimensional object model information will be referred to as three-dimensional model information.

姿勢推定部３０４は、画像取得部３０１で取得された被写体画像Ｉ１から、その被写体である人物の位置姿勢を推定する。ここでの位置姿勢には、被写体である人物の向きや姿勢のみならず、腕や脚等の人体を構成する各部位の位置姿勢も含まれる。 The posture estimation unit 304 estimates the position and posture of the subject person from the subject image I1 acquired by the image acquisition unit 301. The position and posture here includes not only the orientation and posture of the subject person, but also the positions and postures of each part of the human body, such as the arms and legs.

操作受付部３０２は、ディスプレイ２０４に表示されたアイコン領域に対するタップ操作などのユーザ操作を受け付ける。詳細は後述するが、本実施形態の場合、操作受付部３０２は、試着アプリケーション２における生成モードを左右反転モードあるいは通常モードのいずれにするかのモード選択指示の操作などを受け付ける。つまり試着アプリケーション２では、操作受付部３０２を介したユーザからのモード選択指示に応じて生成モードの切り替えが行われる。 The operation reception unit 302 receives user operations such as tapping on an icon area displayed on the display 204. Although details will be described later, in this embodiment, the operation reception unit 302 receives operations such as a mode selection instruction to set the generation mode in the try-on application 2 to either a horizontally inverted mode or a normal mode. In other words, in the try-on application 2, the generation mode is switched in response to a mode selection instruction from the user via the operation reception unit 302.

合成部３０５は、姿勢推定部３０４で推定された被写体の位置姿勢を基に、前述した衣料品情報の３次元モデル情報に基づく３次元モデルを回転、変形、拡大・縮小等し、被写体が衣料品を着用した際の衣料品の見えを表す衣料品オブジェクト画像Ｉ２を生成する。また合成部３０５は、操作受付部３０２を介してユーザから受け付けたモード選択指示によって左右反転モードまたは通常モードのいずれが選択されたかに応じて、衣料品オブジェクト画像Ｉ２を左右反転させるか、あるいは左右反転させないかを決定する。前述したように、左右反転モードは左右反転させた衣料品を被写体が着用した際の見えを表す生成モードであり、通常モードは左右反転させていない衣料品を被写体が着用した際の見えを表す生成モードである。したがって合成部３０５は、いずれか選択された方の生成モードに応じた衣料品オブジェクト画像Ｉ２を生成し、画像取得部３０１で取得された被写体画像Ｉ１内の人物の画像に当該衣料品オブジェクト画像Ｉ２を重畳した合成画像Ｉ３を生成する。 Based on the position and orientation of the subject estimated by the orientation estimation unit 304, the synthesis unit 305 rotates, transforms, enlarges, reduces, etc. the three-dimensional model based on the three-dimensional model information of the clothing information described above to generate a clothing object image I2 that represents the appearance of the clothing when the subject wears the clothing. The synthesis unit 305 also determines whether to horizontally invert the clothing object image I2 or not depending on whether the horizontally inverted mode or the normal mode is selected by the mode selection instruction received from the user via the operation reception unit 302. As described above, the horizontally inverted mode is a generation mode that represents the appearance when the subject wears clothing that is horizontally inverted, and the normal mode is a generation mode that represents the appearance when the subject wears clothing that is not horizontally inverted. Therefore, the synthesis unit 305 generates a clothing object image I2 according to either of the selected generation modes, and generates a synthetic image I3 in which the clothing object image I2 is superimposed on the image of the person in the subject image I1 acquired by the image acquisition unit 301.

表示部３０６は、合成部３０５によって生成された合成画像Ｉ３と、試着アプリケーション２における生成モードが左右反転モードと通常モードのどちらであるかを示す所定の通知アイコン等を、ディスプレイ２０４に表示する。 The display unit 306 displays on the display 204 the composite image I3 generated by the composition unit 305 and a predetermined notification icon indicating whether the generation mode in the fitting application 2 is the left-right inversion mode or the normal mode.

図４は、試着アプリケーション２の実行による本実施形態の情報処理の流れを示したフローチャートである。以降の各フローチャートの説明では、各処理ステップの工程を符号の前に付した「Ｓ」により表す。
まずＳ４０１の処理として、画像取得部３０１は、ユーザからの取得指示に基づいて被写体画像Ｉ１を取得する。被写体画像Ｉ１は、前述したように背面カメラ２０８によって撮像された画像であるとする。なお背面カメラ２０８では、被写体をそのまま撮像する場合の他、実在の鏡に映った被写体の鏡像を撮像する場合も想定される。背面カメラ２０８で鏡に映った被写体の鏡像を撮像する想定例としては、前面カメラ２０９では画角内に収まらない場合や、衣料品を着用した状態の他人からの見えを確認したい場合などが考えられる。なお本実施形態において、背面カメラ２０８から取得される画像は、１０８０×１９２０ピクセルの２次元画像とし、各ピクセルがＲ，Ｇ，Ｂの色信号をそれぞれ８ビットの深度を保持しているとする。 4 is a flowchart showing the flow of information processing in this embodiment by executing the fitting application 2. In the following explanation of each flowchart, the process of each processing step is represented by "S" added before the reference number.
First, in the process of S401, the image acquisition unit 301 acquires a subject image I1 based on an acquisition instruction from a user. The subject image I1 is assumed to be an image captured by the rear camera 208 as described above. In addition to the case where the rear camera 208 captures the subject as it is, it is also assumed that the rear camera 208 captures the mirror image of the subject reflected in a real mirror. Assumed examples of capturing the mirror image of the subject reflected in a mirror by the rear camera 208 include a case where the subject does not fit within the angle of view of the front camera 209, or a case where the user wants to check how the subject looks to others while wearing the clothing. In this embodiment, the image acquired by the rear camera 208 is a two-dimensional image of 1080 x 1920 pixels, and each pixel holds R, G, and B color signals with a depth of 8 bits.

また、被写体画像Ｉ１の取得方法は一例であり、被写体画像Ｉ１は例えば連続的に撮像された動画像のうちの１フレームでもよいし、予め撮像して大容量メモリ２０５やインターネット上の記録メディアに保存されている画像や動画像のうちの１フレームでもよい。図５は、画像取得部３０１にて取得された被写体画像Ｉ１の例に示した図である。図５に例示しているように、被写体画像Ｉ１中には人物の被写体５０１が写っている。もちろん画像取得部３０１は、前面カメラ２０９によって撮像された画像や動画像のうちの１フレームを取得することも可能である。 The method of acquiring subject image I1 is just one example, and subject image I1 may be, for example, one frame of a continuously captured video, or one frame of an image or video that has been captured in advance and stored in large-capacity memory 205 or a recording medium on the Internet. FIG. 5 is a diagram showing an example of subject image I1 acquired by image acquisition unit 301. As shown in FIG. 5, subject image I1 contains a human subject 501. Of course, image acquisition unit 301 can also acquire one frame of an image or video captured by front camera 209.

次にＳ４０２において、オブジェクト取得部３０３は衣料品情報を取得し、姿勢推定部３０４は画像取得部３０１にて取得された被写体画像Ｉ１から被写体５０１の位置姿勢を推定する。またＳ４０２において、合成部３０５は、生成モードが通常モードまたは左右反転モードのいずれであるかを判定し、その判定結果を基に、衣料品情報から被写体の位置姿勢に沿うような衣料品オブジェクト画像Ｉ２を生成する。そして合成部３０５は、その衣料品オブジェクト画像Ｉ２を、Ｓ４０１で取得された被写体画像Ｉ１内の被写体（人物）に重畳した合成画像Ｉ３を生成する。 Next, in S402, the object acquisition unit 303 acquires clothing information, and the orientation estimation unit 304 estimates the position and orientation of the subject 501 from the subject image I1 acquired by the image acquisition unit 301. Also in S402, the synthesis unit 305 determines whether the generation mode is normal mode or left-right inversion mode, and generates a clothing object image I2 from the clothing information that conforms to the position and orientation of the subject based on the determination result. The synthesis unit 305 then generates a synthetic image I3 by superimposing the clothing object image I2 on the subject (person) in the subject image I1 acquired in S401.

図６は、Ｓ４０２における衣料品情報の取得から合成画像の生成までの情報処理の詳細な流れを示したフローチャートである。
Ｓ６０１において、オブジェクト取得部３０３は、衣料品オブジェクト画像Ｉ２の基になる衣料品情報を取得する。例えば、オブジェクト取得部３０３は、大容量メモリ２０５やインターネット上の記録メディアに用意されている複数の衣料品情報の中から、例えばユーザが操作受付部３０２を通じて選択した衣料品情報を取得する。 FIG. 6 is a flowchart showing a detailed flow of information processing from obtaining clothing information in S402 to generating a composite image.
In S601, the object acquisition unit 303 acquires clothing information that is the basis of the clothing object image I2. For example, the object acquisition unit 303 acquires clothing information selected by the user through the operation acceptance unit 302 from a plurality of pieces of clothing information prepared in the mass memory 205 or a recording medium on the Internet.

図７は、衣料品情報の一例の説明に用いる図である。前述したように、衣料品情報は、衣料品の３次元形状を表す３次元モデル情報と、その衣料品が装着される人体部位を示す情報とを含む。衣料品の３次元形状を表す３次元モデル情報により、衣料品オブジェクト画像７０１が生成される。図７の例では、衣料品が装着される人体部位として、胴体、右袖、左袖の各部位を示す情報が含まれている。 Figure 7 is a diagram used to explain an example of clothing information. As described above, clothing information includes three-dimensional model information that represents the three-dimensional shape of the clothing item, and information indicating the body part on which the clothing item is to be worn. A clothing object image 701 is generated from the three-dimensional model information that represents the three-dimensional shape of the clothing item. In the example of Figure 7, information indicating the torso, right sleeve, and left sleeve as the body parts on which the clothing item is to be worn is included.

なお、オブジェクト取得部３０３が取得する衣料品情報は、図５に例示した被写体５０１の人物の位置姿勢に応じて見えを変更することができる情報であればよく、必ずしも３次元形状を表す３次元モデル情報を含んでいなくてもよい。例えば、３次元モデル情報に代えて、例えば衣料品を複数の角度から撮像した２次元画像の配列や、衣料品を正面から撮像した２次元画像１枚の情報が含まれていてもよい。すなわちオブジェクト取得部３０３は、それら２次元画像のオブジェクト画像を取得してもよい。このようにオブジェクト取得部３０３においてオブジェクト画像が取得される場合、合成部３０５ではオブジェクト情報の３次元モデル情報からオブジェクト画像を生成するような処理を行わなくてもよい。 The clothing information acquired by the object acquisition unit 303 may be information that can change its appearance depending on the position and posture of the person of the subject 501 illustrated in FIG. 5, and does not necessarily have to include 3D model information that represents the 3D shape. For example, instead of the 3D model information, it may include information such as an array of 2D images of the clothing captured from multiple angles, or one 2D image of the clothing captured from the front. In other words, the object acquisition unit 303 may acquire object images of these 2D images. When an object image is acquired in this way by the object acquisition unit 303, the synthesis unit 305 does not need to perform processing such as generating an object image from the 3D model information of the object information.

次にＳ６０２において、合成部３０５は、生成モードが通常モードか否かを判定する。そして、生成モードが通常モードではないと判定された場合、つまり左右反転モードであると判定された場合、試着アプリケーション２の処理は、Ｓ６０３に進む。一方、Ｓ６０２において通常モードであると判定された場合、試着アプリケーション２の処理は、Ｓ６０４の処理に進む。なお、生成モードは、試着アプリケーション２の開始時点では例えば通常モードに設定されているとするが、もちろん試着アプリケーション開始時点で左右反転モードに設定されてもよい。また本実施形態の場合、生成モードは、後述するＳ４０４において操作受付部３０２に対するユーザからの指示に応じて切り替え可能となされている。 Next, in S602, the synthesis unit 305 determines whether the generation mode is normal mode or not. If it is determined that the generation mode is not normal mode, that is, if it is determined that the generation mode is horizontally reversed mode, the processing of the try-on application 2 proceeds to S603. On the other hand, if it is determined in S602 that the generation mode is normal mode, the processing of the try-on application 2 proceeds to S604. Note that the generation mode is set to, for example, normal mode when the try-on application 2 starts, but it may of course be set to horizontally reversed mode when the try-on application starts. In this embodiment, the generation mode can be switched in response to a user instruction to the operation reception unit 302 in S404, which will be described later.

Ｓ６０３の処理に進むと、姿勢推定部３０４は、画像取得部３０１にて取得された被写体画像Ｉ１から被写体５０１の位置姿勢を推定する。Ｓ６０３において、姿勢推定部３０４は、画像取得部３０１で取得された被写体画像Ｉ１を左右反転させる処理を行い、その左右反転処理後の被写体画像から被写体５０１の位置姿勢を推定する。なお図１（ａ）に示したような状態で情報処理装置１が使用される例を挙げた場合、姿勢推定部３０４は、被写体画像Ｉ１を、ディスプレイ２０４の短辺方向の画素の並び順を左右反転させた画像とする。また例えば、加速度センサ２０６によって情報処理装置１が横向きの状態であることを検知した場合、姿勢推定部３０４は、被写体画像Ｉ１を、ディスプレイ２０４の長辺方向の画素の並び順を左右反転させた画像とする。そして、姿勢推定部３０４は、左右反転処理後の被写体画像Ｉ１から、公知の深層学習を使った手法等によって被写体５０１の位置姿勢を推定する。深層学習を使った手法等による位置姿勢推定処理の説明は省略する。このＳ６０３の処理後、試着アプリケーション２はＳ６０５の処理に進む。 When the process proceeds to S603, the posture estimation unit 304 estimates the position and posture of the subject 501 from the subject image I1 acquired by the image acquisition unit 301. In S603, the posture estimation unit 304 performs a process of horizontally inverting the subject image I1 acquired by the image acquisition unit 301, and estimates the position and posture of the subject 501 from the subject image after the horizontal inversion process. In an example in which the information processing device 1 is used in the state shown in FIG. 1A, the posture estimation unit 304 sets the subject image I1 to an image in which the order of pixels in the short side direction of the display 204 is horizontally inverted. Also, for example, when the acceleration sensor 206 detects that the information processing device 1 is in a landscape state, the posture estimation unit 304 sets the subject image I1 to an image in which the order of pixels in the long side direction of the display 204 is horizontally inverted. Then, the posture estimation unit 304 estimates the position and posture of the subject 501 from the subject image I1 after the horizontal inversion process by a method using a known deep learning method or the like. An explanation of the position and orientation estimation process using techniques such as deep learning will be omitted. After processing S603, the fitting application 2 proceeds to processing S605.

一方、Ｓ６０４に進んだ場合、姿勢推定部３０４は、画像取得部３０１にて取得された被写体画像Ｉ１から、その画像内の被写体５０１の位置姿勢を推定する。すなわちＳ６０３において、姿勢推定部３０４は、Ｓ６０３のような被写体画像Ｉ１の左右反転処理は行わずに、画像取得部３０１にて取得された被写体画像Ｉ１内の被写体５０１の位置姿勢を推定する。Ｓ６０４での位置姿勢推定処理は、被写体画像Ｉ１の左右反転処理が行われていないこと以外、Ｓ６０３での位置姿勢推定処理と同様であるため、その説明は省略する。このＳ６０４の処理後、試着アプリケーション２の処理はＳ６０５に進む。 On the other hand, if the process proceeds to S604, the posture estimation unit 304 estimates the position and posture of the subject 501 in the subject image I1 acquired by the image acquisition unit 301 from the subject image I1. That is, in S603, the posture estimation unit 304 estimates the position and posture of the subject 501 in the subject image I1 acquired by the image acquisition unit 301 without performing left-right flipping of the subject image I1 as in S603. The position and posture estimation process in S604 is the same as the position and posture estimation process in S603 except that left-right flipping of the subject image I1 is not performed, and therefore a description thereof will be omitted. After the process of S604, the process of the try-on application 2 proceeds to S605.

Ｓ６０５に進むと、合成部３０５は、Ｓ６０３又はＳ６０４で推定された被写体の位置姿勢に沿うように、衣料品情報の３次元モデルを回転、変形、拡大・縮小等して、衣料品の見えを表す衣料品オブジェクト画像Ｉ２を生成する。 When the process proceeds to S605, the synthesis unit 305 generates a clothing object image I2 representing the appearance of the clothing by rotating, transforming, enlarging or reducing the 3D model of the clothing information so as to conform to the position and orientation of the subject estimated in S603 or S604.

図８（ａ）と図８（ｂ）は、Ｓ６０５において合成部３０５により生成される衣料品オブジェクト画像Ｉ２を示した図である。図８（ａ）は、生成モードが通常モードである場合の被写体５０１の位置姿勢に沿うように生成された衣料品オブジェクト画像７０１を示した図である。図８（ｂ）は、生成モードが左右反転モードである場合の被写体の位置姿勢、すなわち例えば左右反転させた被写体画像Ｉ１における被写体の位置姿勢に沿うように生成された衣料品オブジェクト画像７０１を示した図である。なお、衣料品情報として前述したような２次元画像を取得した場合、合成部３０５は、被写体５０１の位置姿勢に沿うように２次元変形を行ってもよいし、被写体５０１の位置姿勢に沿うような２次元画像を２次元画像の配列から選択してもよい。 8(a) and 8(b) are diagrams showing clothing object image I2 generated by synthesis unit 305 in S605. FIG. 8(a) is a diagram showing clothing object image 701 generated so as to conform to the position and orientation of subject 501 when the generation mode is normal mode. FIG. 8(b) is a diagram showing clothing object image 701 generated so as to conform to the position and orientation of the subject when the generation mode is horizontally inverted mode, that is, for example, the position and orientation of the subject in horizontally inverted subject image I1. Note that when a two-dimensional image such as that described above is acquired as clothing information, synthesis unit 305 may perform two-dimensional deformation so as to conform to the position and orientation of subject 501, or may select a two-dimensional image from the array of two-dimensional images so as to conform to the position and orientation of subject 501.

さらに次のＳ６０６において、合成部３０５は、生成モードが通常モードか否かに応じて処理を分岐させる。合成部３０５は、例えば通常モードでない場合つまり左右反転モードである場合にはＳ６０７に処理を進め、一方、通常モードである場合にはＳ６０８に処理を進める。 Furthermore, in the next step S606, the composition unit 305 branches the process depending on whether the generation mode is the normal mode or not. For example, if the generation mode is not the normal mode, that is, if the generation mode is the horizontally inverted mode, the composition unit 305 advances the process to S607, whereas if the generation mode is the normal mode, the composition unit 305 advances the process to S608.

左右反転モードであるためＳ６０７に進むと、合成部３０５は、Ｓ６０５で生成した衣料品オブジェクト画像Ｉ２を左右反転させる。Ｓ６０７における左右反転処理は、Ｓ６０３において姿勢推定部３０４で行われる左右反転処理と同様な処理である。そして、Ｓ６０７の後、合成部３０５はＳ６０８に処理を進める。 Because the left-right flip mode is selected, the process proceeds to S607, where the synthesis unit 305 left-right flips the clothing object image I2 generated in S605. The left-right flip process in S607 is the same as the left-right flip process performed by the orientation estimation unit 304 in S603. After S607, the synthesis unit 305 proceeds to S608.

Ｓ６０８に進むと、合成部３０５は、被写体画像Ｉ１内の被写体（人物）の画像に対して衣料品オブジェクト画像Ｉ２を重畳した合成画像Ｉ３を生成する。すなわちＳ６０８で通常モードであると判定されてＳ６０８に進んだ場合、合成部３０５では、画像取得部３０１で取得された被写体画像Ｉ１内の被写体の画像に対し、Ｓ６０５で生成された衣料品オブジェクト画像Ｉ２を重畳した合成画像Ｉ３が生成される。一方、Ｓ６０６で左右反転モードと判定され、Ｓ６０７で衣料品オブジェクト画像Ｉ２が左右反転されてＳ６０８に進んだ場合、合成部３０５では、被写体画像Ｉ１内の被写体の画像に左右反転した衣料品オブジェクト画像Ｉ２を重畳した合成画像Ｉ３が生成される。 When the process proceeds to S608, the synthesis unit 305 generates a synthetic image I3 by superimposing the clothing object image I2 on the image of the subject (person) in the subject image I1. That is, if the normal mode is determined in S608 and the process proceeds to S608, the synthesis unit 305 generates a synthetic image I3 by superimposing the clothing object image I2 generated in S605 on the image of the subject in the subject image I1 acquired by the image acquisition unit 301. On the other hand, if the horizontally inverted mode is determined in S606 and the clothing object image I2 is horizontally inverted in S607 and the process proceeds to S608, the synthesis unit 305 generates a synthetic image I3 by superimposing the horizontally inverted clothing object image I2 on the image of the subject in the subject image I1.

図９（ａ）と図９（ｂ）は、Ｓ６０８の処理によって生成された合成画像の例を示した図である。図９（ａ）は通常モードである場合に生成される合成画像の例を示し、図９（ｂ）は左右反転モードである場合に生成される合成画像の例を示している。特に図９（ｂ）の合成画像では、鏡に映った被写体５０１を背面カメラ２０８で撮像した場合において、衣料品オブジェクト画像７０１の右袖が鏡像である被写体５０１の右腕側に重畳され、被写体５０１が衣料品を装着した際の見えが再現されている。一方、図９（ａ）の合成画像では、鏡に映った被写体５０１を背面カメラ２０８で撮像した場合、衣料品オブジェクト画像７０１の左袖が鏡像である被写体５０１の右腕側に重畳されており、被写体５０１が衣料品を装着した際の見えが左右で整合していない。 9(a) and 9(b) are diagrams showing examples of composite images generated by the processing of S608. FIG. 9(a) shows an example of a composite image generated in normal mode, and FIG. 9(b) shows an example of a composite image generated in left-right inversion mode. In particular, in the composite image of FIG. 9(b), when the subject 501 reflected in the mirror is imaged by the rear camera 208, the right sleeve of the clothing object image 701 is superimposed on the right arm of the subject 501, which is the mirror image, reproducing the appearance when the subject 501 wears the clothing. On the other hand, in the composite image of FIG. 9(a), when the subject 501 reflected in the mirror is imaged by the rear camera 208, the left sleeve of the clothing object image 701 is superimposed on the right arm of the subject 501, which is the mirror image, and the appearance when the subject 501 wears the clothing is not consistent between the left and right.

図４のフローチャートに説明を戻す。
Ｓ４０３の合成画像の表示処理の際、表示部３０６は、合成画像Ｉ３とともに図１０に示すようなＵＩを描画し、ディスプレイ２０４に表示する。
図１０（ａ）と図１０（ｂ）は、被写体５０１に衣料品オブジェクト画像７０１が重畳された合成画像が表示されたディスプレイ２０４の画面例を示した図である。図１０（ａ）は通常モードである場合の画面例を示し、図１０（ｂ）は左右反転モードである場合の画面例を示している。 Returning to the flowchart of FIG.
When performing the composite image display process in S403, the display unit 306 renders a UI as shown in FIG.
10A and 10B are diagrams showing example screens of the display 204 displaying a composite image in which a clothing object image 701 is superimposed on a subject 501. Fig. 10A shows an example screen in normal mode, and Fig. 10B shows an example screen in left-right inversion mode.

表示部３０６は、被写体５０１に衣料品オブジェクト画像Ｉ２が重畳された合成画像Ｉ３に加えて、生成モードが左右反転モードと通常モードのどちらであるかを示す所定の通知アイコンとしてのモード通知アイコン１００１をも表示する。モード通知アイコン１００１は、生成モードが通常モードである場合と左右反転モードである場合とで、その表示形態が変えられて表示される。本実施形態の場合、モード通知アイコン１００１は、生成モードが通常モードにあるときには例えばグレー色で表示され、左右反転モードにあるときには白色で表示されるとする。これにより、ユーザに対して、現時点の生成モードが通常モードか左右反転モードのいずれであるかを通知することができる。なお、表示部３０６はモード通知アイコン１００１の表示とともに、終了ボタン１００２をも表示させる。終了ボタン１００２は、ユーザが試着アプリケーション２を終了させる際に例えばタップされるボタンである。この終了ボタン１００２がタップされた場合、試着アプリケーション２は、後述するＳ４０６において終了処理を行う。 In addition to the composite image I3 in which the clothing object image I2 is superimposed on the subject 501, the display unit 306 also displays a mode notification icon 1001 as a predetermined notification icon indicating whether the generation mode is the horizontally inverted mode or the normal mode. The mode notification icon 1001 is displayed in a different display form depending on whether the generation mode is the normal mode or the horizontally inverted mode. In the case of this embodiment, the mode notification icon 1001 is displayed in gray, for example, when the generation mode is the normal mode, and in white, when the generation mode is the horizontally inverted mode. This makes it possible to notify the user whether the current generation mode is the normal mode or the horizontally inverted mode. The display unit 306 also displays an end button 1002 along with the display of the mode notification icon 1001. The end button 1002 is, for example, a button that is tapped by the user when the user wants to end the fitting application 2. When the end button 1002 is tapped, the fitting application 2 performs an end process in S406, which will be described later.

前述のようなＳ４０３の処理後、試着アプリケーション２の処理はＳ４０４に進む。
Ｓ４０４に進むと、操作受付部３０２は、ユーザからモード切替指示を受け付けたか否かを判定する。本実施形態の場合、操作受付部３０２は、ディスプレイ２０４の画面上でモード通知アイコン１００１がユーザによりタップされたか否かによって、モード切替指示が行われたか否かを判定する。そして、操作受付部３０２がユーザからのモード切替指示を受け付けた場合、試着アプリケーション２の処理は、Ｓ４０５に進む。 After the process of S403 as described above, the process of the try-on application 2 proceeds to S404.
When the process proceeds to S404, the operation acceptance unit 302 determines whether or not a mode switching instruction has been received from the user. In the present embodiment, the operation acceptance unit 302 determines whether or not a mode switching instruction has been issued based on whether or not the mode notification icon 1001 has been tapped by the user on the screen of the display 204. Then, when the operation acceptance unit 302 accepts a mode switching instruction from the user, the process of the try-on application 2 proceeds to S405.

Ｓ４０５に進むと、合成部３０５は、生成モードを切り替える。すなわちＳ４０４において操作受付部３０２がユーザからモード切替指示を受け付ける前の時点で通常モードであった場合、合成部３０５は、生成モードを左右反転モードへ切り替える。一方、Ｓ４０４において操作受付部３０２がユーザからモード切替指示を受け付ける前の時点で左右反転モードであった場合、合成部３０５は、生成モードを通常モードへ切り替える。 When the process proceeds to S405, the composition unit 305 switches the generation mode. That is, if the normal mode was selected before the operation reception unit 302 received a mode switching instruction from the user in S404, the composition unit 305 switches the generation mode to the horizontally inverted mode. On the other hand, if the horizontally inverted mode was selected before the operation reception unit 302 received a mode switching instruction from the user in S404, the composition unit 305 switches the generation mode to the normal mode.

その後、Ｓ４０６において、操作受付部３０２は、ディスプレイ２０４上の終了ボタン１００２がユーザによりタップされたか否かを判定する。そして、操作受付部３０２において終了ボタン１００２がタップされたと判定された場合、試着アプリケーション２は、図４のフローチャートの処理を終了し、そうでない場合にはＳ４０１に処理を戻して以降の処理を繰り返す。すなわち例えばＳ４０５で生成モードの切り替えが行われた場合、Ｓ４０１からＳ４０３では、その切り替え後の生成モードに応じた処理が行われることになる。 Then, in S406, the operation acceptance unit 302 determines whether or not the user has tapped the end button 1002 on the display 204. If the operation acceptance unit 302 determines that the end button 1002 has been tapped, the fitting application 2 ends the process of the flowchart in FIG. 4, and if not, the process returns to S401 and repeats the subsequent processes. That is, for example, if the generation mode is switched in S405, the processes in S401 to S403 are performed according to the generation mode after the switch.

以上説明したように、第１の実施形態によれば、鏡に映った被写体人物の鏡像が背面カメラ２０８で撮像されて被写体人物と衣料品との左右関係が整合していない場合、ユーザからのモード切替指示に応じて、それらの左右関係が整合した画像を表示可能となる。 As described above, according to the first embodiment, when a mirror image of a subject person reflected in a mirror is captured by the rear camera 208 and the left-right relationship between the subject person and the clothing item is not consistent, an image in which the left-right relationship is consistent can be displayed in response to a mode switching instruction from the user.

＜第１の実施形態の変形例＞
上述した第１の実施形態では、ユーザから受け付けるモード切替指示として、モード通知アイコン１００１へのタップ操作を例に挙げたが、モード通知アイコン１００１とは別の一部画面領域へのタップ操作をモード切替指示として受け付けてもよい。或いは、ユーザが情報処理装置１を振る動作を加速度センサ２０６にて検知可能とし、当該情報処理装置１を振る操作がなされた場合に、それをモード切替指示として受け付けるようにしてもよい。この例の場合、ユーザは、例えば片手で情報処理装置１を振るような動作のみで生成モードの切り替えを指示することができ、生成モードを切り替えるために情報処理装置１を持ち替えるなどのような姿勢を大きく変更する動作を行わなくてもよくなる。また例えば、ユーザが発する音声をマイクロフォン２１１にて検知し、ユーザからの所定の音声が入力された場合、それを以てモード切替指示の受け付けとしてもよい。この例の場合も、ユーザは、生成モードを切り替えるために姿勢を大きく変更する動作などを行わなくてもよくなる。これらの変形例は適宜組み合わせて使用することも可能である。 <Modification of the first embodiment>
In the first embodiment described above, a tap operation on the mode notification icon 1001 is given as an example of a mode switching instruction received from the user, but a tap operation on a partial screen area other than the mode notification icon 1001 may be received as a mode switching instruction. Alternatively, the acceleration sensor 206 may detect the user's shaking of the information processing device 1, and when the information processing device 1 is shook, the operation may be received as a mode switching instruction. In this example, the user can instruct the generation mode to be switched by, for example, shaking the information processing device 1 with one hand, and it is not necessary to perform an operation of greatly changing the posture, such as changing the grip of the information processing device 1, in order to switch the generation mode. In addition, for example, the microphone 211 may detect the voice emitted by the user, and when a predetermined voice from the user is input, the voice may be used as the acceptance of the mode switching instruction. In this example, the user also does not need to perform an operation of greatly changing the posture in order to switch the generation mode. These modified examples can also be used in combination as appropriate.

＜第２の実施形態＞
第１の実施形態では、ユーザからのモード切替指示を受け付けた場合に、生成モードを切り替える例を挙げた。第２の実施形態では、画像取得部３０１にて取得された画像に実在の鏡が映っている場合に生成モードを自動的に切り替える処理と、第１の実施形態と同様にユーザからのモード切替指示に応じて生成モードを切り替える処理との両方を含む例について説明する。第２の実施形態の場合、画像取得部３０１で取得された画像に鏡が映っている場合、つまり画像が鏡像である可能性が高い場合には、左右反転モードに自動的に移行する。また第２の実施形態では、左右反転モードに移行して、衣料品オブジェクト画像の左右を反転させて表示したくない場合、ユーザは、モード切替指示を行うことで左右反転していない衣料品オブジェクト画像を重畳した合成画像が生成されるようにすることもできる。なお、衣料品オブジェクト画像を左右反転させて表示したくないユースケースとしては、例えば、衣料品に文字が記載されており、左右反転すると読み難くなる場合などが考えられる。第２の実施形態の情報処理装置１のハードウェア外観構成、ハードウェア構成は、前述した図１、図２と同じであるため、それらの図示と説明は省略する。 Second Embodiment
In the first embodiment, an example of switching the generation mode when a mode switching instruction from a user is received is given. In the second embodiment, an example including both a process of automatically switching the generation mode when an actual mirror is reflected in the image acquired by the image acquisition unit 301 and a process of switching the generation mode in response to a mode switching instruction from a user as in the first embodiment is described. In the case of the second embodiment, when a mirror is reflected in the image acquired by the image acquisition unit 301, that is, when the image is highly likely to be a mirror image, the mode is automatically switched to the left-right inversion mode. In addition, in the second embodiment, when the user does not want to switch to the left-right inversion mode and display the clothing object image with the left-right inversion reversed, the user can also make a mode switching instruction to generate a composite image in which a clothing object image that is not left-right inverted is superimposed. Note that, as a use case in which it is not desired to display the clothing object image with the left-right inversion, for example, a case where text is written on the clothing and is difficult to read when the text is inverted left-right is considered. The hardware external configuration and hardware configuration of the information processing device 1 of the second embodiment are the same as those of FIG. 1 and FIG. 2 described above, and therefore illustration and description thereof are omitted.

図１１は、第２の実施形態の情報処理装置１における機能構成例を示した図である。図１１の機能構成において、鏡像判定部１１０１とモード選択部１１０２以外の各機能部は第１の実施形態の図３に示した各機能部と同じ構成であるため、それらの説明は省略する。 Fig. 11 is a diagram showing an example of the functional configuration of the information processing device 1 of the second embodiment. In the functional configuration of Fig. 11, the functional units other than the mirror image determination unit 1101 and the mode selection unit 1102 have the same configuration as the functional units shown in Fig. 3 of the first embodiment, so their description will be omitted.

鏡像判定部１１０１は、画像取得部３０１で取得された被写体画像内に鏡が写っているか否か、より詳細には被写体の画像が鏡像であるか否かを判定する。本実施形態では、公知の深層学習等を用いた方法により、画像に鏡が写っているか否かの処理を行うとする。 The mirror image determination unit 1101 determines whether or not a mirror is reflected in the subject image acquired by the image acquisition unit 301, and more specifically, whether or not the image of the subject is a mirror image. In this embodiment, the process of determining whether or not a mirror is reflected in the image is performed using a method using well-known deep learning, etc.

モード選択部１１０２は、鏡像判定部１１０１において画像内に鏡が写っていると判定された場合には生成モードを左右反転モードに設定し、一方、鏡が写っていないと判定された場合には生成モードを通常モードに設定する。そしてモード選択部１１０２は、その生成モードの設定情報を姿勢推定部３０４と合成部３０５に知らせる。このため第２の実施形態の場合、姿勢推定部３０４と合成部３０５では、モード選択部１１０２にて設定された生成モードに応じた処理を行う。 When the mirror image determination unit 1101 determines that a mirror is reflected in the image, the mode selection unit 1102 sets the generation mode to the horizontally inverted mode, and when it determines that a mirror is not reflected in the image, the mode selection unit 1102 sets the generation mode to the normal mode. The mode selection unit 1102 then notifies the orientation estimation unit 304 and the synthesis unit 305 of the generation mode setting information. Therefore, in the case of the second embodiment, the orientation estimation unit 304 and the synthesis unit 305 perform processing according to the generation mode set by the mode selection unit 1102.

図１２は、第２の実施形態の試着アプリケーション２において実行される情報処理の流れを示したフローチャートである。なお、Ｓ１２０１からＳ１２０３までの処理は、第１の実施形態のＳ４０１からＳ４０３までと同じ処理であるため、それらの説明は省略する。 Figure 12 is a flowchart showing the flow of information processing executed in fitting application 2 of the second embodiment. Note that the processing from S1201 to S1203 is the same as the processing from S401 to S403 of the first embodiment, and therefore a description thereof will be omitted.

Ｓ１２０３の処理後、Ｓ１２０４に進むと、鏡像判定部１１０１は、画像取得部３０１で取得された被写体画像Ｉ１に鏡が写っているか否か、より詳細には被写体の画像が鏡像であるか否かを判定する。そして鏡像判定部１１０１において鏡像判定処理で被写体画像Ｉ１に鏡が写っていると判定された場合、試着アプリケーション２の処理はＳ１２０５に進み、一方、鏡が写っていないと判定された場合にはＳ１２０６に進む。すなわち鏡像判定部１１０１は、被写体画像Ｉ１内に写っている鏡の領域の中に、被写体が写っているか否かを判定する。そして、鏡の領域の中に被写体が写っていると判定された場合、試着アプリケーション２の処理はＳ１２０５に進み、一方、鏡の中に被写体が写っていないと判定された場合にはＳ１２０６に進むようにする。 After processing S1203, the process proceeds to S1204, where the mirror image determination unit 1101 determines whether or not a mirror is reflected in the subject image I1 acquired by the image acquisition unit 301, more specifically, whether or not the image of the subject is a mirror image. If the mirror image determination unit 1101 determines in the mirror image determination process that a mirror is reflected in the subject image I1, the process of the try-on application 2 proceeds to S1205, whereas if it is determined that a mirror is not reflected, the process proceeds to S1206. That is, the mirror image determination unit 1101 determines whether or not a subject is reflected in the area of the mirror reflected in the subject image I1. If it is determined that a subject is reflected in the area of the mirror, the process of the try-on application 2 proceeds to S1205, whereas if it is determined that a subject is not reflected in the mirror, the process proceeds to S1206.

Ｓ１２０５に進むと、モード選択部１１０２は、生成モードを自動的に左右反転モードに設定する。一方、Ｓ１２０６に進んだ場合、モード選択部１１０２は、生成モードを通常モードに設定する。そして、Ｓ１２０５またはＳ１２０６の後、試着アプリケーション２の処理はＳ１２０７に進む。
Ｓ１２０７からＳ１２０９までの処理は、第１の実施形態のＳ４０４からＳ４０６までと概ね同様の処理であるため、それらの説明は省略する。なお、第２の実施形態の場合、Ｓ１２０７において、操作受付部３０２は、モード通知アイコン１００１を例えばオンとオフの２状態を備えるスイッチとして扱ってもよい。例えば、操作受付部３０２は、ユーザによってモード通知アイコン１００１がタップされたときに生成モード切替指示の受け付けを可能とし、再びタップがなされた場合にはモード切替指示を受け付けないようにしてもよい。 If the process proceeds to S1205, the mode selection unit 1102 automatically sets the generation mode to the horizontally inverted mode. On the other hand, if the process proceeds to S1206, the mode selection unit 1102 sets the generation mode to the normal mode. Then, after S1205 or S1206, the process of the try-on application 2 proceeds to S1207.
The processes from S1207 to S1209 are generally similar to those from S404 to S406 in the first embodiment, and therefore their description will be omitted. In the case of the second embodiment, in S1207, the operation reception unit 302 may treat the mode notification icon 1001 as a switch having two states, for example, on and off. For example, the operation reception unit 302 may be capable of receiving a generation mode switching instruction when the mode notification icon 1001 is tapped by the user, and may not receive a mode switching instruction when the icon is tapped again.

以上説明したように第２の実施形態の情報処理装置１によれば、画像に鏡が写っている場合に、自動的に左右反転モードに移行する。また第２の実施形態では、衣料品オブジェクト画像を左右反転させて表示したくない場合には、ユーザからのモード切替指示に応じて左右反転していない衣料品オブジェクト画像を表示させることができる。 As described above, according to the information processing device 1 of the second embodiment, if a mirror is reflected in the image, the device automatically switches to the left-right inversion mode. Also, in the second embodiment, if the user does not want to display the clothing object image in a left-right inverted state, the device can display the clothing object image that is not left-right inverted in response to a mode switching instruction from the user.

＜第２の実施形態の変形例＞
第２の実施形態において、画像取得部３０１で取得する被写体画像Ｉ１は、前面カメラ２０９で撮像した画像であってもよい。その際、モード選択部１１０２は、鏡像判定部１１０１にて画像内に鏡が写っていると判定された場合には生成モードを通常モードに設定し、一方、鏡が写っていないと判定された場合には左右反転モードに設定してもよい。
また、第２の実施形態において、鏡像判定部１１０１は、画像取得部３０１にて取得された被写体画像Ｉ１の一部領域に鏡が写っているかを判定してもよい。なお、被写体画像Ｉ１中で被写体５０１が写っている領域を、被写体画像Ｉ１の一部領域としてもよい。 <Modification of the second embodiment>
In the second embodiment, the subject image I1 acquired by the image acquisition unit 301 may be an image captured by the front camera 209. In this case, the mode selection unit 1102 may set the generation mode to the normal mode when the mirror image determination unit 1101 determines that a mirror is captured in the image, and may set the generation mode to the left-right inversion mode when the mirror image determination unit 1101 determines that a mirror is not captured in the image.
In the second embodiment, the mirror image determination unit 1101 may determine whether a mirror is reflected in a partial area of the subject image I1 acquired by the image acquisition unit 301. Note that the area in the subject image I1 in which the subject 501 is reflected may be regarded as the partial area of the subject image I1.

また例えば、不図示の視線追跡装置等を用いて被写体画像Ｉ１中でユーザが注目して見ている注視領域を取得可能とし、それを前述した被写体画像Ｉ１の一部領域つまり鏡の領域としてもよい。すなわちユーザが鏡の領域を注視している場合、モード選択部１１０２は左右反転モードを選択する。これにより、ユーザが鏡の領域を注視している場合には左右反転モードになることで、鏡に映った被写体人物と衣料品との左右関係が整合した画像を表示可能となる。一方、ユーザが鏡の領域を注視していない場合、モード選択部１１０２は、生成モードの選択を行わずに、現時点の生成モードを維持するようにする。これにより被写体画像Ｉ１中に鏡は写っているが、ユーザがその鏡を注視していない場合、つまりユーザが鏡に注意を払っていない場合には、ユーザが意図していないときに生成モードが自動的に切り替えられてしまうのを防止することが可能となる。これらの各変形例は、適宜組み合わせて使用することも可能である。 For example, a gaze tracking device (not shown) may be used to obtain the gaze area in the subject image I1 that the user is focusing on, and this may be used as a partial area of the subject image I1, i.e., the mirror area. That is, when the user is gazing at the mirror area, the mode selection unit 1102 selects the left-right inversion mode. As a result, when the user is gazing at the mirror area, the left-right inversion mode is set, and an image in which the left-right relationship between the subject person and the clothing reflected in the mirror is consistent can be displayed. On the other hand, when the user is not gazing at the mirror area, the mode selection unit 1102 does not select a generation mode, but maintains the current generation mode. As a result, when a mirror is reflected in the subject image I1, but the user is not gazing at the mirror, that is, when the user is not paying attention to the mirror, it is possible to prevent the generation mode from being automatically switched when the user does not intend. These modified examples can also be used in appropriate combinations.

なお前述した第１、第２の実施形態において、情報処理装置１のハードウェア構成のうち入出力に関連するディスプレイ２０４、加速度センサ２０６、背面カメラ２０８等は、外部装置として情報処理装置１とは別に構成されていてもよい。その場合、外部装置と情報処理装置１は、例えばシリアルバス等で接続される。 In the first and second embodiments described above, the display 204, acceleration sensor 206, rear camera 208, and the like, which are related to input and output, among the hardware configurations of the information processing device 1, may be configured as external devices separately from the information processing device 1. In that case, the external devices and the information processing device 1 are connected, for example, by a serial bus or the like.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。上述の実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明は、その技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The present invention can also be realized by supplying a program that realizes one or more of the functions of the above-mentioned embodiments to a system or device via a network or storage medium, and having one or more processors in the computer of the system or device read and execute the program. It can also be realized by a circuit (e.g., an ASIC) that realizes one or more functions. The above-mentioned embodiments are merely examples of concrete implementations of the present invention, and the technical scope of the present invention should not be interpreted in a limiting manner by them. In other words, the present invention can be implemented in various forms without departing from its technical concept or main features.

各実施形態の開示は、以下の構成、方法、およびプログラムを含む。
（構成１）
被写体の画像を取得する画像取得手段と、
前記被写体の画像に重畳するオブジェクト情報を取得するオブジェクト取得手段と、
前記オブジェクト情報を左右反転させずに前記被写体の画像に重畳した合成画像を生成する第１のモードと、左右反転させた前記オブジェクト情報を前記被写体の画像に重畳した合成画像を生成する第２のモードとを含む合成手段と、
前記第１のモードと前記第２のモードのいずれかへの切替指示を受け付ける受付手段と、
前記合成手段にて生成された合成画像を表示する表示制御手段と、
を有することを特徴とする情報処理装置。
（構成２）
前記被写体の画像を基に前記被写体の位置姿勢を推定する推定手段をさらに有し、
前記合成手段は、前記推定された被写体の位置姿勢に基づいて、前記合成画像を生成することを特徴とする構成１に記載の情報処理装置。
（構成３）
前記受付手段は、ユーザからの所定の操作を、前記切替指示として受け付けることを特徴とする構成１または２に記載の情報処理装置。
（構成４）
前記表示制御手段は、前記合成手段が前記第２のモードにある場合には、前記左右反転させたオブジェクト情報を表示していることをユーザに通知することを特徴とする構成１乃至３のいずれか１構成に記載の情報処理装置。
（構成５）
前記受付手段は、前記合成手段で生成された合成画像の一部領域に対するユーザからの所定の操作を、前記切替指示として受け付けることを特徴とする構成１乃至４のいずれか１構成に記載の情報処理装置。
（構成６）
前記受付手段は、音声を取得する音声取得手段を含み、前記音声取得手段により取得した所定の音声を、前記切替指示として受け付けることを特徴とする構成１乃至５のいずれか１構成に記載の情報処理装置。
（構成７）
前記受付手段は、ユーザによる当該情報処理装置を振る動作を検知する検知手段を含み、前記検知手段による情報処理装置を振る動作の検知を、前記切替指示として受け付けることを特徴とする構成１乃至６のいずれか１構成に記載の情報処理装置。
（構成８）
前記被写体の画像が鏡像であるか否かを判定する判定手段と、
前記判定手段による判定の結果を基に、前記合成手段の前記第１のモードと前記第２のモードのいずれかを選択するモード選択手段と、
をさらに有し、
前記モード選択手段は、前記判定手段で前記被写体の画像が鏡像でないと判定した場合には前記第１のモードを選択し、前記判定手段で前記被写体の画像が鏡像であると判定した場合には前記第２のモードを選択することを特徴とする構成１乃至７のいずれか１構成に記載の情報処理装置。
（構成９）
前記受付手段は、前記モード選択手段にて前記選択されたモードを、前記切替指示に応じて切り替えることを特徴とする構成８に記載の情報処理装置。
（構成１０）
前記判定手段は、ユーザが注視している画像の領域が鏡像であるか否か判定し、
前記モード選択手段は、前記判定手段によってユーザが注視している画像の領域が鏡像であると判定された場合に前記第２のモードを選択することを特徴とする構成８または９に記載の情報処理装置。
（構成１１）
前記判定手段は、ユーザが注視している画像の領域が鏡像であるか否か判定し、
前記モード選択手段は、前記判定手段によってユーザが注視している画像の領域が鏡像でないと判定された場合にはモードの選択を行わないことを特徴とする構成８乃至１０のいずれか１構成に記載の情報処理装置。
（構成１２）
前記オブジェクト情報は、３次元オブジェクトモデルと２次元画像との少なくとも１つであることを特徴とする構成１乃至１１のいずれか１構成に記載の情報処理装置。
（構成１３）
ディスプレイと、
前記ディスプレイの画面が配された側に設けられる前面カメラと、
前記ディスプレイの画面に対して背面側に設けられる背面カメラと、
を備え、
前記画像取得手段が取得する前記被写体の画像は、前記背面カメラにて撮像された画像であり、
前記表示制御手段は、前記ディスプレイの画面に前記合成画像を表示することを特徴とする構成１乃至１２のいずれか１構成に記載の情報処理装置。
（方法１）
被写体の画像を取得する画像取得工程と、
前記被写体の画像に重畳するオブジェクト情報を取得するオブジェクト取得工程と、
前記オブジェクト情報を左右反転させずに前記被写体の画像に重畳した合成画像を生成する第１のモードと、左右反転させた前記オブジェクト情報を前記被写体の画像に重畳した合成画像を生成する第２のモードとを含む合成工程と、
前記第１のモードと前記第２のモードのいずれかへの切替指示を受け付ける受付工程と、
前記合成工程にて生成された合成画像を表示する表示制御工程と、
を有することを特徴とする情報処理方法。
（プログラム１）
コンピュータを、構成１乃至１３のいずれか１構成に記載の情報処理装置として機能させるプログラム。 The disclosure of each embodiment includes the following configurations, methods, and programs.
(Configuration 1)
An image acquisition means for acquiring an image of a subject;
an object acquisition means for acquiring object information to be superimposed on the image of the subject;
a synthesis means including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without being inverted horizontally, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information inverted horizontally;
a receiving means for receiving an instruction to switch to either the first mode or the second mode;
A display control means for displaying a composite image generated by the composite means;
13. An information processing device comprising:
(Configuration 2)
The method further includes estimating a position and orientation of the object based on an image of the object,
2. The information processing apparatus according to configuration 1, wherein the synthesis means generates the synthetic image based on the estimated position and orientation of the subject.
(Configuration 3)
3. The information processing apparatus according to configuration 1 or 2, wherein the accepting means accepts a predetermined operation from a user as the switching instruction.
(Configuration 4)
The information processing device according to any one of configurations 1 to 3, wherein the display control means notifies a user that the left-right inverted object information is being displayed when the synthesis means is in the second mode.
(Configuration 5)
5. The information processing apparatus according to any one of configurations 1 to 4, wherein the accepting means accepts, as the switching instruction, a predetermined operation from a user on a partial area of the composite image generated by the combining means.
(Configuration 6)
6. The information processing device according to any one of configurations 1 to 5, wherein the accepting means includes a voice acquiring means for acquiring a voice, and accepts a predetermined voice acquired by the voice acquiring means as the switching instruction.
(Configuration 7)
The information processing device described in any one of configurations 1 to 6, characterized in that the acceptance means includes a detection means for detecting a user's action of shaking the information processing device, and accepts the detection of the action of shaking the information processing device by the detection means as the switching instruction.
(Configuration 8)
a determination means for determining whether the image of the subject is a mirror image;
a mode selection means for selecting either the first mode or the second mode of the synthesis means based on a result of the determination by the determination means;
and
The information processing device described in any one of configurations 1 to 7, characterized in that the mode selection means selects the first mode when the determination means determines that the image of the subject is not a mirror image, and selects the second mode when the determination means determines that the image of the subject is a mirror image.
(Configuration 9)
9. The information processing apparatus according to configuration 8, wherein the reception means switches the mode selected by the mode selection means in response to the switching instruction.
(Configuration 10)
The determination means determines whether or not an area of the image that the user is gazing at is a mirror image;
10. The information processing device according to configuration 8 or 9, wherein the mode selection means selects the second mode when the determination means determines that the area of the image on which the user is gazing is a mirror image.
(Configuration 11)
The determination means determines whether or not an area of the image that the user is gazing at is a mirror image;
11. The information processing device according to any one of configurations 8 to 10, wherein the mode selection means does not select a mode when the determination means determines that the area of the image the user is gazing at is not a mirror image.
(Configuration 12)
12. The information processing apparatus according to any one of configurations 1 to 11, wherein the object information is at least one of a three-dimensional object model and a two-dimensional image.
(Configuration 13)
A display and
A front camera provided on the side where the screen of the display is disposed;
A rear camera provided on the rear side of the screen of the display;
Equipped with
the image of the subject acquired by the image acquisition means is an image captured by the rear camera,
13. The information processing apparatus according to any one of configurations 1 to 12, wherein the display control means displays the composite image on a screen of the display.
(Method 1)
an image acquisition step of acquiring an image of a subject;
an object acquiring step of acquiring object information to be superimposed on the image of the subject;
a synthesis process including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without being inverted horizontally, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information inverted horizontally;
a receiving step of receiving an instruction to switch to either the first mode or the second mode;
a display control step of displaying the composite image generated in the composition step;
13. An information processing method comprising:
(Program 1)
A program that causes a computer to function as the information processing device according to any one of configurations 1 to 13.

１：情報処理装置、２：試着アプリケーション、３０１：画像取得部、３０２：操作受付部、３０３：オブジェクト取得部、３０４：姿勢推定部、３０５：合成部、３０６：表示部 1: Information processing device, 2: Try-on application, 301: Image acquisition unit, 302: Operation reception unit, 303: Object acquisition unit, 304: Pose estimation unit, 305: Synthesis unit, 306: Display unit

Claims

An image acquisition means for acquiring an image of a subject;
an object acquisition means for acquiring object information to be superimposed on the image of the subject;
a synthesis means including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without being inverted horizontally, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information inverted horizontally;
a receiving means for receiving an instruction to switch to either the first mode or the second mode;
A display control means for displaying a composite image generated by the composite means;
13. An information processing device comprising:

The method further includes estimating a position and orientation of the object based on an image of the object,
The information processing apparatus according to claim 1 , wherein the synthesis means generates the synthetic image based on the estimated position and orientation of the subject.

The information processing device according to claim 1, characterized in that the receiving means receives a predetermined operation from a user as the switching instruction.

The information processing device according to claim 1, characterized in that, when the composition means is in the second mode, the display control means notifies the user that the left-right inverted object information is being displayed.

The information processing device according to claim 1, characterized in that the receiving means receives, as the switching instruction, a predetermined operation from a user on a partial area of the composite image generated by the combining means.

The information processing device according to claim 1, characterized in that the receiving means includes a voice acquisition means for acquiring voice, and accepts a predetermined voice acquired by the voice acquisition means as the switching instruction.

The information processing device according to claim 1, characterized in that the accepting means includes a detection means for detecting a user's action of shaking the information processing device, and accepts the detection of the action of shaking the information processing device by the detection means as the switching instruction.

a determination means for determining whether the image of the subject is a mirror image;
a mode selection means for selecting either the first mode or the second mode of the synthesis means based on a result of the determination by the determination means;
and
2. The information processing device according to claim 1, wherein the mode selection means selects the first mode when the determination means determines that the image of the subject is not a mirror image, and selects the second mode when the determination means determines that the image of the subject is a mirror image.

The information processing device according to claim 8, characterized in that the reception means switches the mode selected by the mode selection means in response to the switching instruction.

The determination means determines whether or not an area of the image that the user is gazing at is a mirror image;
9. The information processing apparatus according to claim 8, wherein the mode selection means selects the second mode when the determination means determines that the area of the image on which the user is gazing is a mirror image.

The determination means determines whether or not an area of the image that the user is gazing at is a mirror image;
9. The information processing apparatus according to claim 8, wherein the mode selection means does not select a mode when the determination means determines that the area of the image on which the user is gazing is not a mirror image.

The information processing device according to claim 1, characterized in that the object information is at least one of a three-dimensional object model and a two-dimensional image.

A display and
A front camera provided on the side where the screen of the display is disposed;
A rear camera provided on the rear side of the screen of the display;
Equipped with
the image of the subject acquired by the image acquisition means is an image captured by the rear camera,
2. The information processing apparatus according to claim 1, wherein said display control means displays said composite image on a screen of said display.

an image acquisition step of acquiring an image of a subject;
an object acquiring step of acquiring object information to be superimposed on the image of the subject;
a synthesis process including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without being inverted horizontally, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information inverted horizontally;
a receiving step of receiving an instruction to switch to either the first mode or the second mode;
a display control step of displaying the composite image generated in the composition step;
13. An information processing method comprising:

Computer,
An image acquisition means for acquiring an image of a subject;
an object acquisition means for acquiring object information to be superimposed on the image of the subject;
a synthesis means including a first mode for generating a synthetic image in which the object information is superimposed on the image of the subject without being inverted horizontally, and a second mode for generating a synthetic image in which the object information is superimposed on the image of the subject with the object information inverted horizontally;
a receiving means for receiving an instruction to switch to either the first mode or the second mode;
A display control means for displaying a composite image generated by the composite means;
A program that causes the device to function as an information processing device having the above-mentioned configuration.