JP2009246408A

JP2009246408A - Interaction device, image processing module, image processing method, and program

Info

Publication number: JP2009246408A
Application number: JP2008087048A
Authority: JP
Inventors: Shinji Namihira; 真二波平
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2008-03-28
Filing date: 2008-03-28
Publication date: 2009-10-22

Abstract

PROBLEM TO BE SOLVED: To provide an interaction device capable of easily achieving a natural interaction environment, where the lines of sight of speakers coincide, in interactive communication, and to provide an image processing module, an image processing method and a program. SOLUTION: The interaction device has a camera 11 as an imaging means, an image processing module 13, a display section 32, an image communication section 33 and the like. The image processing module 13 changes the positions of an iris and a pupil by changing the pixel of the palpebral fissure region (an exposure region of an eyeball) of a person included in image data while the line of sight of the person included in inputted image data points to the front of the image data. The image processing module 13 has a face recognition section 14, a line-of-sight determination section 15, and a line-of-sight change section 16. The face recognition section 14 extracts information on a vector (face vector) prescribing the direction of the whole of a face, based on the result of face recognition processing. The line-of-sight change section 16 changes the pixel in the palpebral fissure region so that the line of sight of the person points to the front of the image data, when it is determined that the face points to a display screen based on the direction of the face vector. COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、テレビ電話やテレビ会議などに利用される対話装置、画像処理モジュール、画像処理方法およびプログラムに関する。 The present invention relates to an interactive device, an image processing module, an image processing method, and a program that are used for a videophone or a video conference.

テレビ電話やテレビ会議などの対話型通信システムにおける問題の１つとして、対話者同士の視線の不一致の問題が挙げられる。 One of the problems in interactive communication systems such as videophones and videoconferencing is the problem of disagreement between the lines of dialogue between the interlocutors.

一般に、対話型通信システムを利用する際、ユーザは、表示装置に表示された相手の画像に視線を合わせようとする。しかし、表示装置の表示画面の法線方向とユーザを撮像するカメラのレンズの光軸方向とが異なる場合、ユーザが相手の画像に視線を合わせると、カメラにより撮像されるユーザの画像はユーザの正面画像とはならない。この場合、ユーザは相手の画像と視線を合わせているにもかかわらず、相手側の表示装置にはユーザが視線をそらしている画像が表示されてしまう。 In general, when using an interactive communication system, a user tries to match his / her line of sight with an image of a partner displayed on a display device. However, when the normal direction of the display screen of the display device is different from the optical axis direction of the lens of the camera that captures the user, when the user aligns the line of sight with the image of the other party, the user image captured by the camera is It is not a front image. In this case, although the user aligns the line of sight with the image of the other party, the image that the user is turning the line of sight is displayed on the display device on the other side.

従来、この種の対話型通信システムにおける対話者同士の視線を一致させる技術に、特開２００４−３２６１７９号公報（特許文献１）に開示された技術がある。 Conventionally, there is a technique disclosed in Japanese Patent Application Laid-Open No. 2004-326179 (Patent Document 1) as a technique for matching the lines of sight of interlocutors in this type of interactive communication system.

この特許文献１に開示された画像処理装置は、カメラから入力されたユーザの頭部画像について、あらかじめ定めた大きさおよび方向から見た場合の近似画像を生成する近似画像生成手段を備え、複数の対話者を同時に表示してテレビ会議を行う際に、各対話者の頭部の大きさを統一することができるとともに、各対話者の顔がカメラのほうを向いているように見える画像を自動生成することができるようになっている。
特開２００４−３２６１７９号公報 The image processing apparatus disclosed in Patent Document 1 includes approximate image generation means for generating an approximate image when a user's head image input from a camera is viewed from a predetermined size and direction, and includes a plurality of approximate image generation means. When performing a videoconference with simultaneous display of multiple interactors, it is possible to unify the size of each conversational person's head, and to create an image in which each conversational person's face appears facing the camera. It can be generated automatically.
JP 2004-326179 A

しかし、従来の技術では、各対話者が顔を向けている方向にかかわらず、各対話者の顔がカメラの方を向いている画像を生成してしまう。このため、明らかにカメラの方を向くことを意図していない体勢の対話者については、極めて不自然な画像が生成されてしまう。また、従来の技術では、各対話者の顔全体の方向を変更する必要があるために複雑な画像処理を行わなければならず、画像処理に時間がかかってしまう。 However, in the conventional technology, an image in which each conversation person's face is facing the camera is generated regardless of the direction in which each conversation person faces. For this reason, a very unnatural image is generated for an interlocutor whose posture is clearly not intended to face the camera. Further, in the conventional technique, since it is necessary to change the direction of the whole face of each interlocutor, complicated image processing must be performed, and the image processing takes time.

本発明は、上述した事情を考慮してなされたもので、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現することができる対話装置、画像処理モジュール、画像処理方法およびプログラムを提供することを目的とする。 The present invention has been made in consideration of the above-described circumstances, and an interactive apparatus, an image processing module, and an image processing method that can easily realize a natural interactive environment in which the lines of sight of the interrogators match in interactive communication. And to provide a program.

本発明に係る画像処理モジュールは、上述した課題を解決するために、対話型通信システムに用いられる画像処理モジュールであって、入力された人物の顔を含む画像データから、少なくとも、前記顔の向きを規定する顔ベクトルの情報と、前記顔における眼瞼裂の位置の情報と、前記顔の虹彩の画素の情報と、を抽出する顔認識手段と、前記顔ベクトルの情報にもとづいて、前記顔が表示手段の表示画面を向いているかどうかを判定する視線判定手段と、前記顔が前記表示手段の表示画面を向いていると判定されると、前記人物の視線が前記画像データの正面を向くよう、前記眼瞼裂領域内の画素を変更する視線変更手段と、を備えたことを特徴とするものである。 An image processing module according to the present invention is an image processing module used in an interactive communication system in order to solve the above-described problem, and includes at least the orientation of the face from image data including an inputted person's face. Face recognition means for extracting information on a face vector defining the position, information on the position of an eyelid in the face, and information on an iris pixel of the face, and the face based on the information on the face vector. A line-of-sight determining unit that determines whether or not the display unit is facing the display screen; and if it is determined that the face is facing the display screen of the display unit, the line of sight of the person faces the front of the image data , Line-of-sight changing means for changing pixels in the eyelid fracture region.

また、本発明に係る対話装置は、上述した課題を解決するために、表示手段と、入力された人物の顔を含む画像データから、少なくとも、前記顔の向きを規定する顔ベクトルの情報と、前記顔における眼瞼裂の位置の情報と、前記顔の虹彩の画素の情報と、を抽出する顔認識手段と、前記顔ベクトルの情報にもとづいて、前記顔が前記表示手段の表示画面を向いているかどうかを判定する視線判定手段と、前記顔が前記表示手段の表示画面を向いていると判定されると、前記人物の視線が前記画像データの正面を向くよう、前記眼瞼裂領域内の画素を変更する視線変更手段と、を備え、前記表示手段は、前記視線変更手段から受けた前記眼瞼裂領域内の画素が変更された前記画像データを表示することを特徴とするものである。 Further, in order to solve the above-described problem, the interactive apparatus according to the present invention includes at least information on a face vector that defines the orientation of the face from display means and image data including the face of the input person, Based on face recognition means for extracting information on the position of the eyelids in the face and information on iris pixel of the face, and on the basis of the information on the face vector, the face faces the display screen of the display means. A line-of-sight determination unit that determines whether the face is facing the display screen of the display unit, and pixels in the eyelid region so that the line of sight of the person faces the front of the image data Line-of-sight changing means for changing the image, and the display means displays the image data in which the pixels in the eyelid region received from the line-of-sight changing means are changed.

一方、本発明に係る画像処理方法は、上述した課題を解決するために、入力された人物の顔を含む画像データから、少なくとも、前記顔の向きを規定する顔ベクトルの情報と、前記顔における眼瞼裂の位置の情報と、前記顔の虹彩の画素の情報と、を抽出するステップと、前記顔ベクトルの情報にもとづいて、前記顔が表示手段の表示画面を向いているかどうかを判定するステップと、前記顔が前記表示手段の表示画面を向いていると判定されると、前記人物の視線が前記画像データの正面を向くよう、前記眼瞼裂領域内の画素を変更するステップと、を有することを特徴とする方法である。 On the other hand, in order to solve the above-described problem, the image processing method according to the present invention includes at least information on a face vector that defines the orientation of the face from image data including the face of the input person, Extracting information about the position of the eyelid and pixel information of the iris of the face, and determining whether the face is facing the display screen of the display means based on the information of the face vector And, when it is determined that the face is facing the display screen of the display means, changing the pixels in the eyelid region so that the line of sight of the person faces the front of the image data It is the method characterized by this.

さらに、本発明に係るプログラムは、上述した課題を解決するために、コンピュータを、入力された人物の顔を含む画像データから、少なくとも、前記顔の向きを規定する顔ベクトルの情報と、前記顔における眼瞼裂の位置の情報と、前記顔の虹彩の画素の情報と、を抽出する顔認識手段、前記顔ベクトルの情報にもとづいて、前記顔が表示手段の表示画面を向いているかどうかを判定する視線判定手段、および前記顔が前記表示手段の表示画面を向いていると判定されると、前記人物の視線が前記画像データの正面を向くよう、前記眼瞼裂領域内の画素を変更する視線変更手段、として機能させるためのプログラムである。 Further, in order to solve the above-described problem, the program according to the present invention causes a computer to acquire at least information on a face vector that defines the orientation of the face from the input image data including the face of the person, and the face. Based on the face recognition means for extracting the information on the position of the eyelids in the eye and the information on the iris pixel of the face, it is determined whether or not the face is directed to the display screen of the display means And a line of sight that changes pixels in the eyelid region so that the line of sight of the person faces the front of the image data when it is determined that the face is facing the display screen of the display unit This is a program for functioning as a changing means.

本発明に係る対話装置、画像処理モジュール、画像処理方法および画像処理プログラムによれば、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現することができる。 According to the interactive apparatus, the image processing module, the image processing method, and the image processing program according to the present invention, it is possible to easily realize a natural interactive environment in which the lines of sight of the interlocutors coincide in interactive communication.

本発明に係る対話装置、画像処理モジュール、画像処理方法およびプログラムの実施の形態について、添付図面を参照して説明する。 Embodiments of an interactive apparatus, an image processing module, an image processing method, and a program according to the present invention will be described with reference to the accompanying drawings.

図１は、本発明に係る対話装置１０の一実施形態を示す概略的な全体構成図である。この対話装置１０は、たとえばパーソナルコンピュータや携帯電話機などに適用することが可能である。 FIG. 1 is a schematic overall configuration diagram showing an embodiment of an interactive apparatus 10 according to the present invention. The interactive device 10 can be applied to, for example, a personal computer or a mobile phone.

図１に示すように、対話装置１０は、撮像手段としてのカメラ１１と、画像データ取得部１２と、画像処理モジュール１３と、表示制御部３１と、表示部３２と、画像通信部３３とを有する。 As shown in FIG. 1, the dialogue apparatus 10 includes a camera 11 as an imaging unit, an image data acquisition unit 12, an image processing module 13, a display control unit 31, a display unit 32, and an image communication unit 33. Have.

ユーザは、対話装置１０を用いて、この対話装置１０と画像通信可能にネットワーク接続された他の対話装置１０を用いる対話相手と、テレビ電話やテレビ会議を利用することができる。 The user can use the interactive device 10 to use a videophone or a video conference with a conversation partner using another interactive device 10 connected to the interactive device 10 through a network so that image communication is possible.

カメラ１１は、対話装置１０を用いるユーザの顔を含む画像を撮像して画像データを生成する。このカメラ１１として、例えばＣＭＯＳ型撮像素子やＣＣＤ型撮像素子を用いたものを利用することができる。 The camera 11 captures an image including a user's face using the interactive device 10 and generates image data. As this camera 11, for example, a camera using a CMOS image sensor or a CCD image sensor can be used.

画像データ取得部１２は、画像通信部３３から取得した対話相手の顔を含む画像の画像データ（以下、受信画像データという）、およびカメラ１１から取得されたユーザの顔を含む画像の画像データ（以下、送信画像データという）を、画像処理モジュール１３に入力する。 The image data acquisition unit 12 includes image data of an image including the face of the conversation partner acquired from the image communication unit 33 (hereinafter referred to as reception image data), and image data of an image including the user's face acquired from the camera 11 ( (Hereinafter referred to as transmission image data) is input to the image processing module 13.

画像処理モジュール１３は、入力された画像データに含まれる人物の視線が画像データの正面を向くように、画像データに含まれる人物の眼瞼裂領域（眼球の露出領域）の画素を変更することによって虹彩および瞳孔の位置を変更する。 The image processing module 13 changes the pixels of the human eyelid area (exposed area of the eyeball) included in the image data so that the line of sight of the person included in the input image data faces the front of the image data. Change the position of the iris and pupil.

なお、人物の視線が「画像データの正面を向いている」ことは、人物の視線が「カメラ１１を向いている」ことと同義である。 Note that the fact that the person's line of sight is “facing the front of the image data” is synonymous with the fact that the person's line of sight is “facing the camera 11”.

表示制御部３１は、画像処理モジュール１３が出力する画像データから画像を生成し、表示部３２に表示させる。表示制御部３１に与えられる画像データは、少なくとも画像処理モジュール１３が出力する受信画像データ（対話相手の顔を含む画像データ）を含む。 The display control unit 31 generates an image from the image data output from the image processing module 13 and causes the display unit 32 to display the image. The image data given to the display control unit 31 includes at least received image data (image data including the face of the conversation partner) output from the image processing module 13.

以下の説明では、表示制御部３１に与えられる画像データが、受信画像データと、送信画像データ（ユーザの顔を含む画像データ）との両方である場合の例について示す。この場合、ユーザは、表示部３２に表示された送信画像を見ることにより、表示部３２の表示画面を注視している際の自身の視線がカメラ１１を向いているよう画像処理されているかどうかを確認することができる。 In the following description, an example in which the image data given to the display control unit 31 is both received image data and transmitted image data (image data including a user's face) will be described. In this case, whether the user is performing image processing by looking at the transmission image displayed on the display unit 32 so that his / her line of sight when viewing the display screen of the display unit 32 faces the camera 11. Can be confirmed.

表示部３２は、たとえば液晶ディスプレイやＣＲＴディスプレイなどの一般的な表示出力装置により構成され、表示制御部３１の制御に従って各種画像を表示する。 The display unit 32 is configured by a general display output device such as a liquid crystal display or a CRT display, for example, and displays various images according to the control of the display control unit 31.

画像通信部３３は、ネットワークの形態に応じた種々の情報通信用プロトコルを実装する。画像通信部３３は、この各種プロトコルに従って対話装置１０と他の対話装置１０とを接続する。この接続には、電子ネットワークを介した電気的な接続などを適用することができる。画像通信部３３は、この接続を介して他の対話装置１０と画像データを互いに送受信する。ここで電子ネットワークとは、電気通信技術を利用した情報通信網全般を意味し、ＬＡＮ（Local Area Network）やインターネット網のほか、電話通信回線網、光ファイバ通信ネットワーク、ケーブル通信ネットワークおよび衛星通信ネットワークなどを含む。 The image communication unit 33 implements various information communication protocols according to the network form. The image communication unit 33 connects the interactive device 10 and another interactive device 10 according to these various protocols. For this connection, an electrical connection via an electronic network can be applied. The image communication unit 33 transmits / receives image data to / from another interactive apparatus 10 via this connection. Here, the electronic network means an entire information communication network using telecommunications technology. In addition to a LAN (Local Area Network) and the Internet network, a telephone communication line network, an optical fiber communication network, a cable communication network, and a satellite communication network. Etc.

続いて、画像処理モジュール１３についてより詳細に説明する。 Next, the image processing module 13 will be described in detail.

画像処理モジュール１３は、顔認識部１４、視線判定部１５および視線変更部１６を有する。 The image processing module 13 includes a face recognition unit 14, a line-of-sight determination unit 15, and a line-of-sight change unit 16.

顔認識部１４は、入力された画像データに対して顔認識処理を実行し、眼瞼裂の位置の情報および眼瞼裂領域の画素の情報を抽出する。顔認識処理方法は、従来各種の方法が知られており、これらのうち任意の方法を使用することが可能である。また、顔認識部１４は、顔認識処理の結果にもとづいて、顔全体の向きを規定するベクトル（以下、顔ベクトルという）の情報を抽出する。 The face recognition unit 14 performs face recognition processing on the input image data, and extracts information on the position of the eyelid and pixel information on the eyelid region. Conventionally, various face recognition processing methods are known, and any of these methods can be used. Further, the face recognition unit 14 extracts information on a vector that defines the orientation of the entire face (hereinafter referred to as a face vector) based on the result of the face recognition process.

視線判定部１５は、画像データに含まれる人物の顔が表示手段の表示画面を向いているかどうかを判定する。 The line-of-sight determination unit 15 determines whether the face of the person included in the image data is facing the display screen of the display unit.

視線変更部１６は、黒目情報取得部１７、モデル生成部１８、モデル位置決定部１９および眼瞼裂領域変更部２０を有する。 The line-of-sight change unit 16 includes a black eye information acquisition unit 17, a model generation unit 18, a model position determination unit 19, and an eyelid region change unit 20.

黒目情報取得部１７は、顔認識部１４から虹彩および瞳孔の画素情報（虹彩および瞳孔の大きさおよび色の情報を含む）を取得する。 The black eye information acquisition unit 17 acquires pixel information of the iris and pupil (including information on the size and color of the iris and pupil) from the face recognition unit 14.

モデル生成部１８は、虹彩および瞳孔の画素の情報にもとづいて、虹彩モデルおよび瞳孔モデルを生成する。 The model generation unit 18 generates an iris model and a pupil model based on information on iris and pupil pixels.

モデル位置決定部１９は、顔ベクトルの情報にもとづいて、人物の視線が画像データの正面を向くよう虹彩モデルおよび瞳孔モデルの配置位置を決定する。 The model position determining unit 19 determines the arrangement positions of the iris model and the pupil model so that the person's line of sight faces the front of the image data based on the face vector information.

眼瞼裂領域変更部２０は、眼瞼裂領域内の画素を変更することにより、ユーザの視線が画像データの正面を向いた画像データを生成し、この画像データを表示制御部３１および画像通信部３３に与える。 The eyelid region changing unit 20 generates image data in which the user's line of sight faces the front of the image data by changing the pixels in the eyelid region, and this image data is displayed on the display control unit 31 and the image communication unit 33. To give.

次に、本実施形態に係る対話装置１０の動作（画像処理モジュール１３の動作を含む）の一例について説明する。 Next, an example of the operation of the interactive apparatus 10 according to this embodiment (including the operation of the image processing module 13) will be described.

図２は、図１に示す対話装置１０により、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現するために、カメラ１１によって生成されたユーザの顔を含む画像データの視線を必要に応じて変更する際の手順を示すフローチャートである。図２において、Ｓに数字を付した符号は、フローチャートの各ステップを示す。 FIG. 2 shows image data including the user's face generated by the camera 11 in order to easily realize a natural dialogue environment in which the lines of sight of the dialoguers match in the interactive communication by the dialogue apparatus 10 shown in FIG. It is a flowchart which shows the procedure at the time of changing the eyes | visual_axis of as needed. In FIG. 2, a symbol with a number added to S indicates each step of the flowchart.

なお、以下の説明では、カメラ１１の設置位置が、ユーザから見て左上である場合の例について説明する。 In the following description, an example in which the installation position of the camera 11 is at the upper left when viewed from the user will be described.

まず、ステップＳ１において、画像データ取得部１２は、ユーザの顔を含む画像データをカメラ１１から受け、この画像データを画像処理モジュール１３に入力する。 First, in step S 1, the image data acquisition unit 12 receives image data including the user's face from the camera 11 and inputs this image data to the image processing module 13.

次に、ステップＳ２において、顔認識部１４は、入力された画像データに対して顔認識処理を実行し、ユーザの顔における眼瞼裂の位置の情報および眼瞼裂領域の画素の情報を抽出する。眼瞼裂領域の画素の情報には、虹彩、瞳孔および強膜（いわゆる白目）の画素の情報が含まれる。各画素の情報には、各対象の位置、大きさおよび色の情報が含まれる。 Next, in step S 2, the face recognition unit 14 performs face recognition processing on the input image data, and extracts information on the position of the eyelid in the user's face and information on the pixels in the eyelid region. The pixel information in the eyelid region includes information on the iris, pupil, and sclera (so-called white eye) pixels. Information on each pixel includes information on the position, size, and color of each object.

また、顔認識部１４は、顔認識処理の結果にもとづいて、顔全体の向きを規定する顔ベクトルの情報を抽出する。 Further, the face recognition unit 14 extracts face vector information that defines the orientation of the entire face based on the result of the face recognition process.

図３は、ユーザの顔の向きおよび視線が表示部３２の表示画面を真っすぐ向いている場合において、画像データから抽出される顔ベクトルについて説明するための図である。図３の左は、カメラ１１によって生成された画像データである。ユーザの顔がカメラ１１を向いている場合、ユーザの顔は画像データの正面（紙面奥から手前に向かう向き）を向くことになる。 FIG. 3 is a diagram for explaining the face vector extracted from the image data when the user's face direction and line of sight are directly facing the display screen of the display unit 32. The left side of FIG. 3 is image data generated by the camera 11. When the user's face is facing the camera 11, the user's face is facing the front of the image data (the direction from the back to the front of the paper).

ユーザの顔が表示部３２の表示画面を真っすぐ向いている場合には、顔ベクトルは、表示画面の法線ベクトルに対して並行となる。一方、この場合、図３に示すように、顔ベクトルは、画像データの正面を規定する画像データの法線ベクトル（紙面奥から手前に貫くベクトル）に対しては、カメラ１１の設置位置に応じて若干ずれた向きを持つ。 When the user's face is directly facing the display screen of the display unit 32, the face vector is parallel to the normal vector of the display screen. On the other hand, in this case, as shown in FIG. 3, the face vector depends on the installation position of the camera 11 with respect to the normal vector of the image data that defines the front of the image data (the vector that penetrates from the back of the page to the front). With a slightly offset orientation.

図４は、ユーザの顔の向きが表示部３２の表示画面を真っすぐ向いた状態から大きく外れた場合において、画像データから抽出される顔ベクトルを説明するための図である。 FIG. 4 is a diagram for explaining a face vector extracted from image data when the face direction of the user deviates greatly from a state in which the display screen of the display unit 32 faces straight.

図４に示すように、ユーザが対話相手（表示手段の表示画面）を見ていない場合には、顔ベクトルは、表示画面の法線ベクトルに対して大きくずれた向きを持つことになる。 As shown in FIG. 4, when the user is not looking at the conversation partner (display screen of the display means), the face vector has a direction greatly deviated from the normal vector of the display screen.

ユーザの顔が明らかに対話相手を向いていない場合には、視線の変更処理を行ってしまうと不自然な画像が生成されてしまう。このため、顔ベクトルの向きが所定の範囲にあるかどうかによりユーザの顔が表示手段の表示画面を向いているかどうかを判定し、顔ベクトルの向きが所定の範囲にある場合にのみ視線の変更処理を行うようにするとよい。 If the user's face is clearly not facing the conversation partner, an unnatural image will be generated if the line of sight is changed. Therefore, it is determined whether or not the user's face is facing the display screen of the display means based on whether or not the face vector is in a predetermined range, and the line of sight is changed only when the face vector is in the predetermined range. It is advisable to perform processing.

ステップＳ３において、視線判定部１５は、顔ベクトルの向きが所定の範囲にあるかどうかを判定することにより、画像データに含まれるユーザの顔が表示手段の表示画面を向いているかどうかを判定する。顔ベクトルの向きが所定の範囲にある場合、つまりユーザの顔が表示手段の表示画面を向いていると判定された場合は、視線の変更処理を行うべくステップＳ４に進む。一方、顔ベクトルの向きが所定の範囲外の場合、つまりユーザの顔が表示手段の表示画面を向いていないと判定された場合は、視線判定部１５は、この視線変更処理前の画像データを表示制御部３１および画像通信部３３に与える。ステップＳ８に進む。 In step S 3, the line-of-sight determination unit 15 determines whether the face of the user included in the image data is facing the display screen of the display unit by determining whether the orientation of the face vector is within a predetermined range. . If the orientation of the face vector is within a predetermined range, that is, if it is determined that the user's face is facing the display screen of the display means, the process proceeds to step S4 to perform a line-of-sight change process. On the other hand, when the orientation of the face vector is outside the predetermined range, that is, when it is determined that the user's face is not facing the display screen of the display unit, the line-of-sight determination unit 15 obtains the image data before the line-of-sight change process. This is given to the display control unit 31 and the image communication unit 33. Proceed to step S8.

次に、ステップＳ４において、黒目情報取得部１７は、顔認識部１４から虹彩および瞳孔の画素情報（虹彩および瞳孔の大きさおよび色の情報を含む）を取得する。 Next, in step S 4, the black eye information acquisition unit 17 acquires iris and pupil pixel information (including iris and pupil size and color information) from the face recognition unit 14.

図５は、虹彩モデルおよび瞳孔モデルを生成する様子の一例を示す説明図である。 FIG. 5 is an explanatory diagram showing an example of how an iris model and a pupil model are generated.

ステップＳ５において、モデル生成部１８は、虹彩および瞳孔の画素の情報にもとづいて、虹彩モデルおよび瞳孔モデルを生成する。この結果、ユーザの虹彩および瞳孔の色に合わせた虹彩モデルおよび瞳孔モデルを生成することができるとともに、画像データ上の虹彩および瞳孔の大きさに合わせた虹彩モデルおよび瞳孔モデルを生成することができる。 In step S5, the model generation unit 18 generates an iris model and a pupil model based on information on iris and pupil pixels. As a result, an iris model and a pupil model that match the user's iris and pupil color can be generated, and an iris model and a pupil model that match the size of the iris and pupil on the image data can be generated. .

なお、この虹彩モデルおよび瞳孔モデルは、ユーザの視線が画像データの正面を向くよう画像処理を行う際に用いられる。正面を向いている人物の虹彩および瞳孔の形状は、他者からは、ほぼ円形として視認される。このため、モデル生成部１８は、ほぼ円形形状となるように虹彩モデルおよび瞳孔モデルを生成すればよく、複雑な形状計算をする必要はない。 The iris model and the pupil model are used when performing image processing so that the user's line of sight faces the front of the image data. The shape of the iris and pupil of the person facing the front is visually recognized as a circular shape by others. For this reason, the model generation part 18 should just produce | generate an iris model and a pupil model so that it may become a substantially circular shape, and does not need to perform complicated shape calculation.

次に、ステップＳ６において、モデル位置決定部１９は、顔ベクトルの情報にもとづいて、虹彩モデルおよび瞳孔モデルの配置位置を決定する。より具体的には、モデル位置決定部１９は、顔ベクトルの向きと画像データ正面の向きとの差にもとづいて、ユーザの視線が画像データの正面を向くように、虹彩モデルおよび瞳孔モデルの配置位置を決定する。 Next, in step S6, the model position determination unit 19 determines the arrangement positions of the iris model and the pupil model based on the face vector information. More specifically, the model position determination unit 19 arranges the iris model and the pupil model so that the user's line of sight faces the front of the image data based on the difference between the orientation of the face vector and the front of the image data. Determine the position.

図６は、画像データの眼瞼裂領域内の画素を変更する様子の一例を示す説明図である。また、図７は、視線変更処理前後の画像データの一例について説明するための図である。 FIG. 6 is an explanatory diagram illustrating an example of a state in which the pixels in the eyelid region of the image data are changed. FIG. 7 is a diagram for describing an example of image data before and after the line-of-sight change process.

ステップＳ７において、眼瞼裂領域変更部２０は、モデル位置決定部１９により決定された位置に虹彩モデルおよび瞳孔モデルを配置するよう眼瞼裂領域内の画素を変更することにより、ユーザの視線が画像データの正面を向いた画像データを生成する（図７参照）。そして、眼瞼裂領域変更部２０は、この視線変更処理後の画像データを表示制御部３１および画像通信部３３に与える。 In step S7, the eyelid region changing unit 20 changes the pixels in the eyelid region so that the iris model and the pupil model are arranged at the position determined by the model position determining unit 19, so that the user's line of sight is image data. Is generated (refer to FIG. 7). Then, the eyelid fracture region changing unit 20 gives the image data after the line-of-sight changing process to the display control unit 31 and the image communication unit 33.

たとえば、図６に示すように、まず眼瞼裂領域内の画素を顔認識部１４から取得した強膜の色で埋めてしまい、次にほぼ円形の虹彩モデルおよび瞳孔モデルを最前面に配置し、最後に虹彩モデルおよび瞳孔モデルを眼瞼裂にあわせてトリミングする手順により、ユーザの視線が画像データの正面を向いた画像データを生成することができる。 For example, as shown in FIG. 6, first, the pixels in the eyelid fracture region are filled with the sclera color acquired from the face recognition unit 14, then the substantially circular iris model and pupil model are placed in the foreground, Finally, image data with the user's line of sight facing the front of the image data can be generated by the procedure of trimming the iris model and the pupil model according to the eyelid fracture.

次に、ステップＳ８において、表示制御部３１は、画像データ（視線判定部１５から受けた視線変更処理前の画像データまたは眼瞼裂領域変更部２０から受けた視線変更処理後の画像データ）から画像を生成し、表示部３２に表示させる。また、画像通信部３３は、画像データ（視線判定部１５から受けた視線変更処理前の画像データまたは眼瞼裂領域変更部２０から受けた視線変更処理後の画像データ）を他の対話装置１０に対して送信する。 Next, in step S 8, the display control unit 31 takes an image from the image data (image data before the line-of-sight change process received from the line-of-sight determination unit 15 or image data after the line-of-sight change process received from the eyelid region change unit 20). Is generated and displayed on the display unit 32. In addition, the image communication unit 33 sends the image data (image data before the line-of-sight change process received from the line-of-sight determination unit 15 or image data after the line-of-sight change process received from the eyelid region change unit 20) to the other interactive device 10. Send to.

以上の手順により、カメラ１１によって生成されたユーザの顔を含む画像データの視線を必要に応じて変更することができ、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現することができる。 With the above procedure, the line of sight of the image data including the user's face generated by the camera 11 can be changed as necessary, and a natural conversation environment in which the lines of sight of the interlocutors match in the interactive communication can be easily performed. Can be realized.

また、対話相手の対話装置が本実施形態にかかる対話装置１０の画像処理モジュール１３を備えておらず、画像通信部３３が受信した受信画像データが視線変更処理前の画像データである場合には、図２に示した手順を受信画像データに対して行うことにより、受信画像データに対して視線変更処理を施すことができる。 Further, when the dialogue apparatus of the dialogue partner does not include the image processing module 13 of the dialogue apparatus 10 according to the present embodiment, and the received image data received by the image communication unit 33 is image data before the line-of-sight change process. By performing the procedure shown in FIG. 2 on the received image data, the line-of-sight changing process can be performed on the received image data.

この場合、たとえば、ステップＳ１では、画像データ取得部１２は、対話相手の顔を含む画像データを画像通信部３３から受け、この画像データを画像処理モジュール１３に入力すればよい。また、ステップＳ７では、眼瞼裂領域変更部２０は、視線変更処理後の受信画像データを表示制御部３１にのみ与えればよく、画像通信部３３に与える必要はない。 In this case, for example, in step S 1, the image data acquisition unit 12 may receive image data including the face of the conversation partner from the image communication unit 33 and input this image data to the image processing module 13. In step S 7, the eyelid fracture region changing unit 20 only needs to provide the received image data after the line-of-sight changing process to the display control unit 31, and does not need to provide the image communication unit 33.

もちろん、対話相手の対話装置が本実施形態にかかる対話装置１０の画像処理モジュール１３を備えている場合には、画像取得部は、受信画像データを直接表示制御部３１に与えても構わない。 Of course, when the dialogue apparatus of the dialogue partner includes the image processing module 13 of the dialogue apparatus 10 according to the present embodiment, the image acquisition unit may directly provide the received image data to the display control unit 31.

本実施形態に係る対話装置１０は、顔ベクトルの情報を取得することができる。このため、この顔ベクトルの向きにもとづいて画像データに含まれる人物が対話相手を見ているか（表示画面を見ているか）どうかを判定することができる。したがって、本実施形態に係る対話装置１０によれば、画像データに含まれる人物が対話相手を見ている場合にのみ視線の変更処理を行うことができ、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現することができる。 The dialogue apparatus 10 according to the present embodiment can acquire face vector information. For this reason, it is possible to determine whether the person included in the image data is looking at the conversation partner (looking at the display screen) based on the orientation of the face vector. Therefore, according to the dialogue apparatus 10 according to the present embodiment, it is possible to perform the line-of-sight change process only when a person included in the image data is looking at the conversation partner, and the lines of sight between the dialogue persons in interactive communication can be obtained. A consistent natural dialogue environment can be easily realized.

また、本実施形態に係る対話装置１０は、特別な物理デバイスを使用したり、複数画像の合成を行ったりすることなく、カメラ１１は１台のみを用い、画像処理は眼瞼裂領域の画素のみの変更でよい。このため、極めて簡便かつ高速に視線の変更処理を行うことができる。 In addition, the interactive apparatus 10 according to the present embodiment uses only one camera 11 without using a special physical device or synthesizing a plurality of images, and image processing is performed only for pixels in the eyelid region. You can change it. For this reason, the line-of-sight changing process can be performed very simply and at high speed.

なお、本発明は上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment.

たとえば、上述した画像処理モジュール１３の構成の一部または全部を、コンピュータによりプログラムを実行させて機能させることもできる。この場合、この画像処理プログラムをＣＤ−ＲＯＭ、磁気ディスクなどからコンピュータのプログラムメモリにインストールし、あるいはこのプログラムメモリに通信回線を介してダウンロードするなどして、プログラムメモリに格納された画像処理プログラムをコンピュータにより実行すればよい。 For example, a part or all of the configuration of the image processing module 13 described above can be made to function by executing a program by a computer. In this case, the image processing program stored in the program memory is installed by installing the image processing program from a CD-ROM, a magnetic disk or the like into a program memory of a computer or downloading it to the program memory via a communication line. It can be executed by a computer.

また、本発明の実施形態では、フローチャートの各ステップは、記載された順序に沿って時系列的に行われる処理の例を示したが、必ずしも時系列的に処理されなくとも、並列的あるいは個別実行される処理をも含むものである。 Further, in the embodiment of the present invention, each step of the flowchart shows an example of processing that is performed in time series in the order described. The process to be executed is also included.

本発明に係る対話装置の一実施形態を示す概略的な全体構成図。1 is a schematic overall configuration diagram showing an embodiment of an interactive apparatus according to the present invention. 図１に示す対話装置により、対話型通信において対話者同士の視線が一致した自然な対話環境を容易に実現するために、カメラによって生成されたユーザの顔を含む画像データの視線を必要に応じて変更する際の手順を示すフローチャート。In order to easily realize a natural conversation environment in which the lines of sight of the interlocutors coincide in the interactive communication, the line of sight of the image data including the user's face generated by the camera is used as necessary. The flowchart which shows the procedure at the time of changing. ユーザの顔の向きおよび視線が表示部の表示画面を真っすぐ向いている場合において、画像データから抽出される顔ベクトルについて説明するための図。The figure for demonstrating the face vector extracted from image data, when the direction of a user's face and a gaze are looking straight at the display screen of a display part. ユーザの顔の向きが表示部の表示画面を真っすぐ向いた状態から大きく外れた場合において、画像データから抽出される顔ベクトルを説明するための図。The figure for demonstrating the face vector extracted from image data when the direction of a user's face deviates significantly from the state which faced the display screen of the display part. 虹彩モデルおよび瞳孔モデルを生成する様子の一例を示す説明図。Explanatory drawing which shows an example of a mode that an iris model and a pupil model are produced | generated. 画像データの眼瞼裂領域内の画素を変更する様子の一例を示す説明図。Explanatory drawing which shows an example of a mode that the pixel in the eyelid region of image data is changed. 視線変更処理前後の画像データの一例について説明するための図。The figure for demonstrating an example of the image data before and behind a gaze change process.

Explanation of symbols

１０対話装置
１１カメラ
１２画像データ取得部
１３画像処理モジュール
１４顔認識部
１５視線判定部
１６視線変更部
１７黒目情報取得部
１８モデル生成部
１９モデル位置決定部
２０眼瞼裂領域変更部
３１表示制御部
３２表示部
３３画像通信部 DESCRIPTION OF SYMBOLS 10 Dialogue device 11 Camera 12 Image data acquisition part 13 Image processing module 14 Face recognition part 15 Gaze determination part 16 Gaze change part 17 Black eye information acquisition part 18 Model generation part 19 Model position determination part 20 Eyelid region change part 31 Display control part 32 Display unit 33 Image communication unit

Claims

An image processing module used in an interactive communication system,
Extract at least face vector information that defines the orientation of the face, information on the position of the eyelids in the face, and information on iris pixel of the face from the image data including the face of the input person. Facial recognition means to
Line-of-sight determining means for determining whether the face is facing the display screen of the display means based on the information of the face vector;
When it is determined that the face is facing the display screen of the display unit, the line-of-sight changing unit that changes the pixels in the eyelid region so that the line of sight of the person faces the front of the image data;
An image processing module comprising:

The iris pixel information is
At least information on the position of the iris on the face, information on the size of the iris, and information on the color of the iris,
The line-of-sight changing means includes
A substantially circular iris model is generated based on the information of the iris pixels, and the arrangement position of the iris model is determined based on the face vector information so that the line of sight of the person faces the front of the image data. Configured to be able to change the pixels in the eyelid region so that the iris model is arranged at the arrangement position,
The image processing module according to claim 1.

Display means;
Extract from the input image data including the face of the person at least information on the face vector that defines the orientation of the face, information on the position of the eyelid in the face, and information on the iris pixel of the face Facial recognition means to
Line-of-sight determination means for determining whether the face is facing the display screen of the display means based on the information of the face vector;
When it is determined that the face is facing the display screen of the display unit, the line-of-sight changing unit that changes the pixels in the eyelid region so that the line of sight of the person faces the front of the image data;
With
The display means includes
An interactive apparatus that displays the image data in which pixels in the eyelid region received from the line-of-sight changing means are changed.

The iris pixel information is
At least information on the position of the iris on the face, information on the size of the iris, and information on the color of the iris,
The line-of-sight changing means includes
A substantially circular iris model is generated based on the information of the iris pixels, and the arrangement position of the iris model is determined based on the face vector information so that the line of sight of the person faces the front of the image data. Configured to be able to change the pixels in the eyelid region so that the iris model is arranged at the arrangement position,
The interactive apparatus according to claim 3.

Image communication means configured to be able to transmit and receive image data to and from other interactive devices;
Further comprising
The face recognition means
The image data received from the other interactive device via the image communication means is input.
The interactive apparatus according to claim 3 or 4.

The display means includes
Further displaying image data received from the other interactive device in which a pixel in the eyelid fracture region received from the line-of-sight changing means has been changed;
The interactive apparatus according to claim 5.

Imaging means for capturing an image including the person's face and generating image data;
Further equipped with,
The dialogue apparatus according to any one of claims 3 to 6.

Extract at least face vector information that defines the orientation of the face, information on the position of the eyelids in the face, and information on iris pixel of the face from the image data including the face of the input person. And steps to
Determining whether the face is facing the display screen of the display means based on the information of the face vector;
When it is determined that the face is facing the display screen of the display means, changing the pixels in the eyelid region so that the line of sight of the person faces the front of the image data;
An image processing method comprising:

The iris pixel information is
At least information on the position of the iris on the face, information on the size of the iris, and information on the color of the iris,
Changing the pixel in the eyelid fracture region comprises:
Generating a substantially circular iris model based on the iris pixel information;
Determining an arrangement position of the iris model so that the line of sight of the person faces the front of the image data based on the information of the face vector;
Changing the pixels in the eyelid region so that the iris model is placed at the placement position;
Having
The image processing method according to claim 8.

Computer
Extract from the input image data including the face of the person at least information on the face vector that defines the orientation of the face, information on the position of the eyelid in the face, and information on the iris pixel of the face Facial recognition means,
Based on the information on the face vector, the line-of-sight determining means for determining whether the face is facing the display screen of the display means, and if it is determined that the face is facing the display screen of the display means, Line-of-sight changing means for changing pixels in the eyelid area so that the line of sight of the person faces the front of the image data;
Program to function as.

The iris pixel information is
At least information on the position of the iris on the face, information on the size of the iris, and information on the color of the iris,
The line-of-sight changing means includes
A substantially circular iris model is generated based on the information of the iris pixels, and the arrangement position of the iris model is determined based on the face vector information so that the line of sight of the person faces the front of the image data. Configured to be able to change the pixels in the eyelid region so that the iris model is arranged at the arrangement position,
The program according to claim 10.