JP2008135923A

JP2008135923A - Production method of videos interacting in real time, video production device, and video production system

Info

Publication number: JP2008135923A
Application number: JP2006319879A
Authority: JP
Inventors: Denko O; 傳宏王; Tsai-Yen Li; 蔡彦李; Wen-Hung Liao; 文宏廖
Original assignee: TAIWAN MUKOJO KAGI KOFUN YUGEN; TAIWAN MUKOJO KAGI KOFUN YUGENKOSHI
Current assignee: TAIWAN MUKOJO KAGI KOFUN YUGEN; TAIWAN MUKOJO KAGI KOFUN YUGENKOSHI
Priority date: 2006-11-28
Filing date: 2006-11-28
Publication date: 2008-06-12

Abstract

<P>PROBLEM TO BE SOLVED: To provide an inexpensive and simple production method and a system of videos interacting in real time. <P>SOLUTION: A production method, a production device, and a production system of videos interacting in real time are disclosed. The production system of videos includes: a display device provided with a screen 500; a computer provided with at least one processor, a memory, and a program; and a photographing device. The program provides media contents and special effect command scripts. When the photographing device photographs an on-site staff video 401, makes it match with special effect command scripts and reproduces it, media contents are displayed on the screen 500 in real time and made to interact with the on-site staff video 401 on the screen 500. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は動的映像の制作方法、制作装置、および、制作システムに関し、特に、リアルタイムに相互作用する映像の制作方法、制作装置、および、制作システムに関する。 The present invention relates to a dynamic video production method, production apparatus, and production system, and more particularly, to a video production method, production apparatus, and production system that interact in real time.

デジタルカメラ、インターネットカメラおよびカメラ付き携帯電話などの撮影装置が廉価になって広く普及したことにより、家庭用コンピュータと電子製品との融合は止めることのできない趨勢となっている。しかし、現在の映像マルチメディアの応用はその多くが静止画像を主としており、通常、写真の撮影保存およびファイル管理に重点が置かれ、基本的な画像処理および簡単な画像合成機能が設けられている。動的映像の設備においては、単純な録画、ファイル転換および再生に関する応用が主流となっており、時にはネットワークと組み合わせてリアルタイムな映像の伝送を行うが、マルチメディアコンテンツに創作的な付加価値を与えたり、改造を行ったりすることに対しては不足している部分が多い。また、ゲームソフトによっては使用者の四肢動作のインタラクティブゲームへの整合も試みられているが、運動パターンの理解層のためにゲームシナリオの設計に重大な制限がもたらされることによって、ゲーム内容の変化性も制限される。 As imaging devices such as digital cameras, internet cameras, and camera-equipped mobile phones have become inexpensive and widely spread, the fusion of home computers and electronic products has become an undeniable trend. However, many of the current video multimedia applications are mainly still images, and usually focus on photography storage and file management, and basic image processing and simple image composition functions are provided. . In dynamic video equipment, applications related to simple recording, file conversion and playback are the mainstream, sometimes transmitting real-time video in combination with a network, but giving creative value to multimedia content There are many parts that are lacking for doing or remodeling. In addition, some game software attempts to match the user's limb movements to interactive games. Sex is also limited.

また、テレビコンテンツにおいてよく見られる特殊効果は、制作時に必要なハードおよびソフトのコストが高いだけでなく、専門知識が必要であるので、閾の高い専門領域である。また、役者はその場に存在しない相手を想像して演技する必要があり、負担が大きく、制作上非常に不便である。 In addition, special effects often seen in television content are not only high hardware and software costs required for production, but also require specialized knowledge, which is a highly specialized area. In addition, the actor needs to perform by imagining a partner that does not exist on the spot, which is heavy and inconvenient in production.

上述の一般のデジタルコンテンツの制作における複雑性に鑑み、本発明の発明者はリアルタイムに相互作用する映像の制作方法、映像の制作装置、および、映像の制作システムを案出し、簡単で自然なマンマシンインターフェイスを提供することができ、制作者は廉価なコストで、内容が豊富なデジタルコンテンツを創作できる。
特開平１０−１３７３８号公報 In view of the above complexity of general digital content production, the inventors of the present invention have devised a video production method, a video production device, and a video production system that interact in real time. A machine interface can be provided, and creators can create rich digital contents at low cost.
JP-A-10-13738

本発明の目的は、廉価で簡単なリアルタイムに相互作用する映像の制作方法、映像の制作装置、および、映像の制作システムを提供することにある。 An object of the present invention is to provide an inexpensive and simple real-time video production method, video production apparatus, and video production system.

上記課題を解決するために、本発明は簡単で自然なマンマシンインターフェイスを提供するものであり、制作者は廉価なコストで、内容が豊富なデジタルコンテンツを創作できる。 In order to solve the above problems, the present invention provides a simple and natural man-machine interface, and a creator can create rich digital contents at a low cost.

本発明はリアルタイムに相互作用する映像の制作方法、映像の制作装置、および、映像の制作システムを提供するものであり、画面を備える表示装置、現場人員、少なくとも一つのプロセッサ、メモリおよびプログラムを備える計算機器および撮影装置を備える。プログラムはメディアコンテンツおよび特殊効果コマンドスクリプトを提供する。撮影装置は現場人員映像を撮影し、特殊効果コマンドスクリプトと整合する。最後に、メディアコンテンツおよび整合された現場人員映像と、特殊効果コマンドスクリプトとを合成し、リアルタイムに画面上に表示する。 The present invention provides a video production method, a video production apparatus, and a video production system that interact in real time, and includes a display device including a screen, a field worker, at least one processor, a memory, and a program. Computation equipment and photographing device. The program provides media content and special effects command scripts. The imaging device captures field personnel video and matches the special effect command script. Finally, the media content and the matched field personnel video are combined with the special effect command script and displayed on the screen in real time.

すなわち、請求項１の発明は、画面を準備するステップと、リアルタイムに映像を撮影して前記画面に表示するステップと、仮想の物体の画像を生成し前記画面に表示するステップと、前記物体の画像と前記映像とを相互作用させるステップとを含むことを特徴とするリアルタイムに相互作用する映像の制作方法である。 That is, the invention of claim 1 provides a step of preparing a screen, a step of capturing a video in real time and displaying it on the screen, a step of generating an image of a virtual object and displaying it on the screen, A method for producing an image that interacts in real time, comprising the step of interacting an image with the image.

請求項２の発明は、前記映像を撮影する方法は、インターネットカメラを使用してインターネットカメラ前の映像を撮影する方法であることを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項３の発明は、前記物体の画像を生成するステップは、さらに、予め選択されたモードで対応する物体の画像を生成するステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項４の発明は、前記相互作用させるステップは、前記映像位置を認識するステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項５の発明は、前記相互作用させるステップは、前記映像の変動を追跡するステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項６の発明は、前記相互作用させるステップは、特殊効果スクリプトを前記映像上に生成するステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項７の発明は、前記特殊効果スクリプトは、スクリプト言語によって記述された特殊効果コマンド集から選択されることを特徴とする請求項６記載のリアルタイムに相互作用する映像の制作方法である。
請求項８の発明は、前記物体は、メディアコンテンツから選択されることを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項９の発明は、前記画面を準備するステップは、映像提供者が撮影装置前に存在し、前記撮影装置と前記画面とが電気的に接続されるステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項１０の発明は、前記物体の画像を生成するステップは、前記映像が前記映像の特徴追跡に従うステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。
請求項１１の発明は、前記物体の画像を生成するステップは、前記映像が姿勢分析および認識に従うステップを含むことを特徴とする請求項１記載のリアルタイムに相互作用する映像の制作方法である。 According to a second aspect of the present invention, the method of photographing the video is a method of photographing a video in front of the internet camera using an internet camera. Is the method.
The invention according to claim 3 is characterized in that the step of generating the image of the object further includes the step of generating an image of the corresponding object in a preselected mode. It is a production method of the image to be performed.
The invention of claim 4 is the real-time interactive video production method according to claim 1, wherein the interacting step includes a step of recognizing the video position.
The invention of claim 5 is the real-time interactive image production method according to claim 1, wherein the interacting step includes a step of tracking the variation of the image.
According to a sixth aspect of the present invention, in the method for producing a video that interacts in real time according to the first aspect, the step of interacting includes a step of generating a special effect script on the video.
The invention of claim 7 is the real-time interactive video production method according to claim 6, wherein the special effect script is selected from a special effect command collection described in a script language.
The invention of claim 8 is the real-time interactive video production method according to claim 1, wherein the object is selected from media content.
The invention of claim 9 is characterized in that the step of preparing the screen includes a step in which a video provider exists in front of the photographing apparatus, and the photographing apparatus and the screen are electrically connected. 1 is a method of producing a video that interacts in real time according to 1.
The invention of claim 10 is the real-time interactive video production method according to claim 1, wherein the step of generating an image of the object includes the step of the video following feature tracking of the video. .
The invention according to claim 11 is the method for producing a video that interacts in real time according to claim 1, wherein the step of generating the image of the object includes a step in which the video follows posture analysis and recognition.

請求項１２の発明は、メディア処理装置によって判読される複数のプログラムを保存し、前記メディア処理装置は前記複数のプログラムに基づいて、背景データおよびリアルタイム映像を含むデータを入力するステップと、前記データを認識するステップと、前記データの変更部分を追跡するステップと、メディアコンテンツを準備するステップと、前記メディアコンテンツと前記データとを合成するステップと、前記メディアコンテンツと前記データとを合成したものを表示するステップとを実行することを特徴とするリアルタイムに相互作用する映像の制作装置である。 The invention of claim 12 stores a plurality of programs read by a media processing device, the media processing device inputs data including background data and real-time video based on the plurality of programs, and the data Recognizing, a step of tracking a changed portion of the data, a step of preparing media content, a step of combining the media content and the data, and a combination of the media content and the data A video production device that interacts in real time, characterized by executing a display step.

請求項１３の発明は、前記メディア処理装置によって判読される複数のプログラムを保存し、メディアコンテンツを準備するステップは、さらに、前記メディアコンテンツの読み込みを行うステップと、前記メディアコンテンツを復号化するステップとを含むことを特徴とする請求項１２記載のリアルタイムに相互作用する映像の制作装置である。
請求項１４の発明は、前記メディア処理装置によって判読される複数のプログラムを保存し、前記メディアコンテンツと前記データとを合成するステップは、さらに、前記データの変更部分を再追跡するステップを含むことを特徴とする請求項１２記載のリアルタイムに相互作用する映像の制作装置である。
請求項１５の発明は、前記メディア処理装置によって判読される複数のプログラムを保存し、さらに、特殊効果の読み込みを行うステップと、前記メディアコンテンツ、前記データおよび特殊効果の合成したものを再処理するステップと、前記メディアコンテンツ、前記データおよび特殊効果の合成したものを表示するステップとを含むことを特徴とする請求項１２記載のリアルタイムに相互作用する映像の制作装置である。
請求項１６の発明は、前記メディア処理装置によって判読される複数のプログラムを保存し、前記特殊効果の読み込みを行うステップは、さらに、特殊効果を前記背景データに嵌入するステップを含むことを特徴とする請求項１５記載のリアルタイムに相互作用する映像の制作装置である。 According to a thirteenth aspect of the present invention, the steps of storing a plurality of programs to be read by the media processing apparatus and preparing the media content further include a step of reading the media content and a step of decoding the media content 13. The apparatus for producing a video that interacts in real time according to claim 12, characterized in that:
According to a fourteenth aspect of the present invention, the step of storing a plurality of programs read by the media processing device and combining the media content and the data further includes the step of retracking the changed portion of the data. 13. The video production apparatus that interacts in real time according to claim 12.
The invention of claim 15 stores a plurality of programs read by the media processing device, further reads a special effect, and reprocesses the composite of the media content, the data, and the special effect. 13. The real-time interactive video production apparatus according to claim 12, comprising a step and a step of displaying a combination of the media content, the data and special effects.
According to a sixteenth aspect of the present invention, the step of storing a plurality of programs read by the media processing apparatus and reading the special effect further includes inserting a special effect into the background data. 16. An apparatus for producing an image that interacts in real time according to claim 15.

請求項１７の発明は、画面を備える表示装置と、少なくとも1つのプロセッサ、メモリおよび、メディアコンテンツおよび特殊効果コマンドスクリプトを備える複数の読み取り可能プログラムを備える計算機器と、映像を受信する撮影装置とを備え、前記特殊効果コマンドスクリプト処理および前記メディアコンテンツとの合成によって、前記メディアコンテンツと前記映像をリアルタイムに相互作用させながら前記画面上に表示することを特徴とするリアルタイムに相互作用する映像の制作システムである。 The invention of claim 17 includes: a display device including a screen; a computing device including at least one processor, a memory, and a plurality of readable programs including media content and special effect command scripts; and a photographing device that receives video. A real-time interactive video production system, wherein the media content and the video are displayed on the screen while interacting in real time by combining the special effect command script processing and the media content. It is.

請求項１８の発明は、前記表示装置は、液晶モニタであることを特徴とする請求項１７記載のリアルタイムに相互作用する映像の制作システムである。
請求項１９の発明は、前記計算機器は、コンピュータであることを特徴とする請求項１７記載のリアルタイムに相互作用する映像の制作システムである。 The invention according to claim 18 is the real-time interactive video production system according to claim 17, wherein the display device is a liquid crystal monitor.
A nineteenth aspect of the present invention is the real-time interactive video production system according to the seventeenth aspect, wherein the computing device is a computer.

本発明のリアルタイムに相互作用する映像の制作方法、映像の制作装置、および映像の制作システムによれば、相互作用する特殊効果トラックの概念を使用し、未加工の映像における映像トラックおよびオーディオトラック以外に、リアルタイムに特殊効果を加えることができる。一般の映像特殊効果と異なる点は、本発明で規範された特殊効果はリアルタイムに生成することができ、適用する対象も予め選定しておく必要がなく、インタラクティブな変化をさせることができる点にある。 According to the video production method, video production apparatus, and video production system of the present invention that interact in real time, the concept of interactive special effect tracks is used, and other than video tracks and audio tracks in raw video. In addition, special effects can be added in real time. The difference from general video special effects is that the special effects specified in the present invention can be generated in real time, and it is not necessary to select the target to be applied in advance, and interactive changes can be made. is there.

本発明の詳細な説明を図に沿って下記に示す。本発明の実施例を説明するとき、一般の比率ではなく、説明上都合のいいように、局部的に拡大するが、それは本発明を制限するものではない。 A detailed description of the invention is given below with reference to the figures. When describing embodiments of the present invention, it is not a general ratio but is locally expanded for convenience of explanation, but is not intended to limit the present invention.

本発明は、リアルタイムに相互作用する映像の制作方法、制作装置(リアルタイムに相互作用する映像の保存装置)、および、制作システムであり、画面を備える表示装置、少なくとも一つのプロセッサ、メモリおよびプログラムを備える計算機器および撮影装置を備える。計算機器内のプログラムはメディアコンテンツおよび特殊効果コマンドスクリプトを提供する。撮影装置が現場の映像を撮影し、特殊効果コマンドスクリプトと整合して再生されるとき、メディアコンテンツがリアルタイムに画面上に表示される。メディアコンテンツは仮想人物を含み、リアルタイムに画面上の現場の映像と相互作用させることができる。 The present invention relates to a video production method, a production device (video storage device that interacts in real time), and a production system that interact in real time, a display device including a screen, at least one processor, a memory, and a program. A computing device and a photographing device are provided. The program in the computing device provides media content and special effect command scripts. Media content is displayed on the screen in real time as the photographic device captures the scene video and plays it in alignment with the special effect command script. Media content includes virtual people and can interact with on-site video in real time.

図１に示す本発明の一実施例において、パーソナルコンピュータ、セットトップボックス、ゲームコンソールまたは携帯電話などのプロセッサおよびメモリを備える機器、陰極線管ディスプレイ、液晶ディスプレイまたはプラズマディスプレイなどの表示装置および撮影装置が準備され、本実施例ではコンピュータメインフレーム１００、液晶ディスプレイ１０１、および、インターネット等の通信回線を介して撮影画像を転送しうるインターネットカメラ１０２が採用されている。ここで説明しなければならないこととして、本実施例において、コンピュータメインフレーム１００、液晶ディスプレイ１０１およびインターネットカメラ１０２は有線または無線方式で相互に接続されているが、当然、この形態には制限がなく、ノートブックパソコンまたはタブレットコンピュータなどのようにメインフレームとディスプレイとが接合されたものに撮影装置を組み合わせることもできる。 In one embodiment of the present invention shown in FIG. 1, a device having a processor and memory such as a personal computer, a set top box, a game console or a mobile phone, a display device such as a cathode ray tube display, a liquid crystal display or a plasma display, and a photographing device are provided. In this embodiment, the computer main frame 100, the liquid crystal display 101, and the Internet camera 102 capable of transferring the captured image via a communication line such as the Internet are employed. In this embodiment, the computer main frame 100, the liquid crystal display 101, and the internet camera 102 are connected to each other in a wired or wireless manner. However, there is no limitation on this configuration. The photographing device can be combined with a main frame and a display joined together such as a notebook personal computer or a tablet computer.

次に、ライブ録画に関して、図１に示すように、インターネットカメラ１０２によって現場人員１０４をライブ撮影する。インターネットカメラ１０２は現場人員１０４の映像をキャプチャリングして液晶ディスプレイ１０１の画面１０３上に表示する。画面１０３上には現場人員映像１０５が表示される。現場人員映像１０５はそのときインターネットカメラ１０２の前にいる現場人員１０４をリアルタイムに表示するものである。一実施例における一選択モードにおいて、仮想人物１０６と現場人員映像１０５とが相互作用し合う。ここで説明しなければならないこととして、現場人員１０４はリアルタイムに画面１０３上に表示されて現場人員映像１０５となる。ここでのリアルタイムとは現場人員１０４の動作と現場人員映像１０５とが同期しているということである。また、現場人員１０４がいるシーンおよび仮想人物１０６と現場人員映像１０５との相互作用の方式は予め設定されてなく、使用者はメニュまたは類似のインターフェイスによって選択することができる。選択モードはプログラムが制作されたアプリケーションプログラムとすることができ、コンピュータメインフレーム１００内のメモリなどに保存される。 Next, as for live recording, as shown in FIG. The internet camera 102 captures the video of the on-site personnel 104 and displays it on the screen 103 of the liquid crystal display 101. On-site personnel image 105 is displayed on screen 103. The on-site personnel video 105 displays the on-site personnel 104 in front of the Internet camera 102 at that time in real time. In one selection mode in one embodiment, the virtual person 106 and the on-site personnel image 105 interact. What has to be explained here is that the on-site personnel 104 is displayed on the screen 103 in real time and becomes the on-site personnel image 105. The real time here means that the operation of the on-site personnel 104 and the on-site personnel video 105 are synchronized. In addition, the method of interaction between the scene with the on-site personnel 104 and the virtual person 106 and the on-site personnel video 105 is not set in advance, and the user can select the menu or a similar interface. The selection mode can be an application program in which the program is produced, and is stored in a memory or the like in the computer main frame 100.

図２は、本発明のメディア処理装置の一実施例によるファイルアーキテクチャを示す図である。予め選択されたモードは主体コンテンツおよび特殊効果スクリプトファイルから構成され、一実施例として、先ず、メディアコンテンツ２０１およびシナリオを設定し、例えばポピュラー音楽、昔の歌またはクラシック音楽などのマルチメディア映像コンテンツを生成する。次に、お互いに対応する相互作用効果が設けられた特殊効果コマンドスクリプト２０２が設計され、それは時間パラメータ、相対空間パラメータ、特殊効果の種類、特殊効果適用対象などの基本データを含み、特定の言語によって記述され、コマンドファイルとして保存される。使用者は性別、年齢などの要素に応じて異なるテーマを設計することができ、異なる特殊効果と組み合わせることができる。即ち、同一の主体コンテンツに関して述べると、複数の特殊効果コマンドを搭載することができ、例として、流行音楽を再生するとき、お互いに対応する特殊効果スクリプトに仮想人物を読み込み、再生時にデータ整合を行う方式を採用できる。先ず、使用者はメディアコンテンツ２０１および特殊効果コマンドスクリプト２０２をダウンロードする。続いて現場人員映像のキャプチャリング２０３を行い、撮影装置のリアルタイムな映像のキャプチャリングと組み合わせ、図１に示すように現場人員映像１０５をキャプチャリングした後、特殊効果コマンドスクリプト２０２と直列整合し、最後に動的映像の合成２０４によって直列後のリアルタイムにキャプチャリングされた映像および特殊効果コマンドスクリプト２０２とメディアコンテンツ２０１とを合成し、現場人員を仮想世界の中に表示する。 FIG. 2 is a diagram illustrating a file architecture according to an embodiment of the media processing apparatus of the present invention. The preselected mode is composed of main content and special effect script file. As an example, first, media content 201 and scenario are set, and multimedia video content such as popular music, old songs or classical music is set. Generate. Next, a special effect command script 202 provided with interaction effects corresponding to each other is designed, which includes basic data such as a time parameter, a relative space parameter, a special effect type, a special effect application target, etc., in a specific language. Is saved as a command file. Users can design different themes according to factors such as gender, age, etc., and can be combined with different special effects. That is, when describing the same main content, a plurality of special effect commands can be installed. For example, when playing popular music, a virtual person is read into a special effect script corresponding to each other, and data matching is performed during playback. The method to do can be adopted. First, the user downloads media content 201 and special effect command script 202. Next, on-site personnel image capturing 203 is performed, combined with real-time image capturing of the photographing device, and after capturing on-site personnel image 105 as shown in FIG. Finally, the video and special effect command script 202 captured in real time after serialization by the dynamic video synthesis 204 and the media content 201 are synthesized, and the on-site personnel are displayed in the virtual world.

図３、４はメディア処理装置によってキャプチャリングされた現場人員と仮想世界とを合成し、再生した状態を示す模式図である。表示装置は一画面を表示し、それは撮影装置（図示せず）が現場人員を撮影し、リアルタイムに表示装置の画面４００に表示したものであり、画面４００には現場人員映像４０１が存在する。本実施例の読み取り可能プログラムを実行するとき、予め選択されたモードは人物、神様、アニメキャラクタ、妖怪などの仮想人物４０２を生成することができる。 3 and 4 are schematic views showing a state in which the on-site personnel captured by the media processing apparatus and the virtual world are synthesized and reproduced. The display device displays one screen, which is a screen shot of a field personnel photographed by a photographing device (not shown) and displayed in real time on the screen 400 of the display device. When the readable program of the present embodiment is executed, a preselected mode can generate a virtual person 402 such as a person, a god, an anime character, or a monster.

このとき、仮想人物４０２は現場人員映像４０１と相互作用をさせることができ、リアルタイムに画面４００に表示することができる。図４に示すように、仮想人物４０２は多くの動作および特殊効果を備えることができ、現場人員映像４０１も左右移動などの小さな運動をすることができる。例えば、仮想人物４０２を現場人員映像４０１の肩に乗せたり、現場人員映像４０１の頬にキスをしたりすることができる。このとき、仮想人物４０２の動作に反応して現場人員映像４０１に赤面効果５０１や喜悦効果５０２を生成することができる。もう1つの例として、仮想人物４０２は現場人員映像４０１に魔法を掛けることができ、このとき仮想人物４０２の動作に反応して現場人員映像４０１の頭の上には耳５０３が付き、現場人員映像４０１の頭部が少し揺動したとき、耳５０３もそれに連れて揺動する。即ち、仮想人物４０２、現場人員映像４０１および各種の効果はリアルタイムに相互作用する。つまり、画面５００上において、現場人員映像４０１がどこに移動しても、耳５０３は永遠に現場人員映像４０１の頭の上にある。ここで説明しなければならないこととして、技術的には先ず認識技術を使用することができ、先ず現場人員映像４０１の髪の位置を確認し、次に追跡技術を使用して頭部の移動位置を追跡し、次に特殊効果である耳５０３を髪の上に加え、このような認識および追跡の繰り返しによって、画面上には人と仮想物とのリアルタイムな相互作用効果が生成される。 At this time, the virtual person 402 can interact with the on-site personnel image 401 and can be displayed on the screen 400 in real time. As shown in FIG. 4, the virtual person 402 can have many operations and special effects, and the on-site personnel image 401 can also perform small exercises such as left and right movement. For example, the virtual person 402 can be placed on the shoulder of the on-site personnel image 401 or the cheek of the on-site personnel image 401 can be kissed. At this time, the blush effect 501 and the rejoicing effect 502 can be generated in the on-site personnel image 401 in response to the operation of the virtual person 402. As another example, the virtual person 402 can apply magic to the on-site personnel image 401. At this time, the ear 503 is attached to the head of the on-site personnel image 401 in response to the action of the virtual person 402, When the head of the image 401 is slightly swung, the ear 503 is swung accordingly. That is, the virtual person 402, the on-site personnel image 401, and various effects interact in real time. That is, no matter where the site personnel image 401 moves on the screen 500, the ear 503 is forever on the head of the site personnel image 401. It should be explained here that, from a technical point of view, the recognition technique can be used first, first the hair position of the on-site personnel image 401 is confirmed, and then the moving position of the head using the tracking technique. Then, the ear 503, which is a special effect, is added on the hair, and by repeating such recognition and tracking, a real-time interaction effect between a person and a virtual object is generated on the screen.

現場人員に関して述べると、現場人員映像は半身モードと全身モードとに分けることができ、半身モードは現場人員の頭部および肩部が画面に表示され、全身モードは画面中の身体部分が全身の十分の七を占める。ここで説明しなければならないこととして、インタラクティブなデジタルコンテンツの構成において、リアルタイム性および正確性は同時に達成するのが難しい目標であるが、本発明においてはアプリケーション形態の違いに応じて好適な処理および調整を行うことができる。例を挙げると、動的な顔部拡大映像に応用するとき、顔部特徴の検出および正確な定位が主に考慮される。動作モードでは全域動作の簡易パラメータの推定が主に考慮される。全身モードのときはエリア運動の追跡および構成の認識が相互作用モジュールの重点とされる。 Regarding site personnel, site personnel images can be divided into half-body mode and whole-body mode. In the half-body mode, the head and shoulders of the site personnel are displayed on the screen. Occupies seven. What must be explained here is that real-time performance and accuracy are difficult to achieve at the same time in the construction of interactive digital content, but in the present invention, suitable processing and Adjustments can be made. For example, when applied to dynamic facial magnified images, facial feature detection and accurate localization are primarily considered. In the operation mode, the estimation of the simple parameters for the entire operation is mainly considered. When in whole body mode, tracking of area motion and recognition of configuration is the focus of the interaction module.

仮想像と現場人員との相互作用の使用方法に関して、特徴検出、特徴追跡、姿勢分析および姿勢認識などは仮想像と現場人員の動作とを分析する。特徴検出は適用対象の性質に基づいて低レベル（特徴点）および高レベル（目や口などの顔部特徴）がそれぞれ考慮されたキャプチャリングである。特徴のマッチング方式に関しては、暗示的（Implicit）規則および明示的（Explicit）規則に分けられる。明示的特徴マッチング規則は特徴間の一対一の対応関係を求める。暗示的特徴マッチング規則はパラメータまたは転換などの方式によって前後のフレーム内の特徴間の関係を表す。例えば、明示的規則および低レベル特徴は特徴点マッチング（四肢追跡）とすることができ、明示的規則および高レベル特徴は表情分析とすることができ、暗示的規則および低レベル特徴は密度オプティカルフローとすることができ、暗示的規則および高レベル特徴は顔部器官検出および定位とすることができる。 Regarding the method of using the interaction between the virtual image and the field personnel, feature detection, feature tracking, posture analysis, posture recognition, and the like analyze the virtual image and the motion of the field personnel. Feature detection is capturing in which a low level (feature point) and a high level (face features such as eyes and mouth) are considered based on the properties of the application target. The feature matching method can be divided into an implicit rule and an explicit rule. An explicit feature matching rule finds a one-to-one correspondence between features. Implicit feature matching rules represent relationships between features in the previous and next frames by a method such as parameter or transformation. For example, explicit rules and low-level features can be feature point matching (limb tracking), explicit rules and high-level features can be facial expression analysis, and implicit rules and low-level features are density optical flows. Implicit rules and high-level features can be facial organ detection and localization.

特徴検出においては下記の方法及び装置が使用され、効率が高く、正確な顔面検出および器官定位がなされる。図５は水平辺縁の密度計算の初期選定の連続図であり、初期検出は、グレイスケール映像の水平辺縁の密度強弱によって目および口の位置が初期推定される。候補エリア６０１は選定された目および口の位置である。次に、多くの候補エリア６０１において、器官の相対位置と比率関係を利用して更なる篩い分けが行われる。最後に、眼球検出によって位置の確認が行われる。一実施例として、皮膚の色を判断根拠の補助材料とすることもできる。鼻、眉毛および耳などの器官の定位は一実施例として、比率関係によって位置推定がなされる。顔面の輪郭は楕円方程式によって表示される。一実施例として、全身モードの下で、皮膚の色および髪の特徴部位検出器によって迅速な検出を行うことができ、人体のその他部分に関しては比較的低いレベルであるがグループ化がされた特徴点によって描写される。 In the feature detection, the following method and apparatus are used, and highly efficient and accurate face detection and organ localization are performed. FIG. 5 is a continuous diagram of the initial selection of the horizontal edge density calculation. In the initial detection, the positions of the eyes and the mouth are initially estimated by the density of the horizontal edge of the gray scale image. Candidate area 601 is the position of the selected eye and mouth. Next, in many candidate areas 601, further sieving is performed using the relative position of the organ and the ratio relationship. Finally, the position is confirmed by eyeball detection. As an example, the skin color may be used as an auxiliary material for the judgment basis. As an example, the localization of organs such as the nose, eyebrows, and ears is estimated by a ratio relationship. The contour of the face is displayed by an elliptic equation. As an example, under full body mode, skin color and hair feature site detectors can provide rapid detection and relatively low level but grouped features for the rest of the human body Depicted by dots.

特徴追跡に関し、半身操作モードの特徴追跡の一実施例では、顔部器官の持続定位および全体のエリアの運動パラメータ推定に重点が置かれる。全身操作モードの一実施例として、グループ化のグラフマッチング方式によって特徴の比較および追跡がなされ、計算資源の変化状況によって特徴点の数量が調整される。説明しなければならないこととして、本実施例において、映像キャプチャリング装置が現場人員の顔部を追跡するのではなく、現場人員の顔部が映像キャプチャリング装置の撮影位置に適応しなければならず、このようにして姿勢推定を考慮する必要を無くすことができる。 With respect to feature tracking, one embodiment of feature tracking in the half-body operation mode focuses on sustained localization of facial organs and motion parameter estimation of the entire area. As an example of the whole-body operation mode, features are compared and tracked by a grouped graph matching method, and the number of feature points is adjusted according to a change state of a calculation resource. It should be explained that in this embodiment, the video capturing device does not track the face of the field personnel, but the face of the site personnel must adapt to the shooting position of the video capturing device. In this way, it is possible to eliminate the need to consider posture estimation.

姿勢の分析および認識に関して、静止状態の下での物体の形状の判別の一実施例として、形状マッチングを使用することができ、その関連技術であるShape Contextなどを使用することができる。演算法もElastic Matching演算法とすることができ、さらに多重解析度の概念と組み合わせて小さな程度の変形および遮蔽効果を容認する。連続動作の分析および認識の一実施例として、階層式のオプティカルフロー追跡方式（Pyramidal Optical Flow）を利用し、先ず人体の移動方向および速度を算出し、使用時間の序列法の一実施例として、隠れマルコフモデルまたはリカレント型ニューラルネットワークなどとすることができ、その動作の表す意味内容を分析する。 As an example of determining the shape of an object in a stationary state with respect to posture analysis and recognition, shape matching can be used, and its related technology, such as Shape Context, can be used. The calculation method can also be an Elastic Matching calculation method, and in combination with the concept of multiple analysis degree, a small degree of deformation and shielding effect is allowed. As an example of continuous motion analysis and recognition, using a hierarchical optical flow tracking method (Pyramidal Optical Flow), first calculate the moving direction and speed of the human body, as an example of the ordering method of usage time, It can be a hidden Markov model or a recurrent neural network, etc., and the semantic content represented by the operation is analyzed.

図６は、本発明のメディア処理装置、そのシステム、ソフトの運転の一実施例を示すフロー図である。アプリケーションプログラムのトリガー７０１、ハードの検出７５１、警告メッセージ７３１、アプリケーションプログラムの終了７０４および問題メッセージ７３２がプログラムがハード需要を認証するステップに使用される。ハードの検出７５１が問題を発見したとき、警告メッセージ７３１を生成し、そうでないときは問題メッセージ７３２を生成する。警告メッセージ７３１は使用者にハードが検出を行うとき必要なハード設備が未装着であることや運転不可であることを知らせ、例えば、撮影レンズが未装着であるというメッセージおよび撮影レンズの装着が不完全であるなどのメッセージを知らせる。問題メッセージ７３２は使用者に被撮影者がレンズを離れたことを知らせ、次の撮影ステップに便利になっている。次は前処理であり、背景データの収集７０６を行って内部保存背景データ７０７内に保存し、次に問題メッセージ７３３を生成し、その目的は使用者を撮影可能範囲内に進入させることにある。例えば、歓迎画面によって使用者を撮影可能範囲内に進入させ、その映像を表示画面に出現させる。 FIG. 6 is a flowchart showing an embodiment of the media processing apparatus, its system, and software operation of the present invention. Application program trigger 701, hardware detection 751, warning message 731, application program termination 704 and problem message 732 are used by the program to authenticate hardware demand. A warning message 731 is generated when the hard detection 751 finds a problem, and a problem message 732 is generated otherwise. The warning message 731 informs the user that the hardware equipment necessary for hardware detection is not installed or cannot be operated. For example, a message that the photographic lens is not installed and the photographic lens is not installed. Inform the message that it is complete. The problem message 732 informs the user that the subject has left the lens and is convenient for the next shooting step. The next is preprocessing, where background data collection 706 is performed and stored in the internally stored background data 707, and then a problem message 733 is generated, the purpose of which is to bring the user into the shootable range. . For example, the user is allowed to enter the photographing possible range by the welcome screen, and the video is displayed on the display screen.

認識７０９はここでは顔部および四肢全体を認識できる。動作追跡７１０はここでは顔部および四肢全体の動作を検出できる。また、メディアデータ７６１は拡張ファイルの部類であるＡＶＩまたはＭＰＥＧフォーマットを含むことができる。一実施例として、メディアデータ７６１はＤＬＬファイルなどの圧縮ファイルとすることができる。次にメディアデータ７１１およびメディアデータの復号化７１３を読み込む。認識７０９、動作追跡７１０および内部保存背景データ７０７が次のステップと組み合わさって動的合成映像が生成される。 Here, the recognition 709 can recognize the entire face and limbs. The motion tracking 710 can now detect the motion of the face and the entire limb. Further, the media data 761 can include an AVI or MPEG format which is a type of extended file. As an example, the media data 761 may be a compressed file such as a DLL file. Next, the media data 711 and the decryption 713 of the media data are read. Recognition 709, motion tracking 710 and internally stored background data 707 are combined with the following steps to generate a dynamic composite video.

メディア処理装置によって、撮影機映像とメディアデータ（仮想の物体の画像）との合成７１４および動作の再追跡７１５の後、合成メディアデータの表示７１６がされる。動作の再追跡７１５は背景および映像の変更をもう一度検出する。次に特殊効果の読み込みをするかどうか７５２の判断をし、する場合、特殊効果嵌入の読み込み７１８のステップに進む。特殊効果嵌入の読み込み７１８は一実施例として、その特殊効果の等級はＣEffectとすることができる。次に合成メディアデータを保存するかどうか７５３のステップに進み、保存する場合、合成メディアデータの保存７２０を行う。次に時間が終了するかどうか７５４のステップに進み、終了する場合、保存合成メディアデータの再処理７２２のステップ進む、一実施例として、ＪＰＥＧファイルフォーマットまたはＣStyle等級とすることができる。最後に再処理された保存合成メディアデータの表示７２３およびアプリケーションプログラムの終了７２４を行う(リアルタイムに相互作用する映像の保存装置或いは制作装置)。 The media processing device displays composite media data display 716 after combining 714 and re-tracking operation 715 of the camera image and media data (virtual object image). The motion retrack 715 once again detects background and video changes. Next, it is determined in step 752 whether or not to read a special effect, and if so, the process proceeds to a step 718 for reading a special effect insertion. The special effect insertion reading 718 may be, as an example, a special effect grade of Ceffect. Next, the process proceeds to step 753 as to whether or not to save the composite media data. Next, proceed to step 754 whether the time expires, and if so, proceed to the step of reprocessing stored composite media data 722, which can be a JPEG file format or CStyle grade, as an example. Finally, display 723 of the stored composite media data reprocessed and end 724 of the application program are performed (a video storage device or production device that interacts in real time).

ここで説明しなければならないこととして、メディア処理装置により撮影機映像とメディアデータとの合成７１４は動作の再追跡７１５後、画面上に合成メディアデータの表示７１６を行うことができ、次に特殊効果嵌入の読み込み７１８に進み、合成メディアデータの保存７２０後、撮影機映像とメディアデータとの合成７１４に進み、このようにしてリアルタイムの効果が生成される。図３、４に示すように、仮想人物４０２は動作の再追跡７１５後、現場人員映像４０１の肩および頬の位置を知ることができる。特殊効果の赤面効果５０１は合成メディアデータの保存７２０および動作の再追跡７１５後、リアルタイムに図３に示す赤面効果５０１を加えることができる。また、動作の再追跡７１５によって頬がどこに移動しても赤面効果５０１は正確な位置に生成させることができる。 What has to be explained here is that the media processing device can display the composite media data 716 on the screen after the re-tracking 715 of the camera image and the media data by the media processing device. Proceed to read effect insertion 718, save composite media data 720, and then proceed to composition 714 of the camera image and media data, thus producing a real-time effect. As shown in FIGS. 3 and 4, the virtual person 402 can know the positions of the shoulders and cheeks of the on-site personnel image 401 after re-tracking the movement 715. The special effect blush effect 501 can add the blush effect 501 shown in FIG. 3 in real time after the composite media data storage 720 and the operation re-tracking 715. Further, the blush effect 501 can be generated at an accurate position regardless of where the cheek moves by the retracking operation 715.

以上の説明は、本発明のリアルタイムに相互作用する映像の制作方法、制作装置、制作システムのソフト運転の工程の一実施例を示すものである。本発明はパーソナルコンピュータ、セットトップボックス、ゲームコンソールまたは携帯電話などにおいて、メディア処理装置は前述の説明した各ステップを実行する手段を有し、該各手段により各ステップを実行させることができる。さらに応用して二人の使用者がお互いに遊ぶこともできる。二人の使用者はインターネットまたはイントラネットなどのネットワークで接続され、相手または自分に仮想人物を選択し、一端で命令を出し、他端の仮想人物を操作し、各種の視覚特殊効果を発生させ、相手および自分のモニタ上に表示させることができる。 The above description shows one embodiment of the production method, production apparatus, and production system software operation process that interacts in real time according to the present invention. In the present invention, in a personal computer, a set top box, a game console, a mobile phone, or the like, the media processing apparatus has means for executing the above-described steps, and each step can be executed by the respective means. In addition, two users can play with each other. Two users are connected by a network such as the Internet or an intranet, select a virtual person to the other party or themselves, issue a command at one end, manipulate the virtual person at the other end, generate various visual special effects, It can be displayed on the other party and his monitor.

上述の本発明の一実施例において、アプリケーションソフトの相互作用性および合成効果のリアリティー性が共に考慮され、特殊効果モジュールおよび相互作用モジュールの設計が一緒に考慮され、1つのパッケージに統合され、このようにしてメディアコンテンツのレイアウト時、先ず処理を完了させ、システム資源を相互作用時のリアリティー表現に十分に利用できる。 In the above-described embodiment of the present invention, the interactivity of the application software and the reality of the composite effect are considered together, the design of the special effect module and the interaction module are considered together, and integrated into one package. In this way, when the media content is laid out, the processing is first completed, and the system resources can be fully utilized for expressing the reality at the time of interaction.

以上の説明は本発明の実施例を示したものであり、本発明の特許請求の範囲を制限するものではない。本発明の主旨を逸脱しない範囲における変更または修飾はすべて特許請求の範囲に含まれる。 The above description shows an embodiment of the present invention and does not limit the scope of the claims of the present invention. All changes or modifications within the scope of the present invention are included in the scope of the claims.

本発明の一実施例によるアーキテクチャを示す模式図である。FIG. 2 is a schematic diagram illustrating an architecture according to an embodiment of the present invention. 本発明の一実施例によるファイルアーキテクチャを示すブロック図である。FIG. 3 is a block diagram illustrating a file architecture according to one embodiment of the present invention. キャプチャリングされた現場人員と仮想世界とを合成し、再生した状態を示す模式図である。It is a schematic diagram which shows the state which synthesize | combined and reproduced the captured field personnel and the virtual world. キャプチャリングされた現場人員と仮想世界とを合成し、再生した状態を示すもう一つの模式図である。It is another schematic diagram which shows the state which synthesize | combined and reproduced the captured field personnel and the virtual world. 水平辺縁の密度計算の初期選定の連続図である。It is a continuation figure of the initial selection of the density calculation of a horizontal edge. 本発明のソフトの運転の一実施例を示すフロー図である。It is a flowchart which shows one Example of the driving | operation of the software of this invention.

Explanation of symbols

１００コンピュータメインフレーム
１０１液晶ディスプレイ
１０２ネットワークカメラ
１０３、４００、５００画面
１０４現場人員
１０５、４０１現場人員映像
１０６、４０２仮想人物
２０１メディアコンテンツ
２０２特殊効果コマンドスクリプト
２０３現場人員映像のキャプチャリング
２０４動的映像の合成
５０１赤面効果
５０２喜悦効果
５０３耳
６０１候補エリア
７０１アプリケーションプログラムのトリガー
７５１ハードの検出
７３１警告メッセージ
７０４アプリケーションプログラムの終了
７３２、７３３問題メッセージ
７０６背景データの収集
７０７内部保存背景データ
７０９認識
７１０動作追跡
７１１メディアデータの読み込み
７６１メディアデータ
７１３メディアデータの復号化
７１４撮影機映像とメディアデータとの合成
７１５動作の再追跡
７１６合成メディアデータの表示
７５２特殊効果の読み込みをするかどうか
７１８特殊効果嵌入の読み込み
７５３合成メディアデータを保存するかどうか
７２０合成メディアデータの保存
７５４時間が終了するかどうか
７２２保存合成メディアデータの再処理
７２３再処理された保存合成メディアデータの表示
７２４アプリケーションプログラムの終了 100 computer main frame 101 liquid crystal display 102 network camera 103, 400, 500 screen 104 field personnel 105, 401 field personnel images 106, 402 virtual person 201 media content 202 special effect command script 203 field personnel image capturing 204 dynamic image Compositing 501 Blush effect 502 Joy effect 503 Ear 601 Candidate area 701 Application program trigger 751 Hardware detection 731 Warning message 704 End of application program 732 and 733 Problem message 706 Collection of background data 707 Internally stored background data 709 Recognition 710 Motion tracking 711 Read media data 761 Media data 713 Decode media data 714 Camera image and media Compositing with data 715 Re-tracking operation 716 Displaying composite media data 752 Whether to read special effect 718 Reading special effect insertion 753 Whether to save composite media data 720 Save composite media data 754 Time expires Whether or not 722 Reprocessing of stored composite media data 723 Display of reprocessed stored composite media data 724 End of application program

Claims

Steps to prepare the screen,
Taking a picture in real time and displaying it on the screen;
Generating an image of a virtual object and displaying it on the screen;
A method for producing a video that interacts in real time, comprising the step of interacting the image of the object and the video.

2. The method for producing an image that interacts in real time according to claim 1, wherein the method of photographing the image is a method of photographing an image in front of the internet camera using an internet camera.

2. The method of producing a real-time interactive video according to claim 1, wherein the step of generating an image of the object further includes a step of generating an image of the corresponding object in a preselected mode.

2. The method for producing an image that interacts in real time according to claim 1, wherein the step of interacting includes a step of recognizing the position of the image.

2. The method of producing a video that interacts in real time according to claim 1, wherein the step of interacting includes a step of tracking fluctuations of the video.

2. The method of producing a video that interacts in real time according to claim 1, wherein the step of interacting includes a step of generating a special effect script on the video.

7. The method of producing a video that interacts in real time according to claim 6, wherein the special effect script is selected from a special effect command collection described in a script language.

The method according to claim 1, wherein the object is selected from media content.

The real-time interaction according to claim 1, wherein the step of preparing the screen includes a step in which a video provider exists in front of the photographing device, and the photographing device and the screen are electrically connected. How to make a video to play.

2. The method of producing a real-time interactive video according to claim 1, wherein the step of generating an image of the object includes the step of following the feature tracking of the video.

2. The method of producing a video that interacts in real time according to claim 1, wherein the step of generating an image of the object includes a step in which the video is subjected to posture analysis and recognition.

Save multiple programs that are read by the media processing device,
The media processing device is based on the plurality of programs,
Inputting data including background data and real-time video;
Recognizing the data;
Tracking changes in the data;
Preparing media content;
Combining the media content and the data;
A real-time interactive video production apparatus, comprising: a step of displaying a combination of the media content and the data

Storing a plurality of programs read by the media processing device;
The step of preparing media content further includes
Reading the media content;
The real-time interactive video production apparatus according to claim 12, further comprising: decrypting the media content.

Storing a plurality of programs read by the media processing device;
The step of combining the media content and the data further includes:
The real-time interactive video production apparatus according to claim 12, further comprising the step of re-tracing the changed portion of the data.

Storing a plurality of programs read by the media processing device; and
A step to load special effects,
Reprocessing the composite of the media content, the data and special effects;
13. The real-time interactive video production apparatus according to claim 12, further comprising a step of displaying a combination of the media content, the data, and a special effect.

Storing a plurality of programs read by the media processing device;
The step of reading the special effect further includes:
The real-time interactive video production apparatus according to claim 15, further comprising: inserting a special effect into the background data.

A display device comprising a screen;
A computing device comprising at least one processor, memory and a plurality of readable programs comprising media content and special effects command scripts;
An imaging device for receiving video,
A real-time interactive video production system characterized in that the media content and the video are displayed on the screen while interacting in real time by the special effect command script processing and the media content.

18. The real-time interactive video production system according to claim 17, wherein the display device is a liquid crystal monitor.

18. The real-time interactive video production system according to claim 17, wherein the computing device is a computer.