JP2024064155A

JP2024064155A - Image synthesis frame generation device, image synthesis device and program

Info

Publication number: JP2024064155A
Application number: JP2022172533A
Authority: JP
Inventors: 敦志荒井; Atsushi Arai
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2022-10-27
Filing date: 2022-10-27
Publication date: 2024-05-14

Abstract

【課題】実世界が撮影されたカメラ映像について、俯瞰的にその方角及び画角を容易に認識可能な情報を生成する。【解決手段】映像合成用フレーム生成装置２のカメラパラメータ受信部１１は、カメラパラメータに含まれるズーム値及びフォーカス値を撮影画角θ及びフォーカス距離ｆに変換する。カメラパラメータ補間部１２は、カメラパラメータのデータレートがカメラ映像のフレームレートに一致するように、カメラパラメータのデータを補間する。座標変換部１３は、表示要素毎に、カメラパラメータを用いて、当該表示要素の世界座標値を画像座標値に変換する。合成用フレーム生成部１４は、表示要素毎に、当該表示要素のＣＧを画像座標値の箇所に配置すると共に、カメラの位置を含む地図画像内の撮影領域Ｓを求め、撮影領域Ｓを地図画像に表したＰinＰ画像を生成し、ＰinＰ画像を所定の画像座標値の箇所に配置し、合成用フレームを生成する。【選択図】図３[Problem] To generate information that allows easy recognition of the direction and angle of view from a bird's-eye view of a camera image captured by a real world. [Solution] A camera parameter receiving unit 11 of a video synthesis frame generating device 2 converts the zoom value and focus value included in the camera parameters into a shooting angle of view θ and a focus distance f. A camera parameter interpolating unit 12 interpolates the camera parameter data so that the data rate of the camera parameters matches the frame rate of the camera image. A coordinate converting unit 13 converts the world coordinate value of each display element into an image coordinate value using the camera parameters. A synthesis frame generating unit 14 places the CG of each display element at the image coordinate value, and determines the shooting area S in a map image including the camera position, generates a PinP image that shows the shooting area S on the map image, places the PinP image at a predetermined image coordinate value, and generates a synthesis frame. [Selected Figure] Figure 3

Description

本発明は、カメラ映像を合成する際に用いるフレームを生成する映像合成用フレーム生成装置、カメラ映像にフレームを合成する映像合成装置、及びプログラムに関する。 The present invention relates to a frame generation device for image synthesis that generates frames used when synthesizing camera images, an image synthesis device that synthesizes frames into camera images, and a program.

従来、カメラの撮影方向をセンサで検出し、カメラの動きに応じてコンピュータによるバーチャル映像を合成する装置が知られている（例えば特許文献１を参照）。 Conventionally, there is known a device that uses a sensor to detect the shooting direction of a camera and synthesizes a virtual image using a computer in accordance with the camera movement (see, for example, Patent Document 1).

この装置は、カメラにより撮影された実画像に、バーチャルカメラにより撮影されたバーチャル映像を合成する際に、手動操作によるカメラの撮影方向の変化を反映した角度情報に基づいて、両画像の表示位置関係を一定に保持するものである。 When combining a real image captured by a camera with a virtual image captured by a virtual camera, this device maintains a constant display positional relationship between the two images based on angle information that reflects changes in the camera's shooting direction due to manual operation.

また、テレビ番組に関する地図情報を、番組映像上に表示するシステムが知られている（例えば特許文献２を参照）。 There is also a system known that displays map information related to a television program on the program image (see, for example, Patent Document 2).

このシステムは、位置情報が埋め込まれた番組に対する視聴者のリモコン操作に従って、位置情報を地図サーバへ送信し、地図サーバから位置情報に関連する地図情報を受信し、地図情報をテレビ画面に表示するものである。 This system transmits location information to a map server in response to a viewer's remote control operation for a program in which location information is embedded, receives map information related to the location information from the map server, and displays the map information on the television screen.

特開２００４－２２７３３２号公報JP 2004-227332 A 特開２００８－２９５００１号公報JP 2008-295001 A

前述の特許文献１の装置は、例えばランドマークを含む撮影中のカメラ映像に、その名称等の画像を合成することで、カメラマン及び視聴者に対し、カメラが撮影している方角の情報を提示することができる。つまり、カメラマン及び視聴者は、カメラ映像に表示されたランドマークの名称等の情報から、カメラの向いている方角を認識することができる。 The device of the aforementioned Patent Document 1 can provide the cameraman and viewers with information about the direction in which the camera is shooting by, for example, synthesizing an image of the landmark's name or other information with the camera image that includes the landmark. In other words, the cameraman and viewers can recognize the direction in which the camera is facing from information such as the landmark's name displayed in the camera image.

しかしながら、この装置では、カメラ映像が撮影された実世界において、そのカメラ映像が俯瞰的にどの方角をどれだけの画角で撮影しているかを認識することができない。特に、土地勘のないカメラマン及び視聴者にとっては、その方角及び画角の認識は困難である。 However, with this device, it is not possible to recognize the direction and angle of view of the camera image from a bird's-eye view in the real world where the camera image was captured. It is particularly difficult for cameramen and viewers who are unfamiliar with the area to recognize the direction and angle of view.

また、前述の特許文献２のシステムは、番組に埋め込まれた位置情報に関連する地図情報を取得し、番組に関連する地図を視聴者へ提示するものである。 The system in Patent Document 2 mentioned above obtains map information related to location information embedded in a program and presents a map related to the program to the viewer.

しかしながら、このシステムで用いる地図情報は、番組のカメラ画像について、どの方角をどれだけの画角で撮影しているかを示すものではないため、地図情報によって、カメラの撮影領域を提示することはできない。 However, the map information used in this system does not indicate the direction and angle of view of the camera images for the program, so the map information cannot indicate the camera's shooting area.

このような映像合成を行う装置においては、実世界を撮影するカメラを操作するカメラマン及びカメラ映像の視聴者に対し、ランドマーク等の名称が付された合成映像に加え、さらに撮影に関する情報をカメラ映像と共に提示することが所望されていた。特に、カメラ映像の撮影場所に関して土地勘のないカメラマン及び視聴者に対しては、さらなる撮影に関する情報を提示することが必要である。 In such a device that performs image synthesis, it is desirable to present not only the synthesized image with names of landmarks and the like, but also information about the shooting together with the camera image to the cameraman operating the camera that captures the real world and the viewers of the camera image. In particular, it is necessary to present further information about the shooting to cameramen and viewers who are unfamiliar with the location where the camera image was shot.

そこで、本発明は前記課題を解決するためになされたものであり、その目的は、実世界が撮影されたカメラ映像について、俯瞰的にその方角及び画角を容易に認識可能な情報を生成する映像合成用フレーム生成装置、映像合成装置及びプログラムを提供することにある。 The present invention has been made to solve the above problems, and its purpose is to provide an image synthesis frame generation device, an image synthesis device, and a program that generate information that allows the direction and angle of view of a camera image captured in the real world to be easily recognized from a bird's-eye view.

前記課題を解決するために、請求項１の映像合成用フレーム生成装置は、カメラ映像の撮影に関する情報を含むフレームを、前記カメラ映像を合成する際に用いる合成用フレームとして生成する映像合成用フレーム生成装置において、前記カメラ映像の撮影に関する情報の表示要素毎に、当該表示要素の世界座標値が格納された座標情報記憶部と、前記カメラ映像を撮影するカメラの位置を含む地図画像が格納された地図画像記憶部と、前記表示要素毎のＣＧが格納されたＣＧ記憶部と、前記カメラのカメラパラメータを受信し、前記カメラパラメータに含まれるズーム値及びフォーカス値をそれぞれ撮影画角及びフォーカス距離とし、前記撮影画角及び前記フォーカス距離を含むカメラパラメータを出力するカメラパラメータ受信部と、前記カメラパラメータ受信部により出力された前記カメラパラメータのデータレートが前記カメラ映像のフレームレートに合致するように、前記カメラパラメータを補間するカメラパラメータ補間部と、前記表示要素毎に、前記座標情報記憶部から前記世界座標値を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータに基づいて、前記世界座標値を画像座標値に変換する座標変換部と、前記地図画像記憶部から前記地図画像を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータからパン角及び前記撮影画角を抽出し、前記パン角及び前記撮影画角に基づいて、前記地図画像における前記カメラの位置を基準とした撮影領域を求め、前記撮影領域を前記地図画像に表したＰinＰ（ピクチャインピクチャ）画像を生成し、前記表示要素毎に、前記ＣＧ記憶部から前記ＣＧを読み出し、前記座標変換部により変換された前記画像座標値の箇所に前記ＣＧを配置し、予め設定された画像座標値の箇所に前記ＰinＰ画像を配置することで、前記合成用フレームを生成する合成用フレーム生成部と、を備えたことを特徴とする。 In order to solve the above problem, the image synthesis frame generating device of claim 1 generates a frame including information related to the shooting of a camera image as a synthesis frame to be used when synthesizing the camera image, and includes a coordinate information storage unit in which, for each display element of the information related to the shooting of the camera image, the world coordinate values of the display element are stored, a map image storage unit in which a map image including the position of the camera that shoots the camera image is stored, a CG storage unit in which CG for each of the display elements is stored, a camera parameter receiving unit that receives the camera parameters of the camera, sets the zoom value and focus value included in the camera parameters to the shooting angle of view and the focus distance, respectively, and outputs the camera parameters including the shooting angle of view and the focus distance, and an interpolation unit that outputs the camera parameters so that the data rate of the camera parameters output by the camera parameter receiving unit matches the frame rate of the camera image. a camera parameter interpolation unit that converts the world coordinate values from the coordinate information storage unit to image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit for each of the display elements; a synthesis frame generation unit that reads the map image from the map image storage unit, extracts a pan angle and a shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determines a shooting area based on the position of the camera in the map image based on the pan angle and the shooting angle of view, generates a PinP (Picture in Picture) image that shows the shooting area on the map image, reads the CG from the CG storage unit for each of the display elements, arranges the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranges the PinP image at the location of the image coordinate values that have been set in advance, thereby generating the synthesis frame.

また、請求項２の映像合成用フレーム生成装置は、請求項１に記載の映像合成用フレーム生成装置において、前記カメラパラメータ受信部に代わる新たなカメラパラメータ受信部を備え、さらに、前記ズーム値に対応する撮影画角、及び前記フォーカス値に対応するフォーカス距離が格納された変換情報記憶部を備え、前記新たなカメラパラメータ受信部が、前記カメラのカメラパラメータを受信し、前記カメラパラメータからズーム値及びフォーカス値を抽出し、前記変換情報記憶部から、前記ズーム値に対応する前記撮影画角、及び前記フォーカス値に対応する前記フォーカス距離を読み出し、前記撮影画角及び前記フォーカス距離を含むカメラパラメータを出力し、前記カメラパラメータ補間部が、前記新たなカメラパラメータ受信部により出力された前記カメラパラメータのデータレートが前記カメラ映像のフレームレートに合致するように、前記カメラパラメータを補間する、ことを特徴とする。 The video synthesis frame generating device of claim 2 is the video synthesis frame generating device of claim 1, further comprising a new camera parameter receiving unit replacing the camera parameter receiving unit, and further comprising a conversion information storage unit in which a shooting angle of view corresponding to the zoom value and a focus distance corresponding to the focus value are stored, the new camera parameter receiving unit receives the camera parameters of the camera, extracts the zoom value and the focus value from the camera parameters, reads out the shooting angle of view corresponding to the zoom value and the focus distance corresponding to the focus value from the conversion information storage unit, and outputs the camera parameters including the shooting angle of view and the focus distance, and the camera parameter interpolation unit interpolates the camera parameters so that the data rate of the camera parameters output by the new camera parameter receiving unit matches the frame rate of the camera image.

また、請求項３の映像合成用フレーム生成装置は、請求項１または２に記載の映像合成用フレーム生成装置において、前記座標変換部が、前記カメラパラメータから前記パン角、チルト角、ロール角、前記フォーカス距離及びカメラ位置を抽出し、前記パン角をα、前記チルト角をδ、前記ロール角をφ、前記フォーカス距離をｆ、前記カメラ位置をｔ_w＝［ｔ_wx，ｔ_wy，ｔ_wz］、前記世界座標値を（Ｘ_w，Ｙ_w，Ｚ_w）、前記画像座標値を（ｘ_i，ｙ_i）、横及び縦方向のピクセル数の半値をμ₀，ν₀として、前記パン角α、前記チルト角δ及び前記ロール角φを用いて、以下の式：

により姿勢Ｒを算出し、前記表示要素毎に、前記フォーカス距離ｆ、前記姿勢Ｒ、前記カメラ位置ｔ_w及び前記ピクセル数の半値μ₀，ν₀を用いて、以下の式：

により、前記世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）を前記画像座標値（ｘ_i，ｙ_i）に変換する、ことを特徴とする。 Further, in the video synthesis frame generation device of claim 3, in the video synthesis frame generation device of

claim

1 or 2, the coordinate conversion unit extracts the pan angle, tilt angle, roll angle, the focus distance and camera position from the camera parameters, and calculates the following equation using the pan angle α, the tilt angle δ and the roll angle φ, where α is the pan angle, δ is the tilt angle, φ is the roll angle, f is the focus distance, t _w = [t _wx , t _wy , t _wz ], (X _w , Y _w , Z _w ), (x _i , y _i ), and half the number of pixels in the horizontal and vertical directions are μ ₀ and ν ₀ :

The orientation R is calculated by the following formula, using the focus distance f, the orientation R, the camera position t _w and the half values μ ₀ and ν ₀ of the number of pixels for each display element:

The world coordinate values (X _w , Y _w , Z _w ) are converted into the image coordinate values (x _i , y _i ) by:

また、請求項４の映像合成用フレーム生成装置は、請求項１または２に記載の映像合成用フレーム生成装置において、前記合成用フレーム生成部が、前記カメラパラメータから前記パン角及び前記撮影画角を抽出し、前記パン角から前記カメラの向きを求めると共に、前記撮影画角から前記カメラの位置を基準とした前記撮影領域のなす角を求め、前記カメラの向き及び前記撮影領域のなす角から、前記地図画像における前記撮影領域を求める、ことを特徴とする。 The image synthesis frame generating device of claim 4 is the image synthesis frame generating device of claim 1 or 2, characterized in that the synthesis frame generating unit extracts the pan angle and the shooting angle of view from the camera parameters, determines the camera orientation from the pan angle, determines the angle of the shooting area based on the camera position from the shooting angle of view, and determines the shooting area in the map image from the camera orientation and the angle of the shooting area.

また、請求項５の映像合成用フレーム生成装置は、請求項１または２に記載の映像合成用フレーム生成装置において、前記ＣＧ記憶部には、前記表示要素毎に、当該表示要素の前記撮影画角または前記フォーカス距離に応じた複数のＣＧが格納されており、前記合成用フレーム生成部が、前記カメラパラメータから前記撮影画角または前記フォーカス距離を抽出し、前記表示要素毎に、前記ＣＧ記憶部から、前記撮影画角または前記フォーカス距離に対応する前記ＣＧを読み出し、前記画像座標値の箇所に前記ＣＧを配置し、前記予め設定された箇所に前記ＰinＰ画像を配置することで、前記合成用フレームを生成する、ことを特徴とする。 The image synthesis frame generating device of claim 5 is the image synthesis frame generating device of claim 1 or 2, characterized in that the CG storage unit stores, for each display element, a plurality of CGs corresponding to the shooting angle of view or the focus distance of the display element, and the synthesis frame generating unit extracts the shooting angle of view or the focus distance from the camera parameters, reads out the CG corresponding to the shooting angle of view or the focus distance from the CG storage unit for each display element, places the CG at the location of the image coordinate values, and places the PinP image at the preset location, thereby generating the synthesis frame.

また、請求項６の映像合成用フレーム生成装置は、請求項１または２に記載の映像合成用フレーム生成装置において、前記合成用フレーム生成部が、前記カメラパラメータから前記撮影画角または前記フォーカス距離を抽出し、前記撮影画角または前記フォーカス距離に基づいて、前記ＰinＰ画像の縮尺を変化させ、前記予め設定された箇所に前記縮尺を変化させた前記ＰinＰ画像を配置することで、前記合成用フレームを生成する、ことを特徴とする。 The video synthesis frame generating device of claim 6 is the video synthesis frame generating device of claim 1 or 2, characterized in that the synthesis frame generating unit extracts the shooting angle of view or the focus distance from the camera parameters, changes the scale of the PinP image based on the shooting angle of view or the focus distance, and generates the synthesis frame by placing the PinP image with the changed scale in the preset location.

さらに、請求項７の映像合成装置は、カメラ映像の撮影に関する情報を含むフレームを合成用フレームとして生成し、前記カメラ映像のフレームに前記合成用フレームを合成することで、合成映像を生成する映像合成装置において、前記カメラ映像の撮影に関する情報の表示要素毎に、当該表示要素の世界座標値が格納された座標情報記憶部と、前記カメラ映像を撮影するカメラの位置を含む地図画像が格納された地図画像記憶部と、前記表示要素毎のＣＧが格納されたＣＧ記憶部と、前記カメラのカメラパラメータを受信し、前記カメラパラメータに含まれるズーム値及びフォーカス値をそれぞれ撮影画角及びフォーカス距離とし、前記撮影画角及び前記フォーカス距離を含むカメラパラメータを出力するカメラパラメータ受信部と、前記カメラパラメータ受信部により出力された前記カメラパラメータのデータレートが前記カメラ映像のフレームレートに合致するように、前記カメラパラメータを補間するカメラパラメータ補間部と、前記表示要素毎に、前記座標情報記憶部から前記世界座標値を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータに基づいて、前記世界座標値を画像座標値に変換する座標変換部と、前記地図画像記憶部から前記地図画像を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータからパン角及び前記撮影画角を抽出し、前記パン角及び前記撮影画角に基づいて、前記地図画像における前記カメラの位置を基準とした撮影領域を求め、前記撮影領域を前記地図画像に表したＰinＰ（ピクチャインピクチャ）画像を生成し、前記表示要素毎に、前記ＣＧ記憶部から前記ＣＧを読み出し、前記座標変換部により変換された前記画像座標値の箇所に前記ＣＧを配置し、予め設定された画像座標値の箇所に前記ＰinＰ画像を配置することで、前記合成用フレームを生成する合成用フレーム生成部と、前記カメラ映像のフレームが、前記合成用フレーム生成部により生成された前記合成用フレームに同期するように、前記カメラ映像のフレームを遅延させる映像遅延部と、前記映像遅延部により遅延させた前記カメラ映像のフレームに、前記合成用フレーム生成部により生成された前記合成用フレームを合成し、前記合成映像を出力するフレーム合成部と、を備えたことを特徴とする。 Furthermore, the image synthesizing device of claim 7 generates a frame including information regarding the shooting of a camera image as a synthesis frame, and synthesizes the synthesis frame with the frame of the camera image to generate a synthetic image, and the image synthesizing device includes a coordinate information storage unit in which, for each display element of the information regarding the shooting of the camera image, the world coordinate value of the display element is stored, a map image storage unit in which a map image including the position of the camera that shoots the camera image is stored, a CG storage unit in which CG for each display element is stored, a camera parameter receiving unit that receives the camera parameters of the camera, sets the zoom value and focus value included in the camera parameters as the shooting angle of view and the focus distance, respectively, and outputs the camera parameters including the shooting angle of view and the focus distance, a camera parameter interpolation unit that interpolates the camera parameters so that the data rate of the camera parameters output by the camera parameter receiving unit matches the frame rate of the camera image, and a camera parameter interpolation unit that reads out the world coordinate value from the coordinate information storage unit for each display element, and calculates the world coordinate value based on the camera parameters interpolated by the camera parameter interpolation unit. a coordinate conversion unit that converts the world coordinate values into image coordinate values based on the world coordinates; a synthesis frame generation unit that reads the map image from the map image storage unit, extracts the pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, obtains a shooting area based on the position of the camera in the map image based on the pan angle and the shooting angle of view, generates a PinP (Picture in Picture) image that shows the shooting area on the map image, reads the CG from the CG storage unit for each display element, arranges the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranges the PinP image at the location of a preset image coordinate value to generate the synthesis frame; a video delay unit that delays the camera video frame so that the camera video frame is synchronized with the synthesis frame generated by the synthesis frame generation unit; and a frame synthesis unit that synthesizes the synthesis frame generated by the synthesis frame generation unit with the camera video frame delayed by the video delay unit, and outputs the synthesized video.

さらに、請求項８のプログラムは、カメラ映像の撮影に関する情報を含むフレームを、前記カメラ映像を合成する際に用いる合成用フレームとして生成する映像合成用フレーム生成装置を構成するコンピュータを、前記カメラ映像の撮影に関する情報の表示要素毎に、当該表示要素の世界座標値が格納された座標情報記憶部、前記カメラ映像を撮影するカメラの位置を含む地図画像が格納された地図画像記憶部、前記表示要素毎のＣＧが格納されたＣＧ記憶部、前記カメラのカメラパラメータを受信し、前記カメラパラメータに含まれるズーム値及びフォーカス値をそれぞれ撮影画角及びフォーカス距離とし、前記撮影画角及び前記フォーカス距離を含むカメラパラメータを出力するカメラパラメータ受信部、前記カメラパラメータ受信部により出力された前記カメラパラメータのデータレートが前記カメラ映像のフレームレートに合致するように、前記カメラパラメータを補間するカメラパラメータ補間部、前記表示要素毎に、前記座標情報記憶部から前記世界座標値を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータに基づいて、前記世界座標値を画像座標値に変換する座標変換部、及び、前記地図画像記憶部から前記地図画像を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータからパン角及び前記撮影画角を抽出し、前記パン角及び前記撮影画角に基づいて、前記地図画像における前記カメラの位置を基準とした撮影領域を求め、前記撮影領域を前記地図画像に表したＰinＰ（ピクチャインピクチャ）画像を生成し、前記表示要素毎に、前記ＣＧ記憶部から前記ＣＧを読み出し、前記座標変換部により変換された前記画像座標値の箇所に前記ＣＧを配置し、予め設定された画像座標値の箇所に前記ＰinＰ画像を配置することで、前記合成用フレームを生成する合成用フレーム生成部として機能させることを特徴とする。 Furthermore, the program of claim 8 includes a computer constituting an image synthesis frame generating device that generates a frame including information regarding the shooting of a camera image as a synthesis frame to be used when synthesizing the camera image, the computer comprising: a coordinate information storage unit in which, for each display element of the information regarding the shooting of the camera image, the world coordinate values of the display element are stored; a map image storage unit in which a map image including the position of the camera that shoots the camera image is stored; a CG storage unit in which CG for each display element is stored; a camera parameter receiving unit that receives the camera parameters of the camera, sets the zoom value and focus value included in the camera parameters as the shooting angle of view and the focus distance, respectively, and outputs the camera parameters including the shooting angle of view and the focus distance; and a camera parameter output unit that interpolates the camera parameters so that the data rate of the camera parameters output by the camera parameter receiving unit matches the frame rate of the camera image. a data interpolation unit, a coordinate conversion unit that reads the world coordinate values from the coordinate information storage unit for each of the display elements, and converts the world coordinate values into image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit, and a synthesis frame generation unit that reads the map image from the map image storage unit, extracts a pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determines a shooting area based on the position of the camera in the map image based on the pan angle and the shooting angle of view, generates a PinP (Picture in Picture) image that shows the shooting area on the map image, reads the CG from the CG storage unit for each of the display elements, arranges the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranges the PinP image at the location of the image coordinate values that have been set in advance, thereby functioning as a synthesis frame generation unit.

また、請求項９のプログラムは、カメラ映像の撮影に関する情報を含むフレームを合成用フレームとして生成し、前記カメラ映像のフレームに前記合成用フレームを合成することで、合成映像を生成する映像合成装置を構成するコンピュータを、前記カメラ映像の撮影に関する情報の表示要素毎に、当該表示要素の世界座標値が格納された座標情報記憶部、前記カメラ映像を撮影するカメラの位置を含む地図画像が格納された地図画像記憶部、前記表示要素毎のＣＧが格納されたＣＧ記憶部、前記カメラのカメラパラメータを受信し、前記カメラパラメータに含まれるズーム値及びフォーカス値をそれぞれ撮影画角及びフォーカス距離とし、前記撮影画角及び前記フォーカス距離を含むカメラパラメータを出力するカメラパラメータ受信部、前記カメラパラメータ受信部により出力された前記カメラパラメータのデータレートが前記カメラ映像のフレームレートに合致するように、前記カメラパラメータを補間するカメラパラメータ補間部、前記表示要素毎に、前記座標情報記憶部から前記世界座標値を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータに基づいて、前記世界座標値を画像座標値に変換する座標変換部、前記地図画像記憶部から前記地図画像を読み出し、前記カメラパラメータ補間部により補間された前記カメラパラメータからパン角及び前記撮影画角を抽出し、前記パン角及び前記撮影画角に基づいて、前記地図画像における前記カメラの位置を基準とした撮影領域を求め、前記撮影領域を前記地図画像に表したＰinＰ（ピクチャインピクチャ）画像を生成し、前記表示要素毎に、前記ＣＧ記憶部から前記ＣＧを読み出し、前記座標変換部により変換された前記画像座標値の箇所に前記ＣＧを配置し、予め設定された画像座標値の箇所に前記ＰinＰ画像を配置することで、前記合成用フレームを生成する合成用フレーム生成部、前記カメラ映像のフレームが、前記合成用フレーム生成部により生成された前記合成用フレームに同期するように、前記カメラ映像のフレームを遅延させる映像遅延部、及び、前記映像遅延部により遅延させた前記カメラ映像のフレームに、前記合成用フレーム生成部により生成された前記合成用フレームを合成し、前記合成映像を出力するフレーム合成部として機能させることを特徴とする。 The program of claim 9 further comprises a computer constituting an image synthesizing device that generates a composite image by generating a frame including information regarding the shooting of a camera image as a synthesis frame and synthesizing the synthesis frame with the frame of the camera image, the computer comprising: a coordinate information storage unit in which, for each display element of the information regarding the shooting of the camera image, the world coordinate values of the display element are stored; a map image storage unit in which a map image including the position of the camera that shoots the camera image is stored; a CG storage unit in which CG for each display element is stored; a camera parameter receiving unit that receives the camera parameters of the camera, sets the zoom value and focus value included in the camera parameters as the shooting angle of view and the focus distance, respectively, and outputs the camera parameters including the shooting angle of view and the focus distance; a camera parameter interpolation unit that interpolates the camera parameters so that the data rate of the camera parameters output by the camera parameter receiving unit matches the frame rate of the camera image; and a computer program that reads out the world coordinate values from the coordinate information storage unit for each display element, and outputs the world coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit. a coordinate conversion unit that converts the world coordinate values into image coordinate values based on the world coordinates; a frame synthesis unit that reads the map image from the map image storage unit, extracts the pan angle and the shooting angle from the camera parameters interpolated by the camera parameter interpolation unit, obtains a shooting area based on the position of the camera in the map image based on the pan angle and the shooting angle, generates a PinP (Picture in Picture) image that shows the shooting area on the map image, reads the CG from the CG storage unit for each display element, arranges the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranges the PinP image at the location of a preset image coordinate value to generate the synthesis frame; a video delay unit that delays the camera video frame so that the camera video frame is synchronized with the synthesis frame generated by the synthesis frame generation unit; and a frame synthesis unit that synthesizes the synthesis frame generated by the synthesis frame generation unit with the camera video frame delayed by the video delay unit and outputs the synthesized video.

以上のように、本発明によれば、実世界が撮影されたカメラ映像について、俯瞰的に撮影の方角及び画角を容易に認識可能な情報を生成することができる。 As described above, according to the present invention, it is possible to generate information that allows the direction and angle of view of a camera image captured in the real world to be easily recognized from a bird's-eye view.

実施例１の映像合成システムの全体構成例を示す概略図である。1 is a schematic diagram illustrating an example of the overall configuration of a video synthesis system according to a first embodiment. 実施例１の映像合成システムの動作概要を示す模式図である。FIG. 2 is a schematic diagram showing an outline of the operation of the image synthesis system according to the first embodiment; 実施例１における映像合成用フレーム生成装置の構成例を示すブロック図である。1 is a block diagram showing an example of the configuration of a video synthesis frame generating device according to a first embodiment; 実施例１における映像合成用フレーム生成装置の処理例を示すフローチャートである。4 is a flowchart showing an example of processing performed by the video synthesis frame generating device in the first embodiment. カメラパラメータ受信部の処理例を示すフローチャートである。13 is a flowchart showing an example of processing by a camera parameter receiving unit. 座標変換部の処理例を示すフローチャートである。13 is a flowchart illustrating an example of processing by a coordinate conversion unit. 座標変換部の処理例を説明する図である。11A and 11B are diagrams illustrating an example of processing by a coordinate conversion unit; 合成用フレーム生成部の処理例を示すフローチャートである。13 is a flowchart illustrating an example of processing by a synthesis frame generating unit. 合成用フレームの例を示す図である。FIG. 13 is a diagram showing an example of a synthesis frame. ＰinＰ画像の例を示す図である。FIG. 1 is a diagram showing an example of a PinP image. 変換情報記憶部のデータ構成例を示す図である。13 is a diagram illustrating an example of a data configuration of a conversion information storage unit; 座標情報記憶部のデータ構成例を示す図である。5 is a diagram illustrating an example of a data configuration of a coordinate information storage unit; ＣＧ記憶部のデータ構成例を示す図である。FIG. 4 is a diagram illustrating an example of a data structure of a CG storage unit. ＣＧ記憶部の他のデータ構成例を示す図である。13A and 13B are diagrams illustrating another example of the data structure of the CG storage unit. 実施例１における映像合成用フレーム生成装置の他の処理例を示すフローチャートである。10 is a flowchart showing another example of processing performed by the video synthesis frame generating device in the first embodiment. 実施例１における映像フレーム合成装置の構成例を示すブロック図である。1 is a block diagram showing an example of the configuration of a video frame synthesizing device according to a first embodiment. パン操作が行われる前後の合成映像の例を示す図である。11A and 11B are diagrams illustrating an example of a composite image before and after a panning operation is performed. チルト操作及びズーム操作が行われる前後の合成映像の例を示す図である。11A and 11B are diagrams illustrating an example of a composite image before and after a tilt operation and a zoom operation are performed. ズーム操作に伴いランドマークの表示が切り替えられる前後の合成映像の例を示す図である。11A and 11B are diagrams showing examples of composite images before and after the display of landmarks is switched in accordance with a zoom operation. 実施例２の映像合成装置の構成例を示すブロック図である。FIG. 11 is a block diagram showing a configuration example of an image synthesizing device according to a second embodiment.

以下、本発明を実施するための形態について図面を用いて詳細に説明する。
〔実施例１〕
まず、実施例１について説明する。図１は、実施例１の映像合成システムの全体構成例を示す概略図であり、図２は、その動作概要を示す模式図である。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.
Example 1
First, a description will be given of Example 1. Fig. 1 is a schematic diagram showing an example of the overall configuration of a video synthesis system according to Example 1, and Fig. 2 is a schematic diagram showing an outline of the operation thereof.

この映像合成システム１は、映像合成用フレーム生成装置２及び映像フレーム合成装置３を備えて構成される。映像合成システム１は、カメラマンが操作するカメラにより撮影されたカメラ映像を入力すると共に、カメラのパラメータ（カメラパラメータＣＰ）を入力する。そして、映像合成システム１は、方角、ランドマーク等の表示要素毎のＣＧ、及び撮影領域Ｓを地図画像に表したＰinＰ画像を含む合成用フレームを生成し、カメラ映像のフレームに合成用フレームを合成し、合成映像を実時間で出力する。撮影領域Ｓは、カメラがどの方角をどれだけの画角で撮影しているかを示す領域である。 This video synthesis system 1 is comprised of a video synthesis frame generation device 2 and a video frame synthesis device 3. The video synthesis system 1 inputs camera images captured by a camera operated by a photographer, as well as camera parameters (camera parameters CP). The video synthesis system 1 then generates a synthesis frame including CG for each display element such as direction and landmark, and a PinP image that shows the shooting area S on a map image, synthesizes the synthesis frame with the camera image frame, and outputs the synthesized image in real time. The shooting area S is an area that indicates which direction the camera is capturing images at and with what angle of view.

映像合成用フレーム生成装置２は、カメラパラメータＣＰを入力し、カメラパラメータＣＰに基づいて、カメラ映像の撮影に関する情報を含むフレームを、後述する映像フレーム合成装置３にてカメラ映像を合成する際に用いる合成用フレームとして生成し、合成用フレームを映像フレーム合成装置３に出力する。 The video synthesis frame generation device 2 inputs the camera parameters CP, and based on the camera parameters CP, generates a frame containing information related to the shooting of the camera video as a synthesis frame to be used when synthesizing the camera video in the video frame synthesis device 3 described below, and outputs the synthesis frame to the video frame synthesis device 3.

具体的には、映像合成用フレーム生成装置２は、方角、ランドマーク等の表示要素毎の世界座標値を画像座標値（カメラ座標値）に変換する。そして、映像合成用フレーム生成装置２は、カメラの位置を含む地図画像内の撮影領域Ｓを求め、撮影領域Ｓを地図画像に表したＰinＰ画像を生成する。世界座標値は、世界座標系における座標値を示し、画像座標値は、画像座標系（カメラ座標系）における座標値を示す。 Specifically, the image synthesis frame generating device 2 converts the world coordinate values of each display element, such as directions and landmarks, into image coordinate values (camera coordinate values). Then, the image synthesis frame generating device 2 determines the shooting area S in the map image that includes the camera position, and generates a PinP image that shows the shooting area S on the map image. The world coordinate values indicate coordinate values in the world coordinate system, and the image coordinate values indicate coordinate values in the image coordinate system (camera coordinate system).

映像合成用フレーム生成装置２は、表示要素毎のＣＧ及びＰinＰ画像を所定の画像座標値の箇所に配置することで合成用フレームを生成し、合成用フレームを映像フレーム合成装置３に出力する。 The video compositing frame generator 2 generates a compositing frame by placing the CG and PinP images for each display element at a location with a specified image coordinate value, and outputs the compositing frame to the video frame compositing device 3.

合成用フレームは、図２の下側に示すように、例えば画像座標値の箇所に配置されたランドマークの表示要素「旭川駅」、及び所定の画像座標値の箇所に配置された２つのＰinＰ画像により構成される。 As shown in the lower part of Figure 2, the synthesis frame is composed of, for example, a landmark display element "Asahikawa Station" placed at the image coordinate value, and two PinP images placed at the specified image coordinate value.

図１に戻って、映像フレーム合成装置３は、カメラ映像を入力すると共に、映像合成用フレーム生成装置２から合成用フレームを入力し、カメラ映像を所定時間だけ遅延させ、遅延後のカメラ映像のフレームに合成用フレームを合成し、合成映像を出力する。 Returning to FIG. 1, the video frame synthesizer 3 inputs the camera video and a synthesis frame from the video synthesis frame generator 2, delays the camera video by a predetermined time, synthesizes the synthesis frame with the delayed camera video frame, and outputs the synthesized video.

合成映像（のフレーム）は、図２の右側に示すように、カメラ映像に、ランドマークの表示要素「旭川駅」及び２つのＰinＰ画像からなる合成用フレームが合成された映像である。 The composite image (frame) is an image in which the camera image is combined with a composite frame consisting of the landmark display element "Asahikawa Station" and two PinP images, as shown on the right side of Figure 2.

これにより、カメラマン及び視聴者は、合成映像に含まれるＰinＰ画像から、カメラがどの方角をどれだけの画角で撮影しているかを認識することができ、合成映像に含まれるランドマークの表示要素「旭川駅」の表示から、カメラ映像の方角及び画角を具体的に認識することができる。つまり、図１に示した映像合成システム１を用いることで、実世界が撮影されたカメラ映像について、俯瞰的に撮影の方角及び画角を容易に認識可能な情報を生成することができる。 This allows the cameraman and viewers to recognize from the PinP image included in the composite image which direction the camera is shooting at and what angle of view it is shooting at, and they can specifically recognize the direction and angle of view of the camera image from the display of the landmark display element "Asahikawa Station" included in the composite image. In other words, by using the image synthesis system 1 shown in Figure 1, it is possible to generate information that allows the direction and angle of view of camera images captured in the real world to be easily recognized from a bird's-eye view.

尚、この合成映像には、特定のランドマークのＣＧ（本例では「旭川駅」）が合成されているが、例えば東西南北等の単純な方角の名称を示すＣＧが合成されるようにしてもよい。合成される表示要素のＣＧは、映像合成用フレーム生成装置２を操作するユーザにより自由に設定されるものとする。 In addition, this composite image includes a CG image of a specific landmark ("Asahikawa Station" in this example), but it may also include a CG image showing the names of simple directions such as north, south, east and west. The CG image of the display elements to be composited can be freely set by the user operating the image composition frame generation device 2.

（映像合成用フレーム生成装置２／実施例１）
次に、図１に示した実施例１における映像合成用フレーム生成装置２について詳細に説明する。図３は、実施例１における映像合成用フレーム生成装置２の構成例を示すブロック図であり、図４は、その処理例を示すフローチャートである。 (Video Composition Frame Generator 2/Example 1)
Next, a detailed description will be given of the video synthesis frame generating device 2 in the first embodiment shown in Fig. 1. Fig. 3 is a block diagram showing an example of the configuration of the video synthesis frame generating device 2 in the first embodiment, and Fig. 4 is a flowchart showing an example of the processing thereof.

この映像合成用フレーム生成装置２は、フレーム処理部１０及び記憶部２０を備えている。フレーム処理部１０は、カメラパラメータ受信部１１、カメラパラメータ補間部１２、座標変換部１３及び合成用フレーム生成部１４を備え、記憶部２０は、変換情報記憶部２１、座標情報記憶部２２、ＣＧ記憶部２３及び地図画像記憶部２４を備えている。 This video synthesis frame generation device 2 includes a frame processing unit 10 and a storage unit 20. The frame processing unit 10 includes a camera parameter receiving unit 11, a camera parameter interpolation unit 12, a coordinate conversion unit 13, and a synthesis frame generation unit 14, and the storage unit 20 includes a conversion information storage unit 21, a coordinate information storage unit 22, a CG storage unit 23, and a map image storage unit 24.

カメラパラメータ受信部１１は、カメラからカメラパラメータＣＰを受信し、変換情報記憶部２１に格納された変換情報を用いて、カメラパラメータＣＰに含まれるズーム値及びフォーカス値をそれぞれ撮影画角θ及びフォーカス距離ｆに変換し、撮影画角θ及びフォーカス距離ｆを含むカメラパラメータＣＰ１を生成する（ステップＳ４０１）。 The camera parameter receiving unit 11 receives the camera parameters CP from the camera, and converts the zoom value and focus value included in the camera parameters CP into the shooting angle of view θ and the focus distance f, respectively, using the conversion information stored in the conversion information storage unit 21, to generate camera parameters CP1 including the shooting angle of view θ and the focus distance f (step S401).

尚、カメラパラメータ受信部１１は、カメラパラメータＣＰに含まれるズーム値及びフォーカス値をそれぞれ撮影画角θ及びフォーカス距離ｆとして、カメラパラメータＣＰ１を生成するようにしてもよい。 The camera parameter receiving unit 11 may generate the camera parameters CP1 by setting the zoom value and focus value included in the camera parameters CP as the shooting angle of view θ and focus distance f, respectively.

カメラパラメータ受信部１１は、カメラパラメータＣＰ１をカメラパラメータ補間部１２に出力する。ここで、カメラパラメータ受信部１１が入力するカメラパラメータＣＰは、パン角α、チルト角δ、ロール角φ、ズーム値、フォーカス値、カメラ位置ｔ_w等により構成される。カメラパラメータ受信部１１が出力するカメラパラメータＣＰ１は、パン角α、チルト角δ、ロール角φ、撮影画角θ、フォーカス距離ｆ、カメラ位置ｔ_w等により構成される。後述するカメラパラメータＣＰ２についてもカメラパラメータＣＰ１と同様である。カメラパラメータ受信部１１の処理の詳細については後述する。 The camera parameter receiving unit 11 outputs the camera parameters CP1 to the camera parameter interpolation unit 12. Here, the camera parameters CP input by the camera parameter receiving unit 11 are composed of a pan angle α, a tilt angle δ, a roll angle φ, a zoom value, a focus value, a camera position _tw , etc. The camera parameters CP1 output by the camera parameter receiving unit 11 are composed of a pan angle α, a tilt angle δ, a roll angle φ, a shooting angle of view θ, a focus distance f, a camera position _tw , etc. The camera parameters CP2 described later are the same as the camera parameters CP1. Details of the processing by the camera parameter receiving unit 11 will be described later.

カメラパラメータ補間部１２は、カメラパラメータ受信部１１からカメラパラメータＣＰ１を入力する。そして、カメラパラメータ補間部１２は、カメラ映像のフレームレートに対し、カメラパラメータＣＰ１を構成するデータ列のデータレート（カメラにおいてカメラパラメータＣＰが更新されるレート）が不足している場合、カメラパラメータＣＰ１のデータレートがカメラ映像のフレームレートに一致するように、カメラパラメータＣＰ１のデータを補間することで、その値が滑らかに変化するカメラパラメータＣＰ２を生成する（ステップＳ４０２）。カメラパラメータ補間部１２は、カメラパラメータＣＰ２を座標変換部１３に出力する。 The camera parameter interpolation unit 12 inputs the camera parameters CP1 from the camera parameter receiving unit 11. Then, when the data rate of the data sequence constituting the camera parameters CP1 (the rate at which the camera parameters CP are updated in the camera) is insufficient for the frame rate of the camera image, the camera parameter interpolation unit 12 generates camera parameters CP2 whose values change smoothly by interpolating the data of the camera parameters CP1 so that the data rate of the camera parameters CP1 matches the frame rate of the camera image (step S402). The camera parameter interpolation unit 12 outputs the camera parameters CP2 to the coordinate conversion unit 13.

補間手法はユーザにより予め設定され、かつ選択できるものとする。その基本的な手法は、カメラパラメータＣＰ１のデータ列を入力した後、次のデータ列を入力するまでの間、同じ値を保持し続ける０次ホールドにより行う。また、より滑らかに補間する場合は、入力されたデータ列の区間毎に、異なる低次の多項式を用いて近似するスプライン補間を行う。 The interpolation method is preset and selectable by the user. The basic method is a zero-order hold, which holds the same value after the data string of the camera parameters CP1 is input, until the next data string is input. For smoother interpolation, spline interpolation is performed, approximating each section of the input data string using different low-order polynomials.

また、通常使用される３次スプライン補間は、補間対象の区間において、３次関数の係数をａ_j，ｂ_j，ｃ_j、対象区間の始点のデータをｄ_jとすると、以下の式で表される。

Furthermore, commonly used cubic spline interpolation is expressed by the following equation, where a _j , b _j , and c _j are the coefficients of a cubic function in an interval to be interpolated, and d _j is the data at the start point of the interval.

無限区間において実時間で逐次に３次スプライン補間を行う場合、制御点を滑らかにつなぐ境界条件に加え、補間対象の前後区間における影響定数及び前後区間数を決定することにより係数ａ_j，ｂ_j，ｃ_jを求めることができ、補間が行われる。その他、ＬＳＴＭ（Long Short Term Memory）等の深層学習により、時系列データの予測を行う学習済み推論モデルを使用するようにしてもよい。 When cubic spline interpolation is performed sequentially in real time in an infinite interval, the coefficients _aj , bj, and _cj can be obtained by determining the influence constants in the intervals before and after the interpolation target and the number of intervals before and after the _{interpolation} target in addition to the boundary conditions that smoothly connect the control points, and the interpolation is performed.In addition, a learned inference model that predicts time series data by deep learning such as LSTM (Long Short Term Memory) may be used.

座標変換部１３は、カメラパラメータ補間部１２からカメラパラメータＣＰ２を入力する。そして、座標変換部１３は、方角、ランドマーク等の表示要素毎に、カメラパラメータＣＰ２を用いて、座標情報記憶部２２に格納された世界座標値を画像座標値に変換する（ステップＳ４０３）。 The coordinate conversion unit 13 inputs the camera parameters CP2 from the camera parameter interpolation unit 12. Then, the coordinate conversion unit 13 converts the world coordinate values stored in the coordinate information storage unit 22 into image coordinate values using the camera parameters CP2 for each display element such as a direction or a landmark (step S403).

座標変換部１３は、表示要素毎の画像座標値及びカメラパラメータＣＰ２を合成用フレーム生成部１４に出力する。座標変換部１３の処理の詳細については後述する。 The coordinate conversion unit 13 outputs the image coordinate values and camera parameters CP2 for each display element to the synthesis frame generation unit 14. The processing of the coordinate conversion unit 13 will be described in detail later.

合成用フレーム生成部１４は、座標変換部１３から、表示要素毎の画像座標値及びカメラパラメータＣＰ２を入力する。そして、合成用フレーム生成部１４は、表示要素毎に、ＣＧ記憶部２３に格納された当該表示要素のＣＧを、入力した画像座標値の箇所に配置する。また、合成用フレーム生成部１４は、地図画像記憶部２４に格納された地図画像及びカメラパラメータＣＰ２を用いて、地図画像内のカメラの位置を基準とした撮影領域Ｓを求め、撮影領域Ｓを地図画像に表したＰinＰ画像を生成し、ＰinＰ画像を所定の画像座標値の箇所に配置する。 The synthesis frame generation unit 14 inputs the image coordinate values and camera parameters CP2 for each display element from the coordinate conversion unit 13. Then, for each display element, the synthesis frame generation unit 14 places the CG of that display element stored in the CG storage unit 23 at the location of the input image coordinate values. In addition, the synthesis frame generation unit 14 uses the map image stored in the map image storage unit 24 and the camera parameters CP2 to determine the shooting area S based on the position of the camera in the map image, generates a PinP image that shows the shooting area S on the map image, and places the PinP image at a location of a specified image coordinate value.

合成用フレーム生成部１４は、表示要素毎のＣＧ及び撮影領域Ｓを地図画像に表したＰinＰ画像を含む合成用フレームを生成し、合成用フレームを映像フレーム合成装置３へ出力する（ステップＳ４０４）。合成用フレーム生成部１４の処理の詳細については後述する。 The synthesis frame generation unit 14 generates a synthesis frame including a PinP image that represents the CG for each display element and the shooting area S on a map image, and outputs the synthesis frame to the video frame synthesis device 3 (step S404). The processing of the synthesis frame generation unit 14 will be described in detail later.

映像合成用フレーム生成装置２は、当該映像合成用フレーム生成装置２による処理が完了したか否かを判定する（ステップＳ４０５）。映像合成用フレーム生成装置２は、ステップＳ４０５において、当該処理が完了していないと判定した場合（ステップＳ４０５：Ｎ）、ステップＳ４０１へ移行し、次のカメラパラメータＣＰを受信する等の前述のステップＳ４０１～Ｓ４０４の処理を継続する。一方、映像合成用フレーム生成装置２は、当該処理が完了したと判定した場合（ステップＳ４０５：Ｙ）、当該処理を終了する。 The video synthesis frame generating device 2 determines whether the processing by the video synthesis frame generating device 2 has been completed (step S405). If the video synthesis frame generating device 2 determines in step S405 that the processing has not been completed (step S405: N), it proceeds to step S401 and continues the processing of the above-mentioned steps S401 to S404, such as receiving the next camera parameters CP. On the other hand, if the video synthesis frame generating device 2 determines that the processing has been completed (step S405: Y), it ends the processing.

（カメラパラメータ受信部１１）
次に、図３に示したカメラパラメータ受信部１１の処理の詳細について説明する。図５は、カメラパラメータ受信部１１の処理例を示すフローチャートであり、図４に示したステップＳ４０１に対応している。 (Camera parameter receiving unit 11)
Next, a detailed description will be given of the process of the camera parameter receiving unit 11 shown in Fig. 3. Fig. 5 is a flowchart showing an example of the process of the camera parameter receiving unit 11, which corresponds to step S401 shown in Fig. 4.

カメラパラメータ受信部１１は、カメラからカメラパラメータＣＰを受信し（ステップＳ５０１）、カメラパラメータＣＰからズーム値及びフォーカス値を抽出する（ステップＳ５０２）。 The camera parameter receiving unit 11 receives the camera parameters CP from the camera (step S501) and extracts the zoom value and focus value from the camera parameters CP (step S502).

カメラパラメータ受信部１１は、変換情報記憶部２１から、ズーム値に対応する撮影画角θを読み出すと共に（ステップＳ５０３）、フォーカス値に対応するフォーカス距離ｆを読み出す（ステップＳ５０４）。 The camera parameter receiving unit 11 reads out the shooting angle of view θ corresponding to the zoom value from the conversion information storage unit 21 (step S503), and also reads out the focus distance f corresponding to the focus value (step S504).

図１１は、変換情報記憶部２１のデータ構成例を示す図である。変換情報記憶部２１には、ズーム値及び当該ズーム値に対応する撮影画角θが組となって、複数の組のデータが格納されている。また、変換情報記憶部２１には、フォーカス値及び当該フォーカス値に対応するフォーカス距離ｆが組となって、複数の組のデータが格納されている。これらのデータは予め設定され、ユーザにより変更することができる。 Figure 11 is a diagram showing an example of the data configuration of the conversion information storage unit 21. The conversion information storage unit 21 stores multiple sets of data, each of which is a pair of a zoom value and a shooting angle of view θ corresponding to that zoom value. The conversion information storage unit 21 also stores multiple sets of data, each of which is a pair of a focus value and a focus distance f corresponding to that focus value. These data are set in advance and can be changed by the user.

これにより、ズーム値及びフォーカス値が、それぞれ撮影画角θ及びフォーカス距離ｆに変換される。 This converts the zoom value and focus value into the shooting angle of view θ and focus distance f, respectively.

図５に戻って、カメラパラメータ受信部１１は、ズーム値及びフォーカス値の代わりに撮影画角θ及びフォーカス距離ｆを含むカメラパラメータＣＰ１、すなわちパン角α、チルト角δ、ロール角φ、撮影画角θ、フォーカス距離ｆ及びカメラ位置ｔ_wを含むカメラパラメータＣＰ１を生成する（ステップＳ５０５）。そして、カメラパラメータ受信部１１は、カメラパラメータＣＰ１をカメラパラメータ補間部１２に出力する（ステップＳ５０６）。 5, the camera parameter receiving unit 11 generates camera parameters CP1 including the imaging angle of view θ and the focus distance f instead of the zoom value and the focus value, i.e., the pan angle α, the tilt angle δ, the roll angle φ, the imaging angle of view θ, the focus distance f, and the camera position _tw (step S505).Then, the camera parameter receiving unit 11 outputs the camera parameters CP1 to the camera parameter interpolation unit 12 (step S506).

これにより、カメラパラメータＣＰに含まれるズーム値及びフォーカス値が、それぞれズームレンズの画角及びフォーカスの合う距離を直接示す値でない場合、変換情報記憶部２１に格納された予め設定された変換情報を用いて、撮影画角θ及びフォーカス距離ｆに変換される。この場合、カメラの位置が変化しないで固定の場合は、既知の固定のカメラパラメータＣＰが入力され、既知の固定のカメラパラメータＣＰ１が出力されることとなる。 As a result, if the zoom value and focus value included in the camera parameters CP are not values that directly indicate the angle of view of the zoom lens and the focused distance, respectively, they are converted into the shooting angle of view θ and the focus distance f using pre-set conversion information stored in the conversion information storage unit 21. In this case, if the camera position is fixed and does not change, the known fixed camera parameters CP are input, and the known fixed camera parameters CP1 are output.

尚、カメラパラメータ受信部１１は、カメラパラメータＣＰに含まれるズーム値及びフォーカス値が、それぞれズームレンズの画角及びフォーカスの合う距離を直接示す値である場合、ズーム値及びフォーカス値をそれぞれ撮影画角θ及びフォーカス距離ｆとして、カメラパラメータＣＰ１を生成する。 When the zoom value and focus value included in the camera parameter CP directly indicate the angle of view of the zoom lens and the focused distance, respectively, the camera parameter receiving unit 11 generates the camera parameter CP1 by setting the zoom value and focus value to the shooting angle of view θ and the focus distance f, respectively.

（座標変換部１３）
次に、図３に示した座標変換部１３の処理の詳細について説明する。図６は、座標変換部１３の処理例を示すフローチャートであり、図４に示したステップＳ４０３に対応している。 (Coordinate conversion unit 13)
Next, a detailed description will be given of the process of the coordinate conversion unit 13 shown in Fig. 3. Fig. 6 is a flow chart showing an example of the process of the coordinate conversion unit 13, which corresponds to step S403 shown in Fig. 4.

座標変換部１３は、カメラパラメータ補間部１２からカメラパラメータＣＰ２を入力する（ステップＳ６０１）。そして、座標変換部１３は、座標情報記憶部２２から、表示要素毎の世界座標値を読み出す（ステップＳ６０２）。 The coordinate conversion unit 13 inputs the camera parameters CP2 from the camera parameter interpolation unit 12 (step S601). The coordinate conversion unit 13 then reads out the world coordinate values for each display element from the coordinate information storage unit 22 (step S602).

図１２は、座標情報記憶部２２のデータ構成例を示す図である。座標情報記憶部２２には、表示要素の識別子である表示要素ＩＤ、表示要素の名称、及び当該表示要素の世界座標系の位置を示す世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）が組となって、複数の組のデータが格納されている。これらの表示要素は、カメラが撮影可能な領域内に存在するランドマーク等の要素である。 12 is a diagram showing an example of the data configuration of the coordinate information storage unit 22. The coordinate information storage unit 22 stores a plurality of sets of data, each set consisting of a display element ID that is an identifier of a display element, a name of the display element, and world coordinate values ( _Xw , _Yw , _Zw ) that indicate the position of the display element in the world coordinate system. These display elements are elements such as landmarks that exist within an area that can be photographed by the camera.

例えば、表示要素ＩＤ＝１に対応して、ランドマークの表示要素の名称「旭川駅」及びその世界座標値（Ｘ_w1，Ｙ_w1，Ｚ_w1）が格納されている。これらのデータは予め設定され、ユーザにより変更することができる。 For example, the name of the landmark display element "Asahikawa Station" and its world coordinate values ( _Xw1 , _Yw1 , _Zw1 ) are stored corresponding to the display element ID = 1. These data are set in advance and can be changed by the user.

図６に戻って、座標変換部１３は、後述するステップＳ６０３～Ｓ６０５において、表示要素毎に、カメラパラメータＣＰ２を用いて当該表示要素の世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）を画像座標値（ｘ_i，ｙ_i）に変換する。ここで、世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）を画像座標値（ｘ_i，ｙ_i）に変換するためには、世界座標系及び画像座標系を定義する必要がある。 6, in steps S603 to S605 described later, the coordinate conversion unit 13 converts the world coordinate values ( _Xw , _Yw , Zw) of each display element into image coordinate values (x _i , y _i ) using the camera parameter CP2. Here, in order to convert the world coordinate values ( _Xw , _Yw , _Zw ) into image coordinate values (x _i _{, y i} ₎ , it is necessary to define a world coordinate system and an image coordinate system.

図７は、座標変換部１３の処理例を説明する図である。（１）は、座標を経度、緯度及び標高で表した世界測地系を示し、（２）は、座標を世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）で表した世界座標系を示し、（３）は、座標を画像座標値（ｘ_i，ｙ_i）で表した画像座標系を示している。 7 is a diagram for explaining an example of processing by the coordinate conversion unit 13. (1) shows the world geodetic system in which coordinates are expressed by longitude, latitude, and altitude, (2) shows the world coordinate system in which coordinates are expressed by world coordinate values ( _Xw , _Yw , _Zw ), and (3) shows the image coordinate system in which coordinates are expressed by image coordinate values (x _i , y _i ).

例えば、世界座標系は、カメラの視点の位置を原点にとり、原点を経線の東回り方向を＋Ｘ軸、緯線の北回り方向を＋Ｙ軸、地平面に対して垂直かつ上向きの軸を＋Ｚ軸とし、Ｘ，Ｙ，Ｚの順に右手系をなすように定義する。 For example, the world coordinate system is defined with the camera's viewpoint position as its origin, the eastward direction of the meridian from the origin as the +X axis, the northward direction of the latitude as the +Y axis, and the axis perpendicular to the horizon and pointing upward as the +Z axis, with the order X, Y, Z forming a right-handed system.

また、カメラ座標系を定義する。すなわち、カメラ座標系は、光軸方向を＋ｚ軸とし、カメラ座標系の原点が世界座標系の原点と一致し、かつ各軸の回転がない場合、カメラ座標系＋ｘ軸と世界座標系＋Ｘ軸、カメラ座標系＋ｙ軸と世界座標系－Ｚ軸、カメラ座標系＋ｚ軸と世界座標系＋Ｙ軸が一致する関係にあり、ｘ，ｙ，ｚの順に右手系をなすように定義する。 The camera coordinate system is also defined. That is, the camera coordinate system is defined so that the optical axis direction is the +z axis, the origin of the camera coordinate system coincides with the origin of the world coordinate system, and when there is no rotation of the axes, the +x axis of the camera coordinate system coincides with the +X axis of the world coordinate system, the +y axis of the camera coordinate system coincides with the -Z axis of the world coordinate system, and the +z axis of the camera coordinate system coincides with the +Y axis of the world coordinate system, forming a right-handed system in the order of x, y, z.

そうすると、図７に示すように、例えばランドマークの表示要素「旭川駅」の位置を、×印を〇で囲んだマークで示した場合、（１）の世界測地系における中央部の当該マークの位置に「旭川駅」があり、これに対応する（２）の世界座標系における世界座標値は（Ｘ_w1，Ｙ_w1，Ｚ_w1）となり、（３）の画像座標系における画像座標値は（ｘ_i1，ｙ_i1）となる。 Then, for example, if the position of the landmark display element "Asahikawa Station" is shown by a mark consisting of an X surrounded by a circle, as shown in Figure 7, "Asahikawa Station" is located at the position of the mark in the center of the world geodetic system (1), and the corresponding world coordinate value in the world coordinate system (2) is ( _Xw1 , _Yw1 , _Zw1 ), and the image coordinate value in the image coordinate system (3) is ( _xi1 , _yi1 ).

このランドマークの表示要素「旭川駅」については、座標変換部１３により、世界座標値（Ｘ_w1，Ｙ_w1，Ｚ_w1）が画像座標値（ｘ_i1，ｙ_i１）に変換される。 For the display element "Asahikawa Station" of this landmark, the coordinate conversion unit 13 converts the world coordinate values ( _Xw1 , _Yw1 , _Zw1 ) into image coordinate values ( _xi1 , _yi1 ).

図６に戻って、座標変換部１３は、ステップＳ６０２の後、カメラパラメータＣＰ２から、パン角α、チルト角δ、ロール角φ、撮影画角θ及びフォーカス距離ｆ及びカメラ位置ｔ_wを抽出する（ステップＳ６０３）。 Returning to FIG. 6, after step S602, the coordinate conversion unit 13 extracts the pan angle α, tilt angle δ, roll angle φ, shooting angle of view θ, focus distance f, and camera position _tw from the camera parameters CP2 (step S603).

座標変換部１３は、ｙ軸の回転角をパン角α[rad]、ｘ軸の回転角をチルト角δ[rad]、ｚ軸の回転角をロール角φ[rad]として、以下の式により、パン角α、チルト角δ及びロール角φを用いて、姿勢Ｒを算出する（ステップＳ６０４）。

The coordinate conversion unit 13 defines the rotation angle about the y-axis as the pan angle α [rad], the rotation angle about the x-axis as the tilt angle δ [rad], and the rotation angle about the z-axis as the roll angle φ [rad], and calculates the attitude R using the pan angle α, tilt angle δ, and roll angle φ according to the following equation (step S604).

ここで、世界座標系におけるカメラ座標系の原点の位置（カメラの視点位置）であるカメラ位置ｔ_wが以下の式で表されるものとする。

Here, the camera position t _w , which is the position of the origin of the camera coordinate system in the world coordinate system (the viewpoint position of the camera), is expressed by the following equation.

座標変換部１３は、以下の式により、表示要素毎に、フォーカス距離ｆ、姿勢Ｒ及びカメラ位置ｔ_wを用いて、世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）を画像座標値（ｘ_i，ｙ_i）に変換する（ステップＳ６０５）。

画像座標系は、画像左上を原点として水平右方向を＋ｘ軸、垂直下方向を＋ｙ軸とし、横・縦方向のピクセル数の半値をμ₀，ν₀とし、フォーカス距離ｆはピクセル単位の値とする。このように、座標変換部１３により、フォーカス距離ｆ、姿勢Ｒ、カメラ位置ｔ_w及び世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）から画像座標値（ｘ_i，ｙ_i）が算出される。 The coordinate conversion unit 13 converts the world coordinate values ( _Xw , _Yw , Zw) into image coordinate values (x _i , y _i ) for each display element using the focus distance f, orientation R and _camera position _tw (step S605).

The image coordinate system has the upper left corner of the image as the origin, the horizontal right direction as the +x axis, the vertical down direction as the +y axis, half the number of pixels in the horizontal and vertical directions as _μ0 and _ν0 , and the focus distance f is a value in pixels. In this way, the coordinate conversion unit 13 calculates the image coordinate values (x _i , y _i ) from the focus distance f, the attitude R, the camera position t _w and the world coordinate values (X _w , Y _w , Z _w ).

座標変換部１３は、表示要素毎の画像座標値（ｘ_i，ｙ_i）及びカメラパラメータＣＰ２を合成用フレーム生成部１４に出力する（ステップＳ６０６）。 The coordinate conversion unit 13 outputs the image coordinate values (x _i , y _i ) and the camera parameters CP2 for each display element to the synthesis frame generation unit 14 (step S606).

これにより、座標情報記憶部２２に格納された表示要素毎の世界座標値（Ｘ_w，Ｙ_w，Ｚ_w）が、画像座標値（ｘ_i，ｙ_i）に変換される。 As a result, the world coordinate values ( _Xw , _Yw , _Zw ) for each display element stored in the coordinate information storage unit 22 are converted into image coordinate values (x _i , y _i ).

（合成用フレーム生成部１４）
次に、図３に示した合成用フレーム生成部１４の処理について詳細に説明する。図８は、合成用フレーム生成部１４の処理例を示すフローチャートであり、図４に示したステップＳ４０４に対応している。 (Synthesis Frame Generator 14)
Next, a detailed description will be given of the processing of the synthesis frame generating unit 14 shown in Fig. 3. Fig. 8 is a flow chart showing an example of the processing of the synthesis frame generating unit 14, which corresponds to step S404 shown in Fig. 4.

合成用フレーム生成部１４は、座標変換部１３から、表示要素毎の画像座標値及びカメラパラメータＣＰ２を入力する（ステップＳ８０１）。そして、合成用フレーム生成部１４は、ＣＧ記憶部２３から、表示要素毎のＣＧを読み出す（ステップＳ８０２）。 The compositing frame generation unit 14 inputs the image coordinate values and camera parameters CP2 for each display element from the coordinate conversion unit 13 (step S801). The compositing frame generation unit 14 then reads out the CG for each display element from the CG storage unit 23 (step S802).

図１３は、ＣＧ記憶部２３のデータ構成例を示す図である。ＣＧ記憶部２３には、表示要素の識別子である表示要素ＩＤ、及び当該表示要素のＣＧが組となって、複数の組のデータが格納されている。 Figure 13 is a diagram showing an example of the data structure of the CG storage unit 23. The CG storage unit 23 stores multiple sets of data, each set consisting of a display element ID, which is an identifier of a display element, and the CG of that display element.

例えば、表示要素ＩＤ＝１に対応して、ランドマークの表示要素の名称「旭川駅」（図１２を参照）をグラフィック化したＣＧが格納されている。これらのデータは予め設定され、ユーザにより変更することができる。 For example, a CG that graphically represents the name of the landmark display element "Asahikawa Station" (see Figure 12) is stored in correspondence with display element ID = 1. This data is preset and can be changed by the user.

図８に戻って、合成用フレーム生成部１４は、地図画像記憶部２４から、地図画像を読み出す（ステップＳ８０３）。地図画像は、例えばカメラの位置を中央に配置した地図の画像である。 Returning to FIG. 8, the synthesis frame generation unit 14 reads a map image from the map image storage unit 24 (step S803). The map image is, for example, an image of a map with the camera position positioned in the center.

合成用フレーム生成部１４は、カメラパラメータＣＰ２から、パン角α及び撮影画角θを抽出する（ステップＳ８０４）。そして、合成用フレーム生成部１４は、パン角αからカメラの向きα１を求め、撮影画角θから、カメラの位置を基準とした撮影領域Ｓの２辺のなす角θ１を求める（後述する図９の右側を参照）（ステップＳ８０５）。 The synthesis frame generation unit 14 extracts the pan angle α and the shooting angle of view θ from the camera parameters CP2 (step S804). The synthesis frame generation unit 14 then determines the camera direction α1 from the pan angle α, and determines the angle θ1 between two sides of the shooting area S based on the camera position from the shooting angle of view θ (see the right side of Figure 9 described later) (step S805).

合成用フレーム生成部１４は、地図画像内のカメラの位置、カメラの向きα１及び撮影領域Ｓのなす角θ１から地図画像内の撮影領域Ｓを求め、撮影領域Ｓを地図画像内に配置することで、ＰinＰ画像を生成する（ステップＳ８０６）。地図画像内のカメラの位置は予め設定されているため（例えば地図画像内の中央位置に設定されているため）、カメラの向きα１及びなす角θ１から、地図画像において撮影領域Ｓを特定することができる。 The synthesis frame generation unit 14 determines the shooting area S in the map image from the camera position in the map image, the camera direction α1, and the angle θ1 of the shooting area S, and generates a PinP image by arranging the shooting area S in the map image (step S806). Since the camera position in the map image is set in advance (for example, because it is set to the center position in the map image), the shooting area S can be identified in the map image from the camera direction α1 and the angle θ1.

ここで、ＰinＰ画像は、カメラの視点位置から投影される撮影領域Ｓを地図画像に表した画像である。撮影領域Ｓは３次元的にはカメラの姿勢及びセンサ形状及び画角によって変化する四角錘となるが、地図上に示す場合は、四角錘を２次元の地図に透視投影または平行投影したものでもよいし、レンズの水平画角から計算される三角形としてもよい。 Here, a PinP image is an image that shows the shooting area S projected from the viewpoint position of the camera on a map image. In three dimensions, the shooting area S is a quadrangular pyramid that changes depending on the camera's attitude, sensor shape, and angle of view, but when shown on a map, the quadrangular pyramid may be a perspective or parallel projection onto a two-dimensional map, or it may be a triangle calculated from the horizontal angle of view of the lens.

地図画像の地図がメルカトル図法等の３次元の球体モデルを投影した図法でない場合、図法による歪みを撮影領域Ｓに反映するようにしてもよい。撮影領域Ｓの表示方法は、その内部をユーザにより予め設定される色でアルファブレンディングしてもよいし、その境界となる線を示してもよい。 If the map of the map image is not a projection of a three-dimensional spherical model such as the Mercator projection, the distortion due to the projection may be reflected in the shooting area S. The shooting area S may be displayed by alpha blending the interior with a color preset by the user, or by showing the lines that form its boundaries.

撮影領域Ｓにおけるカメラの位置（視点位置）は、ＰinＰ画像内の地図の中心部としてもよいし、任意の位置に設定してもよく、撮影方向を示すカメラのアイコン画像を重ねて表示してもよい。また、ＰinＰ画像において、後述するステップＳ８０７のように、方角、ランドマーク等の表示要素の画像座標値の位置に、そのＣＧを表示してもよい。 The camera position (viewpoint position) in the shooting area S may be the center of the map in the PinP image, or may be set at any position, and a camera icon image indicating the shooting direction may be displayed superimposed. Also, in the PinP image, CG may be displayed at the position of the image coordinate values of display elements such as directions and landmarks, as in step S807 described below.

合成用フレーム生成部１４は、ステップＳ８０１にて入力した表示要素毎の画像座標値の箇所に、ステップＳ８０２にて読み出したそれぞれのＣＧを配置すると共に、ステップＳ８０６にて生成したＰinＰ画像を、予め設定された画像座標値の箇所に配置することで、合成用フレームを生成する（ステップＳ８０７）。合成用フレームにＰinＰ画像を配置するか否かは、ユーザにより予め設定されるものとする。 The compositing frame generating unit 14 generates a compositing frame by placing each CG read out in step S802 at the image coordinate values for each display element input in step S801, and placing the PinP image generated in step S806 at the image coordinate values set in advance (step S807). It is assumed that the user sets in advance whether or not to place the PinP image in the compositing frame.

図９は、合成用フレームの例を示す図である。この合成用フレームには、ランドマークの表示要素「旭川駅」のＣＧが、画像座標値（ｘ_i1，ｙ_i1）の箇所に配置され、撮影領域Ｓを地図画像に表した縮尺の異なる２つのＰinＰ画像が、予め設定された画像座標値の箇所に配置されている。 9 is a diagram showing an example of a compositing frame, in which a CG image of a landmark display element "Asahikawa Station" is placed at the location of image coordinate values (x _i1 , y _i1 ), and two PinP images with different scales showing the shooting area S on a map image are placed at the locations of preset image coordinate values.

この例では、撮影領域Ｓの頂点位置にカメラが設置されており、このカメラの位置を基準に、カメラの向き（光軸）α１の点線（図９の右側を参照）を中央にした２辺のなす角θ１により、撮影領域Ｓが設定されている。 In this example, a camera is installed at the vertex of the shooting area S, and the shooting area S is set based on the position of this camera, with the angle θ1 between two sides of the camera direction (optical axis) α1, with the dotted line (see the right side of Figure 9) at the center.

尚、この例は、縮尺の異なる２つのＰinＰ画像が配置されているが、１つのＰinＰ画像が配置されるようにしてもよいし、縮尺の異なる３つ以上のＰinＰ画像が配置されるようにしてもよい。また、合成用フレーム内のＰinＰ画像のサイズ及び位置は、ユーザにより自由に設定され、変更することができる。 In this example, two PinP images with different scales are placed, but one PinP image may be placed, or three or more PinP images with different scales may be placed. Also, the size and position of the PinP image within the composition frame can be freely set and changed by the user.

図１０は、ＰinＰ画像の例を示す図であり、縮尺を自由に変化させた場合の例を示している。ＰinＰ画像を生成する際に用いる地図画像は、例えばユーザにより予め設定された縮尺に応じて拡大または縮小する。そして、図１０に示すように、撮影領域Ｓを、拡大または縮小後の地図画像に表したＰinＰ画像が生成される。予め設定された縮尺は、ユーザにより変更することができる。 Figure 10 shows an example of a PinP image, where the scale can be freely changed. The map image used to generate the PinP image is enlarged or reduced according to a scale preset by the user, for example. Then, as shown in Figure 10, a PinP image is generated in which the shooting area S is displayed on the enlarged or reduced map image. The preset scale can be changed by the user.

つまり、合成用フレーム生成部１４は、図８のステップＳ８０３にて地図画像記憶部２４から読み出した地図画像を、予め設定された縮尺に応じて拡大または縮小し（または、地図画像記憶部２４から予め設定された縮尺の地図画像を読み出し）、その縮尺の地図画像を、ステップＳ８０６にてＰinＰ画像を生成する際に用いる。 In other words, the synthesis frame generation unit 14 enlarges or reduces the map image read from the map image storage unit 24 in step S803 of FIG. 8 according to a preset scale (or reads a map image of a preset scale from the map image storage unit 24), and uses the map image of that scale when generating the PinP image in step S806.

尚、地図画像の縮尺は、カメラパラメータＣＰ２に含まれる撮影画角θまたはフォーカス距離ｆに基づいて変化させるようにしてもよい。例えば合成用フレーム生成部１４は、カメラパラメータＣＰ２から抽出した撮影画角θのしきい値判定により、撮影画角θが小さいほど小縮尺となるような縮尺を設定し、撮影画角θが大きいほど大縮尺となるような縮尺を設定する。また、合成用フレーム生成部１４は、カメラパラメータＣＰ２から抽出したフォーカス距離ｆのしきい値判定により、フォーカス距離ｆが長いほど小縮尺となるような縮尺を設定し、フォーカス距離ｆが短いほど大縮尺となるような縮尺を設定する。 The scale of the map image may be changed based on the shooting angle of view θ or the focus distance f included in the camera parameters CP2. For example, the synthesis frame generation unit 14 sets a scale that is smaller the shooting angle of view θ, and a scale that is larger the shooting angle of view θ, based on a threshold judgment of the shooting angle of view θ extracted from the camera parameters CP2. The synthesis frame generation unit 14 also sets a scale that is smaller the longer the focus distance f, and a scale that is larger the shorter the focus distance f, based on a threshold judgment of the focus distance f extracted from the camera parameters CP2.

図８に戻って、合成用フレーム生成部１４は、ステップＳ８０７の後、合成用フレームを映像フレーム合成装置３へ出力する（ステップＳ８０８）。 Returning to FIG. 8, after step S807, the synthesis frame generation unit 14 outputs the synthesis frame to the video frame synthesis device 3 (step S808).

これにより、表示要素毎の画像座標値、カメラパラメータＣＰ２、ＣＧ記憶部２３に格納された表示要素毎のＣＧ及び地図画像記憶部２４に格納された地図画像を用いて、表示要素毎のＣＧ及び撮影領域Ｓを地図画像に表したＰinＰ画像が所定の画像座標値の箇所に配置され、合成用フレームが生成される。 As a result, using the image coordinate values for each display element, the camera parameters CP2, the CG for each display element stored in the CG storage unit 23, and the map image stored in the map image storage unit 24, the CG for each display element and a PinP image showing the shooting area S on the map image are placed at a location with a specified image coordinate value, and a synthesis frame is generated.

このように、図３に示した映像合成用フレーム生成装置２は、カメラ映像のフレームに同期した合成用フレームを毎フレーム連続で生成することで、カメラのパン、チルトまたはズーム等の動きに応じて、表示要素毎のＣＧ及びＰinＰ画像内の撮影領域Ｓが追従する合成用フレームを生成することができる。 In this way, the image synthesis frame generating device 2 shown in FIG. 3 can generate synthesis frames in succession for each frame that are synchronized with the camera image frames, thereby generating synthesis frames in which the shooting area S in the CG and PinP images for each display element follows movements such as panning, tilting, or zooming of the camera.

この場合の映像合成用フレーム生成装置２の処理は、カメラパラメータ受信部１１によるステップＳ４０１、カメラパラメータ補間部１２によるステップＳ４０２、座標変換部１３によるステップＳ４０３及び合成用フレーム生成部１４によるステップＳ４０４の各処理を直列に行うものである。 In this case, the processing of the video synthesis frame generating device 2 is performed in series by step S401 by the camera parameter receiving unit 11, step S402 by the camera parameter interpolation unit 12, step S403 by the coordinate conversion unit 13, and step S404 by the synthesis frame generating unit 14.

つまり、映像合成用フレーム生成装置２は、カメラパラメータＣＰを入力してから、補間処理にてカメラ映像のフレームレートに同期するカメラパラメータＣＰ２を増やし、フレーム毎の合成用フレームを生成するまでの図４に示したステップＳ４０１～Ｓ４０４の処理を、無限ループで実行し、合成用フレームを出力し続ける。言い換えると、映像合成用フレーム生成装置２は、入力するカメラパラメータＣＰに対応するカメラパラメータＣＰ２毎、すなわちカメラ映像のフレーム毎に処理を直列的に行う。これに対し、映像合成用フレーム生成装置２は、図４に示した直列処理の代わりに、並列処理を行うようにしてもよい。 In other words, the video synthesis frame generating device 2 inputs the camera parameters CP, then increases the camera parameters CP2 that are synchronized with the frame rate of the camera video through interpolation processing, and performs the processing of steps S401 to S404 shown in FIG. 4 until generating a synthesis frame for each frame in an infinite loop, and continues to output synthesis frames. In other words, the video synthesis frame generating device 2 serially processes each camera parameter CP2 that corresponds to the input camera parameters CP, i.e., each frame of the camera video. Alternatively, the video synthesis frame generating device 2 may perform parallel processing instead of the serial processing shown in FIG. 4.

図１５は、実施例１における映像合成用フレーム生成装置２の他の処理例を示すフローチャートであり、図４に示した直列処理に代わる並列処理の例を示している。図１５に示すステップＳ１５０１～Ｓ１５０５は、それぞれ図４に示したステップＳ４０１～Ｓ４０５に対応している。 FIG. 15 is a flowchart showing another example of processing by the video synthesis frame generating device 2 in the first embodiment, and shows an example of parallel processing instead of the serial processing shown in FIG. 4. Steps S1501 to S1505 shown in FIG. 15 correspond to steps S401 to S405 shown in FIG. 4, respectively.

映像合成用フレーム生成装置２は、ステップＳ１５０１～Ｓ１５０５のそれぞれの処理を並列的に行うと共に、当該ステップの処理を、前段のステップの処理が完了すると直ちに行う。つまり、映像合成用フレーム生成装置２のカメラパラメータ受信部１１から合成用フレーム生成部１４までの各構成部は、並列に処理を実行すると共に、当該処理を完了してデータを出力した後、前段の構成部等からデータを入力して当該処理を引き続き行う。 The video synthesis frame generating device 2 performs the processes of steps S1501 to S1505 in parallel, and performs the process of the relevant step immediately after the process of the previous step is completed. In other words, each component of the video synthesis frame generating device 2, from the camera parameter receiving unit 11 to the synthesis frame generating unit 14, executes processes in parallel, and after completing the process and outputting the data, inputs data from the previous component and continues the process.

（映像フレーム合成装置３／実施例１）
次に、図１に示した実施例１における映像フレーム合成装置３について詳細に説明する。図１６は、実施例１における映像フレーム合成装置３の構成例を示すブロック図である。この映像フレーム合成装置３は、映像遅延部３０及びフレーム合成部３１を備えている。 (Video Frame Synthesizer 3/Example 1)
Next, a detailed description will be given of the video frame synthesizer 3 in the embodiment 1 shown in Fig. 1. Fig. 16 is a block diagram showing an example of the configuration of the video frame synthesizer 3 in the embodiment 1. The video frame synthesizer 3 includes a video delay unit 30 and a frame synthesizer 31.

映像遅延部３０は、カメラ映像を入力し、カメラ映像のフレームが後述するフレーム合成部３１において合成用フレームに同期するように、予め設定された遅延量だけカメラ映像を遅延させ、遅延後のカメラ映像をフレーム合成部３１に出力する。これにより、カメラ映像のフレームのタイミングが、映像合成用フレーム生成装置２により生成された合成用フレームのタイミングに合致するように、カメラ映像の遅延量が調整される。 The video delay unit 30 inputs the camera video, delays the camera video by a preset delay amount so that the camera video frames are synchronized with the synthesis frames in the frame synthesis unit 31 described below, and outputs the delayed camera video to the frame synthesis unit 31. This adjusts the delay amount of the camera video so that the timing of the camera video frames matches the timing of the synthesis frames generated by the video synthesis frame generation device 2.

ある時刻において同期したカメラ映像のフレームとカメラパラメータＣＰがそれぞれ映像フレーム合成装置３及び映像合成用フレーム生成装置２に入力された場合、映像合成用フレーム生成装置２に備えたカメラパラメータ補間部１２において未来のカメラパラメータＣＰ２をサンプリングする必要があるときは、入力時刻を未来として扱う分、カメラ映像のフレームにはサンプリング量に応じた遅延量が発生する。 When synchronized camera image frames and camera parameters CP are input to the image frame synthesizer 3 and the image synthesis frame generator 2, respectively, at a certain time, if it is necessary to sample future camera parameters CP2 in the camera parameter interpolation unit 12 provided in the image synthesis frame generator 2, a delay corresponding to the amount of sampling occurs in the camera image frames since the input time is treated as future.

このため、映像遅延部３０により、この遅延量も含めた図３に示したフレーム処理部１０の処理によって決定される合成用フレームの遅延量に応じて、カメラ映像が遅延することとなる。 Therefore, the video delay unit 30 delays the camera video according to the delay amount of the synthesis frame determined by the processing of the frame processing unit 10 shown in Figure 3, which includes this delay amount.

遅延量は、ユーザにより自由に設定することができ、遅延量を０としてもよい。例えばユーザは、図３に示したフレーム処理部１０によりカメラパラメータＣＰが入力されてから合成用フレームが出力されるまでの間の処理時間を計測し、この処理時間を遅延量として設定する。 The amount of delay can be freely set by the user, and may be set to 0. For example, the user measures the processing time from when the camera parameters CP are input by the frame processing unit 10 shown in FIG. 3 until the synthesis frame is output, and sets this processing time as the amount of delay.

フレーム合成部３１は、映像遅延部３０から遅延後のカメラ映像を入力すると共に、映像合成用フレーム生成装置２から合成用フレームを入力する。前述のとおり、フレーム合成部３１が入力する遅延後のカメラ映像のフレームと合成用フレームとは同期しており、遅延後のカメラ映像のフレームのタイミングが合成用フレームのタイミングに合致している。 The frame synthesis unit 31 inputs the delayed camera video from the video delay unit 30, and inputs the synthesis frames from the video synthesis frame generation device 2. As described above, the delayed camera video frames and synthesis frames input by the frame synthesis unit 31 are synchronized, and the timing of the delayed camera video frames matches the timing of the synthesis frames.

フレーム合成部３１は、遅延後のカメラ映像のフレームに合成用フレームを合成することで、例えば図２に示した合成映像を生成し、合成映像を出力する。 The frame synthesis unit 31 synthesizes a synthesis frame with a delayed camera image frame to generate a synthetic image, for example, as shown in FIG. 2, and outputs the synthetic image.

（カメラ操作前後の合成映像の例）
次に、図１に示した映像合成システム１において、カメラマンがカメラを操作することでカメラパラメータＣＰが変更された場合の、映像フレーム合成装置３により出力される合成映像の例について説明する。後述する図１７～図１９に示す合成映像の例は、後述する実施例２についても適用がある。 (Example of composite images before and after camera operation)
Next, an example of a composite image output by the image frame synthesizer 3 in the image synthesis system 1 shown in Fig. 1 when the camera parameter CP is changed by the cameraman operating the camera will be described. The examples of composite images shown in Figs. 17 to 19 described later are also applicable to Example 2 described later.

図１７は、パン操作が行われる前後の合成映像の例を示す図である。左側の図は、パン操作が行われる前の合成映像を示し、右側の図は、ランドマークの表示要素「旭川駅」に対して左方向へパン操作が行われた後の合成映像を示している。 Figure 17 shows an example of a composite image before and after a panning operation. The image on the left shows the composite image before a panning operation is performed, and the image on the right shows the composite image after a panning operation is performed to the left on the landmark display element "Asahikawa Station."

図１７から、左方向のパン操作により、合成映像内に配置されたランドマークの表示要素「旭川駅」のＣＧの位置が、合成映像内で左から右へ移動していることがわかる。 From Figure 17, we can see that panning to the left causes the CG position of the landmark display element "Asahikawa Station" placed in the composite image to move from left to right within the composite image.

また、左方向のパン操作により、合成映像の右下に配置された２つのＰinＰ画像内の撮影領域Ｓの向き（カメラの向きα１（図９を参照））が、ＰinＰ画像内で左下方向から下方向へ移動していることがわかる。 It can also be seen that panning to the left causes the orientation of the shooting area S (camera orientation α1 (see Figure 9)) in the two PinP images located at the bottom right of the composite video to move from the bottom left to the bottom within the PinP images.

このように、カメラのパン操作に対して、ランドマークの表示要素「旭川駅」のＣＧ及びＰinＰ画像内の撮影領域Ｓを追従させることができる。 In this way, the shooting area S in the CG and PinP images of the landmark display element "Asahikawa Station" can be made to follow the panning operation of the camera.

図１８は、チルト操作及びズーム操作が行われる前後の合成映像の例を示す図である。左側の図は、チルト操作及びズーム操作が行われる前の合成映像を示し、右側の図は、ランドマークの表示要素「十勝岳」に対してチルト操作及びズーム操作が行われた後の合成映像を示している。 Figure 18 shows an example of a composite image before and after a tilt operation and a zoom operation. The image on the left shows the composite image before a tilt operation and a zoom operation are performed, and the image on the right shows the composite image after a tilt operation and a zoom operation are performed on the landmark display element "Tokachi-dake."

図１８から、チルト操作及びズーム操作により、合成映像内に表示されたランドマークの表示要素「旭川空港」のＣＧ及び「十勝岳」のＣＧが、「十勝岳」のＣＧのみとなり、「十勝岳」の映像が拡大して表示されていることがわかる。 From Figure 18, it can be seen that by using tilt and zoom operations, the CG of the landmark display elements "Asahikawa Airport" and "Tokachi Mountain" displayed in the composite image are reduced to only the CG of "Tokachi Mountain," and the image of "Tokachi Mountain" is displayed in an enlarged scale.

また、チルト操作及びズーム操作により、合成映像の右下に配置された２つのＰinＰ画像内の撮影領域Ｓの向き（カメラの向きα１）は変化しないことがわかる。また、ズーム操作により、合成映像の右下に配置された２つのＰinＰ画像内の撮影領域Ｓの幅（なす角θ１（図９を参照））が狭く（なす角θ１が小さく）なっていることがわかる。 It can also be seen that the tilt operation and zoom operation do not change the orientation (camera orientation α1) of the shooting area S in the two PinP images located at the bottom right of the composite video. It can also be seen that the zoom operation narrows the width (angle θ1 (see FIG. 9)) of the shooting area S in the two PinP images located at the bottom right of the composite video (angle θ1 becomes smaller).

このように、カメラのチルト操作及びズーム操作に対して、ランドマークの表示要素「旭川空港」のＣＧ及び「十勝岳」のＣＧ、並びにＰinＰ画像内の撮影領域Ｓを追従させることができる。 In this way, the CG of the landmark display elements "Asahikawa Airport" and "Tokachi Mountain", as well as the shooting area S in the PinP image, can be made to follow the tilt and zoom operations of the camera.

図１９は、ズーム操作に伴いランドマークの表示が切り替えられる前後の合成映像の例を示す図である。左側の図は、ズーム操作が行われる前の合成映像を示し、右側の図は、ズーム操作が行われた後の合成映像を示している。 Figure 19 shows an example of a composite image before and after the landmark display is switched in response to a zoom operation. The image on the left shows the composite image before the zoom operation is performed, and the image on the right shows the composite image after the zoom operation is performed.

図１９から、ズーム操作により、合成映像内に表示されたランドマークの表示要素「旭山」のＣＧが、「旭山動物園」のＣＧに切り替えられていることがわかる。 From Figure 19, we can see that the zoom operation has switched the CG of the landmark display element "Asahiyama" displayed in the composite image to the CG of "Asahiyama Zoo."

また、ズーム操作により、合成映像の右下に配置された２つのＰinＰ画像内の撮影領域Ｓの向き（カメラの向きα１）は変化しないことがわかる。また、ズーム操作により、合成映像の右下に配置された２つのＰinＰ画像内の撮影領域Ｓの幅（なす角θ１）が狭く（なす角θ１が小さく）なっていることがわかる。 It can also be seen that the orientation (camera orientation α1) of the shooting area S in the two PinP images located at the bottom right of the composite video does not change due to the zoom operation. It can also be seen that the width (angle θ1) of the shooting area S in the two PinP images located at the bottom right of the composite video becomes narrower (the angle θ1 becomes smaller) due to the zoom operation.

このように、カメラのズーム操作に対して、ランドマークの表示要素「旭山」のＣＧ及び「旭山動物園」のＣＧ、並びにＰinＰ画像内の撮影領域Ｓを追従させることができる。また、以下に説明する処理により、カメラのズーム操作に応じて、表示要素のＣＧを変化させることができる。 In this way, the CG of the landmark display elements "Asahiyama" and "Asahiyama Zoo", as well as the shooting area S in the PinP image, can be made to follow the zoom operation of the camera. Furthermore, the process described below makes it possible to change the CG of the display elements in response to the zoom operation of the camera.

図１４は、図３に示したＣＧ記憶部２３の他のデータ構成例を示す図であり、図１９に示した合成映像の例のように、ズーム操作に伴いランドマークの表示を切り替える場合に適用がある。 Figure 14 shows another example of the data structure of the CG storage unit 23 shown in Figure 3, and is applicable to cases where the display of landmarks is switched in response to a zoom operation, as in the example of the composite image shown in Figure 19.

このＣＧ記憶部２３には、撮影画角θが予め設定されたしきい値θ_a以上である場合（θ≧θ_a）の表示要素ＩＤ及び当該表示要素のＣＧが組となって、複数の組のデータが格納されている。また、撮影画角θがしきい値θ_aよりも小さい場合（θ＜θ_a）の表示要素ＩＤ及び当該表示要素のＣＧが組となって、複数の組のデータが格納されている。 The CG storage unit 23 stores a plurality of sets of data each including a display element ID and a CG of the display element when the shooting angle of view θ is equal to or larger _than a preset threshold value θa (θ≧ _θa ), and also stores a plurality of sets of data each including a display element ID and a CG of the display element when the shooting angle of view θ is smaller than the threshold value _θa (θ< _θa ).

例えば、撮影画角θがしきい値θ_a以上である場合（θ≧θ_a）の表示要素ＩＤ＝ｎに対応して、ランドマークの表示要素の名称「旭山」をグラフィック化したＣＧが格納されている。また、撮影画角θがしきい値θ_aよりも小さい場合（θ＜θ_a）の表示要素ＩＤ＝ｎに対応して、ランドマークの表示要素の名称「旭山動物園」をグラフィック化したＣＧが格納されている。これらのデータは予め設定され、ユーザにより変更することができる。 For example, when the shooting angle of view θ is equal to or greater than the threshold value _θa (θ≧ _θa ), a CG that graphically represents the name of the landmark's display element, "Asahiyama," is stored in correspondence with the display element ID=n. When the shooting angle of view θ is smaller than the threshold value _θa (θ< _θa ), a CG that graphically represents the name of the landmark's display element, "Asahiyama Zoo," is stored in correspondence with the display element ID=n. These data are preset and can be changed by the user.

この場合、図３に示した合成用フレーム生成部１４は、図８に示したステップＳ８０２において、カメラパラメータＣＰ２から撮影画角θを抽出し、撮影画角θが予め設定されたしきい値θ_a以上であるか、または撮影画角θがしきい値θ_aよりも小さいかを判定する。 In this case, the synthesis frame generation unit 14 shown in FIG. 3 extracts the shooting angle of view θ from the camera parameter CP2 in step S802 shown in FIG. 8, and determines whether the shooting angle of view θ is equal to or larger than a preset threshold value _θa or whether the shooting angle of view θ is smaller than the threshold _{value θa} .

合成用フレーム生成部１４は、撮影画角θがしきい値θ_a以上であると判定した場合、図１４に示したＣＧ記憶部２３から、撮影画角θがしきい値θ_a以上である場合（θ≧θ_a）の表示要素のＣＧを読み出す。一方、合成用フレーム生成部１４は、撮影画角θがしきい値θ_aよりも小さいと判定した場合、図１４に示したＣＧ記憶部２３から、撮影画角θがしきい値θ_aよりも小さい場合（θ＜θ_a）の表示要素のＣＧを読み出す。 When the synthesis frame generation unit 14 determines that the shooting angle of view θ is equal to or greater than the threshold _{value θa} , it reads out CG of display elements for which the shooting angle of view θ is equal to or greater than the threshold value _θa (θ≧ _θa ) from the CG storage unit 23 shown in Fig. 14. On the other hand, when the synthesis frame generation unit 14 determines that the shooting angle of view θ is smaller than the threshold value _θa , it reads out CG of display elements for which the shooting angle of view θ is smaller than the threshold value _θa (θ< _θa ) from the CG storage unit 23 shown in Fig. 14.

合成用フレーム生成部１４は、表示要素毎の画像座標値の箇所に、ＣＧ記憶部２３から読み出したそれぞれのＣＧを配置する等して、合成用フレームを生成する。図１４及び図１９の例では、表示要素ＩＤ＝ｎについて、撮影画角θがしきい値θ_a以上である場合（θ≧θ_a）に、図１９の左側に示したように、表示要素「旭山」のＣＧが配置された合成映像が生成される。そして、カメラのズーム操作により、撮影画角θがしきい値θ_aよりも小さくなった場合（θ＜θ_a）、図１９の右側に示したように、表示要素「旭山動物園」のＣＧが配置された合成映像が生成される。 The compositing frame generating unit 14 generates a compositing frame by, for example, arranging each CG read from the CG storage unit 23 at the location of the image coordinate value for each display element. In the examples of Fig. 14 and Fig. 19, for a display element ID=n, when the shooting angle of view θ is equal to or greater than the threshold value _θa (θ≧ _θa ), a composite image is generated in which the CG of the display element "Asahiyama" is arranged, as shown on the left side of Fig. 19. Then, when the shooting angle of view θ becomes smaller than the threshold value _θa (θ< _θa ) due to a zoom operation of the camera, a composite image is generated in which the CG of the display element "Asahiyama Zoo" is arranged, as shown on the right side of Fig. 19.

これにより、カメラのズーム操作に対して、撮影画角θに対応した表示要素のＣＧを含む合成用フレームが生成され、結果として、撮影画角θに対応した表示要素のＣＧを含む合成画像が生成される。 As a result, a compositing frame containing CG of display elements corresponding to the shooting angle of view θ is generated in response to the camera zoom operation, and as a result, a composite image containing CG of display elements corresponding to the shooting angle of view θ is generated.

つまり、表示要素のＣＧは、カメラパラメータＣＰ２に含まれる撮影画角θに応じて変化することとなる。例えば、カメラ映像がズームしてランドマークの表示要素が拡大表示された場合、ランドマークの概要を示すＣＧ（例えばズーム操作前の表示要素「十勝岳連峰」）からランドマークの詳細を示すＣＧ（例えばズーム操作後の表示要素「美瑛岳」、「十勝岳」、「富良野岳」）に切り替えて表示される。 In other words, the CG of the display element changes according to the shooting angle of view θ included in the camera parameter CP2. For example, when the camera image is zoomed in and the display element of the landmark is enlarged, the CG showing the outline of the landmark (for example, the display element "Tokachi Mountain Range" before the zoom operation) is switched to the CG showing the details of the landmark (for example, the display elements "Biei Mountain", "Tokachi Mountain", "Furano Mountain" after the zoom operation).

尚、図１４の例は、しきい値が１つの場合を示しているが、複数のしきい値により区別された表示要素ＩＤ及び当該表示要素のＣＧがＣＧ記憶部２３に格納されるようにしてもよい。 Note that while the example in FIG. 14 shows a case where there is one threshold, display element IDs and CGs of the display elements distinguished by multiple thresholds may be stored in the CG storage unit 23.

また、前述の例では、合成用フレームに含まれる表示要素のＣＧを、カメラパラメータＣＰ２に含まれる撮影画角θに応じて変化させるようにした。つまり、図１４に示した撮影画角θの例では、ＣＧ記憶部２３から、撮影画角θが大きいほど、ランドマークの概要を示すＣＧが読み出され、撮影画角θが小さいほど、ランドマークの詳細を示すＣＧが読み出される。 In addition, in the above example, the CG of the display element included in the synthesis frame is changed according to the shooting angle of view θ included in the camera parameter CP2. That is, in the example of the shooting angle of view θ shown in FIG. 14, the larger the shooting angle of view θ is, the more CG showing an outline of the landmark is read from the CG storage unit 23, and the smaller the shooting angle of view θ is, the more CG showing the details of the landmark is read.

これに対し、表示要素のＣＧを、カメラパラメータＣＰ２に含まれるフォーカス距離ｆに応じて変化させるようにしてもよい。この場合、ＣＧ記憶部２３から、フォーカス距離ｆが短いほど、ランドマークの概要を示すＣＧ（図１４の左側に示したＣＧ）が読み出され、フォーカス距離ｆが長いほど、ランドマークの詳細を示すＣＧ（図１４の右側に示したＣＧ）が読み出される。 In contrast, the CG of the display element may be changed according to the focus distance f included in the camera parameter CP2. In this case, the shorter the focus distance f, the more CG showing an outline of the landmark (the CG shown on the left side of FIG. 14) is read from the CG storage unit 23, and the longer the focus distance f, the more CG showing the details of the landmark (the CG shown on the right side of FIG. 14) is read.

以上のように、実施例１の映像合成システム１によれば、映像合成用フレーム生成装置２のカメラパラメータ受信部１１は、カメラからカメラパラメータＣＰを受信し、変換情報記憶部２１に格納された変換情報を用いて、カメラパラメータＣＰに含まれるズーム値及びフォーカス値をそれぞれ撮影画角θ及びフォーカス距離ｆに変換し、撮影画角θ及びフォーカス距離ｆを含むカメラパラメータＣＰ１を生成する。 As described above, according to the video synthesis system 1 of the first embodiment, the camera parameter receiving unit 11 of the video synthesis frame generating device 2 receives the camera parameters CP from the camera, and uses the conversion information stored in the conversion information storage unit 21 to convert the zoom value and focus value included in the camera parameters CP into the shooting angle of view θ and the focus distance f, respectively, to generate the camera parameters CP1 including the shooting angle of view θ and the focus distance f.

カメラパラメータ補間部１２は、カメラパラメータＣＰ１のデータレートがカメラ映像のフレームレートに一致するように、カメラパラメータＣＰ１のデータを補間することで、カメラパラメータＣＰ２を生成する。 The camera parameter interpolation unit 12 generates camera parameters CP2 by interpolating the data of the camera parameters CP1 so that the data rate of the camera parameters CP1 matches the frame rate of the camera image.

座標変換部１３は、表示要素毎に、カメラパラメータＣＰ２及び座標情報記憶部２２に格納された当該表示要素の世界座標値を用いて、世界座標値を画像座標値に変換する。 For each display element, the coordinate conversion unit 13 converts the world coordinate values into image coordinate values using the camera parameters CP2 and the world coordinate values of the display element stored in the coordinate information storage unit 22.

合成用フレーム生成部１４は、表示要素毎に、ＣＧ記憶部２３に格納された当該表示要素のＣＧを画像座標値の箇所に配置すると共に、地図画像記憶部２４に格納された地図画像を用いて、カメラの位置を含む地図画像内の撮影領域Ｓを求め、撮影領域Ｓを地図画像に表したＰinＰ画像を生成し、ＰinＰ画像を所定の画像座標値の箇所に配置する。そして、合成用フレーム生成部１４は、表示要素毎のＣＧ及び撮影領域Ｓを地図画像に表したＰinＰ画像を含む合成用フレームを生成する。 For each display element, the synthesis frame generation unit 14 places the CG of that display element stored in the CG storage unit 23 at the location of the image coordinate value, and uses the map image stored in the map image storage unit 24 to determine the shooting area S in the map image including the camera position, generates a PinP image that shows the shooting area S on the map image, and places the PinP image at the location of the specified image coordinate value. Then, the synthesis frame generation unit 14 generates a synthesis frame that includes the CG of each display element and the PinP image that shows the shooting area S on the map image.

これにより、カメラ映像のフレームに同期した合成用フレームが毎フレーム連続で生成される。つまり、カメラのパン、チルトまたはズーム等の動きに応じて、表示要素毎のＣＧ及びＰinＰ画像内の撮影領域Ｓが追従する合成用フレームが生成される。 As a result, a synthesis frame synchronized with the camera image frame is generated continuously for each frame. In other words, a synthesis frame is generated in which the shooting area S in the CG and PinP images for each display element follows the camera's movements such as panning, tilting, or zooming.

また、映像フレーム合成装置３の映像遅延部３０は、カメラ映像のフレームが映像合成用フレーム生成装置２により生成された合成用フレームに同期するように、予め設定された遅延量だけカメラ映像を遅延させる。 In addition, the video delay unit 30 of the video frame synthesis device 3 delays the camera video by a preset delay amount so that the camera video frame is synchronized with the synthesis frame generated by the video synthesis frame generation device 2.

フレーム合成部３１は、遅延後のカメラ映像のフレームに合成用フレームを合成することで、合成映像を生成する。 The frame synthesis unit 31 generates a synthetic image by synthesizing a synthesis frame with a delayed camera image frame.

これにより、カメラ映像のフレームに同期した合成映像が毎フレーム連続で生成される。つまり、カメラのパン、チルトまたはズーム等の動きに応じて、表示要素毎のＣＧ及びＰinＰ画像内の撮影領域Ｓが追従する合成映像が生成される。 As a result, a composite image synchronized with the camera image frames is generated continuously for each frame. In other words, a composite image is generated in which the CG for each display element and the shooting area S in the PinP image follow the camera's movements such as panning, tilting, or zooming.

したがって、実世界が撮影されたカメラ映像について、俯瞰的にその方角及び画角を容易に認識可能な情報を生成することができる。 This makes it possible to generate information that allows the direction and angle of view of a camera image capturing the real world to be easily recognized from a bird's-eye view.

このように、実世界が撮影されたカメラ映像において、方角、ランドマーク等の表示要素毎のＣＧと共に、ＰinＰ画像における地図画像上の撮影領域Ｓを、カメラの動きに追従させ、実時間で同時に提示することができる。この提示を受けたカメラマン及び視聴者は、現時点で撮影中の映像がどの方角をどれだけの画角で撮影しているのかを、容易に認識することができる。特に、土地勘のないカメラマン及び視聴者に対して有用である。 In this way, in camera footage of the real world, the shooting area S on the map image in the PinP image can be displayed in real time, along with CG for each display element such as direction and landmarks, by following the camera's movement. Cameramen and viewers who receive this display can easily recognize which direction and what angle of view the image currently being shot is taking. This is particularly useful for cameramen and viewers who are unfamiliar with the area.

〔実施例２〕
次に、実施例２について説明する。図２０は、実施例２の映像合成装置の構成を示すブロック図である。 Example 2
Next, a description will be given of a second embodiment of the present invention. Fig. 20 is a block diagram showing the configuration of a video synthesizing apparatus according to the second embodiment of the present invention.

この映像合成装置４は、フレーム処理部１０、記憶部２０、映像遅延部３０及びフレーム合成部３１を備えている。フレーム処理部１０は、カメラパラメータ受信部１１、カメラパラメータ補間部１２、座標変換部１３及び合成用フレーム生成部１４を備え、記憶部２０は、変換情報記憶部２１、座標情報記憶部２２、ＣＧ記憶部２３及び地図画像記憶部２４を備えている。 This image synthesis device 4 includes a frame processing unit 10, a memory unit 20, an image delay unit 30, and a frame synthesis unit 31. The frame processing unit 10 includes a camera parameter receiving unit 11, a camera parameter interpolation unit 12, a coordinate conversion unit 13, and a synthesis frame generation unit 14, and the memory unit 20 includes a conversion information storage unit 21, a coordinate information storage unit 22, a CG storage unit 23, and a map image storage unit 24.

映像合成装置４は、カメラにより撮影されたカメラ映像を入力すると共に、カメラパラメータＣＰを入力する。そして、映像合成装置４は、表示要素毎のＣＧ、及び撮影領域Ｓを地図画像に表したＰinＰ画像を含む合成用フレームを生成し、カメラ映像のフレームに合成用フレームを合成し、合成映像を実時間で出力する。 The image synthesizer 4 inputs the camera image captured by the camera as well as the camera parameters CP. The image synthesizer 4 then generates a synthesis frame including CG for each display element and a PinP image that shows the shooting area S on a map image, synthesizes the synthesis frame with the camera image frame, and outputs the synthesized image in real time.

図１に示した実施例１の映像合成システム１（図３に示した映像合成用フレーム生成装置２及び図１６に示した映像フレーム合成装置３）と、実施例２の映像合成装置４とを比較すると、両映像合成システム１及び映像合成装置４は、同じ構成部を備えている。つまり、映像合成装置４は、図１に示した実施例１の映像合成システム１を構成する映像合成用フレーム生成装置２及び映像フレーム合成装置３を、１つの装置として構成したものである。 Comparing the video synthesis system 1 of Example 1 shown in FIG. 1 (video synthesis frame generation device 2 shown in FIG. 3 and video frame synthesis device 3 shown in FIG. 16) with the video synthesis device 4 of Example 2, both video synthesis systems 1 and video synthesis device 4 have the same components. In other words, the video synthesis device 4 is a single device configured by combining the video synthesis frame generation device 2 and the video frame synthesis device 3 that constitute the video synthesis system 1 of Example 1 shown in FIG. 1.

映像合成装置４に備えたフレーム処理部１０（カメラパラメータ受信部１１、カメラパラメータ補間部１２、座標変換部１３及び合成用フレーム生成部１４）、記憶部２０（変換情報記憶部２１、座標情報記憶部２２、ＣＧ記憶部２３及び地図画像記憶部２４）、映像遅延部３０及びフレーム合成部３１の処理は、実施例１と同じであるため、ここでは説明を省略する。 The processing of the frame processing unit 10 (camera parameter receiving unit 11, camera parameter interpolation unit 12, coordinate conversion unit 13, and synthesis frame generation unit 14), memory unit 20 (conversion information memory unit 21, coordinate information memory unit 22, CG memory unit 23, and map image memory unit 24), video delay unit 30, and frame synthesis unit 31 provided in the image synthesis device 4 is the same as in Example 1, so a description thereof will be omitted here.

以上のように、実施例２の映像合成装置４によれば、実施例１の映像合成システム１と同様の効果を奏する。つまり、カメラのパン、チルトまたはズーム等の動きに応じて、表示要素毎のＣＧ及びＰinＰ画像内の撮影領域Ｓが追従する合成映像が、毎フレーム連続で生成される。 As described above, the image synthesis device 4 of the second embodiment provides the same effect as the image synthesis system 1 of the first embodiment. In other words, a synthetic image in which the CG for each display element and the shooting area S in the PinP image follow the camera's panning, tilting, zooming, or other movements is generated continuously for each frame.

以上、実施例１，２を挙げて本発明を説明したが、本発明は前記実施例１，２に限定されるものではなく、その技術思想を逸脱しない範囲で種々変形可能である。 The present invention has been described above using Examples 1 and 2, but the present invention is not limited to the above Examples 1 and 2, and various modifications are possible without departing from the technical concept thereof.

例えばカメラのロール角φは、カメラパラメータＣＰ，ＣＰ１，ＣＰ２に含まれるようにしたが、カメラマンがカメラをロールしない場合は、ロール角φを固定角として扱うようにしてもよい。つまり、座標変換部１３は、図６に示したステップＳ６０３において、カメラパラメータＣＰ２からロール角φを抽出することなく、ステップＳ６０４において、予め設定されたロール角φを用いて姿勢Ｒを算出する。 For example, the roll angle φ of the camera is included in the camera parameters CP, CP1, and CP2, but if the cameraman does not roll the camera, the roll angle φ may be treated as a fixed angle. In other words, the coordinate conversion unit 13 does not extract the roll angle φ from the camera parameter CP2 in step S603 shown in FIG. 6, but calculates the attitude R using the preset roll angle φ in step S604.

また、例えば合成用フレーム生成部１４は、図８に示したステップＳ８０２において、ＣＧ記憶部２３から例えば方角またはランドマークの表示要素のＣＧを読み出し、ステップＳ８０７において、画像座標値の箇所にＣＧを配置する等により、合成用フレームを生成するようにした。 In addition, for example, in step S802 shown in FIG. 8, the synthesis frame generation unit 14 reads out CG of a display element, such as a direction or a landmark, from the CG storage unit 23, and in step S807 generates a synthesis frame by, for example, placing the CG at the location of the image coordinate values.

これに対し、合成用フレーム生成部１４は、ＣＧ記憶部２３から、対象物までの距離の表示要素のＣＧを読み出すようにしてもよい。この場合、ＣＧ記憶部２３には、その表示要素ＩＤに対応して実際の距離をグラフィック化したＣＧが格納されている。合成用フレーム生成部１４は、ＣＧ記憶部２３から距離の表示要素のＣＧを読み出し、画像座標値の箇所にＣＧを配置する等により、合成用フレームを生成する。この場合、合成用フレーム生成部１４は、読み出した距離の表示要素のＣＧに対し、当該距離に応じた色を付与し、合成用フレームを生成するようにしてもよい。これにより、カメラマン及び視聴者は、ＣＧの色に応じて、対象物までの距離を容易に把握することができる。 In response to this, the compositing frame generating unit 14 may read out the CG of the display element of the distance to the object from the CG storage unit 23. In this case, the CG storage unit 23 stores CG that graphically represents the actual distance corresponding to the display element ID. The compositing frame generating unit 14 reads out the CG of the distance display element from the CG storage unit 23, and generates a compositing frame by, for example, placing the CG at the image coordinate value. In this case, the compositing frame generating unit 14 may assign a color corresponding to the distance to the CG of the read out distance display element, and generate a compositing frame. This allows the cameraman and the viewer to easily grasp the distance to the object according to the color of the CG.

尚、本発明の実施例１の映像合成用フレーム生成装置２及び映像フレーム合成装置３、並びに実施例２の映像合成装置４のハードウェア構成としては、通常のコンピュータを使用することができる。映像合成用フレーム生成装置２、映像フレーム合成装置３及び映像合成装置４は、ＣＰＵ、ＲＡＭ等の揮発性の記憶媒体、ＲＯＭ等の不揮発性の記憶媒体、及びインターフェース等を備えたコンピュータによって構成される。 In addition, a normal computer can be used as the hardware configuration of the video synthesis frame generation device 2 and video frame synthesis device 3 of Example 1 of the present invention, and the video synthesis device 4 of Example 2. The video synthesis frame generation device 2, the video frame synthesis device 3, and the video synthesis device 4 are configured by a computer equipped with a CPU, a volatile storage medium such as RAM, a non-volatile storage medium such as ROM, an interface, etc.

映像合成用フレーム生成装置２に備えたフレーム処理部１０（カメラパラメータ受信部１１、カメラパラメータ補間部１２、座標変換部１３及び合成用フレーム生成部１４）及び記憶部２０（変換情報記憶部２１、座標情報記憶部２２、ＣＧ記憶部２３及び地図画像記憶部２４）の各機能は、これらの機能を記述したプログラムをＣＰＵに実行させることによりそれぞれ実現される。 The functions of the frame processing unit 10 (camera parameter receiving unit 11, camera parameter interpolation unit 12, coordinate conversion unit 13, and synthesis frame generation unit 14) and memory unit 20 (conversion information memory unit 21, coordinate information memory unit 22, CG memory unit 23, and map image memory unit 24) provided in the video synthesis frame generation device 2 are each realized by having the CPU execute a program that describes these functions.

また、映像フレーム合成装置３に備えた映像遅延部３０及びフレーム合成部３１の各機能は、これらの機能を記述したプログラムをＣＰＵに実行させることによりそれぞれ実現される。 Furthermore, the functions of the video delay unit 30 and the frame synthesis unit 31 provided in the video frame synthesis device 3 are each realized by having the CPU execute a program that describes these functions.

また、映像合成装置４に備えたフレーム処理部１０、記憶部２０、映像遅延部３０及びフレーム合成部３１の各機能は、これらの機能を記述したプログラムをＣＰＵに実行させることによりそれぞれ実現される。 Furthermore, the functions of the frame processing unit 10, memory unit 20, video delay unit 30, and frame synthesis unit 31 provided in the video synthesis device 4 are each realized by having the CPU execute a program that describes these functions.

これらのプログラムは、前記記憶媒体に格納されており、ＣＰＵに読み出されて実行される。また、これらのプログラムは、磁気ディスク（フロッピー（登録商標）ディスク、ハードディスク等）、光ディスク（ＣＤ－ＲＯＭ、ＤＶＤ等）、半導体メモリ等の記憶媒体に格納して頒布することもでき、ネットワークを介して送受信することもできる。 These programs are stored in the storage medium and are read and executed by the CPU. In addition, these programs can be distributed by storing them on storage media such as magnetic disks (floppy disks, hard disks, etc.), optical disks (CD-ROMs, DVDs, etc.), and semiconductor memories, and can also be transmitted and received via a network.

１映像合成システム
２映像合成用フレーム生成装置
３映像フレーム合成装置
４映像合成装置
１０フレーム処理部
１１カメラパラメータ受信部
１２カメラパラメータ補間部
１３座標変換部
１４合成用フレーム生成部
２０記憶部
２１変換情報記憶部
２２座標情報記憶部
２３ＣＧ記憶部
２４地図画像記憶部
３０映像遅延部
３１フレーム合成部
ＣＰ，ＣＰ１，ＣＰ２カメラパラメータ
Ｓ撮影領域
α パン角
δ チルト角
φ ロール角
θ 撮影画角
ｆフォーカス距離
ｔ_w カメラ位置
Ｒ姿勢
μ₀，ν₀ 横・縦方向のピクセル数の半値
（Ｘ_w，Ｙ_w，Ｚ_w）世界座標値
（ｘ_i，ｙ_i）画像座標値
α１カメラの向き
θ１なす角
θ_a しきい値 1 Video synthesis system 2 Video synthesis frame generation device 3 Video frame synthesis device 4 Video synthesis device 10 Frame processing unit 11 Camera parameter receiving unit 12 Camera parameter interpolation unit 13 Coordinate conversion unit 14 Synthesis frame generation unit 20 Memory unit 21 Conversion information memory unit 22 Coordinate information memory unit 23 CG memory unit 24 Map image memory unit 30 Video delay unit 31 Frame synthesis unit CP, CP1, CP2 Camera parameters S Shooting area α Pan angle δ Tilt angle φ Roll angle θ Shooting angle of view f Focus distance t _w Camera position R Attitude μ ₀ , ν ₀ Half value of number of pixels in horizontal and vertical directions (X _w , Y _w , Z _w ) World coordinate values (x _i , y _i ) Image coordinate values α 1 Camera direction θ 1 Angle θ _a Threshold value

Claims

1. A video synthesis frame generating device that generates a frame including information about a camera image as a synthesis frame used when synthesizing the camera images, comprising:
a coordinate information storage unit in which, for each display element of information relating to the shooting of the camera image, a world coordinate value of the display element is stored;
a map image storage unit in which a map image including the positions of the cameras that capture the camera images is stored;
a CG storage unit in which CG for each of the display elements is stored;
a camera parameter receiving unit that receives camera parameters of the camera, sets a zoom value and a focus value included in the camera parameters as a shooting angle of view and a focus distance, respectively, and outputs camera parameters including the shooting angle of view and the focus distance;
a camera parameter interpolation unit that interpolates the camera parameters so that a data rate of the camera parameters output by the camera parameter receiving unit matches a frame rate of the camera image;
a coordinate conversion unit that reads out the world coordinate values from the coordinate information storage unit for each of the display elements, and converts the world coordinate values into image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit;
reading out the map image from the map image storage unit, extracting a pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determining a shooting area based on the pan angle and the shooting angle of view in the map image and using the position of the camera as a reference, and generating a PinP (Picture-in-Picture) image in which the shooting area is represented on the map image,
a synthesis frame generating unit that generates the synthesis frame by reading the CG from the CG storage unit for each of the display elements, arranging the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranging the PinP image at a location of a preset image coordinate value;
A video synthesis frame generating device comprising:

2. The video synthesis frame generating device according to claim 1,
a new camera parameter receiving unit replacing the camera parameter receiving unit, and further comprising a conversion information storage unit storing a photographing angle of view corresponding to the zoom value and a focus distance corresponding to the focus value;
The new camera parameter receiving unit
receiving camera parameters of the camera, extracting a zoom value and a focus value from the camera parameters, reading out the shooting angle of view corresponding to the zoom value and the focus distance corresponding to the focus value from the conversion information storage unit, and outputting the camera parameters including the shooting angle of view and the focus distance;
The camera parameter interpolation unit
a frame generating unit for generating frames for video synthesis, the frame generating unit interpolating the camera parameters so that a data rate of the camera parameters output by the new camera parameter receiving unit matches a frame rate of the camera video;

3. The video synthesis frame generating device according to claim 1,
The coordinate conversion unit is
The pan angle, tilt angle, roll angle, focus distance and camera position are extracted from the camera parameters, and the pan angle is α, the tilt angle is δ, the roll angle is φ, the focus distance is f, the camera position is t _w = [t _wx , t _wy , t _wz ], the world coordinate values are (X _w , Y _w , Z _w ), the image coordinate values are (x _i , y _i ), and half the number of pixels in the horizontal and vertical directions are μ ₀ and ν ₀ ,
Using the pan angle α, the tilt angle δ, and the roll angle φ, the following equation is calculated:

The attitude R is calculated by
For each of the display elements, the focus distance f, the orientation R, the camera position t _w and the half-values μ ₀ and ν ₀ of the number of pixels are calculated using the following equation:

and converting the world coordinate values (X _w , Y _w , Z _w ) into the image coordinate values (x _i , y _i ) by

3. The video synthesis frame generating device according to claim 1,
The synthesis frame generating unit
extracting the pan angle and the shooting angle from the camera parameters, determining the orientation of the camera from the pan angle, determining an angle of the shooting area based on the position of the camera from the shooting angle of view, and determining the shooting area in the map image from the orientation of the camera and the angle of the shooting area.

3. The video synthesis frame generating device according to claim 1,
the CG storage unit stores a plurality of CGs for each of the display elements, the CGs corresponding to the shooting angle of view or the focus distance of the display element;
The synthesis frame generating unit
Extracting the photographing angle of view or the focus distance from the camera parameters;
a CG corresponding to the shooting angle of view or the focus distance is read from the CG storage unit for each of the display elements, the CG is placed at the location of the image coordinate values, and the PinP image is placed at the predetermined location, thereby generating the synthesis frame.

3. The video synthesis frame generating device according to claim 1,
The synthesis frame generating unit
Extracting the photographing angle of view or the focus distance from the camera parameters;
A video synthesis frame generating device characterized in that the device changes the scale of the PinP image based on the shooting angle of view or the focus distance, and generates the synthesis frame by placing the PinP image with the changed scale in the predetermined location.

1. An image synthesizing device that generates a frame including information related to shooting of a camera image as a synthesis frame, and synthesizes the synthesis frame with a frame of the camera image to generate a synthetic image, comprising:
a coordinate information storage unit in which, for each display element of information relating to the shooting of the camera image, a world coordinate value of the display element is stored;
a map image storage unit in which a map image including the positions of the cameras that capture the camera images is stored;
a CG storage unit in which CG for each of the display elements is stored;
a camera parameter receiving unit that receives camera parameters of the camera, sets a zoom value and a focus value included in the camera parameters as a shooting angle of view and a focus distance, respectively, and outputs camera parameters including the shooting angle of view and the focus distance;
a camera parameter interpolation unit that interpolates the camera parameters so that a data rate of the camera parameters output by the camera parameter receiving unit matches a frame rate of the camera image;
a coordinate conversion unit that reads out the world coordinate values from the coordinate information storage unit for each of the display elements, and converts the world coordinate values into image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit;
reading out the map image from the map image storage unit, extracting a pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determining a shooting area based on the pan angle and the shooting angle of view in the map image with the position of the camera as a reference, and generating a PinP (Picture-in-Picture) image in which the shooting area is represented on the map image,
a synthesis frame generating unit that generates the synthesis frame by reading the CG from the CG storage unit for each of the display elements, arranging the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranging the PinP image at a location of a preset image coordinate value;
a video delay unit that delays the camera video frame so that the camera video frame is synchronized with the synthesis frame generated by the synthesis frame generation unit;
a frame synthesis unit that synthesizes the camera video frame delayed by the video delay unit with the synthesis frame generated by the synthesis frame generation unit and outputs the synthesized video;
An image synthesizing device comprising:

A computer constituting an image synthesis frame generating device that generates a frame including information regarding shooting of a camera image as a synthesis frame to be used when synthesizing the camera image,
a coordinate information storage unit in which, for each display element of information relating to the shooting of the camera image, the world coordinate values of the display element are stored;
a map image storage unit in which a map image including the positions of the cameras capturing the camera images is stored;
a CG storage unit in which CG for each of the display elements is stored;
a camera parameter receiving unit that receives camera parameters of the camera, determines a zoom value and a focus value included in the camera parameters as a photographing angle of view and a focus distance, respectively, and outputs camera parameters including the photographing angle of view and the focus distance;
a camera parameter interpolation unit that interpolates the camera parameters so that a data rate of the camera parameters output by the camera parameter receiving unit matches a frame rate of the camera image;
a coordinate conversion unit that reads out the world coordinate values from the coordinate information storage unit for each of the display elements, and converts the world coordinate values into image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit; and
reading out the map image from the map image storage unit, extracting a pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determining a shooting area based on the pan angle and the shooting angle of view in the map image and using the position of the camera as a reference, and generating a PinP (Picture-in-Picture) image in which the shooting area is represented on the map image,
A program for functioning as a synthesis frame generation unit that generates the synthesis frame by reading the CG from the CG storage unit for each display element, placing the CG at the location of the image coordinate values converted by the coordinate conversion unit, and placing the PinP image at a location of predetermined image coordinate values.

a computer that constitutes an image synthesizing device that generates a frame including information regarding shooting of a camera image as a synthesis frame, and synthesizes the synthesis frame with a frame of the camera image to generate a synthetic image;
a coordinate information storage unit in which, for each display element of information relating to the shooting of the camera image, the world coordinate values of the display element are stored;
a map image storage unit in which a map image including the positions of the cameras capturing the camera images is stored;
a CG storage unit in which CG for each of the display elements is stored;
a camera parameter receiving unit that receives camera parameters of the camera, sets a zoom value and a focus value included in the camera parameters as a shooting angle of view and a focus distance, respectively, and outputs camera parameters including the shooting angle of view and the focus distance;
a camera parameter interpolation unit that interpolates the camera parameters so that a data rate of the camera parameters output by the camera parameter receiving unit matches a frame rate of the camera image;
a coordinate conversion unit that reads out the world coordinate values from the coordinate information storage unit for each of the display elements, and converts the world coordinate values into image coordinate values based on the camera parameters interpolated by the camera parameter interpolation unit;
reading out the map image from the map image storage unit, extracting a pan angle and the shooting angle of view from the camera parameters interpolated by the camera parameter interpolation unit, determining a shooting area based on the pan angle and the shooting angle of view in the map image with the position of the camera as a reference, and generating a PinP (Picture-in-Picture) image in which the shooting area is represented on the map image,
a synthesis frame generating unit which generates the synthesis frame by reading out the CG from the CG storage unit for each of the display elements, arranging the CG at the location of the image coordinate values converted by the coordinate conversion unit, and arranging the PinP image at a location of a preset image coordinate value;
a video delay unit that delays the camera video frame so that the camera video frame is synchronized with the synthesis frame generated by the synthesis frame generation unit; and
A program for functioning as a frame synthesis unit that synthesizes the camera image frame delayed by the image delay unit with the synthesis frame generated by the synthesis frame generation unit and outputs the synthesized image.