JP5198151B2

JP5198151B2 - Video search device and video search method

Info

Publication number: JP5198151B2
Application number: JP2008143825A
Authority: JP
Inventors: 寿雄西田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2008-05-30
Filing date: 2008-05-30
Publication date: 2013-05-15
Anticipated expiration: 2028-05-30
Also published as: JP2009290798A

Description

本発明は、映像検索装置及び映像検索方法に関し、例えば映像監視システムに適用して好適なものである。 The present invention relates to a video search apparatus and a video search method, and is suitable for application to a video surveillance system, for example.

オフィスビル、ホテル及びデパートなどの施設では古くから映像監視システムが導入されており、施設内の随所に監視カメラが設置されている。これら監視カメラの撮影映像は、集中管理室などにおいてリアルタイムで表示されるほか、ビデオテープ又はハードディスク装置などの記録媒体に記録されて一定期間保存される。そして、記録媒体に記録された撮影映像は、事件等が発生したときなどに状況検証のために利用されている。 In office buildings, hotels and department stores, video surveillance systems have been introduced for a long time, and surveillance cameras are installed throughout the facility. The video captured by these surveillance cameras is displayed in real time in a central control room or the like, or recorded on a recording medium such as a video tape or a hard disk device and stored for a certain period. The captured video recorded on the recording medium is used for situation verification when an incident or the like occurs.

ところで、通常、かかる施設には複数台の監視カメラが設置され、またこれら監視カメラによる撮影は昼夜を問わず常時行われる。このため、記録媒体に記録される撮影映像の量が膨大となり、かかる撮影映像の中から特定の人物が写っているシーンを検出しなければならない事態が発生した場合に、その作業に多大な時間及び労力を要するという問題があった。 By the way, normally, a plurality of surveillance cameras are installed in such a facility, and photographing with these surveillance cameras is always performed regardless of day or night. For this reason, the amount of captured video recorded on the recording medium becomes enormous, and when a situation occurs in which it is necessary to detect a scene in which a specific person is captured from such captured video, a great deal of time is spent on the work. In addition, there is a problem of requiring labor.

かかる問題を解決するための１つの方法として、例えば特許文献１には、各々の被写体にＩＣタグ等の識別情報素子を所持させ、この識別情報素子から得られる識別情報と、記録媒体に記録するその識別情報素子を所持した被写体の撮影映像とを関連付けておく方法が開示されている。 As one method for solving such a problem, for example, in Patent Document 1, each subject has an identification information element such as an IC tag, and the identification information obtained from the identification information element is recorded on a recording medium. how to keep association with captured image of a subject who possesses the identification information element is disclosed.

ところが、かかる特許文献１に開示された方法は、各々の被写体に識別情報素子を所持させる必要があるため、例えばホテルやデパートなどの不特定多数の人間が出入りする施設には適用し難い問題があった。 However, since the method disclosed in Patent Document 1 requires each subject to have an identification information element, there is a problem that is difficult to apply to a facility where an unspecified number of people enter and exit, such as hotels and department stores. there were.

本発明は以上の点を考慮してなされたもので、膨大な映像の中から所望の被写体の映像を容易に検出し得る映像検索装置及び方法を提案しようとするものである。 The present invention has been made in consideration of the above points, and an object of the present invention is to propose a video search apparatus and method capable of easily detecting a video of a desired subject from a huge video.

かかる課題を解決するため本発明においては、映像検索装置において、外部から与えられる第１の映像情報に基づく映像内に存在する各被写体をそれぞれ抽出し、抽出した前記被写体の映像情報を記録媒体に記録する被写体抽出部と、前記被写体抽出部により抽出された各前記被写体の特徴部位をそれぞれ抽出する特徴部位抽出部と、前記特徴部位抽出部により抽出された各前記被写体の特徴部位に関する情報である前記被写体ごとの特徴情報を、それぞれ前記記録媒体に記録された各前記被写体の映像情報と関連付けて記憶する記憶部と、検索対象として指定された被写体の特徴情報に基づいて、前記記憶部に記憶された各前記特徴情報の中から前記検索対象として指定された被写体に対応する特徴情報を検索する映像検索部とを備えることを特徴とする。 In order to solve this problem, in the present invention, in the video search device, each subject existing in the video based on the first video information given from the outside is extracted, and the video information of the extracted subject is stored in the recording medium. Information relating to a subject extraction unit to be recorded, a feature part extraction unit that extracts a feature part of each subject extracted by the subject extraction part, and a feature part of each subject extracted by the feature part extraction unit The storage unit stores the feature information for each subject in association with the video information of each subject recorded in the recording medium, and stores the feature information in the storage unit based on the feature information of the subject specified as a search target. And a video search unit for searching for feature information corresponding to the subject specified as the search target from among the feature information. The features.

また本発明においては、映像検索方法において、外部から与えられる第１の映像情報に基づく映像内に存在する各被写体をそれぞれ抽出し、抽出した前記被写体の映像情報を記録媒体に記録すると共に、抽出した各前記被写体の特徴部位をそれぞれ抽出し、抽出した各前記被写体の特徴部位に関する情報である前記被写体ごとの特徴情報を、それぞれ前記記録媒体に記録された各前記被写体の映像情報と関連付けて記憶する第１のステップと、検索対象として指定された被写体の特徴情報に基づいて、記憶した各前記特徴情報の中から前記検索対象として指定された被写体に対応する特徴情報を検索する第２のステップとを備えることを特徴とする。 In the present invention, in the video search method, each subject existing in the video based on the first video information given from outside is extracted, and the extracted video information of the subject is recorded on the recording medium and extracted. The extracted characteristic parts of each subject are extracted, and the characteristic information for each subject, which is information relating to the extracted characteristic parts of each subject, is stored in association with the video information of each subject recorded on the recording medium. And a second step of searching for feature information corresponding to the subject specified as the search target from among the stored feature information based on the feature information of the subject specified as the search target. It is characterized by providing.

本発明によれば、記録媒体に記録された複数の被写体の映像情報の中から所望の被写体の映像情報を検出することができ、かくして膨大な映像の中から所望の被写体の映像を容易に検出し得る映像検索装置及び方法を実現できる。 According to the present invention, it is possible to detect video information of a desired subject from video information of a plurality of subjects recorded on a recording medium, and thus easily detect a video of a desired subject from an enormous amount of video. A video search apparatus and method that can be implemented.

以下図面について、本発明の一実施の形態を詳述する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

（１）第１の実施の形態
（１−１）本実施の形態による映像監視システムの構成
図１において、１は全体として本実施の形態による映像監視システムを示す。この映像監視システム１は、各被写体の特徴に関する情報（以下、これを特徴情報と呼ぶ）を撮影映像から画像処理により抽出して収集し、収集した各被写体の特徴情報に基づいて、記録媒体８に蓄積された映像の中からユーザにより指定された被写体の映像を検索し得るようになされている。 (1) First Embodiment (1-1) Configuration of Video Monitoring System According to this Embodiment In FIG. 1, reference numeral 1 denotes an overall video monitoring system according to this embodiment. The video monitoring system 1 extracts and collects information about characteristics of each subject (hereinafter referred to as feature information) from the captured video by image processing, and based on the collected feature information of each subject, the recording medium 8 The video of the subject specified by the user can be searched from the video stored in the screen.

実際上、この映像監視システム１は、図２に示すように、同一の被写体２を多視点から撮影可能な状態に固定配置された複数のカメラ３Ａ〜３ｎから構成されるカメラ部３を備える。図２は、部屋４の四隅から中央に位置する被写体（例えば人物）２を撮影可能なように４台のカメラ３Ａ〜３Ｄを設置した例であり、矢印ａは被写体２の向きを示している。 In practice, as shown in FIG. 2, the video monitoring system 1 includes a camera unit 3 including a plurality of cameras 3 </ b> A to 3 n that are fixedly arranged so that the same subject 2 can be photographed from multiple viewpoints. FIG. 2 is an example in which four cameras 3A to 3D are installed so that a subject (for example, a person) 2 located in the center from the four corners of the room 4 can be photographed, and an arrow a indicates the direction of the subject 2. .

そしてこれらカメラ３Ａ〜３ｎから出力される例えば図３（Ａ）〜（Ｄ）に示すような同一の被写体２の撮影映像の映像信号（以下、これをカメラ映像信号と呼ぶ）が、それぞれ図示しないケーブルを介して映像検索装置５内の映像入力部６に与えられる。 Then, for example, video signals (hereinafter referred to as camera video signals) of the captured video of the same subject 2 as shown in FIGS. 3A to 3D output from these cameras 3A to 3n are not shown. It is given to the video input unit 6 in the video search device 5 via a cable.

映像入力部６は、各カメラ３Ａ〜３ｎからそれぞれ与えられるカメラ映像信号に対してこれらカメラ映像信号間の同期をとるための同期処理などの所定の信号処理をそれぞれ施し、信号処理後の各カメラ映像信号をそれぞれ映像制御部７に送出する。 The video input unit 6 performs predetermined signal processing such as synchronization processing for synchronizing the camera video signals with respect to the camera video signals respectively given from the cameras 3A to 3n. Each video signal is sent to the video controller 7.

映像制御部７は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）及びＲＡＭ（Random Access Memory）等を備えて構成され、通常モード時、映像入力部６から与えられる各カメラ映像信号に基づく撮影映像情報（以下、これをカメラ映像情報と呼ぶ）を、例えばハードディスク装置や半導体メモリなどから構成される記録媒体８に順次格納する。 The video control unit 7 includes a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like, and is based on each camera video signal supplied from the video input unit 6 in the normal mode. Shooting video information (hereinafter referred to as camera video information) is sequentially stored in a recording medium 8 composed of, for example, a hard disk device or a semiconductor memory.

記録媒体８に格納されたカメラ映像情報は、この後、被写体抽出部９により１フレーム分ずつ順次読み出される。そして被写体抽出部９は、読み出した１フレーム分のカメラ映像情報に基づくフレーム画像内に被写体２が存在するか否かを判定する。例えば被写体２を人物とした場合、背景差分法や肌色検出法などの従来手法により被写体２の有無を判定することができる。また被写体２を人物や車などの「動く物」とした場合には、フレーム画像間における画像の差分を抽出する動き検出処理などの従来手法により被写体２の有無を判定することができる。 Thereafter, the camera video information stored in the recording medium 8 is sequentially read out frame by frame by the subject extraction unit 9. Then, the subject extraction unit 9 determines whether or not the subject 2 exists in the frame image based on the read camera image information for one frame. For example, when the subject 2 is a person, the presence or absence of the subject 2 can be determined by a conventional method such as a background difference method or a skin color detection method. When the subject 2 is a “moving object” such as a person or a car, the presence / absence of the subject 2 can be determined by a conventional method such as a motion detection process that extracts a difference between images between frame images.

被写体抽出部９は、かかる判定によりフレーム画像内に被写体２が存在すると判断した場合には、当該フレーム画像からその被写体２の映像部分（以下、これを被写体映像と呼ぶ）を抽出し、抽出した被写体映像の映像情報を特徴部位抽出部１０及び多視点映像生成部１１に送出する。また多視点映像生成部１１は、被写体抽出部９から与えられる被写体映像の映像情報を記録媒体８に順次格納する。 When the subject extraction unit 9 determines that the subject 2 is present in the frame image based on the determination, the subject extraction unit 9 extracts and extracts a video portion of the subject 2 (hereinafter referred to as a subject video) from the frame image. Video information of the subject video is sent to the feature part extraction unit 10 and the multi-view video generation unit 11. Further, the multi-view video generation unit 11 sequentially stores video information of the subject video provided from the subject extraction unit 9 in the recording medium 8.

被写体抽出部９は、以上の処理を、記録媒体８に格納された各カメラ映像情報に対して１フレーム分ずつ繰り返し実行する。この結果、同一の被写体を同一のタイミングで多視点から撮影した図４（Ａ）〜（Ｄ）に示すような複数の被写体映像の映像情報が記録媒体８に順次蓄積される。ただし、以上の処理を被写体抽出部９が数〜数十フレームごとに行うようにしても良い。このようにすることによって、被写体抽出部９の負荷を軽減し、かつ記録媒体８に蓄積される被写体映像の映像情報の情報量を低減することができる。 The subject extraction unit 9 repeatedly executes the above processing for each camera image information stored in the recording medium 8 for each frame. As a result, video information of a plurality of subject videos as shown in FIGS. 4A to 4D obtained by photographing the same subject from multiple viewpoints at the same timing is sequentially stored in the recording medium 8. However, the subject extraction unit 9 may perform the above processing every several to several tens of frames. By doing so, it is possible to reduce the load on the subject extraction unit 9 and to reduce the amount of information of the video information of the subject video accumulated in the recording medium 8.

このとき特徴部位抽出部１０は、被写体抽出部９から与えられる被写体映像の映像情報に基づいて、対応する被写体映像内の予め定められた特徴的な部位（以下、これを特徴部位と呼ぶ。）の映像部分を抽出する。例えば被写体が人物である場合、目、眉毛、鼻、口、耳、顔全体、装着している衣料品（服や帽子等）や装飾品（ネックレス、ピアス等）、バッグ、服に描かれた文字などが特徴部位に相当する。 At this time, the feature part extraction unit 10 is based on the video information of the subject video provided from the subject extraction unit 9, and a predetermined characteristic part in the corresponding subject video (hereinafter referred to as a feature part). The video part of is extracted. For example, when the subject is a person, it is drawn on the eyes, eyebrows, nose, mouth, ears, the entire face, worn clothing (clothes, hats, etc.) and ornaments (necklaces, earrings, etc.), bags, clothes Characters and the like correspond to characteristic parts.

そして特徴部位抽出部１０は、抽出した特徴部位の映像情報と、当該特徴部位を管理するための管理情報とを特徴情報として、その特徴部位を抽出した被写体映像の映像情報及び対応するカメラ映像情報とそれぞれ関連付けて、記録媒体８内の特徴情報記憶部８Ａに格納する。 Then, the feature part extraction unit 10 uses the extracted video information of the feature part and management information for managing the feature part as the feature information, and the video information of the subject video from which the feature part is extracted and the corresponding camera video information Are stored in the characteristic information storage unit 8A in the recording medium 8 in association with each other.

一方、映像制御部７は、ユーザにより操作部１３が操作されて被写体検索モードが選択されると、そのときユーザにより選択された１台のカメラ３Ａ〜３ｎから与えられるカメラ映像信号（以下、これを選択カメラ映像信号と呼ぶ）を映像出力部１４を介してモニタ１５に与える。これによりこの選択カメラ映像信号に基づいて、図６（Ａ）に示すようなカメラ映像画面１６がモニタ１５に表示される。 On the other hand, when the operation unit 13 is operated by the user and the subject search mode is selected by the user, the video control unit 7 receives camera video signals (hereinafter referred to as “this”) from one camera 3A to 3n selected by the user at that time. (Referred to as “selected camera video signal”) is supplied to the monitor 15 via the video output unit 14. Accordingly, a camera video screen 16 as shown in FIG. 6A is displayed on the monitor 15 based on the selected camera video signal.

このカメラ映像画面１６は、かかる選択カメラ映像信号に基づくカメラ映像上に、当該カメラ映像内に存在するいずれか一人の被写体を取り囲むように枠体１６Ａが表示された画面である。このカメラ映像画面１６では、操作部１３を操作することによって、かかる枠体１６Ａを当該カメラ映像画面１６上に存在する他の被写体を取り囲むように移動させることができる。 The camera video screen 16 is a screen in which a frame 16A is displayed on a camera video based on the selected camera video signal so as to surround any one subject existing in the camera video. On the camera video screen 16, the frame 16 </ b> A can be moved so as to surround other subjects existing on the camera video screen 16 by operating the operation unit 13.

そして映像制御部７は、この後、ユーザにより操作部１３が操作されてそのとき枠体１６Ａで囲まれた被写体（以下、適宜、これを検索対象被写体と呼ぶ）の検索命令が入力されると、そのとき特徴部位抽出部９が抽出したその検索対象被写体の特徴部位の映像情報を特徴情報記憶部８Ａから読み出す。 After that, when the operation unit 13 is operated by the user, the video control unit 7 then receives a search command for a subject surrounded by the frame body 16A (hereinafter referred to as a search target subject as appropriate). Then, the video information of the characteristic part of the search target subject extracted by the characteristic part extraction unit 9 is read from the characteristic information storage unit 8A.

また映像制御部７は、この特徴部位の映像情報に基づいて、それまでに特徴情報記憶部８Ａに蓄積された各被写体の特徴部位の映像情報の中から検索対象被写体のものと推測される特徴部位の映像情報を例えばパターンマッチング処理により検出し、その特徴部位の映像情報と関連付けられている被写体映像に基づいて多視点映像を生成すべき要求（以下、これを多視点映像生成要求と呼ぶ）を多視点映像生成部１１に送出する。 Further, the video control unit 7 is based on the video information of the feature part, and the feature estimated to be the subject to be searched from the video information of the feature part of each subject accumulated so far in the feature information storage unit 8A. A request to detect video information of a part by, for example, pattern matching processing and generate a multi-view video based on a subject video associated with the video information of the characteristic part (hereinafter referred to as a multi-view video generation request) Is sent to the multi-view video generation unit 11.

多視点映像生成部１１は、各カメラ３Ａ〜３ｎの設置位置や撮影方向などのカメラ３Ａ〜３ｎごとの管理情報（以下、これをカメラ情報と呼ぶ）を例えばハードディスク装置や半導体メモリなどから構成されるカメラ情報記憶部１２内に予め保持している。そして多視点映像生成部１１は、上述のように映像制御部７から多視点映像生成要求が与えられると、かかるカメラ情報記憶部１２に保持した各カメラ３Ａ〜３ｎのカメラ情報と、記録媒体８に格納されている検索対象被写体の被写体映像の映像情報とに基づいて、例えば図５（Ａ）及び（Ｂ）に示すような検索対象被写体の多視点映像を生成する。 The multi-view video generation unit 11 includes management information (hereinafter referred to as camera information) for each of the cameras 3A to 3n such as the installation positions and shooting directions of the cameras 3A to 3n, for example, from a hard disk device or a semiconductor memory. Is stored in advance in the camera information storage unit 12. When the multi-view video generation request is given from the video control unit 7 as described above, the multi-view video generation unit 11 stores the camera information of each camera 3A to 3n held in the camera information storage unit 12 and the recording medium 8. For example, a multi-view video of the search target subject as shown in FIGS. 5A and 5B is generated based on the video information of the subject video of the search target subject stored in FIG.

この多視点映像は、２枚の被写体映像に基づいて、これら２枚の被写体映像の視点位置とは異なる視点位置から被写体を見たときの被写体の３次元モデルであり、例えば図４（Ａ）及び（Ｃ）にそれぞれ示す各被写体映像の映像情報に基づいて図５（Ａ）に示すような多視点映像が生成され、図４（Ａ）及び（Ｂ）にそれぞれ示す各被写体映像の映像情報に基づいて図５（Ｂ）に示すような多視点映像が生成される。なお多視点映像の視点位置は、後述のように操作部１３を操作することにより変えることができるが、生成初期時における視点位置は予め定められた所定位置であるものとする。 This multi-viewpoint video is a three-dimensional model of a subject when the subject is viewed from a viewpoint position different from the viewpoint position of the two subject videos based on the two subject videos. For example, FIG. And (C), a multi-view video as shown in FIG. 5 (A) is generated based on the video information of each subject video, and the video information of each subject video shown in FIGS. 4 (A) and 4 (B). A multi-viewpoint video as shown in FIG. Note that the viewpoint position of the multi-view video can be changed by operating the operation unit 13 as will be described later, but the viewpoint position at the initial stage of generation is a predetermined position.

そして多視点映像生成部１１は、このようにして生成した検索対象被写体の多視点映像の映像情報を記録媒体８に格納する。 Then, the multi-view video generation unit 11 stores the video information of the multi-view video of the search target object generated in this way in the recording medium 8.

さらに映像制御部７は、記録媒体８に格納されているカメラ映像情報のうち、検索対象被写体のものと推測される特徴部位の映像情報と関連付けられているカメラ映像情報に基づく記録映像の中から、検索対象被写体が映っている映像部分を検出すべき要求（以下、これを被写体映像部位検出要求と呼ぶ）を特徴部位抽出部１０に送出する。 Further, the video control unit 7 selects from among the recorded video based on the camera video information associated with the video information of the characteristic part that is assumed to be the subject of the search object among the camera video information stored in the recording medium 8. Then, a request to detect a video portion in which the search target subject is shown (hereinafter referred to as a subject video part detection request) is sent to the feature part extraction unit 10.

特徴部位抽出部１０は、この被写体映像部位検出要求が与えられると、かかる検索対象被写体のものと推測される特徴部位の映像情報を用いたパターンマッチング処理により、当該特定部位の映像情報と関連付けられたカメラ映像情報に基づく記録映像中の当該検索対象被写体が写っている映像部分を検出し、検出結果を映像制御部７に通知する。 When this subject video part detection request is given, the feature part extraction unit 10 is associated with the video information of the specific part by pattern matching processing using the video information of the feature part presumed to be the subject to be searched. The video portion in which the subject to be searched is recorded in the recorded video based on the camera video information is detected, and the video control unit 7 is notified of the detection result.

かくして映像制御部７は、かかる上述のようにして検出した検索対象被写体の被写体映像の映像情報と、この被写体映像に基づいて多視点映像生成部１１に生成させた多視点映像の映像情報と、対応するカメラ映像情報に基づく記録映像中の検索対象被写体が写っている映像部分のカメラ映像情報とを記録媒体８から読み出し、これらに基づいて図６（Ｂ）に示すような検索結果表示画面１７を生成する。そして映像制御部７は、この検索結果表示画面１７の画面信号を映像出力部１４を介してモニタ１５に送出することにより、この検索結果表示画面１７をモニタ１５に表示させる。 Thus, the video control unit 7 includes the video information of the subject video of the search subject detected as described above, the video information of the multi-view video generated by the multi-view video generation unit 11 based on the subject video, The camera video information of the video portion in which the subject to be searched in the recorded video based on the corresponding camera video information is read from the recording medium 8, and based on these, the search result display screen 17 as shown in FIG. Is generated. Then, the video control unit 7 sends the screen signal of the search result display screen 17 to the monitor 15 via the video output unit 14 to display the search result display screen 17 on the monitor 15.

なお、図６（Ｂ）において、１７Ａ及び１７Ｂは、各カメラ３Ａ〜３ｎから出力されるカメラ映像情報に基づき生成された検索対象被写体の被写体映像のうち、予め定められたカメラ３Ａ〜３ｎから出力されたカメラ映像情報に基づき生成された被写体映像であり、１７Ｃは、検索対象被写体の多視点映像である。この多視点映像は、操作部１３を操作することにより視点位置を自由に変えることができる。さらに１７Ｄは、記録媒体８から読み出したカメラ映像情報に基づき表示される検索対象被写体が写っている記録映像（動画像）である。 In FIG. 6B , 17A and 17B are output from predetermined cameras 3A to 3n among the subject videos of the search target subject generated based on the camera video information output from the cameras 3A to 3n. 17C is a subject video generated based on the obtained camera video information, and 17C is a multi-view video of the search target subject. The viewpoint position of this multi-viewpoint image can be freely changed by operating the operation unit 13. Further, 17D is a recorded video (moving image) showing a search target subject displayed based on the camera video information read from the recording medium 8.

（１−２）特徴部位抽出処理に関する特徴部位抽出部の具体的な処理内容
ここで特徴部位抽出部１０は、上述のように被写体抽出部９から与えられる被写体映像の映像情報に基づいて、当該被写体映像から特徴部位を抽出するための手段として、図７に示すようなテンプレート画像テーブル２０を内部メモリ（図示せず）に保持している。 (1-2) Specific processing contents feature site extracting section 10, where the characteristic portion extracting unit on the characteristics site extracting process, on the basis of the image information of the subject image supplied from the subject extraction unit 9 as described above, the A template image table 20 as shown in FIG. 7 is held in an internal memory (not shown) as means for extracting a characteristic part from the subject video.

このテンプレート画像テーブル２０は、抽出対象となる特徴部位（目、眉毛、鼻、口、服に描かれた文字など）のテンプレート画像を管理するためのテーブルであり、これら抽出対象の特徴部位のテンプレート画像の画像情報がそれぞれ複数格納されている。 The template image table 20 is a table for managing template images of feature parts to be extracted (characters drawn on eyes, eyebrows, nose, mouth, clothes, etc.). A plurality of pieces of image information of images are stored.

例えば、「目」、「眉毛」、「鼻」及び「口」という特徴部位については、それぞれ「目」、「鼻」及び「口」について複数の形状のテンプレート画像が格納され、「文字」という特徴部位については、それぞれアルファベット、数字、ひらがな及びカタカナなどの抽出対象となる各文字のテンプレート画像が格納される。 For example, for the characteristic parts of “eyes”, “eyebrows”, “nose” and “mouth”, template images of a plurality of shapes are stored for “eyes”, “nose” and “mouth”, respectively, and are called “characters”. With respect to the characteristic portions, template images of characters to be extracted such as alphabets, numbers, hiragana and katakana are stored.

そして特徴部位抽出部１０は、被写体抽出部９から被写体映像の映像情報が与えられると、当該被写体映像に対してテンプレート画像テーブル２０に登録されている各テンプレート画像を用いたパターンマッチング処理を実行することにより、当該被写体映像から特徴部位を抽出する。 When the video information of the subject video is given from the subject extraction unit 9, the feature part extraction unit 10 executes pattern matching processing using each template image registered in the template image table 20 for the subject video. Thus, a characteristic part is extracted from the subject video.

また特徴部位抽出部１０は、かかるパターンマッチング処理により特徴部位を検出したときには、その特徴部位の映像部分を被写体映像から抽出し、その映像部分の映像情報とその特徴部位に関する管理情報とを特徴情報記憶部８Ａに保持された図８に示すような特徴情報管理テーブル２１に格納する。 The feature site extracting section 10, when detecting the feature portion by such pattern matching process, extracting a video portion of the characteristic portion from the object image, features video information of the video part and the management information about the characteristic site The information is stored in the feature information management table 21 shown in FIG. 8 held in the information storage unit 8A.

この特徴情報管理テーブル２１は、被写体映像から抽出された特徴部位の映像情報及び当該特徴部位の管理情報を、その被写体映像の映像ファイル及び対応するカメラ映像情報の映像ファイルと関連付けて管理するためのテーブルであり、特徴部位映像情報欄２１Ａ、管理情報欄２１Ｂ、被写体映像ファイルＩＤ欄２１Ｃ及び撮影映像ファイルＩＤ欄２１Ｄから構成される。 The feature information management table 21 manages video information of a feature part extracted from a subject video and management information of the feature part in association with a video file of the subject video and a video file of corresponding camera video information. This table includes a feature part video information column 21A, a management information column 21B, a subject video file ID column 21C, and a captured video file ID column 21D.

そして特徴部位映像情報欄２１Ａには、特徴部位抽出部１０により被写体映像から抽出された特徴部位の映像情報が格納され、被写体映像ファイルＩＤ欄２１Ｃには、その被写体映像の映像情報が格納された映像ファイルのファイルＩＤが格納される。またカメラ映像ファイルＩＤ欄２１Ｄには、かかる被写体映像が抽出された記録映像の映像ファイルのファイルＩＤが格納される。 The feature part video information column 21A stores the video information of the feature part extracted from the subject video by the feature part extraction unit 10, and the subject video file ID column 21C stores the video information of the subject video. The file ID of the video file is stored. The camera video file ID column 21D stores the file ID of the video file of the recorded video from which the subject video is extracted.

一方、管理情報欄２１Ｂは、属性欄２１Ｅ、認識文字欄２１Ｆ及びタイムコード欄２１Ｇなどから構成される。そして属性欄２１Ｅには、対応する特徴部位の属性（目、鼻、口、文字等）が格納され、認識文字欄２１Ｆには、特徴部位の属性が文字である場合に当該特徴部位について認識した文字が格納される。さらにタイムコード欄２１Ｇには、対応する特徴部位を検出した被写体映像が抽出された元のフレーム画像のタイムコードが格納される。 On the other hand, the management information column 21B includes an attribute column 21E, a recognized character column 21F, a time code column 21G, and the like. The attribute field 21E stores the corresponding feature part attributes (eyes, nose, mouth, characters, etc.), and the recognized character field 21F recognizes the feature part when the feature part attribute is a character. Stores characters. Further, the time code column 21G stores the time code of the original frame image from which the subject video from which the corresponding characteristic part is detected is extracted.

図９は、以上のような特徴部位の抽出処理（以下、これを特徴部位抽出処理と呼ぶ）に関する特徴部位抽出部１０の具体的な処理内容を示す。特徴部位抽出部１０は、図示しない内部メモリに格納された制御プログラムに従って、被写体抽出部９から被写体映像の映像情報が与えられるごとにこの図９に示す特徴部位抽出処理を実行する。 FIG. 9 shows the specific processing contents of the feature part extraction unit 10 regarding the above-described feature part extraction processing (hereinafter referred to as feature part extraction processing). The feature part extraction unit 10 executes the feature part extraction process shown in FIG. 9 every time video information of a subject video is given from the subject extraction unit 9 according to a control program stored in an internal memory (not shown).

すなわち特徴部位抽出部１０は、被写体抽出部９から被写体映像の映像情報が与えられるとこの特徴部位抽出処理を開始し、まず、テンプレート画像テーブル２０（図７）から１つのテンプレート画像を選択する（ＳＰ１）。 That is, the feature part extraction unit 10 starts the feature part extraction process when the video information of the subject video is given from the subject extraction unit 9, and first selects one template image from the template image table 20 (FIG. 7) ( SP1).

次いで特徴部位抽出部１０は、そのテンプレート画像をそのとき受信した被写体映像上でラスタ走査させながら、被写体映像上のテンプレート画像が重なった領域内の各画素の画素値と、テンプレート画像内の各画素の画素値とをそれぞれ比較する（ＳＰ２）。 Next, the feature part extraction unit 10 raster-scans the template image on the subject image received at that time, and the pixel value of each pixel in the region where the template image on the subject image overlaps, and each pixel in the template image Are compared with each other (SP2).

そして特徴部位抽出部１０は、被写体映像に対する上述のようなラスタ走査が終了すると、被写体映像上でテンプレート画像の画素の画素値と一定数以上画素値が一致した領域が存在するか否かを判断する（ＳＰ３）。 Then, when the raster scan as described above is completed on the subject video, the feature part extraction unit 10 determines whether or not there is an area on the subject video that has a pixel value equal to or greater than a certain number of pixels of the template image. (SP3).

特徴部位抽出部１０は、この判断において肯定結果を得ると、被写体映像のかかる領域の映像情報を、その管理情報共に特徴情報として、当該被写体映像の映像ファイルのファイルＩＤ及び対応する記録映像の映像ファイルのファイルＩＤと関連付けて特徴情報記憶部８Ａ内の特徴情報管理テーブル２１に登録し（ＳＰ４）、この後ステップＳＰ７に進む。 When the characteristic part extraction unit 10 obtains a positive result in this determination, the video information of the region of the subject video is used as feature information together with the management information, and the video ID of the video file of the subject video and the video of the corresponding recorded video The file is registered in the feature information management table 21 in the feature information storage unit 8A in association with the file ID of the file (SP4), and then the process proceeds to step SP7.

これに対して特徴部位抽出部１０は、かかる判断において否定結果を得ると、テンプレート画像を拡大又は縮小済みであるか否かを判断する（ＳＰ５）。すなわち本実施の形態の場合、上述のようなパターンマッチング処理を、テンプレート画像を少しずつ拡大しながら複数回行い、同様にテンプレート画像を少しずつ縮小しながら複数回行う。このため特徴部位抽出部１０は、ステップＳＰ５の判断において否定結果を得ると、そのとき利用しているテンプレート画像を拡大又は縮小し（ＳＰ６）、この後ステップＳＰ２に戻る。 On the other hand, if the characteristic part extraction unit 10 obtains a negative result in this determination, it determines whether the template image has been enlarged or reduced (SP5). That is, in the case of the present embodiment, the pattern matching process as described above is performed a plurality of times while gradually expanding the template image, and similarly performed a plurality of times while gradually reducing the template image. For this reason, if the characteristic part extraction part 10 obtains a negative result in determination of step SP5, it will enlarge or reduce the template image currently used (SP6), and will return to step SP2 after this.

そして特徴部位抽出部１０は、以上のようなパターンマッチング処理をすべて終了することによりステップＳＰ５の判断において肯定結果を得ると、未だパターンマッチング処理を行なっていないテンプレート画像がテンプレート画像テーブル２０上に存在するか否かを判断する（ＳＰ７）。 When the feature part extraction unit 10 obtains a positive result in the determination of step SP5 by ending all the pattern matching processes as described above, a template image that has not yet been subjected to the pattern matching process exists on the template image table 20. It is determined whether or not to perform (SP7).

特徴部位抽出部１０は、この判断において肯定結果を得ると、テンプレート画像をテンプレート画像テーブル２０に格納された他のテンプレート画像と交換した後（ＳＰ８）、ステップＳＰ２に戻る。かくして特徴部位抽出部１０は、この後、ステップＳＰ２以降の処理を上述と同様に実行する。 If the characteristic part extraction unit 10 obtains a positive result in this determination, it exchanges the template image with another template image stored in the template image table 20 (SP8), and then returns to step SP2. Thus, the feature part extraction unit 10 thereafter executes the processing after step SP2 in the same manner as described above.

そして特徴部位抽出部１０は、やがてテンプレート画像テーブル２０に登録されているすべてのテンプレート画像について同様の処理を終えることによりステップＳＰ７において否定結果を得ると、この特徴部位抽出処理を終了する。 And the characteristic part extraction part 10 will complete | finish this characteristic part extraction process, if a negative result is obtained in step SP7 by finishing the same process about all the template images registered into the template image table 20 before long.

（１−３）多視点映像生成処理に関する多視点映像生成部の具体的な処理内容
次に、多視点映像生成部１１における多視点映像の生成処理（以下、これを多視点映像生成処理と呼ぶ）の具体的な処理内容について説明する。これに際して、まず、多視点映像生成部１１がカメラ情報記憶部１２に保持している各カメラ３Ａ〜３ｎのカメラ情報について説明する。 (1-3) Specific processing contents of multi-view video generation unit regarding multi-view video generation processing Next, multi-view video generation processing in the multi-view video generation unit 11 (hereinafter referred to as multi-view video generation processing). ) Will be described in detail. At this time, first, camera information of each of the cameras 3A to 3n held in the camera information storage unit 12 by the multi-view video generation unit 11 will be described.

図１０は、多視点映像生成部１１がカメラ情報記憶部１２内に保持している各カメラ３Ａ〜３ｎのカメラ情報の具体的な内容を示している。この図１０からも明らかなように、各カメラ情報は、対応するカメラ３Ａ〜３ｎの位置情報、向き情報、画角情報及びズーム倍率情報から構成される。 FIG. 10 shows specific contents of the camera information of the cameras 3A to 3n held in the camera information storage unit 12 by the multi-view video generation unit 11. As is clear from FIG. 10, each camera information is composed of position information, orientation information, field angle information, and zoom magnification information of the corresponding cameras 3A to 3n.

位置情報は、各カメラ３Ａ〜３ｎが設置された場所を１つの空間とした場合の対応するカメラ３Ａ〜３ｎのワールド座標を表す情報であり、向き情報は、そのカメラ３Ａ〜３ｎの撮影方向（光軸方向）を表す情報である。図１０において、θ_n（ｎ＝１，２，３，……）はそのカメラ３Ａ〜３ｎの光軸が水平面となす煽り方向の角度を示し、φ_n（ｎ＝１，２，３，……）は当該カメラ３Ａ〜３ｎの光軸が水平面上の基準方向となす角度を示す。 The position information is information indicating the world coordinates of the corresponding cameras 3A to 3n when the place where the cameras 3A to 3n are installed is a single space, and the direction information is the shooting direction of the cameras 3A to 3n ( (Optical axis direction) information. In FIG. _{10, θ n (n = 1,2,3} , ......) indicates a tilt direction of the angle formed by the optical axis horizontal plane of the camera _{3A~3n, φ n (n = 1,2,3} , ... ...) indicates an angle formed by the optical axis of the cameras 3A to 3n with a reference direction on a horizontal plane.

また画角情報は、そのカメラ３Ａ〜３ｎの画角を表す情報である。図１０において、α_n（ｎ＝１，２，３，……）はそのカメラ３Ａ〜３ｎの水平方向の画角を示し、β_n（ｎ＝１，２，３，……）はそのカメラ３Ａ〜３ｎの垂直方向の画角を示す。さらにズーム倍率情報は、そのカメラ３Ａ〜３ｎのズーム倍率を表す情報である。 The field angle information is information representing the field angles of the cameras 3A to 3n. In FIG. 10, α _n (n = 1, 2, 3,...) Indicates the horizontal angle of view of the cameras 3A to 3n, and β _n (n = 1, 2, 3,...) Indicates the camera. The angle of view in the vertical direction of 3A to 3n is shown. Further, the zoom magnification information is information representing the zoom magnification of the cameras 3A to 3n.

これらのカメラ情報のうち、位置情報及び画角情報については、ユーザにより事前にカメラ情報記憶部１２に格納される。また向き情報については、各カメラ３Ａ〜３ｎをそれぞれ保持する各保持機構（図示せず）から映像検索装置５に与えられる対応するカメラ３Ａ〜３ｎの向き情報に基づいて例えば映像制御部７などにより自動的に更新される。さらにズーム倍率情報は、各カメラ３Ａ〜３ｎから映像検索装置５に逐次通知される現在のズーム倍率に基づいて自動的に更新される。 Among these camera information, position information and angle of view information are stored in the camera information storage unit 12 in advance by the user. For the orientation information, for example, by the video control unit 7 or the like based on the orientation information of the corresponding cameras 3A to 3n given to the video search device 5 from each holding mechanism (not shown) that holds each camera 3A to 3n. Updated automatically. Further, the zoom magnification information is automatically updated based on the current zoom magnification that is sequentially notified to the video search device 5 from each of the cameras 3A to 3n.

次に、図１１を用いて、同一の被写体３０を撮影する２台のカメラ３１Ａ，３１Ｂの各撮影映像から抽出した被写体映像３２Ａ，３２Ｂの映像情報に基づいて、任意の視点位置から見た被写体３０の多視点映像３３を生成する方法について説明する。 Next, with reference to FIG. 11, the subject viewed from an arbitrary viewpoint position based on the video information of the subject videos 32A and 32B extracted from the shot images of the two cameras 31A and 31B that shoot the same subject 30 A method for generating 30 multi-view images 33 will be described.

この場合、まず、各被写体映像３２Ａ，３２Ｂ内における被写体３０上の対応点を検出する。対応点は、一方のカメラ３１Ａの撮影により得られた被写体映像３２Ａ内の特徴部位と同じ特徴部位を、他方のカメラ３１Ｂの撮影により得られた被写体映像３２Ｂ内で検出することにより検出することができる。具体的には、基準とする一方の被写体映像３２Ａ内の例えば「左目」の位置と、他方の被写体映像３１Ｂ内の「左目」の位置とを対応点とする。これと同様にして、各被写体映像３２Ａ，３２Ｂ内における被写体３０上の対応点を幾つか検出する。 In this case, first, corresponding points on the subject 30 in each of the subject images 32A and 32B are detected. The corresponding point can be detected by detecting the same characteristic part in the subject video 32A obtained by the photographing of one camera 31A in the subject video 32B obtained by the photographing of the other camera 31B. it can. Specifically, for example, the position of “left eye” in one subject video 32A as a reference and the position of “left eye” in the other subject video 31B are used as corresponding points. Similarly, several corresponding points on the subject 30 in each of the subject images 32A and 32B are detected.

続いて、これら検出した幾つかの対応点の座標に基づいて、次式
を満たす平面射影行列Ｈを決定する。 Subsequently, based on the coordinates of these detected corresponding points,
A planar projection matrix H satisfying the above is determined.

次いで、この平面射影行列Ｈを用いて、基準とする一方の被写体映像３２Ａ内の全画素に対して行列変換を行う。これにより一方の被写体映像３２Ａ内の各画素とそれぞれ対応する他方の被写体映像３２Ｂ内の画素を検出することができる。 Next, using this planar projection matrix H, matrix conversion is performed on all the pixels in the one subject video 32A as a reference. As a result, it is possible to detect the pixels in the other subject image 32B corresponding to the pixels in the one subject image 32A.

この後、一方のカメラ３１Ａからそのとき生成しようとする多視点映像３３の視点位置までの距離と、他方のカメラ３１Ｂから当該視点位置までの距離との比を求める。この比は、カメラ情報記憶部１２に格納されている各カメラ３Ａ〜３ｎの位置情報に基づいて求めることができる。 Thereafter, a ratio between the distance from one camera 31A to the viewpoint position of the multi-view video 33 to be generated at that time and the distance from the other camera 31B to the viewpoint position is obtained. This ratio can be obtained based on the position information of each camera 3 </ b> A to 3 n stored in the camera information storage unit 12.

ここで、仮に一方のカメラ３１Ａから多視点映像３３の視点位置までの距離と、他方のカメラ３１Ｂから当該視点位置までの距離との比をγ：（１−γ）（ただし０≦γ≦）とし、一方の被写体映像３２Ａ上の１つの画素の位置ベクトルＰ₁、当該画素と対応する他方の被写体映像３２Ｂ上の画素（以下、座標をとする）の位置ベクトルをＰ₂とすると、かかる視点位置における多視点映像３３上での対応点の座標の位置ベクトルは、次式

によって算出することができる。 Here, it is assumed that a ratio between the distance from one camera 31A to the viewpoint position of the multi-view video 33 and the distance from the other camera 31B to the viewpoint position is γ: (1-γ) (where 0 ≦ γ ≦). If the position vector P ₁ of one pixel on one subject video 32A and the position vector of a pixel (hereinafter referred to as coordinates) on the other subject video 32B corresponding to the pixel are P ₂ , The position vector of the coordinates of the corresponding point on the multi-view video 33 at the position is given by

Can be calculated.

よって、かかる比が求められたら、この後、一方の被写体映像３２Ａ上の１つの画素の座標（ａ₁，ｂ₁）と、他方の被写体映像３２Ｂ上のこれに対応する画素の座標（ａ₂，ｂ₂）とに基づいて、（２）式によりかかる視点位置における多視点映像３３上の対応点の座標（ａ₁₂，ｂ₁₂）を求める。 Therefore, when such a ratio is obtained, the coordinates (a ₁ , b ₁ ) of _one pixel on one subject image 32A and the coordinates (a _{2) of the} corresponding pixel on the other subject image 32B are thereafter obtained. , B ₂ ), the coordinates (a ₁₂ , b ₁₂ ) of the corresponding point on the multi-view video 33 at the viewpoint position are obtained from the equation (2).

この後、かかる一方の被写体映像３２Ａ上の座標（ａ₁，ｂ₁）における画素値と、他方の被写体映像３２Ｂ上の座標（ａ₂，ｂ₂）における画素値とを用いて、多視点映像３３上のかかる対応点の画素の画素値を線形補間法により求める。 Thereafter, a multi-view video is obtained by using the pixel value at the coordinates (a ₁ , b ₁ ) on the one subject video 32A and the pixel value at the coordinates (a ₂ , b ₂ ) on the other subject video 32B. The pixel value of the pixel at the corresponding point on 33 is obtained by linear interpolation.

以上の処理を一方の被写体映像３２Ａ上のすべての画素について実行する。これにより同一の被写体３０を撮影する２台のカメラ３１Ａ，３１Ｂの各撮影映像から抽出した被写体映像３２Ａ，３２Ｂの映像情報に基づいて、任意の視点位置の多視点映像３３を得ることができる。 The above processing is executed for all pixels on one subject image 32A. As a result, a multi-view video 33 at an arbitrary viewpoint position can be obtained based on the video information of the subject videos 32A and 32B extracted from the shot videos of the two cameras 31A and 31B that shoot the same subject 30.

以上の原理に基づいて、多視点映像生成部１１は、図示しない内部メモリに格納された制御プログラムに従って図１２に示す多視点映像生成処理を実行することにより、検索対象被写体の多視点映像を生成する。 Based on the above principle, the multi-view video generation unit 11 generates the multi-view video of the search target subject by executing the multi-view video generation processing shown in FIG. 12 according to the control program stored in the internal memory (not shown). To do.

実際上、多視点映像生成部１１は、映像制御部７から上述の多視点映像生成要求が与えられると、この図１２に示す多視点映像生成処理を開始し、まず、多視点映像を生成するために利用する２枚の被写体映像として、カメラ情報記憶部１２に格納されている各カメラ３Ａ〜３ｎのカメラ情報のうち、デフォルトの視点位置に最も近い位置に位置する２台のカメラ３Ａ〜３ｎからそれぞれ与えられたカメラ映像信号に基づく被写体映像を選択する（ＳＰ１０）。 Actually, when the above-described multi-view video generation request is given from the video control unit 7, the multi-view video generation unit 11 starts the multi-view video generation process shown in FIG. 12, and first generates a multi-view video. Two cameras 3A to 3n positioned closest to the default viewpoint position among the camera information of each camera 3A to 3n stored in the camera information storage unit 12 as two subject images used for the purpose. To select a subject video based on the given camera video signal (SP10).

続いて多視点映像生成部１１は、図１１について説明した方法により、これら２枚の被写体映像上の対応点を複数個所分それぞれ検出し（ＳＰ１１）、この検出結果に基づいてデフォルトの視点位置から見た被写体の多視点映像を生成する（ＳＰ１２）。そして多視点映像生成部１１は、この後、この多視点映像生成処理を終了する。 Subsequently, the multi-view video generation unit 11 detects a plurality of corresponding points on the two subject videos by the method described with reference to FIG. 11 (SP11), and based on the detection result, the multi-view video generation unit 11 determines from the default viewpoint position. A multi-view video of the viewed subject is generated (SP12). Then, the multi-view video generation unit 11 ends this multi-view video generation process.

なお映像制御部７は、この後、図６（Ｂ）について上述した検索結果表示画面１７に表示された多視点映像１７Ｃの視点位置を変更すべき旨の操作入力がユーザから与えられると、ユーザが指定した位置を視点位置とする多視点映像の生成要求（多視点映像生成要求）を多視点映像生成部１１に与える。 When the user gives an operation input to change the viewpoint position of the multi-view video 17C displayed on the search result display screen 17 described above with reference to FIG. A multi-view video generation request (multi-view video generation request) is given to the multi-view video generation unit 11 with the specified position as the viewpoint position.

そして多視点映像生成部１１は、この多視点映像生成要求が与えられたときも上述と同様にして多視点映像を生成する。この場合における多視点映像生成処理の内容は、ステップＳＰ１０において、ユーザにより指定された視点位置に最も近い位置に位置する２台のカメラ３Ａ〜３ｎからそれぞれ与えられたカメラ映像信号に基づく被写体映像を選択する点を除いて上述と同様である。 The multi-view video generation unit 11 generates a multi-view video in the same manner as described above even when this multi-view video generation request is given. In this case, the content of the multi-view video generation process is as follows. In step SP10, the subject video based on the camera video signals respectively given from the two cameras 3A to 3n located closest to the viewpoint position designated by the user is obtained. Same as above except for selection.

（１−４）被写体検索処理に関する映像制御部の具体的な処理内容
一方、映像制御部７は、ユーザにより操作部１３が操作されて被写体検索モードが選択された場合、図示しない内部メモリに格納された制御プログラムに基づいて図１３に示す被写体検索処理を実行する。 (1-4) Specific Processing Contents of Video Control Unit for Subject Search Processing On the other hand, the video control unit 7 stores in an internal memory (not shown) when the subject search mode is selected by operating the operation unit 13 by the user. The subject search process shown in FIG. 13 is executed on the basis of the control program.

すなわち、映像制御部７は、被写体検索モードが選択されると、ユーザにより指定されたカメラ３Ａ〜３ｎから与えられる映像信号（選択カメラ映像信号）を上述の映像出力部１４を介してモニタ１５に送出することにより、この選択カメラ映像信号に基づく図６（Ａ）について上述したカメラ映像画面１６をモニタ１５に表示させる（ＳＰ２０）。 That is, when the subject search mode is selected, the video control unit 7 sends video signals (selected camera video signals) given from the cameras 3A to 3n designated by the user to the monitor 15 via the video output unit 14 described above. By sending out, the camera video screen 16 described above with reference to FIG. 6A based on the selected camera video signal is displayed on the monitor 15 (SP20).

また映像制御部７は、これと併せて、かかる選択カメラ映像信号に基づくカメラ映像内に被写体を検出したか否かを被写体抽出部９に定期的に問い合わせながら、被写体抽出部９がカメラ映像内の被写体を検出するのを待ち受ける（ＳＰ２１）。 In addition to this, the video control unit 7 periodically inquires the subject extraction unit 9 as to whether or not the subject has been detected in the camera video based on the selected camera video signal, and the subject extraction unit 9 It waits to detect the subject (SP21).

このとき被写体抽出部９は、通常モード時と同様の処理を行っており、選択カメラ映像信号に基づくカメラ映像内に少なくとも１つの被写体を検出した場合には、その旨を映像制御部７に通知すると共に、この後、かかる選択カメラ映像信号に基づくカメラ映像の各フレーム画像内にそれぞれ存在する各被写体の位置を映像制御部７に逐次報告する。 At this time, the subject extraction unit 9 performs the same processing as in the normal mode, and if it detects at least one subject in the camera video based on the selected camera video signal, notifies the video control unit 7 to that effect. Thereafter, the position of each subject existing in each frame image of the camera video based on the selected camera video signal is sequentially reported to the video control unit 7.

かくして映像制御部７は、被写体抽出部９から被写体を検出した旨の通知を受信すると、この後被写体抽出部９から逐次与えられる各被写体の位置情報に基づき生成した枠体画像信号を選択カメラ映像信号に重畳する（ＳＰ２２）。これにより、かかる選択カメラ映像信号に基づくカメラ映像内に存在する１つの被写体を取り囲むように枠体１６Ａ（図６（Ａ））が表示される。 Thus, when the video control unit 7 receives the notification that the subject is detected from the subject extraction unit 9, the frame image signal generated based on the position information of each subject that is sequentially given from the subject extraction unit 9 is selected camera video. It is superimposed on the signal (SP22). Thus, the frame 16A (FIG. 6A) is displayed so as to surround one subject existing in the camera video based on the selected camera video signal.

この後、映像制御部７は、ユーザ操作に応じて枠体１６Ａの位置を他の被写体を取り囲む位置に移動させながら、検索対象の被写体（検索対象被写体）が選択されるのを待ち受ける（ＳＰ２３）。そして映像制御部７は、やがてユーザにより検索対象被写体が選択されると、このとき特徴部位検出部１０により検出されたその検索対象被写体の特徴部位の映像情報を特徴情報記憶部８Ａから読み出す（ＳＰ２４）。 Thereafter, the video control unit 7 waits for selection of a search target subject (search target subject) while moving the position of the frame 16A to a position surrounding another subject in accordance with a user operation (SP23). . When the search target object is selected by the user, the video control unit 7 reads out the video information of the characteristic part of the search target object detected by the characteristic part detection unit 10 at this time from the feature information storage unit 8A (SP24). ).

続いて映像制御部７は、読み出した検索対象被写体の特徴部位の映像情報に基づいて、特徴情報記憶部８Ａに蓄積されている他の特徴部位の映像情報のうち、検索対象被写体のものと推測される特徴部位の映像情報をパターンマッチ処理により検索する（ＳＰ２５）。 Subsequently, based on the read video information of the characteristic part of the search target subject, the video control unit 7 estimates that the video information of the other characteristic part stored in the feature information storage unit 8A belongs to the search target subject. The image information of the feature part to be searched is retrieved by pattern matching processing (SP25).

次いで映像制御部７は、かかる検索処理により検索対象被写体のものと推測される特徴部位の映像情報を検出できたか否かを判断する（ＳＰ２６）。そして映像制御部７は、この判断において否定結果を得ると、かかる検索対象被写体の被写体映像等を検出できなかった旨のメッセージをモニタ１５に表示させ（ＳＰ２７）、この後この被写体検索処理を終了する。 Next, the video control unit 7 determines whether or not video information of a characteristic part estimated to be the subject of the search target has been detected by the search process (SP26). If the video control unit 7 obtains a negative result in this determination, it displays a message on the monitor 15 that the subject video or the like of the subject to be searched cannot be detected (SP27), and thereafter ends this subject search processing. To do.

これに対して映像制御部７は、ステップＳＰ２６の判断において肯定結果を得ると、そのとき検出した特徴部位の映像情報と関連付けられている被写体映像の映像ファイルのファイルＩＤと、当該特徴部位の映像情報と関連付けられている記録映像の映像ファイルのファイルＩＤとを特徴情報記憶部８Ａに格納されている特徴情報管理テーブル２１（図８）から読み出す（ＳＰ２８）。 On the other hand, if the video control unit 7 obtains a positive result in the determination at step SP26, the video ID of the video file of the subject video associated with the video information of the characteristic part detected at that time, and the video of the characteristic part The file ID of the video file of the recorded video associated with the information is read from the feature information management table 21 (FIG. 8) stored in the feature information storage unit 8A (SP28).

また映像制御部７は、ステップＳＰ２８において取得したファイルＩＤにより特定される被写体映像を用いて多視点映像を生成すべき旨の多視点映像生成要求を多視点映像生成部１１に送出すると共に（ＳＰ２９）、ステップＳＰ２８において取得したファイルＩＤにより特定される記録映像の映像内で、検索対象被写体が写っている映像部分を検出すべき旨の被写体映像部位検出要求を特徴部位抽出部１０に送出する（ＳＰ３０）。 In addition, the video control unit 7 sends a multi-view video generation request for generating a multi-view video using the subject video specified by the file ID acquired in step SP28 to the multi-view video generation unit 11 (SP29). ) Sends a subject video part detection request to the feature part extraction unit 10 to detect a video part in which the subject to be searched is shown in the video of the recorded video specified by the file ID acquired in step SP28 ( SP30).

そして映像制御部７は、この後、かかる被写体映像の映像情報と、当該被写体映像に基づいて多視点映像生成部１１により生成された多視点映像の映像情報と、特徴部位抽出部１０により検出された対応する記録映像中の検索対象被写体が写っている映像部分のカメラ映像情報とを記録媒体８から読み出す（ＳＰ３１）。 Then, the video control unit 7 detects the video information of the subject video, the video information of the multi-view video generated by the multi-view video generation unit 11 based on the subject video, and the feature part extraction unit 10. Further, the camera video information of the video portion in which the subject to be searched in the corresponding recorded video is shown is read from the recording medium 8 (SP31).

また映像制御部７は、記録媒体８から読み出したこれら被写体映像及び多視点映像の各映像情報と、カメラ映像情報とに基づいて図６（Ｂ）について上述した検索結果表示画面１７の画面信号を生成し、この画面信号を映像出力部１４を介してモニタ１５に与えることにより、当該検索結果表示画面１７をモニタ１５に表示させる（ＳＰ３２）。そして映像制御部７は、この後この被写体検索処理を終了する。 Further, the video control unit 7 outputs the screen signal of the search result display screen 17 described above with reference to FIG. 6B based on the video information of the subject video and the multi-view video read from the recording medium 8 and the camera video information. The search result display screen 17 is displayed on the monitor 15 by generating and supplying this screen signal to the monitor 15 via the video output unit 14 (SP32). Then, the video controller 7 thereafter ends this subject search process.

（１−５）本実施の形態の効果
以上のように、本実施の形態による映像監視システム１では、各カメラ３Ａ〜３ｎから与えられるカメラ映像信号に基づいて、これらカメラ映像信号に基づくカメラ映像内に存在する各被写体の特徴部分の映像情報及びその管理情報を取得すると共に、これらをその被写体の特徴情報として当該被写体の被写体映像及び多視点映像と関連付けて特徴情報記憶部８Ａに蓄積する。 (1-5) Effects of this Embodiment As described above, in the video surveillance system 1 according to this embodiment, the camera video based on the camera video signals based on the camera video signals given from the cameras 3A to 3n. The video information and management information of the characteristic part of each subject existing therein are acquired, and these are stored as feature information of the subject in the feature information storage unit 8A in association with the subject video and multi-view video of the subject.

また映像監視システム１では、この後いずれかのカメラ３Ａ〜３ｎからのカメラ映像信号に基づくカメラ映像内に存在する被写体（検索対象被写体）が指定されたときに、その被写体の特徴部位の特徴情報に基づいて、特徴情報記憶部８Ａに蓄積されている各特徴情報の中から検索対象被写体に対応する特徴情報を検索し、検索結果に基づいて、その被写体の被写体映像及び多視点映像の各映像情報と、記録媒体８に格納されているカメラ映像のうちのその被写体が写っている映像部分の映像をモニタ１５に表示する。 Further, in the video surveillance system 1, when a subject (search target subject) existing in the camera video based on the camera video signal from any one of the cameras 3A to 3n is designated thereafter, the feature information of the characteristic part of the subject is designated. Based on the feature information, the feature information corresponding to the search subject is searched from the feature information stored in the feature information storage unit 8A. Based on the search result, the subject video and the multi-view video of the subject are searched. The information and the video of the video portion in which the subject of the camera video stored in the recording medium 8 is shown on the monitor 15.

従って、この映像監視システム１によれば、カメラ３Ａ〜３ｎの撮影映像から取得した各被写体の特徴情報に基づいてユーザにより指定された被写体の検索が行えるため、膨大な映像の中から所望の被写体の映像を容易に検出でき、また例えばホテルやデパートなどの不特定多数の人間が出入りする施設における映像監視システムとして実用上十分に適用することができる。 Therefore, according to the video monitoring system 1, since the subject specified by the user can be searched based on the feature information of each subject acquired from the captured images of the cameras 3A to 3n, a desired subject can be selected from a vast amount of videos. Can be easily detected, and can be applied practically as a video surveillance system in a facility where an unspecified number of people come and go, such as hotels and department stores.

（２）第２の実施の形態
図１において、４０は全体として第２の実施の形態による映像監視システムを示す。この映像監視システム４０は、映像検索装置４１において、図６（Ａ）について上述したカメラ映像画面１６及び図６（Ｂ）について上述した検索結果表示画面１７が同じ１つの画面にまとめられて表示される点が第１の実施の形態による映像監視システム１と異なる。 (2) Second Embodiment In FIG. 1, reference numeral 40 denotes a video surveillance system according to a second embodiment as a whole. In this video monitoring system 40, the video search device 41 displays the camera video screen 16 described above with reference to FIG. 6A and the search result display screen 17 described above with reference to FIG. This is different from the video monitoring system 1 according to the first embodiment.

すなわち、本実施の形態による映像監視システム４０の場合、ユーザ操作により被写体検索モードを選択すると、図１４に示すような検索画面４２がモニタ１５に表示される。 That is, in the case of the video monitoring system 40 according to the present embodiment, when the subject search mode is selected by a user operation, a search screen 42 as shown in FIG.

この検索画面４２では、ユーザにより指定されたカメラ３Ａ〜３ｎから与えられる映像信号（選択カメラ映像信号）に基づくカメラ映像がカメラ映像表示欄４２Ａに表示され、そのカメラ映像表示欄４２Ａに表示されたカメラ映像内に存在するすべての被写体の被写体映像が被写体一覧表示欄４２Ｂに一覧表示される。 In this search screen 42, a camera video based on video signals (selected camera video signals) given from the cameras 3A to 3n designated by the user is displayed in the camera video display column 42A and displayed in the camera video display column 42A. Subject images of all subjects existing in the camera image are listed in the subject list display field 42B.

そして検索画面４２では、ユーザが操作部１３を操作して被写体一覧表示欄４２Ｂに表示された被写体画像の中から検索対象の被写体（検索対象被写体）の被写体画像を選択することによって、その検索対象被写体の被写体映像、多視点映像及びその検索対象被写体が写っている記録映像の映像部分を基本映像表示欄４２Ｃ、多視点映像表示欄４２Ｄ及び記録映像表示欄４２Ｅにそれぞれ表示させることができる。 On the search screen 42, the user operates the operation unit 13 to select a subject image of a subject to be searched (subject to be searched) from subject images displayed in the subject list display field 42B. The video portion of the recorded video in which the subject video, multi-view video, and the search target subject can be displayed in the basic video display column 42C, the multi-view video display column 42D, and the recorded video display column 42E, respectively.

図１５は、このような検索画面４２の表示を含めた第２の実施の形態による被写体検索処理に関する映像制御部４３（図１）の具体的な処理内容を示している。映像制御部４３は、ユーザにより操作部１３が操作されて被写体検索モードが選択されると、図示しない内部メモリに格納された制御プログラムに基づいてこの図１５に示す被写体検索処理を実行する。 FIG. 15 shows the specific processing contents of the video control unit 43 (FIG. 1) regarding the subject search processing according to the second embodiment including the display of the search screen 42 as described above. When the operation unit 13 is operated by the user and the subject search mode is selected, the video control unit 43 executes the subject search process shown in FIG. 15 based on a control program stored in an internal memory (not shown).

すなわち映像制御部４３は、被写体検索モードが選択されるとこの被写体検索処理を開始し、まず、所定の画面信号を映像出力部１４を介してモニタ１５に送出することにより、図１４について上述した検索画面４２をモニタ１５に表示させる（ＳＰ４０）。 That is, when the subject search mode is selected, the video control unit 43 starts this subject search process, and first sends a predetermined screen signal to the monitor 15 via the video output unit 14 to thereby describe the above-described FIG. The search screen 42 is displayed on the monitor 15 (SP40).

なお、この段階における検索画面４２では、選択カメラ映像信号に基づくカメラ映像が検索画面４２のカメラ映像表示欄４２Ａに表示されるが、被写体一覧表示欄４２Ｂや、基本映像表示欄４２Ｃ、多視点映像表示欄４２Ｄ及び記録映像表示欄４２Ｅには対応する映像は表示されない。 In the search screen 42 at this stage, the camera video based on the selected camera video signal is displayed in the camera video display column 42A of the search screen 42. However, the subject list display column 42B, the basic video display column 42C, and the multi-viewpoint video are displayed. The corresponding video is not displayed in the display column 42D and the recorded video display column 42E.

続いて映像制御部４３は、選択カメラ映像信号に基づくカメラ映像内に被写体を検出したか否かを被写体抽出部９に定期的に問い合わせながら、被写体抽出部９がカメラ映像内の被写体を検出するのを待ち受ける（ＳＰ４１）。 Subsequently, the video control unit 43 periodically inquires the subject extraction unit 9 as to whether or not the subject has been detected in the camera video based on the selected camera video signal, and the subject extraction unit 9 detects the subject in the camera video. (SP41).

そして映像制御部４３は、やがて被写体抽出部９から被写体を検出した旨の通知が与えられると、そのとき被写体抽出部９により生成されて記録媒体８に格納された、そのとき検出された各被写体の被写体映像の映像情報を記録媒体８から読み出し、当該映像情報に基づく被写体映像を被写体一覧表示欄４２Ｂ内に一覧表示する（ＳＰ４２）。 Then, when the video control unit 43 eventually receives a notification from the subject extraction unit 9 that the subject has been detected, the video control unit 43 then generates each subject detected by the subject extraction unit 9 and stored in the recording medium 8 at that time. The video information of the subject video is read from the recording medium 8, and the subject video based on the video information is displayed in a list in the subject list display field 42B (SP42).

続いて映像制御部４３は、操作部１３が操作されて検索画面４２の被写体一覧表示欄４２Ｂに表示された被写体映像の中から検索対象とする被写体（検索対象被写体）の被写体映像が選択されるのを待ち受け（ＳＰ４３）、やがて検索対象被写体が選択されると、図１３について上述した第１の実施の形態による被写体検索処理のステップＳＰ２４〜ステップＳＰ３１と同様にステップＳＰ４４〜ステップＳＰ５１を処理する。 Subsequently, the video control unit 43 operates the operation unit 13 to select the subject video of the subject to be searched (search target subject) from the subject videos displayed in the subject list display field 42B of the search screen 42. When a search target subject is eventually selected, step SP44 to step SP51 are processed in the same manner as step SP24 to step SP31 of the subject search process according to the first embodiment described above with reference to FIG.

次いで映像制御部４３は、ステップＳＰ５１において記録媒体８から読み出した被写体映像の映像情報、多視点映像情報及びカメラ映像情報に基づく被写体映像、多視点映像及び記録映像をそれぞれ検索画面４２の基本映像表示欄４２Ｃ、多視点映像表示欄４２Ｄ及び記録映像表示欄４２Ｅにそれぞれ表示させ（ＳＰ５２）、この後この被写体検索処理を終了する。 Next, the video controller 43 displays the subject video, multi-view video information, and recorded video based on the video information, multi-view video information, and camera video information read from the recording medium 8 in step SP51, respectively, on the search screen 42 as the basic video display. Displayed in the column 42C, the multi-view video display column 42D, and the recorded video display column 42E (SP52), and thereafter, the subject search process is terminated.

以上のように本実施の形態による映像監視システム４０では、選択映像信号に基づくカメラ映像と、当該カメラ映像内の各被写体の被写体映像と、検索結果とが１つの検索画面４２内に表示されるため、第１の実施の形態による映像監視システム１と比較して、検索対象の視認性や、操作性を向上させることができる。 As described above, in the video monitoring system 40 according to the present embodiment, the camera video based on the selected video signal, the subject video of each subject in the camera video, and the search result are displayed in one search screen 42. Therefore, the visibility and operability of the search target can be improved as compared with the video monitoring system 1 according to the first embodiment.

（３）第３の実施の形態
図１において、５０は全体として第３の実施の形態による映像監視システムを示す。この映像監視システム５０は、映像検索装置５１において、キーワードを用いて所望の被写体を検索し得るようになされた点が第１の実施の形態による映像監視システム１と異なる。 (3) Third Embodiment In FIG. 1, reference numeral 50 denotes a video surveillance system according to a third embodiment as a whole. The video monitoring system 50 is different from the video monitoring system 1 according to the first embodiment in that the video search device 51 can search for a desired subject using a keyword.

すなわち本実施の形態による映像監視システム５０の場合、被写体検索モードを選択すると、図１６に示すようなキーワード入力ダイアログ５２が表示される。そしてユーザが操作部１３を操作してこのキーワード入力ダイアログ５２のキーワード入力欄５２Ａに所望のキーワードを入力した後、検索ボタン５２Ｂを押下する操作を行なうことによって、そのキーワードに対応する被写体の検索を実行させることができる。この際使用するキーワードとしては、撮影日時や文字などが適用される。そしてこのときの検索結果が例えば図１７（Ａ）に示す検索画面５４としてモニタに表示される。 That is, in the video monitoring system 50 according to the present embodiment, when the subject search mode is selected, a keyword input dialog 52 as shown in FIG. 16 is displayed. Then, after the user operates the operation unit 13 to input a desired keyword into the keyword input field 52A of the keyword input dialog 52, the user searches for a subject corresponding to the keyword by performing an operation of pressing the search button 52B. Can be executed. As keywords used at this time, shooting date and time, characters, and the like are applied. The search result at this time is displayed on the monitor as a search screen 54 shown in FIG.

この検索画面５４では、そのときキーワード入力ダイアログ５２を用いてユーザが指定したキーワードがキーワード表示欄５４Ａに表示され、かかるキーワードに基づいて検出した各被写体の被写体映像が候補一覧表示欄５４Ｂに一覧表示される。 In this search screen 54, the keyword specified by the user using the keyword input dialog 52 at that time is displayed in the keyword display column 54A, and subject videos of each subject detected based on the keyword are displayed in a list in the candidate list display column 54B. Is done.

そして検索画面５４では、この後、ユーザが操作部１３を操作して候補一覧表示欄５４Ｂに表示された被写体画像の中から検索対象の被写体（検索対象被写体）の被写体画像を選択することによって、図１７（Ｂ）に示すように、その検索対象被写体の被写体映像、多視点映像及びその検索対象被写体が写っている記録映像の映像部分をそれぞれ検索画面５４内のそれまで空きスペースであった領域に表示させることができる。 On the search screen 54, the user then operates the operation unit 13 to select a subject image of a search target subject (search target subject) from the subject images displayed in the candidate list display field 54B. As shown in FIG. 17 (B), the subject video of the search subject, the multi-view video, and the video portion of the recorded video in which the search subject is shown, respectively, are areas that were previously empty spaces in the search screen 54. Can be displayed.

図１８は、このような検索画面５４の表示を含めた第３の実施の形態による被写体検索処理に関する映像制御部５３の具体的な処理内容を示している。映像制御部５３は、ユーザにより操作部１３が操作されて被写体検索モードが選択されると、図示しない内部メモリに格納された制御プログラムに基づいてこの図１８に示す被写体検索処理を実行する。 FIG. 18 shows the specific processing contents of the video control unit 53 related to the subject search processing according to the third embodiment including the display of the search screen 54 as described above. When the operation unit 13 is operated by the user and the subject search mode is selected, the video control unit 53 executes the subject search process shown in FIG. 18 based on a control program stored in an internal memory (not shown).

すなわち映像制御部５４は、被写体検索モードが選択されるとこの被写体検索処理を開始し、まず、所定の画面信号を映像出力部１４を介してモニタ１５に送出することにより、図１６について上述したキーワード入力ダイアログ５１をモニタ１５に表示させる（ＳＰ６０）。 That is, when the subject search mode is selected, the video control unit 54 starts this subject search process, and first sends a predetermined screen signal to the monitor 15 via the video output unit 14 to thereby describe the above-described FIG. The keyword input dialog 51 is displayed on the monitor 15 (SP60).

続いて映像制御部５３は、かかるキーワード入力ダイアログ５２のキーワード入力欄５２Ａにキーワードが入力された後に検索ボタン５２Ｂが押下する操作が行なわれるのを待ち受け（ＳＰ６１）、やがてかかるキーワード入力欄５２Ａにキーワードが入力されて検索ボタン５２Ｂを押下する操作が入力されると、特徴情報管理テーブル２１（図８）上の対応するエントリを検索する（ＳＰ６２）。 Subsequently, the video control unit 53 waits for an operation to be performed by pressing the search button 52B after the keyword is input to the keyword input field 52A of the keyword input dialog 52 (SP61), and the keyword is input to the keyword input field 52A. When an operation for pressing the search button 52B is input, the corresponding entry on the feature information management table 21 (FIG. 8) is searched (SP62).

例えばキーワードが撮影日時であった場合、映像制御部５３は、特徴部位管理テーブル２１（図８）の各エントリのタイムコード欄２１Ｇにそれぞれ格納されたタイムコードをキーワードと順次比較し、当該タイムコードがキーワードと一致するすべてのエントリを検出する。また映像制御部５３は、例えばキーワードが文字であった場合、特徴情報管理テーブル２１において属性が「文字」である各エントリの認識結果欄２１Ｆにそれぞれ格納された文字とキーワードとを順次比較し、当該文字がキーワードと一致するすべてのエントリを検出する。 For example, when the keyword is the shooting date and time, the video control unit 53 sequentially compares the time code stored in the time code column 21G of each entry of the feature part management table 21 (FIG. 8) with the keyword, and the time code. Finds all entries that match the keyword. For example, when the keyword is a character, the video control unit 53 sequentially compares the character and the keyword respectively stored in the recognition result column 21F of each entry whose attribute is “character” in the feature information management table 21; Find all entries where the character matches the keyword.

そして映像制御部５３は、この検索により対応するエントリを検出すると、そのエントリの被写体映像ファイルＩＤ欄２１Ｃ（図８）に格納されているファイルＩＤを読み出し、そのファイルＩＤが付与された映像ファイルのファイル情報（被写体映像の映像情報）を記録媒体８から読み出す。また映像制御部５３は、このようにして記録媒体８から読み出したファイル情報に基づく被写体映像と、キーワード入力ダイアログ５２を介して入力されたキーワードとが表示された検索画面５４（図１７）を、モニタ１５に表示させる（ＳＰ６３）。 When the video control unit 53 detects a corresponding entry by this search, the video control unit 53 reads the file ID stored in the subject video file ID column 21C (FIG. 8) of the entry, and the video file to which the file ID is assigned is read. File information (video information of the subject video) is read from the recording medium 8. In addition, the video control unit 53 displays a search screen 54 (FIG. 17) on which the subject video based on the file information read from the recording medium 8 in this way and the keywords input via the keyword input dialog 52 are displayed. It is displayed on the monitor 15 (SP63).

この後、映像制御部５３は、かかる検索画面５４の候補一覧表示欄５４Ｂに表示された被写体映像の中から１つの被写体映像が選択されると、図１３について上述した第１の実施の形態による被写体検索処理のステップＳＰ２４〜ステップＳＰ３１と同様にステップＳＰ６５〜ステップＳＰ７２を処理することにより、ステップＳＰ６４において選択された被写体映像の被写体（検索対象被写体）の被写体映像及び多視点映像の各映像情報と、かかる被写体が写っている撮影映像の映像情報とを記録媒体８から読み出す。 Thereafter, when one subject video is selected from the subject videos displayed in the candidate list display field 54B of the search screen 54, the video control unit 53 according to the first embodiment described above with reference to FIG. By processing step SP65 to step SP72 in the same manner as step SP24 to step SP31 of the subject search process, the subject video of the subject (search target subject) of the subject video selected in step SP64 and each video information of the multi-viewpoint video Then, the video information of the captured video in which the subject is shown is read from the recording medium 8.

そして映像制御部５４は、ステップＳＰ７２において記録媒体８から読み出した被写体映像の映像情報、多視点映像の映像情報及び対応する記録映像のうちの検索対象被写体が写っている映像部分のカメラ映像情報に基づいて、検索対象被写体の被写体映像及び多視点映像と、記録媒体８に記録されたカメラ映像のうちの検索対象被写体が写っているカメラ映像とをそれぞれ検索画面５５上の所定位置にそれぞれ表示させ（ＳＰ７３）、この後この被写体検索処理を終了する。 Then, the video control unit 54 adds the video information of the subject video read from the recording medium 8 in step SP72, the video information of the multi-view video, and the camera video information of the video portion in which the search target subject of the corresponding recorded video is shown. Based on this, the subject video and multi-view video of the subject to be searched and the camera video in which the subject to be searched among the camera videos recorded on the recording medium 8 are displayed at predetermined positions on the search screen 55, respectively. (SP73) Thereafter, the subject search process is terminated.

以上のように本実施の形態による映像監視システムでは、キーワードを用いて検索対象被写体を検索することができるため、撮影日時や被写体の特徴に基づいて所望する被写体の検索を行うことができ、その分、第１の実施の形態による映像監視システム１に比してより一層と使い勝手を向上させることができる。 As described above, in the video monitoring system according to the present embodiment, a search target subject can be searched using a keyword. Therefore, a desired subject can be searched based on the shooting date and the characteristics of the subject. Therefore, the usability can be further improved compared to the video monitoring system 1 according to the first embodiment.

（４）第４の実施の形態
図１との対応部分に同一符号を付した図１９において、６０は全体として第４の実施の形態による映像監視システムを示す。この映像監視システム６０は、映像検索装置６１に映像情報通信部６２が設けられている点が第１の実施の形態による映像監視システム１と異なる。 (4) Fourth Embodiment In FIG. 19, in which parts corresponding to those in FIG. 1 are assigned the same reference numerals, 60 denotes a video surveillance system according to a fourth embodiment as a whole. This video monitoring system 60 is different from the video monitoring system 1 according to the first embodiment in that a video information communication unit 62 is provided in the video search device 61.

映像情報通信部６２は、外部データベース６３との間で通信を行うためのインタフェースであり、映像制御部６４がこの映像情報通信部６２を介して外部データベース６３との間で情報のやり取りを行い得るようになされている。 The video information communication unit 62 is an interface for performing communication with the external database 63, and the video control unit 64 can exchange information with the external database 63 via the video information communication unit 62. It is made like that.

外部データベース６３には、上述のような映像検索に関する種々の情報を格納することができる。 The external database 63 can store various kinds of information related to video search as described above.

例えば外部データベース６３に指名手配されている犯罪者の画像リストが格納されている場合、映像制御部６４は、映像情報通信部６２を介して外部データベース６３からかかる画像リストを読み出し、これを記録媒体８に記録する。また映像制御部６４は、被写体抽出部９を制御してこの記録媒体８に記録した画像リストに登録された各画像内から犯罪者の画像（以下、これを犯罪者画像と呼ぶ）を抽出させると共に、特徴部位抽出部１０を制御してこの犯罪者画像から特徴部位の特徴情報（特徴部位の映像情報及びその管理情報）を抽出及び収集させる。 For example, in the case where an image list of criminals who have been appointed is stored in the external database 63, the video control unit 64 reads out the image list from the external database 63 via the video information communication unit 62, and stores the image list in the recording medium. Record in 8. The video control unit 64 controls the subject extraction unit 9 to extract a criminal image (hereinafter referred to as a criminal image) from each image registered in the image list recorded on the recording medium 8. At the same time, the feature part extraction unit 10 is controlled to extract and collect feature information (image information of the feature part and its management information) from the criminal image.

そして映像制御部６４は、このとき抽出した犯罪者画像の特徴部位の映像情報と、特徴情報記憶部８Ａに格納されている特徴情報管理テーブル２１（図８）の各エントリの映像情報とをパターンマッチング処理により比較し、一致するエントリが存在する場合にはその旨をモニタ１５に表示させる。 Then, the video control unit 64 patterns the video information of the characteristic part of the criminal image extracted at this time and the video information of each entry of the feature information management table 21 (FIG. 8) stored in the feature information storage unit 8A. Comparison is made by the matching process, and if there is a matching entry, that fact is displayed on the monitor 15.

以上のように本実施の形態による映像監視システム６０では、特徴情報記憶部８Ａに蓄積した各被写体の特徴情報と、外部データベース６３に格納されたデータとの双方を用いた検索や照合が可能となるため、第１の実施の形態による映像監視システム１と比べても使い勝手をより一層と向上させることができる。 As described above, in the video monitoring system 60 according to the present embodiment, it is possible to search and collate using both the feature information of each subject accumulated in the feature information storage unit 8A and the data stored in the external database 63. Therefore, usability can be further improved compared to the video monitoring system 1 according to the first embodiment.

（５）他の実施の形態
なお上述の第１〜第４の実施の形態においては、映像監視システム１，４０，５０，６０を図１や図１９のように構成するようにした場合について述べたが、本発明はこれに限らず、この他種々の構成を広く適用することができる。 (5) Other Embodiments In the first to fourth embodiments described above, the case where the video surveillance systems 1, 40, 50, 60 are configured as shown in FIG. 1 or FIG. 19 will be described. However, the present invention is not limited to this, and various other configurations can be widely applied.

また上述の第１〜第４の実施の形態においては、特徴情報記憶部８Ａを記録媒体８の一部として構成するようにした場合について述べたが、本発明はこれに限らず、特徴情報記憶部８Ａ及び記録媒体８を別体に構成するようにしても良い。 In the first to fourth embodiments described above, the case where the feature information storage unit 8A is configured as a part of the recording medium 8 has been described. However, the present invention is not limited to this, and the feature information storage is performed. The unit 8A and the recording medium 8 may be configured separately.

さらに上述の第１〜第４の実施の形態においては、検索結果表示画面１７（図６（Ｂ））や検索画面４２，５４（図１４、図１７）において、検索結果として、検索対象被写体の被写体映像、多視点映像及び記録媒体８に記録されたカメラ映像のうちの検索対象被写体が写っている映像部分のすべてを表示するようにした場合について述べたが、本発明はこれに限らず、これら検索結果表示画面１７や検索画面４２，５４において、検索結果として、検索対象被写体の被写体映像、多視点映像及び記録媒体８に記録されたカメラ映像のうちの一部を表示するようにしても良い。 Further, in the first to fourth embodiments described above, the search result display screen 17 (FIG. 6B) and the search screens 42 and 54 (FIGS. 14 and 17) show the search target subject as the search results. Although the description has been given of the case where all of the video portions in which the subject to be searched is displayed among the subject video, the multi-view video, and the camera video recorded on the recording medium 8, the present invention is not limited to this. In the search result display screen 17 and the search screens 42 and 54, as a search result, a part of the subject video of the search target subject, the multi-view video, and the camera video recorded on the recording medium 8 may be displayed. good.

さらに上述の第１〜第４の実施の形態においては、映像監視システム１，４０，５０，６０が各カメラ３Ａ〜３ｎに基づくカメラ映像情報をそのまま記録媒体８に保存するようにした場合について述べたが、本発明はこれに限らず、例えば特徴情報抽出部１０が、特徴情報記憶部８Ａに格納された各被写体の特徴情報に基づいて、図２０に示すように、記録媒体８に記録されたカメラ映像情報に基づく記録映像を被写体ごとの映像部分に分割したり、被写体が写っていない映像部分を削除するような処理を任意のタイミングで行うようにしても良い。このようにすることによって、検索結果示画面１７や検索画面４２，５４において、検索結果を表示する際の処理速度を向上させることができ、また記録媒体８に記録すべき映像情報の情報量を低減することができる。 Furthermore, in the first to fourth embodiments described above, a case where the video monitoring systems 1, 40, 50, 60 store the camera video information based on the cameras 3A-3n as they are in the recording medium 8 will be described. However, the present invention is not limited to this. For example, the feature information extraction unit 10 is recorded on the recording medium 8 as shown in FIG. 20 based on the feature information of each subject stored in the feature information storage unit 8A. The recorded video based on the camera video information may be divided into video portions for each subject, or processing such as deleting a video portion where the subject is not shown may be performed at an arbitrary timing. By doing so, the processing speed when displaying the search results on the search result display screen 17 and the search screens 42 and 54 can be improved, and the information amount of the video information to be recorded on the recording medium 8 can be reduced. Can be reduced.

さらに上述の第１〜第４の実施の形態においては、多視点映像生成部１１がユーザにより検索対象の被写体が指定された後に多視点映像を生成するようにした場合について述べたが、本発明はこれに限らず、例えば被写体抽出部９から与えられる被写体映像の映像情報を記録媒体８に格納した後直ぐにこの映像情報を利用して多視点映像を生成するようにしても良く、多視点映像生成部１１が多視点映像を生成するタイミングとしては、この他種々のタイミングを適用することができる。 Further, in the first to fourth embodiments described above, the case has been described in which the multi-view video generation unit 11 generates a multi-view video after a user specifies a search target subject. However, the present invention is not limited to this. For example, the video information of the subject video given from the subject extraction unit 9 may be stored in the recording medium 8 and the multi-view video may be generated using the video information immediately after the video information is stored. Various other timings can be applied as the timing at which the generation unit 11 generates the multi-viewpoint video.

さらに上述の第１〜第４の実施の形態においては、検索対象として指定された被写体（検出対象被写体）の特徴情報に基づいて、特徴情報記憶部１０に記憶された各特徴情報の中から検索対象被写体に対応する特徴情報を検索する映像検索部としての機能を映像制御部７，４３，５３，６４に搭載するようにした場合について述べたが、本発明はこれに限らず、かかる機能を有する機能ブロックを映像制御部７，４３，５３，６４とは別個に設けるようにしても良い。 Furthermore, in the above-described first to fourth embodiments, search is performed from the feature information stored in the feature information storage unit 10 based on the feature information of the subject (detection subject) designated as the search target. Although the case where the video control unit for searching for the feature information corresponding to the target subject is provided in the video control unit 7, 43, 53, 64 has been described, the present invention is not limited to this, and the function is provided. The functional blocks may be provided separately from the video control units 7, 43, 53, and 64.

第１〜第３の実施の形態による映像監視システムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the video surveillance system by the 1st-3rd embodiment. カメラの設置位置の一例を示す略線図である。It is a basic diagram which shows an example of the installation position of a camera. （Ａ）〜（Ｄ）は、図２の各カメラの撮影映像を示す略線図である。(A)-(D) is a basic diagram which shows the picked-up image of each camera of FIG. （Ａ）〜（Ｄ）は、図３の各カメラの撮影映像に基づく被写体映像を示す略線図である。(A)-(D) is a basic diagram which shows the to-be-photographed image based on the picked-up image of each camera of FIG. （Ａ）〜（Ｂ）は、図４の被写体映像に基づく多視点映像の作成例を示す略線図である。(A)-(B) is a basic diagram which shows the example of preparation of the multiview video based on the to-be-photographed image of FIG. （Ａ）はカメラ映像画面、（Ｂ）は検索結果表示画面の構成例をそれぞれ示す略線図である。(A) is a camera video screen, and (B) is a schematic diagram showing a configuration example of a search result display screen. テンプレート画像テーブルの説明に供する概念図である。It is a conceptual diagram with which it uses for description of a template image table. 特徴情報管理テーブルの説明に供する概念図である。It is a conceptual diagram with which it uses for description of the characteristic information management table. 特徴部位抽出処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a characteristic part extraction process. カメラ情報の構成例を示す図表である。It is a graph which shows the structural example of camera information. 多視点映像の生成方法の説明に供する概念図である。It is a conceptual diagram with which it uses for description of the production | generation method of a multiview video. 多視点映像生成処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a multiview video production | generation process. 第１の実施の形態による被写体検索処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the subject search process by 1st Embodiment. 第２の実施の形態による検索画面の構成を示す略線図である。It is a basic diagram which shows the structure of the search screen by 2nd Embodiment. 第２の実施の形態による被写体検索処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the subject search process by 2nd Embodiment. キーワード入力ダイアログの構成例を示す略線図である。It is a basic diagram which shows the structural example of a keyword input dialog. 第３の実施の形態による検索画面の構成を示す略線図である。It is a basic diagram which shows the structure of the search screen by 3rd Embodiment. 第３の実施の形態による被写体検索処理の処理手順を示すフローチャートである。15 is a flowchart illustrating a processing procedure of subject search processing according to the third embodiment. 第４の実施の形態による映像監視システムの全体構成を示すブロック図である。It is a block diagram which shows the whole structure of the video surveillance system by 4th Embodiment. 他の実施の形態の説明に供する概念図である。It is a conceptual diagram with which it uses for description of other embodiment.

Explanation of symbols

１，４０，５０，６０……映像監視システム、２……被写体、３，３Ａ〜３ｎ……カメラ、５，４１，５１，６１……映像検索装置、７，４３，５３，６４……映像制御部、８……記録媒体、８Ａ……特徴情報記憶部、９……被写体抽出部、１０……特徴部位抽出部、１１……多視点映像生成部、１２……カメラ情報記憶部、１３……操作部、１５……モニタ、１６……カメラ映像画面、１７……検索結果表示画面、２０……テンプレート画像テーブル、２１……特徴情報管理テーブル、４２，５４……検索画面、５２……キーワード入力ダイアログ、６２……映像情報通信部、６３……外部データベース。 1, 40, 50, 60 ... Video surveillance system, 2 ... Subject, 3, 3A-3n ... Camera, 5, 41, 51, 61 ... Video search device, 7, 43, 53, 64 ... Video Control unit, 8... Recording medium, 8 A. Feature information storage unit, 9... Subject extraction unit, 10... Feature part extraction unit, 11. ...... Operating unit, 15 ... monitor, 16 ... camera video screen, 17 ... search result display screen, 20 ... template image table, 21 ... feature information management table, 42, 54 ... search screen, 52 ... ... Keyword input dialog, 62 ... Video information communication unit, 63 ... External database.

Claims

A subject extraction unit that extracts each subject existing in the video based on the first video information given from outside, and records the extracted video information of the subject on a recording medium;
A feature part extraction unit that extracts a feature part of each subject extracted by the subject extraction unit;
A storage unit for storing the feature information for each subject, which is information about the feature portion of each subject extracted by the feature portion extraction unit, in association with the video information of each subject recorded on the recording medium;
An output unit that outputs a subject extracted from the first video information to a display unit;
A designation unit for designating a subject to be searched from among the subjects displayed on the display unit based on a user operation;
Based on the subject specified as the search target, the feature information corresponding to the subject specified as the search target is searched from the feature information stored in the storage unit, and the feature information obtained as a search result A video search unit comprising: a video search unit that reads video information of a subject associated with the video from the recording medium and outputs the video information by the output unit.

The video search unit
The video search device according to claim 1, wherein the video information of the subject associated with the feature information detected by the search is read from the recording medium, and a video based on the video information is displayed as a search result. .

The first video information is
Consists of a plurality of second video information obtained by shooting the same subject from multiple viewpoints,
The subject extraction unit
For each of the second video information, extract each subject present in the video based on the second video information, record the extracted video information of the subject on the recording medium,
A multi-view video generation unit that generates a multi-view video composed of a three-dimensional model of the subject based on video information of the same subject existing in the video based on the second video information;
The video search unit
The video search apparatus according to claim 2, wherein the multi-view video generated by the multi-view video generation unit is displayed as a search result based on video information of a subject specified as the search target.

The feature information of the subject includes video information of the feature part of the subject,
The video search unit
Based on the feature information of the subject specified as the search target, the feature information corresponding to the subject specified as the search target among the feature information stored in the storage unit is converted into a video pattern matching process. The video search device according to claim 1, wherein the video search device is searched for by the following.

The video search unit
Recording the first video information on the recording medium;
The feature part extraction unit includes:
The video search according to claim 1, wherein the first video information recorded on the recording medium is divided into video parts for each subject based on the feature information stored in the storage unit. apparatus.

The feature part extraction unit includes:
6. The video search apparatus according to claim 5 , wherein video information of a video portion in which any subject is not captured is deleted from the first video information recorded on the recording medium.

Extracting each subject existing in the video based on the first video information given from outside, recording the extracted video information of the subject on a recording medium, and extracting the characteristic part of each extracted subject A first step of storing feature information for each subject, which is information relating to the extracted feature part of each subject, in association with video information of each subject recorded on the recording medium;
The subject extracted from the first video information is output to the display unit, and the subject to be searched is designated from among the subjects displayed on the display unit based on the user's operation . Based on the subject, the feature information corresponding to the subject specified as the search target is searched from the stored feature information, and the video information associated with the feature information detected by the search is stored in the recording medium. image retrieval method characterized by comprising a second step of outputting by the display unit to read out from.

In the second step,
The video search method according to claim 7 , wherein the video information of the subject associated with the feature information detected by the search is read from the recording medium, and a video based on the video information is displayed as a search result. .

The first video information is
Consists of a plurality of second video information obtained by shooting the same subject from multiple viewpoints,
In the first step,
For each of the second video information, extract each subject present in the video based on the second video information, record the extracted video information of the subject on the recording medium,
In the second step,
Displaying, as a search result of the search, a multi-view video that is generated based on video information of the same subject existing in each video based on the second video information and is a three-dimensional model of the subject. The video search method according to claim 8 .

The feature information of the subject includes video information of the feature part of the subject,
In the second step,
Based on the feature information of the subject specified as the search target, the feature information corresponding to the subject specified as the search target is searched from the stored feature information by video pattern matching processing. The video search method according to claim 7 , wherein the video search method is characterized.

In the first step,
Recording the first video information on the recording medium;
The video search method according to claim 7 , wherein the first video information recorded on the recording medium is divided into video parts for each subject based on the stored feature information.

In the first step,
The video search method according to claim 11 , wherein video information of a video portion in which any subject is not captured is deleted from the first video information recorded on the recording medium.