JP2019197973A

JP2019197973A - Image processing device, control method thereof, program, and image processing system

Info

Publication number: JP2019197973A
Application number: JP2018090105A
Authority: JP
Inventors: 智恵菊地; Chie Kikuchi
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2018-05-08
Filing date: 2018-05-08
Publication date: 2019-11-14

Abstract

To provide a technology which allows an image generating device to generate a high-quality virtual viewpoint image even while each device including a plurality of imaging means for generating a virtual viewpoint image performs communication within the allowable range of the communication path thereof.SOLUTION: An image processing device for providing an image at one viewpoint among images of a plurality of viewpoints used for generating virtual viewpoint images includes: a control section performing code amount control when the total data amount of coded image data received from a first other device and coded image data obtained by lossless encoding of an image obtained by own imaging part exceeds a pre-set threshold; and a transmission section transmitting coded image data having been received and coded image data having been coded by itself to a second other device when the total data amount is equal to or less than the threshold, while transmitting coded image data obtained by code amount control performed by the control section to the second other device when the total data amount exceeds the threshold.SELECTED DRAWING: Figure 2

Description

本発明は仮想視点画像を生成するための撮像画像の転送技術に関するものである。 The present invention relates to a technique for transferring a captured image for generating a virtual viewpoint image.

昨今、複数のカメラを異なる位置に設置し、それら複数のカメラによる同期撮影を行うことで得た複数視点画像から、仮想視点から見た仮想視点コンテンツ（仮想視点画像）を生成する技術が注目されている。上記のようにして複数視点画像から仮想視点コンテンツを生成する技術によれば、例えば、サッカーやバスケットボールにおいて、実際のカメラが入り込めない様々な視点位置のハイライトシーンを生成できるため、通常の画像と比較してユーザに高臨場感を与えることが出来る。 Recently, a technology that generates virtual viewpoint content (virtual viewpoint image) viewed from a virtual viewpoint from multiple viewpoint images obtained by installing multiple cameras at different positions and performing synchronized shooting with the multiple cameras has attracted attention. ing. According to the technology for generating virtual viewpoint content from a plurality of viewpoint images as described above, for example, in soccer or basketball, it is possible to generate highlight scenes at various viewpoint positions where an actual camera cannot enter. Compared with, it can give a high sense of presence to the user.

一方、複数視点画像に基づく仮想視点コンテンツの生成及び閲覧は、複数のカメラが撮影した画像をサーバなどの画像処理部に集約し、当該画像処理部にて、三次元モデル生成、レンダリングなどの処理を施し、ユーザ端末に転送を行うことで実現する。 On the other hand, generation and browsing of virtual viewpoint content based on a plurality of viewpoint images is performed by collecting images taken by a plurality of cameras into an image processing unit such as a server, and processing such as three-dimensional model generation and rendering by the image processing unit. This is realized by transferring to the user terminal.

また、特許文献１では、複数のカメラをそれぞれとペアとなる制御ユニットを介して光ファイバで接続し、制御ユニットに各カメラの画像フレームを蓄積し、蓄積された画像フレームを使って連続的な動きを表現する画像出力を行うことについて記載されている。 Further, in Patent Document 1, a plurality of cameras are connected to each other through an optical fiber via a pair of control units, and the image frames of each camera are accumulated in the control unit. It describes that image output expressing motion is performed.

特許文献１のような、複数カメラの撮影画像をサーバに集約して仮想視点コンテンツを生成するシステムでは、カメラの台数に応じてネットワークの伝送負荷が増加する。そこで、サーバに転送する画像情報の総情報量を削減する様々な方法がある。たとえば、特許文献２ではビット深度の高いカメラとビット深度の低いカメラを組み合わせて用いることで、得られる画像情報の情報量を低減する場合について記載されている。また、特許文献３では、解像度の高いカメラと解像度の低いカメラを組み合わせて用いることで、得られる画像情報の情報量を低減する場合について記載されている。 In a system such as Patent Document 1 that collects captured images of a plurality of cameras on a server and generates virtual viewpoint content, the network transmission load increases according to the number of cameras. Therefore, there are various methods for reducing the total amount of image information transferred to the server. For example, Patent Document 2 describes a case where the amount of image information obtained is reduced by combining a camera with a high bit depth and a camera with a low bit depth. Patent Document 3 describes a case where the amount of image information obtained is reduced by combining a high-resolution camera and a low-resolution camera.

米国特許第７１０６３６１号U.S. Pat. No. 7,106,361 特許第４５７８５６６号Japanese Patent No. 4578566 特許第４５７８５６７号Japanese Patent No. 4578567

しかしながら、特許文献２や特許文献３記載の方法では、ビット深度や解像度が異なる画像から仮想視点コンテンツを生成するために、非常に複雑な処理を必要とする。さらに、人物などを仮想視点コンテンツの対象とする場合には、その詳細について対応点を取る必要があり、ビット深度や解像度が異なる画像を用いる場合には、精度のよい対応点をとることができない。そのため、仮想視点コンテンツを作成する人物などの領域はロスレスで転送することが望ましい。 However, the methods described in Patent Literature 2 and Patent Literature 3 require very complicated processing in order to generate virtual viewpoint content from images having different bit depths and resolutions. Furthermore, when a person or the like is a target of virtual viewpoint content, it is necessary to take corresponding points for the details, and when using images with different bit depths and resolutions, it is not possible to take accurate corresponding points. . For this reason, it is desirable to transfer an area such as a person who creates virtual viewpoint content without loss.

また、特許文献１のような、複数カメラの撮影画像をサーバに集約して仮想視点コンテンツを生成するシステムでは、カメラの台数に応じてネットワークの伝送負荷とサーバの演算負荷が増加する。そのため、自由視点合成の対象物を抜き出し、抜き出した画像領域のみをサーバに転送するようなことも行われる。しかし、画像の一部とは言え、仮想視点コンテンツをロスレス符号化すると、転送データ量が増大する。 Further, in a system such as Patent Document 1 that collects captured images of a plurality of cameras on a server and generates virtual viewpoint content, the network transmission load and the server calculation load increase according to the number of cameras. Therefore, it is also possible to extract an object for free viewpoint synthesis and transfer only the extracted image area to the server. However, although it is a part of the image, if the virtual viewpoint content is lossless encoded, the amount of transfer data increases.

さらに、リアルタイムでの仮想視点コンテンツ作成の実現に対する要望もあり、これに応えるためには、データの転送遅延を小さくすると同時に、サーバ側でのデコード処理が軽いほうが望ましい。 In addition, there is a demand for real-time virtual viewpoint content creation. To meet this demand, it is desirable to reduce the data transfer delay and at the same time reduce the decoding process on the server side.

上記の通りなので、仮想視点コンテンツの高画質な伝送、転送データ量の抑制、および、サーバでのデコード処理負荷の軽減を行うことが望ましい。 As described above, it is desirable to perform high-quality transmission of virtual viewpoint content, reduce the amount of transfer data, and reduce the decoding processing load on the server.

本発明は、上記の課題に鑑みてなされたものであり、仮想視点画像を生成するための複数の撮像手段を有する各装置が、その通信路の許容範囲内での通信しながらも、画像生成装置では高い品位の仮想視点画像を生成することを可能ならしめる技術を提供しようとするものである。 The present invention has been made in view of the above problems, and each device having a plurality of imaging means for generating a virtual viewpoint image performs image generation while communicating within the allowable range of the communication path. The apparatus intends to provide a technique that makes it possible to generate a high-quality virtual viewpoint image.

この課題を解決するため、例えば本発明の画像処理装置は以下の構成を備える。すなわち、
仮想視点画像を生成するために利用される複数の視点位置の画像のうちの１つの視点位置の画像を提供する画像処理装置であって、
撮像手段からの画像を取得する取得手段と、
前記取得手段で取得した画像を変換して可逆の符号化データを生成する符号化手段と、
第１の他の装置から１以上の符号化画像データの受信が可能な受信手段と、
前記受信手段で受信した符号化画像データと前記符号化手段で符号化して得た符号化画像データとの総データ量が予め設定された閾値を超える場合に符号量制御を行う制御手段と、
前記総データ量が前記閾値以下の場合には、前記受信手段で受信した符号化画像データと前記符号化手段で符号化して得た符号化画像データとを第２の他の装置に送信し、前記総データ量が前記閾値を超える場合には、前記制御手段による符号量制御して得た符号化画像データを前記第２の他の装置に送信する送信手段とを有し、
前記制御手段は、
前記受信手段で受信した符号化画像データ、及び、前記符号化手段で得た符号化画像データのうち符号量削減対象の符号化画像データを特定し、当該特定した符号化画像データにおける符号量を削減するための対象となる変換係数を特定する特定手段と、
該特定手段で特定された符号化画像データの特定された変換係数の符号量を削減する削減手段とを有し、
総データ量が前記閾値以下となるまで、前記特定手段による特定と前記削減手段による削減を行うことを特徴とする。 In order to solve this problem, for example, an image processing apparatus of the present invention has the following configuration. That is,
An image processing apparatus for providing an image at one viewpoint position among images at a plurality of viewpoint positions used for generating a virtual viewpoint image,
Obtaining means for obtaining an image from the imaging means;
Encoding means for converting the image acquired by the acquisition means to generate lossless encoded data;
Receiving means capable of receiving one or more pieces of encoded image data from a first other device;
Control means for performing code amount control when the total data amount of the encoded image data received by the receiving means and the encoded image data encoded by the encoding means exceeds a preset threshold;
When the total data amount is equal to or less than the threshold, the encoded image data received by the receiving unit and the encoded image data obtained by encoding by the encoding unit are transmitted to the second other device, When the total data amount exceeds the threshold value, it has transmission means for transmitting the encoded image data obtained by controlling the code amount by the control means to the second other device,
The control means includes
The encoded image data received by the receiving unit and the encoded image data to be reduced in code amount among the encoded image data obtained by the encoding unit are specified, and the code amount in the specified encoded image data is determined. A specifying means for specifying a conversion coefficient to be reduced;
Reducing means for reducing the code amount of the specified transform coefficient of the encoded image data specified by the specifying means;
Until the total data amount becomes equal to or less than the threshold, the specification by the specifying unit and the reduction by the reducing unit are performed.

本発明によれば、仮想視点画像を生成するための複数の撮像手段を有する各装置が、その通信路の許容範囲内での通信をしながらも、画像生成装置では高い品位の仮想視点画像を生成することが可能になる。 According to the present invention, each device having a plurality of imaging means for generating a virtual viewpoint image communicates within the allowable range of the communication path, but the image generating device displays a high-quality virtual viewpoint image. Can be generated.

複数カメラによる撮影システムの構成を示すブロック図。The block diagram which shows the structure of the imaging | photography system by a several camera. 第１の実施形態における転送データ作成処理を示すフローチャート。6 is a flowchart illustrating transfer data creation processing according to the first embodiment. 第１の実施形態における送信用符号量制御処理を示すフローチャート。The flowchart which shows the code amount control process for transmission in 1st Embodiment. 第１の実施形態における符号データ削減処理を示すフローチャート。5 is a flowchart showing code data reduction processing according to the first embodiment. ウェーブレット変換により生成されるサブバンドを説明するための図。The figure for demonstrating the subband produced | generated by wavelet transformation. 第１の実施形態の変形例における符号データ削減処理を示すフローチャート。The flowchart which shows the code data reduction process in the modification of 1st Embodiment. 第１の実施形態およびその変形例における符号データ削除結果の違いを示す図。The figure which shows the difference in the code data deletion result in 1st Embodiment and its modification. 第２の実施形態における符号データ削減処理を示すフローチャート。9 is a flowchart showing code data reduction processing according to the second embodiment. 第３の実施形態における符号データ削減処理を示すフローチャート。10 is a flowchart illustrating code data reduction processing according to the third embodiment. 第４の実施形態における転送データ作成処理を示すフローチャート。10 is a flowchart illustrating transfer data creation processing according to the fourth embodiment.

以下、添付図面に従って本発明に係る実施形態を詳細に説明する。なお、以下の実施形態は本発明を限定するものではなく、また、本実施形態で説明されている特徴の組み合わせの全てが本発明の解決手段に必須のものとは限らない。なお、同一の構成については、同じ符号を付して説明する。 Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. The following embodiments do not limit the present invention, and all the combinations of features described in the present embodiment are not necessarily essential to the solution means of the present invention. In addition, about the same structure, the same code | symbol is attached | subjected and demonstrated.

［第１の実施形態］
図１（ａ）は、実施形態における画像処理システムを構成する複数のセンサシステムが競技場（図示の場合はサッカーフィールド）を囲むように設置されている様を示している。図１（ｂ）は、実施形態における画像処理システム１００のブロック構成図でさる。この画像処理システム１００は、センサシステム１０１Ａ〜１０１Ｌ、仮想視点画像生成装置として機能する画像コンピューティングサーバ２００、およびスイッチングハブ１５０、ネットワーク３００、さらには、エンドユーザ端末４００を有する。 [First Embodiment]
FIG. 1A shows a state in which a plurality of sensor systems constituting the image processing system in the embodiment are installed so as to surround a stadium (in the illustrated case, a soccer field). FIG. 1B is a block diagram of the image processing system 100 in the embodiment. The image processing system 100 includes sensor systems 101A to 101L, an image computing server 200 that functions as a virtual viewpoint image generation device, a switching hub 150, a network 300, and an end user terminal 400.

エンドユーザ端末４００はパーソナルコンピュータ等の情報処理装置であって、仮想視点の位置と方向を設定し、その情報を画像コンピューティングサーバ２００にネットワーク３００を介して送信する。そして、画像コンピューティングサーバ２００は、その情報に従い、センサシステム１０１Ａ〜１０１Ｌからの画像データから、指定された仮想視点位置と方向の仮想視点画像を生成し、符号化する。そして、画像コンピューティングサーバ２００は、その符号化画像データをエンドユーザ端末４００に送信する。エンドユーザ端末４００は、符号化画像データを復号し、表示する。画像コンピューティングサーバ２００とエンドユーザ端末４００間の通信や処理は、この発明の主眼ではない。 The end user terminal 400 is an information processing apparatus such as a personal computer, sets the position and direction of the virtual viewpoint, and transmits the information to the image computing server 200 via the network 300. Then, the image computing server 200 generates and encodes a virtual viewpoint image at the designated virtual viewpoint position and direction from the image data from the sensor systems 101A to 101L according to the information. Then, the image computing server 200 transmits the encoded image data to the end user terminal 400. The end user terminal 400 decodes the encoded image data and displays it. Communication and processing between the image computing server 200 and the end user terminal 400 are not the main point of the present invention.

そこで、センサシステム１０１Ａ〜１０１Ｌの１２セットで得た画像を画像コンピューティングサーバ２００へ送信する動作を説明する。 Therefore, an operation of transmitting images obtained by 12 sets of the sensor systems 101A to 101L to the image computing server 200 will be described.

本実施形態の画像処理システム１００におけるセンサシステム１０１Ａ〜１０１Ｌは伝送路でデイジーチェーン（直列）接続される。伝送路は特に問わないが、例えばネットワークで利用されるイーサケーブル等である。センサシステム１０１Ａは、カメラ１１１Ａ、カメラ１１１Ａの撮像により得た画像の符号化処理、及び、この符号化で得た符号化画像データを通信路１３０Ａへ転送を行うカメラアダプタ１２０Ａを有する。センサシステム１０１Ａ以外のセンサシステム１０１Ｂ〜１０１Ｌも同様の構成を有する。すなわち、センサシステム１０１Ａの下流に位置するセンサ１０１Ｂは、カメラ１１１Ｂ、カメラアダプタ１２０Ｂを有する。また、末端に位置するセンサシステム１０１Ｌもカメラ１１１Ｌ、カメラアダプタ１２０Ｌを有する。 The sensor systems 101A to 101L in the image processing system 100 of the present embodiment are connected in a daisy chain (series) via a transmission line. The transmission path is not particularly limited, and is, for example, an Ethernet cable used in a network. The sensor system 101A includes a camera 111A, a camera adapter 120A that performs encoding processing of an image obtained by imaging by the camera 111A, and transfers encoded image data obtained by the encoding to the communication path 130A. The sensor systems 101B to 101L other than the sensor system 101A have the same configuration. That is, the sensor 101B located downstream of the sensor system 101A includes a camera 111B and a camera adapter 120B. The sensor system 101L located at the end also includes a camera 111L and a camera adapter 120L.

本実施形態において、センサシステム１０１Ａ〜１０１Ｌまでの１２セットのうち、特に区別せず、いずれでも良い場合、その１つを単にセンサシステム１０１と記載し、内部構成についてもカメラ１１１、カメラアダプタ１２０と記載する。なお、実施形態では、センサシステムの台数として１２セットと記載しているが、あくまでも一例であり、台数をこれに特に限定されるものではない。また、本実施形態では、特に断りがない限り、画像という文言は、動画と静止画の概念を含むものとして説明する。すなわち、本実施形態の画像処理システム１００は、静止画及び動画の何れについても処理可能である。また、本実施形態では、画像処理システム１００により提供される仮想視点コンテンツは、仮想視点画像である例を説明するが、これに限らない。仮想視点音声が含まれる例を中心に説明するが、例えば、センサシステム１０１にマイクを含めて、仮想視点コンテンツとして画像と音声が含まれてもよい。ただし、音声については、本発明の主眼ではないので、本実施形態での説明は割愛する。 In the present embodiment, among the 12 sets from the sensor systems 101A to 101L, if there is no particular distinction and any of them may be used, one of them is simply referred to as the sensor system 101, and the internal configuration is also described as the camera 111 and the camera adapter 120. Describe. In the embodiment, twelve sets are described as the number of sensor systems, but this is merely an example, and the number of sensor systems is not particularly limited to this. In the present embodiment, the term “image” will be described as including the concept of a moving image and a still image unless otherwise specified. That is, the image processing system 100 according to the present embodiment can process both still images and moving images. In this embodiment, an example in which the virtual viewpoint content provided by the image processing system 100 is a virtual viewpoint image will be described, but the present invention is not limited to this. Although an example in which virtual viewpoint sound is included will be mainly described, for example, a microphone may be included in the sensor system 101 and an image and sound may be included as virtual viewpoint content. However, since the voice is not the main point of the present invention, the description in the present embodiment is omitted.

実施形態における画像処理システム１００は、図１（ａ）に示すように、被写体を複数の方向から撮影するための複数のセンサシステム１０１を有する。センサシステム１０１同士はデイジーチェーンにより接続される。この接続形態により、撮影画像の４Ｋや８Ｋなどへの高解像度化及び高フレームレート化に伴う画像データの大容量化において、接続ケーブル数の削減や配線作業の省力化ができる効果があることをここに明記しておく。 As shown in FIG. 1A, the image processing system 100 according to the embodiment includes a plurality of sensor systems 101 for photographing a subject from a plurality of directions. The sensor systems 101 are connected by a daisy chain. This connection form has the effect of reducing the number of connection cables and saving labor in wiring work in increasing the capacity of image data as the resolution and frame rate of captured images increase to 4K or 8K. It is clearly stated here.

なお、これに限らず、接続形態として、各センサシステム１０１Ａ〜１０１Ｌがスイッチングハブ１５０に接続されて、スイッチングハブ１５０を経由してセンサシステム１０１間のデータ送受信を行うスター型のネットワーク構成としてもよい。 However, the present invention is not limited to this, and as a connection form, a star-type network configuration in which each sensor system 101A to 101L is connected to the switching hub 150 and transmits and receives data between the sensor systems 101 via the switching hub 150 may be employed. .

また、図１（ａ）では、デイジーチェーンとなるようセンサシステム１０１Ａ〜１０１Ｌの全てがカスケード接続されている構成を示したがこれに限定するものではない。例えば、複数のセンサシステム１０１をいくつかのグループに分割して、分割したグループ単位でセンサシステム１０１間をデイジーチェーン接続してもよい。そして、分割単位の終端となるカメラアダプタ１２０がスイッチングハブに接続されて画像コンピューティングサーバ２００へ画像の入力を行うようにしてもよい。 1A shows a configuration in which all of the sensor systems 101A to 101L are cascade-connected so as to form a daisy chain, but this is not a limitation. For example, the plurality of sensor systems 101 may be divided into several groups, and the sensor systems 101 may be daisy chain connected in divided groups. Then, the camera adapter 120 serving as the end of the division unit may be connected to the switching hub and input an image to the image computing server 200.

センサシステム１０１におけるカメラアダプタ１２０は、上流の装置（カメラアダプタ１２０Ａにとっては画像コンピューティングサーバ２００、カメラアダプタ１２０Ｂにとってはセンサシステム１０１Ａ）から受信した各種設定情報の受信と保持、並びに、設定処理を行う。 The camera adapter 120 in the sensor system 101 receives and holds various setting information received from upstream devices (the image computing server 200 for the camera adapter 120A and the sensor system 101A for the camera adapter 120B), and performs setting processing. .

また、カメラアダプタ１２０は、カメラ１１１による撮影で得た画像から、仮想視点画像の対象物となる被写体を分離する。そして、カメラアダプタ１２０は、分離された被写体画像を符号化し、その符号化で得た符号化画像データを下流の装置に向けて転送する。 In addition, the camera adapter 120 separates a subject that is a target of the virtual viewpoint image from an image obtained by photographing with the camera 111. Then, the camera adapter 120 encodes the separated subject image and transfers the encoded image data obtained by the encoding to a downstream apparatus.

例えば、カメラアダプタ１２０Ａは、カメラ１１１Ａによる撮像で得た画像から、仮想視点画像の対象物となる被写体を分離し、その分離した画像を符号化する。そして、カメラアダプタ１２０Ａは、符号化画像データを伝送路１３０Ａを介して下流のセンサシステム１０１Ｂに転送する。 For example, the camera adapter 120A separates a subject that is a target of the virtual viewpoint image from an image obtained by imaging with the camera 111A, and encodes the separated image. Then, the camera adapter 120A transfers the encoded image data to the downstream sensor system 101B via the transmission path 130A.

センサシステム１０１Ｂのカメラアダプタ１２０Ｂは、センサシステム１０１Ａのカメラアダプタ１２０Ａと同様、カメラ１１１Ｂより撮像した画像から被写体を分離し、符号化する。そして、カメラアダプタ１２０Ｂは、その符号化画像データと、センサシステム１０１Ａから受信した符号化画像データとを下流に位置するセンサシステム１０１Ｃに通信路１３０Ｂを介して転送する。センサシステム１０１Ｃ以降も同じである。 Similar to the camera adapter 120A of the sensor system 101A, the camera adapter 120B of the sensor system 101B separates and encodes the subject from the image captured by the camera 111B. Then, the camera adapter 120B transfers the encoded image data and the encoded image data received from the sensor system 101A to the sensor system 101C located downstream via the communication path 130B. The same applies to the sensor system 101C and thereafter.

つまり、図１（ａ）に示すように、センサシステム１０１Ａと１０１Ｂの間には、センサシステム１０１Ａで作成された符号化画像データＡのみが転送される。センサシステム１０１Ｂと１０１Ｃの間では、センサシステム１０１Ａ，１０１Ｂが作成した符号化画像データＡ，Ｂが転送される。同様に、センサシステム１０１Ｃと１０１Ｄの間では、センサシステム１０１Ａ〜１０１Ｃで作成された３つの符号化画像データＡ，Ｂ，Ｃが転送される。従って、デイジーチェーンの末端に位置するセンサシステム１０１Ｌには、センサシステム１０１Ａ〜１０１Ｋで作成された１１の符号化画像データＡ，Ｂ，…、Ｋが到達する。そして、センサシステム１０１Ｌは、センサシステム１０１Ｋから転送されてきた１１個の符号化画像データＡ〜Ｋに、自ら作成した符号化画像データＬを加えた１２個の符号化画像データＡ，Ｂ，…、Ｌを画像コンピューティングサーバ２００に転送することになる。 That is, as shown in FIG. 1A, only the encoded image data A created by the sensor system 101A is transferred between the sensor systems 101A and 101B. The encoded image data A and B created by the sensor systems 101A and 101B are transferred between the sensor systems 101B and 101C. Similarly, three encoded image data A, B, and C created by the sensor systems 101A to 101C are transferred between the sensor systems 101C and 101D. Therefore, the 11 encoded image data A, B,..., K created by the sensor systems 101A to 101K reach the sensor system 101L located at the end of the daisy chain. Then, the sensor system 101L includes 12 encoded image data A, B,... Obtained by adding the encoded image data L created by itself to the 11 encoded image data A to K transferred from the sensor system 101K. , L will be transferred to the image computing server 200.

なお、本実施形態では、カメラ１１１とカメラアダプタ１２０が分離された構成にしているが、同一筺体で一体化されていてもよい。 In the present embodiment, the camera 111 and the camera adapter 120 are separated from each other, but they may be integrated in the same casing.

次に、画像コンピューティングサーバ２００の構成及び動作について説明する。本実施形態の画像コンピューティングサーバ２００は、ハードウェア的には、ＣＰＵ、ＲＯＭ、ＲＡＭ、ネットワークＩ/Ｆ、ＨＤＤ等のストレージデバイスで構成され、所謂、パーソナルコンピュータ等に代表される情報処理装置である。電源がＯＮになると、ＣＰＵはＨＤＤかえらＯＳ（オペレーティングシステム）をＲＡＭにロードし実行することで、情報処理装置として機能する。更に、ＣＰＵは、ＨＤＤからサーバプログラムをＲＡＭにロードし、ＯＳの元で実行することで、画像コンピューティングサーバ２００として機能する。 Next, the configuration and operation of the image computing server 200 will be described. The image computing server 200 according to the present embodiment is configured with storage devices such as a CPU, a ROM, a RAM, a network I / F, and an HDD, and is an information processing apparatus represented by a so-called personal computer. is there. When the power is turned on, the CPU functions as an information processing apparatus by loading an OS (operating system) from the HDD into the RAM and executing it. Further, the CPU functions as the image computing server 200 by loading a server program from the HDD into the RAM and executing it under the OS.

画像コンピューティングサーバ２００は、センサシステム１０１Ｌから取得した符号化画像データＡ〜Ｌに対する処理を行う。このため、画像コンピューティングサーバ２００は、フロントエンドサーバ２１０、データベース２２０（以下、ＤＢとも記載する。）、バックエンドサーバ２３０、タイムサーバ２４０を有する。なお、実施形態では、１台の装置が、ソフトウェアを実行することで、フロントエンドサーバ２１０、データベース２２０、バックエンドサーバ２３０、タイムサーバ２４０として機能するものとするが、これら個々の機能を独立した装置で実現しても構わない。 The image computing server 200 performs processing on the encoded image data A to L acquired from the sensor system 101L. Therefore, the image computing server 200 includes a front-end server 210, a database 220 (hereinafter also referred to as DB), a back-end server 230, and a time server 240. In the embodiment, one device functions as the front-end server 210, the database 220, the back-end server 230, and the time server 240 by executing software, but these individual functions are independent. You may implement | achieve with an apparatus.

タイムサーバ２４０は時刻及び同期信号を配信する機能を有し、スイッチングハブ１５０を介してセンサシステム１０１Ａ〜１０１Ｌに時刻及び同期信号を配信する。時刻と同期信号を受信したカメラアダプタ１２０Ａ〜１２０Ｌは、カメラ１１１Ａ〜１１１Ｌに対し、時刻と同期信号をもとにＧｅｎｌｏｃｋさせ撮像する画像フレームの同期を行う。即ち、タイムサーバ２４０は、複数のカメラ１１１の撮影タイミングを同期させる。これにより、画像処理システム１００は同じタイミングで撮影された複数の撮影画像に基づいて仮想視点画像を生成できるため、撮影タイミングのずれによる仮想視点画像の品質低下を抑制できる。 The time server 240 has a function of distributing the time and the synchronization signal, and distributes the time and the synchronization signal to the sensor systems 101A to 101L via the switching hub 150. The camera adapters 120 A to 120 L that have received the time and the synchronization signal synchronize the image frames to be captured by Genlocking the cameras 111 A to 111 L based on the time and the synchronization signal. That is, the time server 240 synchronizes the shooting timings of the plurality of cameras 111. Thereby, since the image processing system 100 can generate a virtual viewpoint image based on a plurality of captured images captured at the same timing, it is possible to suppress deterioration in the quality of the virtual viewpoint image due to a shift in the capturing timing.

フロントエンドサーバ２１０は、センサシステム１０１に対する設定処理、並びに、センサシステム１１１Ｌから取得した画像を、カメラの識別子やデータ種別、フレーム番号に応じてデータベース２２０に書き込む処理を行う。設定処理には、センサシステム１０１Ａ乃至１０１Ｌのそれぞれに対し、それぞれが最上流から何番目のセンサシステムであるかを通知する処理、後述するカウンタＣの設定処理、再符号化する際の各階層、各サブバンドの量子化パラメータセットを設定する処理が含まれる。 The front-end server 210 performs setting processing for the sensor system 101 and processing for writing an image acquired from the sensor system 111L in the database 220 according to the camera identifier, data type, and frame number. The setting process includes a process for notifying each of the sensor systems 101A to 101L of which sensor system is the most upstream sensor, a counter C setting process (to be described later), each layer for re-encoding, A process for setting a quantization parameter set for each subband is included.

次に、バックエンドサーバ２３０は、エンドユーザ端末４００から指定された仮想視点の位置と方向に基づいて、データベース２２０から対応する画像を読み出し、レンダリング処理を行って仮想視点画像を生成する。そして、バックエンドサーバ２３０は、生成した仮想視点画像を符号化し、その符号化画像データをエンドユーザ端末４００に送信する。 Next, the back-end server 230 reads a corresponding image from the database 220 based on the position and direction of the virtual viewpoint designated from the end user terminal 400, and performs a rendering process to generate a virtual viewpoint image. Then, the back-end server 230 encodes the generated virtual viewpoint image and transmits the encoded image data to the end user terminal 400.

なお、実施形態では、エンドユーザ端末４００にて仮想視点の位置と方向の設定を行うものとするが、画像コンピューティングサーバ２００がその機能を代替しても構わない。このためには、画像コンピューティングサーバ２００にユーザインターフェース（表示装置とユーザからの指示入力を行うための入力装置）を設け、仮想視点の位置と方向を設定するアプリケーションプログラムを実行させればよい。また、センサシステムの同期システムや画像コンピューティングサーバ２００の構成はこれに限らず様々な形態が考えられるが、本件の主眼ではないので細かい説明は割愛する。 In the embodiment, the end user terminal 400 sets the position and direction of the virtual viewpoint, but the image computing server 200 may substitute the function. For this purpose, the image computing server 200 may be provided with a user interface (a display device and an input device for inputting instructions from the user), and an application program for setting the position and direction of the virtual viewpoint may be executed. Further, the configuration of the sensor system synchronization system and the image computing server 200 is not limited to this, and various forms are conceivable. However, since it is not the main point of the present case, a detailed description is omitted.

このように、画像処理システム１００においては、被写体を複数の方向から撮影するための複数のカメラ１１１による撮影に基づく画像データに基づいて、バックエンドサーバ２３０により仮想視点画像が生成される。なお、本実施形態における画像処理システム１００は、上記で説明した論理的な構成に限定される訳ではなく、物理的に独立した構成で実現してもよい。 As described above, in the image processing system 100, the virtual viewpoint image is generated by the back-end server 230 based on the image data based on the photographing by the plurality of cameras 111 for photographing the subject from a plurality of directions. Note that the image processing system 100 in the present embodiment is not limited to the logical configuration described above, and may be realized by a physically independent configuration.

次に、カメラアダプタ１２０によって、分離された被写体画像に対して施される画像処理について説明する。なお、被写体を分離する方法は、あらかじめ学習した物体を検出する方法や、予め被写体無しの画像を背景画像として記憶保持し、背景画像と異なる領域を動体検出として検出する方法など様々な方法が考えられるが、本件の主眼ではないので詳しい説明は割愛する。 Next, image processing performed on the separated subject image by the camera adapter 120 will be described. There are various methods for separating the subject, such as a method for detecting an object learned in advance, a method for storing an image without a subject in advance as a background image, and a method for detecting a region different from the background image as moving object detection. However, since it is not the main point of this case, a detailed explanation is omitted.

カメラアダプタ１２０は、分離した被写体画像をロスレス圧縮し、伝送路１３０を用いて隣接する下流のカメラアダプタに転送する。この際、カメラアダプタ１２０は、予め用意された２種類のロスレス圧縮方法の一方を選択して、被写体画像の符号化を行う。１つ目の方法は、被写体画像に対してＤＷＴ（離散ウェーブレット変換）を行うことなく、画素毎にロスレス圧縮する方法(以下、第１のロスレス圧縮と呼ぶ)である。２つ目は被写体画像全体に対して離散ウェーブレット変換（ＤＷＴ）を予め設定された回数行い、ウェーブレット変換係数（以下、ＤＷＴ係数）をロスレス圧縮する方法(以下、第２のロスレス圧縮と呼ぶ)である。 The camera adapter 120 performs lossless compression on the separated subject image and transfers it to the adjacent downstream camera adapter using the transmission path 130. At this time, the camera adapter 120 selects one of the two types of lossless compression methods prepared in advance and encodes the subject image. The first method is a lossless compression method for each pixel without performing DWT (discrete wavelet transform) on the subject image (hereinafter referred to as first lossless compression). The second is a method in which discrete wavelet transform (DWT) is performed a predetermined number of times on the entire subject image and the wavelet transform coefficient (hereinafter referred to as DWT coefficient) is lossless compressed (hereinafter referred to as second lossless compression). is there.

第１のロスレス圧縮は、例えば、ラスタースキャン順に、画素単位に例えばゴロム符号化するものである。第１のロスレス圧縮は、ＤＷＴ処理を行わないため（ＤＷＴの実行回数が０回ということもできる）、符号化、復号処理に係る処理負荷は非常に小さい。しかし、第１のロスレス圧縮は、符号量制御には不向きな方法であると言える。そのため、本実施形態においては、第１のロスレス圧縮で生成された符号化画像データは符号量制御対象とはしない。 In the first lossless compression, for example, Golomb encoding is performed on a pixel basis in raster scan order, for example. Since the first lossless compression does not perform DWT processing (the number of executions of DWT can be 0), the processing load related to encoding and decoding processing is very small. However, it can be said that the first lossless compression is not suitable for code amount control. Therefore, in the present embodiment, the encoded image data generated by the first lossless compression is not a code amount control target.

一方、第２のロスレス圧縮は、ＤＷＴ処理を行うので、第１のロスレス圧縮より処理に係る負荷は大きい。しかし、第２のロスレス圧縮処理で得た符号化画像データは、量子化パラメータを調整することで、容易に非可逆（ロッシー）の符号化画像データに変更できる。それ故、第２のロスレス圧縮で得た符号化画像データは符号量制御対象とする。なお、最初の符号化処理ではＤＷＴ変換は必須になるが、一旦生成されたロスレス符号化画像データをロッシー符号化データに変更する場合には、ＤＷＴ係数まで復元しさえすれば、後は量子化、エントロピー符号化（例えばゴロム符号化）だけで良くなる。つまり、第２のロスレス圧縮で生成されたロスレス符号化データをロッシー符号化データへの変更に係る処理は、ＤＷＴ変換処理が無くなる分、処理負荷は小さいと言える。 On the other hand, since the second lossless compression performs DWT processing, the processing load is larger than that of the first lossless compression. However, the encoded image data obtained by the second lossless compression process can be easily changed to lossy encoded image data by adjusting the quantization parameter. Therefore, the encoded image data obtained by the second lossless compression is a code amount control target. In the first encoding process, DWT conversion is indispensable. However, when lossless encoded image data once generated is changed to lossy encoded data, it is only necessary to restore the DWT coefficient and then quantize the data. Only entropy coding (eg Golomb coding) is required. That is, it can be said that the processing involved in changing the lossless encoded data generated by the second lossless compression to the lossy encoded data has a small processing load because the DWT conversion process is eliminated.

次に、上記２つの圧縮方法を利用した、カメラアダプタ１２０Ａ〜１２０Ｌの各々が実行する転送データに係る処理を図２のフローチャートを用いて説明する。なお、以下に説明する処理は、カメラアダプタ１２０Ａ〜１２０Ｌの中の１つを着目カメラアダプタ１２０と定義し、その着目カメラアダプタ１２０が行うものとして説明する。着目カメラアダプタ１２０以外の他のカメラアダプタも同様の処理を行うものと理解されたい。 Next, processing relating to transfer data executed by each of the camera adapters 120A to 120L using the above-described two compression methods will be described with reference to the flowchart of FIG. In the following description, it is assumed that one of the camera adapters 120A to 120L is defined as the focused camera adapter 120 and is performed by the focused camera adapter 120. It should be understood that camera adapters other than the camera adapter 120 of interest perform the same processing.

ステップＳ２０１にて、着目カメラアダプタ１２０は、自身が有する内部メモリから、初期カウンタＴ、および、カウンタ間隔Ｇを取得する。これら初期カウンタＴ及びカウンタ間隔Ｇは、本画像処理システム１００が稼動する際に画像コンピューティングサーバ２００から設定されるものである。カウンタ間隔Ｇは、全センサシステム１０１に共通な値であり、実施形態では“３”である。また、初期カウンタ値Ｔは、カメラアダプタ１２０Ａ、１２０Ｂ，１２０Ｃ、１２０Ｄ…の順番に、０、１、２、０、１、２…と、上流から下流に向かって０、１、２が繰り替えされるように設定される。例えば、カメラアダプタ１２０Ｃには、Ｔ＝２、Ｇ＝３が設定されることになる。また、実施形態では、センサシステムは１２個存在する。故に、初期カウンタＴが０として設定されるカメラアダプタの個数、初期カウンタＴが１として設定されるカメラアダプタの個数、初期カウンタＴが２として設定されるカメラアダプタの個数はそれぞれ４つとなる。 In step S201, the camera adapter 120 of interest acquires the initial counter T and the counter interval G from its own internal memory. These initial counter T and counter interval G are set from the image computing server 200 when the image processing system 100 is operated. The counter interval G is a value common to all the sensor systems 101, and is “3” in the embodiment. The initial counter value T is 0, 1, 2, 0, 1, 2,..., 0, 1, 2, and so on in the order of the camera adapters 120A, 120B, 120C, 120D. Is set to For example, T = 2 and G = 3 are set in the camera adapter 120C. In the embodiment, there are 12 sensor systems. Therefore, the number of camera adapters for which the initial counter T is set to 0, the number of camera adapters for which the initial counter T is set to 1, and the number of camera adapters for which the initial counter T is set to 2 are four.

ステップＳ２０２にて、着目カメラアダプタ１２０は、内部メモリに予め確保された変数としてのカウンタＣに初期カウンタＴを代入することで、カウンタＣを初期化する。この結果、カメラアダプタ１２０Ａ、１２０Ｂ、…それぞれのカウンタＣは、それぞれの初期カウンタＴの値と同じ、０、１、２、０、１、２…が設定されることになる。 In step S202, the camera adapter 120 of interest initializes the counter C by substituting the initial counter T for the counter C as a variable secured in advance in the internal memory. As a result, the respective counters C of the camera adapters 120A, 120B,... Are set to 0, 1, 2, 0, 1, 2,.

ステップＳ２０３にて、着目カメラアダプタ１２０は、撮影が開始されるタイミングを待つ。このタイミングは、既に説明したように、タイムサーバ２４０から設定に従ったものとなる。 In step S203, the camera adapter 120 of interest waits for the timing at which shooting is started. This timing is according to the setting from the time server 240, as already described.

撮影が開始されると、着目カメラアダプタ１２０は処理をステップＳ２０４へ進める。このステップＳ２０４において、着目カメラアダプタ１２０は、カメラ１１１で撮像して得られた画像から分離した被写体領域の画像を、圧縮対象の着目フレーム画像として取得する。すなわち、仮想視点コンテンツの対象を撮影画像から検出し、その検出領域のみが圧縮対象の着目フレーム画像として取得される。この場合、被写体の領域は１領域とは限らないため、複数の領域を圧縮対象のフレーム画像として取得されることもある。ただし、本実施形態では説明を簡略化するために、１つの領域のみが検出され、圧縮対象のフレーム画像とするものとして説明する。 When shooting is started, the camera adapter 120 of interest advances the process to step S204. In step S204, the camera adapter 120 of interest acquires an image of the subject area separated from the image obtained by imaging with the camera 111 as a frame image of interest to be compressed. That is, the target of the virtual viewpoint content is detected from the captured image, and only the detection area is acquired as the target frame image to be compressed. In this case, since the area of the subject is not limited to one area, a plurality of areas may be acquired as frame images to be compressed. However, in the present embodiment, in order to simplify the description, it is assumed that only one region is detected and used as a frame image to be compressed.

ステップＳ２０５にて、着目カメラアダプタ１２０は、カウンタＣが“０”であるか否かを判定することで、着目フレーム画像を第１のロスレス圧縮、第２のロスレス圧縮のいずれに従って符号化するかを判定する。実施形態では、カウンタＣが“０”の場合、着目カメラアダプタ１２０は着目フレーム画像を第１のロスレス圧縮（符号量制御不可の符号化法）で符号化する。それ故、カウンタＣが“０”の場合、着目カメラアダプタ１２０は処理をステップＳ２０６に進める。 In step S 205, the camera adapter 120 of interest determines whether the counter C is “0”, so that the frame image of interest is encoded according to the first lossless compression or the second lossless compression. Determine. In the embodiment, when the counter C is “0”, the target camera adapter 120 encodes the target frame image by the first lossless compression (encoding method in which code amount control is impossible). Therefore, when the counter C is “0”, the camera adapter 120 of interest advances the process to step S206.

また、カウンタＣが“０”以外の場合（“１”又は“２”のいずれかである場合）、着目カメラアダプタは着目フレーム画像を第２のロスレス圧縮（符号量制御可の符号化法）で符号化する。それ故、カウンタＣが“０”以外の場合、着目カメラアダプタ１２０は処理をステップＳ２０８に進める。 When the counter C is other than “0” (when it is “1” or “2”), the camera adapter of interest performs second lossless compression of the frame image of interest (encoding method with controllable code amount). Encode with Therefore, when the counter C is other than “0”, the camera adapter 120 of interest advances the process to step S208.

ステップＳ２０６にて、着目カメラアダプタ１２０は、着目フレーム画像を第１のロスレス圧縮に従って符号化する。実施形態では、ラスタースキャン順に、着目画素をゴロム符号化を行うものとしている。なお、画素単位に単純に可逆符号化できれば良いので、直前の画素値との差分を求め、その差分に所定の可逆符号化（ゴロム符号）を適用して符号化する等の他の手法でも良い。 In step S206, the camera adapter 120 of interest encodes the frame image of interest according to the first lossless compression. In the embodiment, Golomb encoding is performed on the target pixel in the raster scan order. Note that since it is only necessary to be able to simply perform lossless encoding for each pixel, other methods such as obtaining a difference from the immediately preceding pixel value and applying predetermined lossless encoding (Golomb code) to the difference may be used. .

また、ステップＳ２０８に処理が進んだ場合、着目カメラアダプタ１２０は、着目フレーム画像を第２のロスレス圧縮に従って符号化する。すなわち、着目カメラアダプタ１２０は、着目フレーム画像に対し、予め設定された回数分ＤＷＴ変換（ウェーブレット変換）を行い、ＤＷＴ係数を得る。そして、着目カメラアダプタ１２０はそのＤＷＴ係数を、コンポーネン毎にロスレス符号化する。実施形態におけるＤＷＴ変換する回数は３回とする。またＤＷＴ係数のロスレス符号化はゴロム符号化を用いるものとするが、他の符号化を用いても良い。 When the process proceeds to step S208, the camera adapter 120 of interest encodes the frame image of interest according to the second lossless compression. That is, the camera adapter 120 of interest performs DWT transformation (wavelet transformation) on the subject frame image for a preset number of times to obtain a DWT coefficient. Then, the camera adapter 120 of interest performs lossless encoding of the DWT coefficient for each component. The number of DWT conversions in the embodiment is three. In addition, lossless coding of DWT coefficients uses Golomb coding, but other coding may be used.

これまでの説明から明らかなように、実施形態の画像処理システムには１２個のカメラアダプタ１２０が存在する。よって、そのうちの４つがステップＳ２０６の第１のロスレス圧縮処理を行い、残りの８つがステップＳ２０８の第２のロスレス圧縮処理を行うことになる。 As is apparent from the above description, there are twelve camera adapters 120 in the image processing system of the embodiment. Therefore, four of them perform the first lossless compression process of step S206, and the remaining eight perform the second lossless compression process of step S208.

ステップＳ２０９にて、着目カメラアダプタ１２０は、カウンタＣに“１”を加算し、その加算した値をカウンタ間隔Ｇで除算した際の余りで、カウンタＣの値を更新する。つまり、カウンタＣの値は、０〜Ｇ−１の間を循環するように更新される。カメラアダプタ１２０Ａは、初期カウンタＴは“０”であるので、１フレームの符号化画像データを生成するたびに、０、１、２、０、１、２…の順にカウンタＣが更新されることになる。また、カメラアダプタ１２０Ｂは、初期カウンタＴは“１”であるので、１フレームの符号化画像データを生成するたびに、１、２、０、１、２、０…の順にカウンタＣが更新されることになる。 In step S209, the camera adapter 120 of interest adds “1” to the counter C, and updates the value of the counter C with the remainder when the added value is divided by the counter interval G. That is, the value of the counter C is updated so as to circulate between 0 and G-1. Since the initial counter T of the camera adapter 120A is “0”, the counter C is updated in the order of 0, 1, 2, 0, 1, 2,... Every time one frame of encoded image data is generated. become. Further, since the initial counter T of the camera adapter 120B is “1”, the counter C is updated in the order of 1, 2, 0, 1, 2, 0... Every time one frame of encoded image data is generated. Will be.

上記を、センサシステム１０１Ａ〜１０１Ｌの全体について表現すれば、次の通りである。 The above is expressed for the entire sensor systems 101A to 101L as follows.

センサシステム１０１Ａ〜１０１ＬそれぞれのカウンタＣは、撮影を開始した最初のフレームでは、０、１、２、０、１、２、…、２であったが、その次のフレームでは、１、２、０、１、２、０、…、０となる。そして、その次のフレーム撮影時には、２、０、１、２、０、…、１となる。なお、このカウンタＣの値は、各符号データのデータ番号として、後の符号量制御の際に参照される。 The counter C of each of the sensor systems 101A to 101L was 0, 1, 2, 0, 1, 2,... 2 in the first frame where the imaging was started, but in the next frame, 1, 2, 0, 1, 2, 0,... Then, at the time of the next frame shooting, 2, 0, 1, 2, 0,. Note that the value of the counter C is referred to in the subsequent code amount control as the data number of each code data.

ステップＳ２１０にて、着目カメラアダプタ１２０は、下流に転送することになる符号データの総量に基づき、符号量制御処理を実施する。この符号量制御処理の具体例については図３を用いて後述する。 In step S210, the camera adapter 120 of interest performs code amount control processing based on the total amount of code data to be transferred downstream. A specific example of the code amount control process will be described later with reference to FIG.

ステップＳ２１１にて、着目カメラアダプタ１２０は、ステップＳ２１０による符号量制御によって得たデータを、下流に転送するため、内部の送信バッファに出力し、送信する。そして、ステップＳ２１２にて、着目カメラアダプタ１２０は、画像コンピューティングサーバ２００から、撮影終了を示す指示を受信したか否かを判定し、否の場合には処理をステップＳ２０４に戻す。また、画像コンピューティングサーバ２００から、撮影終了を示す指示を受信した場合、カメラアダプタ１２０は本処理を終了する。 In step S211, the camera adapter 120 of interest outputs and transmits the data obtained by the code amount control in step S210 to the internal transmission buffer in order to transfer the data downstream. In step S212, the camera adapter 120 of interest determines whether or not an instruction indicating the end of shooting has been received from the image computing server 200. If not, the process returns to step S204. When receiving an instruction indicating the end of shooting from the image computing server 200, the camera adapter 120 ends this process.

次に、ステップＳ２１０の符号量制御の処理について、図３の処理フローを用いて説明する。 Next, the code amount control process in step S210 will be described with reference to the process flow of FIG.

本処理を分かりやすくするため、カメラアダプタ１２０Ａ〜１２０Ｌに、上流から下流になる順に番号０〜１１を割り振る。そして、番号ｉが示すカメラアダプタ１２０で生成される符号化データをＤ（ｉ）と表し、その符号量をＳ（ｉ）と表す。そして、図３の処理を行う着目カメラアダプタは、便宜的に、第ｎ番目（ｎ＝０、１、…、１１のいずれか）であるとする。すると、着目カメラアダプタ１２０は、上流のカメラアダプタから符号化画像データＤ（０）、Ｄ（１）、…、Ｄ（ｎ−１）を受信することになる。そして、着目カメラアダプタ１２０は、自身が接続されたカメラ１１１の撮像した画像から分離した被写体画像の符号化画像データＤ（ｎ）を生成することになる。この生成する符号化データＤ（ｎ）は、図２のステップＳ２０６、又は、Ｓ２０８で生成されたものである。そして、着目カメラアダプタ１２０は、符号化画像データＤ（０）、…、Ｄ（ｎ）を下流に転送することになる。かかる点を踏まえ、着目カメラアダプタ１２０の符号量制御処理を、図３のフローチャートに従って説明する。 In order to make this processing easy to understand, numbers 0 to 11 are assigned to the camera adapters 120A to 120L in order from upstream to downstream. The encoded data generated by the camera adapter 120 indicated by the number i is represented as D (i), and the code amount is represented as S (i). The camera adapter of interest that performs the processing of FIG. 3 is assumed to be the nth (n = 0, 1,..., 11) for convenience. Then, the camera adapter 120 of interest receives the encoded image data D (0), D (1),..., D (n−1) from the upstream camera adapter. Then, the camera adapter 120 of interest generates the encoded image data D (n) of the subject image separated from the image captured by the camera 111 to which the camera adapter 120 is connected. The generated encoded data D (n) is generated in step S206 or S208 in FIG. Then, the camera adapter 120 of interest transfers the encoded image data D (0),..., D (n) downstream. Based on this point, the code amount control processing of the camera adapter 120 of interest will be described according to the flowchart of FIG.

ステップＳ３０１にて、着目カメラアダプタ１２０は、上流から伝送路１３０を通じて符号画像データ群Ｇｃを受信し、その符号化画像データ群のデータサイズＧｓを求める。Ｇｃ，Ｇｓは次のように表せる。
Ｇｃ＝｛Ｄ（０）、…、Ｄ（ｎ−１）｝
Ｇｓ＝Ｓ（０）＋…＋Ｓ（ｎ−１） In step S301, the camera adapter 120 of interest receives the encoded image data group Gc from the upstream through the transmission path 130, and obtains the data size Gs of the encoded image data group. Gc and Gs can be expressed as follows.
Gc = {D (0), ..., D (n-1)}
Gs = S (0) + ... + S (n-1)

ステップＳ３０２にて、着目カメラアダプタ１２０は、自身が生成した符号化データＤ（ｎ）（ステップＳ２０６またはＳ２０８で生成した符号データ）と、その符号量Ｓ（ｎ）を取得する。 In step S302, the camera adapter 120 of interest acquires the encoded data D (n) generated by itself (code data generated in step S206 or S208) and the code amount S (n).

そして、ステップＳ３０３にて、着目カメラアダプタ１２０は、下流に転送することになる符号化画像データの総データ量TotalSizeを算出する。総データ量TotalSizeは次式で得られる。
TotalSize＝Ｇｓ＋Ｓ（ｎ） In step S303, the camera adapter 120 of interest calculates the total data amount TotalSize of the encoded image data to be transferred downstream. The total data amount TotalSize is obtained by the following equation.
TotalSize = Gs + S (n)

ステップＳ３０４にて、着目カメラアダプタ１２０は、内部メモリより、転送可能な上限サイズを示す閾値Ｍを取得する。このＭは、本画像処理システムにおける画像コンピューティングサーバ２００から設定されるものであり、センサシステム１０１Ａ〜１０１Ｌを接続する伝送帯域、もしくは本システムがその伝送帯域で利用許可された帯域を示す値である。 In step S304, the camera adapter 120 of interest acquires the threshold value M indicating the upper limit size that can be transferred from the internal memory. This M is set from the image computing server 200 in this image processing system, and is a value indicating a transmission band connecting the sensor systems 101A to 101L, or a band permitted to use this system in the transmission band. is there.

ステップＳ３０５にて、着目カメラアダプタ１２０は、ステップＳ３０４で取得した閾値Ｍと、ステップＳ３０３で算出した送信すべき符号データの総データ量TotalSizeを比較する。 In step S305, the camera adapter 120 of interest compares the threshold value M acquired in step S304 with the total data amount TotalSize of the code data to be transmitted calculated in step S303.

Ｍ≧TotalSizeの関係にある場合、着目カメラアダプタ１２０は、符号量制御無しに、上流から受信した符号化データＧｃ、及び、自身が生成した可逆符号化データＤ（ｎ）をそのまま下流に転送しても構わないと判断し、本処理を終える。 When M ≧ TotalSize, the camera adapter 120 of interest transfers the encoded data Gc received from the upstream and the lossless encoded data D (n) generated by itself without any code amount control to the downstream. It is judged that it is okay to finish this processing.

一方、Ｍ＜TotalSizeの関係にある場合、着目カメラアダプタは、処理をステップＳ３０６に進め、TotalSizeをＭ以下とするため、符号化画像データＤ（０）、…、Ｄ（ｎ）の符号量削減処理を行う。 On the other hand, if the relationship of M <TotalSize is satisfied, the camera adapter of interest advances the process to step S306 and sets the total size to M or less, so that the code amount reduction of the encoded image data D (0),. Process.

ステップＳ３０６の処理の説明の前に、第２のロスレス圧縮におけるＤＷＴで生成されるサブバンドデータについて説明する。一般に、ＤＷＴをｍ回行った場合に得られるサブバンドの個数は、｛３×ｍ＋１｝になる。実施形態では、ＤＷＴを３回行うものとしているので、得られるサブバンドは図５のように１０個となる。 Before describing the processing in step S306, subband data generated by DWT in the second lossless compression will be described. In general, the number of subbands obtained when DWT is performed m times is {3 × m + 1}. In the embodiment, since DWT is performed three times, ten subbands are obtained as shown in FIG.

本実施形態では、低周波サブバンドＬＬから順に０、１、２、…、９の番号を付与して説明する。また、最小の解像度の階層番号を０とし、解像度が大きくなるにつれて階層番号が１増えるような階層番号を付与する。 In the present embodiment, numbers 0, 1, 2,..., 9 are assigned in order from the low frequency subband LL. Further, the hierarchy number with the lowest resolution is set to 0, and a hierarchy number is assigned so that the hierarchy number increases by 1 as the resolution increases.

したがって、ＤＷＴの実行回数＝１の時には、階層番号は最小解像度から０、１の２階層となり、階層番号０最小解像度０のサブバンド０（ＬＬ）と、階層番号１のサブバンド１（ＨＬ）、２（ＬＨ）、３（ＨＨ）が得られることになる。 Therefore, when the number of executions of DWT = 1, the layer number becomes two layers of 0 and 1 from the minimum resolution, subband 0 (LL) of layer number 0 and minimum resolution 0, and subband 1 (HL) of layer number 1 2 (LH) and 3 (HH) are obtained.

そして、ＤＷＴの実行回数＝３の場合、サブバンドの階層は４階層になる。故に、階層番号は最小解像度から０、１、２、３となる。つまり、ＤＷＴの回数＝最大解像度の階層番号となる。 When the number of executions of DWT = 3, the subband hierarchy is four. Therefore, the hierarchy number is 0, 1, 2, 3 from the minimum resolution. That is, the number of DWTs = the layer number of the maximum resolution.

また、解像度を１サイズ大きくするのに必要になるサブバンドをまとめて階層と呼ぶ。階層番号＝０はサブバンド０（ＬＬ）の１つのみから構成されるが、階層番号が１以上は高周波サブバンド｛ＨＬ，ＬＨ，ＨＨ｝から構成される。すなわち、階層番号＝１は、サブバンド１，２，３から、階層番号＝２はサブバンド４，５，６から構成される。よって、階層番号＝Ａは、サブバンド３×Ａ−２、３×Ａ−１、３×Ａから構成される。 In addition, subbands required to increase the resolution by one size are collectively referred to as a hierarchy. The layer number = 0 is composed of only one subband 0 (LL), but the layer number of 1 or more is composed of high-frequency subbands {HL, LH, HH}. That is, layer number = 1 is composed of subbands 1, 2, and 3, and layer number = 2 is composed of subbands 4, 5, and 6. Therefore, the layer number = A is composed of subbands 3 × A−2, 3 × A−1, and 3 × A.

上記を踏まえ、ステップＳ３０６の符号量制御処理を図４のフローチャートに従って説明する。ここでもう一度、着目カメラアダプタ１２０は、センサシステム１０１Ａ〜１０１Ｌにおける、最上流を０番目とするｎ番目のセンサシステムに属する点に注意されたい。また、着目カメラアダプタ１２０は、上流より符号化画像データＤ（０）、…、Ｄ（ｎ−１）を受信し、自身は符号化画像データＤ（ｎ）を生成する点も注意されたい。 Based on the above, the code amount control processing in step S306 will be described with reference to the flowchart of FIG. Here again, it should be noted that the camera adapter 120 of interest belongs to the nth sensor system in the sensor systems 101A to 101L where the most upstream is the 0th. It should also be noted that the camera adapter 120 of interest receives the encoded image data D (0),..., D (n-1) from the upstream and generates the encoded image data D (n) itself.

また、着目カメラアダプタ１２０は、自身が保持するカウンタＣの値から、他のカメラアダプタが保持しているカウンタＣの値を特定できる。例えば、着目カメラアダプタ１２０がカメラアダプタ１２０Ｂであり、そのカウンタＣが仮に１であるとき、上流のカメラアダプタ１２０Ａが保持しているカウンタＣは０であり、カメラアダプタ１２０Ｃが保持しているカウンタＣは２、カメラアダプタ１２０Ｄが保持しているカウンタＣは０であると特定できる。換言すれば、着目カメラアダプタ１２０は、上流より受信した符号化画像データＤ（０）、…、Ｄ（ｎ−１）、及び、自信が生成した符号化画像データＤ（ｎ）のいずれが第１のロスレス圧縮に従って得た符号化画像データであり、いずれが第２のロスレス圧縮に従って得た符号化データであるかを判定できる。 Further, the camera adapter 120 of interest can specify the value of the counter C held by another camera adapter from the value of the counter C held by itself. For example, when the camera adapter 120 of interest is the camera adapter 120B and its counter C is 1, the counter C held by the upstream camera adapter 120A is 0, and the counter C held by the camera adapter 120C 2 and the counter C held by the camera adapter 120D can be identified as 0. In other words, the camera adapter 120 of interest receives the encoded image data D (0),..., D (n−1) received from the upstream and the encoded image data D (n) generated by the self-confidence. It is possible to determine which is encoded image data obtained in accordance with the first lossless compression and which is encoded data obtained in accordance with the second lossless compression.

そこで、本実施形態における着目カメラアダプタは、符号化画像データＤ（０）、…、Ｄ（ｎ）のうち、第２のロスレス圧縮に従って生成された符号化画像データを特定し、その中からロスレス符号化されている階層のＤＷＴ係数をロッシー符号化に変更することでデータ量を削減する。実施形態では、データ量を削減する符号化画像データは、その符号化データを生成した際のカウンタＣの値によって選択される。例えば、カウンタＣが“１”であるときに生成した符号化画像データのうち、最大階層の符号化データをロッシー符号化し、データ量の削減を図る。更に削減が必要な場合には、カウンタＣが“２”であるとき生成した符号化画像データのうち、最大階層の符号化データをロッシー符号化し、データ量を削減する。そして、それでも、更に削減が必要な場合には、カウンタＣが“１”であるときに生成した符号化データのうち、最大階層から１つ下階層の符号化データをロッシー符号化に変更する。以下、かかる処理をステップＳ３０５が示す条件が満足するまで繰り返す。 Therefore, the camera adapter of interest in the present embodiment specifies encoded image data generated according to the second lossless compression from the encoded image data D (0),... The data amount is reduced by changing the DWT coefficient of the encoded hierarchy to lossy encoding. In the embodiment, the encoded image data for reducing the data amount is selected based on the value of the counter C when the encoded data is generated. For example, among the encoded image data generated when the counter C is “1”, the encoded data of the highest hierarchy is lossy encoded to reduce the data amount. If further reduction is necessary, the encoded data of the highest layer among the encoded image data generated when the counter C is “2” is lossy-encoded to reduce the data amount. If there is still a need for further reduction, among the encoded data generated when the counter C is “1”, the encoded data one layer below the maximum layer is changed to lossy encoding. Thereafter, this process is repeated until the condition indicated by step S305 is satisfied.

図４は、図３のステップＳ３０６の処理の具体例である。なお、以下に説明する各変数は、着目カメラアダプタ１２０内の不図示の内部メモリに確保されているものである。

ステップＳ４０１にて、着目カメラアダプタ１２０は、削減対象の階層番号を表すための変数Ｒに最大階層番号を設定する。これは、変数Ｒが示す階層番号のロスレス符号データがロッシー符号データに変換されることを意味する。本実施形態では、ＤＷＴは３回実行するものとしているので、変数Ｒは“３”で初期化されることになる。なお、以下の説明で、変数Ｒが示す階層を単に階層Ｒと表現する。 FIG. 4 is a specific example of the process of step S306 of FIG. Each variable described below is secured in an internal memory (not shown) in the camera adapter 120 of interest.

In step S401, the camera adapter 120 of interest sets the maximum hierarchy number in the variable R for representing the reduction target hierarchy number. This means that the lossless code data of the layer number indicated by the variable R is converted into lossy code data. In this embodiment, since the DWT is executed three times, the variable R is initialized with “3”. In the following description, the hierarchy indicated by the variable R is simply expressed as hierarchy R.

ステップＳ４０２にて、着目カメラアダプタ１２０は、階層Ｒのロスレス符号化データをロッシー符号化データに変換する際の量子化パラメータを内部メモリより取得する。この量子化パラメータは全センサシステムのカメラアダプタに共通であり、画像コンピューティングサーバ２００からの指示に従って格納されるものである。 In step S402, the camera adapter 120 of interest acquires the quantization parameter for converting the lossless encoded data of the layer R into the lossy encoded data from the internal memory. This quantization parameter is common to the camera adapters of all sensor systems, and is stored according to an instruction from the image computing server 200.

ステップＳ４０３にて、着目カメラアダプタ１２０は、データ量削減対象の符号化画像データを選定するために用いる変数Ｐを、“１”で初期化する。実施形態では、カウンタＣ＝０の場合に生成された符号化画像データは第１のロスレス圧縮であって、符号量削減非対象の符号化データである。よって、変数Ｐは“０”以外であれば、“２”で初期化しても構わない。 In step S403, the camera adapter 120 of interest initializes the variable P used for selecting the encoded image data to be reduced in data amount to “1”. In the embodiment, the encoded image data generated when the counter C = 0 is the first lossless compression and is encoded data that is not subject to code amount reduction. Therefore, if the variable P is other than “0”, it may be initialized with “2”.

上記によって、符号化画像データＤ（０）、…、Ｄ（ｎ）のうちの変数Ｐで特定される符号化画像データ（１つとは限らない）の、階層Ｒのロスレス符号化データがロッシー符号データに変換されるための条件設定が行われたことになる。 As described above, the lossless encoded data of the layer R of the encoded image data (not necessarily one) specified by the variable P among the encoded image data D (0),. The conditions for converting to data are set.

ステップＳ４０４にて、着目カメラアダプタ１２０は、符号化画像データを特定するための変数ｉに“０”を代入し、初期化する。 In step S404, the camera adapter 120 of interest substitutes “0” for the variable i for specifying the encoded image data and initializes it.

次に、ステップＳ４０５にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）を生成した際のカウンタＣの値が、変数Ｐが示す値と一致するか否かを判定する。符号化画像データＤ（ｉ）を生成したときのカウンタＣの値が、着目カメラアダプタ１２０が保持するカウンタＣの値から判定できるのは、既に説明した通りである。なお、符号化画像データのヘッダに、カウンタＣの値を格納されるようにするのであれば、それに従って判定しても構わない。 Next, in step S405, the camera adapter 120 of interest determines whether or not the value of the counter C when the encoded image data D (i) is generated matches the value indicated by the variable P. As described above, the value of the counter C when the encoded image data D (i) is generated can be determined from the value of the counter C held by the camera adapter 120 of interest. If the value of the counter C is stored in the header of the encoded image data, the determination may be made accordingly.

符号化画像データＤ（ｉ）を生成したときのカウンタＣの値が、変数Ｐで示される値と一致した場合、符号化画像データＤ（ｉ）は符号量削減対象の候補となる。そこで、着目カメラアダプタ１２０は処理をステップＳ４０６に進める。また、符号化画像データＤ（ｉ）を生成したときのカウンタＣの値が、変数Ｐで示される値と不一致であった場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）が符号量削減候補ではないと判断し、処理をステップＳ４０９に進める。 When the value of the counter C when the encoded image data D (i) is generated matches the value indicated by the variable P, the encoded image data D (i) is a candidate for the code amount reduction target. Therefore, the camera adapter 120 of interest advances the process to step S406. If the value of the counter C when the encoded image data D (i) is generated does not match the value indicated by the variable P, the camera adapter 120 of interest receives the encoded image data D (i) as encoded. It is determined that it is not an amount reduction candidate, and the process proceeds to step S409.

ステップＳ４０６にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）における、階層Ｒのデータがロスレスデータであるかを確認する。ロスレスデータであるかどうかは、符号化画像データＤ（ｉ）の階層Ｒの量子化パラメータを参照し、量子化されていない場合はロスレスデータであると判断する。階層Ｒがロスレスデータである場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の階層Ｒの符号データを削減することが可能であると判断し、処理をステップＳ４０７へ進める。また、階層Ｒのデータがロッシーデータと判断した場合、着目カメラアダプタ１２０は、階層Ｒについてはこの段階でのデータ削減処理を行わないものと決定し、処理をステップＳ４０９へ進める。なお、階層Ｒのデータがロッシーデータとなっている状況は、着目カメラアダプタよりも上流に位置する他のカメラアダプタにて既にロッシーデータに変更されたことを意味する。 In step S406, the camera adapter 120 of interest checks whether the data of the layer R in the encoded image data D (i) is lossless data. Whether or not the data is lossless data is determined by referring to the quantization parameter of the layer R of the encoded image data D (i), and when it is not quantized, it is determined as lossless data. If the layer R is lossless data, the camera adapter 120 of interest determines that the code data of the layer R of the encoded image data D (i) can be reduced, and the process proceeds to step S407. If it is determined that the data in the hierarchy R is lossy data, the camera adapter 120 of interest determines that the data reduction process at this stage is not performed for the hierarchy R, and the process proceeds to step S409. Note that the situation where the data in the hierarchy R is lossy data means that the data has already been changed to lossy data in another camera adapter located upstream from the camera adapter of interest.

ステップＳ４０７にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の階層Ｒを復号し、ＤＷＴ係数を復元する。そして、ステップＳ４０８にて、着目カメラアダプタ１２０は、復元して得たＤＷＴ係数を、先のステップＳ４０２で取得した階層Ｒの量子化パラメータを使って量子化し、量子化後のＤＷＴ係数を再符号化する。 In step S407, the camera adapter 120 of interest decodes the layer R of the encoded image data D (i) and restores the DWT coefficient. In step S408, the camera adapter 120 of interest quantizes the restored DWT coefficient using the quantization parameter of the layer R acquired in step S402, and recodes the quantized DWT coefficient. Turn into.

ステップＳ４０９にて、着目カメラアダプタ１２０は、変数ｉに“１”を加算することで、変数ｉを更新する。そして、ステップＳ４１０にて、着目カメラアダプタ１２０は、変数ｉの値と、着目カメラアダプタの順位を表す値“ｎ”とを比較する。ｉ≦ｎの関係にある場合、着目カメラアダプタ１２０は処理をステップＳ４０５に戻し、ｉ＞ｎの関係になるまで上述した処理を繰り返す。そして、ｉ＞ｎの関係を満たした場合、着目カメラアダプタ１２０は処理をステップＳ４１１に進める。 In step S409, the camera adapter 120 of interest updates the variable i by adding “1” to the variable i. In step S410, the camera adapter 120 of interest compares the value of the variable i with the value “n” representing the order of the camera adapter of interest. If the relationship is i ≦ n, the camera adapter 120 of interest returns the process to step S405 and repeats the above-described processing until the relationship of i> n is satisfied. If the relationship i> n is satisfied, the camera adapter 120 of interest advances the process to step S411.

上記の結果、着目カメラアダプタ１２０は、符号化画像データＤ（０）、…、（ｎ）のうちの幾つかが再符号化され、その符号量が削減されることになる。それ故、ステップＳ４１１において、着目カメラアダプタ１２０は、総データ量TotalSizeを改めて算出して更新する。そして、ステップＳ４１２にて、着目カメラアダプタ１２０は、更新後の総データ量TotalSizeと閾値Ｍとの比較を行う。Ｍ≧TotalSizeの関係にある場合、着目カメラアダプタ１２０は、下流に転送すべき符号化画像データに対する削減処理が完了したと判断し、本処理を終える。この結果、データ量が削減された符号化画像データを含む符号化画像データＤ（０）、…、Ｄ（ｎ）が下流に転送されることになる。一方、Ｍ＜TotalSizeの関係にある場合、着目カメラアダプタ１２０は、現状の符号化画像データ量では送信できないと判断し、処理をステップＳ４１３に進める。 As a result, the camera adapter 120 of interest re-encodes some of the encoded image data D (0),..., (N), and the code amount is reduced. Therefore, in step S411, the camera adapter 120 of interest recalculates and updates the total data amount TotalSize. In step S412, the camera adapter 120 of interest compares the updated total data amount TotalSize with the threshold value M. If the relationship of M ≧ TotalSize is satisfied, the camera adapter 120 of interest determines that the reduction process for the encoded image data to be transferred downstream has been completed, and ends this process. As a result, encoded image data D (0),..., D (n) including encoded image data with a reduced data amount is transferred downstream. On the other hand, if M <TotalSize, the camera adapter 120 of interest determines that transmission is not possible with the current amount of encoded image data, and the process proceeds to step S413.

ステップＳ４１３にて、着目カメラアダプタ１２０は、変数Ｐに“１”を加算し、更新する。そして、ステップＳ４１４にて、変数Ｐが示す値がカウンタ間隔を示す変数Ｇと一致するか否かを判定する。変数Ｐの値は、０〜Ｇ−１の範囲の値しかとらない。そのため、変数ＰとＧが一致していれば、第２のロスレス圧縮処理で得た全ての符号化画像データの階層Ｒのロッシーデータへの変更が完了したことを意味する。それ故、変数ＰとＧが一致した場合、着目カメラアダプタ１２０は、ステップＳ４１５に処理を進める。また、変数ＰとＧが不一致である場合、着目カメラアダプタ１２０は処理をステップＳ４０４に戻す。 In step S413, the camera adapter 120 of interest adds “1” to the variable P and updates it. In step S414, it is determined whether or not the value indicated by the variable P matches the variable G indicating the counter interval. The value of the variable P takes only a value in the range of 0 to G-1. Therefore, if the variables P and G match, it means that the change of all the encoded image data obtained by the second lossless compression process to the lossy data of the layer R has been completed. Therefore, when the variables P and G match, the camera adapter 120 of interest advances the process to step S415. If the variables P and G do not match, the camera adapter 120 of interest returns the process to step S404.

ステップＳ４１５にて、着目カメラアダプタ１２０は、ロッシー符号化する対象の階層を、これまでよりも１つ下とすべく、変数Ｒから“１”減じ、変数Ｒを更新する。そして着目カメラアダプタ１２０は処理をステップＳ４０２に戻す。 In step S415, the camera adapter 120 of interest updates the variable R by subtracting “1” from the variable R so that the hierarchy to be lossy-encoded is one level lower than before. Then, the camera adapter 120 of interest returns the process to step S402.

以上説明したように実施形態によれば、ＤＷＴ係数を符号化したデータの階層データを使って符号量制御を行うことで、転送時間を制御し、伝送遅延を小さくできる。さらに、第１のロスレス圧縮の符号化画像データは一定間隔（実施形態では３）で生成され、必ずロスレス符号化データとして転送されるため、仮想視点コンテンツ作成時に、ロッシー符号化による画質劣化の影響を小さくすることができる。 As described above, according to the embodiment, by performing code amount control using hierarchical data of data obtained by encoding DWT coefficients, it is possible to control transfer time and reduce transmission delay. Furthermore, the first lossless compressed encoded image data is generated at regular intervals (3 in the embodiment) and is always transferred as lossless encoded data. Therefore, when virtual viewpoint content is created, the effect of image quality deterioration due to lossy encoding is affected. Can be reduced.

また、符号量を削減する対象の符号化画像データは、センサシステム１０１Ａ〜１０１Ｌの論理的な並びの間に２つ入れて選択されていく。従って、センサシステム１０１Ａ〜１０１Ｌが物理的に図１（ａ）に示すように配置されている場合には、隣接する視点位置の符号化画像データが削減されるのでなく、分散した視点位置の符号化画像データの符号量が削減されていくので、符号量削減による画質劣化が偏った視点位置に集中することを抑制できる。 In addition, two pieces of encoded image data whose code amount is to be reduced are selected between the logical arrangements of the sensor systems 101A to 101L. Therefore, when the sensor systems 101A to 101L are physically arranged as shown in FIG. 1A, the encoded image data of the adjacent viewpoint positions is not reduced, but the codes of the distributed viewpoint positions are reduced. Since the code amount of the converted image data is reduced, it is possible to suppress the image quality deterioration due to the code amount reduction from being concentrated on the biased viewpoint position.

また、再符号化する際には、ＤＷＴ係数まで復元できれば良いため（ＤＷＴ処理を行う必要はないため）、その処理負荷を小さくできる。 In addition, when re-encoding is performed, it is only necessary to restore the DWT coefficient (since it is not necessary to perform DWT processing), the processing load can be reduced.

なお、上記実施形態では、ＤＷＴを行う回数を３回に固定して説明したが、カメラアダプタの数とフレーム間隔に応じて適宜変更してもよい。たとえば、カウンタ間隔Ｇが小さい値の時には、削減対象となる符号化画像データ群Ｄ（ｘ）として特定される符号化画像データの数が多くなるため、階層数が少なくてもよい。一方、カウンタ間隔Ｇが大きい値の場合には、削減対象となる符号化画像データ群Ｄ（ｘ）として特定される符号化画像データの数が少なくなるため、階層数(ＤＷＴの実行回数）を増やして、削減可能な範囲を広げてもよい。 In the above-described embodiment, the number of times of performing DWT is fixed to 3 times. However, the number may be appropriately changed according to the number of camera adapters and the frame interval. For example, when the counter interval G is a small value, the number of encoded image data specified as the encoded image data group D (x) to be reduced increases, so the number of hierarchies may be small. On the other hand, when the counter interval G is a large value, the number of encoded image data specified as the encoded image data group D (x) to be reduced decreases, so the number of layers (number of executions of DWT) is set. You may increase and extend the range which can be reduced.

ここで上記実施形態におけるデータ転送量の上限を示す閾値Ｍと総データ量TotalSizeとの比較について考察する。上流に位置するカメラアダプタほど、それより上流のカメラアダプタから受信する符号化データの個数は少ないと言える。つまり、上流に位置するカメラアダプタほど、総データ量は閾値以下となる傾向にある。換言すれば、下流に位置するカメラアダプタほど総データ量は閾値を超える蓋然性が高くなる。これは、第２のロスレス圧縮で生成された符号化画像データのロッシーデータへの変更処理を行う頻度は、下流のカメラアダプタほど多くなることを意味する。そこで、この処理負荷の偏りを分散させるため、閾値Ｍをカメラアダプタ毎に異なるようにしても良い。 Here, a comparison between the threshold M indicating the upper limit of the data transfer amount and the total data amount TotalSize in the above embodiment will be considered. It can be said that the number of encoded data received from the upstream camera adapter is smaller in the upstream camera adapter. In other words, the total amount of data tends to be less than or equal to the threshold as the camera adapter is located upstream. In other words, the probability that the total data amount exceeds the threshold increases as the camera adapter is located downstream. This means that the frequency of changing the encoded image data generated by the second lossless compression to the lossy data increases as the downstream camera adapter increases. Therefore, the threshold value M may be different for each camera adapter in order to disperse this processing load bias.

例えば、センサシステム１０１Ａ〜１０１Ｌを接続する伝送路において、本システムが利用可能な通信帯域のうちの許容される帯域をＢとし、センサシステム１０１の個数をＮとしたとき、画像コンピューティングサーバ２００のフロントエンドサーバ２１０は、第ｉ番目（ｉ＝０、１、…、Ｎ−１のいずれか）のセンサシステム１０１のカメラアダプタに対する閾値Ｍを次式（１）により求める。
Ｍ＝｛Ｂ／Ｎ｝×（ｉ＋１）…（１）
そして、フロントエンドサーバ２１０は、各カメラアダプタに対して、それぞれ用に算出した値Ｍを閾値として設定させる。カメラアダプタ１２０Ａ〜１２０Ｌに設定された閾値Ｍは異なるものとなるが、各カメラアダプタは図２乃至図４のフローチャートに従って処理を行えばよい。この結果、符号量制御処理による処理の負荷が下流側に集中することを回避できるようになる。 For example, in the transmission path connecting the sensor systems 101A to 101L, when the allowable band of the communication bands that can be used by this system is B and the number of sensor systems 101 is N, the image computing server 200 The front-end server 210 obtains the threshold value M for the camera adapter of the i-th (i = 0, 1,..., N−1) sensor system 101 by the following equation (1).
M = {B / N} × (i + 1) (1)
Then, the front-end server 210 causes each camera adapter to set the value M calculated for each as a threshold value. Although the threshold values M set for the camera adapters 120A to 120L are different, each camera adapter may perform processing according to the flowcharts of FIGS. As a result, it is possible to avoid the processing load due to the code amount control processing from being concentrated on the downstream side.

［第１の実施形態の変形例］
上記第１の実施形態では、ステップＳ４０８による処理対象は階層Ｒに属する全サブバンド（ＨＬ，ＬＨ、ＨＨの３つのサブバンド）のＤＷＴ係数であった。つまり、削減対象のＤＷＴ係数が比較的多く、その階層の符号量削減処理で削減されるデータ量は比較的多く、図４のＳ４０５以降の処理の繰り返し回数を少なくできる。しかし、これでは、必要以上にデータ量を削減してしまい、逆に画質劣化の度合いが大きくなる蓋然性が増していく。 [Modification of First Embodiment]
In the first embodiment, the processing target in step S408 is the DWT coefficient of all subbands (three subbands HL, LH, and HH) belonging to the hierarchy R. That is, there are a relatively large number of DWT coefficients to be reduced, the amount of data reduced by the code amount reduction processing of that layer is relatively large, and the number of repetitions of the processing after S405 in FIG. However, this increases the probability that the amount of data will be reduced more than necessary, and the degree of image quality degradation will increase.

そこで、本変形例では、階層単位ではなく、サブバンドを単位としてデータ量削減を行う方法について説明する。本変形例と上記第１の実施形態との差異は、図４に示す符号データ削減処理のフローのみである。従って、画像処理システムの構成は第１の実施形態と同じであるものとし、その説明については省略する。 Therefore, in the present modification, a method for reducing the data amount not in units of layers but in units of subbands will be described. The difference between this modification and the first embodiment is only the flow of the code data reduction process shown in FIG. Accordingly, the configuration of the image processing system is assumed to be the same as that of the first embodiment, and the description thereof is omitted.

図６は、本変形例におけるデータ量削減処理を示すフローチャートである。同図は上記のように図４に代わるものである。すなわち、図６に係る処理は、図３に示すステップＳ３０６の処理に相当することになる点に注意されたい。なお。図６の処理において、第１の実施形態と同じ処理について、図４と同じ参照符号を付し、その詳細な説明を割愛する。 FIG. 6 is a flowchart showing a data amount reduction process in the present modification. This figure replaces FIG. 4 as described above. That is, it should be noted that the process according to FIG. 6 corresponds to the process of step S306 shown in FIG. Note that. In the process of FIG. 6, the same processes as those of the first embodiment are denoted by the same reference numerals as those of FIG.

ステップＳ６０１にて、着目カメラアダプタ１２０は、データ削減対象のサブバンド番号を特定するための変数Ｓに、最大サブバンド番号を代入し初期化する。本実施形態の場合、ＤＷＴの実行回数が３回であるため、最大サブバンド番号は“９”となる（図５を参照されたい）。以降、変数Ｓで示されるサブバンドを単にサブバンドＳと記す。 In step S601, the camera adapter 120 of interest substitutes and initializes the maximum subband number to the variable S for specifying the subband number targeted for data reduction. In the case of this embodiment, since the DWT is executed three times, the maximum subband number is “9” (see FIG. 5). Hereinafter, the subband indicated by the variable S is simply referred to as subband S.

ステップＳ６０２にて、着目カメラアダプタ１２０は、内部メモリから、サブバンドＳのロスレスデータをロッシーデータに変更するための量子化パラメータを取得する。 In step S602, the camera adapter 120 of interest acquires a quantization parameter for changing the lossless data of the subband S to lossy data from the internal memory.

ステップＳ４０３〜Ｓ４０５は第１の実施形態と同様の動作をする。 Steps S403 to S405 operate in the same manner as in the first embodiment.

ステップＳ４０５にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の生成時のカウンタＣの値が、変数Ｐが示す値と同じであるか否かを判定する。符号化画像データＤ（ｉ）の生成時のカウンタＣの値が変数Ｐが示す値と同じである場合、着目カメラアダプタ１２０は処理をステップＳ６０３へ進める。 In step S405, the camera adapter 120 of interest determines whether or not the value of the counter C when the encoded image data D (i) is generated is the same as the value indicated by the variable P. When the value of the counter C at the time of generating the encoded image data D (i) is the same as the value indicated by the variable P, the camera adapter 120 of interest advances the process to step S603.

ステップＳ６０３において、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）のサブバンドＳのデータがロスレスデータかどうかを判定する。すなわち、サブバンドＳが量子化されていなければロスレスデータと判断する。サブバンドＳのデータがロスレスデータの場合、着目カメラアダプタ１２０は、処理をステップＳ６０４へ進め、そうでなければ、ステップＳ４０９へ進める。 In step S603, the camera adapter 120 of interest determines whether the data of the subband S of the encoded image data D (i) is lossless data. That is, if the subband S is not quantized, it is determined as lossless data. If the data of the subband S is lossless data, the camera adapter 120 of interest advances the process to step S604, and otherwise advances to step S409.

ステップＳ６０４にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）のサブバンドＳのデータをデコードし、サブバンドＳのＤＷＴ係数を取得する。そして、ステップＳ６０５にて、着目カメラアダプタ１２０は、得られたＤＷＴ係数を先のステップＳ６０２で取得した量子化パラメータを使って量子化し、再符号化する。 In step S604, the camera adapter 120 of interest decodes the data of the subband S of the encoded image data D (i) and acquires the DWT coefficient of the subband S. In step S605, the camera adapter 120 of interest quantizes and re-encodes the obtained DWT coefficient using the quantization parameter acquired in the previous step S602.

ステップＳ４０９〜Ｓ４１４までは第１の実施形態と同様の動作をする。ステップＳ４１４にて、着目カメラアダプタ１２０は、変数Ｐの値とカウンタ間隔Ｇとを比較し、両者が一致していれば、処理をステップＳ６０６に進め、不一致である場合には処理をＳ４０４に戻す。 Steps S409 to S414 are the same as those in the first embodiment. In step S414, the camera adapter 120 of interest compares the value of the variable P and the counter interval G. If the two match, the process proceeds to step S606. If the two match, the process returns to S404. .

ステップＳ６０６に処理が進むのは、カウンタＣが変数Ｐで示した値であったときに生成された全ての符号化画像データのサブバンドＳをロッシーデータに変更した場合である。それ故、ステップＳ６０６にて、着目カメラアダプタ１２０は、次のサブバンドの符号量制御を行うため、変数Ｓを“１”だけ減じることで、変数Ｓを更新し、処理をステップＳ６０２へ処理を戻す。 The process proceeds to step S606 when the subband S of all the encoded image data generated when the counter C has the value indicated by the variable P is changed to lossy data. Therefore, in step S606, the camera adapter 120 of interest updates the variable S by subtracting the variable S by “1” in order to perform the code amount control of the next subband, and the process proceeds to step S602. return.

上記の本変形例によれば、最大階層“３”のサブバンドＨＨ，ＬＨ，ＨＬの順に、ロッシーデータに変換され符号量が削減されていく。そして、それでも削減量が足りない場合には、次の階層“２”のサブバンドＨＨ，ＬＨ，ＨＬの順に削減処理が移っていくことになる。 According to the above-described modification, the code amount is reduced by converting into lossy data in the order of the subbands HH, LH, and HL of the maximum hierarchy “3”. If the reduction amount is still insufficient, the reduction process moves in the order of subbands HH, LH, and HL of the next layer “2”.

ここで、上記第１の実施形態と本変形例による符号化画像データのデータ量の違いを図７の模式図を参照して説明する。説明を簡単にするため、同図は１つのセンサシステムで生成された符号化画像データのデータ構造を示している。同図の参照符号７０１は全てのサブバンドがロスレス符号化されている状態を示している。参照符号７０２は本変形例による符号量削減処理で得た符号化画像データの例を示す。そして、参照符号７０３は、先に説明した第１の実施形態における符号量削減処理後の符号化画像データを示している。第１の実施形態による符号化画像データ７０３は、階層１、２に含まれる６つのサブバンドすべてがロッシーに変更されるが、本変形例によれば、階層２に含まれる３つのサブバンドと、階層２ではサブバンド３のみがロッシーになる。そのため、第１の実施形態よりも画質が良い状態で転送できることになるのは明らかである。 Here, the difference in the data amount of the encoded image data according to the first embodiment and the present modification will be described with reference to the schematic diagram of FIG. In order to simplify the explanation, the figure shows a data structure of encoded image data generated by one sensor system. Reference numeral 701 in the figure indicates a state in which all subbands are lossless encoded. Reference numeral 702 indicates an example of encoded image data obtained by the code amount reduction process according to the present modification. Reference numeral 703 indicates encoded image data after the code amount reduction processing in the first embodiment described above. In the encoded image data 703 according to the first embodiment, all six subbands included in the hierarchies 1 and 2 are changed to lossy, but according to this modification, the three subbands included in the hierarchies 2 and In layer 2, only subband 3 becomes lossy. For this reason, it is obvious that transfer can be performed with better image quality than in the first embodiment.

また、本変形例でも、センサシステム内のすべてのカメラアダプタで同じ方法でデータを削減するものとするが、各センサシステムに設定する閾値Ｍを、先に示した式（１）に従って設定することで、再符号化に係る負担を分散させるようにしても良い。 Also in this modification, data is reduced by the same method for all camera adapters in the sensor system, but the threshold value M set for each sensor system is set according to the equation (1) shown above. Thus, the burden on re-encoding may be distributed.

［第２の実施形態］
第１の実施形態では、階層単位にロスレスデータをロッシーデータに変換した。しかし、仮想視点コンテンツの生成のための対応点を取る際に、低解像度であってもロスレスでデータを再生できる方が望ましい場合もある。そこで、本第２の実施形態では、低解像度のデータはロスレスデータとして転送することを優先する例を説明する。第１の実施形態と本第２の実施形態の差異は、図４のステップＳ３０６の符号量削減処理のみである。 [Second Embodiment]
In the first embodiment, lossless data is converted into lossy data in units of layers. However, when taking corresponding points for generating virtual viewpoint content, it may be desirable to be able to reproduce data losslessly even at low resolution. Therefore, in the second embodiment, an example will be described in which low-resolution data is prioritized to be transferred as lossless data. The difference between the first embodiment and the second embodiment is only the code amount reduction processing in step S306 in FIG.

よって、以下ではステップＳ３０６の詳細を、図８のフローチャートに従って説明する。なお、図８は、第１の実施形態における図４に代わるものである。よって、図４と同じ処理については、同じ参照符号を付し、その説明は割愛する。 Therefore, details of step S306 will be described below with reference to the flowchart of FIG. FIG. 8 is an alternative to FIG. 4 in the first embodiment. Therefore, the same processes as those in FIG. 4 are denoted by the same reference numerals, and the description thereof is omitted.

ステップＳ４０１〜Ｓ４０５の処理は第１の実施形態と同じである。ステップＳ４０６にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の階層Ｒのデータがロスレスデータであるか判定する。階層Ｒがロスレスデータであれば、着目カメラアダプタ１２０は、第１の実施形態と同様に処理をステップＳ４０７へ進める。また、符号化画像データＤ（ｉ）の階層Ｒのデータがロッシーデータである場合、着目カメラアダプタは処理をステップＳ８０１に進める。 The processing in steps S401 to S405 is the same as that in the first embodiment. In step S406, the camera adapter 120 of interest determines whether the data of the layer R of the encoded image data D (i) is lossless data. If the hierarchy R is lossless data, the camera adapter 120 of interest advances the processing to step S407 as in the first embodiment. If the data of the layer R of the encoded image data D (i) is lossy data, the camera adapter of interest advances the process to step S801.

このステップＳ８０１にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の階層Ｒの符号化データを全て削除する。そして、ステップＳ８０２にて、着目カメラアダプタ１２０は、階層Ｒのデータがすべて削除されたことを示すため、符号化画像データＤ（ｉ）のヘッダ情報を書き換える。たとえば、各階層の符号データサイズがヘッダに記載されているのであれば、着目カメラアダプタ１２０は、階層Ｒの符号データサイズとして０を書き込む。あるいは、画像サイズを階層「Ｒ−１」でデコード可能な画像サイズに書き換えてもよい。最終的に、仮想視点コンテンツを生成する際に、階層Ｒのデータが無いことが分かる情報を出力すればよい。ステップＳ８０２の処理を終えると、着目カメラアダプタ１２０は処理をステップＳ４０９へ進め、変数ｉに“１”を加算することで更新する。ステップＳ４１０〜Ｓ４１４は第１の実施形態とまったく同じ動作であるので、ここでの説明は割愛する。 In step S801, the camera adapter 120 of interest deletes all the encoded data of the layer R of the encoded image data D (i). In step S 802, the camera adapter 120 of interest rewrites the header information of the encoded image data D (i) to indicate that all the data of the layer R has been deleted. For example, if the code data size of each layer is described in the header, the camera adapter 120 of interest writes 0 as the code data size of the layer R. Alternatively, the image size may be rewritten to an image size that can be decoded in the hierarchy “R-1”. Finally, when the virtual viewpoint content is generated, information indicating that there is no data of the hierarchy R may be output. When the process of step S802 is completed, the camera adapter 120 of interest advances the process to step S409 and updates it by adding “1” to the variable i. Steps S410 to S414 are exactly the same operations as those in the first embodiment, and thus description thereof is omitted here.

上記本第２の実施形態によれば、第１の実施形態に比べて、最大階層の符号化データは、ロスレスデータ→ロッシーデータ→削除という経緯を踏む。この結果、解像度が低いほど、ロスレスデータが維持される蓋然性を高めることができるようになる。 According to the second embodiment, compared with the first embodiment, the encoded data of the highest layer follows the process of lossless data → lossy data → deletion. As a result, the lower the resolution, the higher the probability that lossless data will be maintained.

［第３の実施形態］
第１、第２の実施形態では、ＤＷＴ変換後の符号化で得たデータを一律に画質を落すことで、たとえ低画質であっても、センサシステム１２０Ａ〜１２０Ｌの全てで生成された符号データが画像コンピューティングサーバ２００に届くようにした。 [Third Embodiment]
In the first and second embodiments, the code data generated by all of the sensor systems 120A to 120L is obtained by uniformly reducing the image quality of the data obtained by encoding after DWT conversion, even if the image quality is low. Can reach the image computing server 200.

しかし、センサシステム１２０Ａ〜１２０Ｌの全てが必ず正常に動作するとは限らない。そのような状況を想定し、全てのセンサシステムからの符号データが揃わなくても、仮想視点コンテンツを生成できるロバスト性の高いアルゴリズムを画像コンピューティングサーバ２００が実装している場合もある。そこで、本第３の実施形態では、ＤＷＴ係数をロスレス符号化したデータを間引くことで、転送データ量を制御する方法について説明する。 However, all of the sensor systems 120A to 120L do not always operate normally. Assuming such a situation, the image computing server 200 may be equipped with a highly robust algorithm that can generate virtual viewpoint content even if code data from all sensor systems is not available. Therefore, in the third embodiment, a method of controlling the transfer data amount by thinning out data obtained by lossless encoding of DWT coefficients will be described.

具体的には、着目カメラアダプタ１２０は、符号化画像データＤ（０）、…、Ｄ（ｎ）のうち、変数Ｐが示す値と同じカウンタＣで生成された符号化画像データＤ（ｉ）の１つを特定する。そして、着目カメラアダプタ１２０は、送信符号データの総データ量TotalSizeが閾値Ｍ以下となるまで、符号化画像データＤ（ｉ）の最大解像度の階層から順に符号化データを削除する。もし、ＬＬサブバンドまで削除して、符号化画像データＤ（ｉ）を完全に削除しても、さらなるデータ削減が必要な場合、着目カメラアダプタは変数Ｐが示すカウンタＣで生成された他の符号化画像データについても同様に行う。それでも削減が足りない場合には、変数Ｐの値を更新し、削減対象の符号化画像データを特定する。 Specifically, the camera adapter 120 of interest encodes encoded image data D (i) generated by the same counter C as the value indicated by the variable P among the encoded image data D (0),..., D (n). One of these is specified. Then, the camera adapter 120 of interest deletes the encoded data in order from the highest resolution layer of the encoded image data D (i) until the total data amount TotalSize of the transmitted encoded data becomes equal to or less than the threshold value M. If even the LL subband is deleted and the encoded image data D (i) is completely deleted, but further data reduction is necessary, the camera adapter of interest will receive another data generated by the counter C indicated by the variable P. The same applies to the encoded image data. If the reduction is still insufficient, the value of the variable P is updated to specify the encoded image data to be reduced.

本第３の実施形態と、第１、第２の実施形態との差異は、図３のステップＳ３０６の符号データ削減処理であり、それ以外は同じである。図９は、本第３の実施形態における符号データ削除処理のフローチャートである。これは、図４、図６に代わるものと理解されたい。 The difference between the third embodiment and the first and second embodiments is the code data reduction process in step S306 in FIG. 3, and the other points are the same. FIG. 9 is a flowchart of the code data deletion process in the third embodiment. This should be understood as an alternative to FIGS.

ステップＳ９０１にて、着目カメラアダプタ１２０は、削除対象のカウンタＣの値を特定するための変数Ｐに“１”を代入し、初期化する。ステップＳ９０２にて、着目カメラアダプタ１２０は、符号化画像データの特定するための変数ｉに“０”を代入し、初期化する。 In step S901, the camera adapter 120 of interest substitutes “1” for the variable P for specifying the value of the counter C to be deleted and initializes it. In step S902, the camera adapter 120 of interest substitutes “0” for the variable i for specifying the encoded image data and initializes it.

ステップＳ９０３にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）が削除されているか否かを判定する。削除されている場合、着目カメラアダプタ１２０は処理をステップＳ９１２に進める。一方、符号化画像データＤ（ｉ）が存在する場合、着目カメラアダプタ１２０は、処理をステップＳ９０４に進める。ステップＳ９０４にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）を生成した際のカウンタＣの値と、変数Ｐが示す値とが一致しているか否かを判定する。不一致の場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）は削除対象外であると判定し、処理をステップＳ９１２に進める。また、一致した場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）は削除対象であると判定し、処理をステップＳ９０５に進める。 In step S903, the camera adapter 120 of interest determines whether the encoded image data D (i) has been deleted. If it has been deleted, the camera adapter 120 of interest advances the process to step S912. On the other hand, when the encoded image data D (i) exists, the camera adapter 120 of interest advances the process to step S904. In step S904, the camera adapter 120 of interest determines whether or not the value of the counter C when the encoded image data D (i) is generated matches the value indicated by the variable P. If they do not match, the camera adapter 120 of interest determines that the encoded image data D (i) is not to be deleted, and the process proceeds to step S912. If they match, the camera adapter 120 of interest determines that the encoded image data D (i) is to be deleted, and advances the process to step S905.

ステップＳ９０５にて、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）に含まれる符号化データの中で最大階層の番号が何であるのかを、変数Ｒに代入する。そして、ステップＳ９０６にて、符号化画像データＤ（ｉ）の階層Ｒの符号化データを全て削除する。そして、ステップＳ９０７にて、着目カメラアダプタ１２０は、階層Ｒのデータがすべて削除されたことを示すため、符号化画像データＤ（ｉ）のヘッダ情報を書き換える。このステップＳ９０６、Ｓ９０７の処理は、図８のステップＳ８０１，Ｓ８０２と同じ処理であるともいえる。そして、ステップＳ９０８にて、着目カメラアダプタ１２０は、総データ量TotalSizeを改めて算出して更新する。そして、ステップＳ９０９にて、着目カメラアダプタ１２０は、更新後の総データ量TotalSizeと閾値Ｍとの比較を行う。Ｍ≧TotalSizeの関係にある場合、着目カメラアダプタ１２０は、下流に転送すべき符号化画像データに対する削減処理が完了したと判断し、本処理を終える。一方、Ｍ＜TotalSizeの関係にある場合、着目カメラアダプタ１２０は、更なる符号化データの削除が必要であると判定し、処理をステップＳ９１０に進める。 In step S905, the camera adapter 120 of interest substitutes the variable R for what is the highest layer number in the encoded data included in the encoded image data D (i). In step S906, all the encoded data of the layer R of the encoded image data D (i) are deleted. In step S907, the camera adapter 120 of interest rewrites the header information of the encoded image data D (i) to indicate that all data of the layer R has been deleted. It can be said that the processes in steps S906 and S907 are the same as those in steps S801 and S802 in FIG. In step S908, the camera adapter 120 of interest calculates and updates the total data amount TotalSize again. In step S909, the camera adapter 120 of interest compares the updated total data amount TotalSize with the threshold value M. If the relationship of M ≧ TotalSize is satisfied, the camera adapter 120 of interest determines that the reduction process for the encoded image data to be transferred downstream has been completed, and ends this process. On the other hand, if the relationship of M <TotalSize is satisfied, the camera adapter 120 of interest determines that further deletion of encoded data is necessary, and advances the process to step S910.

ステップＳ９１０にて、着目カメラアダプタ１２０は、削除対象の階層を１つ下にするため、変数Ｒから“１”を減じ、変数Ｒを更新する。そして、ステップＳ９１１にて、着目カメラアダプタ１２０は、変数Ｒと“０”とを比較する。変数Ｒの値が０以上の場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）には削除可能な符号化データが残っていると判定し、処理をステップＳ９０６に戻す。一方、変数Ｒが“０”未満（“−１”）の場合、着目カメラアダプタ１２０は、符号化画像データＤ（ｉ）の全階層の符号化データ削除されたこと、すなわち、符号化画像データＤ（ｉ）が存在しなくなったと判定し、処理をステップＳ９１２に進める。 In step S910, the camera adapter 120 of interest subtracts “1” from the variable R and updates the variable R in order to move the hierarchy to be deleted one level down. In step S911, the camera adapter 120 of interest compares the variable R with “0”. If the value of the variable R is greater than or equal to 0, the camera adapter 120 of interest determines that there is remaining erasable encoded data in the encoded image data D (i), and returns the process to step S906. On the other hand, when the variable R is less than “0” (“−1”), the camera adapter 120 of interest has deleted the encoded data of all layers of the encoded image data D (i), that is, encoded image data. It is determined that D (i) no longer exists, and the process proceeds to step S912.

ステップＳ９１２にて、着目カメラアダプタ１２０は、変数ｉに“１”を加算し、変数ｉを更新する。そして、ステップＳ９１３にて、着目カメラアダプタ１２０は、更新後の変数ｉの値と、着目カメラアダプタの順位を表す値“ｎ”とを比較する。ｉ≦ｎの関係にある場合、着目カメラアダプタ１２０は処理をステップＳ９０３に戻す。そして、ｉ＞ｎの関係を満たしていると判定した場合、着目カメラアダプタ１２０は処理をステップＳ９１４に進める。ステップＳ９１４にて、着目カメラアダプタ１２０は、削除対象となる符号化データを変更するため、変数Ｐに“１”を加算し、変数Ｐを更新する。そして、着目カメラアダプタ１２０は処理をステップＳ９０２に戻す。 In step S912, the camera adapter 120 of interest adds “1” to the variable i and updates the variable i. In step S913, the camera adapter 120 of interest compares the value of the updated variable i with a value “n” representing the rank of the camera adapter of interest. If i ≦ n, the camera adapter 120 of interest returns the process to step S903. If it is determined that the relationship i> n is satisfied, the camera adapter 120 of interest advances the process to step S914. In step S914, the camera adapter 120 of interest adds “1” to the variable P and updates the variable P in order to change the encoded data to be deleted. Then, the camera adapter 120 of interest returns the process to step S902.

以上説明した処理によれば、第１、第２の実施形態と比較すると、転送される符号データの数は減ってしまうが、サーバ側で受け取るロスレス符号データを増やすことができるようになる。 According to the processing described above, compared to the first and second embodiments, the number of code data to be transferred is reduced, but the lossless code data received on the server side can be increased.

なお、仮想視点画像を生成するためには多くの視点位置の画像が存在することが望まれるが、一方で、或る適当な下限数の視点画像が得られれば、当初の目的の画質の仮想視点画像を生成できるものである。その意味で、この下限数までは上記のような削減処理を行っても良い。 In order to generate a virtual viewpoint image, it is desirable that images of many viewpoint positions exist. On the other hand, if a certain appropriate lower limit number of viewpoint images are obtained, a virtual image of the original target image quality is obtained. A viewpoint image can be generated. In that sense, the above reduction processing may be performed up to the lower limit number.

［第４の実施形態］
第１〜第３の実施形態では、１２個のセンサシステム１０１Ａ〜１０１Ｌのうち、カウンタＣが０となる４個のセンサシステムは第１のロスレス圧縮を行い、カウンタＣが１、又は２となる８個のセンサシステムは第２のロスレス圧縮を行った。 [Fourth Embodiment]
In the first to third embodiments, out of the twelve sensor systems 101A to 101L, four sensor systems in which the counter C is 0 perform the first lossless compression, and the counter C is 1 or 2. The eight sensor system performed a second lossless compression.

本第４の実施形態では、全センサシステムがＤＷＴを実行する、第２のロスレス圧縮を行う例を説明する。 In the fourth embodiment, an example of performing second lossless compression in which all sensor systems perform DWT will be described.

本第４の実施形態を第１の実施形態との差異は、図２の転送データ作成に係る処理である。そこで、本第４の実施形態における転送データ作成フローを図１０のフローチャートに従って説明する。 The difference between the fourth embodiment and the first embodiment is the processing related to transfer data creation in FIG. Therefore, the transfer data creation flow in the fourth embodiment will be described with reference to the flowchart of FIG.

図１０において、図２と異なるのは、ステップＳ２０５の判定処理と、ステップＳ２０６の画素値のロスレス圧縮が無いということだけである。その他は図２と同じであるので、同一参照符号を付しその説明は割愛する。 10 is different from FIG. 2 only in that there is no determination processing in step S205 and lossless compression of pixel values in step S206. Since others are the same as those in FIG. 2, the same reference numerals are assigned and the description thereof is omitted.

ステップＳ２０１〜Ｓ２０４は第１の実施形態とまったく同じ動作である。ステップＳ２０４にて着目カメラアダプタが取得したフレーム画像は必ず、ステップＳ２０８にて、第２のロスレス圧縮が実行され、ロスレスの符号化画像データが生成される。ステップＳ２０９以降の処理も第１の実施形態と全く同じであるため、ここでの説明は割愛する。 Steps S201 to S204 are exactly the same as those in the first embodiment. The frame image acquired by the camera adapter of interest in step S204 is always subjected to the second lossless compression in step S208, and lossless encoded image data is generated. Since the processing after step S209 is exactly the same as that of the first embodiment, the description here is omitted.

そして、ステップＳ２１０における処理が第１の実施形態と同じである、故、本第４の実施形態によれば、全センサシステムは第２のロスレス圧縮に従って符号化画像データを生成するが、カウンタＣが“０”の場合に生成された、第２のロスレス圧縮に従った符号化画像データについては再符号化されること無く、画像コンピューティングサーバ２００に供給されることになる。 The processing in step S210 is the same as that of the first embodiment. Therefore, according to the fourth embodiment, the entire sensor system generates encoded image data according to the second lossless compression, but the counter C The encoded image data according to the second lossless compression generated when “0” is “0” is supplied to the image computing server 200 without being re-encoded.

以上説明したように第１乃至第４の実施形態によれば、システム全体の符号量制御ができると同時に、サーバ側では最低限のロスレスデータを受信することができる。したがって、サーバ側では、高品位な仮想視点コンテンツをリアルタイムに作成することが可能になる。 As described above, according to the first to fourth embodiments, the code amount of the entire system can be controlled, and at the same time, the minimum lossless data can be received on the server side. Therefore, on the server side, it is possible to create high-quality virtual viewpoint content in real time.

［その他の実施形態］
第４の実施形態では、すべてセンサシステムは被写体画像に対してＤＷＴを実行し、第１の実施形態と同様に階層毎にデータを削減する方法について述べた。しかし、第１の実施形態の変形例で示したサブバンドを単位としたデータ削減、第２の実施形態で示した階層を単位としたデータ削減、第３の実施形態で示した符号データの削減と組み合わせてもよい。 [Other Embodiments]
In the fourth embodiment, all the sensor systems have performed the DWT on the subject image, and described the method for reducing the data for each layer as in the first embodiment. However, data reduction in units of subbands shown in the modification of the first embodiment, data reduction in units of hierarchies shown in the second embodiment, and code data reduction shown in the third embodiment And may be combined.

また、本実施形態では、ＤＷＴ係数をゴロム符号を用いてロスレス圧縮していたが、ＪＰＥＧ２０００のロスレス圧縮を用いてもよい。すなわち、ＪＰＥＧ２０００の５３フィルタを用いてＤＷＴ係数を作成し、ビットプレーン符号化することでＤＷＴを用いたロスレス符号データを作成する。その場合、符号量制御時には、各サブバンドの符号データの最後尾からデータを削除することで符号量を削減する。ビットプレーン符号化は、ゴロム符号よりも処理負荷が大きいためセンサシステム１０１でのエンコード、画像コンピューティングサーバ２００側でのデコードの処理負荷は重くなることが予想される。しかし、符号量制御時には、ＤＷＴ係数の復号、量子化、再符号化する必要が無く、符号データのままビットプレーンデータを削除することができるため、符号量制御時の処理負荷が軽減される。 In this embodiment, the DWT coefficient is lossless compressed using Golomb code, but JPEG2000 lossless compression may be used. That is, DWT coefficients are created using 53 filters of JPEG2000, and lossless code data using DWT is created by bit-plane coding. In that case, at the time of code amount control, the code amount is reduced by deleting the data from the end of the code data of each subband. Since bit plane coding has a larger processing load than Golomb code, it is expected that the processing load of encoding on the sensor system 101 and decoding on the image computing server 200 side will be heavy. However, at the time of code amount control, it is not necessary to decode, quantize, and re-encode the DWT coefficient, and the bit plane data can be deleted without changing the code data, so the processing load at the time of code amount control is reduced.

上記の通り、実施形態の特徴的な処理を行うのは、センサシステムにおけるカメラアダプタの処理にある。このカメラアダプタの全機能をプログラムによって実現しても良いし、その一部をハードウェアや回路で実現しても構わない。プログラムで実現する場合には、そのプログラムをネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 As described above, the characteristic processing of the embodiment is performed by the processing of the camera adapter in the sensor system. All the functions of the camera adapter may be realized by a program, or a part thereof may be realized by hardware or a circuit. When implemented as a program, the program can also be implemented by supplying the program to a system or apparatus via a network or a storage medium and reading and executing the program by one or more processors in the computer of the system or apparatus. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

１００…画像処理システム、１０１Ａ〜１０１Ｌ…センサシステム、１１１Ａ〜１１１Ｌ…カメラ、１２０Ａ〜１２０Ｌ…カメラアダプタ、１５０…スイッチングハブ、２００…画像コンピューティングサーバ、３００…ネットワーク、４００…エンドユーザ端末 DESCRIPTION OF SYMBOLS 100 ... Image processing system, 101A-101L ... Sensor system, 111A-111L ... Camera, 120A-120L ... Camera adapter, 150 ... Switching hub, 200 ... Image computing server, 300 ... Network, 400 ... End user terminal

Claims

An image processing apparatus for providing an image at one viewpoint position among images at a plurality of viewpoint positions used for generating a virtual viewpoint image,
Obtaining means for obtaining an image from the imaging means;
Encoding means for converting the image acquired by the acquisition means to generate lossless encoded data;
Receiving means capable of receiving one or more pieces of encoded image data from a first other device;
Control means for performing code amount control when the total data amount of the encoded image data received by the receiving means and the encoded image data encoded by the encoding means exceeds a preset threshold;
When the total data amount is equal to or less than the threshold, the encoded image data received by the receiving unit and the encoded image data obtained by encoding by the encoding unit are transmitted to the second other device, When the total data amount exceeds the threshold value, it has transmission means for transmitting the encoded image data obtained by controlling the code amount by the control means to the second other device,
The control means includes
The encoded image data received by the receiving unit and the encoded image data to be reduced in code amount among the encoded image data obtained by the encoding unit are specified, and the code amount in the specified encoded image data is determined. A specifying means for specifying a conversion coefficient to be reduced;
Reducing means for reducing the code amount of the specified transform coefficient of the encoded image data specified by the specifying means;
The image processing apparatus characterized by performing identification by the identification unit and reduction by the reduction unit until a total data amount becomes equal to or less than the threshold value.

The image processing apparatus according to claim 1, wherein the transform performed by the encoding unit is a wavelet transform.

The encoding means applies a predetermined encoding to the reversible code without quantizing the subband wavelet transform coefficients of the resolution obtained by wavelet transform on the image obtained by the acquisition means. The image processing apparatus according to claim 2, wherein the image data is generated.

The image processing apparatus according to claim 3, wherein the specifying unit sequentially specifies encoded image data corresponding to viewpoint positions at predetermined intervals with respect to an array of a plurality of viewpoint positions.

The specifying means specifies, as a reduction target, reversible wavelet transform coefficients in all subbands belonging to one resolution in the order from the maximum resolution to the minimum resolution in the specified encoded image data,
The reducing means quantizes the wavelet transform coefficient specified by the specifying means according to a preset quantization parameter, and re-encodes the quantized wavelet transform coefficient. The image processing apparatus according to claim 4.

The specifying means performs reversible wavelet transform of one subband in the order from the maximum resolution to the minimum resolution in the specified encoded image data and according to a preset order belonging to one resolution. Priority is specified for reduction,
The reducing means quantizes the wavelet transform coefficient specified by the specifying means according to a preset quantization parameter, and re-encodes the quantized wavelet transform coefficient. The image processing apparatus according to claim 4.

The specifying unit specifies the wavelet transform coefficients in all subbands belonging to one resolution in the order from the maximum resolution to the minimum resolution in the specified encoded image data as a reduction target,
The reduction means is
If the identified wavelet transform coefficient is reversible, quantize according to a preset quantization parameter, re-encode the quantized wavelet transform coefficient,
The image processing apparatus according to claim 4, wherein if the specified wavelet transform coefficient is irreversible, the specified wavelet transform coefficient is deleted.

The specifying unit specifies the wavelet transform coefficients in all subbands belonging to one resolution in the order from the maximum resolution to the minimum resolution in the specified encoded image data as a reduction target,
The image processing apparatus according to claim 4, wherein the reducing unit deletes the specified wavelet transform coefficient.

The image processing apparatus according to claim 1, further comprising the imaging unit.

The image processing apparatus is daisy chained with the first other apparatus as an upstream and the second other apparatus as a downstream,
In the case where the allowable bandwidth in the communication band of the daisy chain connection is B, the number of devices connected in the daisy chain is N, and the image processing device is the i-th device where the most upstream is 0th,
The threshold value is defined by the following formula: threshold value = {B / N} × (i + 1)
The image processing apparatus according to claim 1, wherein the image processing apparatus is an image processing apparatus.

A control method for an image processing apparatus that provides an image at one viewpoint position among images at a plurality of viewpoint positions used for generating a virtual viewpoint image,
An acquisition step of acquiring an image from the imaging means;
An encoding step of converting the image acquired in the acquisition step to generate lossless encoded data;
A receiving step capable of receiving one or more pieces of encoded image data from a first other device;
A control step for performing code amount control when the total data amount of the encoded image data received in the receiving step and the encoded image data obtained by encoding in the encoding step exceeds a preset threshold;
If the total data amount is less than or equal to the threshold, the encoded image data received in the receiving step and the encoded image data obtained by encoding in the encoding step are transmitted to the second other device, When the total data amount exceeds the threshold value, the transmission step of transmitting the encoded image data obtained by the code amount control by the control step to the second other device,
The control step includes
The encoded image data received in the reception step and the encoded image data to be reduced in code amount among the encoded image data obtained in the encoding step are specified, and the code amount in the specified encoded image data is determined. A specific process for identifying a conversion coefficient to be reduced; and
A reduction step of reducing the code amount of the specified transform coefficient of the encoded image data specified in the specifying step,
The control method for an image processing apparatus, characterized in that the specification by the specifying step and the reduction by the reduction step are performed until the total data amount becomes equal to or less than the threshold value.

A computer having an imaging unit and a communication unit with another apparatus reads and executes the computer, thereby causing the computer to view one viewpoint position among images at a plurality of viewpoint positions used for generating a virtual viewpoint image. A program for functioning as an image processing apparatus that provides an image of
In the computer,
An acquisition step of acquiring an image from the imaging means;
An encoding step of converting the image acquired in the acquisition step to generate lossless encoded data;
A receiving step capable of receiving one or more pieces of encoded image data from a first other device;
A control step for performing code amount control when the total data amount of the encoded image data received in the receiving step and the encoded image data obtained by encoding in the encoding step exceeds a preset threshold;
If the total data amount is less than or equal to the threshold, the encoded image data received in the receiving step and the encoded image data obtained by encoding in the encoding step are transmitted to the second other device, When the total data amount exceeds the threshold value, a program for executing a transmission step of transmitting encoded image data obtained by code amount control in the control step to the second other device. And
The control step includes
The encoded image data received in the reception step and the encoded image data to be reduced in code amount among the encoded image data obtained in the encoding step are specified, and the code amount in the specified encoded image data is determined. A specific process for identifying a conversion coefficient to be reduced; and
A reduction step of reducing the code amount of the specified transform coefficient of the encoded image data specified in the specifying step,
The program characterized by performing the specification by the specific process and the reduction by the reduction process until the total data amount becomes equal to or less than the threshold.

A plurality of imaging devices for capturing images at a plurality of viewpoint positions, and encoded data indicating the images at the plurality of viewpoint positions are received from an imaging device positioned at the end of the plurality of imaging devices, and a virtual viewpoint image is received. An image processing system configured with an image generation device to generate,
The plurality of imaging devices are connected in series,
Each of the plurality of imaging devices
Obtaining means for obtaining an image from the imaging means;
Encoding means for converting the image acquired by the acquisition means to generate lossless encoded data;
Receiving means capable of receiving one or more encoded image data transferred from upstream;
Control means for performing code amount control when the total data amount of the encoded image data received by the receiving means and the encoded image data encoded by the encoding means exceeds a preset threshold;
When the total data amount is less than or equal to the threshold, the encoded image data received by the receiving unit and the encoded image data obtained by encoding by the encoding unit are transmitted downstream, and the total data amount is A transmission means for transmitting the encoded image data obtained by controlling the code amount by the control means to the downstream when the threshold is exceeded;
The control means includes
The encoded image data received by the receiving unit and the encoded image data to be reduced in code amount among the encoded image data obtained by the encoding unit are specified, and the code amount in the specified encoded image data is determined. A specifying means for specifying a conversion coefficient to be reduced;
Reducing means for reducing the code amount of the specified transform coefficient of the encoded image data specified by the specifying means;
The image processing system characterized by performing identification by the identification unit and reduction by the reduction unit until a total data amount becomes equal to or less than the threshold value.