KR102233494B1

KR102233494B1 - Multi-object tracking device and method in 360 degree video space

Info

Publication number: KR102233494B1
Application number: KR1020190062448A
Authority: KR
Inventors: 이명진; 심유정; 손종웅
Original assignee: 한국항공대학교산학협력단
Priority date: 2019-05-28
Filing date: 2019-05-28
Publication date: 2021-03-26
Also published as: KR20200136649A

Abstract

360도 영상 공간에서의 다중 객체 추적 장치 및 방법에 관한 것이며, 다중 객체 추적 장치에 의한 360도 영상 공간에서의 다중 객체 추적 방법은 (a) 상기 360도 영상 공간에 포함된 360도 비디오 프레임 내에서 객체 존재 후보 영역을 검출하는 단계, (b) 상기 360도 비디오 프레임으로부터 상기 객체 존재 후보 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 추출하는 단계, (c) 추출된 상기 평면 뷰포트 프레임에 대해 객체를 포함하는 객체 영역을 검출하는 단계, (d) 객체 영역이 검출된 경우, 검출된 객체 영역에 대응하는 대응 영상신호를 추출하고, 상기 검출된 객체 영역의 위치를 상기 360도 영상 공간에 대응하게 변환하여 변환 위치를 생성하는 단계, (e) 상기 대응 영상신호 및 상기 변환 위치를 기초로 상기 360도 비디오 프레임에서의 객체의 위치를 추적하는 단계를 포함하고, 상기 (a) 단계에서 검출된 상기 객체 존재 후보 영역이 복수개인 경우, 상기 (b) 단계 내지 상기 (e) 단계는 복수개의 상기 객체 존재 후보 영역 각각에 대하여 개별적으로 수행될 수 있다.It relates to a multi-object tracking apparatus and method in a 360-degree image space, and the method of tracking a multi-object in a 360-degree image space by a multi-object tracking device is (a) within a 360-degree video frame included in the 360-degree image space. Detecting an object presence candidate region, (b) converting and extracting a viewport frame having a predetermined angle of view including the object presence candidate region from the 360-degree video frame into a flat viewport frame form, (c) the extracted Detecting an object region including an object in the plan viewport frame, (d) when the object region is detected, extracts a corresponding image signal corresponding to the detected object region, and determines the position of the detected object region in the 360 Generating a transformed position by transforming to correspond to the image space of the diagram, (e) tracking the position of an object in the 360-degree video frame based on the corresponding image signal and the transformed position, wherein (a) When there are a plurality of the object presence candidate regions detected in step ), steps (b) to (e) may be individually performed for each of the plurality of object presence candidate regions.

Description

Multi-object tracking device and method in 360-degree image space {MULTI-OBJECT TRACKING DEVICE AND METHOD IN 360 DEGREE VIDEO SPACE}

본 발명은 영상보안시스템, 엔터테인먼트 응용 등을 위해 360도 비디오에서 다중 객체를 검출하고 각각의 객체의 이동 궤적을 추적하는 360도 영상 공간에서의 다중 객체 추적 장치 및 방법에 관한 것이다.The present invention relates to a multi-object tracking apparatus and method in a 360-degree image space for detecting multiple objects in a 360-degree video and tracking a movement trajectory of each object for a video security system, entertainment application, and the like.

최근, 공공장소나 제한 구역에서 범죄가 빈번히 발생함에 따라 CCTV(Closed Circuit Television)를 포함하는 360도 영상을 설치하여 정해진 영역을 감시하고 있다. 그러나, CCTV를 사용하여 운영자가 실시간 감시하는 경우에는 감시하다가 범죄 행위를 감지하지 못했을 경우가 종종 발생한다. 이런 경우, 사후에 VCR(Video Cassette Recorder)이나 DVR(Digital Video Recorder)에 녹화된 영상을 분석하여 사후에 사건을 감지하는 방법을 사용한다. 이를 보완하기 위해 IP(Internet Protocoal) 카메라를 이용한 영상분석 방법을 이용하여 실시간으로 사건을 감지하는 지능형 영상 감지 시스템이 주목받고 있다. 사건의 이상 징후 판별을 위해서는 영상 내 물체 감지 및 감지된 물체의 추적 기술을 이용하는데 물체 감지 방법으로는 일반적으로 GMM(Gaussian Mixture Model)과 같은 배경 영상을 모델링하여 이동 물체를 검출하는 방법이 주로 사용된다. 또한, 물체를 감지한 이후에는 Kalman filter나, mean shift, bounding box overlap 방법 등을 이용하여 감지된 물체를 추적하게 된다. 이 경우, 360도 영상은 평면 영상과 다른 구조적 왜곡이 존재하기 때문에 기존의 추적 방식과는 다른 추적 방법이 요구되는 실정이다. Recently, as crimes frequently occur in public places or restricted areas, 360-degree images including CCTV (Closed Circuit Television) are installed to monitor a designated area. However, in the case of real-time monitoring by an operator using CCTV, it often occurs that a criminal activity is not detected during monitoring. In this case, a method of detecting an event after the death by analyzing the video recorded on a video cassette recorder (VCR) or a digital video recorder (DVR) is used. To compensate for this, an intelligent image detection system that detects an event in real time using an image analysis method using an IP (Internet Protocoal) camera is drawing attention. The detection of an object in the image and tracking technology of the detected object is used to determine the abnormal signs of an event. In general, a method of detecting a moving object by modeling a background image such as GMM (Gaussian Mixture Model) is mainly used as an object detection method. do. In addition, after detecting the object, the detected object is tracked using a Kalman filter, mean shift, or bounding box overlap method. In this case, since a 360-degree image has structural distortion different from that of a planar image, a tracking method different from the conventional tracking method is required.

본원의 배경이 되는 기술은 한국등록특허공보 제10-1735365 호에 개시되어 있다.The technology behind the present application is disclosed in Korean Patent Publication No. 10-1735365.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, 하나의 360도 전방위 카메라로부터 획득한 360도 비디오 또는 여러 일반 카메라 들로부터 획득한 비디오들을 스티칭하여 구성한 ERP (Equi-rectangular Projection) 또는 CMP(Cube-Map Projection) 360도 비디오에 대해 사용자가 설정한 유형의 복수의 객체들의 비디오 프레임 내 위치를 검출하여 표시하는 화면 제공을 목적으로 한다. The present application is to solve the above-described problems of the prior art, ERP (Equi-rectangular Projection) or CMP (Cube) composed by stitching 360-degree videos acquired from one 360-degree omnidirectional camera or videos acquired from several general cameras. -Map Projection) The purpose of a 360-degree video is to provide a screen that detects and displays the positions of a plurality of objects of a type set by a user in a video frame.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, ERP 또는 CMP 형식의 360도 비디오에서 사용자가 설정한 유형의 복수의 객체들의 시간의 흐름에 따른 이동 궤적을 표시하는 화면 제공을 목적으로 한다.The present application is to solve the above-described problems of the prior art, and an object of the present invention is to provide a screen displaying movement trajectories of a plurality of objects of a type set by a user over time in a 360-degree video of an ERP or CMP format.

본원은 전술한 종래 기술의 문제점을 해결하기 위한 것으로서, ERP 또는 CMP 형식의 360도 비디오에서 사용자가 지정한 하나의 또는 복수의 객체들을 포함하는 좁은 화각의 Viewport 화면들의 제공을 목적으로 한다.The present application is to solve the above-described problems of the prior art, and an object of the present invention is to provide viewport screens having a narrow angle of view including one or a plurality of objects designated by a user in a 360-degree video of an ERP or CMP format.

다만, 본원의 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다.However, the technical problem to be achieved by the embodiments of the present application is not limited to the technical problems as described above, and other technical problems may exist.

상기한 기술적 과제를 달성하기 위한 기술적 수단으로서, 본원의 제1 측면에 따른 다중 객체 추적 장치에 의한 360도 영상 공간에서의 다중 객체 추적 방법은, (a) 상기 360도 영상 공간에 포함된 360도 비디오 프레임 내에서 객체 존재 후보 영역을 검출하는 단계, (b) 상기 360도 비디오 프레임으로부터 상기 객체 존재 후보 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 추출하는 단계, (c) 추출된 상기 평면 뷰포트 프레임에 대해 객체를 포함하는 객체 영역을 검출하는 단계, (d) 객체 영역이 검출된 경우, 검출된 객체 영역에 대응하는 대응 영상신호를 추출하고, 상기 검출된 객체 영역의 위치를 상기 360도 영상 공간에 대응하게 변환하여 변환 위치를 생성하는 단계, (e) 상기 대응 영상신호 및 상기 변환 위치를 기초로 상기 360도 비디오 프레임에서의 객체의 위치를 추적하는 단계를 포함하고, 상기 (a) 단계에서 검출된 상기 객체 존재 후보 영역이 복수개인 경우, 상기 (b) 단계 내지 상기 (e) 단계는 복수개의 상기 객체 존재 후보 영역 각각에 대하여 개별적으로 수행될 수 있다.As a technical means for achieving the above technical problem, a method for tracking multiple objects in a 360-degree image space by a multiple-object tracking device according to the first aspect of the present application includes: (a) 360 degrees included in the 360-degree image space. Detecting an object presence candidate region within a video frame, (b) converting and extracting a viewport frame having a predetermined angle of view including the object presence candidate region from the 360-degree video frame into a flat viewport frame form, (c ) Detecting an object region including an object with respect to the extracted plan viewport frame, (d) when the object region is detected, extracting a corresponding image signal corresponding to the detected object region, and Converting a position to correspond to the 360-degree image space to generate a transformed position, (e) tracking the position of an object in the 360-degree video frame based on the corresponding image signal and the transformed position, and If there are a plurality of the object presence candidate regions detected in step (a), steps (b) to (e) may be individually performed for each of the plurality of candidate object presence regions.

또한, 상기 360도 영상 공간은 ERP(Equirectangular Projection) 또는 CMP(Cube Map Projection) 형식일 수 있다.In addition, the 360-degree image space may be in ERP (Equirectangular Projection) or CMP (Cube Map Projection) format.

또한, 상기 (a) 단계 내지 상기 (e) 단계는, 상기 360도 영상 공간에 포함된 복수의 360도 비디오 프레임 각각에 대하여 수행되되, 상기 (e) 단계는, 상기 복수의 360도 비디오 프레임 중 제1 프레임에서 검출된 객체 영역에 대응하는 대응 영상신호 및 변환 위치와, 상기 복수의 360도 비디오 프레임 중 시계열적으로 제1 프레임 이후의 제2 프레임에서 검출된 객체 영역에 대응하는 대응 영상신호 및 변환 위치를 고려하여, 상기 360도 영상 공간에서의 시간의 흐름에 따른 객체의 이동을 추적할 수 있다.In addition, steps (a) to (e) are performed for each of a plurality of 360-degree video frames included in the 360-degree image space, and step (e) includes: A corresponding image signal and a transform position corresponding to the object region detected in the first frame, and a corresponding image signal corresponding to the object region detected in a second frame after the first frame in time series among the plurality of 360-degree video frames, and In consideration of the transformation position, it is possible to track the movement of the object according to the passage of time in the 360-degree image space.

또한, 상기 소정의 화각은, 상기 평면 뷰포트 프레임 내에서 상기 객체 존재 후보 영역 중 일부가 제외되지 않도록 하는 한도 내에서, 상기 객체 존재 후보 영역이 차지하는 가로 비율 및 세로 비율이 미리 설정된 가로 비율 및 미리 설정된 세로 비율 이상이 되도록 설정될 수 있다.In addition, the predetermined angle of view is a horizontal ratio and a preset width ratio and a vertical ratio occupied by the object presence candidate region within a limit such that some of the object presence candidate regions are not excluded within the plan viewport frame. It can be set to be more than the vertical ratio.

또한, 상기 대응 영상 신호는 상기 검출된 객체 영역에 대한 추적이 가능하도록 상기 검출된 객체 영역을 다른 영역과 구분하는 신호로서, 상기 검출된 객체 영역에 대한 영상 특징벡터 형태일 수 있다.In addition, the corresponding image signal is a signal that separates the detected object region from other regions to enable tracking of the detected object region, and may be in the form of an image feature vector for the detected object region.

또한, 상기 (a) 단계는, 움직임 영역에 객체가 존재할 가능성을 고려하여, 상기 움직임 영역을 상기 객체 존재 후보 영역으로 검출할 수 있다.In addition, in step (a), in consideration of a possibility that an object exists in the motion region, the motion region may be detected as the object presence candidate region.

또한, 상기 (a) 단계는, 상기 360도 비디오 프레임에 대한 배경 생성 및 전경 추출을 수행하고, 배경 차분을 통해 상기 움직임 영역을 추출하되, 상기 움직임 영역이 소정의 영역 크기 이상인 경우에 한하여 상기 객체 존재 후보 영역으로서 검출할 수 있다.In addition, in the step (a), a background is generated and a foreground is extracted for the 360-degree video frame, and the motion region is extracted through a background difference, but only when the motion region is larger than a predetermined region size, the object It can be detected as an existing candidate region.

또한, 상기 (a) 단계는, 기존의 직사각형 형태의 비디오 프레임에 적용되는 객체 검출기를 이용하여, 직접적으로 객체 영역을 검출하는 대신, 객체 존재 후보 영역을 검출할 수 있다.In addition, in step (a), instead of directly detecting an object area, an object presence candidate area may be detected using an object detector applied to an existing rectangular video frame.

또한, 상기 (b) 단계는, 상기 소정의 화각의 뷰포트 프레임의 좌표를 u, v 좌표로 변환하고 중점 좌표의 yaw, pitch 좌표를 구한 다음 구(sphere) 좌표계로 변환시키며, projection 식을 사용하여 구 좌표계에서의 좌표를 구한 후 회전 변환을 포함하는 변환을사용하여 상기 평면 뷰포트 프레임 형태로 추출할 수 있다.In addition, in step (b), the coordinates of the viewport frame of the predetermined angle of view are converted into u and v coordinates, the yaw and pitch coordinates of the midpoint coordinates are obtained, and then converted into a sphere coordinate system, using a projection equation. After obtaining the coordinates in the old coordinate system, it can be extracted in the form of the plan viewport frame using a transformation including a rotation transformation.

또한, 상기 (a) 단계에서 검출된 상기 객체 존재 후보 영역이 복수개이고, 복수개의 객체 존재 후보 영역 중 둘 이상이 적어도 일부 중첩되는 경우, 상기 (b) 단계는, 상기 적어도 일부 중첩되는 둘 이상의 객체 존재 후보 영역 중 하나에 대해서만 평면 뷰포트 프레임 형태의 변환 및 추출을 수행할 수 있다.In addition, when the object presence candidate regions detected in step (a) are plural, and two or more of the plurality of object presence candidate regions overlap at least partially, the step (b) includes the at least partially overlapping two or more objects. Transformation and extraction of a flat viewport frame shape may be performed for only one of the existence candidate regions.

또한, 상기 (c) 단계는, 추출된 상기 평면 뷰포트 프레임에 대하여 객체 검출기를 이용하여 상기 객체 영역을 검출하고, 객체의 종류를 구분할 수 있다.In addition, in step (c), the object area may be detected using an object detector for the extracted plan viewport frame, and the type of the object may be identified.

또한, 상기 (e) 단계는, 상기 제1 프레임이 상기 360도 영상 공간의 제1 경계 부분에 추적하는 객체가 위치하는 프레임에 해당하고, 상기 제2 프레임이 상기 제1 경계 부분의 반대편인 제2 경계 부분에 추적하는 객체가 위치하는 프레임에 해당하는 불연속 경계에 대한 객체 이동 추적의 경우, 상기 불연속 경계에서 yaw, pitch 값이 연속성을 가지도록 상기 제1 경계 부분에 대응하는 yaw, pitch 값을 기준으로 상기 제2 경계 부분에 대응하는 yaw, pitch 값을 보상할 수 있다.In addition, in the step (e), the first frame corresponds to a frame in which an object to be tracked is located in a first boundary portion of the 360-degree image space, and the second frame is a second frame opposite to the first boundary portion. 2 In the case of object movement tracking on a discontinuous boundary corresponding to a frame in which an object to be tracked in the boundary portion is located, yaw and pitch values corresponding to the first boundary portion are determined so that yaw and pitch values at the discontinuous boundary have continuity. As a reference, yaw and pitch values corresponding to the second boundary portion may be compensated.

또한, (f) 추적된 상기 객체의 위치에 대응하는 표시를 상기 360도 비디오 프레임에 표시하여 디스플레이하는 단계 및 (g) 사용자에 의해 상기 표시가 선택되면, 선택된 표시에 대응하는 선택 객체 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 디스플레이하는 단계를 더 포함할 수 있다.In addition, (f) displaying and displaying a display corresponding to the location of the tracked object on the 360-degree video frame, and (g) when the display is selected by the user, a selection object area corresponding to the selected display is included. The method may further include converting and displaying a viewport frame having a predetermined angle of view into a flat viewport frame format.

또한, 상기 (c) 단계에서 검출된 객체 영역이 복수개인 경우, 상기 (f) 단계에서 상기 표시는 복수개의 검출된 객체 영역에 대응하여 복수개이고, 상기 (g) 단계에서, 사용자에 의해 상기 표시가 둘 이상 선택되면, 선택된 둘 이상의 표시 각각에 대응하는 소정의 화각의 뷰포트 프레임이 각각 평면 뷰포트 프레임 형태로 변환되어 디스플레이될 수 있다.In addition, when there are a plurality of object regions detected in step (c), the display in step (f) is a plurality corresponding to a plurality of detected object regions, and in step (g), the display is displayed by the user. When two or more are selected, viewport frames having a predetermined angle of view corresponding to each of the selected two or more displays may be converted into flat viewport frames and displayed.

한편, 본원의 제2 측면에 따른 360도 영상 공간에서의 다중 객체 추적 장치는, 상기 360도 영상 공간에 포함된 360도 비디오 프레임 내에서 객체 존재 후보 영역을 검출하는 객체 존재 후보 영역 추출기, 상기 360도 비디오 프레임으로부터 상기 객체 존재 후보 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 추출하는 뷰포트 영상 추출기, 추출된 상기 평면 뷰포트 프레임에 대해 객체를 포함하는 객체 영역을 검출하는 객체 검출기, 객체 영역이 검출된 경우, 검출된 객체 영역에 대응하는 대응 영상신호를 추출하고, 상기 검출된 객체 영역의 위치를 상기 360도 영상 공간에 대응하게 변환하여 변환 위치를 생성하는 360도 좌표 변환기, 상기 대응 영상신호 및 상기 변환 위치를 기초로 상기 360도 비디오 프레임에서의 객체의 위치를 추적하는 객체 추적기를 포함하고, 상기 객체 존재 후보 영역 추출기에서 검출된 상기 객체 존재 후보 영역이 복수개인 경우, 상기 뷰포트 영상 추출기 및 상기 객체 검출기는 복수개의 상기 객체 존재 후보 영역 각각에 대하여 개별적으로 수행할 수 있다.Meanwhile, the apparatus for tracking multiple objects in a 360-degree image space according to the second aspect of the present application includes an object presence candidate region extractor that detects an object presence candidate region within a 360-degree video frame included in the 360-degree image space, and the 360 A viewport image extractor that converts and extracts a viewport frame of a predetermined angle of view including the object presence candidate region from a video frame into a plan viewport frame form, and an object that detects an object region including an object for the extracted plan viewport frame When a detector and an object region are detected, a 360-degree coordinate converter that extracts a corresponding image signal corresponding to the detected object region and converts the position of the detected object region to correspond to the 360-degree image space to generate a transformed position And an object tracker for tracking a position of an object in the 360-degree video frame based on the corresponding video signal and the transformed position, and when there are a plurality of object presence candidate regions detected by the object presence candidate region extractor, The viewport image extractor and the object detector may individually perform each of the plurality of object presence candidate regions.

상술한 과제 해결 수단은 단지 예시적인 것으로서, 본원을 제한하려는 의도로 해석되지 않아야 한다. 상술한 예시적인 실시예 외에도, 도면 및 발명의 상세한 설명에 추가적인 실시예가 존재할 수 있다.The above-described problem solving means are merely exemplary and should not be construed as limiting the present application. In addition to the above-described exemplary embodiments, additional embodiments may exist in the drawings and detailed description of the invention.

전술한 본원의 과제 해결 수단에 의하면, 전방위의 넓은 시야각을 갖는 360도 영상에서 사용자들이 지정한 관심 객체의 출현 및 이동시, 객체의 자동 검출이 가능하도록 하고, 사용자의 360도 영상 공간 탐색 효율을 높일 수 있으며, 객체를 포함한 별도의 뷰포트(Viewport) 구성을 통한 영상보안시스템이나 표준 화각의 비디오 컨텐츠 생성기 구성이 가능할 수 있다.According to the above-described problem solving means of the present application, when an object of interest specified by users appears and moves in a 360-degree image having a wide viewing angle in all directions, it is possible to automatically detect the object and improve the user's 360-degree image space search efficiency. In addition, it is possible to configure a video security system or a video content generator with a standard angle of view through a separate viewport including objects.

다만, 본원에서 얻을 수 있는 효과는 상기된 바와 같은 효과들로 한정되지 않으며, 또 다른 효과들이 존재할 수 있다.However, the effect obtainable in the present application is not limited to the above-described effects, and other effects may exist.

도 1은 본원의 일 실시예에 따른 360도 영상을 ERP(Equirectangular Projection) 및 CMP(Cube Map Projection)로 투영시킨 ERP 형식의 좌표계(도 1의 (a))와 CMP 형식의 좌표계(도 1의 (b))를 나타낸 도면이다.
도 2는 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 장치의 동작 흐름을 나타낸 도면이다.
도 3은 본원의 일 실시예에 따른 객체 존재 후보 영역 추출기의 동작 흐름을 나타낸 도면이다.
도 4는 본원의 일 실시예에 따른 뷰포트 영상 추출기의 동작 흐름을 나타낸 도면이다.
도 5는 본원의 일 실시예에 따른 객체 검출기, 360도 좌표 변환기 및 객체 추적기의 동작 흐름을 나타낸 도면이다.
도 6은 본원의 일 실시예에 따른 뷰포트 프레임 내의 객체의 위치를 ERP 또는 CMP 공간 상으로 변환하기 위한 동작 흐름을 나타낸 도면이다.
도 7은 본원의 일 실시예에 따른 360도 비디오 프레임에 객체를 표현하여 디스플레이한 화면을 나타낸 도면이다.
도 8은 본원의 일 실시예에 따른 선택 객체 영역을 포함하는 360도 비디오 프레임 및 평면 뷰포트 프레임을 나타낸 도면이다.
도 9는 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 장치의 제어 방법에 대한 동작 흐름도이다.1 is an ERP-format coordinate system (FIG. 1A) in which a 360-degree image is projected by ERP (Equirectangular Projection) and CMP (Cube Map Projection) according to an embodiment of the present application, and a CMP-format It is a figure showing (b)).
2 is a diagram illustrating an operation flow of a multi-object tracking device in a 360-degree image space according to an embodiment of the present application.
3 is a diagram illustrating an operation flow of an object presence candidate region extractor according to an embodiment of the present application.
4 is a diagram illustrating an operation flow of a viewport image extractor according to an embodiment of the present application.
5 is a diagram illustrating an operation flow of an object detector, a 360 degree coordinate converter, and an object tracker according to an embodiment of the present application.
6 is a diagram illustrating an operation flow for converting a position of an object in a viewport frame into an ERP or CMP space according to an embodiment of the present application.
7 is a diagram illustrating a screen displayed by expressing an object in a 360-degree video frame according to an embodiment of the present application.
8 is a diagram illustrating a 360-degree video frame and a plan viewport frame including a selection object area according to an exemplary embodiment of the present disclosure.
9 is a flowchart illustrating a method of controlling a multi-object tracking apparatus in a 360-degree image space according to an exemplary embodiment of the present disclosure.

아래에서는 첨부한 도면을 참조하여 본원이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본원의 실시예를 상세히 설명한다. 그러나 본원은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본원을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, embodiments of the present application will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present application. However, the present application may be implemented in various different forms and is not limited to the embodiments described herein. In addition, in the drawings, parts irrelevant to the description are omitted in order to clearly describe the present application, and similar reference numerals are attached to similar parts throughout the specification.

본원 명세서 전체에서, 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "전기적으로 연결" 또는 "간접적으로 연결"되어 있는 경우도 포함한다. Throughout the present specification, when a part is said to be "connected" with another part, it is not only the case that it is "directly connected", but also "electrically connected" or "indirectly connected" with another element interposed therebetween. "Including the case.

본원 명세서 전체에서, 어떤 부재가 다른 부재 "상에", "상부에", "상단에", "하에", "하부에", "하단에" 위치하고 있다고 할 때, 이는 어떤 부재가 다른 부재에 접해 있는 경우뿐 아니라 두 부재 사이에 또 다른 부재가 존재하는 경우도 포함한다.Throughout the present specification, when a member is positioned "on", "upper", "upper", "under", "lower", and "lower" of another member, this means that a member is located on another member. This includes not only the case where they are in contact but also the case where another member exists between the two members.

본원 명세서 전체에서, 어떤 부분이 어떤 구성 요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것을 의미한다.In the entire specification of the present application, when a certain part "includes" a certain component, it means that other components may be further included rather than excluding other components unless specifically stated to the contrary.

본원은 360도 영상 공간에서의 다중 객체 추적 장치 및 방법에 관한 것으로, 360도 비디오에서 검출된 객체 존재 후보 영역에 대한 뷰포트를 추출, 뷰포트의 객체를 검출하여 360도 비디오에 표출할 수 있고, ERP 또는 CMP 형식의 360도 비디오에서 사용자가 설정한 유형의 복수의 객체들의 비디오 프레임 내 위치를 검출하여 표시하는 화면을 제공할 수 있으며, ERP 또는 CMP 형식의 360도 비디오에서 사용자가 지정한 하나 혹은 복수의 객체를 포함하는 소정의 화각의 뷰포트 화면을 제공할 수 있는 기술이다. 이하에서는 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 장치(이하 '본 장치'라 함)에 대해 설명한다.The present application relates to a multi-object tracking apparatus and method in a 360-degree image space, and extracts a viewport for an object presence candidate area detected in a 360-degree video, detects an object in the viewport, and displays it in a 360-degree video. Alternatively, it is possible to provide a screen that detects and displays the position of a plurality of objects of a type set by a user in a video frame in a 360-degree video in CMP format, and one or more user-designated 360-degree videos in ERP or CMP format. It is a technology capable of providing a viewport screen with a predetermined angle of view including objects. Hereinafter, a multi-object tracking device (hereinafter referred to as “the device”) in a 360-degree image space according to an exemplary embodiment of the present disclosure will be described.

도 1은 본원의 일 실시예에 따른 360도 영상을 ERP(Equirectangular Projection) 및 CMP(Cube Map Projection)로 투영시킨 ERP 형식의 좌표계(도 1의 (a))와 CMP 형식의 좌표계(도 1의 (b))를 나타낸 도면이고, 도 2는 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 장치의 동작 흐름을 나타낸 도면이다.1 is an ERP-format coordinate system (FIG. 1A) in which a 360-degree image is projected by ERP (Equirectangular Projection) and CMP (Cube Map Projection) according to an embodiment of the present application, and a CMP-format (b)), and FIG. 2 is a diagram illustrating an operation flow of a multi-object tracking device in a 360-degree image space according to an embodiment of the present application.

360도 비디오는 전방위의 영상을 한 프레임에 사영하여 보여주는 방식을 사용하기 때문에 영상에 구조적 왜곡이 존재한다는 특징을 가지고 있다. 본 장치(1)는 영상의 구조적 왜곡에 따른 객체 오 검출과, 작은 객체를 추적하지 못하는 문제를 해결하기 위한 것으로, 움직임 영역을 검출하여 객체 존재 후보 영역을 추출할 수 있다.Since 360-degree video uses a method of projecting an image from all directions to one frame, it has a characteristic that there is structural distortion in the image. The present apparatus 1 is for resolving an object error detection due to structural distortion of an image and a problem in that a small object cannot be tracked, and may extract an object presence candidate region by detecting a motion region.

도1 및 도 2를 참조하면, 본 장치(1)는 객체 존재 후보 영역 추출기(100), 뷰포트 영상 추출기(200), 객체 검출기(300), 360도 좌표 변환기(400) 및 객체 추적기(500)를 포함할 수 있다.1 and 2, the apparatus 1 includes an object presence candidate region extractor 100, a viewport image extractor 200, an object detector 300, a 360 degree coordinate converter 400, and an object tracker 500. It may include.

객체 존재 후보 영역 추출기(100)는 360도 영상 공간에 포함된 360도 비디오 프레임 내에서 객체 존재 후보 영역을 검출할 수 있다. The object presence candidate region extractor 100 may detect the object presence candidate region within a 360 degree video frame included in the 360 degree image space.

전방위의 영상을 동시에 기록하는 360도 비디오(영상)는 여러 projection 방법을 사용하여 재생 가능하며, 객체 존재 후보 영역 추출기(100)는 360도 영상을 ERP(Equirectangular Projection)로 투영시킨 ERP 형식의 좌표계 또는 CMP(Cube Map Projection)로 투영시킨 CMP형식의 좌표계를 360도 영상으로부터 획득할 수 있다. 즉, 본 장치(1)에서의 360도 영상 공간은 ERP 또는 CMP 형식일 수 있다. A 360-degree video (image) that simultaneously records an omnidirectional image can be reproduced using multiple projection methods, and the object presence candidate region extractor 100 projects a 360-degree image into an ERP (Equirectangular Projection) coordinate system or A coordinate system in CMP format projected by CMP (Cube Map Projection) can be acquired from a 360-degree image. That is, the 360-degree video space in the device 1 may be in ERP or CMP format.

객체 존재 후보 영역 추출기(100)는 360도 비디오 프레임 내에서 객체 존재 후보 위치를 검출할 수 있다. The object presence candidate region extractor 100 may detect an object presence candidate position within a 360-degree video frame.

도 3은 본원의 일 실시예에 따른 객체 존재 후보 영역 추출기의 동작 흐름을 나타낸 도면이다.3 is a diagram illustrating an operation flow of an object presence candidate region extractor according to an embodiment of the present application.

도 3(a)를 참조하면, 객체 존재 후보 영역 추출기(100)는 움직임 영역에 객체가 존재할 가능성을 고려하여, 움직임 영역을 객체 존재 후보 영역을 검출할 수 있다. 객체 존재 후보 영역 추출기(100)는 360도 비디오 프레임에 대한 배경 생성 및 전경 추출을 수행하고, 배경 차분을 통해 움직임 영역을 추출하되, 움직임 영역이 소정의 영역 크기 이상인 경우에 한하여, 객체 존재 후보 영역으로서 검출할 수 있다. 객체 존재 후보 영역 추출기(100)는 입력으로 들어오는 360도 비디오 영상에 대하여 배경을 생성하며, 전경을 추출하고, 배경 차분을 사용하여 프레임 간 움직임이 존재하는 영역을 찾을 수 있고, 후술하는 뷰포트 영상 추출기(200)는 해당 영역의 좌 상단 좌표를 u, v, yaw, pitch 형식으로 변환할 수 있다.Referring to FIG. 3A, the object presence candidate region extractor 100 may detect the object presence candidate region from the motion region in consideration of the possibility that the object exists in the motion region. The object presence candidate region extractor 100 performs background generation and foreground extraction for a 360-degree video frame, and extracts a motion region through a background difference, but only when the motion region is larger than a predetermined region size, the object presence candidate region It can be detected as. The object presence candidate region extractor 100 generates a background for a 360-degree video image that is input as an input, extracts the foreground, and uses the background difference to find a region where motion exists between frames, and a viewport image extractor described later. (200) may convert the upper left coordinate of the corresponding area into u, v, yaw, and pitch formats.

먼저, 객체 존재 후보 영역 추출기(100)는 ERP 또는 CMP 형식의 360도 비디오를 기반으로 배경 생성 및 전경 추출을 수행(도 3의 '배경 생성/전경 추출')하고, 배경 차분(도 3의 '배경 차분기')을 통해 객체의 존재 후보 위치를 검출할 수 있다. 자세히 설명하면, 객체 존재 후보 영역 추출기(100)는 ERP 또는 CMP 형식의 360 비디오 프레임 공간에서 배경 생성 및 전경 추출(도 3의 '배경 생성/전경 추출'), 배경 차분(도 3의 '배경 차분기') 및 모폴로지 연산/라벨링(도 3의 '모폴로지 연산/라벨링')을 통해 추출된 ERP 또는 CMP 좌표계 형식의 객체 영역 대표 좌표를 이용하여 객체 존재 후보 영역을 생성할 수 있다. First, the object presence candidate region extractor 100 performs background generation and foreground extraction based on 360-degree video in ERP or CMP format ('background creation/foreground extraction' in FIG. 3), and background difference (' Background difference') can be used to detect the location of the candidate existence of the object. In detail, the object presence candidate region extractor 100 generates a background and extracts a foreground in a 360 video frame space in ERP or CMP format ('background creation/foreground extraction' in FIG. 3), and a background difference ('background difference in FIG. 3). Branching') and morphology operation/labeling ('morphology operation/labeling' in FIG. 3) extracted ERP or CMP coordinate system format object region representative coordinates may be used to generate an object presence candidate region.

여기서 모폴로지(Morphology) 연산은 영상의 형태를 분석하고 처리하는 기법 중 하나로, 대상 물체의 정보를 반영하여 영상 내 원하는 부분만 추출하는 연산 기법으로, 영상 내의 물체 구조를 추출하기 위해 영상의 경계, 골격, 블록 등의 형태를 표현하는데 필요한 요소를 추출하는데 이용되는 기법이다. 모폴로지 기법을 구성하는 가장 기본적인 연산으로는 침식(Erosion)연산, 팽창(Dilation) 연산이 포함될 수 있으며, 모폴로지 연산은 이진 영상 또는 명암도 영상에서 사용될 수 있다. 또한, 라벨링(Labeling)은 이진화된 이미지에서 객체를 각각 분별하기 위해 인접한 픽셀 값들끼리 그룹화하여 번호를 매기는 것을 의미하며, 인접환 화소들을 묶어 하나의 객체로 판단하는 방식을 의미할 수 있다. 라벨링은 4방향 라벨링과 8방향 라벨링으로 구분될 수 있다. 모폴로지 연산 및 라벨링은 이미 공지된 기술이므로, 자세한 설명은 생략하기로 한다.Here, the morphology operation is one of the techniques for analyzing and processing the shape of an image.It is an operation method that extracts only the desired part in the image by reflecting the information of the target object.In order to extract the object structure in the image, the boundary and skeleton of the image It is a technique used to extract the elements necessary to express the shape of the block, etc. The most basic operation constituting the morphology technique may include an erosion operation and a dilation operation, and the morphology operation may be used in a binary image or a contrast image. In addition, labeling refers to grouping and numbering adjacent pixel values to identify objects in the binarized image, and may refer to a method of grouping adjacent ring pixels to determine an object. Labeling can be divided into 4-way labeling and 8-way labeling. Since morphology calculation and labeling are already known techniques, detailed descriptions will be omitted.

객체 존재 후보 영역 추출기(100)는 ERP 또는 CMP 형식의 360도 비디오 프레임에 대한 배경 생성 및 전경 추출과 배경 차분을 수행하고, 모폴로지 연산 또는 라벨링을 수행함으로써 객체 영역 대표 좌표, 즉 객체 존재 후보 영역을 생성할 수 있다.The object presence candidate region extractor 100 generates a background for a 360-degree video frame in ERP or CMP format, extracts a foreground, and performs a background difference, and performs morphology calculation or labeling to determine the object region representative coordinates, that is, the object presence candidate region. Can be generated.

객체 존재 후보 영역 추출기(100)는 360도 비디오 프레임에 대한 배경 생성 및 전경 추출을 수행하고, 배경 차분을 통해 움직임 영역을 추출하되, 움직임 영역이 소정의 크기 영역 이상인 경우에 한하여 객체 존재 후보 영역으로서 검출할 수 있다. 즉, 객체 존재 후보 영역 추출기(100)는 ERP 혹은 CMP 형식의 360 비디오 프레임 공간에서 배경 생성 및 전경 추출, 배경 차분 및 모폴로지 연산/라벨링을 통해 추출된 움직임 영역의 좌 상단 좌표의 영역 크기가 일정 수준 이상인 경우에만 객체 존재 후보 영역으로서 검출할 수 있다. The object presence candidate region extractor 100 performs background generation and foreground extraction for a 360-degree video frame, and extracts a motion region through a background difference. Can be detected. That is, the object presence candidate region extractor 100 generates a background in a 360 video frame space in ERP or CMP format and extracts the foreground, and the region size of the upper left coordinate of the motion region extracted through background difference and morphology calculation/labeling is a certain level. Only in the case of abnormality, it can be detected as an object presence candidate region.

도 3(b)를 참조하면, 객체 존재 후보 영역 추출기(100)는 배경생성/전경 추출, 배경 차분기 및 모폴로지 연산/라벨링을 통하여 객체 존재 후보 영역을 추출할 수 있으나, 이에 한정되는 것이 아닌, 후술하는 객체 검출기(300)를 이용하여 ERP 또는 CMP 좌표계 형식의 객체 영역 대표 좌표, 즉 객체 존재 후보 영역을 검출할 수 있다. 객체 존재 후보 영역 추출기(100)는 기존의 직사각형 비디오 프레임에 적용되는 객체 검출기(300)를 이용하여 객체의 존재 후보 위치를 검출할 수 있으며, ERP 혹은 CMP 형식의 360도 비디오 프레임 공간에서 객체 검출기(300)가 검출한 객체 영역의 좌 상단 좌표를 추출할 수 있단. 즉, 객체 존재 후보 영역 추출기(100)는 기존의 직사각형 형태의 비디오 프레임에 적용되는 객체 검출기(300)를 이용하여, 직접적으로 객체를 검출하는 대신, 객체 존재 후보 영역을 검출할 수 있다. 객체 검출기(300)에 대해서는 아래에서 더 자세히 설명하도록 한다.Referring to FIG. 3(b), the object presence candidate region extractor 100 may extract the object presence candidate region through background generation/foreground extraction, background difference branching, and morphology calculation/labeling, but is not limited thereto. The object detector 300, which will be described later, may be used to detect the representative coordinates of an object area in the form of an ERP or CMP coordinate system, that is, an object presence candidate area. The object presence candidate region extractor 100 may detect the candidate existence position of an object using the object detector 300 applied to the existing rectangular video frame, and the object detector ( 300) can extract the upper left coordinate of the object area detected. That is, the object presence candidate region extractor 100 may detect the object presence candidate region instead of directly detecting the object by using the object detector 300 applied to the existing rectangular video frame. The object detector 300 will be described in more detail below.

본 장치(1)는 객체 존재 후보 영역에 대한 좁은 화각의 뷰포트를 만들어서 뷰포트 영상에 대한 객체를 검출함으로써 영상의 구조적 왜곡에 따른 오 검출을 해소할 수 있다. The present apparatus 1 can eliminate false detection due to structural distortion of the image by detecting an object for the viewport image by creating a viewport with a narrow angle of view for an object presence candidate region.

도 4는 본원의 일 실시예에 따른 뷰포트 영상 추출기의 동작 흐름을 나타낸 도면이다.4 is a diagram illustrating an operation flow of a viewport image extractor according to an embodiment of the present application.

뷰포트 영상 추출기(200)는 소정의 화각의 뷰포트가 한 지점에서 여러 개 생성되는 것을 방지하기 위하여 중첩을 검사하고, 해당 위치에서의 왜곡이 존재하지 않는 일정 크기의 평면 뷰포트 영상을 생성할 수 있다.The viewport image extractor 200 may check overlap to prevent multiple viewports having a predetermined angle of view from being generated at one point, and generate a flat viewport image having a predetermined size without distortion at a corresponding position.

도 4를 참조하면, 뷰포트 영상 추출기(200)는 360도 비디오 프레임으로부터 객체 존재 후보 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 추출할 수 있다.Referring to FIG. 4, the viewport image extractor 200 may convert and extract a viewport frame having a predetermined angle of view including an object presence candidate region from a 360-degree video frame into a flat viewport frame form.

뷰포트 영상 추출기(200)는 360도 비디오 프레임으로부터 객체 존재 후보 영역을 중심으로 하는 소정 화각의 뷰포트 프레임을 변환할 수 있다. 뷰포트 영상 추출기(200)는 객체 존재 후보 영역을 중심으로 소정의 화각의 뷰포트를 프레임 단위로 생성할 수 있다.The viewport image extractor 200 may convert a viewport frame having a predetermined angle of view centered on an object presence candidate region from a 360-degree video frame. The viewport image extractor 200 may generate a viewport having a predetermined angle of view in a frame unit based on an object presence candidate region.

뷰포트 영상 추출기(200)는 소정의 화각의 뷰포트 프레임의 좌표를 u, v 좌표로 변환하고 중점 좌표의 yaw pitch 좌표를 구한 다음 구(sphere) 좌표계로 변환시키며, projection 식을 사용하여 구 좌표계에서의 좌표를 구한 후 회전 변환을 포함하는 변환을 사용하여 평면 뷰포트 프레임 형태로 추출할 수 있다. 여기서 소정의 화각은, 평면 뷰포트 프레임 내에서 객체 존재 후보 영역 중 일부가 제외되지 않도록 하는 한도 내에서, 객체 존재 후보 영역이 차지하는 가로 비율 및 세로 비율이 미리 설정된 가로 비율 및 미리 설정된 세로 비율 이상이 되도록 설정될 수 있다. 예를 들면, 뷰포트 영상 추출기(200)는 객체 존재 후보 영역 이외에는 그 주변에 큰 여백이 형성되지 않는 화각으로 뷰포트 프레임을 형성할 수 있다. 뷰포트 프레임은 연산 속도 향상을 위해 가능하면 객체 존재 후보 영역 이외의 영역을 포함하지 않는 방향으로 형성됨이 바람직하므로, 이에 맞게 소정의 화각은 좁게(예를 들면, 가로 54° 및 세로 90°의 화각) 설정될 수 있다. 즉, 좁은 화각은 객체 존재 후보 영역 중 일부를 제외하지 않고 객체 존재 후보 영역 전체를 포함하는 정도의 화각일 수 있다. The viewport image extractor 200 converts the coordinates of the viewport frame at a predetermined angle of view into u and v coordinates, obtains the yaw pitch coordinates of the midpoint coordinates, and converts them into a sphere coordinate system. After obtaining the coordinates, it can be extracted in the form of a flat viewport frame using a transformation including a rotation transformation. Here, the predetermined angle of view is such that the horizontal ratio and the vertical ratio occupied by the object presence candidate region are equal to or greater than the preset horizontal ratio and the preset vertical ratio, within the limit of not excluding some of the object presence candidate regions within the plan viewport frame. Can be set. For example, the viewport image extractor 200 may form a viewport frame with an angle of view in which a large margin is not formed around the object presence candidate region. The viewport frame is preferably formed in a direction that does not include an area other than the object presence candidate area if possible in order to improve the computational speed, so the predetermined angle of view is narrow (for example, a view angle of 54° horizontally and 90° vertically) accordingly. Can be set. That is, the narrow angle of view may be such that the entire object presence candidate region is included without excluding some of the object presence candidate regions.

도 4를 참조하면, 뷰포트 영상 추출기(200)는 하나의 장치로 구현되어 객체 존재 후보 영역을 중심으로 소정의 화각의 뷰포트를 프레임 단위로 생성할 수 있으나, 이에 한정되는 것이 아닌, 뷰포트 영상 추출기(200)에 포함되는 u, v 좌표 변환기(210), 구(Sphere) 좌표계 변환기(220) 및 뷰포트 추출기(230)를 이용하여 소정의 화각의 뷰포트를 프레임 단위로 생성할 수 있다. 예를 들면, 뷰포트 영상 추출기(200)는 u, v 좌표 변환기(210), 구(Sphere) 좌표계 변환기(220) 및 뷰포트 추출기(230)를 포함할 수 있다. u, v 좌표 변환기(210)는 객체 존재 후보 영역 추출기(100)로부터 ERP 또는 CMP 형식의 360도 비디오 프레임(도 4의 'ERP/CMP 좌표')를 수신하고, 360도 영상의 좌표를 u, v 좌표로 변환하고 중점 좌표를 yaw, pitch 좌표로 변환 할 수 있다. 구(Sphere) 좌표계 변환기(220)는 u, v 좌표 변환기(210)로부터 u, v 좌표계를 수신하여 구(Sphere) 좌표계로 변환시킬 수 있고, projection 식을 사용하여 구 좌표계에서의 좌표를 구할 수 있다. 뷰포트 추출기(230)는 구 좌표계를 기반으로 회전 변환 등을 사용하여 평면 뷰포트 영상을 생성할 수 있다. Referring to FIG. 4, the viewport image extractor 200 is implemented as a single device to generate a viewport of a predetermined angle of view in a frame unit around an object presence candidate region, but is not limited thereto. Using the u- and v-coordinate converter 210, the sphere coordinate system converter 220, and the viewport extractor 230 included in 200), a viewport having a predetermined angle of view may be generated in units of frames. For example, the viewport image extractor 200 may include a u- and v-coordinate converter 210, a sphere coordinate system converter 220, and a viewport extractor 230. The u, v coordinate converter 210 receives a 360-degree video frame ('ERP/CMP coordinates' in FIG. 4) in ERP or CMP format from the object presence candidate region extractor 100, and converts the coordinates of the 360-degree image to u, You can convert it to v coordinates and convert the midpoint coordinates to yaw and pitch coordinates. The sphere coordinate system converter 220 may receive the u, v coordinate system from the u, v coordinate converter 210 and convert it into a sphere coordinate system, and obtain the coordinates in the sphere coordinate system using a projection equation. have. The viewport extractor 230 may generate a plan viewport image using rotation transformation or the like based on the old coordinate system.

이때, 객체 존재 후보 영역 추출기(100)에서 검출된 객체 존재 후보 영역이 복수개이고, 복수개의 객체 존재 후보 영역 중 둘 이상이 적어도 일부 중첩되는 경우, 뷰포트 영상 추출기(200)는 적어도 일부 중첩되는 둘 이상의 객체 존재 후보 영역 중 하나에 대해서만 평면 뷰포트 프레임 형태의 변환 및 추출을 수행할 수 있다. 예를 들면, 객체 존재 후보 영역 추출기(100)에서 검출된 객체 존재 후보 영역이 2개이고, 상기 2개의 객체 존재 후보 영역이 서로 중첩되는 경우, 뷰포트 영상 추출기(200)는 둘 중 먼저 검출된 객체 존재 후보 영역을 남기고, 나중에 검출된 객체 존재 후보 영역은 제외할 수 있다. 다른 예로, 객체 존재 후보 영역 추출기(100)에서 검출된 객체 존재 후보 영역이 3개이고, 상기 3개의 객체 존재 후보 영역이 서로 중첩되는 경우, 셋 중 가운데 위치하는 객체 존재 후보 영역을 남기고, 상대적으로 양측에 위치하는 객체 존재 후보 영역을 제외할 수 있다. 이는 전체 프레임인 ERP 또는 CMP 형식의 360도 비디오 프레임에 대하여 소정의 화각의 뷰포트 프레임 추출 시, 중첩되는 영역에 대한 한 객체의 중복 표시를 방지하기 위함이며, 일정 거리 이내에서 다시 소정의 화각의 뷰포트 영상을 추출하여 중복이 생기는 경우를 방지할 수 있다. 뷰포트 영상 추출기(200)에서 객체 존재 후보 영역이 둘 이상 중첩된 경우 하나를 선택하는 방법은 이에만 한정되는 것은 아니다. In this case, when there are a plurality of object presence candidate regions detected by the object presence candidate region extractor 100, and two or more of the plurality of object presence candidate regions overlap at least partially, the viewport image extractor 200 may at least partially overlap two or more object presence candidate regions. Transformation and extraction of a flat viewport frame shape can be performed for only one of the object presence candidate regions. For example, when there are two object presence candidate regions detected by the object presence candidate region extractor 100 and the two object presence candidate regions overlap each other, the viewport image extractor 200 may determine the existence of the object detected first among the two. The candidate region may be left and the object presence candidate region detected later may be excluded. As another example, when there are three object presence candidate regions detected by the object presence candidate region extractor 100 and the three object presence candidate regions overlap each other, an object presence candidate region located in the middle of the three is left, and relatively both sides It is possible to exclude an object presence candidate area located at. This is to prevent overlapping display of one object in the overlapping area when extracting a viewport frame of a predetermined angle of view for the entire frame, ERP or CMP format, and a viewport of a predetermined angle of view within a certain distance. By extracting the image, it is possible to prevent the occurrence of duplication. When two or more object presence candidate regions are overlapped by the viewport image extractor 200, a method of selecting one is not limited thereto.

도 5는 본원의 일 실시예에 따른 객체 검출기, 360도 좌표 변환기 및 객체 추적기의 동작 흐름을 나타낸 도면이다.5 is a diagram illustrating an operation flow of an object detector, a 360 degree coordinate converter, and an object tracker according to an embodiment of the present application.

도 5를 참조하면, 객체 검출기(300)는 추출된 평면 뷰포트 프레임에 대해 객체를 포함하는 객체 영역을 검출할 수 있다.Referring to FIG. 5, the object detector 300 may detect an object region including an object with respect to the extracted plan viewport frame.

본 장치(1)는 하나의 프레임에서 객체 존재 후보 영역에 대한 좁은 화각의 뷰포트 프레임을 추출하면, 뷰포트 프레임에 대한 객체 검출을 실시하여 뷰포트 내에서의 객체 좌표를 검출할 수 있고, 이를 360도 비디오 상의 좌표로 변환하여 추적기에서 넘겨주는 방식을 사용할 수 있다. 즉, 본 장치(1)는 각각의 좁은 화각의 뷰포트 프레임을 이용하여 객체의 위치 및 영상 신호를 검출하고, 검출한 객체 별 위치를 360도 비디오 프레임 공간으로 변환할 수 있다.When the device 1 extracts a viewport frame with a narrow angle of view for an object presence candidate area from one frame, it detects the object coordinates in the viewport by performing object detection on the viewport frame. You can use a method that converts to the coordinates of the image and passes it to the tracker. That is, the present apparatus 1 may detect the position of an object and an image signal by using the viewport frame of each narrow angle of view, and convert the detected position of each object into a 360-degree video frame space.

구체적으로 설명하면, 본 장치(1)의 객체 검출기(300)는 변환된 소정의 화각의 뷰포트 프레임에 대해 객체를 포함하는 직사각형을 검출할 수 있다. 객체 검출기(300)는 기존의 EPR 또는 CMP 형태의 360도 영상에서 변환된 소정의 화각의 뷰포트 프레임에 대하여 객체를 포함하는 직사각형을 검출하고, 객체의 종류를 구분할 수 있다. 즉, 본 장치(1)는 추출된 평면 뷰포트 프레임에 대하여 객체 검출기(300)를 이용하여 객체 영역을 검출하고, 객체의 종류를 구분할 수 있다. 객체의 종류는, 예를 들어 사람 및 사람이 아닌 객체로 구분될 수 있다. 다른 예로, 객체의 종류는 사람, 사람이 이용하는 이동 수단 및 사람과 이동수단을 제외한 다른 객체로 구분될 수 있다. 전술한 예의 경우, 객체 검출기(300)는 객체 추적 대상이 사람 또는 사람이 이용하는 이동 수단을 포함하는 직사각형 형태의 객체 영역을 검출할 수 있다.Specifically, the object detector 300 of the apparatus 1 may detect a rectangle including an object for the converted viewport frame having a predetermined angle of view. The object detector 300 detects a rectangle including an object with respect to a viewport frame having a predetermined angle of view converted from a 360-degree image in the conventional EPR or CMP format, and identifies the type of the object. That is, the apparatus 1 may detect an object area with respect to the extracted plan viewport frame using the object detector 300 and classify the type of the object. The types of objects may be classified into, for example, people and non-human objects. As another example, the type of object may be classified into a person, a means of transportation used by a person, and other objects excluding a person and a means of transportation. In the case of the above-described example, the object detector 300 may detect an object area in the shape of a rectangle including a person or a moving means used by the object to be tracked.

360도 좌표 변환기(400)는 객체 검출기(300)에서 객체 영역이 검출된 경우, 검출된 객체 영역에 대응하는 대응 영상 신호를 추출하고, 검출된 객체 영역의 위치를 360도 영상 공간에서 대응하게 변환하여 변환 위치를 생성할 수 있다. 여기서 대응 영상 신호는 검출된 객체 영역에 대한 추적이 가능하도록 검출된 객체 영역을 다른 영역과 구분하는 신호로서, 객체 검출기(300)로부터 검출된 객체 영역에 대한 영상 특징 벡터 형태일 수 있다. When the object area is detected by the object detector 300, the 360-degree coordinate converter 400 extracts a corresponding image signal corresponding to the detected object area, and converts the position of the detected object area correspondingly in the 360-degree image space. You can create a transform location. Here, the corresponding image signal is a signal that separates the detected object region from other regions to enable tracking of the detected object region, and may be in the form of an image feature vector for the object region detected by the object detector 300.

도 5를 참조하면, 객체 검출기(300)는 추출된 평면 뷰포트 프레임(도 5의 '뷰포트 영상')에 대해 객체를 포함하는 객체 영역을 검출할 수 있고, 360도 좌표 변환기(400)는 검출된 객체 영역에 대응하는 CNN 기반 특징 벡터 형태인 대응 영상신호를 추출할 수 있고, 검출된 객체 영역의 위치를 360도 영상 공간에 대응하게 변환하여 변환 위치(도 5의 '360도 영상 객체 좌표')를 생성할 수 있다. 정리하면, 객체 검출기(300)에서 360 비디오 프레임 객체 존재 후보 영역에서의 좁은 화각의 뷰포트 프레임 내 추적된 객체의 위치 및 영상 신호를 추출하면, 360도 좌표 변환기(400)는 각 뷰포트 화면에서의 객체 위치를 ERP 혹은 CMP 형식의 360도 비디오 프레임 공간으로 재변환시킬 수 있다. Referring to FIG. 5, the object detector 300 may detect an object area including an object with respect to the extracted flat viewport frame ('viewport image' in FIG. 5), and the 360 degree coordinate converter 400 is detected. The corresponding image signal in the form of a CNN-based feature vector corresponding to the object region can be extracted, and the position of the detected object region is converted to correspond to a 360-degree image space, and the transformed position ('360 degree image object coordinates' in Fig. 5) Can be created. In summary, when the object detector 300 extracts the location of the tracked object and the image signal in the viewport frame having a narrow angle of view in the candidate area for the existence of the 360 video frame object, the 360-degree coordinate converter 400 The position can be retransformed into a 360-degree video frame space in ERP or CMP format.

객체 추적기(500)는 대응 영상 신호 및 변환 위치를 기초로 360도 비디오 프레임에서의 객체의 위치를 추적할 수 있다. 객체 추적기(500)는 변환된 객체 별 위치(ERP 또는 CMP)와 객체를 포함하는 직사각형 내 영상신호(뷰포트 또는 ERP/CMP)를 이용하여 객체별 위치를 추적할 수 있다.The object tracker 500 may track the position of the object in the 360-degree video frame based on the corresponding image signal and the transformed position. The object tracker 500 may track the location of each object using the converted location of each object (ERP or CMP) and an image signal (viewport or ERP/CMP) in a rectangle including the object.

만약, 객체 존재 후보 영역 추출기(100)에서 검출된 객체 존재 후보 영역이 복수개인 경우, 뷰포트 영상 추출기(200), 객체 검출기(300), 360도 좌표 변환기(400) 및 객체 추적기(500)는 복수개의 객체 존재 후보 영역 각각에 대하여 개별적으로 수행할 수 있다. If there are a plurality of object presence candidate regions detected by the object presence candidate region extractor 100, the viewport image extractor 200, the object detector 300, the 360 degree coordinate converter 400, and the object tracker 500 are plural. It can be performed individually for each of the two object presence candidate regions.

객체 존재 후보 영역 추출기(100), 뷰포트 영상 추출기(200), 객체 검출기(300), 360도 좌표 변환기(400) 및 객체 추적기(500)는 360도 영상 공간에 포함된 복수의 360도 비디오 프레임 각각에 대하여 수행되되, 객체 추적기(500)는 복수의 360도 비디오 프레임 중 제1 프레임에 검출된 객체 영역에 대응하는 대응 영상신호 및 변환 위치와, 복수의 360도 비디오 프레임 중 시계열적으로 제1 프렌임 이후의 제2 프레임에서 검출된 객체 영역에 대응하는 대응 영상신호 및 변환 위치를 고려하여, 360도 영상 공간에서의 시간의 흐름에 따른 객체의 이동을 추적할 수 있다. 예시적으로 제2 프레임은 시계열적으로 제1프레임의 바로 프레임이거나, 제1 프레임 이후에 적어도 하나의 프레임을 거친 다음의 프레임일 수 있다.The object presence candidate region extractor 100, the viewport image extractor 200, the object detector 300, the 360-degree coordinate converter 400, and the object tracker 500 are each of a plurality of 360-degree video frames included in the 360-degree image space. The object tracker 500 is performed on the corresponding image signal and the transform position corresponding to the object region detected in the first frame among the plurality of 360-degree video frames, and the first plane among the plurality of 360-degree video frames in time series. The movement of the object over time in the 360-degree image space may be tracked in consideration of the corresponding image signal and the transformed position corresponding to the object region detected in the second frame after the time. For example, the second frame may be a frame immediately after the first frame in time series, or may be a frame following at least one frame after the first frame.

객체 추적기(500)는 추적 시 ERP 나 CMP의 불연속 경계에서 yaw, pitch값이 연속되도록 보상하여 추적할 수 있다. 객체 추적기(500)는 제1프레임이 360도 영상 공간의 제1 경계 부분에 추적하는 객체가 위치하는 프레임에 해당하고, 제1 프레임이 제1 경계 부분의 반대편인 제2경계 부분에 추적하는 객체가 위치하는 프레임에 해당하는 불연속 경계에 대한 객체 이동 추적의 경우, 불연속 경계에서 yaw, pitch 값이 연속성을 가지도록 제1 경계 부분에 대응하는 yaw, pitch 값을 기준으로 제2 경계 부분에 대응하는 yaw, pitch 값을 보상할 수 있다. 예를 들어, 제1 프레임의 왼쪽 테두리 경계 부분에 존재하는 객체가 제1 프레임의 왼쪽 테두리 경계와 연결되는 제1 프레임의 오른쪽 테두리 경계 부분에서 객체가 나타나는 경우, 왼쪽 테두리 경계 부분에 대응하는 yaw, pitch 값을 기준으로 오른쪽 테두리 경계 부분에 대응하는 yaw, pitch 값을 조절할 수 있다. The object tracker 500 can track by compensating so that yaw and pitch values are continuous at the discontinuous boundary of ERP or CMP during tracking. The object tracker 500 corresponds to a frame in which an object to be tracked is located at a first boundary portion of a 360-degree image space, and the first frame is an object to be tracked at a second boundary portion opposite to the first boundary portion. In the case of object movement tracking for a discontinuous boundary corresponding to the frame in which is located, the yaw and pitch values at the discontinuous boundary correspond to the second boundary based on the yaw and pitch values corresponding to the first boundary so that the values have continuity. The yaw and pitch values can be compensated. For example, when an object existing at the left border border of the first frame appears at the right border border of the first frame connected to the left border border of the first frame, yaw corresponding to the left border border, You can adjust the yaw and pitch values corresponding to the right border boundary based on the pitch value.

도 6은 본원의 일 실시예에 따른 뷰포트 프레임 내의 객체의 위치를 ERP 또는 CMP 공간 상으로 변환하기 위한 동작 흐름을 나타낸 도면이다. 6 is a diagram illustrating an operation flow for converting a position of an object in a viewport frame into an ERP or CMP space according to an embodiment of the present application.

도 6을 참조하면, 본 장치(1)는 객체 추적기(500)로부터 획득된 뷰포트의 객체별 위치 좌표를 기반으로 u, v 좌표 변환기(210) 및 구(Sphere) 좌표계 변환기(220)를 이용하여 객체의 위치를 ERP 또는 CMP 좌표로 변환할 수 있다. u, v 좌표 변환기(210)는 뷰포트의 객체별 위치 좌표를 수신하여 u, v 좌표로 변환하고 중점 좌표의 yaw, pitch 좌표를 구한 다음, 구 좌표계 변환기(220)는 u, v 좌표를 구 좌표로 변환한 후, 2D projection 식을 사용하여 ERP 또는 CMP 좌표계로 변환할 수 있다. Referring to FIG. 6, the device 1 uses a u, v coordinate converter 210 and a sphere coordinate system converter 220 based on the position coordinates of each object of the viewport obtained from the object tracker 500. The position of the object can be converted to ERP or CMP coordinates. The u, v coordinate converter 210 receives the position coordinates of each object of the viewport, converts them to u, v coordinates, obtains the yaw and pitch coordinates of the central coordinates, and then the old coordinate system converter 220 converts the u, v coordinates to the old coordinates. After converting to, it can be converted to an ERP or CMP coordinate system using a 2D projection equation.

도 7은 본원의 일 실시예에 따른 360도 비디오 프레임에 객체를 표현하여 디스플레이한 화면을 나타낸 도면이고, 도 8은 본원의 일 실시예에 따른 선택 객체 영역을 포함하는 360도 비디오 프레임 및 평면 뷰포트 프레임을 나타낸 도면이다.7 is a diagram illustrating a screen displayed by expressing an object in a 360-degree video frame according to an embodiment of the present application, and FIG. 8 is a 360-degree video frame including a selected object area and a plan viewport according to an embodiment of the present application. It is a diagram showing a frame.

본 장치(1)는 ERP 또는 CMP 좌표계로 변환된 객체의 위치에 대응하는 표시를 360도 비디오 프레임에 표시하여 디스플레이할 수 있으며, 사용자에 의해 표시가 선택되면, 선택된 표시에 대응하는 선택 객체 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 디스플레이할 수 있다. 만약, 객체 검출기(300)에서 검출된 객체 영역이 복수개인 경우, 본 장치(1)는 복수개의 검출된 객체 영역에 대응하여 상기 표시는 복수개일 수 있고, 사용자에 의해 표시가 둘 이상 선택되면, 선택된 둘 이상의 표시 각각에 대응하는 소정의 화각 뷰포트 프레임이 각각 평면 뷰포트 프렝님 형태로 변환되어 디스플레이될 수 있다. The device 1 can display and display a display corresponding to the position of an object converted to an ERP or CMP coordinate system in a 360-degree video frame. When a display is selected by the user, the selected object area corresponding to the selected display is displayed. The included viewport frame having a predetermined angle of view may be converted into a flat viewport frame and displayed. If there are a plurality of object regions detected by the object detector 300, the device 1 may have a plurality of displays corresponding to the plurality of detected object regions, and if two or more displays are selected by the user, A predetermined angle of view viewport frame corresponding to each of the two or more selected displays may be converted into a flat viewport viewport frame and displayed.

사용자는 추적 중인 객체에 대하여 뷰포트 영상을 사용할 수 있고, 추적을 원할 경우 클릭(선택)을 통하여 새로운 뷰포트 영상을 생성하여 객체를 추적할 수 있다. 만약, 전술한 바와 같이, 검출된 객체 영역이 복수개인 경우, 사용자가 여러 개의 객체를 선택하면, 본 장치(1)는 사용자의 선택에 따른 임의의 뷰포트를 생성할 수 있고, 일반 화각으로 360도 비디오에서 관심있는 객체에 대한 추적을 가능하게 할 수 있다.The user can use the viewport image for the object being tracked, and if he wants to track, he can create a new viewport image through clicking (selection) to track the object. If, as described above, when there are a plurality of detected object areas, if the user selects several objects, the device 1 can create an arbitrary viewport according to the user's selection, and the general angle of view is 360 degrees. You can enable tracking of objects of interest in the video.

정리하면, 본 장치(1)는 360도 비디오 프레임 공간으로 변환된 객체들의 위치를 화면에 동시에 표시할 수 있고, 360도 비디오 프레임 공간에 표시된 객체들 중에서 일부를 사용자가 선택할 수 있으며, 360도 비디오 프레임 공간에서 사용자에 의해 선택된 객체를 포함하는 소정의 화각의 뷰포트 프레임을 변환하여 재생할 수 있다. 본 장치(1)는 360도 비디오 프레임 공간으로 변환된 객체들의 위치를 화면에 동시에 표시할 경우, 좁은 화각의 뷰포트 프레임에서 관측된 객체의 ERP 또는 CMP 형식의 360도 비디오 프레임 화면에 동시에 표시할 수 있다. 전체 프레임인 ERP 또는 CMP 형식의 360도 비디오 프레임에 대하여 좁은 화각의 뷰포트 프레임을 추출 시, 전술한 객체 추적기(500)에 의해 본 장치(1)는 중첩되는 영역에 대한 한 객체의 중복 표시를 방지할 수 있고, 일정 거리 이내에서 다시 소정의 화각의 뷰포트 영상을 추출하여 중복이 생기는 경우를 방지할 수 있다. 또한, 본 장치(1)는 ERP 혹은 CMP 형식의 비디오 프레임 공간에 표시된 추적 객체에 대하여 사용자의 선택을 처리할 수 있는데, 사용자가 추적을 원하는 객체에 대하여 마우스 클릭 등의 수단을 이용하여 선택하면, 해당 선택 발생 위치의 객체 좌표를 획득함으로써 다중 객체 선택이 가능할 수 있고, 선택된 객체를 포함하는 소정의 화각의 뷰포트 프레임 변환을 실시하여 해당 객체의 추적과 재생을 수행할 수 있다. 이때, 선택된 객체는 복수개일수 있으며, 해당 객체의 추적이 이루어지는 중이면, 이를 따라 소정의 화각의 뷰포트 비디오 시퀀스를 생성하여 객체의 움직임에 따른 관측을 용이하게 할 수 있다. In summary, the device 1 can simultaneously display the positions of objects converted into a 360-degree video frame space on the screen, and a user can select some of the objects displayed in the 360-degree video frame space, and 360-degree video In the frame space, a viewport frame having a predetermined angle of view including an object selected by a user may be converted and reproduced. The device (1) can simultaneously display the positions of objects converted into a 360-degree video frame space on the screen, simultaneously displaying the ERP or CMP format of the object observed in a narrow viewport frame on a 360-degree video frame screen. have. When extracting a viewport frame with a narrow angle of view for an entire frame of ERP or CMP format 360-degree video frame, the device 1 prevents duplicate display of one object in the overlapping area by the above-described object tracker 500. In addition, it is possible to prevent overlapping by extracting a viewport image having a predetermined angle of view again within a certain distance. In addition, the device 1 can process the user's selection for the tracking object displayed in the ERP or CMP format video frame space. When the user selects the object to be tracked using a means such as a mouse click, Multi-object selection may be possible by obtaining the object coordinates of the corresponding selection occurrence location, and tracking and reproduction of the corresponding object may be performed by performing a viewport frame conversion of a predetermined angle of view including the selected object. In this case, there may be a plurality of selected objects, and if the object is being tracked, a viewport video sequence having a predetermined angle of view may be generated accordingly to facilitate observation according to the movement of the object.

본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 장치(1)에 의하면, 전방위의 넓은 시야각을 갖는 360도 비디오에서 사용자들이 지정한 관심 객체의 출현 및 이동시 이의 자동 검출이 가능하도록 하고, 사용자의 360도 비디오 공간 탐색 효율을 높일 수 있으며, 객체를 포함한 별도의 뷰포트 구성을 통한 영상 보안 시스템 또는 표준 화각의 비디오 컨텐츠 생성기 구성이 가능하도록 할 수 있다. According to the multi-object tracking apparatus 1 in a 360-degree image space according to an embodiment of the present application, it is possible to automatically detect the appearance and movement of an object of interest designated by users in a 360-degree video having a wide viewing angle in all directions, It is possible to increase the user's search efficiency in a 360-degree video space, and to configure a video security system or a video content generator with a standard angle of view through a separate viewport including objects.

이하에서는 상기에 자세히 설명된 내용을 기반으로, 본원의 동작 흐름을 간단히 살펴보기로 한다.Hereinafter, based on the details described above, the operation flow of the present application will be briefly described.

도 9는 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 방법에 대한 동작 흐름도이다.9 is a flowchart illustrating a method for tracking multiple objects in a 360-degree image space according to an embodiment of the present application.

도 9에 도시된 360도 영상 공간에서의 다중 객체 추적 방법은 앞서 설명된 360도 영상 공간에서의 다중 객체 추적 장치(1)에 의하여 수행될 수 있다. 따라서, 이하 생략된 내용이라고 하더라도 360도 영상 공간에서의 다중 객체 추적 장치(1)에 대하여 설명된 내용은 360도 영상 공간에서의 다중 객체 추적 방법에 대한 설명에도 동일하게 적용될 수 있다.The multi-object tracking method in the 360-degree image space shown in FIG. 9 may be performed by the multi-object tracking apparatus 1 in the 360-degree image space described above. Accordingly, even if omitted below, the description of the multi-object tracking apparatus 1 in a 360-degree image space may be equally applied to a description of a method for tracking multiple objects in a 360-degree image space.

도 9를 참조하면, 본원의 일 실시예에 따른 360도 영상 공간에서의 다중 객체 추적 방법은 360도 영상 공간에 포함된 360도 비디오 프레임 내에서 객체 존재 후보 영역을 검출할 수 있따(S901).Referring to FIG. 9, the multi-object tracking method in a 360-degree image space according to an embodiment of the present application may detect an object presence candidate region within a 360-degree video frame included in the 360-degree image space (S901).

다음으로, 360도 비디오 프레임으로부터 객체 존재 후보 영역을 포함하는 소정의 화각의 뷰포트 프레임을 평면 뷰포트 프레임 형태로 변환하여 추출할 수 있다(S902)Next, a viewport frame having a predetermined angle of view including an object presence candidate region may be converted into a flat viewport frame form and extracted from the 360-degree video frame (S902).

다음으로, 추출된 평면 뷰포트 프레임에 대해 객체를 포함하는 객체 영역을 검출할 수 있다(S903).Next, an object area including an object may be detected with respect to the extracted plan viewport frame (S903).

다음으로, 객체 영역이 검출된 경우, 검출된 객체 영역에 대응하는 대응 영상신호를 추출하고, 검출된 객체 영역의 위치를 360도 영상 공간에 대응하게 변환하여 변환 위치를 생성할 수 있다(S904).Next, when the object region is detected, a corresponding image signal corresponding to the detected object region is extracted, and the position of the detected object region is converted to correspond to a 360-degree image space to generate a transformed position (S904). .

다음으로, 대응 영상신호 및 상기 변환 위치를 기초로 상기 360도 비디오 프레임에서의 객체의 위치를 추적할 수 있다(S905).Next, the position of the object in the 360-degree video frame may be tracked based on the corresponding video signal and the transformed position (S905).

이때, 단계 S901에서 검출된 객체 존재 후보 영역이 복수개인 경우, 상기 단계 S902 내지 단계 S905는 복수개의 객체 존재 후보 영역 각각에 대하여 개별적으로 수행될 수 있다.In this case, when there are a plurality of object presence candidate regions detected in step S901, steps S902 to S905 may be individually performed for each of the plurality of object presence candidate regions.

상술한 설명에서, 단계 901 내지 905는 본원의 구현예에 따라서, 추가적인 단계들로 더 분할되거나, 더 적은 단계들로 조합될 수 있다. 또한, 일부 단계는 필요에 따라 생략될 수도 있고, 단계 간의 순서가 변경될 수도 있다.In the above description, steps 901 to 905 may be further divided into additional steps or may be combined into fewer steps, according to an embodiment of the present disclosure. In addition, some steps may be omitted as necessary, or the order between steps may be changed.

본원의 일 실시 예에 따른 360도 영상 공간에서의 다중 객체 추적 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method for tracking multiple objects in a 360-degree image space according to an exemplary embodiment of the present disclosure may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded in the medium may be specially designed and configured for the present invention, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the present invention, and vice versa.

또한, 전술한 360도 영상 공간에서의 다중 객체 추적 방법은 기록 매체에 저장되는 컴퓨터에 의해 실행되는 컴퓨터 프로그램 또는 애플리케이션의 형태로도 구현될 수 있다.In addition, the above-described method for tracking multiple objects in a 360-degree image space may be implemented in the form of a computer program or application executed by a computer stored in a recording medium.

전술한 본원의 설명은 예시를 위한 것이며, 본원이 속하는 기술분야의 통상의 지식을 가진 자는 본원의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다.The foregoing description of the present application is for illustrative purposes only, and those of ordinary skill in the art to which the present application pertains will be able to understand that it can be easily modified into other specific forms without changing the technical spirit or essential features of the present application. Therefore, it should be understood that the embodiments described above are illustrative and non-limiting in all respects. For example, each component described as a single type may be implemented in a distributed manner, and similarly, components described as being distributed may also be implemented in a combined form.

본원의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본원의 범위에 포함되는 것으로 해석되어야 한다.The scope of the present application is indicated by the claims to be described later rather than the detailed description, and all changes or modified forms derived from the meaning and scope of the claims and their equivalent concepts should be construed as being included in the scope of the present application.

1: 360도 영상 공간에서의 다중 객체 추적 장치
100: 객체 존재 후보 영역 추출기
200: 뷰포트 영상 추출기
300: 객체 검출기
400: 360도 좌표 변환기
500: 객체 추적기1: Multi-object tracking device in 360-degree image space
100: object existence candidate region extractor
200: viewport image extractor
300: object detector
400: 360 degree coordinate converter
500: object tracker

Claims

A multi-object tracking method in a 360-degree image space by a multi-object tracking device,
(a) in the 360-degree video frame included in the 360-degree video space, detecting a motion region as an object presence candidate region in consideration of a possibility that an object exists in the motion region;
(b) Considering that a 360-degree image has structural distortion different from that of a planar image, converting and extracting a viewport frame having a predetermined angle of view including the object presence candidate region from the 360-degree video frame into a planar viewport frame format. step;
(c) detecting an object region including an object by performing object detection based on the plane viewport frame extracted corresponding to the object presence candidate region;
(d) when the object region is detected, extracting a corresponding image signal corresponding to the detected object region, and converting the position of the detected object region to correspond to the 360-degree image space to generate a transformed position;
(e) tracking the position of the object in the 360-degree video frame based on the corresponding video signal and the transformed position,
When the object presence candidate regions detected in step (a) are plural, steps (b) to (e) are individually performed for each of the plurality of candidate object presence regions,
The predetermined angle of view is set within a limit such that some of the object presence candidate regions are not excluded from the plan viewport frame.

The method of claim 1,
The 360-degree image space is ERP (Equirectangular Projection) or CMP (Cube Map Projection) format.

The method of claim 1,
Steps (a) to (e) are performed for each of a plurality of 360-degree video frames included in the 360-degree image space,
The step (e) includes: a corresponding image signal and a transform position corresponding to an object region detected in a first frame among the plurality of 360-degree video frames, and a time-sequentially after the first frame among the plurality of 360-degree video frames. Multi-object in 360-degree image space, which tracks the movement of an object according to the passage of time in the 360-degree image space in consideration of a corresponding image signal and a transform position corresponding to the object region detected in the second frame Tracking method.

The method of claim 1,
The predetermined angle of view is set such that a horizontal ratio and a vertical ratio occupied by the object presence candidate region are equal to or greater than a preset horizontal ratio and a preset vertical ratio.

The method of claim 1,
The corresponding image signal is a signal that separates the detected object region from other regions to enable tracking of the detected object region, and is in the form of an image feature vector for the detected object region. Multi-object tracking method.

delete

The method of claim 1,
The step (a),
Perform background generation and foreground extraction for the 360-degree video frame, and extract the motion region through background difference
The multi-object tracking method in a 360 degree image space, wherein the motion region is detected as the object presence candidate region only when the motion region is larger than a predetermined region size.

The method of claim 1,
The step (a),
Using an object detector applied to a conventional rectangular video frame, instead of directly detecting an object region, an object presence candidate region is detected.

The method of claim 1,
In the step (b), the coordinates of the viewport frame of the predetermined angle of view are converted into u and v coordinates, yaw and pitch coordinates of the midpoint coordinates are obtained, and then converted into a sphere coordinate system. A method of tracking multiple objects in a 360-degree image space, which is to extract the coordinates in the plane viewport frame form using a transformation including a rotation transformation.

The method of claim 1,
When the object presence candidate regions detected in step (a) are plural, and at least two or more of the plurality of object presence candidate regions overlap at least partially,
In the step (b), transforming and extracting a plan viewport frame type is performed only on one of the at least partially overlapping two or more object presence candidate regions.

The method of claim 1,
In the step (c), the object region is detected using an object detector with respect to the extracted plan viewport frame, and the type of the object is classified.

The method of claim 3,
In the step (e), the first frame corresponds to a frame in which an object to be tracked is located in a first boundary portion of the 360-degree image space, and the second frame is a second boundary opposite to the first boundary portion. In the case of object movement tracking on a discontinuous boundary corresponding to a frame in which an object to be tracked in the part is located, the yaw and pitch values at the discontinuous boundary are based on the yaw and pitch values corresponding to the first boundary so as to have continuity. Compensating the yaw and pitch values corresponding to the second boundary portion, a multi-object tracking method in a 360-degree image space.

The method of claim 1,
(f) displaying and displaying a display corresponding to the location of the tracked object on the 360-degree video frame; And
(g) When the display is selected by the user, in a 360-degree image space, further comprising converting and displaying a viewport frame having a predetermined angle of view including a selected object area corresponding to the selected display into a flat viewport frame format. Multi-object tracking method.

The method of claim 13,
When there are a plurality of object areas detected in step (c),
In the step (f), the display is a plurality corresponding to a plurality of detected object areas,
In the step (g), when two or more displays are selected by the user, viewport frames of a predetermined angle of view corresponding to each of the selected two or more displays are converted into flat viewport frames and displayed. Multi-object tracking method in.

A multi-object tracking device in a 360-degree image space,
An object presence candidate region extractor for detecting a motion region as an object presence candidate region in consideration of a possibility that an object exists in the motion region within the 360 degree video frame included in the 360 degree image space;
A viewport image extractor that converts and extracts a viewport frame of a predetermined angle of view including the object presence candidate region from the 360-degree video frame into a flat viewport frame form, taking into account that a 360-degree image has structural distortion different from that of a planar image. ;
An object detector configured to detect an object region including an object by performing object detection based on the plane viewport frame extracted corresponding to the object presence candidate region;
A 360-degree coordinate converter for generating a transformed position by extracting a corresponding image signal corresponding to the detected object area, and converting the position of the detected object area to correspond to the 360-degree image space when the object area is detected;
An object tracker for tracking the position of the object in the 360-degree video frame based on the corresponding video signal and the transformed position,
When there are a plurality of object presence candidate regions detected by the object presence candidate region extractor, the viewport image extractor and the object detector are individually performed for each of the plurality of object presence candidate regions,
The predetermined angle of view is set within a limit such that some of the object presence candidate regions are not excluded in the plan viewport frame.