KR102015082B1

KR102015082B1 - syntax-based method of providing object tracking in compressed video

Info

Publication number: KR102015082B1
Application number: KR1020170176597A
Authority: KR
Inventors: 이현우; 정승훈; 이성진; 배현성
Original assignee: 이노뎁 주식회사
Priority date: 2017-12-20
Filing date: 2017-12-20
Publication date: 2019-08-27
Also published as: KR20190074900A; WO2019124634A1

Abstract

본 발명은 일반적으로 H.264 AVC 및 H.265 HEVC 등의 압축영상으로부터 객체 추적을 효과적으로 수행하는 기술에 관한 것이다. 더욱 상세하게는, 본 발명은 예컨대 CCTV 카메라가 생성하는 압축영상에 대해 무언가 유의미한 움직임이 존재하는 영역, 즉 이동객체 영역을 종래기술처럼 복잡한 이미지 프로세싱을 수행할 필요없이 압축영상을 구성하는 영상 블록 단위로 모션벡터와 코딩유형과 같은 신택스(syntax)에 기초하여 추출하고 그 이동객체 영역의 움직임을 추적하여 영상관제 장치에 정규화 제공하는 기술에 관한 것이다. 본 발명에 따르면 압축영상에 대해 디코딩, 다운스케일 리사이징, 차영상 획득, 영상 분석 등과 같은 복잡한 프로세싱을 거치지 않고서도 CCTV 촬영 영상에서 이동객체 영역을 신속하게 추적할 수 있어 영상관제 시스템을 통한 범죄예방 및 사후증거 확보 효과를 높일 수 있는 장점이 있다. 이때, 이동객체 추적 결과를 수학적으로 정규화한 후에 영상관제 장치로 제공함으로써 다양한 장비에 대한 호환성을 보장할 수 있는 장점이 있다.The present invention generally relates to a technique for efficiently performing object tracking from compressed images such as H.264 AVC and H.265 HEVC. More specifically, the present invention, for example, in the image block unit constituting the compressed image without having to perform a complex image processing in the area where there is something meaningful movement, that is, the moving object region for the compressed image generated by the CCTV camera, for example The present invention relates to a technique for extracting based on syntax, such as a low motion vector and a coding type, and tracking the movement of the moving object region to provide normalization to an image control apparatus. According to the present invention, it is possible to quickly track a moving object region in CCTV photographed images without complex processing such as decoding, downscaling resizing, difference image acquisition, image analysis, etc. of the compressed image, thereby preventing crime through the video control system and There is an advantage that can increase the effect of securing post evidence. In this case, by providing a video control device after mathematically normalizing the moving object tracking result, there is an advantage of ensuring compatibility with various equipment.

Description

Syntax-based method of providing object tracking in compressed video

본 발명은 일반적으로 H.264 AVC 및 H.265 HEVC 등의 압축영상으로부터 객체 추적을 효과적으로 수행하는 기술에 관한 것이다.The present invention generally relates to a technique for efficiently performing object tracking from compressed images such as H.264 AVC and H.265 HEVC.

더욱 상세하게는, 본 발명은 예컨대 CCTV 카메라가 생성하는 압축영상에 대해 무언가 유의미한 움직임이 존재하는 영역, 즉 이동객체 영역을 종래기술처럼 복잡한 이미지 프로세싱을 수행할 필요없이 압축영상을 구성하는 영상 블록 단위로 모션벡터와 코딩유형과 같은 신택스(syntax)에 기초하여 추출하고 그 이동객체 영역의 움직임을 추적하여 영상관제 장치에 정규화 제공하는 기술에 관한 것이다.More specifically, the present invention, for example, in the image block unit constituting the compressed image without having to perform a complex image processing in the area where there is something meaningful movement, that is, the moving object region for the compressed image generated by the CCTV camera, for example The present invention relates to a technique for extracting based on syntax, such as a low motion vector and a coding type, and tracking the movement of the moving object region to provide normalization to an image control apparatus.

최근에는 범죄예방이나 사후증거 확보 등을 위해 CCTV를 이용하는 영상관제 시스템을 구축하는 것이 일반적이다. 지역별로 다수의 CCTV 카메라를 설치해둔 상태에서 이들 CCTV 카메라가 생성하는 영상을 모니터에 표시하고 스토리지 장치에 저장해두는 것이다. 범죄나 사고가 발생하는 장면을 관제 요원이 발견하게 되면 그 즉시 적절하게 대처하는 한편, 필요에 따라서는 사후증거 확보를 위해 스토리지에 저장되어 있는 영상을 검색하는 것이다.Recently, it is common to build a video control system using CCTV for crime prevention and security. With multiple CCTV cameras installed by region, images generated by these CCTV cameras are displayed on a monitor and stored in a storage device. When a control agent finds a scene where a crime or accident occurs, he or she immediately responds appropriately and, if necessary, retrieves the image stored in the storage to secure post evidence.

그런데. CCTV 카메라의 설치 현황에 비해 관제 요원의 수는 매우 부족한 것이 현실이다. 이처럼 제한된 인원으로 영상 감시를 효과적으로 수행하려면 CCTV 영상을 모니터 화면에 단순 표시하는 것만으로는 충분하지 않다. 각각의 CCTV 영상에 존재하는 객체의 움직임을 감지하여 실시간으로 해당 영역에 무언가 추가 표시함으로써 효과적으로 발견되도록 처리하는 것이 바람직하다. 이러한 경우에 관제 요원은 CCTV 영상 전체를 균일한 관심도를 가지고 지켜보는 것이 아니라 객체 움직임이 있는 부분을 중심으로 CCTV 영상을 감시하면 된다.By the way. The reality is that the number of control personnel is very low compared to the installation status of CCTV cameras. In order to effectively perform video surveillance with such limited number of people, simply displaying CCTV images on the monitor screen is not enough. It is preferable to process the object to be effectively detected by detecting the motion of an object present in each CCTV image and displaying something in the corresponding area in real time. In this case, the monitoring personnel do not monitor the entire CCTV image with uniform interest, but monitor the CCTV image centering on the part where the object moves.

한편, 영상감지 시스템에서는 스토리지 공간의 효율을 위해 압축영상을 채택하고 있다. 특히 최근에는 CCTV 카메라의 설치 대수가 급속하게 증가하고 고화질 카메라가 주로 설치됨에 따라 H.264 AVC 및 H.265 HEVC 등과 같은 고압축율의 복잡한 영상압축 기술이 채택되고 있다.On the other hand, the video sensing system adopts compressed video for the efficiency of the storage space. In particular, as the number of CCTV cameras is rapidly increasing and high quality cameras are mainly installed, high-compression complex video compression technologies such as H.264 AVC and H.265 HEVC have been adopted.

동영상 데이터를 생성하는 카메라 장치에서는 이들 기술규격 중 어느 하나에 따라 압축영상을 생성하여 제공하며, 동영상을 재생하는 장치에서는 이러한 압축영상을 전달받으면 그 압축영상을 인코딩할 때 적용했던 기술규격에 따라 역으로 디코딩을 수행한다. 영상압축 기술이 적용된 CCTV 영상에서 움직임 유무를 판단하려면 종래에는 압축영상을 디코딩하여 재생영상, 즉 압축이 풀려있는 원래 영상을 얻은 후에 이미지 처리하는 과정이 필요하였다.The camera device generating the video data generates and provides a compressed image according to any one of these technical standards, and when the device playing video receives the compressed image, the camera device generates the compressed data according to the technical standard applied when encoding the compressed image. Perform decoding. In order to determine the presence or absence of motion in a CCTV image to which image compression technology is applied, conventionally, a process of processing an image after decompressing a compressed image and obtaining a reproduced image, that is, an uncompressed original image is required.

[도 1]은 H.264 AVC 기술규격에 따른 동영상 디코딩 장치의 일반적인 구성을 나타내는 블록도이다. [도 1]을 참조하면, H.264 AVC에 따른 동영상 디코딩 장치는 구문분석기(11), 엔트로피 디코더(12), 역 변환기(13), 모션벡터 연산기(14), 예측기(15), 디블로킹 필터(16)를 포함하여 구성된다. 1 is a block diagram illustrating a general configuration of a video decoding apparatus according to the H.264 AVC Technical Standard. Referring to FIG. 1, a video decoding apparatus according to H.264 AVC may include a parser 11, an entropy decoder 12, an inverse converter 13, a motion vector operator 14, a predictor 15, and deblocking. And a filter 16.

이들 하드웨어 모듈이 압축영상을 순차적으로 처리함으로써 압축을 풀고 원래의 영상 데이터를 복원해낸다. 이때, 구문분석기(11)는 압축영상의 코딩 유닛에 대해 모션벡터 및 코딩유형을 파싱해낸다. 이러한 코딩 유닛(coding unit)은 일반적으로는 매크로블록이나 서브 블록과 같은 영상 블록인데, 기술규격에 따라서는 정확히 일치하지 않게 구현될 수도 있다.These hardware modules process compressed video sequentially to decompress and restore the original video data. At this time, the parser 11 parses the motion vector and the coding type for the coding unit of the compressed image. Such a coding unit is generally an image block such as a macroblock or a subblock, and may be implemented not exactly according to a technical standard.

[도 2]는 기존의 영상분석 솔루션에서 압축영상으로부터 객체 추적을 수행하는 과정을 나타내는 순서도이다.2 is a flowchart illustrating a process of performing object tracking from a compressed image in a conventional image analysis solution.

[도 2]를 참조하면, 종래기술에서는 압축영상을 H.264 AVC 및 H.265 HEVC 등에 따라 디코딩하고(S10), 재생영상의 프레임 이미지들을 작은 이미지, 예컨대 320x240 정도로 다운스케일 리사이징을 한다(S20). 이때, 다운스케일 리사이징을 하는 이유는 이후 과정에서의 프로세싱 부담을 그나마 줄이기 위한 것이다. 그리고 나서, 리사이징된 프레임 이미지들에 대해 차영상(differentials)을 구한 후에 영상 분석을 통해 이동객체를 추출해낸다(S30). 그리고 나서, 일련의 프레임 이미지에 대한 영상 분석을 통해 그 이동객체의 이동 경로를 식별해낸다(S40).Referring to FIG. 2, in the prior art, the compressed video is decoded according to H.264 AVC and H.265 HEVC, etc. (S10), and the frame images of the playback video are downscaled to a small image, for example, 320x240 (S20). ). In this case, the reason for downsizing resizing is to reduce the processing burden in the subsequent process. Then, after obtaining differential images of the resized frame images, the moving object is extracted through image analysis (S30). Then, the moving path of the moving object is identified through image analysis of the series of frame images (S40).

종래기술에서 이동객체를 추출하려면 압축영상 디코딩, 다운스케일 리사이징, 영상 분석을 수행한다. 이들은 복잡도가 매우 높은 프로세스이고, 그로 인해 종래의 영상관제 시스템에서는 한 대의 영상분석 서버가 동시 처리할 수 있는 용량이 상당히 제한되어 있다. 현재 고성능의 영상분석 서버가 커버할 수 있는 최대 CCTV 채널은 통상 최대 16 채널이다. 다수의 CCTV 카메라가 설치되므로 영상관제 시스템에는 다수의 영상분석 서버가 필요하였고, 이는 비용 증가와 물리적 공간 확보의 어려움이라는 문제점을 유발하였다.In the prior art, to extract a moving object, compressed image decoding, downscale resizing, and image analysis are performed. These are very complicated processes, and therefore, in a conventional video control system, the capacity of a single video analysis server can be processed at a time is quite limited. Currently, the maximum CCTV channels that a high performance video analytics server can cover are typically up to 16 channels. Since a large number of CCTV cameras were installed, a number of video analysis servers were required for the video control system, which caused problems of increased cost and difficulty in securing physical space.

대규모의 영상관제 시스템을 구축 및 유지하는 데에는 상당한 예산이 소요되는 바, 그에 상당하는 효용가치가 요구되고 있다. 그러한 요구의 기본 방향은 범죄 예방 및 범죄증거 확보이다. 그에 따라, 주변 모습을 단순히 촬영하여 저장하거나 이로부터 이동객체의 존재를 알려주는 것에서 더 나아가 실생활에서 경험상 그 자체가 문제되는 특수한 상황을 영상관제 시스템이 소프트웨어로 검출해주는 높은 수준의 감지 기능을 제공할 필요가 있다. 이때, 시스템 구축 비용과 물리적 공간 확보라는 현실적인 문제를 고려하여 효율적인 구현 기술도 요망된다.The construction and maintenance of a large-scale video control system requires a considerable budget, and a corresponding utility value is required. The basic direction of such a demand is crime prevention and criminal evidence. Therefore, it can go beyond simply photographing and storing the surrounding image or notifying the existence of a moving object from it, and providing a high level of detection that the video control system detects a special situation where the experience itself is a problem in real life. There is a need. At this time, an efficient implementation technology is also required in consideration of the actual problems of system construction cost and physical space.

본 발명의 목적은 일반적으로 H.264 AVC 및 H.265 HEVC 등의 압축영상으로부터 객체 추적을 효과적으로 수행하는 기술을 제공하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to provide a technique for efficiently performing object tracking from compressed video such as H.264 AVC and H.265 HEVC.

특히, 본 발명의 목적은 예컨대 CCTV 카메라가 생성하는 압축영상에 대해 무언가 유의미한 움직임이 존재하는 영역, 즉 이동객체 영역을 종래기술처럼 복잡한 이미지 프로세싱을 수행할 필요없이 압축영상을 구성하는 영상 블록 단위로 모션벡터와 코딩유형과 같은 신택스(syntax)에 기초하여 추출하고 그 이동객체 영역의 움직임을 추적하여 영상관제 장치에 정규화 제공하는 기술을 제공하는 것이다.In particular, an object of the present invention is to, for example, the area in which something meaningful movements exist for the compressed image generated by the CCTV camera, that is, the moving object region in units of image blocks constituting the compressed image without having to perform complicated image processing as in the prior art. The present invention provides a technique for extracting based on syntax, such as a motion vector and a coding type, and tracking the movement of the moving object region to provide normalization to an image control apparatus.

상기의 목적을 달성하기 위하여 본 발명에 따른 압축영상에 대한 신택스 기반의 객체 추적 방법은, 압축영상의 비트스트림을 파싱하여 코딩 유닛에 대한 모션벡터 및 코딩유형을 획득하는 제 1 단계; 압축영상을 구성하는 복수의 영상 블록 별로 미리 설정된 제 1 시간동안의 모션벡터 누적값을 획득하는 제 2 단계; 복수의 영상 블록에 대하여 모션벡터 누적값을 미리 설정된 제 1 임계치와 비교하는 제 3 단계; 제 1 임계치를 초과하는 모션벡터 누적값을 갖는 영상 블록을 이동객체 영역으로 마킹하는 제 4 단계; 사용자 조작에 의해 추적 대상으로 특정된 이동객체 영역(이하, '추적대상 이동객체 영역'이라 함)과 관련하여 압축영상의 일련의 영상 프레임에 걸쳐 추적대상 이동객체 영역의 일련의 좌표를 획득하여 그에 따른 좌표 시퀀스를 영상관제 장치로 제공하는 제 5 단계;를 포함하여 구성된다.In order to achieve the above object, a syntax-based object tracking method for a compressed image according to the present invention includes: a first step of parsing a bitstream of a compressed image to obtain a motion vector and a coding type for a coding unit; A second step of obtaining a motion vector cumulative value for a first preset time for each of the plurality of image blocks constituting the compressed image; A third step of comparing a motion vector cumulative value with a first threshold value for a plurality of image blocks; A fourth step of marking an image block having a motion vector accumulation value exceeding a first threshold as a moving object region; With respect to the moving object region (hereinafter, referred to as the "tracking target moving object region") specified as the tracking target by a user operation, a series of coordinates of the tracking target moving object region are obtained over a series of image frames of the compressed image. And a fifth step of providing the coordinate sequence according to the image control apparatus.

본 발명에서 압축영상을 구성하는 영상 블록은 매크로블록과 서브블록을 포함할 수 있다.In the present invention, the image block constituting the compressed image may include a macroblock and a subblock.

이때, 제 5 단계는, 이동객체 영역이 ID 미할당 상태인 경우에 Unique ID를 신규 발행하여 할당하는 제 5a 단계; 사용자 조작에 따라 특정의 이동객체 영역(이하, '추적대상 이동객체 영역'이라 함)을 추적 대상으로 설정하는 제 5b 단계; 추적대상 이동객체 영역에 할당된 Unique ID(이하, '추적대상 Unique ID'라 함)를 식별하는 제 5c 단계; 압축영상을 구성하는 일련의 영상 프레임에 대해 추적대상 Unique ID 값이 할당된 이동객체 영역의 사각형 좌표를 순차적으로 산출하여 추적대상 이동객체 영역의 좌표 시퀀스로 설정하는 제 5d 단계; 추적대상 이동객체 영역의 좌표 시퀀스에 포함되는 일련의 사각형 좌표를 압축영상의 해상도에 대응하여 정규화 처리하는 제 5e 단계; 위 정규화 처리된 추적대상 이동객체 영역의 좌표 시퀀스를 영상관제 장치로 제공하는 제 5f 단계; 이동객체 영역이 일련의 영상 프레임에서 사라지는 경우에 그 할당된 Unique ID를 리보크(revoke)하는 제 5g 단계;를 포함하여 구성될 수 있다.In this case, the fifth step may include: a fifth step of newly issuing and assigning a unique ID when the moving object region is in an ID unassigned state; A fifth step of setting a specific moving object area (hereinafter referred to as a 'tracking target moving object area') as a tracking object according to a user's operation; A fifth step of identifying a unique ID (hereinafter, referred to as a 'tracking unique ID') allocated to the tracked moving object region; A fifth step of sequentially calculating rectangular coordinates of the moving object region to which the tracking unique ID value is assigned to the series of image frames constituting the compressed image and setting the coordinates of the moving object region to the tracking sequence; A fifth step of normalizing the series of rectangular coordinates included in the coordinate sequence of the region to be tracked corresponding to the resolution of the compressed image; A fifth step of providing the video sequence apparatus with the coordinate sequence of the region of the tracking target moving object normalized above; And revoking the assigned Unique ID when the moving object region disappears from the series of image frames.

이때, 사각형 좌표는 이동객체 영역을 최적으로 포함하도록 가상으로 형성된 사각형에 대한 좌상단 좌표(x, y), 가로축 길이(dx), 세로축 길이(dy)를 포함하여 구성되고, 정규화 처리는 압축영상의 가로해상도(x_res) 및 세로해상도(y_res)를 반영하여 좌상단 x 좌표와 가로축 길이(dx)를 압축영상의 가로해상도(x_res)로 나눗셈 처리하고 좌상단 y 좌표와 세로축 길이(dy)를 압축영상의 세로해상도(y_res)로 나눗셈 처리하는 것이 바람직하다.In this case, the rectangular coordinates include upper left coordinates (x, y), horizontal axis lengths (dx), and vertical axis lengths (dy) of a virtually formed rectangle so as to optimally include a moving object region. The upper left x coordinate and horizontal axis length (dx) are divided by the horizontal resolution (x_res) of the compressed image, reflecting the horizontal resolution (x_res) and vertical resolution (y_res), and the upper left y coordinate and vertical axis length (dy) are vertically compressed. It is preferable to divide by the resolution (y_res).

또한, 본 발명에 따른 객체 추적 방법은, 이동객체 영역을 중심으로 그 인접하는 복수의 영상 블록(이하, '이웃 블록'이라 함)을 식별하는 제 a 단계; 복수의 이웃 블록에 대해 모션벡터 값을 미리 설정된 제 2 임계치와 비교하는 제 b 단계; 제 2 임계치를 초과하는 모션벡터 값을 갖는 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 c 단계; 복수의 이웃 블록 중에서 코딩유형이 인트라 픽쳐(Intra Picture)인 이웃 블록을 이동객체 영역으로 추가 마킹하는 제 d 단계; 복수의 이동객체 영역에 대하여 인터폴레이션을 수행하여 이동객체 영역으로 둘러싸인 미리 설정된 갯수 이하의 비마킹 영상 블록을 이동객체 영역으로 추가 마킹하는 제 e 단계;를 더 포함하여 구성될 수 있다.In addition, the object tracking method according to the present invention comprises a first step of identifying a plurality of adjacent image blocks (hereinafter referred to as 'neighbor block') around the moving object area; B) comparing a motion vector value with a second preset threshold value for a plurality of neighboring blocks; C) additionally marking a neighboring block having a motion vector value exceeding a second threshold as a moving object region; D) additionally marking a neighboring block having a coding type of an intra picture among the plurality of neighboring blocks as a moving object region; The method may further include an e-step of performing interpolation on the plurality of moving object regions to additionally mark the moving object region with a predetermined number of non-marked image blocks surrounded by the moving object region.

한편, 본 발명에 따른 컴퓨터로 판독가능한 비휘발성 기록매체는 컴퓨터에 이상과 같은 압축영상에 대한 신택스 기반의 객체 추적 방법을 실행시키기 위한 프로그램을 기록한 것이다.Meanwhile, the computer-readable nonvolatile recording medium according to the present invention records a program for executing a syntax-based object tracking method for the compressed image as described above on a computer.

본 발명에 따르면 CCTV 압축영상에 대해 디코딩, 다운스케일 리사이징, 차영상 획득, 영상 분석 등과 같은 복잡한 프로세싱을 거치지 않고서도 CCTV 영상에서 이동객체 영역을 추출하므로 기존의 영상분석 서버에 비해 약 20배 정도의 성능 향상을 얻을 수 있는 장점이 있다.According to the present invention, since the moving object region is extracted from the CCTV image without complex processing such as decoding, downscaling resizing, difference image acquisition, image analysis, etc., the CCTV compressed image is about 20 times higher than that of the existing image analysis server. There is an advantage to the performance gains.

또한, 본 발명에 따르면 압축영상에 대해 디코딩, 다운스케일 리사이징, 차영상 획득, 영상 분석 등과 같은 복잡한 프로세싱을 거치지 않고서도 CCTV 촬영 영상에서 이동객체 영역을 신속하게 추적할 수 있어 영상관제 시스템을 통한 범죄예방 및 사후증거 확보 효과를 높일 수 있는 장점이 있다. 이때, 이동객체 추적 결과를 수학적으로 정규화한 후에 영상관제 장치로 제공함으로써 다양한 장비에 대한 호환성을 보장할 수 있는 장점이 있다.In addition, according to the present invention, it is possible to quickly track the moving object area in CCTV images without complex processing such as decoding, downscale resizing, difference image acquisition, image analysis, etc. of the compressed image. There is an advantage to increase the effect of preventing and securing evidence. In this case, by providing a video control device after mathematically normalizing the moving object tracking result, there is an advantage of ensuring compatibility with various equipment.

[도 1]은 동영상 디코딩 장치의 일반적인 구성을 나타내는 블록도.
[도 2]는 종래기술에서 압축영상으로부터 객체 추적을 수행하는 과정을 나타내는 순서도.
[도 3]은 본 발명에 따라 압축영상으로부터 객체 추적을 수행하는 전체 프로세스를 나타내는 순서도.
[도 4]는 본 발명에서 압축영상으로부터 유효 움직임을 검출하는 과정의 구현 예를 나타내는 순서도.
[도 5]는 CCTV 압축영상에 대해 본 발명에 따른 유효 움직임 영역 검출 과정을 적용한 결과의 일 예를 나타내는 도면.
[도 6]은 본 발명에서 이동객체 영역에 대한 바운더리 영역을 검출하는 과정의 구현 예를 나타내는 순서도.
[도 7]은 [도 5]의 CCTV 영상 이미지에 대해 본 발명에 따른 바운더리 영역 검출 과정을 적용한 결과의 일 예를 나타내는 도면.
[도 8]은 [도 7]의 CCTV 영상 이미지에 대해 인터폴레이션을 통해 이동객체 영역을 정리한 결과의 일 예를 나타내는 도면.
[도 9]는 본 발명에서 압축영상으로부터 사용자가 지정한 추적대상 이동객체 영역을 추적 식별하는 과정의 구현 예를 나타내는 순서도.
[도 10]은 본 발명에서 이동객체 영역에 Unique ID가 할당된 일 예를 나타내는 도면.
[도 11]은 본 발명에서 이동객체 영역에 사각형 좌표가 설정된 일 예를 나타내는 도면.1 is a block diagram showing a general configuration of a video decoding apparatus.
2 is a flowchart illustrating a process of performing object tracking from a compressed image in the prior art.
3 is a flowchart illustrating the entire process of performing object tracking from a compressed image according to the present invention.
4 is a flowchart showing an embodiment of a process of detecting an effective motion from a compressed image in the present invention.
5 is a diagram illustrating an example of a result of applying an effective motion region detection process according to the present invention to a CCTV compressed image.
FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region for a moving object region in the present invention. FIG.
FIG. 7 is a diagram illustrating an example of a result of applying a boundary area detection process according to the present invention to the CCTV image of FIG. 5. FIG.
FIG. 8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation for the CCTV video image of FIG. 7. FIG.
FIG. 9 is a flowchart illustrating an example of a process of tracking and identifying a region of a tracking target moving object designated by a user from a compressed image in the present invention. FIG.
FIG. 10 is a diagram illustrating an example in which a unique ID is assigned to a moving object area in the present invention. FIG.
FIG. 11 is a diagram illustrating an example in which rectangular coordinates are set in a moving object region in the present invention. FIG.

이하에서는 도면을 참조하여 본 발명을 상세하게 설명한다.Hereinafter, with reference to the drawings will be described in detail the present invention.

[도 3]은 본 발명에 따라 압축영상으로부터 객체 추적을 수행하는 전체 프로세스를 나타내는 순서도이다. 본 발명에 따른 객체 추적 프로세스는 일련의 압축영상을 다루는 시스템, 예컨대 CCTV 영상관제 시스템에서 영상분석 서버가 양호하게 수행할 수 있다.3 is a flowchart illustrating the entire process of performing object tracking from a compressed image according to the present invention. In the object tracking process according to the present invention, a video analysis server may be preferably performed in a system for handling a series of compressed images, for example, a CCTV video control system.

본 발명에서는 압축영상을 디코딩할 필요없이 압축영상의 비트스트림을 파싱하여 각 영상 블록, 즉 매크로블록(Macro Block) 및 서브블록(Sub Block) 등의 신택스 정보, 바람직하게는 모션벡터(Motion Vector)와 코딩유형(Coding Type) 정보를 통해 이동객체 영역을 빠르게 추출한다. 이렇게 얻어진 이동객체 영역은 본 명세서에 첨부된 이미지에서 보여지는 바와 같이 이동객체의 경계선을 정확하게 반영하지는 못하지만 처리속도가 빠르면서도 신뢰도가 높은 장점이 있다. 그리고 나서, 본 발명에서는 이렇게 얻어진 이동객체 영역에 기초하여 관제요원이 지정한 특정의 이동객체 영역을 추적해나가는 동작을 수행한다.The present invention parses a bitstream of a compressed image without having to decode the compressed image, thereby syntax information of each image block, that is, a macro block and a sub block, preferably a motion vector. Quickly extract the moving object region through and coding type information. The moving object region thus obtained does not accurately reflect the boundary of the moving object as shown in the image attached to the present specification, but has a high processing speed and high reliability. Then, the present invention performs an operation of tracking the specific moving object area designated by the controller based on the obtained moving object area.

한편, 본 발명에 따르면 압축영상을 디코딩하지 않고도 이동객체 영역을 추출해내고 객체 추적을 수행할 수 있다. 하지만, 본 발명이 적용된 장치 또는 소프트웨어라면 압축영상을 디코딩하는 동작을 수행하지 않아야 하는 것으로 본 발명의 범위가 한정되는 것은 아니다.Meanwhile, according to the present invention, the moving object region can be extracted and the object tracking can be performed without decoding the compressed image. However, the apparatus or software to which the present invention is applied should not perform an operation of decoding a compressed image, but the scope of the present invention is not limited.

이하, [도 3]을 참조하여 본 발명에 따라 압축영상으로부터 객체 추적을 수행하는 과정의 개념을 살펴본다.Hereinafter, a concept of a process of performing object tracking from a compressed image according to the present invention will be described with reference to FIG. 3.

단계 (S100) : 먼저, 압축영상의 모션벡터에 기초하여 압축영상으로부터 실질적으로 의미를 인정할만한 유효 움직임을 검출하며, 이처럼 유효 움직임이 검출된 영상 영역을 이동객체 영역으로 설정한다.Step S100: First, an effective motion that can substantially recognize meaning is detected from the compressed image based on the motion vector of the compressed image, and the image region in which the effective motion is detected is set as the moving object region.

이를 위해, H.264 AVC 및 H.265 HEVC 등의 동영상압축 표준에 따라서 압축영상의 코딩 유닛(coding unit)의 모션벡터와 코딩유형을 파싱한다. 이때, 코딩 유닛의 사이즈는 일반적으로 64x64 픽셀 내지 4x4 픽셀 정도이며 플렉서블(flexible)하게 설정될 수 있다.To this end, the motion vector and coding type of a coding unit of a compressed image are parsed according to a video compression standard such as H.264 AVC and H.265 HEVC. In this case, the size of the coding unit is generally about 64x64 pixels to 4x4 pixels and may be set to be flexible.

각 영상 블록에 대해 미리 설정된 일정 시간(예: 500 msec) 동안 모션벡터를 누적시키고, 그에 따른 모션벡터 누적값이 미리 설정된 제 1 임계치(예: 20)을 초과하는지 검사한다. 만일 그러한 영상 블록이 발견되면 해당 영상 블록에서 유효 움직임이 발견된 것으로 보고 이동객체 영역으로 마킹한다. 그에 따라, 모션벡터가 발생하였더라도 일정 시간동안의 누적값이 제 1 임계치를 넘지 못하는 경우에는 영상 변화가 미미한 것으로 추정하고 무시한다.The motion vectors are accumulated for a predetermined time period (for example, 500 msec) for each image block, and it is checked whether the motion vector accumulation value exceeds the first threshold value (for example, 20). If such an image block is found, it is considered that an effective motion is found in the image block, and then marked as a moving object area. Accordingly, even if the motion vector is generated, if the cumulative value for a predetermined time does not exceed the first threshold, the image change is estimated to be insignificant and ignored.

단계 (S200) : 다음으로, 앞의 (S100)에서 검출된 이동객체 영역에 대하여 모션벡터와 코딩유형에 기초하여 바운더리 영역이 어디까지인지 검출한다. 이를 위해, 이동객체 영역으로 마킹된 영상 블록을 중심으로 인접한 복수의 영상 블록을 검사하여 모션벡터가 제 2 임계치(예: 0) 이상 발생하였거나 코딩유형이 인트라 픽쳐(Intra Picture)일 경우에는 해당 영상 블록도 이동객체 영역으로 마킹한다. 이러한 과정을 통해서는 실질적으로는 해당 영상 블록이 앞서 (S100)에서 검출된 이동객체 영역과 한 덩어리를 이루는 형태로 되는 결과가 된다.Step S200: Next, the moving object area detected in the previous step S100 is detected based on the motion vector and the coding type. To this end, a plurality of adjacent video blocks are examined centering on the image blocks marked as the moving object region, and when the motion vector occurs more than a second threshold (for example, 0) or when the coding type is an intra picture, the corresponding video Blocks are also marked as moving object areas. Through this process, the corresponding image block is substantially in the form of forming a lump with the moving object region detected in S100.

유효 움직임이 발견되어 이동객체 영역의 근방에서 어느 정도의 움직임이 있는 영상 블록이라면 이는 앞의 이동객체 영역과 한 덩어리일 가능성이 높기 때문에 이동객체 영역이라고 마킹한다. 또한, 인트라 픽쳐의 경우에는모션벡터가 존재하지 않기 때문에 모션벡터에 기초한 판정이 불가능하다. 이에, 이동객체 영역으로 이미 검출된 영상 블록에 인접하여 위치하는 인트라 픽쳐는 일단 기 추출된 이동객체 영역과 함께 한 덩어리로 추정한다.If an effective motion is found and there is a certain amount of motion in the vicinity of the moving object area, it is marked as a moving object area because it is likely to be a mass with the previous moving object area. In addition, in the case of an intra picture, since a motion vector does not exist, determination based on a motion vector is impossible. Accordingly, the intra picture located adjacent to the image block already detected as the moving object region is estimated as a mass together with the previously extracted moving object region.

단계 (S300) : 앞의 (S100)과 (S200)에서 검출된 이동객체 영역에 인터폴레이션(interpolation)을 적용하여 이동객체 영역의 분할(fragmentation)을 정리한다. 앞의 과정에서는 영상 블록 단위로 이동객체 영역 여부를 판단하였기 때문에 실제로는 하나의 이동객체(예: 사람)임에도 불구하고 중간중간에 이동객체 영역으로 마킹되지 않은 영상 블록이 존재하여 여러 개의 이동객체 영역으로 분할되는 현상이 발생할 수 있다. 그에 따라, 이동객체 영역으로 마킹된 복수의 영상 블록으로 둘러싸여 하나 혹은 소수의 비마킹 영상 블록이 존재한다면 이들은 이동객체 영역으로 추가로 마킹한다. 이를 통해, 여러 개로 분할되어 있는 이동객체 영역을 하나로 뭉쳐지도록 만들 수 있는데, 이와 같은 인터폴레이션의 영향은 [도 7]과 [도 8]을 비교하면 명확하게 드러난다.Step S300: The interpolation is applied to the moving object areas detected at S100 and S200 to clean up the fragmentation of the moving object area. In the above process, since it is determined whether the moving object area is the image block unit, even though it is actually a moving object (for example, a person), there is an image block that is not marked as the moving object area in the middle. The phenomenon of dividing into may occur. Accordingly, if there is one or a few unmarked image blocks surrounded by a plurality of image blocks marked with the moving object region, they additionally mark the moving object region. Through this, it is possible to make the mobile object area divided into several groups into one. The influence of such interpolation is clearly seen when comparing [FIG. 7] and [FIG. 8].

단계 (S400) : 이상의 과정을 통하여 압축영상에 대해 코딩 유닛의 신택스(모션벡터, 코딩유형)에 기초하여 이동객체 영역을 신속하게 추출하였다. 단계 (S400)에서는 이러한 이동객체 영역의 추출 결과를 이용하여 예컨대 관제요원이 모니터 화면에서 마우스 조작 등을 통해 이동객체 영역을 특정하면서 객체 추적을 요구하였을 때에 압축영상에서 해당 이동객체 영역의 이동 경로를 추적해낸다. 이 과정은 실시간성이 중요하기 때문에 본 발명이 양호하게 적용될 수 있다. 영상관제 시스템에서 관제요원 판단에 수상해보이는 무언가가 있다면 이를 추적하여 표시해줌으로써 범죄 예방 효과를 높이려는 것이다. 또한, 사후증거 확보라는 면에서도 객체 추적 정보는 유용하게 활용될 수 있다.Step S400: The moving object region is quickly extracted based on the syntax (motion vector, coding type) of the coding unit through the above process. In the step S400, when the control agent requests the object tracking while specifying the moving object area through a mouse operation or the like on the monitor screen by using the extracted result of the moving object area, the moving path of the moving object area in the compressed image is determined. Trace it. Since this process is important in real time, the present invention can be well applied. If there is something suspicious in the control agent's judgment in the video surveillance system, it is intended to increase the crime prevention effect by tracking and displaying it. In addition, the object tracking information may be usefully used in terms of securing post evidence.

이를 위해, 본 발명에서는 사용자 조작에 의해 추적 대상으로 특정된 이동객체 영역(이하, '추적대상 이동객체 영역'이라 함)과 관련하여 압축영상의 일련의 영상 프레임(예: 초당 30 프레임의 시퀀스)에 걸쳐 추적대상 이동객체 영역에 대한 일련의 좌표를 획득한다. 그리고 나서, 이렇게 획득된 일련의 좌표, 즉 좌표 시퀀스를 이동객체 영역에 대한 추적 결과로서 영상관제 장치로 제공한다. 영상관제 장치에서는 각각의 영상 프레임에서 해당 좌표에 대응하는 이동객체 영역을 관제요원에게 두드러진 모습으로 디스플레이 표시한다.To this end, in the present invention, a series of image frames (e.g., a sequence of 30 frames per second) of a compressed image in relation to a moving object region (hereinafter, referred to as a 'tracking target moving object region') specified as a tracking target by a user manipulation. Obtain a set of coordinates for the area of the tracked moving object over. Then, the obtained series of coordinates, that is, the coordinate sequence, is provided to the video control device as a tracking result for the moving object region. The video surveillance apparatus displays the moving object area corresponding to the corresponding coordinate in each image frame in a prominent manner to the control personnel.

압축영상으로부터 객체 추적을 수행하는 구체적인 과정에 대해서는 [도 9]를 참조하여 상세하게 후술한다.A detailed process of performing object tracking from the compressed image will be described later in detail with reference to FIG. 9.

[도 4]는 본 발명에서 압축영상으로부터 유효 움직임을 검출하는 과정의 구현 예를 나타내는 순서도이고, [도 5]는 CCTV 압축영상에 대해 본 발명에 따른 유효 움직임 영역 검출 과정이 적용된 결과의 일 예를 나타내는 도면이다. [도 4]의 프로세스는 [도 3]에서 단계 (S100)에 대응한다.4 is a flowchart illustrating an example of a process of detecting effective motion from a compressed image in the present invention, and FIG. 5 is an example of a result of applying an effective motion region detection process according to the present invention to a CCTV compressed image. It is a figure which shows. The process of FIG. 4 corresponds to step S100 in FIG.

단계 (S110) : 먼저, 압축영상의 코딩 유닛을 파싱하여 모션벡터 및 코딩유형을 획득한다. [도 1]을 참조하면, 동영상 디코딩 장치는 압축영상의 스트림에 대해 H.264 AVC 및 H.265 HEVC 등과 같은 동영상압축 표준에 따라 구문분석(헤더 파싱) 및 모션벡터 연산을 수행한다. 이러한 과정을 통하여 압축영상의 코딩 유닛에 대하여 모션벡터와 코딩유형을 파싱해낸다.Step S110: First, a coding unit of a compressed image is parsed to obtain a motion vector and a coding type. Referring to FIG. 1, a video decoding apparatus performs parsing (header parsing) and motion vector calculation on a stream of a compressed video according to a video compression standard such as H.264 AVC and H.265 HEVC. Through this process, the motion vector and coding type are parsed for the coding unit of the compressed image.

단계 (S120) : 압축영상을 구성하는 복수의 영상 블록 별로 미리 설정된 시간(예: 500 ms) 동안의 모션벡터 누적값을 획득한다. Step S120: Acquire a motion vector cumulative value for a preset time (for example, 500 ms) for each of the plurality of image blocks constituting the compressed image.

이 단계는 압축영상으로부터 실질적으로 의미를 인정할만한 유효 움직임, 예컨대 주행중인 자동차, 달려가는 사람, 서로 싸우는 군중들이 있다면 이를 검출하려는 의도를 가지고 제시되었다. 흔들리는 나뭇잎, 잠시 나타나는 고스트, 빛의 반사에 의해 약간씩 변하는 그림자 등은 비록 움직임은 있지만 실질적으로는 무의미한 객체이므로 검출되지 않도록 한다.This step is presented with the intention to detect if there are effective movements that are practically meaningful from the compressed image, such as driving cars, running people, and fighting crowds. Shaky leaves, ghosts that appear momentarily, and shadows that change slightly due to light reflections are not detected because they are moving but practically meaningless objects.

이를 위해, 미리 설정된 일정 시간(예: 500 msec) 동안 하나이상의 영상 블록 단위로 모션벡터를 누적시켜 모션벡터 누적값을 획득한다. 이때, 영상 블록은 매크로블록과 서브블록을 포함하는 개념으로 사용된 것이다.To this end, a motion vector cumulative value is obtained by accumulating the motion vectors in units of one or more image blocks for a predetermined time period (for example, 500 msec). In this case, the image block is used as a concept including a macroblock and a subblock.

단계 (S130, S140) : 복수의 영상 블록에 대하여 모션벡터 누적값을 미리 설정된 제 1 임계치(예: 20)와 비교하며, 제 1 임계치를 초과하는 모션벡터 누적값을 갖는 영상 블록을 이동객체 영역으로 마킹한다.Steps S130 and S140: Comparing a motion vector cumulative value with respect to a plurality of image blocks with a preset first threshold value (eg, 20) and moving the image block having a motion vector cumulative value exceeding the first threshold value. Mark with

만일 이처럼 일정 이상의 모션벡터 누적값을 갖는 영상 블록이 발견되면 해당 영상 블록에서 무언가 유의미한 움직임, 즉 유효 움직임이 발견된 것으로 보고 이동객체 영역으로 마킹한다. 예컨대 영상관제 시스템에서 사람이 뛰어가는 정도로 관제 요원이 관심을 가질만한 가치가 있을 정도의 움직임을 선별하여 검출하려는 것이다. 반대로, 모션벡터가 발생하였더라도 일정 시간동안의 누적값이 제 1 임계치를 넘지 못할 정도로 작을 경우에는 영상에서의 변화가 그다지 크지않고 미미한 것으로 추정하고 검출 단계에서 무시한다.If an image block having a predetermined motion vector accumulation value is found as described above, it is regarded that something significant motion, that is, effective motion, is found in the image block, and then marked as a moving object region. For example, in a video surveillance system, a human run is to detect and detect a movement that is worth the attention of the control personnel. On the contrary, even if the motion vector is generated, if the cumulative value for a predetermined time is small enough not to exceed the first threshold, the change in the image is assumed to be small and insignificant and is neglected in the detection step.

단계 (S150) : 압축영상의 재생 화면에 이동객체 영역을 일반 영상과 구별되도록 디스플레이 제공한다. [도 5]는 본 발명에서 유효 움직임 영역 검출 과정이 적용된 결과의 일 예를 나타내는 도면으로서, 제 1 임계치를 초과하는 모션벡터 누적값을 나타낸 다수의 영상 블록이 이동객체 영역으로 마킹되어 모니터 화면에 붉은 색으로 표시되었다. [도 5]를 살펴보면 보도블럭이나 도로, 그리고 그림자가 있는 부분 등은 이동객체 영역으로 표시되지 않은 반면, 걷고있는 사람들이나 주행중인 자동차 등이 이동객체 영역으로 표시되었다.Step S150: A display is provided on the playback screen of the compressed image to distinguish the moving object region from the normal image. FIG. 5 is a diagram illustrating an example of a result of an effective motion region detection process according to the present invention, in which a plurality of image blocks representing a motion vector accumulation value exceeding a first threshold is marked as a moving object region and displayed on a monitor screen. It is shown in red. Referring to FIG. 5, the sidewalk block, the road, and the shadowed part are not displayed as the moving object area, while the walking people or the driving car are displayed as the moving object area.

[도 6]은 본 발명에서 이동객체 영역에 대한 바운더리 영역을 검출하는 과정의 구현 예를 나타내는 순서도이고, [도 7]은 [도 5]에 나타낸 유효 움직임 영역 검출 과정을 적용한 CCTV 영상 이미지에 대해 [도 6]에 따른 바운더리 영역 검출 과정이 적용된 결과의 일 예를 나타내는 도면이다. [도 6]의 프로세스는 [도 3]에서 단계 (S200)에 대응한다.FIG. 6 is a flowchart illustrating an example of a process of detecting a boundary region of a moving object region in the present invention, and FIG. 7 is a CCTV image image to which the effective motion region detecting process shown in FIG. 5 is applied. 6 is a diagram illustrating an example of a result of applying a boundary area detection process according to FIG. 6. The process of FIG. 6 corresponds to step S200 in FIG.

앞서의 [도 5]를 살펴보면 이동객체가 제대로 마킹되지 않았으며 일부에 대해서만 마킹이 이루어진 것을 발견할 수 있다. 즉, 걷고있는 사람이나 주행중인 자동차를 살펴보면 객체의 전부가 마킹된 것이 아니라 일부 블록만 마킹되었다는 것을 발견할 수 있다. 더욱이 하나의 이동객체에 대해 복수의 이동객체 영역이 마킹된 것도 많이 발견된다. 이는 앞의 (S100)에서 채택한 이동객체 영역의 판단 기준이 일반 영역을 필터링 아웃하는 데에는 매우 유용하지만 상당히 엄격한 것이었다는 것을 의미한다. 따라서, 이동객체 영역을 중심으로 그 주변을 살펴봄으로써 이동객체의 바운더리를 검출하는 과정이 필요하다.Looking at the previous [Fig. 5] it can be found that the moving object is not properly marked and only a portion of the marking is made. In other words, if you look at a person walking or driving a car, you will find that not all of the objects are marked, but only some blocks. In addition, it is found that a plurality of moving object areas are marked for one moving object. This means that the criterion of the moving object region adopted in S100 was very useful for filtering out the general region but was quite strict. Therefore, it is necessary to detect the boundary of the moving object by looking around the moving object area.

단계 (S210) : 먼저, 앞의 (S100)에 의해 이동객체 영역으로 마킹된 영상 블록을 중심으로 하여 인접하는 복수의 영상 블록을 식별한다. 이들은 본 명세서에서는 편이상 '이웃 블록'이라고 부른다. 이들 이웃 블록은 (S100)에 의해서는 이동객체 영역으로 마킹되지 않은 부분인데, [도 6]의 프로세스에서는 이들에 대해 좀더 살펴봄으로써 이들 이웃 블록 중에서 이동객체 영역의 바운더리에 포함될만한 것이 있는지 확인하려는 것이다.Step S210: First, a plurality of adjacent image blocks are identified based on the image blocks marked as moving object areas by the previous S100. These are referred to as 'neighborhood blocks' in the present specification. These neighboring blocks are parts that are not marked as moving object areas by (S100). In the process of FIG. 6, the neighboring blocks are examined in detail to determine whether any of these neighboring blocks may be included in the boundary of the moving object area. .

단계 (S220, S230) : 복수의 이웃 블록에 대하여 모션벡터 값을 미리 설정된 제 2 임계치(예: 0)와 비교하고, 제 2 임계치를 초과하는 모션벡터 값을 갖는 이웃 블록을 이동객체 영역으로 마킹한다. 실질적으로 의미를 부여할만한 유효 움직임이 인정된 이동객체 영역에 인접하여 위치하고 어느 정도의 움직임도 발견되고 있다면 그 영상 블록은 촬영 영상의 특성상 앞의 이동객체 영역과 한 덩어리일 가능성이 높다. 따라서, 이러한 이웃 블록도 이동객체 영역이라고 마킹한다. Steps S220 and S230: compare a motion vector value with respect to a plurality of neighboring blocks with a second preset threshold (eg, 0) and mark the neighboring block having a motion vector value exceeding the second threshold as a moving object region. do. If the motion is located adjacent to the recognized moving object area and the motion is found to some extent, the image block is likely to be a block with the previous moving object area due to the characteristics of the captured image. Therefore, such neighboring blocks are also marked as moving object regions.

단계 (S240) : 또한, 복수의 이웃 블록 중에서 코딩유형이 인트라 픽쳐인 것을 이동객체 영역으로 마킹한다. 인트라 픽쳐의 경우에는 모션벡터가 존재하지 않기 때문에 해당 이웃 블록에 움직임이 존재하는지 여부를 모션벡터에 기초하여 판단하는 것이 원천적으로 불가능하다. 이 경우에 이동객체 영역으로 이미 검출된 영상 블록에 인접 위치하는 인트라 픽쳐는 일단 기 추출된 이동객체 영역의 설정을 그대로 유지해주는 편이 안전하다.Step S240: Also, the coding type is an intra picture among the plurality of neighboring blocks, as a moving object region. In the case of an intra picture, since a motion vector does not exist, it is fundamentally impossible to determine whether a motion exists in a corresponding neighboring block based on the motion vector. In this case, it is safer for the intra picture located adjacent to the image block already detected as the moving object region to maintain the setting of the previously extracted moving object region.

단계 (S250) : 압축영상의 재생 화면에 이동객체 영역을 일반 영상과 구별되도록 디스플레이 제공한다. [도 7]은 본 발명에서 바운더리 영역 검출 과정까지 적용된 결과의 일 예를 나타내는 도면인데, 이상의 과정을 통해 이동객체 영역으로 마킹된 다수의 영상 블록이 모니터 화면에 파란 색으로 표시되었다. [도 7]을 살펴보면, 앞서 [도 5]에서 붉은 색으로 표시되었던 이동객체 영역의 근방으로 파란 색의 이동객체 영역은 좀더 확장되었으며 이를 통해 이동객체를 전부 커버할 정도가 되었다는 사실을 발견할 수 있다.In operation S250, a display of the moving object region is provided on the reproduction screen of the compressed image so as to be distinguished from the normal image. 7 is a diagram illustrating an example of a result applied to a boundary region detection process in the present invention, in which a plurality of image blocks marked as moving object regions are displayed in blue on a monitor screen. Referring to FIG. 7, the blue moving object region was further extended to the vicinity of the moving object region, which was previously indicated in red in FIG. 5, and thus, it was found that the moving object region covered the entire moving object. have.

[도 8]은 [도 7]에 나타낸 바운더리 영역 검출 과정을 적용한 CCTV 영상 이미지에 대해 본 발명에 따라 인터폴레이션을 통해 이동객체 영역을 정리한 결과의 일 예를 나타내는 도면이다.FIG. 8 is a diagram illustrating an example of a result of arranging a moving object region through interpolation according to the present invention for a CCTV image image to which the boundary region detection process illustrated in FIG. 7 is applied.

단계 (S300)은 앞의 (S100)과 (S200)에서 검출된 이동객체 영역에 인터폴레이션을 적용하여 이동객체 영역의 분할을 정리하는 과정이다. [도 7]을 살펴보면 파란 색으로 표시된 이동객체 영역 사이사이에 비마킹 영상 블록이 발견된다. 이렇게 중간중간에 비마킹 영상 블록이 존재하게 되면 이들이 여러 개의 개별적인 이동객체인지 아니면 한 덩어리로 간주해야 할 대상인지 판단하기 어렵다. 특히, CCTV 영상관제 시스템의 모니터 화면 상에 얼룩덜룩하게 표시되므로 관제 요원이 즉각적으로 파악하기가 곤란하다는 단점도 있다. 더욱이, 이동객체 영역이 파편화되면 단계 (S400)의 결과가 부정확해질 수 있고, 특히 이동객체 영역의 갯수가 많아지기 때문에 단계 (S400)의 프로세스가 복잡해지는 문제도 있다.Step S300 is a process of arranging the division of the moving object area by applying interpolation to the moving object areas detected in the previous steps S100 and S200. Referring to FIG. 7, unmarked image blocks are found between the moving object regions indicated in blue. If there are non-marked image blocks in the middle, it is difficult to determine whether they are individual moving objects or objects to be regarded as a mass. In particular, since the mottled display on the monitor screen of the CCTV video control system has a disadvantage that it is difficult for the control personnel to immediately grasp. Furthermore, if the moving object region is fragmented, the result of step S400 may be inaccurate, and in particular, the process of step S400 becomes complicated because the number of moving object regions increases.

그에 따라, 본 발명에서는 이동객체 영역으로 마킹된 복수의 영상 블록으로 둘러싸여 하나 혹은 소수의 비마킹 영상 블록이 존재한다면 이는 이동객체 영역으로 마킹하는데, 이를 인터폴레이션이라고 부른다. [도 7]과 대비하여 [도 8]을 살펴보면, 이동객체 영역 사이사이에 존재하던 비마킹 영상 블록이 모두 이동객체 영역이라고 마킹되었다. 이를 통해, 관제 요원이 참고하기에 좀더 직관적이고 정확한 이동객체 검출 결과를 도출할 수 있게 되었다.Accordingly, in the present invention, if there is one or a few unmarked image blocks surrounded by a plurality of image blocks marked as the moving object region, this is marked as the moving object region, which is called interpolation. Referring to FIG. 8, in contrast to FIG. 7, all of the non-marked image blocks existing between the moving object regions are marked as moving object regions. Through this, it is possible to derive the result of detecting the moving object more intuitively and accurately for the control personnel.

[도 9]는 본 발명에서 압축영상으로부터 사용자가 지정한 추적대상 이동객체 영역을 추적 식별하는 과정의 구현 예를 나타내는 순서도로서, [도 3]에서 단계 (S400)에 대응한다.FIG. 9 is a flowchart illustrating an example of a process of tracking and identifying a region of a tracked moving object specified by a user from a compressed image according to the present invention, and corresponds to step S400 in FIG. 3.

전술한 바와 같이 본 발명은 압축영상의 코딩 유닛에서 바로 얻을 수 있는 신택스 정보에 기초하여 이동객체 영역을 추출한다. 종래기술의 압축영상을 디코딩하여 원본 영상에 대해 차영상을 획득하여 분석하는 과정이 불필요하게 되었으며, 이를 통해 발명자의 테스트에 따르면 최대 20배의 처리속도 개선을 이루었다. 그러나, 이러한 접근방식은 정밀도가 떨어진다는 약점이 있다. 이동객체 자체를 추출하는 것이 아니라 이동객체가 포함되어 있을 것으로 추정되는 영상 블록의 덩어리를 추출한다는 점에서 개념상 차이가 있다. 이러한 차이점을 반영하여 본 발명은 CCTV 촬영 영상에서 관제요원이 지정한 특정 객체를 시간경과에 따라 추적해나가는 과정에서도 종래기술과는 상이한 접근법을 채택하였다.As described above, the present invention extracts a moving object region based on syntax information directly obtained from a coding unit of a compressed image. The process of acquiring and analyzing the difference image with respect to the original image by decoding the compressed image of the prior art is unnecessary, and according to the inventor's test, the processing speed is improved up to 20 times. However, this approach has the disadvantage of poor precision. There is a difference in concept in that it extracts the chunk of the image block that is assumed to contain the moving object rather than extracting the moving object itself. Reflecting these differences, the present invention adopts a different approach from the prior art even in the process of tracking a specific object designated by a controller in a CCTV photographing image over time.

이하에서, 본 발명에서 채택하고 있는 객체 추적 과정의 일 실시예를 구체적으로 기술한다.Hereinafter, an embodiment of the object tracking process adopted in the present invention will be described in detail.

단계 (S410) : 먼저, 이동객체 영역을 하나의 객체(오브젝트)처럼 다루기 위하여 식별정보(ID) 미할당 상태인 이동객체 영역을 발견하면 Unique ID를 신규 발행하여 할당해준다. 즉, 이전의 과정에서 이동객체 영역이라고 마킹되어진 서로 연결되어 있는 영상블록의 덩어리를 하나의 객체(오브젝트)처럼 다루는 것이다. 이를 소프트웨어 처리 과정에서 구현하기 위해 이동객체 영역(영상블록의 덩어리)에 대해 Unique ID를 할당하여 관리한다.Step S410: First, in order to treat the moving object area as a single object (object), if a moving object area that is not assigned with identification information (ID) is found, a unique ID is newly issued and assigned. That is, in the previous process, the chunks of the image blocks connected to each other, which are marked as moving object regions, are treated as one object (object). In order to implement this in the software processing process, a unique ID is assigned to the moving object area (a block of image blocks) and managed.

그에 따라, [도 9]에서 이후의 과정은 이동객체 영역에 할당된 Unique ID를 기준으로 수행되는 것이 바람직한다. [도 10]은 이동객체 영역에 Unique ID가 할당되어 있는 일 예를 나타낸다.Accordingly, the subsequent process in FIG. 9 is preferably performed based on the Unique ID assigned to the moving object area. 10 illustrates an example in which a unique ID is allocated to a moving object area.

한편, 단계 (S410)에서는 이동객체 영역이라고 마킹되어진 서로 연결되어 있는 영상블록의 덩어리가 일련의 영상 프레임 앞뒤 간에 동일한 것인지 아닌지를 판단할 수 있어야 한다. 그래야, 현재 다루고 있는 이동객체 영역에 대해 이전에 Unique ID가 할당되어 있었는지 여부를 판단할 수 있기 때문이다.Meanwhile, in step S410, it should be possible to determine whether the chunks of the connected image blocks marked as moving object regions are the same before and after the series of image frames. This is because it is possible to determine whether a Unique ID has been previously assigned to the mobile object area currently being handled.

본 발명에서는 원본 영상 이미지의 내용을 다루는 것이 아니라 영상블록이 이동객체 영역인지 여부만 체크하였기 때문에 앞 뒤의 영상 프레임에서 이동객체 영역의 덩어리의 동일성 여부를 정밀하게 확인할 수 없다. 즉, 영상에 포함된 이미지 내용을 파악하지 않기 때문에 예컨대 동일 지점에서 앞 뒤 프레임 간에 고양이가 개로 치환되었을 때에 그러한 변화를 식별하지 못한다. 하지만, 프레임 간의 시간간격이 매우 짧다는 점과 영상관제 시스템의 관찰 대상은 통상의 속도로 움직인다는 점을 감안하면 이러한 일이 벌어질 가능성은 매우 낮다.In the present invention, not only the content of the original video image is checked, but only whether the image block is a moving object region, so it is not possible to accurately determine whether the mass of the moving object region is identical in the front and rear image frames. That is, since the contents of the image included in the image are not known, such a change cannot be identified, for example, when the cat is replaced by a dog between the frames before and after the same point. However, given that the time interval between the frames is very short and that the target of the video control system moves at a normal speed, this is unlikely to happen.

이에, 본 발명에서는 앞 뒤 프레임에서 이동객체 영역의 덩어리 간에 중첩되는 영상블록의 비율 혹은 갯수가 일정 임계치 이상인 것들을 동일한 이동객체 영역이라고 추정한다. 이러한 접근방식에 의하면 원본 영상의 내용을 모르더라도 특정의 이동객체 영역이 움직이고 있는 것인지 아니면 새로운 이동객체 영역이 신규로 나타난 것인지 아니면 기존의 이동객체 영역이 사라진 것인지 판단할 수 있다. 이러한 판단은 정확도는 종래기술에 비해 낮지만 데이터 처리 속도를 획기적으로 높일 수 있어 실제 적용에서는 오히려 장점을 나타낸다.Accordingly, the present invention estimates that the ratio or number of image blocks overlapping between the chunks of the moving object region in the front and back frames is equal to or greater than a predetermined threshold. According to this approach, it is possible to determine whether a specific moving object area is moving, a new moving object area is newly displayed, or an existing moving object area disappears even if the contents of the original image are not known. Although this accuracy is lower than that of the prior art, the data processing speed can be drastically increased, which shows an advantage in actual application.

단계 (S420, S430) : 이어서, 사용자, 예컨대 CCTV 관제요원의 조작에 대응하여 특정의 이동객체 영역을 추적 대상으로 설정한다. 관제요원이 CCTV 촬영 영상을 살펴보던 중에 화면 상에서 범죄자가 뛰어가고 있는 것을 발견하였다고 가정하면, CCTV 모니터 화면에서 그 범죄자를 추적하라고 지정할 수 있다. 본 발명에서는 범죄자를 추적하는 것이 아니라 그 범죄자가 속해있는 이동객체 영역을 추적하는 것으로 식별하며, 이를 추적 대상으로 설정한다. 본 명세서에서는 편이상 이 영역을 '추적대상 이동객체 영역'이라고 부른다.Steps S420 and S430, a specific moving object area is then set as the tracking target in response to the operation of a user, for example, a CCTV controller. Assuming that the agent is looking at the CCTV footage and found that the offender is running on the screen, you can specify that the offender be tracked on the CCTV monitor screen. In the present invention, rather than tracking criminals, they identify the area of the moving object to which the criminal belongs, and set the tracking target. In the present specification, this region is referred to as a 'tracking target moving object region'.

전술한 바와 같이 압축영상을 구성하는 일련의 영상 프레임에서 시간 경과에 따른 이동객체 영역의 아이덴티티(identity)를 관리하기 위해 Unique ID를 기준으로 이동객체 영역을 다루는 것이 바람직하다. 그에 따라, 본 발명은 추적대상 이동객체 영역에 할당된 Unique ID를 식별하며, 본 명세서에서는 편이상 이 Unique ID를 '추적대상 Unique ID'라고 부른다.As described above, in order to manage the identity of the moving object region over time in a series of image frames constituting the compressed image, the moving object region may be treated based on a unique ID. Accordingly, the present invention identifies a unique ID assigned to the tracked moving object region, and in this specification, the unique ID is referred to as a 'tracking unique ID'.

단계 (S440) : 이어서, 압축영상을 구성하는 일련의 영상 프레임의 각각에서 발견된 하나이상의 이동객체 영역을 살펴본다. 이들 이동객체 영역에 대해 Unique ID가 할당되어 있는데, 이들 중에서 추적대상 Unique ID 값과 동일한 Unique ID가 할당되어 있는 이동객체 영역을 식별한다. 이 식별된 이동객체 영역이 추적대상 이동객체 영역에 해당하며, 각 영상 프레임에서 그 식별된 이동객체 영역의 사각형 좌표를 순차적으로 산출해낸다. 이렇게 얻어진 일련의 사각형 좌표를 추적대상 이동객체 영역에 대한 좌표 시퀀스로 설정한다.Step S440: Next, one or more moving object regions found in each of the series of image frames constituting the compressed image are examined. Unique IDs are assigned to these mobile object areas, and among them, the mobile object areas to which unique ID values identical to the tracked unique ID values are assigned are identified. The identified moving object area corresponds to the tracking target object area, and the rectangular coordinates of the identified moving object area are sequentially calculated in each image frame. The set of rectangular coordinates thus obtained is set as the coordinate sequence for the tracked moving object region.

바람직하게는 추적대상 이동객체 영역에 대해 사각형 좌표를 산출한다. 이동객체 영역에 대한 사각형 좌표의 일 예로서 해당 이동객체 영역을 최적으로 포함하도록 가상으로 형성된 사각형에 대한 좌상단 좌표(x, y), 가로축 길이(dx), 세로축 길이(dy)를 통해 구성할 수 있다. 즉, 이동객체 영역의 사각형 좌표는 (x, y, dx, dy)의 형태로 이루어지는 것이다. [도 11]은 세 개의 이동객체 영역(Unique ID = 001, 002, 003)에 각각 사각형 좌표가 설정된 예를 나타낸다.Preferably, the rectangular coordinates are calculated for the tracked moving object region. As an example of the rectangular coordinates for the moving object area, the upper left coordinates (x, y), the horizontal axis length (dx), and the vertical axis length (dy) of the virtually formed rectangle may be configured to optimally include the moving object area. have. That is, the rectangular coordinates of the moving object area are in the form of (x, y, dx, dy). 11 illustrates an example in which rectangular coordinates are set in three moving object areas (Unique ID = 001, 002, and 003).

단계 (S450, S460) : 그리고 나서, 추적대상 이동객체 영역의 좌표 시퀀스에 포함되는 일련의 사각형 좌표를 압축영상의 해상도에 대응하여 정규화 처리한다. 정규화 처리는 특정 범위의 값, 예컨대 0 내지 1 사이의 실수 값으로 매핑하는 것인데, 본 발명에서는 영상관제 모니터의 해상도에 따른 차이를 극복하고 호환성을 유지하기 위해 채용되는 것이 바람직하다.Steps S450 and S460 then normalize a series of rectangular coordinates included in the coordinate sequence of the tracked moving object region corresponding to the resolution of the compressed image. The normalization process is to map to a specific range of values, for example, a real value between 0 and 1. In the present invention, it is preferable to be employed to overcome the differences according to the resolution of the video monitor and to maintain compatibility.

이와 같은 정규화 처리의 일 예로서 사각형 좌표를 압축영상의 가로 및 세로 해상도(x_res, y_res)로 나눗셈 처리하는 것을 들 수 있다. 즉, 사각형 좌표의 좌상단 x 좌표와 가로축 길이(dx)를 압축영상의 가로해상도(x_res)로 나눗셈 처리하고, 사각형 좌표의 좌상단 y 좌표와 세로축 길이(dy)를 압축영상의 세로해상도(y_res)로 나눗셈 처리하는 것이다. 이렇게 하면 모든 값이 0 내지 1 사이의 실수 값으로 정규화 매핑된다. 예를 들어 압축영상의 해상도가 가로 100, 세로 100이고 사각형 좌표가 (0, 0, 50, 50)이라면 정규화 처리를 수행한 후에는 좌표 값이 (0.0, 0.0, 0.5, 0.5)가 된다. 이러한 정규화 처리를 좌표 시퀀스에 포함된 일련의 사각형 좌표에 대해 일일히 적용한다.One example of such normalization is dividing the rectangular coordinates into horizontal and vertical resolutions (x_res, y_res) of the compressed image. That is, the upper left x coordinate and the horizontal axis length (dx) of the rectangular coordinates are divided by the horizontal resolution (x_res) of the compressed image, and the upper left y coordinate and the vertical axis length (dy) of the square coordinates are the vertical resolution (y_res) of the compressed image. It's a division process. This normalizes all values to real values between 0 and 1. For example, if the resolution of the compressed image is 100 horizontal and 100 vertical and the square coordinates are (0, 0, 50, 50), the coordinate value becomes (0.0, 0.0, 0.5, 0.5) after normalization processing. This normalization process is applied to a series of rectangular coordinates included in the coordinate sequence.

이어서, 그 정규화 처리된 추적대상 이동객체 영역의 좌표 시퀀스를 영상관제 장치로 제공하여 CCTV 관제요원이 다루고 있는 모니터에 추적대상 이동객체 영역의 모습이 드러날 수 있도록 한다. 전술한 바와 같이, 정규화 처리된 사각형 좌표는 영상관제 장치에서 모니터의 디스플레이 해상도에 무관하게 항상 일정한 모습을 제공하게 되는 장점이 있다.Subsequently, the coordinate sequence of the normalized tracked moving object area is provided to the video control device so that the state of the tracked moving object area is revealed on the monitor handled by the CCTV controller. As described above, the normalized rectangular coordinates have an advantage of always providing a constant appearance regardless of the display resolution of the monitor in the image control apparatus.

단계 (S470) : 압축영상의 재생 화면에 영상 프레임 별로 추적대상 이동객체 영역의 좌표 시퀀스에 대응하는 이동객체 영역을 일반 영상과 구별되도록 디스플레이 제공한다. 좌표 시퀀스에는 사각형 좌표가 포함되어 있으므로 해당 사각형 영역을 전체적으로 특별하게 표시할 수도 있고, 해당 사각형 영역에 최적으로 배치되어 있는 이동객체 영역을 특별하게 표시할 수도 있다.In operation S470, the moving object region corresponding to the coordinate sequence of the tracking target moving object region for each image frame is provided on the reproduction screen of the compressed image so as to be distinguished from the general image. Since the coordinate sequence includes the rectangular coordinates, the corresponding rectangular region may be specially displayed as a whole, or the moving object region optimally disposed in the rectangular region may be specially displayed.

예를 들어, 현재 추적 중인 이동객체 영역을 특수한 색상으로 디스플레이 표시하는 것이 바람직하다. 이때, 그 추적 대상 객체 간에 서로 구분되게 상이한 색상을 할당하는 것이 바람직하다. 이를 통해, 영상관제 시스템의 관제요원은 객체 추적을 수행하고 있는 영상 지점을 즉시 인식할 수 있게 되고, 이를 통해 좀더 높은 주의력을 가지고 관찰하게 된다. 이는 사후증거 확보의 과정에서도 마찬가지로 도움을 줄 수 있다.For example, it is preferable to display the moving object area currently being tracked in a special color. At this time, it is preferable to assign different colors to the tracking object to distinguish from each other. Through this, the control agent of the video control system can immediately recognize the image point performing the object tracking, through which the observation with higher attention. This can also help in the process of securing follow-up evidence.

단계 (S480) : 한편, 이동객체 영역이 일련의 영상 프레임에서 사라지는 경우에 그 이동객체 영역에 대해 앞서 단계 (S410)에서 할당하였던 Unique ID를 리보크 처리함으로써 이동객체 영역을 소멸시킨다.In operation S480, when the moving object region disappears in a series of image frames, the moving object region is destroyed by revoking the Unique ID allocated in step S410 to the moving object region.

한편, 본 발명은 컴퓨터가 읽을 수 있는 비휘발성 기록매체에 컴퓨터가 읽을 수 있는 코드의 형태로 구현되는 것이 가능하다. 이러한 비휘발성 기록매체는 컴퓨터가 읽을 수 있는 데이터를 저장하는 모든 종류의 스토리지 장치를 포함하는데 예컨대 하드디스크, SSD, CD-ROM, NAS, 자기테이프, 웹디스크, 클라우드 디스크 등이 있고 네트워크로 연결된 다수의 스토리지 장치에 코드가 분산 저장되고 실행되는 형태로 구현될 수도 있다.Meanwhile, the present invention may be embodied in the form of computer readable codes on a computer readable nonvolatile recording medium. Such nonvolatile recording media include all types of storage devices that store computer readable data, such as hard disks, SSDs, CD-ROMs, NAS, magnetic tapes, web disks, cloud disks, etc. The code may be implemented in a form in which the code is distributed and stored in a storage device.

Claims

Parsing the bitstream of the compressed image to obtain a motion vector and a coding type for the coding unit;
A second step of obtaining a motion vector cumulative value for a first preset time for each of the plurality of image blocks constituting the compressed image;
A third step of comparing the motion vector cumulative value with a first threshold value for the plurality of image blocks;
A fourth step of marking an image block having a motion vector accumulation value exceeding the first threshold as a moving object region;
A series of coordinates of the tracking target moving object region is acquired over a series of image frames of the compressed image in relation to the moving object region (hereinafter, referred to as a 'tracking target moving object region') specified as a tracking target by a user operation. Providing a corresponding coordinate sequence to the image control apparatus;
It is configured to include,
The fifth step,
A fifth step of newly issuing and assigning a unique ID when the mobile object region is in an unassigned ID state;
A fifth step of setting a specific moving object area (hereinafter referred to as a 'tracking target moving object area') as a tracking object according to a user's operation;
A fifth step of identifying a unique ID (hereinafter, referred to as a “tracking unique ID”) assigned to the tracked moving object region;
A fifth step of sequentially calculating rectangular coordinates of the moving object region to which the tracking target unique ID value is assigned to a series of image frames constituting the compressed image and setting the coordinates of the moving object region to be tracked;
A fifth step of normalizing the series of rectangular coordinates included in the coordinate sequence of the region to be tracked corresponding to the resolution of the compressed image;
A fifth step of providing the video monitoring apparatus with the normalized coordinate sequence of the tracked moving object region;
A fifth step of revoke the assigned Unique ID when the moving object area disappears from the series of image frames;
Syntax-based object tracking method for a compressed image, characterized in that comprises a.

delete

The method according to claim 1,
The rectangular coordinates include upper left coordinates (x, y), horizontal axis lengths (dx), and vertical axis lengths (dy) of a virtually formed rectangle so as to optimally include a moving object area.
The normalization process divides the upper left x coordinate and the horizontal axis length (dx) into the horizontal resolution (x_res) of the compressed image and divides the upper left y coordinate and the vertical axis length (dy) into the vertical resolution (y_res) of the compressed image. A syntax based object tracking method for a compressed image, characterized in that.

The method according to claim 1,
Performed between the fourth and fifth steps,
A step of identifying a plurality of adjacent image blocks (hereinafter, referred to as 'neighbor block') around the moving object area;
B) comparing a motion vector value obtained in the first step with respect to the plurality of neighboring blocks with a second preset threshold value;
C) additionally marking, as a moving object region, a neighboring block having a motion vector value exceeding the second threshold value as a result of the comparison in the b of the plurality of neighboring blocks;
Syntax-based object tracking method for a compressed image, characterized in that further comprises.

The method according to claim 4,
Carried out after the step c,
D) additionally marking a neighboring block having a coding type of an intra picture among the plurality of neighboring blocks as a moving object region;
Syntax-based object tracking method for a compressed image, characterized in that further comprises.

The method according to claim 5,
Carried out after the d step,
Performing an interpolation operation on the plurality of moving object areas to additionally mark up to a predetermined number of unmarked image blocks surrounded by the moving object area as a moving object area;
Syntax-based object tracking method for a compressed image, characterized in that further comprises.

The method according to claim 1,
The image block is a syntax-based object tracking method for a compressed image, characterized in that it comprises a macroblock and a subblock.

A non-transitory computer-readable recording medium having recorded thereon a program for executing a syntax-based object tracking method for a compressed image according to any one of claims 1, 3 to 7.