KR102152144B1

KR102152144B1 - Method Of Fast And High Efficiency Video Codec Image Coding Based On Object Information Using Machine Learning

Info

Publication number: KR102152144B1
Application number: KR1020180116237A
Authority: KR
Inventors: 이윤진
Original assignee: 강원호
Priority date: 2018-09-28
Filing date: 2018-09-28
Publication date: 2020-09-04
Also published as: KR20200039040A

Abstract

본 발명은 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법에 관한 것이다.
여기서, 본 발명의 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 영상이미지생성부가 시간에 따라 일정영역을 촬영하여 시간에 따라 서로 다른 영상이미지를 생성하는 (A)단계;
객체정보제공부가 특정 객체가 기계 학습되어, 학습된 객체의 객체정보를 추출하는 (B)단계;
객체영역추출부가 영상이미지생성부에서 영상이미지를 일정한 간격으로 나누어, 복수 개의 제1블록이 형성된 영상이미지를 입력 받고, 객체정보제공부에서 객체정보를 입력 받아, 저장된 기준객체와 객체정보를 대비하여, 매칭되지 않는 객체를 비학습객체로 추출하고, 매칭되는 객체를 학습객체로 추출하는 (C)단계;
학습객체부호화부가 학습객체에 포함되는 복수 개의 제1블록을 복수 개의 학습객체하위블록으로 분할하여, 상기 학습객체하위블록에 가중치를 부여하여 부호화하는 (D)단계; 및
비학습객체부호화부가 비학습객체에 포함되는 제1블록을 부호화하는 (E)단계를 포함한다.The present invention relates to a high-speed, high-efficiency video codec image encoding method based on object information using machine learning.
Here, the method of encoding a high-speed, high-efficiency video codec image based on object information using machine learning of the present invention includes the step (A) in which a video image generator captures a certain area over time and generates different video images over time;
(B) step of extracting object information of the learned object by machine learning the specific object by the object information providing unit;
The object region extraction unit divides the image image at regular intervals from the image image generator, receives the image image in which a plurality of first blocks are formed, receives the object information from the object information provider, and compares the stored reference object and object information. , (C) extracting the unmatched object as a non-learning object and extracting the matched object as a learning object;
(D) step of dividing a plurality of first blocks included in a learning object into a plurality of learning object sub-blocks, and encoding the learning object sub-blocks by giving weights to the learning object sub-blocks; And
The non-learning object encoding unit includes step (E) encoding the first block included in the non-learning object.

Description

Method Of Fast And High Efficiency Video Codec Image Coding Based On Object Information Using Machine Learning

본 발명은 기계학습으로 학습된 객체를 이용해 영상이미지를 빠르고 효율적으로 부호화 하는 기술에 관한 것이다.The present invention relates to a technique for quickly and efficiently encoding an image image using an object learned by machine learning.

기계학습 방법은 하나의 영상으로부터 다양한 객체를 분류하고, 분류된 객체 정보를 추출하는데 이용되고 있다. 보다 구체적으로, 기계학습 방법은 하나의 영상으로부터 사람, 차, 자전거, 자동차 등을 감지하며 객체에 맞게 분류하는 것이다.Machine learning methods are used to classify various objects from one image and to extract classified object information. More specifically, the machine learning method detects people, cars, bicycles, cars, etc. from a single image and classifies them according to objects.

기계학습은 객체를 분류 및 감지하며 객체에 대해 학습하며 학습된 데이터를 기반으로 보다 객체를 보다 정확하게 분류 및 감지할 수 있다. 대표적으로 딥 러닝(Deep learing) 기술은 기계학습 방법의 하나의 예가 된다.Machine learning classifies and detects objects, learns about objects, and can more accurately classify and detect objects based on the learned data. Typically, deep learning technology is an example of a machine learning method.

현재, 기계학습 기술은 카메라 또는 비디오 입력 장치에 결합되어 영상으로부터 특정 객체를 추출한다는 점에서, 불법 주차 단속, 불법 쓰레기 유기 및 제품불량여부를 판별할 수 있는 어플리케이션으로 활용되고 있다.Currently, machine learning technology is used as an application that can detect illegal parking enforcement, illegal garbage dumping, and product defects in that it is coupled to a camera or video input device to extract a specific object from an image.

그러나, 현재까지 개발된 기계학습 기술은 객체를 분류하고, 부호화 하는 과정에서 불필요한 분류 및 불필요한 부호화를 진행시키고 있다. 이는, 기계학습에 대한 복잡도를 증가시키며 부호화 효율을 낮추는 문제가 되고 있다.However, the machine learning technology developed up to now is proceeding unnecessary classification and unnecessary encoding in the process of classifying and encoding objects. This increases the complexity of machine learning and lowers the coding efficiency.

대한민국 등록특허 10-1851099 (공고일자 2018.04.20)Korean Patent Registration 10-1851099 (announcement date 2018.04.20)

이에, 본 발명이 해결하고자 하는 과제는 이러한 문제점을 해결하기 위한 것으로서, 본 발명은 기계학습을 통해 학습된 객체정보를 기반으로 영상의 부호화 효율을 증가시키고, 영상부호화의 복잡도를 감소시킬 수 있도록 한다.Accordingly, the problem to be solved by the present invention is to solve this problem, and the present invention increases the encoding efficiency of an image based on object information learned through machine learning, and reduces the complexity of image encoding. .

본 발명의 해결 하고자 하는 과제는 이상에서 언급한 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The problem to be solved of the present invention is not limited to the problems mentioned above, and other technical problems that are not mentioned will be clearly understood by those skilled in the art from the following description.

상기 해결하고자 하는 과제를 달성하기 위한 본 발명의 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은, 영상이미지생성부가 시간에 따라 일정영역을 촬영하여 시간에 따라 서로 다른 영상이미지를 생성하는 (A)단계;In the object information-based high-speed and high-efficiency video codec image encoding method using machine learning of the present invention for achieving the above-described problem, an image image generator generates different image images according to time by photographing a certain area according to time. Step (A);

객체정보제공부가 특정 객체가 기계 학습되어, 학습된 객체의 객체정보를 추출하는 (B)단계;(B) step of extracting object information of the learned object by machine learning the specific object by the object information providing unit;

객체영역추출부가 상기 영상이미지생성부에서 상기 영상이미지를 일정한 간격으로 나누어, 복수 개의 제1블록이 형성된 영상이미지를 입력 받고, 상기 객체정보제공부에서 상기 객체정보를 입력 받아, 저장된 기준객체와 상기 객체정보를 대비하여, 매칭되지 않는 객체를 비학습객체로 추출하고, 매칭되는 객체를 학습객체로 추출하는 (C)단계;The object region extracting unit divides the image image at regular intervals from the image image generator, receives an image image in which a plurality of first blocks are formed, receives the object information from the object information provider, and stores the reference object and the (C) step of extracting an unmatched object as a non-learning object and extracting the matched object as a learning object in preparation for object information;

학습객체부호화부가 상기 학습객체에 포함되는 복수 개의 제1블록을 복수 개의 학습객체하위블록으로 분할하여, 상기 학습객체하위블록에 가중치를 부여하여 부호화하는 (D)단계; 및(D) step of dividing a plurality of first blocks included in the learning object into a plurality of learning object sub-blocks, and encoding the learning object sub-blocks by giving weights to the learning object sub-blocks; And

상기 비학습객체부호화부가 상기 비학습객체에 포함되는 제1블록을 부호화하는 (E)단계를 포함한다.And (E) encoding the first block included in the non-learning object by the non-learning object encoding unit.

상기 (C)단계에서, 객체영역추출부가 학습객체로 추출할 때, 상기 학습객체를 추출할 때, 학습객체의 크기 보다 크게 학습객체영역을 설정하는 단계를 더 포함할 수 있다.In the step (C), when the object region extracting unit extracts the learning object, when extracting the learning object, the step of setting the learning object region larger than the size of the learning object may be further included.

본 발명에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 기계학습을 통해 학습된 객체를 기반으로 영상이미지에서 학습객체와 비학습객체를 분류하고 학습객체와 비학습객체를 서로 다른 부호화 과정으로 부호화한다.The object information-based high-speed, high-efficiency video codec image encoding method using machine learning according to the present invention classifies learning objects and non-learning objects from image images based on objects learned through machine learning, and differentiates learning and non-learning objects from each other. It is encoded in the encoding process.

즉, 본 발명은 하나의 영상이미지에 서로 다른 부호화를 진행시키며, 영상이미지의 부호화 효율을 향상시킬 수 있다. 또한, 본 발명은 하나의 영상이미지에 서로 다른 블록 분할과정을 진행시키며, 분할된 블록을 부호화 시키며 영상이미지에 대한 부호화 속도를 높일 수 있다.That is, according to the present invention, different encodings are performed on one image image, and encoding efficiency of the image image may be improved. In addition, according to the present invention, different block division processes are performed on one video image, the divided blocks are encoded, and the encoding speed for the video image can be increased.

도 1은 본 발명의 일 실시예에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템의 블록도이다.
도 2는 영상이미지생성부에 촬영된 영상이미지를 나타낸 도면이다.
도 3은 영상이미지생성부와 객체정보제공부가 결합된 일례의 장치를 나타낸 도면이다.
도 4는 도 1의 객체영역추출부가 도 2의 제2영상이미지를 처리하는 상태를 나타낸 도면이다.
도 5는 도 4의 제1블록이 학습객체하위블록으로 분할되는 과정을 나타낸 도면이다.
도 6은 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템의 학습객체부호화부의 부호화 과정을 나타낸 도면이다.
도 7은 객체영역추출부가 영상이미지에 포함된 학습객체의 영역을 표시한 도면이다.
도 8은 학습객체의 영역에 가중치를 표시한 도면이다.
도 9는 학습객체부호화부와 비학습객체부호화부가 시간에 따라 연속된 영상을 분할하는 과정을 나타낸 도면이다.
도 10은 SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법을 나타낸 도면이다.
도 11은 코딩유닛 분할구조 제한 종료방법을 나타낸 도면이다.
도 12는 움직임 예측 탐색범위 제한 종료방법의 처리순서도이다.
도 13은 향상된 움직임 벡터 예측 (AMVP: Advanced Motion Vector Prediction) 탐색 방법을 나타낸 도면이다.
도 14는 본 발명의 일 실시예에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법의 순서도이다.1 is a block diagram of a high-speed, high-efficiency video codec image encoding system based on object information using machine learning according to an embodiment of the present invention.
2 is a diagram showing an image image captured by an image image generator.
3 is a diagram showing an example of a device in which an image image generating unit and an object information providing unit are combined.
4 is a diagram illustrating a state in which the object region extraction unit of FIG. 1 processes the second image image of FIG. 2.
5 is a diagram showing a process in which the first block of FIG. 4 is divided into learning object sub-blocks.
6 is a diagram illustrating an encoding process of a learning object encoder in a high-speed, high-efficiency video codec image encoding system based on object information using machine learning.
7 is a diagram showing an area of a learning object included in an image image by an object area extraction unit.
8 is a diagram showing weights of a learning object area.
9 is a diagram showing a process of dividing a continuous image according to time by a learning object encoding unit and a non-learning object encoding unit.
10 is a diagram showing a method of early termination of a coding unit division structure using a SKIP mode.
11 is a diagram showing a method of ending a coding unit division structure limitation.
12 is a flowchart of a method for ending a motion prediction search range limitation.
13 is a diagram illustrating a method of searching for advanced motion vector prediction (AMVP).
14 is a flowchart illustrating a method of encoding a high-speed, high-efficiency video codec image based on object information using machine learning according to an embodiment of the present invention.

본 발명의 이점 및 특징 그리고 그것들을 달성하기 위한 방법들은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시 예들을 참조하면 명확해질 것이다.Advantages and features of the present invention and methods for achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings.

본 명세서 상에서는 본 발명의 설명이 간결하고 명확해질 수 있도록, 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법을 설명하기 앞서, 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법을 수행하는 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템에 대해 먼저 설명한다.In this specification, before describing the object information-based high-speed high-efficiency video codec image encoding method using machine learning, the object information-based high-speed high-efficiency video codec image encoding method using machine learning is performed so that the description of the present invention may be concise and clear. A high-speed, high-efficiency video codec image encoding system based on object information using machine learning will be described first.

따라서, 명세서 전체에 걸쳐 기술된 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템에 대한 모든 설명은 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법에 그대로 적용될 수 있다.Accordingly, all descriptions of the object information-based high-speed high-efficiency video codec image encoding system using machine learning described throughout the specification can be directly applied to the object information-based high-speed high-efficiency video codec image encoding method using machine learning.

아울러, 본 명세서의 상에 기재되는 방법의 단계는 컴퓨터에 의해 수행되는 각 단계이다. 따라서, 본 발명의 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법을 수행하는 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템 즉, 컴퓨터에서 처리될 수 있다.In addition, the steps of the method described above in this specification are each step performed by a computer. Therefore, the object information-based high-speed high-efficiency video codec image encoding method using machine learning of the present invention is a high-speed high-efficiency video codec image encoding based on object information using machine learning that performs the object information-based high-speed high-efficiency video codec image encoding method using machine learning. It can be processed in a system ie computer.

이하, 도 1 내지 도 11을 참조하여 본 발명의 일 실시예에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템에 대해 상세히 설명한다. 그리고 이를 바탕으로 도 12를 참조해 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법에 대해 상세히 설명한다.Hereinafter, a high-speed, high-efficiency video codec image encoding system based on object information using machine learning according to an embodiment of the present invention will be described in detail with reference to FIGS. 1 to 11. And based on this, a method of encoding a high-speed, high-efficiency video codec image based on object information using machine learning will be described in detail with reference to FIG. 12.

먼저, 도 1을 참조하여 본 발명의 일 실시예에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템에 대해 구체적으로 설명한다.First, a high-speed, high-efficiency video codec image encoding system based on object information using machine learning according to an embodiment of the present invention will be described in detail with reference to FIG. 1.

기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템(1)은 기계학습을 통해 학습된 객체를 기반으로 영상이미지에서 학습객체와 비학습객체를 분류하고 학습객체와 비학습객체를 서로 다른 부호화 과정으로 부호화한다.High-speed and high-efficiency video codec video encoding system based on object information using machine learning (1) classifies learning and non-learning objects from image images based on objects learned through machine learning, and encodes learning and non-learning objects differently. It is encoded as a process.

이러한 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템(1)은 영상이미지생성부(10), 객체정보제공부(20), 객체영역추출부(30), 학습객체부호화부(40) 및 비학습객체부호화부(50)를 포함한다.The object information-based high-speed and high-efficiency video codec image encoding system 1 using machine learning includes an image image generator 10, an object information providing unit 20, an object region extraction unit 30, and a learning object encoding unit 40. And a non-learning object encoding unit 50.

아울러, 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템(1)은 부호화종료여부판단부(60)를 포함할 수 있다.In addition, the object information-based high-speed and high-efficiency video codec image encoding system 1 using machine learning may include an encoding end determination unit 60.

영상이미지생성부(10)는 시간에 따라 일정영역을 촬영하여 시간에 따라 서로 다른 영상 이미지를 생성한다. 이러한 영상이미지생성부(10)는 객체가 위치한 일정영역을 촬영하여 영상 이미지를 생성할 수 있는 카메라가 될 수 있다. 일례로, 영상이미지생성부(10)는 객체정보와 매칭되는 복수 개의 객체를 포함하는 일정영역을 촬영하여 도 2의 (a)에 도시된 바와 같이, 제1영상이미지(110)와 도 2의 (b)에 도시된 바와 같이, 제2영상이미지(120)를 생성할 수 있다. 이때, 복수 개의 객체는 객체정보제공부(20)에서 기계학습 된 객체정보에 대응되는 자동차 객체, 자전거 객체 및 강아지 객체가 될 수 있다. 이러한 객체는 블록기반 부호화 코덱에서 블록 내에 여러 개가 존재할 수 있다. 이때, 객체에 대한 객체정보 예를 들어, 플래그 정보 등은 영상을 코딩 하여 전송하는 비트스트림에 포함될 수 있다.The image image generating unit 10 photographs a certain area according to time and generates different image images according to time. Such an image image generator 10 may be a camera capable of generating an image image by photographing a certain area in which an object is located. As an example, the image image generation unit 10 photographs a certain area including a plurality of objects matched with the object information, and, as shown in FIG. 2A, the first image 110 and the FIG. 2 As shown in (b), the second image image 120 may be generated. At this time, the plurality of objects may be a car object, a bicycle object, and a dog object corresponding to the object information machine-learned by the object information providing unit 20. Several such objects may exist in a block in a block-based encoding codec. In this case, object information about the object, for example, flag information, may be included in a bitstream for coding and transmitting an image.

객체정보제공부(20)는 영상(또는 영상 프레임)이미지에 포함된 객체를 추출하고 기계학습하며 학습된 객체로부터 객체정보를 추출한다.The object information providing unit 20 extracts an object included in an image (or image frame) image, performs machine learning, and extracts object information from the learned object.

이러한 객체정보제공부(20)는 영상이미지생성부(10)에 설치될 수 있다. 다시 말해, 영상이미지생성부(10)와 객체정보제공부(20)는 하나의 개체로 통합된 장치로 형성될 수 있다. 일례로, 도 3에 도시된 바와 같이, 기계학습을 하는 객체정보제공부(20)가 영상생성부(10)에 설치된 CCTV 카메라로 형성될 수 있다.The object information providing unit 20 may be installed in the image image generating unit 10. In other words, the image image generating unit 10 and the object information providing unit 20 may be formed as an integrated device as one object. For example, as shown in FIG. 3, the object information providing unit 20 for machine learning may be formed by a CCTV camera installed in the image generating unit 10.

아울러, 객체정보제공부(20)는 도 3에 도시된 바와 같이 영상이미지생성부(10)에서 영상 프레임을 생성하기 이전에 다양한 특정 객체 일례로, 자전거, 자동차, 개 등의 객체를 기계학습 할 수 있다. 이때 객체정보제공부(20)는 딥 러닝 기술과 같은 기계학습 기술을 통해, 객체를 기계 학습할 수 있다. 또한, 객체정보제공부(20)는 영상이미지의 부호화에 필요한 객체정보를 이용할 수 있도록 한다.In addition, the object information providing unit 20, as shown in Figure 3, before generating the image frame in the image image generator 10, as an example of various specific objects, to machine learning objects such as bicycles, cars, dogs, etc. I can. In this case, the object information providing unit 20 may machine learn the object through a machine learning technology such as a deep learning technology. In addition, the object information providing unit 20 makes it possible to use object information necessary for encoding an image image.

객체정보제공부(20)는 영상이미지생성부(10)를 통해 촬영된 영상이미지로부터 학습된 객체를 감지하며 객체로부터 정보 즉, 객체정보를 추출할 수 있다.The object information providing unit 20 may detect an object learned from an image image captured through the image image generator 10 and extract information, that is, object information from the object.

영상이미지생성부(10)는 생성된 영상이미지 그리고 객체정보제공부(20)는 추출된 객체정보를 객체영역추출부(30)에 전송한다.The image image generating unit 10 transmits the generated image image and the object information providing unit 20 to the object region extracting unit 30 the extracted object information.

객체영역추출부(30)는 도 4에 도시된 바와 같이 영상이미지를 일정한 간격으로 나누어, 복수 개의 제1블록이 형성된 영상이미지를 입력 받는다.The object region extracting unit 30 divides the image image at regular intervals as shown in FIG. 4 to receive an image image in which a plurality of first blocks are formed.

객체영역추출부(30)는 저장된 기준객체 즉, 기계 학습된 객체정보와 영상이미지에서 추출된 객체정보를 비교한다. 이때, 객체영역추출부(30)는 매칭되지 않는 객체를 비학습객체(Ba1)로 추출하고, 매칭되는 객체를 학습객체(Aa1, Ab1, Ac1)로 추출한다.The object region extraction unit 30 compares the stored reference object, that is, machine-learned object information and object information extracted from an image image. In this case, the object region extraction unit 30 extracts the unmatched object as a non-learning object Ba1, and extracts the matched object as a learning object Aa1, Ab1, Ac1.

학습객체부호화부(40)는 객체영역추출부(30)에서 추출된 학습객체 일례로, 강아지 학습객체(Aa1)에 포함되는 제1블록(Aa11)을 복수 개의 학습객체하위블록(Aa21)로 분할할 수 있다.The learning object encoding unit 40 is an example of a learning object extracted from the object region extraction unit 30, and divides the first block Aa11 included in the puppy learning object Aa1 into a plurality of learning object sub-blocks Aa21. can do.

여기서, 학습객체하위블록(Aa21)은 도 5에 도시된 바와 같이 제1블록의 가로 및 세로의 길이가 1/2이 되는 제2블록(Aa12) 및 제2블록의 가로 및 세로의 길이가 1/2이 되는 제3블록(Aa13)이 되고, 제3블록의 가로 및 세로의 길이가 1/2이 되는 제4블록(Aa14)이 될 수 있다.Here, the learning object sub-block Aa21 has a second block Aa12 whose horizontal and vertical length is 1/2 of the first block and the horizontal and vertical length of the second block as shown in FIG. It may be a third block Aa13 of /2, and a fourth block Aa14 of 1/2 of the horizontal and vertical lengths of the third block.

학습객체부호화부(40)는 제4블록에 제3블록 보다 큰 가중치 그리고 제3블록에 제2블록 보다 큰 가중치 그리고 제2블록에 상기 제1블록 보다 큰 가중치를 부여할 수 있다. 그리고 가중치가 부여된 블록을 부호화 할 수 있다. 일례로, 학습객체부호화부(40)는 최초 제1블록에서부터 제1블록이 분할되어 형성되는 블록에 가중치 1씩 증가시키며 부여하며 최초 제1블록에 가중치 0, 제2블록에 가중치 1, 제3블록에 가중치 2 및 제4블록에 가중치 3을 부여할 수 있다. 그리고 각 블록을 부호화 할 수 있다. 즉, 학습객체부호화부(40)는 학습객체가 검출된 영역에 제1블록에서 많이 분할되어 큰 가중치가 부여된 제3블록 및 제4블록이 형성되도록 하며, 분할된 블록을 부호화 한다.The learning object encoding unit 40 may give the fourth block a weight greater than that of the third block, the third block with a weight greater than the second block, and the second block with a weight greater than that of the first block. In addition, the weighted block can be encoded. For example, the learning object encoding unit 40 increases and assigns a weight of 1 to a block formed by dividing the first block from the first block, and assigns a weight of 0 to the first block, a weight of 1 to the second block, and a third block. A weight 2 may be applied to the block and a weight 3 may be applied to the fourth block. And each block can be coded. In other words, the learning object encoding unit 40 forms the third block and the fourth block to which the learning object is detected by being divided in a large amount from the first block to form the third block and the fourth block to which the large weight is assigned, and encodes the divided blocks.

보다 구체적으로 학습객체부호화부(40)는 부호화를 수행할 때 영상을 코딩 유닛(CU: Coding Unit, 이하 'CU')의 기본 단위인 최대코딩유닛(LCU: Largest Coding Unit) 단위로 나누어 부호화를 수행한다. 여기서, 코딩 유닛(CU)은 기존의 비디오 코덱인 H.264 /AVC에서의 기본 블록인 매크로블록(MB: Macro Block, 이하 'MB')과 유사한 역할을 한다. 그러나, 코딩유닛은 16x16의 고정 크기를 갖는 매크로블록과 달리 가변적으로 크기가 정해질 수 있다. 또한 최대코딩유닛(LCU)은 영상의 효율적인 부호화를 위해 다시 최대코딩유닛 보다 작은 크기를 갖는 여러 코딩유닛(CU)으로 분할될 수 있다. 64x64 크기의 최대코딩유닛은 다양한 방식으로 복수의 코딩유닛(CU)들로 분할될 수 있다. 64x64 크기의 최대코딩유닛은 도 5와 같이 복수 개의 코딩유닛들로 분할될 수 있다.More specifically, when performing encoding, the learning object encoder 40 divides the image into a unit of a largest coding unit (LCU), which is a basic unit of a coding unit (CU), and performs encoding. Perform. Here, the coding unit (CU) plays a similar role to a macroblock (MB: Macro Block, hereinafter'MB'), which is a basic block in the existing video codec H.264 /AVC. However, unlike a macroblock having a fixed size of 16x16, the coding unit may be variably sized. In addition, the maximum coding unit (LCU) may be divided into several coding units (CU) having a size smaller than that of the maximum coding unit in order to efficiently encode an image. The maximum coding unit having a size of 64x64 may be divided into a plurality of coding units (CUs) in various ways. The maximum coding unit having a size of 64x64 may be divided into a plurality of coding units as shown in FIG. 5.

이하, 도 5를 참조하여, 제1블록이 학습객체하위블록으로 분할되는 과정을 설명한다. 여기서, 도 5의 ①은 도 4의 강아지 학습객체(Aa1)에 포함되는 제1블록(Aa11)을 나타낸다. 제1블록(Aa11)은 도 5의 ②에 도시된 바와 같이, 최대코딩유닛이 분할 깊이 1인 32x32 크기의 코딩유닛(CU)들로 분할될 수 있다. 32x32 크기의 코딩유닛(CU)들은 도 5의 ④, ⑧, ⑫에 도시된 바와 같이, 32x32 크기의 코딩유닛(CU)이 분할 깊이 2인 16x16 크기의 코딩유닛(CU)들로 분할될 수 있다. 그리고, 16x16 크기의 코딩유닛(CU)들은 도 5의 ⑥, ⑩에 도시된 바와 같이, 8x8 크기의 분할 깊이 3의 코딩유닛(CU)들로 분할될 수 있다.Hereinafter, a process in which the first block is divided into learning object sub-blocks will be described with reference to FIG. 5. Here, ① of FIG. 5 represents a first block Aa11 included in the puppy learning object Aa1 of FIG. 4. As shown in ② of FIG. 5, the first block Aa11 may be divided into coding units (CUs) having a size of 32x32 with a maximum coding unit of 1 division depth. As shown in ④, ⑧, and ⑫ of FIG. 5, the 32x32-sized coding units (CUs) may be divided into 16x16-sized coding units (CUs) having a split depth of 2 and a 32x32-sized coding unit (CU). . In addition, the coding units (CUs) having a size of 16x16 may be divided into coding units (CUs) having a division depth of 3 having a size of 8x8 as shown in ⑥ and ⑩ of FIG. 5.

최대코딩유닛(LCU)은 이와 같이 복수 개의 코딩유닛(CU)으로 분할될 수 있다. 이러한 최대코딩유닛(LCU)의 분할 구조는 부호화 단위의 분할 정보가 될 수 있다. 학습객체부호화부(40)는 다양한 최대코딩유닛(LCU) 분할 구조를 생성하여 최대코딩유닛(LCU) 분할 구조 후보에 저장한 뒤, 최적의 최대코딩유닛(LCU) 분할 구조를 결정하는 단계에서 최대코딩유닛(LCU) 단위로 최대코딩유닛(LCU) 분할 구조 후보 중 하나의 분할 구조를 최적의 최대코딩유닛(LCU) 분할 구조로 선택할 수 있도록 한다.The maximum coding unit (LCU) may be divided into a plurality of coding units (CU) as described above. The split structure of the LCU may be split information of the coding unit. The learning object encoding unit 40 generates various LCU division structures, stores them in the LCU division structure candidate, and determines the optimal LCU division structure. It is possible to select one of the LCU split structure candidates as the optimal LCU split structure in units of the coding unit (LCU).

코딩유닛(CU) 후보에 대한 선택은 율-왜곡 최적화(Rate-distortion Optimization) 방법에 의해 결정되며, 이를 통해 가장 부호화 효율이 좋은 분할구조로 결정된다.The selection of a coding unit (CU) candidate is determined by a rate-distortion optimization method, and through this, a partition structure having the best coding efficiency is determined.

이와 같은 최대코딩유닛(LCU)의 분할구조는 최대코딩유닛(LCU) 단위로 영상의 특성에 맞게 최대코딩유닛(LCU) 분할 구조를 기초로 하여, 부호화를 수행함으로써 부호화 효율을 높일 수 있다. 아울러, 이러한 학습객체부호화부의 부호화 과정은 도 6에 도시된 바와 같이 나타날 수 있다.The division structure of the maximum coding unit (LCU) is based on the division structure of the maximum coding unit (LCU) according to the characteristics of an image in units of the maximum coding unit (LCU), and encoding efficiency can be improved by performing encoding. In addition, the encoding process of the learning object encoding unit may appear as shown in FIG. 6.

학습객체부호화부의 부호화 과정은 HEVC 비디오 코덱 부호화 과정이 될 수 있다. 이러한 HEVC 비디오 코덱 부호화 과정은 도 6에 도시된 바와 같이, 블록 된 영상을 입력 받아 부호화 단위 및 구조, 화면 간(Inter) 예측, 보간(Interpolation), 필터링(Filtering), 변환(Transform) 방법 등 수행할 수 있다.The encoding process of the learning object encoder may be an HEVC video codec encoding process. In the HEVC video codec encoding process, as shown in FIG. 6, a block image is received and coding units and structures, inter prediction, interpolation, filtering, and transformation are performed. can do.

비학습객체부호화부(50)는 객체영역추출부(30)에서 추출된 비학습객체에 포함되는 제1블록을 부호화 할 수 있다. 또한, 비학습객체부호화부(50)는 제1영상이미지(110)와 제2영상이미지(120)를 중첩하여 제1영상이미지의 비학습객체에 포함되는 영역에 제2영상이미지의 학습객체가 중첩되는 경우, 즉, 학습객체가 움직여 비학습객체로 진입하는 경우 제2영상이미지의 학습객체의 제1블록(Aa11)을 학습객체부호화부와 같이 제1블록의 크기 이하로 분할할 수 있다. 즉, 비학습객체부호화부(50)는 제1블록을 제1블록의 가로 및 세로의 길이가 1/2이 되는 제2블록으로 분할할 수 있다. 특히, 비학습객체부호화부(50)는 제1블록의 분할을 제한하는 분할블록의 크기값을 포함하고 있어, 분할블록의 크기값에 대응될 때까지 제1블록을 분할할 수 있다.The non-learning object encoding unit 50 may encode a first block included in the non-learning object extracted by the object region extracting unit 30. In addition, the non-learning object encoding unit 50 overlaps the first video image 110 and the second video image 120 so that the learning object of the second video image is in the region included in the non-learning object of the first video image. In the case of overlapping, that is, when the learning object moves and enters the non-learning object, the first block Aa11 of the learning object of the second image image can be divided into the size of the first block or less like the learning object encoding unit. That is, the non-learning object encoding unit 50 may divide the first block into second blocks whose horizontal and vertical lengths of the first block are 1/2. In particular, since the non-learning object encoding unit 50 includes a size value of a divided block limiting the division of the first block, it can divide the first block until it corresponds to the size value of the divided block.

또한, 비학습객체부호화부(50)는 제2블록에 제1블록 보다 큰 가중치를 부여할 수 있다. 일례로, 비학습객체부호화부(50)는 제1블록에 가중치 0을 부여하고, 제2블록에 가중치 1을 부여할 수 있다. 비학습객체부호화부(50)는 이와 같이 분할된 블록을 부호화 할 수 있다. 이와 같은, 비학습객체부호화부(50)는 객체가 검출되지 않고 객체의 움직임이 적은 또는 화소 변화가 적은 영역에 제1블록에서 많이 분할되지 않고, 분할되더라도 한 번 분할되어, 적은 가중치가 부여된 제2블록이 형성되도록 하며 분할된 블록을 부호화 한다.In addition, the non-learning object encoding unit 50 may give the second block a greater weight than the first block. For example, the non-learning object encoding unit 50 may assign a weight of 0 to the first block and a weight of 1 to the second block. The non-learning object encoding unit 50 may encode the divided blocks. As such, the non-learning object encoding unit 50 is not divided much in the first block in the area where the object is not detected and the movement of the object is small or the pixel change is small, and is divided once even if the object is divided, and a small weight is given. A second block is formed and the divided blocks are encoded.

이하, 도 7 및 도 8을 참조하여 객체영역추출부가 갖는 다른 특징에 대해 설명하도록 한다. 객체영역추출부(30)는 학습객체를 포함하여, 도 7에 도시된 바와 같이 학습객체(Aa1, Ab1, Ac1)의 크기 보다 크게 학습객체영역(OA)을 설정할 수 있다. 보다 구체적으로 객체영역추출부(30)는 학습객체의 외측을 따라 한 층의 제1블록이 감싼 영역을 학습객체영역으로 설정할 수 있다.Hereinafter, other features of the object region extraction unit will be described with reference to FIGS. 7 and 8. The object region extracting unit 30 may include the learning object and set the learning object region OA to be larger than the size of the learning objects Aa1, Ab1, and Ac1 as shown in FIG. 7. More specifically, the object region extracting unit 30 may set an area surrounded by the first block of one layer along the outer side of the learning object as the learning object region.

도 8의 (a)는 객체영역추출부에서 비학습객체의 영역을 구하는 첫 번째 단계로서, 제1블록 즉, 최대코딩유닛(LCU)로 분할된 영상이미지에서 객체영역이 포함된 최대코딩유닛(LCU)에 대해서는 가중치를 '2'로 부여하고, 그외의 최대코딩유닛(LCU) 즉, 제1블록에 대해서는 가중치를 '0'으로 부여한다. (b)는 객체영역추출부에서 비학습객체의 영역을 구하는 두 번째 단계로서, 가중치가 '2'로 부여된 최대코딩유닛(LCU)의 인접한 최대코딩유닛(LCU)에 가중치를 '1'로 한다. 이 과정을 통해 영상이미지에서 가중치가 2로 부여된 객체영역, 그리고 가중치가 1로 부여된 객체영역의 인접영역을 제외한 비학습 객체영역을 정확하게 구할 수 있다.8A is the first step of obtaining the area of the non-learning object in the object area extracting unit. In the first block, that is, the maximum coding unit including the object area in the image image divided by the maximum coding unit (LCU) ( LCU) is assigned a weight of '2', and other LCUs, that is, the first block, are assigned a weight of '0'. (b) is the second step of obtaining the area of the non-learning object in the object area extraction unit, and the weight is set to '1' to the adjacent maximum coding unit (LCU) of the maximum coding unit (LCU) given a weight of '2'. do. Through this process, it is possible to accurately obtain the non-learning object region excluding the object region to which the weight is assigned to 2 and the adjacent region to the object region to which the weight is assigned to 1.

이하, 도 9를 참조하여, 학습객체부호화부(40)와 비학습객체부호화부(50)의 또 다른 특징에 대해 설명하도록 한다. Hereinafter, another feature of the learning object encoding unit 40 and the non-learning object encoding unit 50 will be described with reference to FIG. 9.

도 9의 (a)와 (b)는 시간차를 두고 촬영된 영상이미지이다. 이러한 영상이미지는 시간 축 상에 고속 고효율 비디오 코덱으로 부호화된 연속된 영상으로 그것의 코딩유닛(CU) 분할구조의 일 예를 나타내고 있다. 코딩유닛(CU) 분할 구조는 영상의 특성에 따라 결정되는 특성을 보여주고 있다. 보다 구체적으로 코딩유닛 분할 구조는 객체 또는 그것의 경계와 같이 움직임이 많거나 복잡한 영역에서는 최대코딩유닛(LCU)의 분할구조가 배경과 같은 비학습 객체영역에서의 분할구조와 비교해 극명한 차이를 나타내고 있다.9A and 9B are image images taken with a time difference. This video image is a continuous video coded with a high-speed and high-efficiency video codec on a time axis, and shows an example of a coding unit (CU) division structure thereof. The coding unit (CU) split structure shows characteristics determined according to the characteristics of an image. More specifically, the division structure of the coding unit shows a marked difference in the divisional structure of the maximum coding unit (LCU) in a region with a lot of motion or complexities such as an object or its boundary compared to the divisional structure in a non-learning object region such as a background. .

도 9에서 표시된 바와 같이, (1)과 같이 비학습된 객체가 위치하는 영역에는 최대코딩유닛(LCU)이 위치하고, (2)와 같이 학습객체가 위치하는 영역에는 최대코딩유닛(LCU)과 달리 화소 변화가 적어 상대적으로 큰 크기의 코딩유닛(CU)로 분할되어 부호화가 수행되고 있다. 일례로, (2)와 같이 학습객체가 위치하는 영역이라도, 화소가 동일한 객체의 일부분이 움직일 경우, 화소 변화가 적어 움직인 부분에 대해서는 최대코딩유닛(LCU)에서 적게 분할되어 부호화가 수행될 수 있다.As shown in Fig. 9, the maximum coding unit (LCU) is located in the area where the unlearned object is located as shown in (1), and unlike the maximum coding unit (LCU) in the area where the learning object is located as shown in (2). Since there is little change in pixels, the coding is performed by being divided into coding units (CU) having a relatively large size. As an example, even in the area where the learning object is located as shown in (2), if a part of the object with the same pixel moves, the moving part due to the small change in the pixel is divided into less by the LCU and encoding can be performed. have.

즉, 도 9의 (a)와 (b)의 (1)과 (2)의 최대코딩유닛(LCU) 비교를 통해 시간적으로 동일한 위치의 최대코딩유닛(LCU)에 대한 분할구조가 상당한 유사성을 갖는 것을 알 수 있다. 이러한 점은 코딩유닛(CU) 분할구조를 결정하는데 있어서 분할구조의 형태를 미리 예측할 수 있는 좋은 정보가 될 수 있다.In other words, through the comparison of the maximum coding unit (LCU) of (1) and (2) of FIGS. 9A and 9B, the division structure for the LCU at the same location in time has considerable similarity. Can be seen. This point can be good information that can predict the shape of the partition structure in advance in determining the coding unit (CU) partition structure.

특히, 이러한 객체정보는 비학습 객체영역에 위치하는 최대코딩유닛(LCU)이 대체로 깊이가 크지 않으며 코딩유닛(CU) 크기가 큰 형태로 분할구조를 갖는 것으로 예상할 수 있도록 하고, 시간적으로 동일한 위치의 최대코딩유닛(LCU)에 대해서 객체정보를 참조하면 분할구조가 유사하게 될 것임을 예상할 수 있다.In particular, this object information allows the maximum coding unit (LCU) located in the non-learning object area to be expected to have a partition structure in a form that has a large size and a large coding unit (CU), and the location is the same in time. It can be expected that the partition structure will be similar if the object information is referred to the LCU of.

따라서, 이와 같은 객체정보를 이용하여, 모든 코딩유닛(CU)에 대해 부호화를 수행함으로써, 분할구조를 결정하는 기존의 방법에서 객체 및 배경 영역에 따른 분할구조를 예측할 수 있어 부호화 복잡도를 감소시킬 수 있다.Therefore, by performing encoding on all coding units (CU) using such object information, it is possible to predict the partition structure according to the object and background region in the conventional method of determining the partition structure, thereby reducing the coding complexity. have.

이와 같은 전술한 최대코딩유닛 및 코딩유닛 분할 구조에는 SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법, 코딩유닛 분할구조 제한 방법 그리고 움직임 예측 탐색범위 제한 방법이 적용될 수 있다.In the above-described maximum coding unit and coding unit split structure, a method of early termination of a coding unit split structure using a SKIP mode, a coding unit split structure restriction method, and a motion prediction search range restriction method may be applied.

먼저, 전술한 최대코딩유닛 및 코딩유닛 분할 구조가 도 10에 도시된 바와 같은 SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법에 적용된 것에 대해 설명한다.First, a description will be given of the application of the above-described maximum coding unit and coding unit split structure to the method of early termination of the coding unit split structure using the SKIP mode as shown in FIG. 10.

SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법은 비학습 객체영역의 블록이 대체로 움직임의 변화가 크기 않아 SKIP 모드로 부호화를 수행한다. 이러한 방법은 현재 코딩유닛(CU)이 배경에 해당할 경우 부호화 수행과정에서 최적의 코딩유닛(CU)이 2Nx2N 크기의 SKIP 모드로 결정이 될 경우에는 더 이상의 코딩유닛(CU) 분할을 수행하지 않고 종료한다. 이러한 방법을 통해, 블록의 분할에 대한 계산 복잡도를 감소시켜 블록의 부호화 속도를 향상시킬 수 있다.In the method of early termination of the coding unit division structure using the SKIP mode, the block in the non-learning object region has a large change in motion, so the coding is performed in the SKIP mode. In this method, when the current coding unit (CU) is in the background, when the optimal coding unit (CU) is determined in the SKIP mode of 2Nx2N size during the coding process, no further coding unit (CU) division is performed. It ends. Through this method, the coding speed of the block can be improved by reducing the computational complexity of the division of the block.

보다 구체적으로 SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법에 대해 설명하면, SKIP 모드 이용 코딩유닛 분할 구조 조기 종료방법은 7단계로 일련의 단계를 진행될 수 있다.In more detail, a method of early termination of the SKIP mode-using coding unit division structure may be described, and the method of early termination of the SKIP mode-using coding unit division structure may be performed in a series of seven steps.

먼저, 현재 코딩유닛(CU)의 영역이 비학습 객체영역인지 판단하고, 최적 에측단위(PU: Prediction unit) 모드를 결정하는 제1단계를 일련의 단계로 시작한다. 제1단계 이후, 현재 코딩유닛(CU)이 가장 작은 크기의 코딩유닛(CU)인지 판단하는 제2단계를 진행한다. 이때, 코딩유닛이 가장 작은 크기의 코딩유닛이 아닐 경우 제3단계를 진행한다. 반면, 코딩유닛이 가장 작은 크기의 코딩유닛일 경우 제6단계를 진행한다.First, a first step of determining whether an area of the current coding unit CU is a non-learning object area, and determining an optimal prediction unit (PU) mode starts with a series of steps. After the first step, a second step of determining whether the current coding unit CU is the smallest coding unit CU is performed. At this time, if the coding unit is not the smallest coding unit, the third step is performed. On the other hand, when the coding unit is the coding unit of the smallest size, step 6 is performed.

제3단계로 진행되어, 코딩유닛이 비학습 객체영역에 속하는 코딩 유닛일 경우 제4단계를 진행한다. 만약 그렇지 않을 경우에는 제5단계를 진행한다.The process proceeds to the third step, and if the coding unit is a coding unit belonging to the non-learning object region, the fourth step is performed. If not, proceed to step 5.

제4단계로 진행되어, 현재 코딩유닛이 SKIP 모드로 결정되었는지 판단하여, SKIP 모드로 결정되었다면 제6단계를 진행한다.The process proceeds to the fourth step, and it is determined whether the current coding unit is determined as the SKIP mode, and if it is determined as the SKIP mode, the sixth step is performed.

제5단계로 진행될 경우, 현재 코딩유닛의 가로 절반, 세로 절반의 크기를 갖는 4개의 코딩유닛으로 분할하고 제1단계를 수행한다.When proceeding to the fifth step, the first step is performed by dividing into four coding units having a size of half the width and half the length of the current coding unit.

코딩유닛의 분할이 종료되면, 현재 크기의 코딩유닛를 부호화하여 코딩유닛 분할 구조 후보에 저장하는 제6단계를 진행한다. 이후, 현재 최대코딩유닛(LCU)에 저장되어 있는 코딩유닛(CU) 후보들 중 율-왜곡 최적화 방법(Rate-distortion Optimization)을 통하여 화질 및 비트량 측면에서 가장 효율적인 CU의 분할구조를 선택하는 제7단계를 진행한다. 그리고 제7단계를 끝으로 일련의 단계를 종료한다.When the division of the coding unit is finished, the sixth step of encoding the coding unit of the current size and storing it in the coding unit division structure candidate proceeds. Thereafter, the seventh, which selects the most efficient split structure of the CU in terms of image quality and bit quantity through rate-distortion optimization among coding unit (CU) candidates currently stored in the maximum coding unit (LCU). Go through the steps. Then, the series of steps is finished with the seventh step.

또한, 도 9와 같은 코딩유닛 분할 구조는 도 11과 같이 코딩유닛 분할구조 제한 방법이 적용될 수 있다.In addition, the coding unit division structure as shown in FIG. 9 may be applied to the coding unit division structure limitation method as shown in FIG. 11.

코딩유닛 분할구조 제한 방법은 비학습 객체영역의 블록이 대체로 시간적 위치의 코딩유닛의 분할구조와 유사한 구조를 갖는 점을 바탕으로 시간적 위치의 코딩유닛의 분할구조가 단순할 경우 부호화하려는 코딩유닛의 분할구조 역시 단순한 점을 이용하는 것이다.The coding unit partitioning structure restriction method is based on the fact that the block of the non-learning object region has a structure similar to that of the coding unit at the temporal position, and if the partitioning structure of the coding unit at the temporal position is simple, the coding unit to be encoded is divided. Structure also uses simple points.

이러한 방법은 현재 코딩유닛이 비학습 객체영역에 해당할 경우 시간적 상관관계에 있는 최대코딩유닛(LCU)의 최소크기가 설정된 코딩유닛의 크기 이하가 아니라면, 현재의 최대코딩유닛(LCU)도 해당 범위 이하의 코딩유닛 분할을 수행 하지 않고 종료한다.In this method, when the current coding unit corresponds to the non-learning object region, the current maximum coding unit (LCU) is also within the range if the minimum size of the maximum coding unit (LCU) in temporal correlation is not less than the size of the set coding unit. It ends without performing the following coding unit division.

이하, 보다 구체적으로 코딩유닛 분할구조 제한 방법에 대해 설명한다. 다만, 코딩유닛 분할구조 제한 방법에 대한 설명이 간결하고 명확해질 수 있도록, 코딩유닛 분할구조 제한 방법에 대해 코딩유닛의 설정된 크기는 32x32로 하였다.Hereinafter, a method of limiting the coding unit division structure will be described in more detail. However, in order to concise and clarify the description of the method of restricting the divisional structure of the coding unit, the size of the coding unit is set to 32x32.

코딩유닛 분할구조 제한 방법은 8단계로 일련의 단계를 진행할 수 있다.The method of limiting the coding unit division structure can proceed a series of steps in 8 steps.

먼저, 현재 코딩유닛의 영역이 비 객체영역인지 판단하고, 최적 예측유닛 모드를 결정하는 제1단계로 시작한다. 제1단계 이후, 현재 코딩유닛이 가장 작은 크기의 코딩유닛인지 판단하는 제2단계를 진행한다. 이때, 가장 작은 크기의 코딩유닛이 아닐 경우 제3단계를 수행하고, 가장 작은 크기의 코딩유닛일 경우 제7단계를 수행한다.First, it starts with a first step of determining whether the region of the current coding unit is a non-object region, and determining an optimal prediction unit mode. After the first step, a second step of determining whether the current coding unit is a coding unit having the smallest size is performed. In this case, if the coding unit is not the smallest size, step 3 is performed, and if the coding unit is the smallest size, step 7 is performed.

제3단계에서는 현재 코딩유닛이 비학습 객체영역에 속하는지를 판별한다. 이때, 현재 코딩유닛이 비학습 객체영역에 속할 경우 제4단계를 진행한다.In the third step, it is determined whether the current coding unit belongs to the non-learning object area. In this case, if the current coding unit belongs to the non-learning object region, step 4 is performed.

제4단계에서 현재 코딩유닛의 크기가 32x32보다 작은지 판단하여, 코딩유닛의 크기가 32x32보다 작다면 제5단계를 수행하고, 코딩유닛의 크기가 32x32보다 크거나 같다면 제6단계를 수행한다.In step 4, it is determined whether the size of the current coding unit is less than 32x32, and if the size of the coding unit is less than 32x32, step 5 is performed, and if the size of the coding unit is greater than or equal to 32x32, step 6 is performed. .

제5단계에서는 참조 프레임으로부터 대응되는(Co-located) 최대코딩유닛(LCU)의 최소 코딩유닛(CU) 크기가 32x32보다 크거나 같은지를 판단한다. 이때, 최대코딩유닛의 크기가 32x32보다 크거나 같으면 제7단계를 수행하고, 최대코딩유닛의 크기가 32x32 보다 작다면 제6단계를 수행한다.In the fifth step, it is determined whether the size of the minimum coding unit (CU) of the maximum coding unit (LCU) co-located from the reference frame is greater than or equal to 32x32. At this time, if the size of the maximum coding unit is greater than or equal to 32x32, step 7 is performed, and if the size of the maximum coding unit is less than 32x32, step 6 is performed.

제6단계에서는 현재 코딩유닛의 가로 절반, 세로 절반의 크기를 갖는 4개의 코딩유닛으로 분할하고 제1단계를 수행한다.In the sixth step, the current coding unit is divided into four coding units having a horizontal half and a vertical half, and the first step is performed.

반면, 제7단계에서는 현재 크기를 부호화하여 코딩유닛의 분할 구조 후보에 저장하고 제8단계를 수행한다.On the other hand, in the seventh step, the current size is encoded and stored in the partition structure candidate of the coding unit, and the eighth step is performed.

제8단계에서는 현재 최대코딩유닛(LCU)에 저장되어 있는 코딩유닛 후보들 중 율-왜곡 최적화 방법(Rate-distortion Optimization)을 통하여 화질 및 비트량 측면에서 가장 효율적인 코딩유닛의 분할구조를 선택한다. 이와 같은 제8단계를 끝으로 일련의 단계를 종료한다. 그리고, 도 9와 같이 코딩유닛 분할 구조는 도 12와 같이 움직임 예측 탐색범위 제한 방법에 적용될 수 있다.In the eighth step, the most efficient partitioning structure of the coding unit in terms of image quality and bit amount is selected through a rate-distortion optimization method among coding unit candidates currently stored in the LCU. This eighth step ends a series of steps. In addition, the coding unit division structure as shown in FIG. 9 can be applied to the method of limiting the motion prediction search range as shown in FIG. 12.

움직임 예측 탐색 범위 제한 종료방법은 일 예로 비학습 객체영역의 부호화 과정에서 움직임 예측을 수행하는 경우 기존의 움직임 예측 탐색범위에 대비하여 1/2로 제한하는 방법이다.The motion prediction search range limitation termination method is, for example, a method of limiting the motion prediction search range to 1/2 compared to the existing motion prediction search range when motion prediction is performed in a process of encoding a non-learning object region.

움직임 예측 탐색 범위 제한 방법은 도 12에 도시된 바와 같은 처리 순서로 진행된다. 특히, 움직임 예측 탐색 범위 제한 방법은 현재 예측 유닛(PU)가 움직임 탐색을 수행하는 경우에 움직임 예측 범위를 기존보다 1/2 값으로 설정하고, 영상 이미지에서 현재 코딩유닛이 배경영역에 해당할 경우, 움직임 예측 탐색 범위를 설정값 64에서 1/2인 32로 설정하고, 그 외의 경우에는 기존 방법을 따른다. 이러한 움직임 예측 탐색 범위 제한 방법은 기존 방법보다 계산 복잡도를 줄일 수 있다. 여기서, 기존 방법은 Advanced Motion Vector Prediction(AMVP)와 Merge가 될 수 있다. The method of limiting the motion prediction search range proceeds in the order of processing as shown in FIG. 12. In particular, the method of limiting the motion prediction search range is to set the motion prediction range to a value of 1/2 compared to the previous one when the current prediction unit (PU) performs motion search, and when the current coding unit corresponds to the background area in the video image. , The motion prediction search range is set to 32, which is 1/2 from the set value 64, and in other cases, the existing method is followed. Such a motion prediction search range limitation method can reduce computational complexity compared to conventional methods. Here, the existing method may be Advanced Motion Vector Prediction (AMVP) and Merge.

이러한 기존 방법 가운데, Advanced Motion Vector Prediction(AMVP)은 도 13에 도시된 바와 같이, 참조 프레임으로부터 움직임 탐색 영역 안의 위치들에 다이아몬드 탐색 방법을 이용하여 1차로 현재 PU와 제일 근접한 블록이 존재하는 위치를 찾고 2차로 2포인트 탐색 방법을 이용하여 해당 위치를 중심으로 세밀하게 주변 위치들과 비교하여 최적의 움직임 벡터를 구하는 방법이다. Among these existing methods, Advanced Motion Vector Prediction (AMVP), as shown in FIG. 13, uses a diamond search method at positions in the motion search area from the reference frame to determine the position where the block closest to the current PU exists. It is a method of finding an optimal motion vector by comparing it with surrounding positions in detail around the corresponding position using a two-point search method in a second order.

또한 Merge 방법은 SKIP 모드로 부호화 할 수 있는 것으로서, SKIP 모드는 해당 PU의 잔여신호를 제외한 움직임 정보만을 부호화하여, 현재 PU와 참조블록의 화소 값이 모두 동일하며 어떠한 화소 정보도 추가되지 않고 현재 PU 그대로 가져오도록 한다. 더욱이, SKIP 모드는 2Nx2N의 예측유닛(PU)에만 적용되며, 해당 PU가 SKIP 모드로 부호화 되었는지의 확인 여부는 SKIP_FLAG를 사용하여 판단한다.In addition, the merge method can be encoded in the SKIP mode. In the SKIP mode, only motion information excluding the residual signal of the corresponding PU is encoded, so that the pixel values of the current PU and the reference block are the same, and no pixel information is added. Bring it as it is. Moreover, the SKIP mode is applied only to the prediction unit (PU) of 2Nx2N, and whether the corresponding PU is encoded in the SKIP mode is determined using SKIP_FLAG.

아울러, 상술한 코딩유닛 분할 구조에 적용되는 방법들은 모두 블록 크기 혹은 CU 깊이 등에 따라 적용 범위를 달리할 수 있다. 이렇게 적용 범위를 결정하는 변수(즉, 크기 혹은 깊이 정보)는 부호화기 및 복호화기가 미리 정해진 값을 사용하도록 설정할 수도 있고, 프로파일 또는 레벨에 따라 정해진 값을 사용하도록 할 수 도 있다. 또한, 부호화기가 변수 값을 비트스트림에 기재하면 복호화기는 비트스트림으로부터 이 값을 구하여 사용할 수도 있다.In addition, all of the methods applied to the above-described coding unit division structure may have different application ranges according to a block size or CU depth. As for the variable (ie, size or depth information) that determines the application range, the encoder and the decoder may be set to use a predetermined value, or a value determined according to a profile or level may be used. In addition, if the encoder writes the variable value in the bitstream, the decoder may obtain and use this value from the bitstream.

코딩유닛 깊이에 따라 적용 범위를 달리하는 할 때는 아래 표에 예시한 바와 같이, 방법 A 주어진 깊이 이상의 깊이에만 적용하는 방법, 방법 B 주어진 깊이 이하에만 적용하는 방법, 방법 C 주어진 깊이에만 적용하는 방법이 있을 수 있다.When the range of application varies depending on the depth of the coding unit, as illustrated in the table below, Method A is applied only to a depth above a given depth, Method B is applied only to a given depth or less, and Method C is applied only to a given depth. There may be.

주어진 코딩유닛의 깊이가 2인 경우, 본 발명의 방법들을 적용하는 범위 결정 방식의 예 이다. (O: 해당 깊이에 적용, X: 해당 깊이에 적용하지 않음.)When the depth of a given coding unit is 2, it is an example of a range determination method to which the methods of the present invention are applied. (O: applies to the depth, X: does not apply to the depth.) 적용범위를 나타내는 코딩유닛 깊이Coding unit depth indicating coverage 방법AMethod A 방법BMethod B 방법CMethod C 00 XX OO XX 1One XX OO XX 22 OO OO OO 33 OO XX XX 44 OO XX XX

모든 깊이에 대하여 본 발명의 방법들을 적용하지 않는 경우는 임의의 지시자(flag)를 사용하여 나타낼 수도 있다. 그리고, 코딩유닛(CU) 깊이의 최대값보다 하나 더 큰 값을 적용범위를 나타내는 코딩유닛(CU) 깊이 값으로 시그널링 함으로써 표현할 수도 있다.또한 상술한 방법은 휘도 블록의 크기에 따라 색차 블록에 다르게 적용할 수 있다. 또한, 휘도 신호영상 및 색차 영상에 다르게 적용할 수 있다.If the methods of the present invention are not applied to all depths, an arbitrary flag may be used to indicate. In addition, a value that is one greater than the maximum value of the depth of the coding unit (CU) may be expressed by signaling a depth value of the coding unit (CU) indicating the coverage. The above-described method is different for the color difference block according to the size of the luminance block. Can be applied. In addition, it can be applied differently to the luminance signal image and the color difference image.

휘도블록크기Luminance block size 색차블록크기Color difference block size 휘도 적용Apply luminance 색차 적용Apply color difference 방법들Methods 4(4x4, 4x2, 2x4)4 (4x4, 4x2, 2x4) 2(2x2)2 (2x2) O or XO or X O or XO or X 가 1, 2, ..A 1, 2, ... 4(4x4, 4x2, 2x4)4 (4x4, 4x2, 2x4) O or XO or X O or XO or X 나 1, 2, ..Me 1, 2, ... 8(8x8, 8x4, 4x8, 2x8 등)8 (8x8, 8x4, 4x8, 2x8, etc.) O or XO or X O or XO or X 다 1, 2, ..Da 1, 2, ... 16(16x16, 16x8, 4x16, 2x16 등)16 (16x16, 16x8, 4x16, 2x16, etc.) O or XO or X O or XO or X 라 1, 2, ..D 1, 2, .. 32(32x32)32 (32x32) O or XO or X O or XO or X 마 1, 2, ..M 1, 2, ... 8(8x8, 8x4, 2x8 등)8 (8x8, 8x4, 2x8, etc.) 2(2x2)2 (2x2) O or XO or X O or XO or X 바 1, 2, ..Bar 1, 2, ... 4(4x4, 4x2, 2x4)4 (4x4, 4x2, 2x4) O or XO or X O or XO or X 사 1, 2, ..Company 1, 2, .. 8(8x8, 8x4, 4x8, 2x8 등)8 (8x8, 8x4, 4x8, 2x8, etc.) O or XO or X O or XO or X 아 1, 2, ..Ah 1, 2, .. 16(16x16, 16x8, 4x16, 2x16 등)16 (16x16, 16x8, 4x16, 2x16, etc.) O or XO or X O or XO or X 자 1, 2, ..Now 1, 2, .. 32(32x32)32 (32x32) O or XO or X O or XO or X 카 1, 2, ..Car 1, 2, .. 16(16x16, 8x16, 4x16 등)16 (16x16, 8x16, 4x16, etc.) 2(2x2)2 (2x2) O or XO or X O or XO or X 타 1, 2, ..Get 1, 2, .. 4(4x4, 4x2, 2x4)4 (4x4, 4x2, 2x4) O or XO or X O or XO or X 파 1, 2, ..Par 1, 2, .. 8(8x8, 8x4, 4x8, 2x8 등)8 (8x8, 8x4, 4x8, 2x8, etc.) O or XO or X O or XO or X 하 1, 2, ..Ha 1, 2, .. 16(16x16, 16x8, 4x16, 2x16 등)16 (16x16, 16x8, 4x16, 2x16, etc.) O or XO or X O or XO or X 개 1, 2, ..Dog 1, 2, .. 32(32x32)32 (32x32) O or XO or X O or XO or X 내 1, 2, ..My 1st, 2nd, ...

표 2는 방법들의 조합의 일 예를 나타낸다. Table 2 shows an example of a combination of methods.

표 2의 변형된 방법들 중에서 방법 "사 1"을 살펴보면, 휘도 블록의 크기가 8(8x8, 8x4, 2x8 등)인 경우이고, 그리고 색차 블록의 크기가 4(4x4, 4x2, 2x4)인 경우에 명세서의 방법을 휘도 신호 및 색차 신호에 적용할 수 있다. 위의 변형된 방법들 중에서 방법 "파 2"를 살펴보면, 휘도블록의 크기가 16(16x16, 8x16, 4x16 등)인 경우이고, 그리고 색차 블록의 크기가 4(4x4, 4x2, 2x4)인 경우에 명세서의 방법을 휘도 신호에 적용하고 색차 신호에는 적용하지 않을 수 있다. 또 다른 변형된 방법들로 휘도 신호에만 명세서의 방법이 적용되고 색차 신호에는 적용되지 않을 수 있다. 반대로 색차 신호에만 명세서의 방법이 적용되고 휘도 신호에는 적용되지 않을 수 있다.Looking at the method "four 1" among the modified methods in Table 2, when the size of the luminance block is 8 (8x8, 8x4, 2x8, etc.), and the size of the color difference block is 4 (4x4, 4x2, 2x4) The method of the specification can be applied to a luminance signal and a color difference signal. Looking at the method "wave 2" among the above modified methods, when the size of the luminance block is 16 (16x16, 8x16, 4x16, etc.), and the size of the color difference block is 4 (4x4, 4x2, 2x4), The method of the specification may be applied to a luminance signal and not to a color difference signal. As other modified methods, the method of the specification may be applied only to the luminance signal and may not be applied to the color difference signal. Conversely, the method of the specification may be applied only to the color difference signal and not to the luminance signal.

이하, 지금까지 설명한 객체를 검출하는 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템(1)에 대한 설명을 바탕으로 본 발명의 일 실시예에 따른 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법에 대해 상세히 설명한다.Hereinafter, based on the description of the object information-based high-speed high-efficiency video codec image encoding system 1 using machine learning for detecting objects described so far, high-speed high-efficiency video based on object information using machine learning according to an embodiment of the present invention The codec video encoding method will be described in detail.

전술한 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템을 구성하는 구성요소 및 구성요소의 특징에 대한 설명된 내용은 후술할 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법에 그대로 적용될 수 있다.The description of the components constituting the object information-based high-speed and high-efficiency video codec image encoding system using the above-described machine learning and the features of the components are intact in the object information-based high-speed high-efficiency video codec image encoding method using machine learning to be described later. Can be applied.

기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 도 14의 순서도를 기준으로 한다.A high-speed, high-efficiency video codec image encoding method based on object information using machine learning is based on the flowchart of FIG. 14.

기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 (A)단계 내지 (F)단계로 진행하며, 영상이미지를 부호화 할 수 있다.A high-speed, high-efficiency video codec image encoding method based on object information using machine learning proceeds to steps (A) to (F), and may encode an image image.

이하, 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법의 각 단계에 대해 설명한다. Hereinafter, each step of a method of encoding a high-speed, high-efficiency video codec image based on object information using machine learning will be described.

기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 영상이미지생성부(10)가 시간에 따라 일정영역을 촬영하여 시간에 따라 서로 다른 영상이미지를 생성하는 (A)단계로 시작한다(S110).The object information-based high-speed, high-efficiency video codec image encoding method using machine learning begins with step (A) in which the image image generator 10 photographs a certain area over time and generates different image images over time (S110). ).

이후, 객체정보제공부(20)가 특정 객체가 기계 학습되어, 학습된 객체의 객체정보를 추출하는 (B)단계로 진행된다(S120). Thereafter, the object information providing unit 20 proceeds to step (B) in which a specific object is machine-learned and object information of the learned object is extracted (S120).

이후, 객체영역추출부(30)가 영상이미지생성부(10)에서 영상이미지를 일정한 간격으로 나누어, 복수 개의 제1블록이 형성된 영상이미지를 입력 받고, 객체정보제공부(20)에서 객체정보를 입력 받아, 저장된 기준객체와 객체정보를 대비하여, 매칭되지 않는 객체를 비학습객체로 추출하고, 매칭되는 객체를 학습객체로 추출하는 (C)단계로 진행된다(S200).Thereafter, the object region extraction unit 30 divides the image image at regular intervals from the image image generation unit 10, receives an image image in which a plurality of first blocks are formed, and receives the object information from the object information providing unit 20. The input and stored reference object and object information are compared, and the non-matching object is extracted as a non-learning object, and the matching object is extracted as a learning object, and the process proceeds to step (C) (S200).

이후, 학습객체부호화부(40)가 학습객체를 복수 개의 제1블록을 복수 개의 학습객체하위블록으로 분할하여, 학습객체하위블록에 가중치를 부여하여 부호화하는 (D)단계(S410) 또는 비학습객체부호화부(50)가 비학습객체에 포함되는 제1블록을 부호화하는 (E)단계로 진행된다(S420). 이후, 부호화종료여부판단부(60)가 상기 (C)단계, 상기 (D)단계 및 상기 (E)는 상기 영상이미지의 부호화가 끝날 때까지, 반복되는 (F) 단계로 진행될 수 있다(S500). (F)단계를 끝으로 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법의 일련의 단계가 종료될 수 있다.Thereafter, the learning object encoding unit 40 divides the learning object into a plurality of first blocks into a plurality of learning object sub-blocks, and weights the learning object sub-blocks for encoding (D) (S410) or non-learning. The object encoding unit 50 proceeds to step (E) of encoding the first block included in the non-learning object (S420). Thereafter, the encoding end determination unit 60 may proceed to step (F), which is repeated until the encoding of the image image ends in step (C), step (D), and step (E) (S500). ). At the end of step (F), a series of steps of a method of encoding a high-speed high-efficiency video codec image based on object information using machine learning may be ended.

아울러, 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 상기 (C)단계에서, 객체영역추출부(30)가 학습객체를 추출할 때, 학습객체의 크기 보다 크게 학습객체영역을 설정하는 단계를 진행할 수 있다. In addition, in the object information-based high-speed, high-efficiency video codec image encoding method using machine learning, in step (C), when the object region extracting unit 30 extracts the learning object, the learning object region is set to be larger than the size of the learning object. You can proceed with the steps.

기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 방법은 이와 같은 단계를 통해, 기계 학습된 데이터를 기반으로 영상에서 학습객체와 비학습객체를 분류하고 학습객체와 비학습객체를 서로 다른 부호화 과정으로 부호화 함으로써, 영상에 대한 부호화 효율을 향상시킬 수 있으며, 영상에 대한 부호화 속도를 높일 수 있다. 이와 같은 본 발명에 따른 방법은 컴퓨터에서 실행되기 위한 프로그램으로 제작되어 컴퓨터가 읽을 수 있는 기록 매체에 저장될 수 있다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광 데이터 저장장치 등이 있으며, 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함할 수 있다.The object information-based high-speed and high-efficiency video codec image encoding method using machine learning is through these steps, classifying learning and non-learning objects in images based on machine-learned data, and encoding learning and non-learning objects in different ways. By encoding by using, encoding efficiency for an image can be improved, and an encoding speed for an image can be increased. Such a method according to the present invention may be produced as a program for execution on a computer and stored in a computer-readable recording medium. Examples of computer-readable recording media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, etc., including those implemented in the form of a carrier wave (for example, transmission over the Internet). can do.

컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 상기 방법을 구현하기 위한 기능적인(function) 프로그램, 코드 및 코드 세그먼트들은 본 발명이 속하는 기술분야의 프로그래머들에 의해 용이하게 추론될 수 있다.The computer-readable recording medium is distributed over a computer system connected by a network, and computer-readable codes can be stored and executed in a distributed manner. In addition, functional programs, codes and code segments for implementing the method can be easily deduced by programmers in the art to which the present invention belongs.

이상 첨부된 도면을 참조하여 본 발명의 실시예들을 설명하였지만, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야 한다.The embodiments of the present invention have been described above with reference to the accompanying drawings, but those of ordinary skill in the art to which the present invention pertains can be implemented in other specific forms without changing the technical spirit or essential features. You can understand. Therefore, it should be understood that the embodiments described above are illustrative in all respects and not limiting.

1: 기계학습을 이용한 객체정보 기반 고속 고효율 비디오 코덱 영상 부호화 시스템
10: 영상이미지생성부 110: 제1영상이미지
120: 제2영상이미지
20: 객체정보제공부
30: 객체영역추출부 40: 학습객체부호화부
50: 비학습객체부호화부 60: 부호화종료여부판단부 1: High-speed and high-efficiency video codec image coding system based on object information using machine learning
10: image image generator 110: first image image
120: second video image
20: object information provision unit
30: object region extraction unit 40: learning object encoding unit
50: Non-learning object encoding unit 60: Encoding end determination unit

Claims

(A) step of generating different image images according to time by photographing a certain area according to time by the image image generator;
(B) step of extracting object information of the learned object by machine learning the specific object by the object information providing unit;
The object region extracting unit divides the image image at regular intervals from the image image generator, receives an image image in which a plurality of first blocks are formed, receives the object information from the object information providing unit, and stores the reference object and the object (C) extracting an unmatched object as a non-learning object in preparation for information and extracting the matching object as a learning object;
(D) step of dividing a plurality of first blocks included in the learning object into a plurality of learning object sub-blocks, and then encoding the learning object sub-blocks by giving weights to the learning object sub-blocks; And
Including the step (E) of encoding a first block included in the non-learning object encoding unit,
In the step (A), the image image generator generates different first and second image images over time,
In the step (E), the non-learning object encoding unit overlaps the first video image and the second video image, so that the learning object of the second video image is in an area included in the non-learning object of the first video image. In the case of overlapping, the first block of the learning object of the second video image is divided into a size equal to or less than the size of the first block,
In the step (C), when the object region extraction unit is extracted as a learning object, further comprising the step of setting a learning object region larger than the size of the learning object, including the learning object,
In the step (E), the non-learning object encoding unit includes a size value of a divided block limiting the division of the first block, and divides the first block until it corresponds to the size value of the divided block. ,
The second block is encoded by dividing the first block into second blocks whose horizontal and vertical lengths of the first block are 1/2,
Object information-based high speed using machine learning that allows the first block and the second block to be encoded differently by giving a weight of '1' to the second block and a weight of '0' to the first block High-efficiency video codec video encoding method.

delete