KR102464851B1

KR102464851B1 - Learning method and image cassification method using multi-scale feature map

Info

Publication number: KR102464851B1
Application number: KR1020200028845A
Authority: KR
Inventors: 곽진태; 김일
Original assignee: 세종대학교산학협력단
Priority date: 2020-03-09
Filing date: 2020-03-09
Publication date: 2022-11-07
Also published as: KR20210113738A

Abstract

인공 신경망 기반으로 추출된 다중 스케일 특징맵을 이용하는 학습 방법 및 영상 분류 방법이 개시된다. 다중 스케일 특징맵을 이용하는 학습 방법은 인공 신경망을 이용하여, 훈련 영상에 대한 다중 스케일 특징맵을 생성하는 단계; 상기 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성하는 단계; 상기 평균 특징맵과 상기 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 상기 픽셀 단위로 상기 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출하는 단계; 및 상기 평균 특징맵 및 상기 이진 패턴 코드를 이용하여, 상기 훈련 영상에 대한 학습을 수행하는 단계를 포함한다. A learning method and an image classification method using a multi-scale feature map extracted based on an artificial neural network are disclosed. A learning method using a multi-scale feature map includes: generating a multi-scale feature map for a training image using an artificial neural network; generating an average feature map by averaging pixel values of the multi-scale feature map in units of pixels; calculating a difference between pixel values of the average feature map and each of the multi-scale feature maps, and calculating a binary pattern code for each of the multi-scale feature maps on a pixel-by-pixel basis; and performing learning on the training image by using the average feature map and the binary pattern code.

Description

Learning method and image classification method using multi-scale feature map {LEARNING METHOD AND IMAGE CASSIFICATION METHOD USING MULTI-SCALE FEATURE MAP}

본 발명은 인공 신경망을 이용하는 학습 방법 및 영상 분류 방법에 관한 것으로서, 더욱 상세하게는 인공 신경망 기반으로 추출된 다중 스케일 특징맵을 이용하는 학습 방법 및 영상 분류 방법에 관한 것이다.The present invention relates to a learning method and an image classification method using an artificial neural network, and more particularly, to a learning method and an image classification method using a multi-scale feature map extracted based on an artificial neural network.

최근 인공 지능 분야가 급격히 발전하고 있으며, 입력된 영상을 분류하기 위해 인공 신경망이 이용된다. 입력된 영상으로부터 특징맵을 추출하기 위해 컨벌루션(convolution)이 많이 이용되며, 최근에는 영상 분류의 성능을 높이기 위해 단일의 특징맵이 아닌 다중 스케일의 특징맵을 추출하여 학습이 이루어지고 있다.Recently, the field of artificial intelligence is rapidly developing, and artificial neural networks are used to classify input images. Convolution is widely used to extract a feature map from an input image, and recently, in order to improve the performance of image classification, learning is performed by extracting a feature map of multiple scales rather than a single feature map.

다중 스케일을 활용하기 위한 일반적인 방법은 크게 두 가지로 나뉜다. 첫째는, 입력 영상을 여러 스케일의 영상으로 변환하여 결합한 후, 하나의 딥러닝 모델에 적용하여, 영상을 분류하는 것이다. There are two general methods for using multiple scales. First, the input images are converted into images of multiple scales, combined, and then applied to one deep learning model to classify the images.

둘째, 입력 영상을 여러 스케일의 영상으로 변환하고 동일한 딥러닝 모델에 각각 적용하여 다중 스케일 특징맵을 획득한 후, 다중 스케일 특징맵을 단순 결합하여 영상을 분류하는 것이다. 다중 스케일 특징 맵을 결합하는 방법으로 Concatenation, Addition, Convolution 등이 주로 사용된다. Second, the input image is converted into images of multiple scales, applied to the same deep learning model, respectively, to obtain multi-scale feature maps, and then images are classified by simply combining the multi-scale feature maps. Concatenation, Addition, and Convolution are mainly used as methods for combining multi-scale feature maps.

전술된 방법은 다중 스케일의 정보를 상호 유기적으로 활용하고 있지 못하고 있으며, 다중 스케일 간의 관계를 적절히 반영하기 어렵다는 문제가 있다.The above-described method does not organically utilize information of multiple scales, and there is a problem in that it is difficult to properly reflect the relationship between multiple scales.

관련 선행문헌으로 대한민국 등록특허 제10-2046240호 및 대한민국 공개특허 제2019-0097205호가 있다.As related prior documents, there are Korean Patent Registration No. 10-2046240 and Korean Patent Publication No. 2019-0097205.

본 발명은 다중 스케일 특징맵을 이용하여, 보다 정확하게 영상을 분류할 수 있는 학습 방법 및 영상 분류 방법을 제공하기 위한 것이다. An object of the present invention is to provide a learning method and an image classification method that can more accurately classify an image using a multi-scale feature map.

또한 본 발명은, 다중 스케일 특징맵으로부터 학습에 유용한 스케일의 특징맵을 선정하고 유용한 정보를 추출할 수 있는 특징맵 추출 방법을 제공하기 위한 것이다.Another object of the present invention is to provide a feature map extraction method capable of selecting a feature map of a useful scale for learning from a multi-scale feature map and extracting useful information.

상기한 목적을 달성하기 위한 본 발명의 일 실시예에 따르면, 인공 신경망을 이용하여, 훈련 영상에 대한 다중 스케일 특징맵을 생성하는 단계; 상기 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성하는 단계; 상기 평균 특징맵과 상기 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 상기 픽셀 단위로 상기 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출하는 단계; 및 상기 평균 특징맵 및 상기 이진 패턴 코드를 이용하여, 상기 훈련 영상에 대한 학습을 수행하는 단계를 포함하는 다중 스케일 특징맵을 이용하는 학습 방법이 제공된다. According to an embodiment of the present invention for achieving the above object, using an artificial neural network, generating a multi-scale feature map for a training image; generating an average feature map by averaging pixel values of the multi-scale feature map in units of pixels; calculating a difference between pixel values of the average feature map and each of the multi-scale feature maps, and calculating a binary pattern code for each of the multi-scale feature maps on a pixel-by-pixel basis; and performing learning on the training image by using the average feature map and the binary pattern code.

또한 상기한 목적을 달성하기 위한 본 발명의 다른 실시예에 따르면, 인공 신경망을 이용하여, 입력 영상에 대한 다중 스케일 특징맵을 생성하는 단계; 상기 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성하는 단계; 상기 평균 특징맵과 상기 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 상기 픽셀 단위로 상기 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출하는 단계; 및 상기 평균 특징맵 및 상기 이진 패턴 코드를 이용하여, 상기 입력 영상을 분류하는 단계를 포함하는 다중 스케일 특징맵을 이용하는 영상 분류 방법이 제공된다.In addition, according to another embodiment of the present invention for achieving the above object, using an artificial neural network, generating a multi-scale feature map for the input image; generating an average feature map by averaging pixel values of the multi-scale feature map in units of pixels; calculating a difference between pixel values of the average feature map and each of the multi-scale feature maps, and calculating a binary pattern code for each of the multi-scale feature maps on a pixel-by-pixel basis; and classifying the input image by using the average feature map and the binary pattern code.

또한 상기한 목적을 달성하기 위한 본 발명의 또 다른 실시예에 따르면, 인공 신경망을 이용하여, 입력 영상에 대한 다중 스케일 특징맵을 생성하는 단계; 상기 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성하는 단계; 상기 평균 특징맵과 상기 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 상기 픽셀 단위로 상기 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출하는 단계; 및 상기 이진 패턴 코드가 반영된 통합 특징맵을 생성하는 단계를 포함하는 영상 분류를 위한 특징맵 추출 방법이 개시된다.In addition, according to another embodiment of the present invention for achieving the above object, using an artificial neural network, generating a multi-scale feature map for the input image; generating an average feature map by averaging pixel values of the multi-scale feature map in units of pixels; calculating a difference between pixel values of the average feature map and each of the multi-scale feature maps, and calculating a binary pattern code for each of the multi-scale feature maps on a pixel-by-pixel basis; and generating an integrated feature map to which the binary pattern code is reflected.

본 발명의 일실시예에 따르면, 다중 스케일 특징맵에 대한 이진 패턴화를 통해 여러 스케일의 특징맵 중에서, 학습에 유용한 스케일의 특징맵이 선정될 수 있으며, 이를 통해 학습을 수행함으로써 학습 능력이 향상될 수 있으며, 영상 분류 성능도 향상될 수 있다.According to an embodiment of the present invention, a feature map of a useful scale for learning may be selected from among feature maps of several scales through binary patterning on a multi-scale feature map, and learning ability is improved by performing learning through this and image classification performance may be improved.

도 1은 본 발명의 일실시예에 따른 다중 스케일 특징맵을 이용하는 학습 방법을 설명하기 위한 도면이다.
도 2는 본 발명의 일실시예에 따른 인공 신경망을 도시하는 도면이다.
도 3은 본 발명의 일실시예에 따른 이진 패턴화를 설명하기 위한 도면이다.
도 4는 본 발명의 일실시예에 따른 다중 스케일 특징맵을 이용하는 영상 분류 방법을 설명하기 위한 도면이다.
도 5는 본 발명의 일실시예에 따른 영상 분류 결과를 설명하기 위한 도면이다.
도 6은 본 발명의 일실시예에 따른 영상 분류를 위한 특징맵 추출 방법을 설명하기 위한 도면이다.1 is a diagram for explaining a learning method using a multi-scale feature map according to an embodiment of the present invention.
2 is a diagram illustrating an artificial neural network according to an embodiment of the present invention.
3 is a diagram for explaining binary patterning according to an embodiment of the present invention.
4 is a diagram for explaining an image classification method using a multi-scale feature map according to an embodiment of the present invention.
5 is a diagram for explaining an image classification result according to an embodiment of the present invention.
6 is a diagram for explaining a method for extracting a feature map for image classification according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다. Since the present invention can have various changes and can have various embodiments, specific embodiments are illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents and substitutes included in the spirit and scope of the present invention. In describing each figure, like reference numerals have been used for like elements.

본 발명은 영상 분류를 위한 학습 방법 및 기계 학습 기반으로 입력된 영상을 분류하는 방법에 관한 발명으로서, 입력 영상에 대한 분류 정확도를 높이기 위해 다중 스케일 특징맵을 이용한다.The present invention relates to a learning method for image classification and a method for classifying an input image based on machine learning, and uses a multi-scale feature map to increase classification accuracy for an input image.

본 발명의 일실시예는, 다중 스케일 특징맵에 포함된 입력 영상의 특징값들을 유기적으로 결합하고, 다중 스케일 특징맵에 포함된 특징값들 중에서 유용한 정보를 추출하여 입력 영상 분류에 이용하기 위해, 다중 스케일 특징맵을 이진 패턴화한다. An embodiment of the present invention organically combines the feature values of the input image included in the multi-scale feature map, extracts useful information from the feature values included in the multi-scale feature map, and uses it for input image classification, Binary pattern the multi-scale feature map.

본 발명의 일실시예에 따른 다중 스케일 특징맵을 이용하는 학습 방법 및 영상 분류 방법은, 입력 영상을 미리 설정된 클래스로 분류하는 다양한 분야에 활용될 수 있으며, 데스크탑, 노트북, 서버 등 프로세서 및 메모리를 포함하는 컴퓨팅 장치에서 수행될 수 있다.The learning method and the image classification method using a multi-scale feature map according to an embodiment of the present invention can be utilized in various fields for classifying an input image into a preset class, and includes a processor and a memory such as a desktop, a laptop computer, a server, etc. may be performed on a computing device.

이하에서, 본 발명에 따른 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 다중 스케일 특징맵을 이용하는 학습 방법을 설명하기 위한 도면이며, 도 2는 본 발명의 일실시예에 따른 인공 신경망을 도시하는 도면이다.1 is a diagram for explaining a learning method using a multi-scale feature map according to an embodiment of the present invention, and FIG. 2 is a diagram illustrating an artificial neural network according to an embodiment of the present invention.

도 1 및 도 2를 참조하면, 본 발명의 일실시예에 따른 학습 방법을 수행하는 컴퓨팅 장치는 인공 신경망을 이용하여, 학습을 위한 훈련 영상(210)에 대한 다중 스케일 특징맵(220)을 생성(S110)한다. 즉, 컴퓨팅 장치는 서로 다른 스케일의 특징맵(220)을 생성하며, 일실시예로서 컨벌루션 레이어(convolution layer, L1)에서 훈련 영상(210)으로부터 특징맵(220)이 추출될 수 있다. 1 and 2 , a computing device performing a learning method according to an embodiment of the present invention generates a multi-scale feature map 220 for a training image 210 for learning by using an artificial neural network. (S110). That is, the computing device generates the feature maps 220 of different scales, and as an embodiment, the feature map 220 may be extracted from the training image 210 in a convolution layer (L1).

컴퓨팅 장치는 서로 다른 크기의 커널을 이용하거나, 훈련 영상에 대한 컨벌루션 횟수를 조절하여 다중 스케일 특징맵을 생성할 수 있다. 여기서, 스케일은 특징맵의 크기 및 해상도를 모두 포함하는 개념일 수 있으며, 다중 스케일 특징맵이란 크기 및 해상도 중 적어도 하나가 다른 복수의 특징맵을 의미할 수 있다. 도 2에서는 크기 및 해상도가 서로 다른 5개의 다중 스케일 특징맵(220)이 이용되는 실시예가 설명된다.The computing device may generate a multi-scale feature map by using kernels of different sizes or by adjusting the number of convolutions for the training image. Here, the scale may be a concept including both the size and resolution of the feature map, and the multi-scale feature map may mean a plurality of feature maps having different sizes and resolutions. In FIG. 2 , an embodiment in which five multi-scale feature maps 220 having different sizes and resolutions are used is described.

그리고 컴퓨팅 장치는 단계 S110에서 다중 스케일 특징맵(220) 중 하나의 크기와 동일하도록, 다중 스케일 특징맵의 크기를 변환할 수 있으며, 특징맵의 크기가 동일해짐으로써, 특징맵의 해상도 역시 동일해질 수 있다.In addition, the computing device may convert the size of the multi-scale feature map to be the same as the size of one of the multi-scale feature maps 220 in step S110. As the size of the feature map becomes the same, the resolution of the feature map becomes the same. can

일실시예로서 컴퓨팅 장치는, 트랜지션 레이어(transiton layer, L2)를 이용하여, 특징맵의 크기를 조절할 수 있으며, 트랜지션 레이어는 1x1 컨벌루션 레이어 및 리사이징 레이어(resizing layer)를 포함할 수 있다. 1x1 컨벌루션 레이어를 통해 다중 스케일 특징맵의 채널 수가 동일해지도록 처리될 수 있으며, 리사이징 레이어를 통해 제1, 제2, 제4 및 제5특징맵(221, 222, 224, 225)의 크기가 제3특징맵(223)의 크기와 동일해지도록 처리될 수 있다. 리사이징 레이어는 이웃한 픽셀값을 평균하거나 이웃한 픽셀값을 보간하는 등의 방법을 통해 특징맵의 크기를 변환할 수 있다.As an embodiment, the computing device may adjust the size of the feature map by using a transition layer (L2), and the transition layer may include a 1x1 convolutional layer and a resizing layer. Through the 1x1 convolutional layer, the number of channels of the multi-scale feature map can be processed to be the same, and the sizes of the first, second, fourth, and fifth feature maps 221, 222, 224, 225 are reduced through the resizing layer. 3 It may be processed to be the same as the size of the feature map 223 . The resizing layer may convert the size of the feature map by averaging neighboring pixel values or interpolating neighboring pixel values.

그리고 컴퓨팅 장치는 다중 스케일 특징맵을 이용하여, 훈련 영상에 대한 학습을 수행한다. 일실시예로서, 컴퓨팅 장치는 이진 패턴화 레이어(L3)를 이용해 다중 스케일 특징맵을 이진 패턴화하여 학습을 수행할 수 있다.And, the computing device performs learning on the training image by using the multi-scale feature map. As an embodiment, the computing device may perform learning by binary patterning the multi-scale feature map using the binary patterning layer L3.

보다 구체적으로 컴퓨팅 장치는 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성(S120)한다. 그리고 평균 특징맵과 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 픽셀 단위로 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출(S130)한다. 단계 S130 및 S140은 SBP 인코더에서 수행될 수 있으며, 단계 S130 및 S140은 도 3에서 보다 자세히 설명된다.More specifically, the computing device generates an average feature map by averaging the pixel values of the multi-scale feature map in units of pixels ( S120 ). Then, the difference between the pixel values of the average feature map and the multi-scale feature map is calculated, and a binary pattern code for each of the multi-scale feature maps is calculated in units of pixels (S130). Steps S130 and S140 may be performed in the SBP encoder, and steps S130 and S140 are described in more detail in FIG. 3 .

컴퓨팅 장치는 평균 특징맵 및 이진 패턴 코드를 이용하여, 훈련 영상에 대한 학습을 수행(S140)한다. 훈련 영상에 대한 클래스 라벨(class label)이 주어짐으로써, 인공 신경망은, 훈련 영상이 어떠한 종류의 영상인지를 학습할 수 있다.The computing device performs learning on the training image by using the average feature map and the binary pattern code (S140). By giving a class label to the training image, the artificial neural network can learn what kind of image the training image is.

일실시예로서, 컴퓨팅 장치는 훈련 영상에 대한 종양 유무 및 종양의 분화도 중 적어도 하나를 학습할 수 있다. 종양을 탐지하는 경우, 훈련 영상으로서 종양 세포가 포함된 영상과 정상 세포만으로 이루어진 영상이 이용될 수 있으며, 종양의 분화도(grade)를 예측하는 경우, 훈련 영상으로서 여러 단계의 분화도에 대응되는 영상이 이용될 수 있다. 그리고 종양 세포의 유무나 종양의 분화도가 클래스 라벨로 주어질 수 있다.As an embodiment, the computing device may learn at least one of the presence or absence of a tumor and the degree of differentiation of the tumor with respect to the training image. In the case of detecting a tumor, an image containing tumor cells and an image consisting only of normal cells may be used as training images. can be used And the presence or absence of tumor cells or the degree of differentiation of the tumor may be given as a class label.

단계 S140에서 컴퓨팅 장치는 평균 특징맵(240)과 이진 패턴 코드로부터 획득된 통합 특징맵(230) 각각을 컨벌루션 및 맥스 풀링한 후, 결합하여 얻어진 최종 특징맵(250)을 훈련 영상(210)에 대한 특징맵으로 이용하여 학습을 수행할 수 있다. 즉, 최종 특징맵이 완전 연결된 신경망으로 입력되어 학습이 이루어질 수 있다.In step S140, the computing device convolves and max-pools each of the average feature map 240 and the integrated feature map 230 obtained from the binary pattern code, and then adds the final feature map 250 obtained by combining to the training image 210. It can be used as a feature map for learning. That is, the final feature map is input to a fully connected neural network, so that learning can be performed.

도 3은 본 발명의 일실시예에 따른 이진 패턴화를 설명하기 위한 도면이다.3 is a diagram for explaining binary patterning according to an embodiment of the present invention.

도 3에 도시된 바와 같이, 본 발명의 일실시예에 따른 학습 방법을 수행하는 컴퓨팅 장치는, 단계 S120에서, 동일한 크기로 변환된 다중 스케일 특징맵(321 내지 325)의 동일 위치의 k번째 픽셀값(326)을 평균하고, 해당 평균값(327)을 k번째 픽셀값으로 가지는 평균 특징맵(240)을 생성한다. 평균 특징맵(240)의 크기는 변환된 다중 스케일 특징맵(321 내지 325)의 크기와 동일하며, 다중 스케일 특징맵(321 내지 325)의 복수의 픽셀값이 평균되어 하나의 픽셀값이 생성되므로, 다중 스케일 특징맵(321 내지 325)으로부터 하나의 평균 특징맵(240)이 생성된다. As shown in FIG. 3 , in the computing device performing the learning method according to an embodiment of the present invention, in step S120, the k-th pixel at the same location of the multi-scale feature maps 321 to 325 converted to the same size. The values 326 are averaged, and the average feature map 240 having the average value 327 as the k-th pixel value is generated. Since the size of the average feature map 240 is the same as the size of the transformed multi-scale feature maps 321 to 325, a plurality of pixel values of the multi-scale feature maps 321 to 325 are averaged to generate one pixel value. , one average feature map 240 is generated from the multi-scale feature maps 321 to 325 .

다중 스케일 특징맵(321 내지 325)의 k번째 픽셀값(

)에 대한 평균값(

)은 [수학식 1]과 같이 계산될 수 있다.The k-th pixel value of the multi-scale feature maps 321 to 325 (

) for the average (

) can be calculated as in [Equation 1].

여기서, j는 다중 스케일 특징맵의 인덱스이며, S는 다중 스케일 특징맵의 개수를 나타내며, c는 다중 스케일 특징맵의 채널을 나타낸다. 그리고 k번째 픽셀값의 평균값(

)은 평균 특징맵의 k번째 픽셀값에 대응된다.Here, j is the index of the multi-scale feature map, S represents the number of multi-scale feature maps, and c represents the channel of the multi-scale feature map. And the average value of the kth pixel value (

) corresponds to the k-th pixel value of the average feature map.

그리고 컴퓨팅 장치는 단계 S130에서 다중 스케일 특징맵(321 내지 325)의 k번째 픽셀값(326)과 평균 특징맵(240)의 k번째 픽셀값(327)의 차이를 계산하고, 계산된 차이값에 따라서 픽셀 단위로 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출할 수 있다. 일실시예로서, 이진 패턴 코드는 픽셀값의 차이가 0 이상인 경우 1, 0 미만인 경우 0이 할당될 수 있다. Then, the computing device calculates the difference between the k-th pixel value 326 of the multi-scale feature map 321 to 325 and the k-th pixel value 327 of the average feature map 240 in step S130, and the calculated difference value Therefore, it is possible to calculate a binary pattern code for each of the multi-scale feature maps in units of pixels. As an embodiment, the binary pattern code may be assigned a value of 1 when the pixel value difference is equal to or greater than 0, and 0 when the difference between the pixel values is less than 0.

도 3에서, 제1 및 제2특징맵(321, 322)의 k번째 픽셀값과 평균 특징맵(240)의 k번째 픽셀값의 차이는 0미만이며, 제3 내지 제5특징맵(323, 324, 325)의 k번째 픽셀값과 평균 특징맵(240)의 k번째 픽셀값의 차이는 0이상이다. 따라서, k번째 픽셀에 대한 이진 패턴 코드는 00111이 된다. 3, the difference between the k-th pixel value of the first and second feature maps 321 and 322 and the k-th pixel value of the average feature map 240 is less than 0, and the third to fifth feature maps 323, The difference between the k-th pixel value of 324 and 325 and the k-th pixel value of the average feature map 240 is 0 or more. Therefore, the binary pattern code for the k-th pixel becomes 00111.

이 때, 이진 패턴 코드를 구성하는 이진수 각각의 자리는 다중 스케일 특징맵 각각의 스케일에 따라 결정될 수 있다. 일예로서 도 3과 같이, 스케일의 크기가 클수록 해당 이진수의 자리는, 최상위 자리에서 최하위 자리의 순서로 결정될 수 있다.In this case, each digit of the binary number constituting the binary pattern code may be determined according to each scale of the multi-scale feature map. As an example, as shown in FIG. 3 , as the size of the scale increases, the position of the corresponding binary number may be determined in the order of the highest digit to the lowest digit.

제1특징맵(321)의 스케일이 가장 작기 때문에, 제1특징맵(321)으로부터 얻어진 이진수 0은 이진 패턴 코드의 최상위 자리에 대응되며, 제5특징맵(325)의 스케일이 가장 크기 때문에, 제5특징맵(325)으로부터 얻어진 이진수 1은 이진 패턴 코드의 최하위 자리에 대응된다.Since the scale of the first feature map 321 is the smallest, the binary 0 obtained from the first feature map 321 corresponds to the most significant digit of the binary pattern code, and since the scale of the fifth feature map 325 is the largest, The binary number 1 obtained from the fifth feature map 325 corresponds to the lowest digit of the binary pattern code.

이와 같이, 다중 스케일 특징맵의 픽셀 단위로 이진 패턴 코드가 산출되면, 컴퓨팅 장치는 픽셀 단위로 이진 패턴 코드를 하나의 십진수로 변환하여, 통합 특징맵(230)을 생성하고, 평균 특징맵(240)과 통합 특징맵(230)을 결합하여 훈련 영상에 대한 학습을 수행한다. 즉, k번째 픽셀에 대한 이진 패턴 코드가 하나의 십진수로 변환되고, 통합 특징맵(230)의 k번째 픽셀값은 변환된 십진수에 대응된다. 평균 특징맵(240)과 통합 특징맵(230)의 크기는 동일하다.In this way, when the binary pattern code is calculated in units of pixels of the multi-scale feature map, the computing device converts the binary pattern code into a single decimal number in units of pixels to generate the integrated feature map 230 , and the average feature map 240 ) and the integrated feature map 230 to perform learning on the training image. That is, the binary pattern code for the k-th pixel is converted into one decimal number, and the k-th pixel value of the integrated feature map 230 corresponds to the converted decimal number. The average feature map 240 and the integrated feature map 230 have the same size.

컴퓨팅 장치는 일실시예로서, [수학식 2]를 이용하여, 이진 패턴 코드를 산출하고 이진 패턴 코드를 십진수로 변환할 수 있다.As an embodiment, the computing device may calculate a binary pattern code and convert the binary pattern code into a decimal number by using [Equation 2].

여기서,

는 다중 스케일 특징맵의 c채널에 대한 통합 특징맵(230)의 k번째 픽셀의 픽셀값을 나타내며,

는 다중 스케일 특징맵의 c채널에 대한 평균 특징맵(240)의 k번째 픽셀의 픽셀값을 나타낸다. j는 다중 스케일 특징맵의 인덱스로서, 5개의 다중 스케일 특징맵(321 내지 335)에 대해 1에서 5사이의 자연수 중에서, 스케일이 작을수록 큰 값이 할당될 수 있다.here,

represents the pixel value of the k-th pixel of the integrated feature map 230 for the c channel of the multi-scale feature map,

denotes the pixel value of the k-th pixel of the average feature map 240 for the c channel of the multi-scale feature map. j is an index of the multi-scale feature map, among natural numbers between 1 and 5 for the five multi-scale feature maps 321 to 335, a larger value may be assigned as the scale is smaller.

도 3에서, k번째 픽셀에 대한 이진 패턴 코드는 00111이므로, 통합 특징맵(230)의 k번째 픽셀의 픽셀값은 십진수 7이 된다.In FIG. 3 , since the binary pattern code for the k-th pixel is 00111, the pixel value of the k-th pixel of the integrated feature map 230 is 7 in decimal.

그리고 통합 특징맵(230)은 평균 특징맵(240)과 결합된 후, 완전 연결된 인공 신경망으로 입력되어 학습에 이용된다.Then, the integrated feature map 230 is combined with the average feature map 240, and then is input to a fully connected artificial neural network and used for learning.

이와 같이, 본 발명의 일실시예에 따르면, 이진 패턴 코드 중 1이 십진수에 반영되며 이는 다중 스케일 특징맵 중에서, 1에 대응되는 스케일의 특징맵이 유용한 정보로서 학습에 이용됨을 의미한다. 즉 전술된 실시예에서는, 이진수 1이 획득된 제3 내지 제5특징맵(323, 324, 325)이 유용한 정보로서, 학습에 이용된다고 볼 수 있다.As described above, according to an embodiment of the present invention, 1 of the binary pattern codes is reflected in a decimal number, which means that among the multi-scale feature maps, a feature map of a scale corresponding to 1 is used for learning as useful information. That is, in the above-described embodiment, it can be seen that the third to fifth feature maps 323 , 324 , and 325 obtained by obtaining the binary number 1 are useful information and are used for learning.

결국, 본 발명의 일실시예에 따르면, 다중 스케일 특징맵에 대한 이진 패턴화를 통해 여러 스케일의 특징맵 중에서, 학습에 유용한 스케일의 특징맵이 선정될 수 있으며, 이를 통해 학습을 수행함으로써 학습 능력이 향상될 수 있다.After all, according to an embodiment of the present invention, a feature map of a useful scale for learning can be selected from among feature maps of multiple scales through binary patterning on a multi-scale feature map, and learning ability is performed by performing learning through this. This can be improved.

도 4는 본 발명의 일실시예에 따른 다중 스케일 특징맵을 이용하는 영상 분류 방법을 설명하기 위한 도면이다.4 is a diagram for explaining an image classification method using a multi-scale feature map according to an embodiment of the present invention.

본 발명의 일실시예에 따른 영상 분류 방법은, 도 1 내지 도 3에서 학습된 인공 신경망을 이용하여, 입력 영상을 미리 설정된 클래스 중 하나로 분류한다. 따라서, 전술된 학습 과정과 유사하게 다중 스케일 특징맵을 생성하고, 이진 패턴화를 수행한다.An image classification method according to an embodiment of the present invention classifies an input image into one of preset classes using the artificial neural network learned in FIGS. 1 to 3 . Therefore, similar to the above-described learning process, a multi-scale feature map is generated and binary patterning is performed.

도 4를 참조하면, 본 발명의 일실시예에 따른 영상 분류 방법을 수행하는 컴퓨팅 장치는 인공 신경망을 이용하여, 입력 영상에 대한 다중 스케일 특징맵을 생성(S410)한다. 컴퓨팅 장치는 다중 스케일 특징맵 중 하나의 크기와 동일하도록, 다중 스케일 특징맵의 크기를 변환할 수 있다.Referring to FIG. 4 , a computing device performing an image classification method according to an embodiment of the present invention generates a multi-scale feature map for an input image using an artificial neural network ( S410 ). The computing device may convert the size of the multi-scale feature map to be the same as the size of one of the multi-scale feature maps.

그리고 컴퓨팅 장치는 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성(S120)한다. Then, the computing device generates an average feature map by averaging the pixel values of the multi-scale feature map in units of pixels (S120).

그리고 컴퓨팅 장치는 평균 특징맵과 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 픽셀 단위로 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출(S130)한다. 이 때, 이진 패턴 코드를 구성하는 이진수 각각의 자리는 다중 스케일 특징맵 각각의 스케일에 따라 결정될 수 있으며, 이진 패턴 코드는, 픽셀값의 차이가 0 이상인 경우 1, 0 미만인 경우 0이 할당되는 코드일 수 있다.Then, the computing device calculates the difference between the pixel values of the average feature map and the multi-scale feature map, and calculates a binary pattern code for each of the multi-scale feature maps in units of pixels (S130). At this time, each digit of the binary number constituting the binary pattern code may be determined according to the scale of each multi-scale feature map, and the binary pattern code is a code in which 1 is assigned when the difference between pixel values is 0 or more, and 0 is assigned when it is less than 0. can be

그리고 컴퓨팅 장치는 평균 특징맵 및 이진 패턴 코드를 이용하여, 입력 영상을 분류(S140)하며, 학습 과정에 이용된 클래스 라벨 중 하나의 클래스로 입력 영상을 분류할 수 있다. 일실시예로서, 컴퓨팅 장치는 입력 영상을, 종양 유무에 따른 클래스로 분류하거나, 종양의 분화도에 따른 클래스로 분류할 수 있다.In addition, the computing device may classify the input image by using the average feature map and the binary pattern code ( S140 ), and classify the input image into one of the class labels used in the learning process. As an embodiment, the computing device may classify the input image into a class according to the presence or absence of a tumor or classify the input image into a class according to the differentiation degree of the tumor.

컴퓨팅 장치는 픽셀 단위로 이진 패턴 코드를 하나의 십진수로 변환하여, 통합 특징맵을 생성하고, 평균 특징맵과 통합 특징맵을 결합하여, 입력 영상을 분류할 수 있다.The computing device may convert the binary pattern code into a single decimal number in units of pixels, generate an integrated feature map, combine the average feature map and the integrated feature map, and classify the input image.

도 5는 본 발명의 일실시예에 따른 영상 분류 결과를 설명하기 위한 도면이다.5 is a diagram for explaining an image classification result according to an embodiment of the present invention.

본 발명의 일실시예에 따른 학습 방법 및 영상 분류 방법은, 보안, 교통 등 다양한 분야에서 생성되는 영상을 분류하는데 이용될 수 있으며, 종양을 탐지하고, 종양의 분화도를 예측하는 의료 분야에서도 이용될 수 있다.The learning method and the image classification method according to an embodiment of the present invention can be used to classify images generated in various fields such as security and transportation, and can also be used in the medical field for detecting tumors and predicting the degree of differentiation of tumors. can

종양을 탐지하는 경우, 훈련 영상으로서 종양 세포가 포함된 영상과 정상 세포만으로 이루어진 영상이 이용될 수 있으며, 종양의 분화도(grade)를 예측하는 경우 훈련 영상으로서 여러 단계의 분화도에 대응되는 영상이 이용될 수 있다. 분화도란 종양 세포와 정상 세포 사이의 유사 정도를 의미하는 것으로서, 분화도가 좋다는 것은 종양 세포와 정상 세포와의 유사 정도가 높다는 것을 나타낸다.In the case of detecting a tumor, an image containing tumor cells and an image consisting only of normal cells may be used as training images. When predicting the grade of tumor, images corresponding to different degrees of differentiation are used as training images. can be The degree of differentiation refers to a degree of similarity between tumor cells and normal cells, and a high degree of differentiation indicates a high degree of similarity between tumor cells and normal cells.

도 5는 대장암에 대한 탐지 결과 및 분화도 예측 결과를 나타내는 도면으로서, 대장암 조직 영상으로부터 생성된 1024x1024 크기의 영상 패치 11294개로 학습 및 테스트가 이루어진 결과를 나타낸다. 11294의 영상 패치 중, 학습 데이터 셋은 정상 세포의 영상 패치 1525개, 저분화도의 종양 영상 패치 1684개, 중분화도의 종양 영상 패치 2925개, 고분화도의 종양 영상 패치 1358개로 구성되며, 테스트 데이터 셋은 정상 세포의 영상 패치 1512개, 저분화도의 종양 영상 패치 638개, 중분화도의 종양 영상 패치 1180개, 고분화도의 종양 영상 패치 472개로 구성된다.FIG. 5 is a view showing a detection result and a differentiation degree prediction result for colorectal cancer, and shows the results of learning and testing with 11294 image patches of 1024x1024 size generated from colorectal cancer tissue images. Among the 11294 image patches, the training data set consists of 1525 normal cell image patches, 1684 low-differentiated tumor image patches, 2925 medium-differentiated tumor image patches, and 1358 highly differentiated tumor image patches. is composed of 1512 normal cell imaging patches, 638 low-differentiation tumor imaging patches, 1180 medium-differentiation tumor imaging patches, and 472 high-differentiation tumor imaging patches.

그리고 도 5에서는 인공 신경망 ResNet50과 인공 신경망 MobileNetV1을 이용한 결과가 도시된다. 도 5에서, ResNet은 단일 스케일의 특징맵을 이용하여 탐지 및 예측한 결과, FF_concat-ResNet은 다중 스케일 맵을 concatenation 방법으로 결합하여 탐지 및 예측한 결과, FF_add-ResNet은 다중 스케일 맵을 addition 방법으로 결합하여 탐지 및 예측한 결과, FF_conv-ResNet은 다중 스케일 맵을 convolution 방법으로 결합하여 탐지 및 예측한 결과, MSBP-Net-ResNet은 본 발명의 일실시예에 따른 탐지 및 예측 결과를 나타낸다.5 shows the results of using the artificial neural network ResNet50 and the artificial neural network MobileNetV1. In Figure 5, ResNet detects and predicts results using a single-scale feature map, FF _concat -ResNet combines multi-scale maps by concatenation method to detect and predict the results, FF _add -ResNet adds a multi-scale map As a result of detection and prediction by combining methods, FF _conv -ResNet shows detection and prediction results by combining multi-scale maps with a convolution method, and MSBP-Net-ResNet shows detection and prediction results according to an embodiment of the present invention. .

그리고 도 5에서 MobileNet은 단일 스케일의 특징맵을 이용하여 탐지 및 예측한 결과, FF_concat-MobileNet은 다중 스케일 맵을 concatenation 방법으로 결합하여 탐지 및 예측한 결과, FF_add-MobileNet은 다중 스케일 맵을 addition 방법으로 결합하여 탐지 및 예측한 결과, FF_conv-MobileNet은 다중 스케일 맵을 convolution 방법으로 결합하여 탐지 및 예측한 결과, MSBP-Net-ResNet은 본 발명의 일실시예에 따른 탐지 및 예측 결과를 나타낸다.And in Figure 5, MobileNet detects and predicts using a single-scale feature map, FF _concat -MobileNet combines multi-scale maps by concatenation method to detect and predict, FF _add -MobileNet adds a multi-scale map As a result of detection and prediction by combining methods, FF _conv -MobileNet shows detection and prediction results by combining multi-scale maps with a convolution method, and MSBP-Net-ResNet shows detection and prediction results according to an embodiment of the present invention. .

도 5에 도시된 바와 같이, 본 발명의 일실시예에 따른 종양 탐지 정확도(ACC_BvsC) 및 분화도 예측 정확도(ACC_Grade)는 99.45% 및 86.61%로 가장 높은 정확도를 나타냄을 알 수 있다. 본 발명의 일실시예에 따른 종양 탐지(F₁ ^BN) 및 저분화도(F₁ ^WD), 중분화도(F₁ ^MD), 고분화도(F₁ ^PD) 예측 결과의 F1 score는 0.9930, 0.7399, 0.7881, 0.8337로서, 본 발명의 일실시예에 따른 종양 탐지 및 분화도 예측 결과가 가장 우수함을 알 수 있다.As shown in FIG. 5 , it can be seen that the tumor detection accuracy (ACC _BvsC ) and differentiation degree prediction accuracy (ACC _Grade ) according to an embodiment of the present invention exhibit the highest accuracy at 99.45% and 86.61%. Tumor detection (F ₁ ^BN ) and low differentiation (F ₁ ^WD ), medium differentiation (F ₁ ^MD ), and high differentiation (F ₁ ^PD ) according to an embodiment of the present invention F1 scores of the prediction results are 0.9930, 0.7399, 0.7881 , 0.8337, indicating that the tumor detection and differentiation degree prediction results according to an embodiment of the present invention are the best.

도 6은 본 발명의 일실시예에 따른 영상 분류를 위한 특징맵 추출 방법을 설명하기 위한 도면이다.6 is a diagram for explaining a method for extracting a feature map for image classification according to an embodiment of the present invention.

본 발명의 일실시예에 따른 특징맵 추출 방법을 수행하는 컴퓨팅 장치는 인공 신경망을 이용하여, 입력 영상에 대한 다중 스케일 특징맵을 생성(S610)하고, 다중 스케일 특징맵의 픽셀 단위로 픽셀값을 평균하여, 평균 특징맵을 생성(S620)한다. 그리고 평균 특징맵과 다중 스케일 특징맵 각각의 픽셀값의 차이를 계산하여, 픽셀 단위로 다중 스케일 특징맵 각각에 대한 이진 패턴 코드를 산출(S630)한다. 그리고 산출된 이진 패턴 코드가 반영된 통합 특징맵(S640)을 생성한다.A computing device for performing the feature map extraction method according to an embodiment of the present invention generates a multi-scale feature map for an input image using an artificial neural network (S610), and calculates pixel values of the multi-scale feature map in units of pixels. By averaging, an average feature map is generated (S620). Then, the difference between the pixel values of the average feature map and the multi-scale feature map is calculated, and a binary pattern code for each of the multi-scale feature maps is calculated in units of pixels (S630). Then, an integrated feature map S640 in which the calculated binary pattern code is reflected is generated.

단계 S640에서 컴퓨팅 장치는 픽셀 단위로 이진 패턴 코드를 하나의 십진수로 변환하여, 통합 특징맵을 생성할 수 있으며, 이진 패턴 코드를 구성하는 이진수 각각의 자리는 다중 스케일 특징맵 각각의 스케일에 따라 결정될 수 있으며, 이진 패턴 코드는 픽셀값의 차이가 0 이상인 경우 1, 0 미만인 경우 0이 할당되는 코드일 수 있다.In step S640, the computing device converts the binary pattern code into a single decimal number in units of pixels to generate an integrated feature map, and each digit of the binary number constituting the binary pattern code is determined according to the scale of each multi-scale feature map. In addition, the binary pattern code may be a code in which 1 is assigned when the pixel value difference is greater than or equal to 0, and 0 is assigned when the difference between pixel values is less than 0.

생성된 통합 특징맵은 영상 분류를 위한 학습이나 영상 분류에 이용될 수 있으며, 일실시예로서 전술된 바와 같이, 평균 특징맵과 결합되어 이용될 수 있다.The generated integrated feature map may be used for image classification or learning for image classification, and as an embodiment, as described above, may be used in combination with the average feature map.

앞서 설명한 기술적 내용들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예들을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 하드웨어 장치는 실시예들의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The technical contents described above may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, etc. alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiments, or may be known and available to those skilled in the art of computer software. Examples of the computer-readable recording medium include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic such as floppy disks. - includes magneto-optical media, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. A hardware device may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.As described above, the present invention has been described with specific matters such as specific components and limited embodiments and drawings, but these are only provided to help a more general understanding of the present invention, and the present invention is not limited to the above embodiments. , various modifications and variations are possible from these descriptions by those of ordinary skill in the art to which the present invention pertains. Therefore, the spirit of the present invention should not be limited to the described embodiments, and not only the claims to be described later, but also all those with equivalent or equivalent modifications to the claims will be said to belong to the scope of the spirit of the present invention. .

Claims

In a learning method using a multi-scale feature map, performed on a computing device,
generating a multi-scale feature map for a training image using an artificial neural network, and transforming the size of the multi-scale feature map to be the same as the size of one of the multi-scale feature maps;
generating an average feature map by averaging pixel values of the converted multi-scale feature map in units of pixels;
calculating the difference between the pixel values of the average feature map and the transformed multi-scale feature map, and calculating a binary pattern code for each of the transformed multi-scale feature maps in units of the pixel according to the calculated difference value step; and
using the average feature map and the binary pattern code to perform learning on the training image,
The step of performing learning on the training image is
generating an integrated feature map by converting the binary pattern code into a single decimal number in the pixel unit; and
performing the learning by combining the average feature map and the integrated feature map,
Each digit of the binary number constituting the binary pattern code is
determined according to each scale of the multi-scale feature map
A learning method using multi-scale feature maps.

delete

The method of claim 1,
The binary pattern code is
A code in which 1 is assigned when the difference between the pixel values is greater than or equal to 0, and 0 is assigned when the difference between the pixel values is less than 0.
A learning method using multi-scale feature maps.

delete

The method of claim 1,
The step of performing learning on the training image is
Learning at least one of the presence or absence of a tumor for the training image and the degree of differentiation of the tumor
A learning method using multi-scale feature maps.

In the image classification method using a multi-scale feature map, performed in a computing device,
generating a multi-scale feature map for an input image using an artificial neural network, and transforming the size of the multi-scale feature map to be the same as the size of one of the multi-scale feature maps;
generating an average feature map by averaging pixel values of the converted multi-scale feature map in units of pixels;
calculating the difference between the pixel values of the average feature map and the transformed multi-scale feature map, and calculating a binary pattern code for each of the transformed multi-scale feature maps in units of the pixel according to the calculated difference value step; and
classifying the input image by using the average feature map and the binary pattern code,
The step of classifying the input image is
generating an integrated feature map by converting the binary pattern code into a single decimal number in the pixel unit; and
classifying the input image by combining the average feature map and the integrated feature map,
Each digit of the binary number constituting the binary pattern code is
determined according to each scale of the multi-scale feature map
An image classification method using a multi-scale feature map.

delete

8. The method of claim 7,
The binary pattern code is
A code in which 1 is assigned when the difference between the pixel values is greater than or equal to 0, and 0 is assigned when the difference between the pixel values is less than 0.
An image classification method using a multi-scale feature map.

delete

8. The method of claim 7,
The step of classifying the input image is
Classifying the input image according to at least one of the presence or absence of a tumor and the degree of differentiation of the tumor
An image classification method using a multi-scale feature map.

In the feature map extraction method for image classification, performed in a computing device,
generating a multi-scale feature map for an input image using an artificial neural network, and transforming the size of the multi-scale feature map to be the same as the size of one of the multi-scale feature maps;
generating an average feature map by averaging pixel values of the converted multi-scale feature map in units of pixels;
calculating a difference between the pixel values of the average feature map and the transformed multi-scale feature map, and calculating a binary pattern code for each of the multi-scale feature maps on a pixel-by-pixel basis according to the calculated difference value; and
and generating an integrated feature map in which the binary pattern code is reflected,
The step of generating the integrated feature map is
converting the binary pattern code into a single decimal number in the pixel unit to generate the integrated feature map,
Each digit of the binary number constituting the binary pattern code is
determined according to each scale of the multi-scale feature map
A feature map extraction method for image classification.

delete

14. The method of claim 13,
The binary pattern code is
A code in which 1 is assigned when the difference between the pixel values is greater than or equal to 0, and 0 is assigned when the difference between the pixel values is less than 0.
A feature map extraction method for image classification.