KR20170025535A

KR20170025535A - Method of modeling a video-based interactive activity using the skeleton posture datset

Info

Publication number: KR20170025535A
Application number: KR1020150122086A
Authority: KR
Inventors: 이승룡; 윤용익; 티엔현더
Original assignee: 경희대학교 산학협력단; 숙명여자대학교산학협력단
Priority date: 2015-08-28
Filing date: 2015-08-28
Publication date: 2017-03-08
Also published as: KR101762010B1

Abstract

본 발명은 스켈레톤 자세 데이터세트를 이용한 비디오 기반 상호 활동 모델링 방법에 관한 것으로서, 일실시예에 따른 상호 활동 모델링 방법은 비디오로부터 추출한 2차원 스켈레톤 위치의 데이터 세트를 입력받는 단계, 상기 입력된 데이터 세트로부터 오브젝트의 위치 좌표들을 연산하는 단계, 상기 연산된 오브젝트의 위치 좌표들로부터 상기 오브젝트의 움직임 속도 및 움직임 방향을 포함하는 트래킹 특징들을 계산하는 단계, 상기 오브젝트에 대응되는 상호 존과 상기 계산된 트래킹 특징들에 기초하여 상호 오브젝트를 결정하는 단계, 상기 결정된 상호 오브젝트에 대한 스켈레톤의 데이터 세트로부터 특징들을 계산하는 단계, 및 상기 계산된 특징들을 싱글 휴먼 액션들의 토픽들과 상호 그룹 활동들에 모델링하는 단계를 포함한다.The present invention relates to a video-based interactive modeling method using a set of skeleton attitude data, wherein a method for modeling an interactive activity according to an embodiment includes the steps of receiving a data set of a two-dimensional skeleton position extracted from a video, Calculating tracking coordinates including the motion velocity and the motion direction of the object from the position coordinates of the calculated object, calculating a distance between the mutual zone corresponding to the object and the calculated tracking characteristics Calculating features from the data set of skeletons for the determined cross-object, and modeling the calculated features into topics of single human actions and inter-group activities. &Lt; RTI ID = 0.0 > do.

Description

METHOD OF MODELING A VIDEO-BASED INTERACTIVE ACTIVITY USING THE SKELETON POSTURE DATSET < RTI ID = 0.0 >

스켈레톤 자세 데이터세트를 이용하여 비디오 기반하여 상호 활동 모델을 생성하는 기술에 관한 것이다.And a technique for generating a mutual activity model based on a video using a skeleton attitude data set.

최근 수십 년간 컴퓨터 비전과 인공 지능 커뮤니티에서 더 많은 관심을 받고 있지만, 인간 활동의 인식은 여전히 외관의 변화, 상호 폐색, 다중 객체의 상호 작용 등에 의한 도전 문제가 남아있다.Although computer vision and artificial intelligence communities have received more attention in recent decades, human perception remains a challenging problem due to changes in appearance, mutual occlusion, and the interaction of multiple objects.

이전 기술들에서는 인체 구성요소들의 동작을 입력 특징으로 이용하여 인간 활동 인식에 노력을 기울인 반면, 근래의 기술들에서는 로우레벨 특징들을 수집하는 기술에 집중하고 있다. 예를 들어, 이미지 처리 기술들의 한계로 인해 스켈레톤과 같이 인체를 표현하는 기술 대신 위치 공간적-시간적 특징과 같은 로우레벨 특징들을 수집하는 기술에 집중하는 추세이다.While previous techniques have attempted to perceive human activity using the behavior of human components as an input feature, recent technologies have focused on techniques for collecting low-level features. For example, due to the limitations of image processing techniques, there is a tendency to concentrate on techniques for collecting low-level features such as spatial spatial-temporal features instead of techniques for representing the human body, such as a skeleton.

미국 특허등록공보 US 7,366,645(발명의 명칭: method of recognition of human motion, vector sequences and speech)US Patent No. 7,366,645 (entitled " Method of Recognition of Human Motion, Vector Sequences and Speech)

Y. Yang and D. Ramanan, "Articulated human detection with flexible mixtures of parts", Pattern Analysis and Machine Learning, IEEE Transaction on, vol. 36, no. 9, pp. 1775-1788, Sept 2014.Y. Yang and D. Ramanan, "Articulated human detection with flexible mixtures of parts ", Pattern Analysis and Machine Learning, IEEE Transaction on, vol. 36, no. 9, pp. 1775-1788, Sept 2014. W. Yang, Y. Wang, and G. Mori, "Recognizing Human Actions from Still Images with Latent Poses", in Computer Vision and Pattern Recognition (CVPR), 2010 International Conference on. San Francisco, USA, IEEE, 2010, pp. 2030-2037W. Yang, Y. Wang, and G. Mori, "Recognizing Human Actions from Still Images with Latent Poses," in Computer Vision and Pattern Recognition, 2010 International Conference on. San Francisco, USA, IEEE, 2010, pp. 2030-2037

감시 카메라들로부터 캡쳐한 비디오로부터 휴먼 상호액션을 표현하는 기술이 제시된다.A technique for representing human interactive actions from video captured from surveillance cameras is presented.

그룹 내 또는 그룹들 상호 간에 있어, 상호 휴먼 객체 간 상호액션을 모델링하는 기술이 제시된다.Within a group or between groups, a technique for modeling interactions between mutual human objects is presented.

비-상호작용 오브젝트(non-interacted objects)를 생략함으로써, 계산 비용을 줄이는 것을 목적으로 한다.By omitting non-interacted objects, the goal is to reduce the computational cost.

특이 값이 데이터 집합에 포함되지 않기 때문에 기능의 훈련 데이터 세트의 품질을 향상시키는 것을 목적으로 한다.The aim is to improve the quality of the training data set of the function because the singular values are not included in the data set.

싱글 오브젝트 액션 인식(single object action recognition)과 상호작용 그룹 활동 인식(interactive group activity recognition)을 구분하는데 있어, 분류의 정확도를 향상시키는 것을 목적으로 한다.The goal is to improve classification accuracy in distinguishing between single object action recognition and interactive group activity recognition.

일실시예에 따른 상호 활동 모델링 방법은 비디오로부터 추출한 2차원 스켈레톤 위치의 데이터 세트를 입력받는 단계, 상기 입력된 데이터 세트로부터 오브젝트의 위치 좌표들을 연산하는 단계, 상기 연산된 오브젝트의 위치 좌표들로부터 상기 오브젝트의 움직임 속도 및 움직임 방향을 포함하는 트래킹 특징들을 계산하는 단계, 상기 오브젝트에 대응되는 상호 존과 상기 계산된 트래킹 특징들에 기초하여 상호 오브젝트를 결정하는 단계, 상기 결정된 상호 오브젝트에 대한 스켈레톤의 데이터 세트로부터 특징들을 계산하는 단계, 및 상기 계산된 특징들을 싱글 휴먼 액션들의 토픽들과 상호 그룹 활동들에 모델링하는 단계를 포함한다.A method for modeling an interactive activity according to an exemplary embodiment includes receiving a data set of a two-dimensional skeleton position extracted from a video, calculating position coordinates of the object from the input data set, Comprising the steps of: calculating tracking characteristics including a motion velocity and a motion direction of an object; determining a mutual object based on the mutual zone corresponding to the object and the calculated tracking characteristics; determining a skeleton data Calculating features from the set, and modeling the calculated features into topics of single human actions and intergroup activities.

일실시예에 따른 상기 오브젝트의 위치 좌표들을 연산하는 단계는, 상기 입력된 데이터 세트로부터 몸통의 4조인트를 이용해서 상기 오브젝트에 대한 위치 좌표들을 검출하는 단계를 포함한다.The computing the positional coordinates of the object according to an embodiment includes detecting positional coordinates for the object using four joints of the torso from the input data set.

일실시예에 따른 상기 상호 오브젝트 및 상기 상호 오브젝트에 상응하는 비상호 오브젝트들은 상호 포텐셜 존과 상기 트래킹 특징들을 통해 결정된다.The mutual objects and the non-mutual objects corresponding to the mutual object according to an embodiment are determined through the mutual potential zone and the tracking characteristics.

일실시예에 따른 상기 오브젝트의 움직임 속도 및 움직임 방향을 포함하는 트래킹 특징들을 계산하는 단계는, 상기 오브젝트의 공간적-시간적 조인트 거리와 휴먼 오브젝트들 간에 상기 움직임 방향을 추출하되, 상기 스켈레톤의 위치 데이터세트로부터 추출하는 단계를 포함한다.The step of calculating tracking characteristics including the motion velocity and the motion direction of the object according to an embodiment may include extracting the motion direction between the spatially-temporal joint distance of the object and the human objects, .

일실시예에 따른 상기 모델링하는 단계는, 모델링 알고리즘을 이용해서 상기 싱글 휴먼 액션들을 위한 확률 모델 및 상기 상호 그룹 활동들을 생성하는 단계를 포함한다.The modeling step according to an embodiment includes generating a probability model for the single human actions and the mutual group activities using a modeling algorithm.

일실시예에 따른 상호 오브젝트 식별 방법은 오브젝트의 위치 좌표들을 입력받는 단계, 상기 입력된 오브젝트의 위치 좌표들에 기초하여 상기 오브젝트로부터 미리 지정된 범위 내에 위치하는 싱글 상호 포텐셜 존들을 결정하는 단계, 상기 결정된 싱글 상호 포텐셜 존들에 기초하여 각 오브젝트에 대한 중첩 영역의 비율을 계산하는 단계, 및 각 오브젝트에 대한 그룹 아이디에 할당된 임계값과 상기 계산된 비율을 비교하여 상기 오브젝트를 식별하는 단계를 포함한다.According to an embodiment of the present invention, there is provided a method for identifying mutual objects, comprising the steps of receiving position coordinates of an object, determining single mutual potential zones located within a predetermined range from the object based on position coordinates of the input object, Calculating a ratio of overlapping regions for each object based on single mutual potential zones; and comparing the calculated ratio with a threshold assigned to a group ID for each object to identify the object.

일실시예에 따른 상기 싱글 상호 포텐셜 존들을 결정하는 단계는, 상기 오브젝트의 위치 좌표와 원의 반지름에 기초하여 상기 싱글 상호 포텐셜 존들을 결정하는 단계를 포함한다.The step of determining the single mutual potential zones according to an embodiment includes determining the single mutual potential zones based on the position coordinates of the object and the radius of the circle.

일실시예에 따른 상기 싱글 상호 포텐셜 존들과 상기 중첩 영역의 비율은, 각각의 오브젝트에 대해서 식별된다.The ratio of the single mutual potential zones and the overlap region according to one embodiment is identified for each object.

일실시예에 따른 상기 비율은 각 오브젝트의 그룹 아이디를 결정하기 위한 임계값에 대비된다.The ratio according to one embodiment is compared to a threshold value for determining the group ID of each object.

일실시예에 따른 특징 데이터 세트의 구성 방법은 그룹 아이디를 입력 받는 단계, 상기 그룹 아이디에 해당하는 각 그룹에 대한 오브젝트의 수를 비교하는 단계, 상기 비교된 오브젝트의 수를 고려하여

및

중에서 적어도 하나 이상의 좌표에 대한 특징들을 추출하는 단계, 상기 추출된 특징들에 상응하는 데이터세트를 인식하는 단계, 및 상기 인식된 데이터세트에 기초하여 인트라 오브젝트 데이터 세트와 인터 오브젝트 특징 데이터세트를 획득하는 단계를 포함한다.A method of configuring a feature data set according to an exemplary embodiment includes receiving a group ID, comparing the number of objects for each group corresponding to the group ID,

And

Extracting features for at least one of the plurality of coordinates from the extracted data set, recognizing a data set corresponding to the extracted features, and obtaining an intra-object data set and an inter-object characteristic data set based on the recognized data set .

일실시예에 따른 상기 특징들을 추출하는 단계는, 상기 그룹 내에 하나의 오브젝트가 있는 경우, x=y 좌표에 대한 특징들을 추출하는 단계를 포함한다.The step of extracting features in accordance with an embodiment includes extracting features for x = y coordinates when there is one object in the group.

일실시예에 따른 상기 특징들을 추출하는 단계는, 상기 그룹 내에 한 개 이상이 오브젝트가 있다면,

및

의 좌표에 대한 특징들을 추출하는 단계를 포함한다.The step of extracting the features according to an embodiment may comprise: if there is more than one object in the group,

And

And extracting features for the coordinates of the coordinates.

일실시예에 따른 특징 데이터 세트의 구성 방법은 상기 오브젝트의 수의 비교 결과를 고려하여, 상기 오브젝트를 비상호 오브젝트와 상호 오브젝트의 두 그룹으로 분류하는 단계를 더 포함한다.The method of constructing a feature data set according to an embodiment may further include classifying the object into two groups of non-mutual objects and mutual objects in consideration of the comparison result of the number of objects.

일실시예에 따른 상기 오브젝트를 비상호 오브젝트와 상호 오브젝트의 두 그룹으로 분류하는 단계는, 상기 오브젝트에 대한 공간적-시간적 조인트 거리와 움직임 방향 특징들은 추출하는 단계를 포함한다.The step of classifying the objects into two groups of non-mutual objects and mutual objects according to an exemplary embodiment includes extracting spatial-temporal joint distances and motion direction characteristics for the objects.

일실시예에 따른 상기 특징들을 추출하는 단계는, 싱글 액션 인식을 위한 인트라 오브젝트 특징 및 상호 활동 인식을 위한 데이터 세트와 인터 오브젝트 특징 중에서 적어도 하나 이상의 특징을 추출하는 단계를 포함한다.The step of extracting the features according to an embodiment includes extracting at least one or more features from the data set and the inter-object characteristic for the intra-object characteristic and the mutual activity recognition for single action recognition.

일실시예에 따른 확률 모델 생성 방법은 특징 데이터세트를 입력받는 단계, K-평균 클러스터링 알고리즘을 적용하여 상기 특징 데이터세트들 내의 특징들을 코드워드들로 클러스터링하는 단계, 상기 클러스터링에 따른 인트라 오브젝트 특징들과 인터 오브젝트 특징들을 액션들과 활동들의 코드워드 히스토그램들로 맵핑하는 단계, 상기 맵핑된 히스토그램에 기반하는 계층 모델(hierarchical model)을 이용해서 워드들을 인코딩하는 단계, 및 상기 인코딩된 워드들을 이용해서 확률 모델을 출력하는 단계를 포함한다.A method for generating a probability model according to an exemplary embodiment includes receiving a feature data set, clustering features in the feature data sets into codewords by applying a K-average clustering algorithm, Mapping inter-object features to codeword histograms of actions and activities, encoding words using a hierarchical model based on the mapped histogram, and using the encoded words to calculate a probability And outputting the model.

일실시예에 따른 상기 확률 모델을 출력하는 단계는, 계측 모델을 기반으로 하는 토픽 모델링에 기초하여 상기 확률 모델을 생성하는 단계를 포함한다.The step of outputting the probability model according to an embodiment includes generating the probability model based on topic modeling based on the metrology model.

일실시예에 따른 상호 활동 모델링 프로그램은 비디오로부터 추출한 2차원 스켈레톤 위치의 데이터 세트를 입력받는 명령어 세트, 상기 입력된 데이터 세트로부터 오브젝트의 위치 좌표들을 연산하는 명령어 세트, 상기 연산된 오브젝트의 위치 좌표들로부터 상기 오브젝트의 움직임 속도 및 움직임 방향을 포함하는 트래킹 특징들을 계산하는 명령어 세트, 상기 오브젝트에 대응되는 상호 존과 상기 계산된 트래킹 특징들에 기초하여 상호 오브젝트를 결정하는 명령어 세트, 상기 결정된 상호 오브젝트에 대한 스켈레톤의 데이터 세트로부터 특징들을 계산하는 명령어 세트, 및 상기 계산된 특징들을 싱글 휴먼 액션들의 토픽들과 상호 그룹 활동들에 모델링하는 명령어 세트를 포함한다.The interactive modeling program according to an embodiment includes an instruction set that receives a data set of a two-dimensional skeleton position extracted from a video, an instruction set that calculates position coordinates of the object from the input data set, A set of instructions for determining a mutual object based on the calculated tracking characteristics and a mutual zone corresponding to the object, a set of instructions for determining a mutual object corresponding to the determined mutual object And a set of instructions for modeling the computed features into topics of single human actions and mutual group activities.

실시예들에 따르면, 감시 카메라들로부터 캡쳐한 비디오로부터 휴먼 상호액션을 표현할 수 있다.According to embodiments, human interactive actions can be represented from video captured from surveillance cameras.

실시예들에 따르면, 그룹 내 또는 그룹들 상호 간에 있어, 상호 휴먼 객체 간 상호액션을 모델링할 수 있다.According to embodiments, interactions between human objects can be modeled within a group or between groups.

실시예들에 따르면, 비-상호작용 오브젝트(non-interacted objects)를 생략함으로써, 계산 비용을 줄일 수 있다.According to embodiments, omitting non-interacted objects may reduce the computational cost.

실시예들에 따르면, 특이 값이 데이터 집합에 포함되지 않기 때문에 기능의 훈련 데이터 세트의 품질을 향상시킬 수 있다.According to embodiments, the quality of the training data set of the function can be improved since the singular value is not included in the data set.

실시예들에 따르면, 싱글 오브젝트 액션 인식(single object action recognition)과 상호작용 그룹 활동 인식(interactive group activity recognition)을 구분하는데 있어, 분류의 정확도를 향상시킬 수 있다.According to embodiments, the classification accuracy can be improved in distinguishing between single object action recognition and interactive group activity recognition.

도 1은 싱글 액션과 상호 활동 인식을 위한 모델링 방법에 대한 흐름도이다.
도 2는 14-조인트 휴먼 자세, 중심점의 결정, 오브젝트의 거리와 움직임 방향을 표현하는 도면이다.
도 3은 상호 작용 존 결정과 오브젝트 그룹 설립을 설명하는 도면이다.
도 4는 조인트 위치 정보를 이용해서 4특징들의 결정을 설명하는 도면이다.
도 5는 상호 작용 존 식별과 오브젝트 그룹 생성의 프로세스를 보여준다.
도 6은 인트라 오브젝트 특징 데이터세트와 인터 오브젝트 특징 데이터세트로 구분되는 두 특징 데이터세트들의 구성 프로세스를 나타낸다.
도 7은 두 특징 데이터세트를 위한 코드북 생성과 토픽 모델링의 프로세스를 도시한다.
도 8은 하나의 특징 벡터를 코드워드의 히스토그램으로 맵핑하는 실시예를 설명하는 도면이다.
도 9는 4-레벨 구조의 토픽 모델을 위한 계층 모델을 디스플레이 하는 도면이다.1 is a flowchart of a modeling method for single action and mutual activity recognition.
Fig. 2 is a diagram showing the 14-joint human posture, the determination of the center point, the distance of the object and the direction of movement.
FIG. 3 is a view for explaining interaction zone determination and object group establishment. FIG.
4 is a diagram for explaining the determination of four features using joint position information.
Figure 5 shows the process of interaction zone identification and object group creation.
FIG. 6 shows a process of configuring two feature data sets divided into an intra-object feature data set and an inter-feature feature data set.
Figure 7 shows the process of codebook generation and topic modeling for two feature data sets.
Fig. 8 is a diagram for explaining an embodiment for mapping one feature vector to a histogram of code words. Fig.
9 is a diagram showing a hierarchical model for a topic model of a four-level structure.

이하에서, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 그러나, 이러한 실시예들에 의해 권리범위가 제한되거나 한정되는 것은 아니다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. However, the scope of the rights is not limited or limited by these embodiments. Like reference symbols in the drawings denote like elements.

아래 설명에서 사용되는 용어는, 연관되는 기술 분야에서 일반적이고 보편적인 것으로 선택되었으나, 기술의 발달 및/또는 변화, 관례, 기술자의 선호 등에 따라 다른 용어가 있을 수 있다. 따라서, 아래 설명에서 사용되는 용어는 기술적 사상을 한정하는 것으로 이해되어서는 안 되며, 실시예들을 설명하기 위한 예시적 용어로 이해되어야 한다.The terms used in the following description are chosen to be generic and universal in the art to which they are related, but other terms may exist depending on the development and / or change in technology, customs, preferences of the technician, and the like. Accordingly, the terminology used in the following description should not be construed as limiting the technical thought, but should be understood in the exemplary language used to describe the embodiments.

또한 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당되는 설명 부분에서 상세한 그 의미를 기재할 것이다. 따라서 아래 설명에서 사용되는 용어는 단순한 용어의 명칭이 아닌 그 용어가 가지는 의미와 명세서 전반에 걸친 내용을 토대로 이해되어야 한다.Also, in certain cases, there may be a term chosen arbitrarily by the applicant, in which case the meaning of the detailed description in the corresponding description section. Therefore, the term used in the following description should be understood based on the meaning of the term, not the name of a simple term, and the contents throughout the specification.

도 1은 싱글 액션과 상호 활동 인식을 위한 모델링 방법에 대한 흐름도이다.1 is a flowchart of a modeling method for single action and mutual activity recognition.

일실시예에 따른 상호 활동 모델링 방법은 비디오로부터 추출한 2차원 스켈레톤 위치의 데이터 세트를 입력 받는다(단계 101).In an interactive modeling method according to an exemplary embodiment, a data set of a two-dimensional skeleton location extracted from a video is input (step 101).

입력 데이터는 조인트 위치 정보와 함께 휴먼 오브젝트 스켈레톤을 포함한다.The input data includes a human object skeleton with joint position information.

입력 데이터에 상응하는 출력은 분류기(classifier)를 서포트 하기 위한 필요한 싱글 오브젝트 액션들과 상호 작용 그룹 활동들에 기반하는 확률 모델이다. 이를 위해, 일실시예에 따른 상호 활동 모델링 방법은 입력된 데이터 세트로부터 오브젝트의 위치 좌표들을 연산한다(단계 102). 즉, 자세 추정을 위해서는 조인트의 위치를 결정해야만 하는데, 이를 위해 단계 102를 통해 오브젝트의 위치 좌표들을 연산한다. The output corresponding to the input data is a probability model based on the single object actions and interaction group activities needed to support the classifier. To this end, a method for modeling an interaction according to an embodiment calculates the positional coordinates of an object from an input data set (step 102). That is, in order to estimate the position of the joint, it is necessary to determine the position of the joint. For this purpose, the position coordinates of the object are calculated through the step 102.

예를 들어, 상호 활동 모델링 방법은 오브젝트의 위치 좌표들을 연산하기 위해 입력된 데이터 세트로부터 몸통의 4관절을 이용해서 오브젝트에 대한 위치 좌표들을 검출할 수 있다.For example, the interactions modeling method may detect positional coordinates for an object using four joints of the torso from the input data set to calculate the positional coordinates of the object.

Yang의 연구 등에 따르면, 각 휴먼의 자세는 14 조인트 요소를 포함한다.According to Yang et al., Each human posture includes 14 joint elements.

14 조인트 요소는 이후 도 2를 통해 구체적으로 설명한다.14 joint elements will be described later in detail with reference to FIG.

다음으로, 일실시예에 따른 상호 활동 모델링 방법은 연산된 오브젝트의 위치 좌표들로부터 상기 오브젝트의 움직임 속도 및 움직임 방향을 포함하는 트래킹 특징들을 계산한다(단계 103). 또한, 일실시예에 따른 상호 활동 모델링 방법은 오브젝트에 대응되는 상호 존과 상기 계산된 트래킹 특징들에 기초하여 상호 오브젝트를 결정하고(단계 104), 결정된 상호 오브젝트에 대한 스켈레톤의 데이터 세트로부터 특징들을 계산한다(단계 105).Next, a method for modeling an interaction according to an exemplary embodiment calculates tracking characteristics including the movement speed and the direction of movement of the object from the coordinates of the calculated object (step 103). In addition, the method for modeling an interaction according to an exemplary embodiment may further include determining a mutual object based on the mutual zone corresponding to the object and the calculated tracking characteristics (step 104), extracting features from the data set of the skeleton for the determined mutual object (Step 105).

일례로, 트래킹 특징들을 계산하기 위해 일실시예에 따른 상호 활동 모델링 방법은 오브젝트의 공간적-시간적 조인트 거리와 휴먼 오브젝트들 간에 움직임 방향을 추출하되, 스켈레톤의 위치 데이터세트로부터 추출할 수 있다.For example, in order to calculate tracking characteristics, an interactive modeling method according to one embodiment extracts the spatial-temporal joint distance of an object and the direction of motion between human objects, and extracts from the location data set of the skeleton.

일실시예에 따른 상호 활동 모델링 방법은 계산된 특징들을 싱글 휴먼 액션들의 토픽들과 상호 그룹 활동들에 모델링하고(단계 106), 모델링 결과를 이용해서 확률 모델을 생성한다(단계 107).The method of interacting modeling according to an exemplary embodiment models the calculated features to topics of single human actions and mutual group activities (step 106), and generates a probability model using the modeling result (step 107).

예를 들어, 상호 활동 모델링 방법은 상호 그룹 활동들에 모델링 하기 위해 모델링 알고리즘을 이용해서 싱글 휴먼 액션들을 위한 확률 모델 및 상호 그룹 활동들을 생성할 수 있다.For example, the interactions modeling method can generate a probability model and intergroup activities for single human actions using a modeling algorithm to model intergroup activities.

도 2는 14-조인트 휴먼 자세, 중심점의 결정, 오브젝트의 거리와 움직임 방향을 표현하는 도면이다.Fig. 2 is a diagram showing the 14-joint human posture, the determination of the center point, the distance of the object and the direction of movement.

도 2에서 보는 바와 같이, 휴먼의 자세(201)는 14 조인트 요소를 포함한다. 또한, 오브젝트의 좌표를 지역화(localize) 하기 위해서는, 몸통(202)의 4 조인트들을 사용하여 중심점을 결정해야 한다.As shown in FIG. 2, the human posture 201 includes 14 joint elements. In addition, to localize the coordinates of an object, the center point must be determined using four joints of the body 202. [

구체적으로, 도 2에 도시된 각 포인트들은 오브젝트의 좌표로서, [수학식 1]에 의해 계산될 수 있다.Specifically, each point shown in Fig. 2 can be calculated by the following equation (1) as the coordinates of the object.

[수학식 1][Equation 1]

여기서,

는

번째 프레임에 나타나는 휴먼 오브젝트(x)의 몸통에 대한 조인트 위치 좌표들이다.here,

The

Th frame of the human object (x).

트래킹 알고리즘은 도 2의 도면부호 203에서 표현된 대한 움직임 속도

와 방향

을 가정한다.The tracking algorithm is based on the motion velocity < RTI ID = 0.0 >

And direction

.

와

프레임에 상응하는

와

로부터의 트래킹 특징들은 아래 [수학식 2] 및 [수학식 3]의 오브젝트 좌표들로부터 계산될 수 있다.

Wow

Frame-equivalent

Wow

Can be calculated from the object coordinates of [Equation 2] and [Equation 3] below.

[수학식 2]&Quot; (2) "

[수학식 3]&Quot; (3) "

상호 작용 표현에 있어 중요한 것은 현재 장면에서 다른 오브젝트들과 상호 동작하는 오브젝트를 어떻게 식별하냐는 것이다.What is important to the interaction representation is how to identify objects that interact with other objects in the current scene.

검출된 모든 오브젝트들로부터 추출된 특징들을 계산하는 대신에, 특징 추출 스텝을 진행하기에 앞서 식별 처리를 수행하면 다음의 장점을 얻을 수 있다.The following advantages can be obtained by performing the identification process before proceeding with the feature extraction step instead of calculating the features extracted from all the detected objects.

비-상호작용 오브젝트(non-interacted objects)를 생략함으로써, 계산 비용을 줄일 수 있다.By omitting non-interacted objects, the computational cost can be reduced.

특이 값이 데이터 집합에 포함되지 않기 때문에 기능의 훈련 데이터 세트의 품질을 향상시킬 수 있다. 이것은 단일 객체들을 상호 작용에 따른 검출과 인식을 고려하지 않는 것을 의미한다.Since the singular values are not included in the data set, the quality of the training data set of the function can be improved. This means that single objects are not considered for detection and recognition according to the interaction.

싱글 오브젝트 액션 인식(single object action recognition)과 상호작용 그룹 활동 인식(interactive group activity recognition)을 구분하는데 있어, 분류의 정확도를 향상시킬 수 있다.Classification accuracy can be improved in distinguishing between single object action recognition and interactive group activity recognition.

본 발명에서, IPZ(Interaction Potential Zone) 알고리즘을 사용할 수 있다.In the present invention, an IPZ (Interaction Potential Zone) algorithm can be used.

IPZ(Interaction Potential Zone) 알고리즘은 도 3을 통해 상세히 설명한다.The IPZ (Interaction Potential Zone) algorithm will be described in detail with reference to FIG.

도 3은 상호 작용 존 결정과 오브젝트 그룹 설립을 설명하는 도면이다.FIG. 3 is a view for explaining interaction zone determination and object group establishment. FIG.

도 3에 개시된 바와 같이, IPZ(Interaction Potential Zone) 알고리즘은 Group Interaction Zone (GIZ)을 검출하는데 필요한 기본 유닛이다As shown in FIG. 3, the IPZ (Interaction Potential Zone) algorithm is a basic unit necessary for detecting the Group Interaction Zone (GIZ)

각 오브젝트는 오퍼레이팅 존을 갖고 있다. 오퍼레이팅 존은 오브젝트의 주변에 위치하고, 반지름

을 갖는 서클로 정의된다.Each object has an operating zone. The operating zone is located in the periphery of the object,

&Lt; / RTI >

따라서, 오브젝트 중심 좌표(501)에 기반하여, 이 영역은 반지름

로 식별된다.Thus, based on the object center coordinates 501,

&Lt; / RTI >

다음으로, 도 3의 도면부호 301과 같이 계산된 IPZ들간 중첩 영역의 비율(503)이 계산된다. 이 때의 비율(503)은 상호작용 휴먼 오브젝트들에 의하여 커버되는 전체 영역에 대한 중첩 영역의 비율로서, [수학식 4]에 의해서 산출될 수 있다.Next, the ratio 503 of overlapping regions between IPZs calculated as 301 in Fig. 3 is calculated. The ratio 503 at this time is the ratio of the overlapping area to the entire area covered by the interactive human objects, and can be calculated by Equation (4).

[수학식 4]&Quot; (4) "

여기서,

는

휴먼 오브젝트의 IPZ이다.here,

The

The IPZ of the human object.

는 중첩되는 IPZ들을 갖는 사람 수이다.

Is the number of people with overlapping IPZs.

만약,

이면, 도 3의 도면부호 301과 같이 홀로 서있는 단 하나의 오브젝트가 있는 경우이고, 이 때의

은 이러한 파라미터의 결과이다.if,

, There is a case where there is only one object alone, as indicated by reference numeral 301 in Fig. 3. In this case,

Is the result of these parameters.

휴먼 오브젝트의 세트는 [수학식 5]에 의해 수행될 수 있는 비교 연산을 통해서 할당되는 상호 작용들을 갖는다.The set of human objects has interactions that are assigned through a comparison operation that can be performed by Equation (5).

[수학식 5]&Quot; (5) "

여기서

는 휴먼 오브젝트들의 세트가 같은 그룹에 놓일 가능성이 얼마나 되는지의 경향을 제어하는 임계 값이다.here

Is a threshold that controls the likelihood that a set of human objects will be placed in the same group.

그룹 할당은 아래 세가지 상황으로 설명될 수 있다.Group assignment can be explained in three situations.

현재 오브젝트가

에 상응하여, 중복 영역 없이 홀로 서있는 경우라면, 새로운 그룹 식별자(GID, GroupID)가 할당된다.If the current object is

, A new group identifier (GID, GroupID) is assigned if it stands alone without overlapping areas.

만약, 현재 오브젝트와 다른 오브젝트가

에 따라 작은 면적에서 중첩되는 경우, 이 두 오브젝트들은 새로운 다른 그룹 식별자(GID, GroupID)가 할당된다.If the object is different from the current object

, The two objects are assigned a new different group identifier (GID, GroupID).

만약, 두 오브젝트가 현재 중첩되는 영역이

의 상태를 만족하는 경우, 이 두 오브젝트는 같은 그룹 식별자(GID, GroupID)가 할당된다.If the two objects are currently overlapping

, The two objects are assigned the same group identifier (GID, GroupID).

출력은 오브젝트의 그룹 식별자(GID, GroupID)에 대한 세트이다. The output is a set for the object's group identifier (GID, GroupID).

그러나, 그룹 식별자(GID, GroupID)를 할당하는 데는 특별한 케이스들이 있다.However, there are special cases for assigning group identifiers (GIDs, GroupIDs).

예를 들어, 도 3의 도면부호 301에서 보는 바와 같이, GID=A의 오브젝트는 GID=B의 오브젝트 그룹 방향의 그룹으로 이동하는 경향을 나타낸다.For example, as shown at reference numeral 301 in FIG. 3, an object of GID = A indicates a tendency to move to a group in the object group direction of GID = B.

이러한 상황을 위해, 그룹 식별자(GID, GroupID)의 할당이 필요하다.For this situation, it is necessary to assign a group identifier (GID, GroupID).

이러한 상황에서는 동적 객체에 대한 고려, 즉 F_v ^ X (T-1, T) ≥δ는 δ와 현재 개체를 식별하기 위해 속도 임계 값을 이동 또는 비 운동 상태인 점이 고려 되어야 한다.In this situation, consideration of dynamic objects, ie, F_v ^ X (T-1, T) ≥δ, must be taken into account in that the speed threshold is shifted or non-moving to identify the δ and the current entity.

현재 위치로부터의 움직임의 속도와 방향에 있어, 오브젝트의 위치는 [수학식 6]과 같이 다음 시간(next time (t+1))에서 계산된다.In the speed and direction of the movement from the current position, the position of the object is calculated at the next time (next time (t + 1)) as shown in Equation (6).

[수학식 6]&Quot; (6) "

만약, 다음 위치 값이 다른 그룹의 IPZ 내에 있다면, 오브젝트의 그룹 식별자(GID, GroupID),

는 [수학식 7]과 같이

로 식별되는 다음 목적지 그룹(next destination group)의 그룹 식별자(GID, GroupID)로 변경된다.If the next position value is within the IPZ of another group, the group identifier (GID, GroupID) of the object,

Is expressed by Equation (7)

To the group identifier (GID, GroupID) of the next destination group identified by the destination identifier.

[수학식 7]&Quot; (7) "

도면부호 302에 나타난 경우는 도 3에서 보는 바와 같다.The case shown at reference numeral 302 is as shown in FIG.

만약, 다음 위치 값이 다른 그룹의 IPZ의 외부라면,

로 식별되는 오브젝트의 그룹 식별자는 [수학식 8]과 같이 새로운 그룹 식별자로 변경된다. 이때의 새로운 그룹 식별자는 이미 존재하는 GID들(

로 표시)과 중첩되지 않는다.If the next position value is outside the IPZ of another group,

Is changed to a new group identifier as in Equation (8). At this time, the new group identifier includes already existing GIDs (

(Not shown).

[수학식 8]&Quot; (8) "

도 3의 도면부호 302에 나타난 상황에 적용될 수 있다.May be applied to the situation indicated at 302 in FIG.

싱글 오브젝트 액션 인식을 위한 오브젝트의 조인트들의 관계와, 상호 작용을 하는 그룹 활동 인식을 위한 두 오브젝트들의 조인트들 간 관계를 설명하기 위해, 본 발명에서는 공간-시간 차원(space-time dimension)에서 조인트 구성들 간 거리와 방향을 추출한다.To illustrate the relationship between object joints for single object action recognition and the relationship between two objects' joints for group activity recognition interacting, the present invention proposes a joint-configuration in a space-time dimension And extracts the distance and direction between them.

도 4는 조인트 위치 정보를 이용해서 4특징들의 결정을 설명하는 도면이다.4 is a diagram for explaining the determination of four features using joint position information.

구체적으로, 공간-시간 조인트 특징들은 도면부호 401 내지 404에서 보는 바와 같이 스켈레톤 위치에 기반해서 계산된다. Specifically, the space-time joint features are calculated based on the skeleton location as shown at 401-404.

공간 조인트 거리(401)는 프레임 내의 두 사람들 사이에 대한 모든 조인트들이 쌍 간 유클리디안 거리로 정의 될 수 있다. 즉, 공간 조인트 거리(401)는 상호 작용 자세와 두 조인트들 간 거리를 [수학식 9]를 이용해서 캡쳐한다.Spatial joint distance 401 can be defined as a pairwise Euclidean distance for all joints between two people in a frame. That is, the spatial joint distance 401 captures the interaction attitude and the distance between two joints using Equation (9).

[수학식 9]&Quot; (9) "

여기서,

과

는

에 상응하는 시간

에서의 휴먼 오브젝트

와

의 조인트 i와 j에 대한 2D 위치 좌표이다.here,

and

The

Corresponding time

Human object in

Wow

Lt; / RTI > is the 2D position coordinate for the joints i and j of FIG.

이는 한 사람(

인 경우나 사람들 간

에 측정될 수 있다.This is one person (

Or between people

Lt; / RTI >

임시 조인트 거리(Temporal joint distance)는 서로 다른 프레임들에 포함된 두 사람의 조인트들의 모든 쌍 간 유클리디안 거리(Euclidean distance)로 정의된다. 즉, 임시 조인트 거리(Temporal joint distance)는 [수학식 10]에 기반해서,

과

의 프레임에 상응하는 time

와

에서의 상호 작용의 사지 쌍 간 거리를 측정한다.Temporal joint distance is defined as the Euclidean distance between all pairs of two joints in different frames. That is, the temporal joint distance is calculated based on (10)

and

Corresponding to the frame of time

Wow

The distance between the pairs of limbs is measured.

[수학식 10]&Quot; (10) "

이는 한 사람

또는 두 사람 간

에 측정될 수 있다.This is a person

Or between two people

Lt; / RTI >

공간 조인트 운동(Spatial joint motion)(403)은 상호 작용 자세에 있어 두 조인트들 사이의 각도를 캡쳐하고, 이는 [수학식 11]에 기반하여 산출될 수 있다.Spatial joint motion 403 captures the angle between two joints in the interactive posture, which can be calculated based on (11).

[수학식 11]&Quot; (11) "

이는 한 사람

또는 두 사람 간

에 측정될 수 있다.This is a person

Or between two people

Lt; / RTI >

시간 조인트 모션(404)의 경우, 조인트 들의 모든 쌍 사이에서 정의될 수 있는데, 상호 작용의 수족 쌍에 대한 각도를

와

에 해당하는 프레임의

과

에서 측정될 수 있다. 예를 들면, [수학식 12]를 통해 상호 작용의 수족 쌍에 대한 각도를 측정할 수 있다.In the case of time joint motion 404, it can be defined between every pair of joints,

Wow

Of the frame corresponding to

and

Lt; / RTI > For example, it is possible to measure the angle with respect to the lumbar pair of the interaction through [Equation 12].

[수학식 12]&Quot; (12) "

상호 작용의 수족 쌍에 대한 각도는 한 사람일 때

또는 두 사람 사이인 경우

에서 모두 측정될 수 있다.The angle for the pair of interactions is one person

Or between two people

Lt; / RTI >

도 5는 상호 작용 존 식별과 오브젝트 그룹 생성의 프로세스를 보여준다.Figure 5 shows the process of interaction zone identification and object group creation.

도 5는 오브젝트 중심 좌표를 수집한다(단계 501). 상호 오브젝트 식별하기 위해서는 오브젝트의 위치 좌표들을 입력 받는다.FIG. 5 collects the object center coordinates (step 501). In order to identify a mutual object, coordinates of an object are input.

다음으로, 상호 작용 존 식별과 오브젝트 그룹을 생성하기 위해서는, 싱글 오브젝트 영역을 설립하고(단계 502), 중첩 비율을 계산할 수 있다(단계 503).Next, in order to generate the interaction zone identification and the object group, a single object area can be established (step 502) and the overlap ratio can be calculated (step 503).

입력된 오브젝트의 위치 좌표들에 기초하여 오브젝트로부터 미리 지정된 범위 내에 위치하는 싱글 상호 포텐셜 존들을 결정하고, 결정된 싱글 상호 포텐셜 존들에 기초하여 각 오브젝트에 대한 중첩 영역의 비율을 계산할 수 있다. 일례로, 싱글 상호 포텐셜 존들을 결정하기 위해서는 오브젝트의 위치 좌표와 원의 반지름에 기초하여 싱글 상호 포텐셜 존들을 결정할 수 있다.It is possible to determine single mutual potential zones located within a predetermined range from the object based on the position coordinates of the input object and calculate the ratio of the overlap region for each object based on the determined single mutual potential zones. For example, to determine single mutual potential zones, single mutual potential zones can be determined based on the object's location coordinates and the radius of the circle.

또한, 오브젝트에 대한 그룹 아이디에 할당된 임계값과 계산된 비율을 비교하여 오브젝트를 식별할 수 있다. 일례로, 싱글 상호 포텐셜 존들과 상기 중첩 영역의 비율은, 각각의 오브젝트에 대해서 식별될 수 있고, 이때의 비율은 각 오브젝트의 그룹 아이디를 결정하기 위한 임계값에 대비될 수 있다.In addition, the object can be identified by comparing the calculated ratio with the threshold assigned to the group ID for the object. In one example, the ratio of the single mutual potential zones to the overlap region can be identified for each object, and the ratio at this time can be compared to a threshold value for determining the group ID of each object.

다음으로, 상호 작용 존 식별과 오브젝트 그룹을 생성하기 위해서는 단계 504에서 반지름을 고려할 수 있다(단계 504).Next, in order to generate the interaction zone identification and the object group, the radius may be considered in step 504 (step 504).

&Lt; / RTI >

따라서, 오브젝트 중심 좌표에 기반하여, 이 영역은 반지름

로 식별될 수 있다.Thus, based on the object center coordinates,

Lt; / RTI >

계산된 IPZ들간 중첩 영역의 비율(503)은 상호작용 휴먼 오브젝트들에 의하여 커버되는 전체 영역에 대한 중첩 영역의 비율이다.The ratio 503 of the overlapping regions among the calculated IPZs is the ratio of the overlapping regions to the entire region covered by the interactive human objects.

현재 오브젝트가

에 상응하여, 중복 영역 없이 홀로 서있는 경우라면, 새로운 그룹 식별자(GID, GroupID)가 할당된다(단계 506).If the current object is

, A new group identifier (GID, GroupID) is assigned (step 506).

만약, 현재 오브젝트와 다른 오브젝트가

에 따라 작은 면적에서 중첩되는 경우, 이 두 오브젝트들은 새로운 다른 그룹 식별자(GID, GroupID)가 할당된다(단계 505).If the object is different from the current object

, The two objects are assigned a different group identifier (GID, GroupID) (step 505).

만약, 두 오브젝트가 현재 중첩되는 영역이

의 상태를 만족하는 경우, 단계 506을 통해 같은 그룹 식별자(GID, GroupID)가 할당된다.If the two objects are currently overlapping

The same group identifier (GID, GroupID) is allocated in step 506. [

출력은 오브젝트의 그룹 식별자(GID, GroupID)에 대한 세트이다(단계 507). The output is a set for the object's group identifier (GID, GroupID) (step 507).

예를 들어, 도 3의 도면부호 301에서 보는 바와 같이, GID=A의 오브젝트는 GIDB의 오브젝트 그룹 방향의 그룹으로 이동하는 경향을 나타낸다.For example, as shown at reference numeral 301 in FIG. 3, an object of GID = A indicates a tendency to move to a group in the object group direction of GIDB.

도 6은 인트라 오브젝트 특징 데이터세트와 인터 오브젝트 특징 데이터세트로 구분되는 두 특징 데이터세트들의 구성 프로세스를 나타낸다.FIG. 6 shows a process of configuring two feature data sets divided into an intra-object feature data set and an inter-feature feature data set.

도 6에서는 도 1의 단계 105의 특징 추출 과정을 구체적으로 설명한다.In FIG. 6, the feature extraction process in step 105 of FIG. 1 will be described in detail.

일실시예에 따른 특징 데이터 세트의 구성 방법은 먼저, 그룹 아이디를 입력 받을 수 있다(단계 601).A method of configuring a feature data set according to an exemplary embodiment may first receive a group ID (step 601).

즉, 특징 데이터세트를 추출하기 위해서는, 그룹 아이디에 기초해서 오브젝트를 두 개 그룹으로 구별 해야 한다. 예를 들어, 그룹 아이디에 기초해서 오브젝트를 비상호 오브젝트와 상호 오브젝트의 두 그룹으로 분류해야 하고, 이에 기초하여 오브젝트에 대한 공간적-시간적 조인트 거리와 움직임 방향 특징들이 추출해야 한다. 또한, 이를 위해, 싱글 액션 인식을 위한 인트라 오브젝트 특징 및 상호 활동 인식을 위한 데이터 세트와 인터 오브젝트 특징 중에서 적어도 하나 이상의 특징을 추출한다.That is, in order to extract the feature data set, the objects must be distinguished into two groups based on the group ID. For example, based on the group ID, objects should be classified into two groups, non-cross-object and cross-object, and based on this, the spatial-temporal joint distance and motion direction characteristics for the object should be extracted. To this end, at least one or more features of an intra-object feature for single-action recognition and a data set and an inter-object feature for mutual activity recognition are extracted.

다음으로, 일실시예에 따른 특징 데이터 세트의 구성 방법은 상기 그룹 아이디에 해당하는 각 그룹에 대한 오브젝트의 수를 비교하여 오브젝트의 수가 2 이상인지 여부를 판단할 수 있다(단계 602).Next, in the method of constructing a feature data set according to an embodiment, it is possible to determine whether the number of objects is equal to or greater than 2 by comparing the number of objects for each group corresponding to the group ID (step 602).

일실시예에 따른 특징 데이터 세트의 구성 방법은 비교된 오브젝트의 수를 고려하여

및

중에서 적어도 하나 이상의 좌표에 대한 특징들을 추출할 수 있다.A method of constructing a feature data set according to an embodiment includes:

And

The features of at least one of the coordinates can be extracted.

이를 위해, 단계 602의 판단 결과 오브젝트의 수가 2 이상인 경우, 특징 데이터 세트의 구성 방법은 그룹 내에 하나의 오브젝트가 있는 경우, x=y 좌표에 대한 특징들을 추출한다(단계 603).For this purpose, when the number of objects is two or more as a result of the determination in step 602, the method of constructing the feature data set extracts features for x = y coordinates when there is one object in the group (step 603).

일실시예에 따른 특징 데이터 세트의 구성 방법은 단계 602의 판단 결과 오브젝트의 수가 2 이상이 아닌 경우, 그룹 내에 한 개 이상이 오브젝트가 있다면,

및

의 좌표에 대한 특징들을 추출할 수 있다(단계 604).If the number of objects is not equal to or greater than 2 as a result of the determination in step 602, if there is more than one object in the group,

And

(Step 604). &Lt; RTI ID = 0.0 >

일실시예에 따른 특징 데이터 세트의 구성 방법은 추출된 특징들에 상응하는 데이터세트를 인식할 수 있다(단계 605).A method of constructing a feature data set according to an embodiment may recognize a data set corresponding to the extracted features (step 605).

특징들은 다른 오브젝트들과의 상호 작용을 시험하지 않고 싱글 동작을 인식하는데 이용될 수 있다. Features can be used to recognize single actions without testing interaction with other objects.

만약, 현재 오브젝트가 다른 오브젝트들과 같은 그룹 아이디를 갖는 다면, 그룹들이 더 많은 오브젝트들로 구성됨을 의미한다. 이의 특징들은 단계 603, 604에서와 같이

와

의 컨디션에서 계산될 수 있다.If the current object has the same group ID as another object, it means that the groups are made up of more objects. These features may be implemented as in

steps

603 and 604

Wow

Can be calculated.

구체적으로, 하나의 오브젝트만이 그룹 내에 있다면, 특징들은

의 조건에서 계산되어야 한다.Specifically, if only one object is in the group,

Should be calculated.

또한, 상기 인식된 데이터세트에 기초하여 인트라 오브젝트 특징 데이터세트(단계 606)와 인터 오브젝트 특징 데이터세트를 획득(단계 607)할 수 있다.In addition, an intra-object feature data set (step 606) and an inter-object feature data set may be acquired (step 607) based on the recognized data set.

추출된 특징들은 인트라 오브젝트 특징 데이터세트는 공간 조인트 거리 특징 서브셋(Spatial joint distance feature subset), 임시 조인트 거리 특징 서브셋(Temporal joint distance feature subset), 공간 조인트 모션 특징 서브셋(Spatial joint motion feature subset), 임시 조인트 모션 특징 서브셋(Temporal joint motion feature subset)을 포함한다.The extracted features may include a spatial joint distance feature subset, a temporal joint distance feature subset, a spatial joint motion feature subset, a temporal joint motion feature subset, And a temporal joint motion feature subset.

먼저, 공간 조인트 거리 특징 서브셋(Spatial joint distance feature subset)은

으로 표현될 수 있다.First, the spatial joint distance feature subset

. &Lt; / RTI >

임시 조인트 거리 특징 서브셋(Temporal joint distance feature subset)은

으로 표현될 수 있고, 공간 조인트 모션 특징 서브셋(Spatial joint motion feature subset)은

으로 표현될 수 있다.The Temporal joint distance feature subset

And the spatial joint motion feature subset may be expressed as

. &Lt; / RTI >

임시 조인트 모션 특징 서브셋(Temporal joint motion feature subset)은

으로 표현될 수 있다.The Temporal joint motion feature subset

. &Lt; / RTI >

싱글 휴먼 오브젝트로부터 추출된 특징을 표현하기 위한 벡터는 다음과 같이 표현될 수 있다.A vector for expressing a feature extracted from a single human object can be expressed as follows.

또한, 인터 오브젝트 특징 데이터세트는 다음의 구성요소를 포함할 수 있다.Further, the inter-object characteristic data set may include the following components.

공간 조인트 거리(Spatial joint distance)으로서,

, 임시 조인트 거리(Temporal joint distance)으로서

, 공간 조인트 모션(Spatial joint motion)으로서

, 임시 조인트 모션(Temporal joint motion)으로서

를 포함한다.As a spatial joint distance,

, As a temporary joint distance

, As a spatial joint motion

, As a temporary joint motion

.

상호 작용 휴먼 오브젝트로부터 추출된 특징을 표현하기 위해서는, 벡터 특징을 아래와 같이 표현할 수 있다.In order to express a feature extracted from an interactive human object, a vector feature can be expressed as follows.

두 특징 데이터세트들은 입력되는 비디오들과 2차원 매트리스에 따라서 프레임별로 수집된다. 두 특징 데이터세트들 중에서 하나는 인트라 오브젝트 특징 데이터세트와 다른 하나는 인터 오브젝트 특징 데이터세트에 대한 것이다.The two feature data sets are collected frame by frame according to the input video and the two-dimensional mattress. One of the two feature data sets is for an intra-object feature data set and the other is for an inter-object feature data set.

도 7은 두 특징 데이터세트를 위한 코드북 생성과 토픽 모델링의 프로세스를 도시한다.Figure 7 shows the process of codebook generation and topic modeling for two feature data sets.

도 7에서는 도 1의 단계 106의 모델링 과정을 구체적으로 설명한다.7, the modeling process of step 106 of FIG. 1 will be described in detail.

인트라 오브젝트 특징 데이터세트(701)와 인터 오브젝트 특징 데이터세트(702)를 포함하는 듀얼 구조 모델이 확률 모델을 생성하는데 이용될 수 있다.A dual structure model including an intra-object feature data set 701 and an inter-object feature data set 702 can be used to generate a probability model.

모델은 "bag-of-words"방식, 즉, 파칭코 배분 모형의 가정에 기초하여 개발된다.The model is developed based on the assumption of the "bag-of-words" approach, ie, the pachinko paring model.

통계 분석은 워드들의 동시발생에 의한 히스토그램에 기반하는 분석될 수 있고, 이 모델을 지원하기 위해, 코드북은 단계 703에서와 같이 K-평균 클러스터링이 수행될 수 있다. 즉, 확률 모델 생성 방법은 K-평균 클러스터링 알고리즘을 적용하여 상기 특징 데이터세트들 내의 특징들을 코드워드들로 클러스터링을 수행할 수 있다(단계 703).The statistical analysis can be analyzed based on the histogram due to the concurrent occurrence of words, and in order to support this model, the codebook can be subjected to K-means clustering as in step 703. That is, the probability model generation method may perform clustering of the features in the feature data sets into codewords by applying a K-average clustering algorithm (step 703).

다음으로, 확률 모델 생성 방법은 클러스터링에 따른 인트라 오브젝트 특징들(704)과 인터 오브젝트 특징들(705)을 액션들과 활동들의 코드워드 히스토그램들로 맵핑할 수 있다.Next, the probability model generation method may map intra-object features 704 and inter-object features 705 according to clustering to codeword histograms of actions and activities.

다음으로, 확률 모델 생성 방법은 맵핑된 히스토그램에 기반하는 계층 모델(hierarchical model)을 이용해서 워드들을 인코딩한다(단계 706).Next, the probability model generation method encodes the words using a hierarchical model based on the mapped histogram (step 706).

이후, 확률 모델 생성 방법은 인코딩된 워드들을 이용해서 확률 모델을 출력한다. 구체적으로, 액션(action)을 위한 확률 모델을 출력하거나(단계 707), 활동(activity)을 위한 확률 모델을 출력할 수 있다(단계 708).Then, the probability model generation method outputs the probability model using the encoded words. Specifically, a probability model for an action may be output (step 707) or a probability model for activity may be output (step 708).

확률 모델 생성 방법은 확률 모델을 출력하기 위해 계측 모델을 기반으로 하는 토픽 모델링에 기초하여 확률 모델을 생성할 수 있다.The probability model generation method can generate a probability model based on topic modeling based on a metric model to output a probability model.

도 8은 하나의 특징 벡터를 코드워드의 히스토그램으로 맵핑하는 실시예를 설명하는 도면이다.Fig. 8 is a diagram for explaining an embodiment for mapping one feature vector to a histogram of code words. Fig.

인트라 오브젝트 벡터를 히스토그램에 맵핑하는 실시예에 해당한다.This corresponds to an embodiment in which an intra-object vector is mapped to a histogram.

즉,

에 해당하는 인트라 오브젝트 벡터를 워드들의 수로 표현되는 히스토그램에 맵핑할 수 있다.In other words,

Can be mapped to a histogram represented by the number of words.

도 9는 4-레벨 구조의 토픽 모델을 위한 계층 모델을 디스플레이 하는 도면이다.9 is a diagram showing a hierarchical model for a topic model of a four-level structure.

"bag-of-words" 모델에 기반해서 학습하고 인식하기 위해서는, Pachinko Allocation Model (Li et al. 2006)과 같이 유연하고, 표현력있는 Latent Dirichlet Allocation (LDA)로부터 개발되어야 한다.To learn and recognize based on the "bag-of-words" model, it should be developed from a flexible and expressive Latent Dirichlet Allocation (LDA) such as the Pachinko Allocation Model (Li et al.

도 9에서는 버텀(bottom) 레벨에서 N 액션 워드들 또는 M 상호작용 워드들로 구성되고, 1번째 레벨에서 n₁ 액션 서브 토픽들, m₁ 상호작용 서브 토픽들로 구성되며, 2번째 레벨에서 n₂ 액션 서브 토픽들, m₂ 상호작용 서브 토픽들로 구성되며, 탑 레벨에서는 하나의 루트로 구성되는 4레벨의 계측적 구조를 나타낸다.9 consists of N action words or M interaction words at the bottom level, consisting of n ₁ action subtopics at the first level, m ₁ interaction subtopics at the bottom level and n ₂ action subtopics, and m ₂ interaction subtopics, and one level at the top level.

이 모델에 대한 전체 보고서는 Li(2006)에서 설명하고 있다. A full report on this model is given in Li (2006).

결국, 본 발명을 이용하면 비-상호작용 오브젝트(non-interacted objects)를 생략함으로써, 계산 비용을 줄일 수 있다. 또한, 특이 값이 데이터 집합에 포함되지 않기 때문에 기능의 훈련 데이터 세트의 품질을 향상시킬 수 있으며, 싱글 오브젝트 액션 인식(single object action recognition)과 상호작용 그룹 활동 인식(interactive group activity recognition)을 구분하는데 있어, 분류의 정확도를 향상시킬 수 있다.As a result, the present invention can be used to reduce the calculation cost by omitting non-interacted objects. In addition, because no singular values are included in the data set, it is possible to improve the quality of the training data set of the function and to distinguish between single object action recognition and interactive group activity recognition So that the accuracy of the classification can be improved.

본 발명의 일실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to an embodiment of the present invention can be implemented in the form of a program command which can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. This is possible.

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined by the equivalents of the claims, as well as the claims.

Claims

A method of interactions modeling, at least temporarily implemented by a computer,
Receiving a data set of a two-dimensional skeleton position extracted from a video;
Computing positional coordinates of an object from the input data set;
Calculating tracking characteristics including a motion velocity and a motion direction of the object from coordinates of the calculated object;
Determining a mutual object based on the mutual zone corresponding to the object and the calculated tracking characteristics;
Computing features from a data set of skeletons for the determined cross-object; And
Modeling the computed features into topics of single human actions and mutual group activities
/ RTI >

The method according to claim 1,
Wherein calculating the position coordinates of the object comprises:
Detecting position coordinates for the object using four joints of the torso from the input data set
/ RTI >

The method according to claim 1,
Wherein the non-mutual objects corresponding to the mutual object and the mutual object are determined through the mutual potential zone and the tracking characteristics.

The method according to claim 1,
Wherein the step of calculating tracking characteristics including the motion velocity and the motion direction of the object comprises:
Extracting the motion direction between the spatially-temporal joint distance of the object and the human objects, and extracting from the position data set of the skeleton
/ RTI >

The method according to claim 1,
Wherein the modeling comprises:
Generating a probability model for the single human actions and the mutual group activities using a modeling algorithm;
/ RTI >

A method of inter-object identification at least temporarily implemented by a computer,
Receiving position coordinates of an object;
Determining single mutual potential zones located within a predetermined range from the object based on position coordinates of the input object;
Calculating a ratio of overlapping regions for each object based on the determined single mutual potential zones; And
Identifying the object by comparing the calculated ratio with a threshold assigned to the group ID for each object
/ RTI >

The method according to claim 6,
Wherein determining the single mutual potential zones comprises:
Determining the single mutual potential zones based on the position coordinates of the object and the radius of the circle
/ RTI >

The method according to claim 6,
Wherein the ratio of the single mutual potential zones to the overlap region is identified for each object.

The method according to claim 6,
Wherein the ratio is in contrast to a threshold value for determining a group ID of each object.

A method of constructing a feature data set that is at least temporarily implemented by a computer,
Receiving a group ID;
Comparing the number of objects for each group corresponding to the group ID;
Considering the number of the compared objects

And

Extracting features for at least one of the coordinates;
Recognizing a data set corresponding to the extracted features; And
Acquiring an intra-object data set and an inter-object characteristic data set based on the recognized data set
Gt; a < / RTI >

11. The method of claim 10,
Wherein extracting the features comprises:
If there is one object in the group, extracting features for x = y coordinates
Gt; a < / RTI >

11. The method of claim 10,
Wherein extracting the features comprises:
If there is more than one object in the group,

And

Extracting features for the coordinates of the coordinates
Gt; a < / RTI >

11. The method of claim 10,
Classifying the object into two groups of non-mutual objects and mutual objects in consideration of the comparison result of the number of objects
Further comprising the steps of:

14. The method of claim 13,
The step of classifying the objects into two groups of non-
The spatial-temporal joint distance and motion direction features for the object are extracted
Gt; a < / RTI >

11. The method of claim 10,
Wherein extracting the features comprises:
Extracting at least one characteristic from among a data set and an inter-object characteristic for recognizing an intra-object characteristic and a mutual activity for single action recognition
Gt; a < / RTI >

A method of generating a probability model at least temporarily implemented by a computer,
Receiving a feature data set;
Clustering features in the feature data sets into codewords by applying a K-means clustering algorithm;
Mapping intra-object features and inter-object features according to the clustering to codeword histograms of actions and activities;
Encoding the words using a hierarchical model based on the mapped histogram; And
Outputting a probability model using the encoded words
A probability model generating step of generating a probability model;

17. The method of claim 16,
Wherein the outputting of the probability model comprises:
Generating the probability model based on topic modeling based on the metrology model
A probability model generating step of generating a probability model;

A computer-readable recording medium having recorded thereon a program for carrying out the method according to any one of claims 1 to 17.

17. An interaction modeling program stored on a recording medium, the program being run on a computing system,
A command set that receives a data set of a two-dimensional skeleton position extracted from a video;
A set of instructions for computing positional coordinates of an object from the input data set;
Calculating a tracking feature including a motion velocity and a motion direction of the object from the position coordinates of the calculated object;
A set of instructions for determining a mutual object based on the mutual zone corresponding to the object and the calculated tracking characteristics;
A set of instructions for computing features from a data set of skeletons for the determined cross-object; And
A set of instructions for modeling the computed features into topics of single human actions and mutual group activities
A program of interactive activity modeling comprising: