KR102706053B1

KR102706053B1 - A variable selection device based on feature selection algorithm and a model management system

Info

Publication number: KR102706053B1
Application number: KR1020230196385A
Authority: KR
Inventors: 이창훈; 고은수; 김흥민; 김영광; 김민준
Original assignee: 오케스트로 주식회사
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-09-12
Anticipated expiration: 2043-12-29

Abstract

본 발명의 일 실시예에 따른 변수 선택 장치는, 클라우드 서버의 특성을 대표하는 변수인 대표변수를 결정하는, 변수 선택 장치에 있어서, 상기 클라우드 서버가 동작되면서 발생되는 정보로서, 복수개의 변수들에 대한 각각의 정보인 동작정보를 수신하는 선택 수신모듈; 상기 동작정보를 기초로, 미리 정해진 판단방법에 따라, 미리 정해진 변수인 목적변수와 상기 목적변수를 제외한 변수인 비교변수 간의 인과성을 분석하는 탐지모듈; 및 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수를 상기 대표변수로 결정하는 결정모듈;을 포함할 수 있다. A variable selection device according to one embodiment of the present invention may include a variable selection device that determines a representative variable, which is a variable representing the characteristics of a cloud server, wherein the variable selection device comprises: a selection reception module that receives operation information, which is information about each of a plurality of variables, as information generated when the cloud server operates; a detection module that analyzes, based on the operation information, a causality between a target variable, which is a predetermined variable, and a comparison variable, which is a variable excluding the target variable, according to a predetermined judgment method; and a determination module that determines the comparison variable and the target variable, which have a causality satisfying a predetermined causality condition, as the representative variable.

Description

{A VARIABLE SELECTION DEVICE BASED ON FEATURE SELECTION ALGORITHM AND A MODEL MANAGEMENT SYSTEM}

본 발명은 특성 선택 알고리즘을 활용하여 머신러닝 모델에 이용되는 변수를 선택하는, 변수 선택 장치 및 이를 포함하는 모델 관리 시스템에 대한 것이다. The present invention relates to a variable selection device that selects variables used in a machine learning model by utilizing a feature selection algorithm, and to a model management system including the same.

클라우드의 도입과 함께 많은 서비스들은 효율적인 네트워크 부하 대응, 장애 시 빠른 복구, 유연한 서버 용량 산정 등의 혜택을 누리게 됐다. 이와 함께 인프라 관리자들은 단순 서버 뿐 아니라 클라우드 시스템의 복잡한 환경을 모두 이해하고 대응 할 수 있어야 하는 상황이 되었다. 이러한 대응을 개별 관리자의 판단이나 룰 베이스의 프로세스로 하기에는 한계가 많아 AIOps(Aritificial Intelligence for IT Operations)라는 개념이 대두되었다. 이는 IT 운영에 인공지능을 도입한 것으로 주로 데이터센터를 효율적으로 최적화하고, 사용하는 고객 만족도를 증가시키며, 개발 생산성을 끌어올리는 방향으로 나아가고 있다. 연구 관점에서도 자원의 적절한 용량을 산정하고, 장애를 선제적으로 대응할 수 있도록 하는 알고리즘들이 핵심 요소로 꼽히고 있다. With the introduction of the cloud, many services have benefited from efficient network load response, rapid recovery in case of failure, and flexible server capacity calculation. At the same time, infrastructure managers are required to understand and respond to not only simple servers but also the complex environment of cloud systems. There are many limitations to responding to these issues through individual manager judgment or rule-based processes, and thus the concept of AIOps (Artificial Intelligence for IT Operations) has emerged. This is the introduction of artificial intelligence to IT operations, and it is mainly moving toward efficiently optimizing data centers, increasing customer satisfaction, and improving development productivity. From a research perspective, algorithms that calculate the appropriate capacity of resources and proactively respond to failures are considered key elements.

위와 같은 알고리즘들의 성능을 최대화하기 위해서는 데이터 센터에서 수집되는 온갖 종류의 데이터(Metric, Log, Trace)를 잘 분석하여 향후 어떤 값을 나타낼지 또한 예측해 내는 것이 중요하다. 자원 사용량(Metric)을 예측하여 수 개월 뒤까지 유저 사용 패턴을 고려하여 자원의 적절한 용량을 산정해 줄 수 있다. 또한 자원 사용량 및 향후 발생할 만한 로그(Log)의 패턴을 예측하여 장애와 관련된 패턴을 선제적으로 탐지하고 근본 원인을 분석하여 해결 시간을 단축할 수 있다. 하지만 클라우드 도입으로 복잡해진 환경만큼 데이터 파이프라인도 고도화되고 복잡해져서,장애 탐지 혹은 장래 워크로드 예측을 위한 머신러닝 모델을 학습하는데 많은 데이터와 시간 및 리소스가 필요하다. 이로 인해, 정확도는 높아졌으나, 머신러닝 모델을 학습으로 인해 서버 과부화 문제가 발생하였다. In order to maximize the performance of the above algorithms, it is important to analyze all kinds of data (Metric, Log, Trace) collected from the data center and predict what values will be shown in the future. By predicting resource usage (Metric), it is possible to calculate the appropriate capacity of resources by considering user usage patterns for several months in the future. In addition, by predicting resource usage and patterns of logs that are likely to occur in the future, it is possible to proactively detect patterns related to failures and analyze the root cause to shorten the resolution time. However, as the environment becomes more complex due to the introduction of the cloud, data pipelines have also become more sophisticated and complex, requiring a lot of data, time, and resources to learn machine learning models for failure detection or future workload prediction. As a result, although accuracy has increased, server overload problems have occurred due to learning machine learning models.

본 발명은 상술한 문제점을 해결하기 위한 것으로, 모델에 필요한 주요 변수들만을 선택하는 변수 선택 장치와 모델의 학습 시기를 결정하는 시기 결정 장치를 포함하는 모델 관리 시스템을 제공하고자 한다. The present invention is intended to solve the above-described problem, and provides a model management system including a variable selection device for selecting only key variables required for a model and a timing determination device for determining a learning timing of the model.

또한, 상기 미리 정해진 판단방법은, 그랜저 인과성 테스트(Granger causality Test)를 이용하여 두 변수 간의 인과성을 판단하는 방법일 수 있다. In addition, the above-determined judgment method may be a method of judging causality between two variables using the Granger causality test.

또한, 미리 정해진 시간범위 마다 상기 목적변수의 상기 동작정보를 구분하고 분석하여, 상기 목적변수의 상기 동작정보에서 이벤트 발생 여부를 탐지하는 탐색모듈;을 더 포함하고, 상기 탐지모듈은, 이벤트가 발생된 상기 목적변수의 상기 동작정보와 동일한 시간범위에 발생한 상기 비교변수의 상기 동작정보를 상기 미리 정해진 판단방법에 따라 서로 비교하여 인과성을 분석할 수 있다. In addition, the method further includes a search module that detects whether an event has occurred in the operation information of the target variable by distinguishing and analyzing the operation information of the target variable for each predetermined time range, and the detection module can analyze causality by comparing the operation information of the target variable in which an event has occurred and the operation information of the comparison variable in the same time range according to the predetermined judgment method.

또한, 상기 미리 정해진 인과성조건은, 인과성이 인정되는 상기 미리 정해진 시간범위가 가장 많을 조건일 수 있다. In addition, the above-determined causality condition may be a condition in which the above-determined time range in which causality is recognized is the largest.

또한, 상기 선택 수신모듈이 수신한 상기 동작정보를 미리 정해진 전처리방법에 따라 전처리하는 예비처리모듈;을 더 포함하고, 상기 탐지모듈은, 전처리된 상기 동작정보를 기초로 상기 목적변수와 상기 비교변수 간의 인과성을 분석하며, 상기 미리 정해진 전처리방법은, 범주형 또는 소정 기준 이하의 분산을 가지는 상기 동작정보를 제거하는 방법일 수 있다. In addition, the above-described selection receiving module further includes a preprocessing module that preprocesses the motion information received according to a predetermined preprocessing method; the detection module analyzes the causality between the target variable and the comparison variable based on the preprocessed motion information, and the predetermined preprocessing method may be a method of removing the motion information that is categorical or has a variance below a predetermined standard.

또한, 상기 미리 정해진 전처리방법은, 상기 목적변수와 상관계수가 소정 기준 이상인 변수에 대한 상기 동작정보를 제거하는 방법일 수 있다.In addition, the above-determined preprocessing method may be a method of removing the operation information for a variable whose correlation coefficient with the target variable is higher than a predetermined standard.

본 발명의 일 실시예에 따른 변수 선택 방법은, 변수 선택 장치에 의해 구현되며, 클라우드 서버의 특성을 대표하는 변수인 대표변수를 결정하는, 변수 선택 방법에 있어서, 선택 수신모듈에 의해, 상기 클라우드 서버가 동작되면서 발생되는 정보로서, 복수개의 변수 - 상기 변수는 미리 정해진 변수인 목적변수와 상기 목적변수를 제외한 변수인 비교변수를 구비함. - 들에 대한 각각의 정보인 동작정보가 수신되는 단계; 탐색모듈에 의해, 미리 정해진 시간범위 마다 상기 목적변수의 상기 동작정보가 구분되고 분석되어, 상기 목적변수의 상기 동작정보에서 이벤트 발생 여부가 탐지되는 단계; 탐지모듈에 의해, 이벤트가 발생된 상기 목적변수의 상기 동작정보와 동일한 시간범위에 발생한 상기 비교변수의 상기 동작정보가 미리 정해진 판단방법에 따라 서로 비교되어 인과성이 분석되는 단계; 및 결정모듈에 의해, 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수가 상기 대표변수로 결정되는 단계;를 포함할 수 있다. A variable selection method according to one embodiment of the present invention is implemented by a variable selection device, and may include a step of receiving, by a selection receiving module, operation information, which is information about each of a plurality of variables - the variables including a target variable, which is a predetermined variable, and a comparison variable, which is a variable excluding the target variable - as information generated while the cloud server is operating; a step of distinguishing and analyzing, by a search module, the operation information of the target variable for each predetermined time range, and detecting whether an event has occurred in the operation information of the target variable; a step of analyzing, by a detection module, the operation information of the target variable in which an event has occurred and the operation information of the comparison variable that has occurred in the same time range according to a predetermined judgment method, and analyzing causality; and a step of determining, by a decision module, the comparison variable and the target variable having causality satisfying a predetermined causality condition as the representative variable.

본 발명의 일 실시예에 따른 모델 관리 시스템은, 클라우드 서버에서 동작되는 머신러닝 모델을 관리하기 위한, 모델 관리 시스템에 있어서, 상기 클라우드 서버가 동작되면서 발생되는 정보로서 복수개의 변수들에 대한 각각의 정보인 동작정보를 기초로, 상기 클라우드 서버의 특성을 대표하는 변수인 대표변수를 결정하는 변수 선택 장치; 및 상기 대표변수에 대한 동작정보를 기초로 상기 머신러닝 모델이 재 학습되는 시기를 결정하는 시기 결정 장치;를 포함하고, 상기 변수 선택 장치는, 상기 동작정보를 수신하는 선택 수신모듈, 상기 동작정보를 기초로 미리 정해진 판단방법에 따라 미리 정해진 변수인 목적변수와 상기 목적변수를 제외한 변수인 비교변수 간의 인과성을 분석하는 탐지모듈 및 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수를 상기 대표변수로 결정하는 결정모듈을 구비할 수 있다. According to one embodiment of the present invention, a model management system for managing a machine learning model operated in a cloud server includes: a variable selection device for determining a representative variable, which is a variable representing a characteristic of the cloud server, based on operation information, which is information about each of a plurality of variables generated as information when the cloud server is operated; and a time determination device for determining a time when the machine learning model is re-learned based on the operation information about the representative variable; wherein the variable selection device may include a selection reception module for receiving the operation information, a detection module for analyzing causality between a target variable, which is a predetermined variable, and a comparison variable, which is a variable excluding the target variable, according to a predetermined judgment method based on the operation information, and a determination module for determining the comparison variable and the target variable, which have causality satisfying a predetermined causality condition, as the representative variable.

또한, 상기 시기 결정 장치는, 상기 동작정보를 시계열적으로 수신하는 시기 수신모듈, 상기 동작정보를 시계열 분해한 후 미리 정해진 강도분석방법에 따라 상기 동작정보의 계절성 강도를 산출하는 계절모듈, 미리 정해진 패턴분석방법으로 상기 동작정보의 패턴을 분석하는 패턴모듈 및 상기 계절성 강도의 크기에 따라 미리 정해진 결정방법으로 상기 클라우드 서버에서 진행되는 상기 머신러닝 모델의 재 학습시기를 결정하는 판단모듈을 포함하며, 상기 미리 정해진 결정방법은, 상기 동작정보의 계절성 또는 상기 동작정보의 패턴을 고려하여 재 학습시기를 결정하는 방법일 수 있다. In addition, the timing determination device includes a timing receiving module that receives the motion information in time series, a seasonal module that decomposes the motion information in time series and then calculates the seasonal intensity of the motion information according to a predetermined intensity analysis method, a pattern module that analyzes a pattern of the motion information using a predetermined pattern analysis method, and a judgment module that determines the re-learning timing of the machine learning model performed in the cloud server according to a predetermined decision method according to the magnitude of the seasonal intensity, and the predetermined decision method may be a method of determining the re-learning timing by considering the seasonality of the motion information or the pattern of the motion information.

또한, 상기 미리 정해진 결정방법은, 상기 동작정보의 계절성 강도가 제1 기준값 이상일 경우, 상기 동작정보의 계절성을 고려하여 재 학습시기를 결정하는 제1 결정방법 및 상기 동작정보의 계절성 강도가 상기 제1 기준값 보다 낮은 제2 기준값 미만일 경우, 상기 동작정보의 패턴을 고려하여 재 학습시기를 결정하는 제2 결정방법을 구비할 수 있다. In addition, the above-determined determination method may include a first determination method for determining a re-learning time by considering the seasonality of the motion information when the seasonality intensity of the motion information is equal to or greater than a first reference value, and a second determination method for determining a re-learning time by considering the pattern of the motion information when the seasonality intensity of the motion information is less than a second reference value that is lower than the first reference value.

본 발명에 따른 변수 선택 장치 및 이를 포함하는 모델 관리 시스템은, 서버 자원의 활용 효율성을 극대화할 수 있다. A variable selection device according to the present invention and a model management system including the same can maximize the utilization efficiency of server resources.

또한, 서버 다운을 예방할 수 있다. Additionally, it can prevent server downtime.

또한, 모델의 성능을 향상시킬 수 있다. Additionally, it can improve the performance of the model.

다만, 본 발명의 효과가 상술한 효과들로 제한되는 것은 아니며, 언급되지 아니한 효과들은 본 명세서 및 첨부된 도면으로부터 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 명확히 이해될 수 있을 것이다.However, the effects of the present invention are not limited to the effects described above, and effects not mentioned can be clearly understood by a person having ordinary skill in the art to which the present invention pertains from this specification and the attached drawings.

도 1은 본 발명의 일 실시예에 따른 모델 관리 시스템의 개요도
도 2는 본 발명의 일 실시예에 따른 모델 관리 시스템의 구성도
도 3은 본 발명의 일 실시예에 따른 모델 관리 시스템의 변수 선택 장치의 구성도
도 4는 본 발명의 일 실시예에 따른 변수 선택 장치에 의해 진행되는 변수 선택 방법의 순서도
도 5는 본 발명의 일 실시예에 따른 변수 선택 장치의 동작도
도 6은 본 발명의 일 실시예에 따른 변수 선택 장치가 선택한 대표변수
도 7은 본 발명의 일 실시예에 따른 모델 관리 시스템의 시기 결정 장치의 구성도
도 8은 본 발명의 일 실시예에 따른 시기 결정 장치의 시기 결정 방법의 순서도
도 9는 본 발명의 일 실시예에 따른 시기 결정 장치의 동작도
도 10은 본 발명의 일 실시예에 따른 시기 결정 장치의 판단모듈이 실행하는 제1 결정방법을 설명하기 위한 그래프
도 11 및 도 12는 본 발명의 일 실시예에 따른 시기 결정 장치의 판단모듈이 실행하는 제3 결정방법을 설명하기 위한 도면Figure 1 is a schematic diagram of a model management system according to one embodiment of the present invention.
Figure 2 is a configuration diagram of a model management system according to one embodiment of the present invention.
Figure 3 is a configuration diagram of a variable selection device of a model management system according to one embodiment of the present invention.
Figure 4 is a flow chart of a variable selection method performed by a variable selection device according to one embodiment of the present invention.
Figure 5 is an operation diagram of a variable selection device according to one embodiment of the present invention.
Figure 6 is a representative variable selected by a variable selection device according to one embodiment of the present invention.
Figure 7 is a configuration diagram of a timing determination device of a model management system according to one embodiment of the present invention.
Figure 8 is a flow chart of a timing determination method of a timing determination device according to one embodiment of the present invention.
Figure 9 is an operation diagram of a timing determination device according to one embodiment of the present invention.
FIG. 10 is a graph for explaining the first decision method executed by the judgment module of the timing decision device according to one embodiment of the present invention.
FIG. 11 and FIG. 12 are drawings for explaining the third decision method executed by the judgment module of the timing decision device according to one embodiment of the present invention.

이하에서는 도면을 참조하여 본 발명의 구체적인 실시예를 상세하게 설명한다. 다만, 본 발명의 사상은 제시되는 실시예에 제한되지 아니하고, 본 발명의 사상을 이해하는 당업자는 동일한 사상의 범위 내에서 다른 구성요소를 추가, 변경, 삭제 등을 통하여, 퇴보적인 다른 발명이나 본 발명 사상의 범위 내에 포함되는 다른 실시예를 용이하게 제안할 수 있을 것이나, 이 또한 본원 발명 사상 범위 내에 포함된다고 할 것이다.Hereinafter, specific embodiments of the present invention will be described in detail with reference to the drawings. However, the spirit of the present invention is not limited to the presented embodiments, and those skilled in the art who understand the spirit of the present invention will be able to easily propose other regressive inventions or other embodiments included within the scope of the spirit of the present invention by adding, changing, deleting, etc. other components within the scope of the same spirit, but this will also be considered to be included within the scope of the spirit of the present invention.

도 1은 본 발명의 일 실시예에 따른 모델 관리 시스템의 개요도이다. FIG. 1 is a schematic diagram of a model management system according to one embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)은 클라우드 서버(S10)와 유/무선으로 네트워크 연결되어 정보 통신 가능할 수 있다. Referring to FIG. 1, a model management system (100) according to one embodiment of the present invention can be connected to a cloud server (S10) via a wired/wireless network to enable information communication.

본 발명에서 언급하는 무선 네트워크라 함은 유선 공중망, 무선 이동 통신망, 또는 휴대 인터넷 등과 통합된 코어 망일 수도 있고, TCP/IP 프로토콜 및 그 상위 계층에 존재하는 여러 서비스, 즉 HTTP(Hyper Text Transfer Protocol), HTTPS(Hyper Text Transfer Protocol Secure), Telnet, FTP(File Transfer Protocol), DNS(Domain Name System), SMTP(Simple Mail Transfer Protocol) 등을 제공하는 전 세계적인 개방형 컴퓨터 네트워크 구조를 의미할 수 있으며, 이러한 예에 한정하지 않고 다양한 형태로 데이터를 송수신할 수 있는 데이터 통신망을 포괄적으로 의미하는 것이다.The wireless network referred to in the present invention may be a core network integrated with a wired public network, a wireless mobile communication network, or a portable Internet, or may mean a worldwide open computer network structure that provides various services existing in the TCP/IP protocol and its upper layer, such as HTTP (Hyper Text Transfer Protocol), HTTPS (Hyper Text Transfer Protocol Secure), Telnet, FTP (File Transfer Protocol), DNS (Domain Name System), SMTP (Simple Mail Transfer Protocol), and the like, and is not limited to these examples, but comprehensively means a data communication network capable of transmitting and receiving data in various forms.

클라우드 서버(S10)는 클라이언트(C10)에게 클라우드 서비스를 제공하는 서버를 의미할 수 있다. A cloud server (S10) may mean a server that provides cloud services to a client (C10).

본 발명에서 언급하는 서버는 서버의 서버 환경을 수행하기 위한 다른 구성들이 포함될 수도 있다. 서버는 임의의 형태의 장치는 모두 포함할 수 있다. The server referred to in the present invention may include other configurations for performing a server environment of the server. The server may include any type of device.

일례로, 서버는 디지털 기기로서, 랩탑 컴퓨터, 노트북 컴퓨터, 데스크톱 컴퓨터, 웹 패드, 이동 전화기와 같이 프로세서를 탑재하고 메모리를 구비한 연산 능력을 갖춘 디지털 기기일 수 있다. For example, a server may be a digital device having computing power, such as a laptop computer, a notebook computer, a desktop computer, a web pad, or a mobile phone, and having a processor and memory.

일례로, 서버는 웹 서버일 수 있다. 다만, 이에 한정하지 않고, 서버의 종류는 통상의 기술자에게 자명한 수준에서 다양하게 변경 가능하다.For example, the server may be a web server. However, the type of server is not limited to this, and can be changed in various ways at a level that is obvious to a person skilled in the art.

일례로, 클라우드 서버(S10)는 클라이언트(C10) 마다 가상머신 혹은 컨테이너를 구성하여, 클라이언트의 요구에 따라 필요한 연산 혹은 저장소를 제공하여, 클라이언트에게 클라우드 서비스를 제공할 수 있다. For example, a cloud server (S10) can provide cloud services to clients by configuring a virtual machine or container for each client (C10) and providing necessary operations or storage according to the client's request.

모델 관리 시스템은 적어도 하나 이상의 클라우드 서버에 연결될 수 있다. The model management system can be connected to at least one cloud server.

모델 관리 시스템(100)은 클라우드 서버에서 동작되는 머신러닝 모델을 관리, 운영할 수 있다. The model management system (100) can manage and operate a machine learning model running on a cloud server.

이하, 모델 관리 시스템에 대해서 자세하게 서술하도록 한다. Below, we will describe the model management system in detail.

도 2는 본 발명의 일 실시예에 따른 모델 관리 시스템의 구성도이다. Figure 2 is a configuration diagram of a model management system according to one embodiment of the present invention.

도 2를 참조하면, 본 발명의 일 실시예에 따른 모델 관리 시스템(100)은, 머신 러닝에 의해 학습된 모델인 학습모델의 학습을 위한 변수를 결정하는 변수 선택 장치(110) 및 학습모델의 재 학습 시기를 결정하는 시기 결정 장치(120)를 포함할 수 있다. Referring to FIG. 2, a model management system (100) according to one embodiment of the present invention may include a variable selection device (110) that determines variables for learning a learning model, which is a model learned by machine learning, and a timing determination device (120) that determines a timing for re-learning the learning model.

또한, 상기 모델 관리 시스템(100)은 상기 변수 선택 장치와 상기 시기 결정 장치가 산출한 변수 혹은 재 학습 시기 등의 정보를 표시하는 사용자 인터페이스를 산출하여 관리자에게 송신하는 인터페이스 장치(130)을 더 포함할 수 있다. In addition, the model management system (100) may further include an interface device (130) that generates a user interface that displays information such as variables or re-learning timing generated by the variable selection device and the timing determination device and transmits the information to the manager.

여기서, 관리자는 클라우드 서버를 관리, 운영하는 사람 혹은 기업을 의미할 수 있다. 또는, 관리자는 모델 관리 시스템을 관리, 운영하는 사람 혹은 기업을 의미할 수 있다. Here, the administrator may mean a person or a company that manages and operates a cloud server. Alternatively, the administrator may mean a person or a company that manages and operates a model management system.

시기 결정 장치는 학습모델이 학습하고 난 후 소정 시간이 지난 후부터 재 학습 시기를 결정하는 프로세스(시기 결정 방법)를 진행할 수 있다. The timing decision device can perform a process (timing decision method) for determining the timing for re-learning after a certain period of time has passed since the learning model has learned.

일례로, 소정 시간은 1달일 수 있으나, 이에 본 발명이 한정되는 것은 아니다. For example, the given period of time may be one month, but the present invention is not limited thereto.

또 다른 예로, 시기 결정 장치는 학습모델이 학습하고 난 후 학습데이터가 소정 용량 이상으로 축적되었을 경우, 재 학습 시기를 결정하는 프로세스(시기 결정 방법)를 진행할 수 있다. As another example, the timing decision device can perform a process (timing decision method) to determine the timing of re-learning when learning data has accumulated beyond a predetermined capacity after the learning model has learned.

일례로, 소정 용량은 100TB일 수 있으나, 이에 본 발명이 한정되는 것은 아니다. For example, the specified capacity may be 100 TB, but the present invention is not limited thereto.

인터페이스 장치(130)는 변수 선택 장치(110)에서 결정한 대표변수와 그 대표변수를 결정하는데 활용된 근거들이 표시되는 사용자 인터페이스를 산출하여 관리자에게 송신할 수 있다. The interface device (130) can produce a user interface that displays the representative variable determined by the variable selection device (110) and the grounds used to determine the representative variable, and transmit the result to the manager.

일례로, 사용자 인터페이스에는 목적변수의 동작정보의 값을 시계열에 따라 그래프 형태로 살펴볼 수 있고, 이벤트가 발생된 시간범위를 살펴볼 수 있다. For example, the user interface can display the values of the target variable's behavior information in a time series graph format and can display the time range in which an event occurred.

일례로, 관리자는 사용자 인터페이스에 입력하여 목적변수를 지정할 수 있다. For example, an administrator can specify a target variable by entering it into the user interface.

인터페이스 장치(130)는 시기 결정 장치(120)에서 결정한 재 학습 시가와 그 재 학습 시기를 결정하는데 활용된 근거들이 표시되는 사용자 인터페이스를 산출하여 관리자에게 송신할 수 있다. The interface device (130) can generate a user interface that displays the re-learning time determined by the timing determination device (120) and the grounds used to determine the re-learning time, and transmit the result to the administrator.

일례로, 사용자 인터페이스에는 동작정보의 계절성 강도, 무슨 결정방법에 의한 것인지에 대한 정보 등이 표시될 수 있다. For example, the user interface could display information about the seasonality of the motion information, what decision method was used, etc.

관리자의 용어에는 관리자의 컴퓨팅 장치가 포함될 수 있다. The term administrator may include the administrator's computing device.

도 3은 본 발명의 일 실시예에 따른 모델 관리 시스템의 변수 선택 장치의 구성도이다. FIG. 3 is a configuration diagram of a variable selection device of a model management system according to one embodiment of the present invention.

도 3을 참조하면, 본 발명의 일 실시예에 따른 모델 관리 시스템은, 클라우드 서버의 특성을 대표하는 변수인 대표변수를 결정하는, 변수 선택 장치(110)에 있어서, 상기 클라우드 서버가 동작되면서 발생되는 정보로서, 복수개의 변수들에 대한 각각의 정보인 동작정보를 수신하는 선택 수신모듈(111), 상기 동작정보를 기초로, 미리 정해진 판단방법에 따라, 미리 정해진 변수인 목적변수와 상기 목적변수를 제외한 변수인 비교변수 간의 인과성을 분석하는 탐지모듈(114) 및 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수를 상기 대표변수로 결정하는 결정모듈(115)을 포함할 수 있다. Referring to FIG. 3, a model management system according to an embodiment of the present invention may include a variable selection device (110) that determines a representative variable, which is a variable representing the characteristics of a cloud server, a selection reception module (111) that receives operation information, which is information about each of a plurality of variables, as information generated when the cloud server is operated, based on the operation information, a detection module (114) that analyzes causality between a target variable, which is a predetermined variable, and a comparison variable, which is a variable excluding the target variable, according to a predetermined judgment method, and a decision module (115) that determines the comparison variable and the target variable having causality satisfying a predetermined causality condition as the representative variable.

또한, 상기 변수 선택 장치(110)는, 미리 정해진 시간범위 마다 상기 목적변수의 상기 동작정보를 구분하고 분석하여, 상기 목적변수의 상기 동작정보에서 이벤트 발생 여부를 탐지하는 탐색모듈(113)을 더 포함할 수 있다. In addition, the variable selection device (110) may further include a search module (113) that distinguishes and analyzes the operation information of the target variable for each predetermined time range and detects whether an event occurs in the operation information of the target variable.

또한, 상기 변수 선택 장치(110)는, 상기 선택 수신모듈(111)이 수신한 상기 동작정보를 미리 정해진 전처리방법에 따라 전처리하는 예비처리모듈(112)을 더 포함할 수 있다. In addition, the variable selection device (110) may further include a preprocessing module (112) that preprocesses the operation information received by the selection receiving module (111) according to a predefined preprocessing method.

선택 수신모듈(111)은 클라우드 서버가 동작되면서 발생되는 정보인 동작정보를 수신할 수 있다. The selection receiving module (111) can receive operation information, which is information generated when the cloud server operates.

일례로, 동작정보는 복수개의 변수들에 대한 각각의 정보일 수 있다. For example, the motion information may be information about each of multiple variables.

일례로, 변수는 중앙처리장치 사용률, 메모리 사용률, 디스크 읽기 데이터양, 디스크 쓰기 데이터양, 네트워크 양, 그래픽처리장치 사용률, 그래픽처리장치용 메모리 사용률 등일 수 있다. For example, variables can be CPU usage, memory usage, disk read data, disk write data, network traffic, graphics processing unit usage, graphics processing unit memory usage, etc.

일례로, 변수는 총 89개 일 수 있다. For example, there can be a total of 89 variables.

다만, 이에 한정하지 않고, 변수의 총 개수는 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.However, without limitation, the total number of variables can be varied in various ways at a level that is obvious to ordinary technicians.

일례로, 동작정보는 시계열적인 데이터로서, 시간에 따라 값들이 지정되어 있는 데이터일 수 있다. For example, motion information may be time-series data, meaning that values are assigned over time.

예비처리모듈(112)은 상기 선택 수신모듈(111)이 수신한 상기 동작정보를 미리 정해진 전처리방법에 따라 전처리할 수 있다. The preprocessing module (112) can preprocess the operation information received by the selection receiving module (111) according to a pre-determined preprocessing method.

일례로, 상기 미리 정해진 전처리방법은, 동작정보의 시간이 소정 시간 미만인 동작정보를 제거하는 방법을 포함할 수 있다. For example, the above-determined preprocessing method may include a method of removing motion information whose time is less than a predetermined time.

일례로, 동작정보는 시계열적인 데이터로서, 데이터가 존재하는 시간이 존재할 수 있다. 만일, 소정의 시간이 150일이라면, 30일의 시간 동안 수집되어 저장된 동작정보는 미리 정해진 전처리방법에 의해 제거될 수 있다. For example, motion information is time-series data, and there may be a time period during which the data exists. If the given time period is 150 days, motion information collected and stored for 30 days may be removed by a pre-determined preprocessing method.

일례로, 상기 미리 정해진 전처리방법은, 특정 변수의 사용율이 소정 비율 미만이거나 클라우드 서버의 가상머신 혹은 클라우드 서버에 가동 중인 프로세스의 수가 소정 기준 미만인 동작정보를 제거하는 방법을 더 포함할 수 있다. For example, the above-described preprocessing method may further include a method of removing operation information in which the usage rate of a specific variable is below a predetermined rate or the number of virtual machines of a cloud server or processes running on a cloud server is below a predetermined standard.

구체적으로, 특정변수는 CPU 사용률일 수 있으며, 여기서 소정 비율은 5%일 수 있으나, 이에 본 발명이 한정되는 것은 아니다. Specifically, the specific variable may be CPU usage, where the predetermined ratio may be 5%, but the present invention is not limited thereto.

일례로, 프로세스의 수에 대한 소정 기준은 5일 수 있으나, 이에 본 발명이 한정되는 것은 아니다. For example, the predetermined standard for the number of processes may be 5, but the present invention is not limited thereto.

CPU 사용율이 적거나 가동되는 프로세스(프로그램)의 수가 적을 경우에는, 동작정보를 기초로 그 인과 관계를 추론하기 어렵기 때문일 수 잇다. When CPU usage is low or the number of running processes (programs) is small, it may be difficult to infer causal relationships based on operational information.

일례로, 상기 미리 정해진 전처리방법은, 상기 목적변수와 상관계수가 소정 기준 이상인 변수에 대한 상기 동작정보를 제거하는 방법을 더 포함할 수 있다. For example, the above-determined preprocessing method may further include a method of removing the operation information for a variable whose correlation coefficient with the target variable is greater than a predetermined standard.

일례로, 상관계수와 관련된 소정 기준은 0.9 일 수 있다. For example, a given criterion for the correlation coefficient might be 0.9.

상관계수는 -1 과 1 사이의 숫자로서 1에 가까울수록 상관 여부가 강한 것일 수 있다. The correlation coefficient is a number between -1 and 1, and the closer it is to 1, the stronger the correlation.

목적변수와 상관계수가 높은 변수라는 의미는 목적변수와 거의 동일한 의미를 갖고 있다는 것으로서, 별도의 인과관계를 분석할 필요가 없기 때문일 수 있다. The meaning of a variable having a high correlation coefficient with the target variable is that it has almost the same meaning as the target variable, so there may be no need to analyze a separate causal relationship.

일례로, 상기 미리 정해진 전처리방법은, 범주형 또는 소정 기준 이하의 분산을 가지는 상기 동작정보를 제거하는 방법일 수 있다. For example, the above-determined preprocessing method may be a method of removing the motion information having a categorical or variance below a predetermined standard.

범주형 정보는 범주 또는 그룹으로 나눌 수 있는 정보를 의미할 수 있다. Categorical information can mean information that can be divided into categories or groups.

일례로, 분산의 소정 기준은 0.5일 수 있다. 다만, 이에 본 발명이 한정되는 것은 아니다. For example, the predetermined criterion for dispersion may be 0.5. However, the present invention is not limited thereto.

미리 정해진 전처리방법은 각 변수들의 동작정보를 의미하는 지표데이터들 간의 시간 오차를 보정하는 방법을 더 포함할 수 있다. The preprocessing method may further include a method of correcting time errors between indicator data indicating operation information of each variable.

일례로, lag값은 목적변수의 단변량 기준 자기상관값(ACF, Autocorrelation) 값의 평균치로 산출될 수 있다. For example, the lag value can be calculated as the average of the univariate autocorrelation (ACF) values of the target variable.

다만, 이에 한정하지 않고, lag값은 다른 방식으로 결정될 수 있다. However, without limitation, the lag value can be determined in other ways.

일례로, lag값은 미리 정해진 시간범위에 '3'을 곱한 값으로써 결정될 수 있다. For example, the lag value can be determined as the value obtained by multiplying a predefined time range by '3'.

일례로, 미리 정해진 시간 범위는 5분일 수 있다. 다만, 이에 한정하지 않고, 미리 정해진 시간 범위는 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.For example, the predefined time range may be 5 minutes. However, this is not limited to this, and the predefined time range may be varied in various ways as would be obvious to a person skilled in the art.

'3'을 곱하는 이유는 탐색모듈(113), 탐지모듈(114) 및 결정모듈(115)에 의해 3 단계의 프로세스가 진행되기 때문일 수 있다. The reason for multiplying by '3' may be because a three-step process is performed by the search module (113), the detection module (114), and the decision module (115).

탐색모듈(113)은 미리 정해진 시간범위 마다 상기 목적변수의 상기 동작정보를 구분하고 분석하여, 상기 목적변수의 상기 동작정보에서 이벤트 발생 여부를 탐지할 수 있다.The search module (113) can detect whether an event occurs in the operation information of the target variable by distinguishing and analyzing the operation information of the target variable at each predetermined time range.

탐색모듈(113)은 미리 정해진 시간범위 마다 목적변수의 동작정보(지표데이터)를 구분할 수 있다. The exploration module (113) can distinguish the operation information (indicator data) of the target variable for each pre-determined time range.

일례로, 목적변수는 중앙처리장치(CPU) 사용률, 메모리(Memory) 사용률, 디스크 읽기 데이터양(Disk read bytes), 디스크 쓰기 데이터양(Disk write bytes), 네트워크 유입량(Network in bytes), 네트워크 유출량(Network out bytes)일 수 있다. For example, target variables can be CPU usage, memory usage, disk read bytes, disk write bytes, network in bytes, and network out bytes.

다만, 이에 한정하지 않고, 목적변수의 구체적인 종류는 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다. However, without limitation to this, the specific type of the objective variable can be varied in various ways at a level that is obvious to ordinary technicians.

하나의 변수의 동작정보를 지표데이터라고 정의할 수 있다. The behavioral information of a single variable can be defined as indicator data.

탐색모듈(113)은 지표데이터의 값들의 분산 등의 기초통계량을 활용하여 지표데이터 내의 이벤트 발생 여부를 판단할 수 있다. The exploration module (113) can determine whether an event has occurred within the indicator data by utilizing basic statistics such as the distribution of the values of the indicator data.

일례로, 지표데이터의 전체 평균값보다 미리 정해진 시간범위 상의 지표데이터의 평균값이 지표데이터의 전체 평균값의 소정 비율의 값보다 크다면, 탐색모듈(113)은 해당 미리 정해진 시간범위에서 이벤트가 발생되었다고 판단할 수 있다. For example, if the average value of the indicator data over a predetermined time range is greater than a predetermined percentage of the overall average value of the indicator data, the search module (113) can determine that an event has occurred over the predetermined time range.

일례로, 하나의 미리 정해진 시간범위 상의 지표데이터의 분산값이 소정 분산 값 이상이라면, 탐색모듈(113)은 해당 미리 정해진 시간범위에서 이벤트가 발생되었다고 판단할 수 있다. For example, if the variance of indicator data over a predetermined time range is greater than a predetermined variance value, the search module (113) can determine that an event has occurred in the predetermined time range.

탐색모듈(113)은 전처리된 동작정보에 대해서만 정보 처리를 진행할 수 있다. The search module (113) can only perform information processing on preprocessed motion information.

탐지모듈(114)은 전처리된 상기 동작정보를 기초로 상기 목적변수와 상기 비교변수 간의 인과성을 분석할 수 있다. The detection module (114) can analyze the causality between the target variable and the comparison variable based on the preprocessed motion information.

탐지모듈(114)은 이벤트가 발생된 상기 목적변수의 상기 동작정보와, 이벤트가 발생한 상기 동작정보와 동일한 시간범위에 발생한 상기 비교변수의 상기 동작정보를 상기 미리 정해진 판단방법에 따라 서로 비교하여 인과성을 분석할 수 있다. The detection module (114) can analyze causality by comparing the operation information of the target variable where the event occurred and the operation information of the comparison variable that occurred in the same time range as the operation information where the event occurred, according to the predetermined judgment method.

미리 정해진 판단방법은 그랜저 인과성 테스트(Granger causality Test)를 이용하여 두 변수 간의 인과성을 판단하는 방법일 수 있다. A pre-determined judgment method could be a method of judging causality between two variables using the Granger causality test.

그랜저 인과성 테스트 (Granger causality Test)는 시계열 데이터에서 두 변수간 인과 관계를 평가하는 통계적인 알고리즘이다. The Granger causality test is a statistical algorithm for assessing the causal relationship between two variables in time series data.

---------------------- [수학식 1] ---------------------- [Mathematical Formula 1]

------------[수학식 2] ------------[Mathematical Formula 2]

상기 두 식에서, F test 등을 수행하여 수학식 2가 수학식 1보다 더 나은 회귀모델(regression model)임을 p-value를 통해 보이면, x가 y를 그랜저 인과관계가 있다고 정의할 수 있다. In the above two equations, if it is shown through the p-value that Equation 2 is a better regression model than Equation 1 by performing an F test, etc., it can be defined that x has a Granger causal relationship with y.

이하, 그랜저 인과성 테스트 (Granger causality Test)에 대한 구체적인 설명은 공지된 기술 범위 내에서 생략될 수 있다. Below, a detailed description of the Granger causality test may be omitted within the scope of known technology.

탐지모듈(114)은 목적변수를 구성하는 지표데이터와 비교변수를 구성하는 지표데이터들 간의 인과성을 그랜저 인과성(Granger causality) 기반으로 체크하여 p-value를 산출할 수 있다. The detection module (114) can check the causality between the indicator data constituting the target variable and the indicator data constituting the comparison variable based on Granger causality and calculate the p-value.

만일, 임의의 비교변수를 기준으로 임의의 미리 정해진 시간범위 내 동작정보, 즉 임의의 비교변수를 기준으로 임의의 미리 정해진 시간범위 내 지표데이터의 그랜저 인과성의 p-value(유의확률)이 유의수준 보다 작다면 그 비교변수의 미리 정해진 시간범위 내 지표데이터는 목적변수와 인과성이 있다고 판단될 수 있다. If the p-value (significance probability) of the Granger causality of the indicator data within a predetermined time range based on an arbitrary comparison variable is less than the significance level, the indicator data within the predetermined time range of the comparison variable can be judged to have a causal relationship with the target variable.

비교변수는 모든 변수에서 목적변수를 제외한 변수를 의미할 수 있다. A comparison variable can mean any variable except the target variable.

하나의 비교변수는 모든 목적변수와 그랜저 인과성이 있는지 여부가 판단될 수 있다. One comparison variable can be judged to have Granger causality with all target variables.

탐지모듈(114)은 p-value를 이용하여 p-value matrix를 생성할 수 있다. The detection module (114) can generate a p-value matrix using the p-value.

결정모듈(115)은 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수를 상기 대표변수로 결정할 수 있다. The decision module (115) can determine the comparison variable and the target variable having a causality satisfying a pre-determined causality condition as the representative variable.

미리 정해진 인과성조건은 인과성이 인정되는 상기 미리 정해진 시간범위가 가장 많을 조건일 수 있다. The pre-determined causality condition may be the condition with the largest number of pre-determined time periods over which causality is recognized.

결정모듈(115)은 p-value matrix를 활용하여 구간별로 합산하여 가상 진도가 높게 인과성을 보인 변수들을 추출할 수 있다. The decision module (115) can extract variables that show high causality with a virtual progress by summing them by section using the p-value matrix.

구체적으로, 결정모듈(115)은 하나의 비교변수의 미리 정해진 시간 범위 중에서 목적변수와 인과성을 보이는 미리 정해진 시간 범위의 개수인 산출개수를 산출할 수 있다. Specifically, the decision module (115) can calculate the number of outputs, which is the number of predetermined time ranges that show causality with the target variable among the predetermined time ranges of one comparison variable.

결정모듈(115)은 비교변수들 중에서 산출개수가 가장 높은 비교변수가 미리 정해진 인과성조건을 만족할 수 있다. The decision module (115) can satisfy a pre-determined causality condition by selecting a comparison variable with the highest number of outputs among the comparison variables.

다만, 이에 본 발명이 한정되는 것은 아니고, 이벤트가 발생한 횟수 기준으로 소정 비율의 횟수 이상으로 산출개수가 보이는 비교변수가 미리 정해진 인과성조건을 만족할 수 있다. However, the present invention is not limited thereto, and a comparison variable that is calculated more than a predetermined ratio of the number of times an event occurs can satisfy a pre-determined causality condition.

여기서, 소정비율을 80%일 수 있다. 다만, 이에 본 발명이 한정되는 것은 아니다. Here, the predetermined ratio may be 80%. However, the present invention is not limited thereto.

결정모듈(115)은 대표변수에 대한 리스트를 클라우드 서버에 전송할 수 있다.The decision module (115) can transmit a list of representative variables to a cloud server.

클라우드 서버 또는 클라우드 서버의 가상머신은 결정모듈(115)로부터 전달받은 대표변수들의 동작정보들 만으로 장래의 기간 동안에 발생될 것으로 예상되는 워크로드를 예측하는 장래예측모델을 학습할 수 있다. (학습모델)A cloud server or a virtual machine of a cloud server can learn a future prediction model that predicts a workload expected to occur in the future period using only the operation information of representative variables received from the decision module (115). (Learning model)

일례로, 클라우드 서버의 스펙, 클라우드 서버에서 구동되는 어플리케이션의 종류, 클라우드 서버에 설치된 프로그램의 종류 및 클라우드 서버에 설정된 명령에 대한 정보가 입력데이터로 설정되고, 그 입력데이터에 라벨링된 과거에 발생된 동작정보를 출력데이터로서 하여,머신러닝/딥러닝을 통해 장래예측모델이 학습될 수 있다. For example, information about the specifications of a cloud server, the type of application running on the cloud server, the type of program installed on the cloud server, and the command set on the cloud server are set as input data, and the operation information that occurred in the past labeled in the input data is used as output data, and a future prediction model can be learned through machine learning/deep learning.

이때, 동작정보는 대표변수들에 대한 동작정보만이 활용될 수 있다. At this time, only the motion information for representative variables can be utilized.

또한, 클라우드 서버 또는 클라우드 서버의 가상머신은 대표변수들만으로 클라우드 서버 혹은 가상머신의 이상 여부를 탐지하는 이상탐지모델을 학습할 수 있다. (학습모델)In addition, a cloud server or a virtual machine of a cloud server can learn an anomaly detection model that detects anomalies in the cloud server or virtual machine using only representative variables. (Learning model)

일례로, 클라우드 서버의 스펙, 클라우드 서버에서 구동되는 어플리케이션의 종류, 클라우드 서버에서 발생된 로그데이터와 트레이스데이터, 클라우드 서버에 설치된 프로그램의 종류 및 클라우드 서버에 설정된 명령에 대한 정보가 입력데이터로 설정되고, 그 입력데이터에 클라우드 서버의 정상/비정상 인지 여부가 라벨링된 데이터를 통해, 머신러닝/딥러닝을 통해 이상탐지모델이 학습될 수 있다.For example, information about the specifications of a cloud server, the type of application running on the cloud server, log data and trace data generated from the cloud server, the type of program installed on the cloud server, and commands set on the cloud server are set as input data, and an anomaly detection model can be trained through machine learning/deep learning through data labeled with whether the cloud server is normal/abnormal.

이로 인해, 클라우드 서버 또는 가상머신은 머신러닝에 사용되는 클라우드 자원을 최소화할 수 있으며, 머신러닝 모델의 성능을 더욱 높일 수 있다. This allows cloud servers or virtual machines to minimize cloud resources used for machine learning and further improve the performance of machine learning models.

이하, 변수 선택 장치(110)에 의해 구현되는 변수 선택 방법에 대해서 자세하게 서술하도록 한다. Below, the variable selection method implemented by the variable selection device (110) will be described in detail.

도 4는 본 발명의 일 실시예에 따른 변수 선택 장치(110)에 의해 진행되는 변수 선택 방법의 순서도이다. Figure 4 is a flowchart of a variable selection method performed by a variable selection device (110) according to one embodiment of the present invention.

도 4를 참조하면, 본 발명의 일 실시예에 따른 변수 선택 방법은, 변수 선택 장치(110)에 의해 구현되며, 클라우드 서버의 특성을 대표하는 변수인 대표변수를 결정하는, 변수 선택 방법에 있어서, 선택 수신모듈(111)에 의해, 상기 클라우드 서버가 동작되면서 발생되는 정보로서, 복수개의 변수 - 상기 변수는 미리 정해진 변수인 목적변수와 상기 목적변수를 제외한 변수인 비교변수를 구비함. - 들에 대한 각각의 정보인 동작정보가 수신되는 단계, 탐색모듈(113)에 의해, 미리 정해진 시간범위 마다 상기 목적변수의 상기 동작정보가 구분되고 분석되어, 상기 목적변수의 상기 동작정보에서 이벤트 발생 여부가 탐지되는 단계, 탐지모듈(114)에 의해, 이벤트가 발생된 상기 목적변수의 상기 동작정보와 동일한 시간범위에 발생한 상기 비교변수의 상기 동작정보가 상기 미리 정해진 판단방법에 따라 서로 비교되어 인과성이 분석되는 단계 및 결정모듈(115)에 의해, 미리 정해진 인과성조건을 만족하는 인과성을 가지는 상기 비교변수와 상기 목적변수가 상기 대표변수로 결정되는 단계를 포함할 수 있다. Referring to FIG. 4, a variable selection method according to an embodiment of the present invention is implemented by a variable selection device (110), and determines a representative variable, which is a variable representing the characteristics of a cloud server, in a variable selection method, a selection receiving module (111) receives a plurality of variables as information generated while the cloud server is operating - the variables include a target variable, which is a predetermined variable, and a comparison variable, which is a variable excluding the target variable. - It may include a step of receiving the motion information, which is each information about the fields, a step of distinguishing and analyzing the motion information of the target variable for each predetermined time range by the search module (113) and detecting whether an event has occurred from the motion information of the target variable, a step of comparing the motion information of the target variable in which an event has occurred and the motion information of the comparison variable that occurred in the same time range according to the predetermined judgment method by the detection module (114) and analyzing causality, and a step of determining the comparison variable and the target variable having causality that satisfies the predetermined causality condition as the representative variable by the decision module (115).

이하, 상술한 내용과 중복되는 한도에서 자세한 설명은 생략될 수 있다. Below, detailed explanations may be omitted to the extent that they overlap with the above-mentioned contents.

도 5는 본 발명의 일 실시예에 따른 변수 선택 장치의 동작도이다. FIG. 5 is an operational diagram of a variable selection device according to one embodiment of the present invention.

도 5를 참조하면,클라우드 서버로부터 선택 수신모듈(111)은 동작정보를 수신할 수 있다. Referring to FIG. 5, the selection reception module (111) can receive operation information from the cloud server.

예비처리모듈(112)은 상기 선택 수신모듈(111)로부터 전달받은 동작정보들 중에서 미리 정해진 전처리방법에 따라, 일부 변수에 대한 지표데이터를 제거할 수 있다. (전처리 과정)The preprocessing module (112) can remove indicator data for some variables from the operation information received from the above-mentioned selection receiving module (111) according to a preprocessing method determined in advance. (Preprocessing process)

전처리된 동작정보는 탐색모듈(113)로 전송될 수 있다. Preprocessed motion information can be transmitted to the search module (113).

탐색모듈(113)은 목적변수의 동작정보들을 미리 정해진 시간범위 별로 구분하고, 미리 정해진 시간범위 단위로 이벤트 발생 여부를 판단할 수 있다. The search module (113) can classify the operation information of the target variable into predetermined time ranges and determine whether an event has occurred in units of predetermined time ranges.

일례로, 도 5(a)를 참조하면, 임의의 하나의 목적변수의 동작정보는 제1-1 구간(K11), 제1-2 구간(K12), 제1-3 구간(K13), 제1-4 구간(K14) 및 제1-5 구간(K15)으로 구분될 수 있다. For example, referring to Fig. 5(a), the operation information of any one target variable can be divided into the 1-1 section (K11), the 1-2 section (K12), the 1-3 section (K13), the 1-4 section (K14), and the 1-5 section (K15).

이 때, 제1-2 구간(K12)과 제1-4 구간(K14)에서 이벤트가 발생되었다고 판단될 수 있다. At this time, it can be determined that an event occurred in the 1st-2nd section (K12) and the 1st-4th section (K14).

탐지모듈(114)은 이벤트가 발생된 상기 목적변수의 상기 동작정보와, 이벤트가 발생한 상기 동작정보와 동일한 시간범위에 발생한 상기 비교변수의 상기 동작정보를 그랜저 인과성 테스트 (Granger causality Test)를 통해 p-value를 산출할 수 있다. The detection module (114) can calculate a p-value using the Granger causality test for the operation information of the target variable where the event occurred and the operation information of the comparison variable that occurred in the same time range as the operation information where the event occurred.

일례로, 도 5(b)를 참조하면, 제1 비교변수의 동작정보는 제2-1 구간(K21), 제2-2 구간(K22), 제2-3 구간(K23), 제2-4 구간(K24) 및 제2-5구간(K25)으로 구분될 수 있다. For example, referring to Fig. 5(b), the operation information of the first comparison variable can be divided into the 2-1 section (K21), the 2-2 section (K22), the 2-3 section (K23), the 2-4 section (K24), and the 2-5 section (K25).

여기서, 이벤트가 발생한 목적변수의 구간인 제1-2 구간(K12)과 제1-4 구간(K14)과 동일한 시간대의 구간인 제2-2 구간(K22)과 제2-4 구간(K24)을 탐지모듈(114)은 인과성 분석 대상으로 결정할 수 있다. Here, the detection module (114) can determine the 2-2 section (K22) and the 2-4 section (K24), which are sections of the same time zone as the 1-2 section (K12) and the 1-4 section (K14), which are sections of the target variable where the event occurred, as targets of causality analysis.

제2-2 구간(K22)의 동작정보는 제1-2 구간(K12)의 동작정보와 인과성이 인정될 수 있으며, 제2-4 구간(K24)의 동작정보는 제1-4 구간(K14)의 동작정보와 인과성이 인정될 수 있다. The motion information of the 2-2 section (K22) can be acknowledged to be causally related to the motion information of the 1-2 section (K12), and the motion information of the 2-4 section (K24) can be acknowledged to be causally related to the motion information of the 1-4 section (K14).

도 5(c)를 참조하면, 제2 비교변수의 동작정보는 제3-1 구간(K31), 제3-2 구간(K32), 제3-3 구간(K33), 제3-4 구간(K34) 및 제2-5구간(K35)으로 구분될 수 있다.Referring to Fig. 5(c), the operation information of the second comparison variable can be divided into the 3-1 section (K31), the 3-2 section (K32), the 3-3 section (K33), the 3-4 section (K34), and the 2-5 section (K35).

이벤트가 발생한 목적변수의 구간인 제1-2 구간(K12)과 제1-4 구간(K14)과 동일한 시간대의 구간인 제3-2 구간(K32)과 제3-4 구간(K34)을 탐지모듈(114)은 인과성 분석 대상으로 결정할 수 있다.The detection module (114) can determine the 3-2 section (K32) and the 3-4 section (K34), which are sections of the same time zone as the 1-2 section (K12) and the 1-4 section (K14), which are sections of the target variable where the event occurred, as targets of causality analysis.

제3-2 구간(K32)의 동작정보는 제1-3 구간(K13)의 동작정보와 인과성이 인정되지 않을 수 있으며, 제3-4 구간(K34)의 동작정보는 제1-4 구간(K34)의 동작정보와 인과성이 인정되지 않을 수 있다. The motion information of the 3-2 section (K32) may not be recognized as causally related to the motion information of the 1-3 section (K13), and the motion information of the 3-4 section (K34) may not be recognized as causally related to the motion information of the 1-4 section (K34).

결정모듈(115)은 대표변수를 클라우드 서버로 전송하여, 클라우드 서버가 머신러닝 학습에 활용하도록 할 수 있다. The decision module (115) can transmit representative variables to a cloud server so that the cloud server can use them for machine learning.

도 6은 본 발명의 일 실시예에 따른 변수 선택 장치가 선택한 대표변수의 효과를 설명하기 위한 도면이다. FIG. 6 is a drawing for explaining the effect of a representative variable selected by a variable selection device according to one embodiment of the present invention.

본 발명의 효과를 평가하기 위해서, RMSE값과 MAE값을 활용하였다. To evaluate the effectiveness of the present invention, RMSE and MAE values were used.

------------ [수학식 3] ------------ [Mathematical Formula 3]

------------------[수학식 4] ------------------[Mathematical Formula 4]

성능 평가를 위해서, 미리 정해진 시간범위는 '1일'일 수 있다. For performance evaluation, the predefined time range can be '1 day'.

도 6(a)를 참조하면, 대부분의 지표와 예측모델을 볼 때, 본 발명은 대부분의 경우에서 우수한 성능을 보이고 있다. Referring to Fig. 6(a), when looking at most indicators and prediction models, the present invention shows excellent performance in most cases.

각 예측모델에서 RMSE값과 MAE값 모두 대표변수로 학습한 예측모델의 성능에 동일한 경향을 보이고 있다. 또한, VAR예측모델은 다른 예측 모델에 비해 가장 낮은 RMSE값과 MAE값을 나타냈다. 즉, 본 발명의 알고리즘을 활용할 경우, VAR예측모델에 가장 큰 효과를 거두는 것으로 파악되었다. VAR예측모델에서는 최악의 RMSE값을 보인 모델보다 67% 성능 향상을 보이는 것으로 조사되었다. 더욱이 KNN 예측모델에서는 18% 성능 향상을 보였으며, LSTM에서는 17% 성능 향상을 보이는 것으로 조사되었다. In each prediction model, both RMSE and MAE values show the same tendency in the performance of the prediction model learned as the representative variable. In addition, the VAR prediction model showed the lowest RMSE and MAE values compared to other prediction models. In other words, it was found that the algorithm of the present invention has the greatest effect on the VAR prediction model. It was investigated that the VAR prediction model showed a 67% performance improvement compared to the model with the worst RMSE value. Furthermore, it was investigated that the KNN prediction model showed an 18% performance improvement and the LSTM showed a 17% performance improvement.

도 6(b)를 참조하면, 하나의 변수를 지정하여 그 변수에 대해서 예측하여 예측성능을 비교하였다. 이 성능 비교에서 오차지수는 RMSE값으로 하였으며, 실험의 효율성을 위해 예측모델은 KNN회귀로 설정하였다. Referring to Fig. 6(b), one variable was specified and predictions were made for that variable to compare prediction performance. In this performance comparison, the error index was the RMSE value, and the prediction model was set to KNN regression for the efficiency of the experiment.

실험 결과, 대부분의 변수에 대해서 본 발명이 활용되어 학습된 예측모델의 예측 성능이 우수한 것으로 조사되었다. 예를 들어 디스크 읽기 데이터양 변수의 경우 상관관계 기반으로 선택된 변수들로 학습된 예측모델의 성능이 본 발명이 활용된 예측모델보다 0.00641 더 나았지만, 본 발명의 예측모델은 다른 변수에서 최고의 성능을 보였다. 구체적으로 CPU 사용률의 경우 최악의 성능을 보인 예측모델 대비 21% 더 나은 성능을 보여주었으며, 메모리 사용률에 대해서는 45%, 디스크 쓰기 데이터양에 대해서는 15%, 네트워크 유입량에 대해서는 20%, 네트워크 유출량에 대해서는 16% 더 나은 성능을 보여주었다. As a result of the experiment, it was found that the prediction performance of the prediction model learned by utilizing the present invention was excellent for most variables. For example, in the case of the disk read data amount variable, the performance of the prediction model learned with variables selected based on correlation was 0.00641 better than the prediction model utilizing the present invention, but the prediction model of the present invention showed the best performance in other variables. Specifically, it showed 21% better performance than the prediction model with the worst performance in the case of CPU usage rate, 45% better performance in the case of memory usage rate, 15% better performance in the case of disk write data amount, 20% better performance in the case of network inflow, and 16% better performance in the case of network outflow.

실험결과, 본 발명이 우수한 효과를 보였을 때, CPU 사용률의 목적변수에는 CPU 스틸(CPU Steal)과 CPU 아이오웨이트(CPU iowait)가 대표변수로 선별되었다. 이는, CPU 사용률 예측 성능 측면에서, CPU 스틸 또는 CPU 아이오웨이트와 같은 값이 증가하면, 동일한 물리적인 기계에서 다른 프로세스를 수행하거나 여러 가상 머신에 CPU 리소스를 분배하기가 어려워지므로 CPU 전체 사용율에 영향을 미치기 때문이다.As a result of the experiment, when the present invention showed excellent effects, CPU steal and CPU iowait were selected as representative variables for the target variables of CPU utilization. This is because, in terms of CPU utilization prediction performance, if values such as CPU steal or CPU iowait increase, it becomes difficult to perform other processes on the same physical machine or to distribute CPU resources to multiple virtual machines, thus affecting the overall CPU utilization.

또한, 메모리 사용률의 목적변수에는 파일 시스템 사용률과 디스크 쓰기와 관련된 변수들이 대표변수로 선별되었다. 이는, 메모리 사용률 예측 성능 측면에서, 분석 대상 가상머신 중 많은 부분이 데이터 파이프라인과 관련이 있어, 파일 시스템 사용이나 디스크 쓰기와 같은 값이 증가하면, 대량 데이터 관련하여 데이터베이스를 저장하거나 쿼리하는 요청이 처리되어, 메모리 사용에 영향을 미치기 때문이다. In addition, variables related to file system usage and disk writing were selected as representative variables for the target variables of memory usage. This is because, in terms of memory usage prediction performance, many of the virtual machines being analyzed are related to data pipelines, so when values such as file system usage or disk writing increase, requests to store or query databases related to large amounts of data are processed, which affects memory usage.

이와 같이, 본 발명은 선택되는 대표변수들을 기초로 추가적인 클라우드 도메인 지식을 취득하는데 활용할 수도 있다. In this way, the present invention can also be utilized to acquire additional cloud domain knowledge based on selected representative variables.

이하, 클라우드 서버에서 학습되는 학습모델의 재 학습 시기를 추천하는 시기 결정 장치(120)에 대해서 자세하게 서술하도록 한다.Below, a timing determination device (120) that recommends a re-training time for a learning model trained on a cloud server will be described in detail.

도 7은 본 발명의 일 실시예에 따른 모델 관리 시스템의 시기 결정 장치(120)의 구성도이다. Figure 7 is a configuration diagram of a timing determination device (120) of a model management system according to one embodiment of the present invention.

본 발명의 일 실시예에 따른 시기 결정 장치(120)는, 상기 동작정보를 시계열적으로 수신하는 시기 수신모듈(121), 상기 동작정보를 시계열 분해한 후 미리 정해진 강도분석방법에 따라 상기 동작정보의 계절성 강도를 산출하는 계절모듈(122), 미리 정해진 패턴분석방법으로 상기 동작정보의 패턴을 분석하는 패턴모듈(124) 및 상기 계절성 강도의 크기에 따라 미리 정해진 결정방법으로 상기 클라우드 서버에서 진행되는 상기 머신러닝 모델의 재 학습시기를 결정하는 판단모듈(125)을 포함할 수 있다. A timing determination device (120) according to one embodiment of the present invention may include a timing receiving module (121) that receives the motion information in a time-series manner, a seasonal module (122) that decomposes the motion information in a time-series manner and then calculates the seasonal intensity of the motion information according to a predetermined intensity analysis method, a pattern module (124) that analyzes the pattern of the motion information using a predetermined pattern analysis method, and a judgment module (125) that determines the re-learning timing of the machine learning model performed in the cloud server using a predetermined decision method according to the magnitude of the seasonal intensity.

또한,시기 결정 장치(120)는 동작정보를 차원 축소하는 전처리 프로세스를 진행하는 전처리모듈(123), 미래의 동작정보를 예측하는 예측모듈(126) 및 시기 결정 방법이 구현되는데 필요한 정보를 저장하는 저장모듈(127)을 더 포함할 수 있다.In addition, the timing determination device (120) may further include a preprocessing module (123) that performs a preprocessing process for reducing the dimension of motion information, a prediction module (126) that predicts future motion information, and a storage module (127) that stores information necessary for implementing a timing determination method.

시기 수신모듈(121)은 클라우드 서버로부터 동작정보를 수신받을 수 있다. The timing receiving module (121) can receive motion information from a cloud server.

동작정보는 복수개의 변수들에 대한 시계열적인 값에 대한 정보로서 자세한 설명은 상술한 내용과 중복되는 한도에서 생략될 수 있다. The motion information is information on time-series values for multiple variables, and a detailed description may be omitted to the extent that it overlaps with the above-described content.

계절모듈(122)은 클라우드 서버가 동작되면서 발생되는 정보인 동작정보를 수신할 수 있다. The seasonal module (122) can receive operation information, which is information generated when the cloud server operates.

동작정보는 클라우드 서버가 가동되면서 사용라는 물리서버(클라우드 서버)의 리소스 사용량에 대한 정보일 수 있다. The operation information can be information about the resource usage of the physical server (cloud server) while the cloud server is in operation.

일례로, 동작정보는 클라우드 서버의 중앙처리장치(CPU) 사용량, 보조기억장치(Memory) 사용량, 네트워크(Network) 사용량, 주기억장치(Disk) 사용량 및 파일시스템(Filesystem) 사용량을 등을 포함할 수 있다. For example, the operation information may include the central processing unit (CPU) usage, auxiliary memory (Memory) usage, network usage, main memory (Disk) usage, and file system usage of the cloud server.

일례로, 동작정보는 CPU 대기시간(iowait) 사용량을 더 포함할 수 있다. For example, the motion information may further include CPU wait time (iowait) usage.

여기서, 사용량은 클라우드 서버에 할당된 리소스 대비 실제 사용되는 자원량 혹은 클라우드 서버가 최대로 가용할 수 있는 리소스 대비 실제 사용되는 자원량을 의미할 수 있다. Here, usage can mean the amount of resources actually used compared to the resources allocated to the cloud server, or the amount of resources actually used compared to the maximum available resources of the cloud server.

여기서, 사용량은 클라우드 서버에서 허용될 수 있는 한계의 iowait 시간 대비 실제로 검측되는 iowait 시간을 의미할 수 있다.Here, usage can mean the actual iowait time observed compared to the allowable iowait time limit on the cloud server.

여기서, 동작정보를 구성하는 각각의 사용량들을 지표데이터라고 정의할 수 있다. Here, each usage amount that constitutes the motion information can be defined as indicator data.

상기 동작정보는 저장모듈(127)에 저장될 수 있다. The above operation information can be stored in the storage module (127).

계절모듈(122)은 상기 동작정보를 시계열 분해한 후 미리 정해진 강도분석방법에 따라 상기 동작정보의 계절성 강도를 산출할 수 있다. The seasonal module (122) can calculate the seasonal intensity of the motion information according to a predetermined intensity analysis method after decomposing the motion information into time series.

계절모듈(122)은 모든 종류의 동작정보(CPU 사용량, Memory 사용량, Network 사용량, Disk 읽기/쓰기 사용량 및 Filesystem 사용량, iowait 값 각각을 시계열 분해하여, 추세(Trend), 계절성(seasonality), 노이즈(잔차, noise)로 구분할 수 있다. The seasonal module (122) can decompose all types of operation information (CPU usage, memory usage, network usage, disk read/write usage, file system usage, and iowait values) into time series and classify them into trend, seasonality, and noise (residual).

일례로, 계절모듈(122)은 가법(additive) 또는 승법(multiplicative)를 활용하여 시계열 분해를 실행할 수 있으나, 이에 본 발명이 한정되는 것은 아니고, 시계열 분해 방법에 대한 구체적인 알고리즘에 대한 설명은 공지된 기술 범위 내에서 생략될 수 있다. For example, the seasonal module (122) can perform time series decomposition using an additive or multiplicative method, but the present invention is not limited thereto, and a description of a specific algorithm for a time series decomposition method may be omitted within the scope of known technology.

미리 정해진 강도분석방법은 상기 동작정보의 계절성과 상기 동작정보의 노이즈에 기반하여 분석하는 방법일 수 있다. The predetermined intensity analysis method may be a method of analysis based on the seasonality of the above motion information and the noise of the above motion information.

구체적으로, 상기 미리 정해진 강도분석방법은 수학식 5를 따르는 방법일 수 있다. Specifically, the above-determined strength analysis method may be a method following mathematical expression 5.

----------------------- [수학식 5] ----------------------- [Mathematical Formula 5]

여기서, 는 지표데이터의 계절성의 강도, 는 지표데이터의 계절성(seasonality)), 는 지표데이터의 노이즈(noise), 는 지표데이터의 계절성과 노이즈 합에 대한 분산, 는 지표데이터의 노이즈에 대한 분산을 의미할 수 있다. Here, is the intensity of seasonality of indicator data, is the seasonality of indicator data), is the noise of the indicator data, is the variance of the seasonality and noise sum of the indicator data, can mean the variance of noise in indicator data.

또한, 계절성의 단위는 주기일 수 있다. Additionally, the unit of seasonality can be a cycle.

계절성의 강도는 0 내지 1 사이의 값으로 산출될 수 있다. The intensity of seasonality can be expressed as a value between 0 and 1.

계절모듈(122)은 동작정보 내의 각 지표데이터에 대해서 계절성의 강도를 산출하고, 각각의 지표데이터의 계절성의 강도를 평균 내어, 클라우드 서버에 대한 동작정보의 계절성 강도를 산출할 수 있다. The seasonal module (122) calculates the intensity of seasonality for each indicator data in the motion information, and by averaging the intensity of seasonality for each indicator data, it can calculate the intensity of seasonality for the motion information for the cloud server.

계절모듈(122)은 산출한 계절성 강도를 판단모듈(125) 및/또는 저장모듈(127)에 전송할 수 있다. The seasonal module (122) can transmit the calculated seasonal intensity to the judgment module (125) and/or the storage module (127).

저장모듈(127)은 계절모듈(122)이 산출한 계절성 강도를 저장할 수 있다. The storage module (127) can store the seasonal intensity produced by the seasonal module (122).

전처리모듈(123)은 상기 동작정보를 차원 축소하는 전처리를 수행할 수 있다. The preprocessing module (123) can perform preprocessing to reduce the dimension of the above motion information.

일례로, 전처리모듈(123)은 AE(AutoEncoder), PCA, SVD, NMF 등의 기법을 활용하여 동작정보의 차원을 축소할 수 있다. For example, the preprocessing module (123) can reduce the dimension of motion information by utilizing techniques such as AE (AutoEncoder), PCA, SVD, and NMF.

전처리모듈(123)은 동작정보를 구성하는 다종의 지표데이터들 각각에 대해서 차원을 축소할 수 있다. The preprocessing module (123) can reduce the dimension of each of the various indicator data constituting the motion information.

그로 인해, 각각의 지표데이터들(CPU 사용량, Memory 사용량, Network 사용량, Disk 사용량 및 Filesystem 사용량, iowait 값)에 대한 축소정보가 산출될 수 있다. Due to this, reduced information can be produced for each indicator data (CPU usage, memory usage, network usage, disk usage, file system usage, iowait value).

전처리모듈(123)이 산출한 축소정보는 기초가 되는 동작정보와 대응되어 저장모듈(127)에 저장될 수 있다. The reduced information produced by the preprocessing module (123) can be stored in the storage module (127) in correspondence with the basic motion information.

전처리모듈(123)이 산출한 축소정보는 패턴모듈(124) 및/또는 저장모듈(127)로 전송될 수 있다. The reduced information produced by the preprocessing module (123) can be transmitted to the pattern module (124) and/or the storage module (127).

패턴모듈(124)은 미리 정해진 패턴분석방법으로 상기 동작정보의 패턴을 분석할 수 있다. The pattern module (124) can analyze the pattern of the above motion information using a pre-determined pattern analysis method.

구체적으로, 패턴모듈(124)은 미리 정해진 패턴분석방법으로 동작정보의 차원이 축소된 축정정보의 패턴을 분석할 수 있다. Specifically, the pattern module (124) can analyze the pattern of the reduced-dimensional measurement information of the motion information using a pre-determined pattern analysis method.

여기서, 미리 정해진 패턴분석방법은 상기 축소정보를 구성하는 차원이 축소된 각각의 지표데이터들 간의 관계성에 기반하여 패턴을 분석하는 방법일 수 있다. Here, the predetermined pattern analysis method may be a method of analyzing a pattern based on the relationship between each of the reduced-dimensional indicator data constituting the reduced information.

구체적인 일례로, 미리 정해진 패턴분석방법은 상기 축소정보를 구성하는 차원이 축소된 각각의 지표데이터들 간의 크기 비율들로서 패턴을 분석하는 방법일 수 있다. As a specific example, the predetermined pattern analysis method may be a method of analyzing patterns as size ratios between each dimensionally reduced indicator data constituting the reduced information.

구체적인 일례로, CPU 사용량이 30%, Memory 사용량이 25%, Network 사용량이 60%, Disk 사용량이 20% 및 Filesystem 사용량이 30%, iowait 사용량이 15%(미리 정해진 한계 iowait 시간 대비 동작정보 내 iowait 값)라면, 해당 동작정보의 패턴은 1: 0.833 : 2 : 0.667 : 1 : 0.5일 수 있다. As a concrete example, if CPU usage is 30%, Memory usage is 25%, Network usage is 60%, Disk usage is 20%, Filesystem usage is 30%, and iowait usage is 15% (iowait value in operation information compared to pre-determined limit iowait time), the pattern of the operation information can be 1: 0.833 : 2 : 0.667 : 1 : 0.5.

패턴모듈(124)이 산출한 동작정보의 패턴은 저장모듈(127) 및/또는 판단모듈(125)로 전송될 수 있다. The pattern of motion information produced by the pattern module (124) can be transmitted to the storage module (127) and/or the judgment module (125).

예측모듈(126)은 클라우드 서버가 사용하는 리소스 양에 대한 미래의 최하시점과 최고시점을 예측할 수 있다. The prediction module (126) can predict the future minimum and maximum points in time for the amount of resources used by the cloud server.

여기서, 최하시점은 한 주기 단위로 동작정보(다종의 지표데이터들의 평균)가 최소로 발생되는 시점을 의미할 수 있다. Here, the lowest point can mean the point in time when the minimum amount of motion information (average of various indicator data) occurs in one cycle.

또한, 최고시점은 한 주기 단위로 동작정보(다종의 지표데이터들의 평균)가 최고로 발생되는 시점을 의미할 수 있다. Additionally, the peak point can mean the point in time when the maximum amount of motion information (average of various indicator data) occurs in one cycle.

예측모듈(126)은 저장모듈(127)로부터 재 학습 여부를 판단하는 시점인 현재 시점으로부터 역산하여 과거의 동작데이터를 수집하여, 상기 현재 시점이 주기 상 어느 시점인지를 파악한 후, 계절성(주기)에 기반하여, 향후 언제 최하시점 또는 최고시점이 도래할지를 예측할 수 있다. The prediction module (126) collects past motion data by calculating backwards from the current point in time, which is the point in time at which re-learning is determined from the storage module (127), and then determines which point in time the current point in time is in the cycle, and then predicts when the lowest or highest point in time will arrive in the future based on seasonality (cycle).

일례로, 계절성이 1년이고, 1년 주기 중 최하시점은 시기로부터 4개월 후, 최고시점은 시기로부터 8개월 후이고, 현재시점이 주기 상 시기로부터 2개월 후라면, 예측모듈(126)은 최하시점은 현재시점으로부터 2개월 후에 발생될 것이고, 최고시점은 6개월 후에 발생될 것으로 예측할 수 있다. For example, if the seasonality is one year, the lowest point in the one-year cycle is four months from the time period, the highest point is eight months from the time period, and the current point is two months from the time period in the cycle, the prediction module (126) can predict that the lowest point will occur two months from the current point and the highest point will occur six months from the current point.

예측모듈(126)은 예측결과를 판단모듈(125)에 전달될 수 있다. The prediction module (126) can transmit the prediction result to the judgment module (125).

예측모듈(126)은 탐지모델에 대한 재 학습 시간을 예측할 수 있다. The prediction module (126) can predict the re-learning time for the detection model.

이를 위해서, 탐지모델이 과거에 머신러닝하였을 때, 입력되는 데이터 양과 소요되는 시간이 저장모듈(127)에 저장될 수 있다. For this purpose, when the detection model was machine-learned in the past, the amount of input data and the time required can be stored in the storage module (127).

예측모듈(126)은 재 학습을 위해 새롭게 입력되는 데이터양을 클라우드 서버로부터 전달받고, 과거의 학습모델이 학습하면서 소요된 시간과 입력된 데이터의 양을 저장모듈(127) 혹은 클라우드 서버로부터 전달받을 수 있다. The prediction module (126) can receive the amount of newly input data for re-learning from the cloud server, and can receive the time taken for the past learning model to learn and the amount of input data from the storage module (127) or the cloud server.

일례로, 예측모듈(126)은 데이터 양이 증가함에 따라 학습되는 시간이 증가한다는 비례 관계식으로서, 재 학습시에 소요되는 재 학습시간을 예측하여 판단모듈(125)에 전송할 수 있다. For example, the prediction module (126) can predict the re-learning time required for re-learning as a proportional relationship that the learning time increases as the amount of data increases and transmit the result to the judgment module (125).

판단모듈(125)은 클라우드 서버의 계절성의 강도의 크기에 따라 클라우드 서버에서 동작되는 학습모델(일례로, 장래예측모델, 이상탐지모델)의 재 학습 시기를 결정할 수 있다. The judgment module (125) can determine the re-learning period of a learning model (e.g., a future prediction model, an anomaly detection model) operated in a cloud server according to the magnitude of the intensity of seasonality of the cloud server.

이는, 클라우드 서버의 경우에는 빠른 판단과 대응을 하여야 하기 때문에, 클라우드 서버에서 가동되는 프로세스가 많을 때 머신 러닝을 하게 된다면 클라우드 서버의 리소스 과부하가 발생될 수 있다. This is because, in the case of cloud servers, quick judgment and response are required, so if machine learning is performed when there are many processes running on the cloud server, the cloud server's resources may become overloaded.

이는, 클라우드 서버의 기능이 저해되는 문제가 발생될 수 있다. This may cause problems in the cloud server's functionality.

미리 정해진 결정방법은 상기 동작정보의 계절성 또는 상기 동작정보의 패턴을 고려하여 학습모델의 재 학습시기를 결정하는 방법일 수 있다. The predetermined decision method may be a method of determining the re-learning timing of the learning model by considering the seasonality of the motion information or the pattern of the motion information.

미리 정해진 결정방법은 상기 동작정보의 계절성 강도가 제1 기준값 이상일 경우, 상기 동작정보의 계절성을 고려하여 재 학습시기를 결정하는 제1 결정방법 및 상기 동작정보의 계절성 강도가 상기 제1 기준값 보다 낮은 제2 기준값 미만일 경우, 상기 동작정보의 패턴을 고려하여 재 학습시기를 결정하는 제2 결정방법을 구비할 수 있다. The predetermined decision method may include a first decision method for determining a re-learning time by considering the seasonality of the motion information when the seasonality intensity of the motion information is equal to or greater than a first reference value, and a second decision method for determining a re-learning time by considering the pattern of the motion information when the seasonality intensity of the motion information is lower than a second reference value that is lower than the first reference value.

또한, 상기 미리 정해진 결정방법은, 상기 동작정보의 계절성 강도가 상기 제2 기준값 이상 상기 제1 기준값 미만일 경우, 미리 정해진 중간조건이 만족되는지에 따라 상기 동작정보의 계절성을 고려하거나 상기 동작정보의 패턴을 고려하여 재 학습시기를 결정하는 제3 결정방법을 더 구비할 수 있다. In addition, the above-determined decision method may further include a third decision method for determining a re-learning period by considering the seasonality of the motion information or considering the pattern of the motion information depending on whether a predetermined intermediate condition is satisfied when the seasonality intensity of the motion information is greater than or equal to the second reference value and less than the first reference value.

일례로, 제1 기준값은 0.7일 수 있고, 제2 기준값은 0.4일 수 있다. For example, the first criterion may be 0.7, and the second criterion may be 0.4.

다만, 이에 한정하지 않고, 제1 기준값과 제2 기준값의 구체적인 수치는 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다. However, without limitation thereto, the specific numerical values of the first reference value and the second reference value can be varied in various ways at a level that is obvious to a person skilled in the art.

구체적인 일례로, 제1 결정방법은, 상기 동작정보의 주기 내에서 상기 동작정보가 최소로 발생되는 시점인 최하시점에서 상기 동작정보가 최고로 발생되는 시점인 최고시점 까지의 시간에 소정 비율의 시간인 중간시간이 탐지모델의 재 학습시간보다 이상일 경우, 미래의 상기 최하시점을 학습모델의 재 학습시기로 결정할 수 있다. As a specific example, the first decision method can determine the future minimum point as the re-learning time of the learning model if the median time, which is a predetermined ratio of the time between the minimum point when the motion information is least generated and the maximum point when the motion information is most generated within the cycle of the motion information, is longer than the re-learning time of the detection model.

일례로, 소정비율은 50%일 수 있다. For example, the given ratio could be 50%.

다만, 이에 한정하지 않고, 소정비율의 구체적인 수치는 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.However, without limitation to this, the specific numerical value of the given ratio can be varied in various ways at a level that is obvious to a person skilled in the art.

일례로, 최하시점과 최고시점 까지의 시간이 48시간이라면, 중간시간은 24시간일 수 있다. For example, if the time between the lowest and highest points is 48 hours, the intermediate time may be 24 hours.

이는, 최고시점이 도래하기 전에 학습모델의 재 학습이 종료할 수 있어, 리소스 과부화를 예방하기 위함일 수 있다. This may be to prevent resource overload as retraining of the learning model may end before the peak point is reached.

구체적인 일례로, 상기 제1 결정방법은, 상기 동작정보의 주기 내에서 상기 동작정보가 최소로 발생되는 시점인 최하시점에서 상기 동작정보가 최고로 발생되는 시점인 최고시점 까지의 시간에 소정 비율의 시간인 중간시간이 재 학습시간보다 미만일 경우, 상기 동작정보가 감소하는 추세이면서 상기 최고시점과 상기 최하시점 사이의 미래 시점을 학습모델의 재 학습시기로 결정하는 방법일 수 있다. As a specific example, the first determination method may be a method of determining a future point in time between the highest point and the lowest point as the re-learning time of the learning model while the motion information is decreasing, if the intermediate time, which is a predetermined ratio of the time from the lowest point, which is the point in time when the motion information is least generated, to the highest point, which is the point in time when the motion information is most generated, within the cycle of the motion information is less than the re-learning time.

이는, 최고시점이 도래하기 전에 학습모델의 재 학습이 종료되지 않아, 리소스 과부화를 예방하기 위함일 수 있다.This may be to prevent resource overload by ensuring that retraining of the learning model is not terminated before the peak point is reached.

여기서, 상기 동작정보가 감소하는 추세이면서 상기 최고시점과 상기 최하시점 사이의 미래 시점은, 상기 최하시점으로부터 상기 동작정보가 감소하는 추세이면서 상기 최고시점과 상기 최하시점 사이의 소정 비율 시간 이전의 시점을 의미할 수 있다. Here, a future point in time between the highest point and the lowest point while the motion information is decreasing may mean a point in time before a predetermined percentage of time between the highest point and the lowest point while the motion information is decreasing from the lowest point.

제2 결정방법은 과거의 학습모델이 학습할 때의 상기 동작정보의 패턴인 기준 동작패턴과 소정의 유사값인 최고 유사기준 이상으로 유사한 동작정보의 패턴이 확인되는 시점을 학습모델의 재 학습시기로 결정하는 방법일 수 있다. The second decision method may be a method of determining the re-learning time of the learning model at the point in time when a pattern of motion information similar to a reference motion pattern, which is a pattern of motion information when the learning model was learning in the past, is confirmed to be higher than a predetermined similarity value, which is a maximum similarity criterion.

이때, 재 학습 여부를 판단하는 학습모델과 동일한 학습모델로 제2 결정방법이 진행되어야 할 수 있다. 일례로, 장래예측모델의 재 학습 판단 시기를 결정해야하는 경우, 장래예측모델이 과거 학습될 때의 동작정보의 패턴인 기준 동작패턴을 기초로 재 학습시기가 결정될 수 있다. At this time, the second decision method may need to be performed with the same learning model as the learning model that determines whether or not to relearn. For example, when the timing of relearning judgment of a future prediction model needs to be determined, the timing of relearning may be determined based on a reference motion pattern, which is a pattern of motion information when the future prediction model was learned in the past.

판단모듈(125)은 상기 기준 동작패턴과 가장 유사한 동작정보 패턴이 가지는 유사값으로 상기 최고 유사기준을 결정할 수 있다. The judgment module (125) can determine the highest similarity criterion based on the similarity value of the motion information pattern that is most similar to the above-mentioned standard motion pattern.

구체적인 일례로서, 판단모듈(125)은 현재시점으로부터 과거의 소정 기간(일례로, 1 주기) 동안에 클라우드 서버에서 기준 동작패턴과 가장 유사한 동작정보 패턴을 가지는 유사값으로 최고 유사기준을 결정할 수 있다. As a specific example, the judgment module (125) can determine the highest similarity criterion as the similarity value having the motion information pattern most similar to the reference motion pattern in the cloud server during a predetermined period of time (for example, 1 cycle) from the present time.

판단모듈(125)은 패턴의 거리를 기초로 유사값을 산출할 수 있으며, 거리가 짧을수록 유사값이 높을 수 있다. The judgment module (125) can calculate a similarity value based on the distance of the pattern, and the shorter the distance, the higher the similarity value.

일례로, 1:2:3:4:5:6 인 패턴과 1:3:3:4:5:7인 패턴이 있을 경우, 두 패턴 간의 거리는 (1-1)^2+(2-3)^2+(3-3)^2+(4-4)^2+(5-5)^2+(6-7)^2으로 연산되어, 2가 산출될 수 있다. For example, if there is a pattern of 1:2:3:4:5:6 and a pattern of 1:3:3:4:5:7, the distance between the two patterns can be calculated as (1-1)^2+(2-3)^2+(3-3)^2+(4-4)^2+(5-5)^2+(6-7)^2, which produces 2.

일례로, 판단모듈(125)에는 거리에 따른 유사값이 지정되어 있을 수 있다.For example, a similarity value according to distance may be specified in the judgment module (125).

일례로, 거리가 3일 경우, 유사값이 90일 수 있다. For example, if the distance is 3, the similarity value may be 90.

다만, 이에 한정하지 않고, 거리에 따른 지정된 유사값의 구체적인 값은 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능 하다. However, without limitation thereto, the specific values of the specified similarity values according to the distance can be varied in various ways at a level that is obvious to a person skilled in the art.

기준 동작패턴은 학습이 일어날 때 수집된 동작정보를 기초로 전처리모듈(123)과 패턴모듈(124)이 패턴을 분석하여 생성된 후 저장모듈(127)에 저장될 수 있다. The standard motion pattern can be generated by analyzing the pattern by the preprocessing module (123) and the pattern module (124) based on the motion information collected when learning occurs, and then stored in the storage module (127).

일례로, 제2 결정방법 하에, 과거의 소정 기간 동안 발생한 동작정보의 패턴들 중에서 기준 동작패턴과 가장 유사값이 98이라면, 최고 유사기준은 유사값 98일 수 있다. For example, under the second decision method, if the most similar value to the reference motion pattern among the patterns of motion information that occurred during a predetermined period of time in the past is 98, the highest similarity criterion may be the similarity value 98.

제3 결정방법은 상기 동작정보의 계절성 강도가 상기 제2 기준값 이상 상기 제1 기준값 미만일 경우, 미리 정해진 중간조건이 만족되는지에 따라 상기 동작정보의 계절성을 고려하거나 상기 동작정보의 패턴을 고려하여 탐지모델의 재 학습시기를 결정하는 방법일 수 있다. The third decision method may be a method of determining the re-learning period of the detection model by considering the seasonality of the motion information or considering the pattern of the motion information depending on whether a predetermined intermediate condition is satisfied when the seasonality intensity of the motion information is greater than or equal to the second reference value and less than the first reference value.

여기서, 상기 미리 정해진 중간조건은 미리 정해진 과거 기간 동안에 상기 기준 동작패턴과 소정의 유사값인 최소 유사기준 이상으로 유사한 패턴을 가지는 동작정보가 존재하는 조건일 수 있다. Here, the above-determined intermediate condition may be a condition in which there exists motion information having a pattern similar to the reference motion pattern by a predetermined similarity value or more than a minimum similarity criterion during a predetermined past period.

일례로, 미리 정해진 과거 기간은 현재 시점으로부터 과거의 소정 시간(일례로, 1주기) 일 수 있다. For example, a predetermined past period could be a given amount of time in the past (e.g., one period) from the present time.

다만, 이에 한정하지 않고, 상기 미리 정해진 과거 기간은 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.However, without limitation thereto, the above-determined past period can be varied in various ways at a level that is obvious to a person skilled in the art.

일례로, 최소 유사기준은 유사값 85일 수 있다. For example, the minimum similarity criterion could be a similarity value of 85.

일례로, 최소 유사기준은 최고 유사기준보다 작을 수 있다. For example, the minimum similarity criterion may be less than the maximum similarity criterion.

다만, 이에 한정하지 않고, 최소 유사기준의 구체적인 값은 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.However, without limitation to this, the specific value of the minimum similarity criterion can be varied in various ways at a level that is obvious to a person skilled in the art.

제3 결정방법에 대한 자세한 설명은 후술하도록 한다. A detailed description of the third decision method will be provided later.

판단모듈(125)은 탐지모델의 재 학습 시점을 결정하여 클라우드 서버에 전송할 수 있다. The judgment module (125) can determine the re-learning point of the detection model and transmit it to the cloud server.

판단모듈(125)은 학습모델의 재 학습 시점을 결정하여 인터페이스 장치에 전송할 수 있다. The judgment module (125) can determine the re-learning point of the learning model and transmit it to the interface device.

저장모듈(127)은 다른 모듈로부터 수신한 정보들을 그 기초가 되는 동작정보와 대응시켜 저장시킬 수 있다. The storage module (127) can store information received from other modules in correspondence with the underlying operation information.

또한, 저장모듈(127)은 동작정보가 발생한 시간 혹은 계절모듈(122), 전처리모듈(123) 혹은 저장모듈(127)이 수신한 시간과 함께 동작정보를 저장할 수 있다. Additionally, the storage module (127) can store the motion information together with the time at which the motion information occurred or the time at which the seasonal module (122), preprocessing module (123) or storage module (127) received the motion information.

저장모듈(127)은 시기 결정 방법이 구현되는데 필요한 모든 정보들을 저장할 수 있다. The storage module (127) can store all information necessary for implementing the timing determination method.

이하, 본 발명의 일 실시예에 따른 시기 결정 방법에 대해서 자세하게 서술하도록 한다. Hereinafter, a timing determination method according to one embodiment of the present invention will be described in detail.

도 8은 본 발명의 일 실시예에 따른 시기 결정 장치의 시기 결정 방법의 순서도이다. Figure 8 is a flowchart of a timing determination method of a timing determination device according to one embodiment of the present invention.

도 8을 참조하면, 본 발명의 일 실시예에 따른 시기 결정 방법은, 시기 수신모듈(121)에 의해 클라우드 서버로부터 동작정보가 수신되는 단계, 계절모듈(122)에 의해 상기 동작정보가 시계열 분석되어 상기 동작정보의 계절성 강도가 산출되는 단계, 패턴모듈(124)에 의해 미리 정해진 패턴분석방법에 따라 상기 동작정보의 패턴이 분석되는 단계 및 판단모듈(125)에 의해 계절성 강도의 크기에 따라 미리 정해진 결정방법으로 클라우드 서버에서 진행되는 학습모델의 재 학습 시기가 결정되는 단계를 포함할 수 있다. Referring to FIG. 8, a timing determination method according to an embodiment of the present invention may include a step of receiving motion information from a cloud server by a timing receiving module (121), a step of analyzing the motion information in time series by a seasonal module (122) to calculate seasonality intensity of the motion information, a step of analyzing a pattern of the motion information according to a predetermined pattern analysis method by a pattern module (124), and a step of determining a re-learning time of a learning model performed in a cloud server according to a predetermined decision method according to the magnitude of the seasonality intensity by a judgment module (125).

도 9는 본 발명의 일 실시예에 따른 시기 결정 장치의 동작도이다. Figure 9 is an operational diagram of a timing determination device according to one embodiment of the present invention.

도 9를 참조하면, 클라우드 서버로부터 시기 수신모듈(121)이 동작정보를 수신할 수 있다. Referring to Fig. 9, a timing receiving module (121) can receive operation information from a cloud server.

동작정보는 계정모듈과 전처리모듈(123)에 각각 전달되어, 계절모듈(122)은 미리 정해진 강도분석방법에 따라 상기 동작정보의 계절성 강도를 산출한 후 판단모듈(125)로 전달하고, 전처리모듈(123)은 동작정보의 차원을 축소하여 패턴모듈(124)로 전달할 수 있다. The motion information is transmitted to the account module and the preprocessing module (123), respectively. The seasonal module (122) calculates the seasonal intensity of the motion information according to a predetermined intensity analysis method and transmits it to the judgment module (125), and the preprocessing module (123) can reduce the dimension of the motion information and transmit it to the pattern module (124).

패턴모듈(124)은 미리 정해진 패턴분석방법으로 상기 동작정보의 패턴을 분석하여 판단모듈(125)에 전달할 수 있다. The pattern module (124) can analyze the pattern of the above motion information using a pre-determined pattern analysis method and transmit it to the judgment module (125).

판단모듈(125)은 계절성의 강도에 따라 제1 결정방법 내지 제3 결정방법을 선택하여 학습모델의 재 학습 시기를 결정할 수 있다. The judgment module (125) can select the first to third decision methods according to the intensity of seasonality to determine the re-learning time of the learning model.

판단모듈(125)의 결정한 재 학습 시기와 그 판단의 근거들은 인터페이스 장치로 전달되며, 인터페이스 장치는 관리자(M10)가 재 학습 시기와 그 판단의 근거를 살펴볼 수 있는 사용자 인터페이스를 산출하여 관리자(M10)에게 송신할 수 있다. The re-learning time determined by the judgment module (125) and the basis for the judgment are transmitted to the interface device, and the interface device can generate a user interface that allows the manager (M10) to examine the re-learning time and the basis for the judgment and transmit it to the manager (M10).

도 10은 본 발명의 일 실시예에 따른 시기 결정 장치의 판단모듈이 실행하는 제1 결정방법을 설명하기 위한 그래프이다. FIG. 10 is a graph for explaining a first decision method executed by a judgment module of a timing decision device according to one embodiment of the present invention.

도 7 및 도 10을 참조하면, 최하시점(X11)은 수집된 동작정보의 주기 내에서 동작정보가 최소로 발생되는 시점을 의미할 수 있고, 최고시점(X12)은 수집된 동작정보의 주기 내에서 동작정보가 최고로 발생되는 시점을 의미할 수 있다. Referring to FIG. 7 and FIG. 10, the lowest point (X11) may mean a point in time when motion information is least generated within a cycle of collected motion information, and the highest point (X12) may mean a point in time when motion information is most generated within a cycle of collected motion information.

중간시간(T11)은 최하시점(X11)에서 최고시점(X12)까지의 시간에 소정 비율의 시간을 의미할 수 있다. The intermediate time (T11) can mean a certain percentage of the time from the lowest point (X11) to the highest point (X12).

중간시간이 예측모듈(126)이 예측한 재 학습 시간 이상이라면, 판단모듈(125)은 현재시점(X13)을 기준으로 미래에 최초로 나타나는 최하시점(X14)을 재 학습시기로 결정할 수 있다. If the intermediate time is longer than the re-learning time predicted by the prediction module (126), the judgment module (125) can determine the lowest time point (X14) that first appears in the future based on the current time point (X13) as the re-learning time.

이와 달리, 중간시간이 예측모듈(126)이 예측한 재 학습 시간 미만이라면, 판단모듈(125)은 현재시점(X13)을 기준으로 미래에 최초로 나타나는 감소하는 추세를 나타내는 최고시점(X15)과 최저시점(X14) 사이의 시점(X16)을 재 학습시기로 결정하는 방법일 수 있다. In contrast, if the intermediate time is less than the re-learning time predicted by the prediction module (126), the judgment module (125) may determine the re-learning time as a time point (X16) between the highest time point (X15) and the lowest time point (X14) that shows a decreasing trend that will first appear in the future based on the current time point (X13).

다시 말해서, 중간시간이 예측모듈(126)이 예측한 재 학습 시간 미만이라면, 판단모듈(125)은 현재시점(X13)을 기준으로, 미래에 최초로 나타나는 최하시점(X14)으로부터, 미래에 최초로 나타나는 감소하는 추세를 나타내는 최고시점(X15)과 최저시점(X14) 사이의 시간의 소정 비율 시간 이전의 시점(X16)을 재 학습시기로 결정하는 방법일 수 있다. In other words, if the intermediate time is less than the re-learning time predicted by the prediction module (126), the judgment module (125) may determine the re-learning time as a time point (X16) that is a predetermined percentage of the time between the highest time point (X15) and the lowest time point (X14) that show a decreasing trend that first appears in the future, based on the current time point (X13).

도 11 및 도 12는 본 발명의 일 실시예에 따른 시기 결정 장치의 판단모듈이 실행하는 제3 결정방법을 설명하기 위한 도면이다. FIG. 11 and FIG. 12 are drawings for explaining a third decision method executed by a judgment module of a timing decision device according to one embodiment of the present invention.

도 8 및 도 11을 참조하면, 제3 결정방법은 상기 미리 정해진 중간조건이 만족되지 않는 경우, 상기 동작정보의 계절성을 고려하여, 미래의 최하시점을 학습모델의 재 학습 시기로 결정하는 방법일 수 있다. (제3 시점판단 방법)Referring to FIG. 8 and FIG. 11, the third decision method may be a method of determining the future lowest point as the re-learning time of the learning model by considering the seasonality of the motion information when the above-determined intermediate condition is not satisfied. (Third point-in-time judgment method)

상기 동작정보의 계절성을 고려하는 방법은 제1 결정방법과 동일할 수 있으며, 이에 대한 자세한 설명은 상술한 내용과 중복되는 한도에서 생략될 수 있다. The method for considering the seasonality of the above motion information may be the same as the first decision method, and a detailed description thereof may be omitted to the extent that it overlaps with the above-described content.

제3 결정방법은 상기 미리 정해진 중간조건이 만족되는 동작정보의 개수에 따라 다른 방법으로 학습모델의 재 학습 시기를 결정하는 방법일 수 있다. The third decision method may be a method of determining the re-learning timing of the learning model in a different way depending on the number of motion information satisfying the above-determined intermediate conditions.

만일, 상기 미리 정해진 중간조건이 만족되는 동작정보가 단수일 경우, 제3 결정방법은 제1 시점판단 방법 의해 재 학습 시점을 결정하는 방법일 수 있다. If the number of motion information satisfying the above-determined intermediate conditions is singular, the third decision method may be a method of determining the re-learning time point by the first time point judgment method.

제1 시점판단 방법은, 미리 정해진 중간조건이 만족되는 동작정보의 패턴과 기준 동작패턴과의 유사값을 설정 유사값으로 지정하여, 상기 기준 동작패턴과 상기 설정 유사값 만큼 유사한 패턴을 가지는 동작정보가 확인되는 시점을 재 학습 시기로 결정하는 방법일 수 있다. The first point-in-time judgment method may be a method of determining a re-learning time at which a point in time when motion information having a pattern similar to the reference motion pattern by the set similarity value is confirmed by setting a similarity value between the pattern of motion information satisfying a predetermined intermediate condition and a reference motion pattern is confirmed.

제3 결정방법은, 상기 미리 정해진 중간조건이 만족되는 동작정보의 개수가 제1 개수 이상 상기 제1 개수보다 큰 제2 개수 미만일 경우, 상기 미리 정해진 과거 기간 동안에 한 주기 단위로 상기 동작정보가 최소로 발생되는 시점인 최하시점과 가장 가까운 상기 미리 정해진 중간조건이 만족되는 과거의 동작정보와 동일한 상기 기준 동작패턴과의 유사값을 가지는 동작정보가 확인되는 시점을 탐지모델의 재 학습시기로 결정하는 방법일 수 있다. (제2 시점판단 방법)The third decision method may be a method of determining the re-learning time of the detection model as the time point at which motion information having a similarity value with the reference motion pattern identical to the past motion information satisfying the predetermined intermediate condition, which is the lowest point at which the motion information is least generated in one cycle during the predetermined past period, is confirmed, when the number of motion information satisfying the above-described intermediate condition is greater than or equal to the first number and less than the second number greater than the first number. (Second time point judgment method)

일례로, 제1 개수는 2개일 수 있고, 제2 개수는 5개일 수 있다. For example, the first number could be 2, the second number could be 5, etc.

다만, 이에 한정하지 않고, 제1 개수와 제2 개수의 구체적인 값은 통상의 기술자에게 자명한 수준에서 다양하게 변형 가능하다.However, without limitation thereto, the specific values of the first number and the second number can be varied in various ways at a level that is obvious to a person skilled in the art.

도 7 및 도 12(a)를 참조하면, 현재시점(Q11)을 기준으로 미리 정해진 과거 기간(A11) 동안, 미리 정해진 중간조건이 만족되는 동작정보는 총 4개(W11, W12, W13, W14)있을 수 있다. Referring to FIG. 7 and FIG. 12(a), there can be a total of four pieces of motion information (W11, W12, W13, W14) that satisfy predetermined intermediate conditions during a predetermined past period (A11) based on the present time (Q11).

또한, 미리 정해진 과거 기간(A11) 동안에 최하시점(S11, S12)이 두 번 나타날 수 있다. Additionally, the lowest point (S11, S12) may appear twice during a predetermined past period (A11).

제3 결정방법 하에서, 판단모듈(125)은 최하시점과 가장 가까운 동작정보(W12)의 패턴과 기준 동작패턴과의 유사값을 설정 유사값으로 지정하여, 상기 기준 동작패턴과 상기 설정 유사값 만큼 유사한 패턴을 가지는 동작정보가 확인되는 시점을 재 학습 시기로 결정할 수 있다. Under the third decision method, the judgment module (125) can designate the similarity value between the pattern of the motion information (W12) closest to the lowest point and the reference motion pattern as the set similarity value, and determine the time point at which motion information having a pattern similar to the reference motion pattern by the set similarity value is confirmed as the re-learning time.

구체적인 일례로, 설정 유사값이 95라면, 판단모듈(125)은, 상기 기준 동작패턴과 유사값 95 만큼 유사한 패턴을 가지는 동작정보가 확인되는 시점이 재 학습 시기로 결정될 수 있다. As a specific example, if the set similarity value is 95, the judgment module (125) can determine the re-learning period as the point in time when motion information having a pattern similar to the reference motion pattern by a similarity value of 95 is confirmed.

제3 결정방법은, 상기 미리 정해진 중간조건이 만족되는 동작정보의 개수가 상기 제2 개수 이상일 경우, 상기 미리 정해진 과거 기간 동안에 비 정상 동작이 감지된 시점과 가장 먼 미리 정해진 중간조건이 만족되는 과거의 동작정보와 동일한 상기 기준 동작패턴과의 유사값을 가지는 동작정보가 확인되는 시점을 탐지모델의 재 학습시기로 결정하는 방법일 수 있다. (제2 시점판단 방법)The third decision method may be a method of determining the re-learning time of the detection model at a time when motion information having a similar value to the reference motion pattern that is the same as the motion information of the past that satisfies the predetermined intermediate condition that is the furthest from the time at which the abnormal motion was detected during the predetermined past period is confirmed, when the number of motion information satisfying the predetermined intermediate condition is greater than or equal to the second number. (Second time point judgment method)

도 7 및 도 12(a)를 참조하면, 현재시점(Q11)을 기준으로 미리 정해진 과거 기간(A11)동안, 미리 정해진 중간조건이 만족되는 동작정보는 총 7개(W11, W12, W13, W14, W15, W16, W17)일 수 있다. Referring to FIG. 7 and FIG. 12(a), there may be a total of 7 pieces of motion information (W11, W12, W13, W14, W15, W16, W17) that satisfy predetermined intermediate conditions during a predetermined past period (A11) based on the present time (Q11).

또한, 미리 정해진 과거 기간(A11) 동안에 이상탐지모델이 탐지한 이상 징후는 총 2번(F11, F12)이 감지될 수 있다. Additionally, during the predetermined past period (A11), the anomaly detection model can detect a total of two anomalies (F11, F12).

제3 결정 방법 하에서, 판단모듈(125)은 이상 징후가 판단된 시점(F11, F12)와 가장 먼 동작정보(W11)의 패턴과 기준 동작패턴과의 유사값을 설정 유사값으로 지정하여, 상기 기준 동작패턴과 상기 설정 유사값 만큼 유사한 패턴을 가지는 동작정보가 확인되는 시점을 재 학습 시기로 결정할 수 있다. Under the third decision method, the decision module (125) can designate the similarity value between the pattern of the motion information (W11) that is furthest from the time point (F11, F12) at which the abnormality sign is determined and the reference motion pattern as the set similarity value, and determine the time point at which motion information having a pattern similar to the reference motion pattern by the set similarity value is confirmed as the re-learning time.

이는, 이상징후에 따른 조치가 우선적으로 이루어져야 하기 때문일 수 있다. This may be because measures taken in response to abnormal symptoms should be given priority.

이에, 대한 자세한 설명은 상술한 내용과 중복되는 한도에서 생략될 수 있다. Accordingly, a detailed description thereof may be omitted to the extent that it overlaps with the above-mentioned contents.

상술한 내용과 다르게 학습모델은 오픈소스로서 외부 서버로부터 공급받을 수 있다. Unlike what was described above, the learning model is open source and can be supplied from an external server.

첨부된 도면은 본 발명의 기술적 사상을 보다 명확하게 표현하기 위해, 본 발명의 기술적 사상과 관련성이 없거나 떨어지는 구성에 대해서는 간략하게 표현하거나 생략하였다.In order to more clearly express the technical idea of the present invention, the attached drawings briefly express or omit components that are not related to or have little to do with the technical idea of the present invention.

상기에서는 본 발명에 따른 실시예를 기준으로 본 발명의 구성과 특징을 설명하였으나 본 발명은 이에 한정되지 않으며, 본 발명의 사상과 범위 내에서 다양하게 변경 또는 변형할 수 있음은 본 발명이 속하는 기술분야의 당업자에게 명백한 것이며, 따라서 이와 같은 변경 또는 변형은 첨부된 특허청구범위에 속함을 밝혀둔다.Although the configuration and features of the present invention have been described above based on embodiments according to the present invention, the present invention is not limited thereto, and it will be apparent to those skilled in the art that various changes or modifications can be made within the spirit and scope of the present invention, and therefore it is made clear that such changes or modifications fall within the scope of the appended patent claims.

100 : 모델 관리 시스템 110 : 변수 선택 장치
120 : 시기 결정 장치100 : Model Management System 110 : Variable Selection Device
120 : Timing Device

Claims

A variable selection device that determines a representative variable, which is a variable representing the characteristics of the cloud server, based on operation information, which is information about each of a plurality of variables generated while the cloud server is operating;
The above variable selection device is,
The cloud server is provided with a selection receiving module that receives operation information, which is information about each of a plurality of variables, as information generated while the cloud server is operating, a preprocessing module that preprocesses the operation information received by the selection receiving module according to a pre-determined preprocessing method, a detection module that analyzes causality between the target variable, which is a pre-determined variable, and the comparison variable, which is a variable excluding the target variable, according to a pre-determined judgment method based on the preprocessed operation information, and a decision module that determines the comparison variable and the target variable having causality satisfying a pre-determined causality condition as the representative variable.
The above pre-determined preprocessing method is,
Including a method for removing the above operation information for a variable whose correlation coefficient with the above target variable is greater than a predetermined standard.
Model management system.

In the first paragraph,
The above predetermined judgment method is,
A method of determining causality between two variables using the Granger causality test.
Model management system.

In the first paragraph,
It further includes a search module that distinguishes and analyzes the operation information of the target variable for each predetermined time range and detects whether an event occurs in the operation information of the target variable;
The above detection module,
The operation information of the target variable in which the event occurred and the operation information of the comparison variable that occurred in the same time range are compared with each other according to the predetermined judgment method to analyze causality.
Model management system.

In the third paragraph,
The above pre-determined causal conditions are,
The condition in which the above pre-determined time range in which causality is recognized is the largest,
Model management system.

In the first paragraph,
The above pre-determined preprocessing method is,
A method for removing the above motion information having a categorical or variance below a predetermined standard,
Model management system.

In the first paragraph,
The above pre-determined preprocessing method is,
A method for correcting the time error between indicator data indicating the above-mentioned operation information of each variable.
Model management system.

A model management method for managing a model implemented by a model management system and operated on a cloud server,
A step of determining a representative variable, which is a variable representing the characteristics of the cloud server, by a variable selection device;
The step in which the above representative variables are determined is:
A step of receiving operation information, which is information generated when the cloud server is operated by a selection receiving module, each of which comprises a plurality of variables - the variables including a target variable, which is a predetermined variable, and a comparison variable, which is a variable other than the target variable;
A step in which the motion information received by the selection receiving module is preprocessed according to a predetermined preprocessing method by the preprocessing module;
A step in which the operation information of the target variable is distinguished and analyzed by the search module for each predetermined time range, and whether an event occurs is detected from the operation information of the target variable;
A step in which the operation information of the target variable in which an event occurred and the operation information of the comparison variable that occurred in the same time range are compared with each other according to a predetermined judgment method by the detection module to analyze causality; and
A step comprising: a step in which the comparison variable and the target variable, which have a causality satisfying a pre-determined causality condition, are determined as the representative variable by a decision module;
The above pre-determined preprocessing method is,
Including a method for removing the above operation information for a variable whose correlation coefficient with the above target variable is greater than a predetermined standard.
How to manage models.

In the first paragraph,
Further comprising a timing determination device for determining a timing for re-learning the machine learning model based on the above-mentioned operation information for the above-mentioned representative variable;
Model management system.

In Article 8,
The above timing device is,
It includes a time receiving module that receives the above motion information in a time series manner, a seasonal module that calculates the seasonal intensity of the motion information according to a predetermined intensity analysis method after time series decomposition of the motion information, a pattern module that analyzes the pattern of the motion information using a predetermined pattern analysis method, and a judgment module that determines the re-learning time of the machine learning model performed in the cloud server using a predetermined decision method according to the size of the seasonal intensity.
The above predetermined decision method is,
A method for determining the re-learning period by considering the seasonality of the above motion information or the pattern of the above motion information.
Model management system.

In Article 9,
The above predetermined decision method is,
A first determination method for determining a re-learning time by considering the seasonality of the motion information when the seasonality intensity of the motion information is greater than or equal to a first reference value, and a second determination method for determining a re-learning time by considering the pattern of the motion information when the seasonality intensity of the motion information is less than a second reference value that is lower than the first reference value.
Model management system.