KR102546540B1

KR102546540B1 - Method and apparatus for prediction of traffic congestion based on lstm

Info

Publication number: KR102546540B1
Application number: KR1020210078113A
Authority: KR
Inventors: 신동훈; 권혜정; 백지원; 정경용
Original assignee: 경기대학교 산학협력단
Priority date: 2021-06-16
Filing date: 2021-06-16
Publication date: 2023-06-22
Anticipated expiration: 2041-06-16
Also published as: KR20220168409A

Abstract

본 발명은 LSTM 기반 교통 혼잡도 예측 방법에 관한 것으로, 더욱 상세하게는 교통 흐름 데이터에서 시공간 특징을 추출을 활용한 결측 대치 및 LSTM 기반 교통 혼잡도를 예측하는 LSTM 기반 교통 혼잡도 예측 방법 및 장치에 관한 것이다. 본 발명의 실시예에 따르면, LSTM 기반 교통 혼잡도 예측 장치는 순서 또는 시간이라는 측면을 고려하여 교통 데이터의 시계열적인 특성을 학습할 수 있다. 본 발명의 다른 실시예에 따르면, LSTM 기반 교통 혼잡도 예측 장치는 도심 외곽지역이 도심지보다 외부의 방해로 인한 교통 흐름에 영향이 작기 때문에 예측 시 변수 작용을 하는 요인이 줄어들어 정확도가 높은 예측을 제공할 수 있다.The present invention relates to an LSTM-based traffic congestion prediction method, and more particularly, to an LSTM-based traffic congestion prediction method and apparatus for predicting LSTM-based traffic congestion and missing imputation using spatio-temporal feature extraction from traffic flow data. According to an embodiment of the present invention, the LSTM-based traffic congestion predicting apparatus may learn time-series characteristics of traffic data in consideration of the aspect of order or time. According to another embodiment of the present invention, the LSTM-based traffic congestion prediction apparatus can provide highly accurate prediction by reducing factors that act as variables during prediction because the outskirts of the city have a smaller influence on traffic flow due to external disturbances than the downtown area. can

Description

LSTM-based traffic congestion prediction method and apparatus {METHOD AND APPARATUS FOR PREDICTION OF TRAFFIC CONGESTION BASED ON LSTM}

본 발명은 LSTM 기반 교통 혼잡도 예측 방법에 관한 것으로, 더욱 상세하게는 교통 흐름 데이터에서 시공간 특징을 추출을 활용한 결측 대치 및 LSTM 기반 교통 혼잡도를 예측하는 LSTM 기반 교통 혼잡도 예측 방법 및 장치에 관한 것이다.The present invention relates to an LSTM-based traffic congestion prediction method, and more particularly, to an LSTM-based traffic congestion prediction method and apparatus for predicting LSTM-based traffic congestion and missing imputation using spatio-temporal feature extraction from traffic flow data.

최근 4차 산업혁명 핵심기술을 기반으로 스마트 자동차가 다양한 형태로 생산되고 있다. 자동차의 역할은 단순한 운송수단이 아닌 생활공간으로 바뀌고, 새로운 편의가치를 제공하는 인포테이먼트 형태로 변화하고 있다. 스마트 자동차 수요가 증가할수록 원활한 도로관리를 위한 교통정보 수집 및 가공은 매우 중요한 영역이기 때문에 단순히 양적인 측면이 아닌 질적인 측면에서 접근이 필요하다. 이를 위해 기존의 교통체계에 IT 기술을 활용한 지능형 교통 시스템(ITS: Intelligent Transportation System)의 연구가 이루어지고 있다.Recently, smart cars are being produced in various forms based on the core technologies of the 4th industrial revolution. The role of the car is changing from a simple means of transportation to a living space, and it is changing to a form of infotainment that provides new convenience values. As the demand for smart cars increases, traffic information collection and processing for smooth road management is a very important area, so a qualitative rather than quantitative approach is required. To this end, research on an intelligent transportation system (ITS) using IT technology in the existing transportation system is being conducted.

도로 자원의 수요증가에 따라 발생하는 문제점에 대한 연구는 사용자의 편의를 증진하는 교통 복지의 기반이 되는 주요 연구이다. 교통 복지는 운행비용, 통행시간, 사고비용, 주차비용, 정시성, 접근성 등 다양하게 구성되며 높은 비중을 차지하는 것은 교통 혼잡도이다. 이에 따라 지능형 교통 시스템을 위해 도로 전 구간의 교통정보를 실시간으로 수집하고 혼잡구간, 교통량, 교통사고 현황 등의 다양한 정보를 제공함으로써 도로 교통망의 기능을 향상시키고 있다. The study of problems arising from the increase in demand for road resources is a major study that is the basis of transportation welfare that promotes user convenience. Transportation welfare consists of various factors such as operating cost, travel time, accident cost, parking cost, punctuality, and accessibility, and traffic congestion accounts for a high proportion. Accordingly, the function of the road traffic network is improved by collecting traffic information on all sections of the road in real time and providing various information such as congested sections, traffic volume, and traffic accident status for the intelligent traffic system.

또한, ITS는 실시간 교통정보제공 서비스를 운영한다. 운전자에게 최적의 경로를 제공하여 혼잡한 도로의 혼잡가중을 방지하고 해당 도로의 교통량을 분산시키기 위한 형태로 이용되기 때문에 신속성에 중점을 두어 정확성은 비교적 떨어진다. 따라서 최근에는 이러한 문제점을 해결하기 위한 딥러닝 모델 및 여러 예측 모델을 통한 실시간 교통상황 패턴 예측에 관한 연구가 활발하게 이루어지고 있으며, 시계열 데이터를 기반으로 한 교통량 예측 연구가 이루어지고 있다.In addition, ITS operates a real-time traffic information service. Since it is used in the form of providing the driver with the optimal route to prevent aggravation of congestion on congested roads and distributing the traffic volume on the road, the accuracy is relatively low due to the emphasis on speed. Therefore, in recent years, studies on real-time traffic situation pattern prediction through deep learning models and various prediction models to solve these problems have been actively conducted, and traffic volume prediction studies based on time-series data have been conducted.

또한, 사물 인터넷 및 센서, 기업의 고객 데이터 트래킹 및 수집 증가, 소셜 네트워크 서비스의 확산으로 인한 비정형 데이터 증가, 저장매체 기술 발전 및 가격하락으로 많은 데이터가 생성되고 있다. 현재 4차 산업혁명의 대표적인 기술 중 데이터가 핵심적인 역할을 하고 있는 빅데이터, 인공지능, 머신러닝, 딥러닝 등의 연구들이 활발해지고 있다. 이와 같은 연구들은 데이터의 양과 질로 모델의 성능이 결정되는데 데이터의 양적인 측면은 해소되고 있지만 현실에서 수집되는 과정에서 일련의 이유로 값이 누락되거나 이상한 값이 저장되는 경우가 많아 데이터의 질적 측면은 떨어진다. 이는 데이터 분석의 어려움, 균형이 맞지 않는 데이터 구조, 모델의 예측 성능감소로 이어진다. 따라서 결측값을 대치하기 위한 알고리즘 적용 및 통계적 방법을 이용한 예측기술이 필요하다.In addition, a lot of data is being generated due to the Internet of Things and sensors, increased tracking and collection of customer data by companies, increased unstructured data due to the spread of social network services, and the development of storage media technology and price declines. Currently, among the representative technologies of the 4th industrial revolution, research on big data, artificial intelligence, machine learning, and deep learning, where data plays a key role, is becoming active. In these studies, the performance of the model is determined by the quantity and quality of the data. The quantitative aspect of the data is being resolved, but in the process of being collected in reality, there are many cases where values are missing or strange values are stored for a series of reasons, so the qualitative aspect of the data is degraded. This leads to difficulties in data analysis, unbalanced data structures, and reduced predictive performance of models. Therefore, it is necessary to apply algorithms to replace missing values and predictive techniques using statistical methods.

1. 한국 공개특허공보 제10-2013-0158919호 “시공간 확률 모델을 이용한 교통 흐름 예측 시스템”(공개일자 : 2015년 06월 30일)1. Korean Patent Publication No. 10-2013-0158919 “Traffic flow prediction system using space-time probability model” (published on June 30, 2015)

본 발명은 교통 데이터에서 시공간특징 추출을 활용한 결측 대치를 통해 교통 데이터의 이상치를 제거하여, 시계열 데이터의 구조에 적합한 LSTM 모델 통해 패턴이 보존된 연속적인 교통데이터를 예측하는 LSTM 기반 교통 혼잡도 예측 방법 및 장치를 제공한다.The present invention is an LSTM-based traffic congestion prediction method that predicts continuous traffic data with patterns preserved through an LSTM model suitable for the structure of time-series data by removing outliers from traffic data through missing imputation using spatio-temporal feature extraction from traffic data and devices.

본 발명의 일 측면에 따르면, LSTM 기반 교통 혼잡도 예측 장치를 제공한다. According to one aspect of the present invention, an LSTM-based traffic congestion predicting device is provided.

본 발명의 일 실시예에 따른 LSTM 기반 교통 혼잡도 예측 장치는 교통 데이터를 수집하는 데이터 수집부, 수집된 상기 교통 데이터의 대한 보정을 수행하는 데이터 보정부, 보정된 교통데이터에 기초하여 적대적 오토 인코더(AAE) 기반 결측 대치 모델을 수행하는 전처리부 및 전처리된 교통데이터에 기초하여 시계열 기반 딥러닝 모델(LSTM) 기반 교통 혼잡도를 예측하는 예측부를 포함할 수 있다.An apparatus for predicting traffic congestion based on LSTM according to an embodiment of the present invention includes a data collection unit for collecting traffic data, a data correction unit for correcting the collected traffic data, and an adversarial auto encoder based on the corrected traffic data ( AAE) based missing imputation model and a prediction unit that predicts traffic congestion based on a time series based deep learning model (LSTM) based on the preprocessed traffic data.

본 발명의 일 측면에 따르면, LSTM 기반 교통 혼잡도 예측 방법 및 이를 실행하는 컴퓨터 프로그램을 제공한다.According to one aspect of the present invention, a method for predicting traffic congestion based on LSTM and a computer program executing the same are provided.

본 발명의 일 실시 예에 따른 LSTM 기반 교통 혼잡도 예측 방법은 교통 데이터를 수집하는 단계, 수집된 상기 교통 데이터의 대한 데이터 보정을 수행하는 단계, 보정된 교통데이터에 기초하여 적대적 오토 인코더(AAE) 기반 결측 대치 모델에 따라 데이터 전처리를 수행하는 단계 및 전처리된 교통데이터에 기초하여 시계열 기반 딥러닝 모델을 통해 교통 혼잡도를 예측하는 단계를 포함할 수 있다.An LSTM-based traffic congestion prediction method according to an embodiment of the present invention includes the steps of collecting traffic data, performing data correction on the collected traffic data, and adversarial auto encoder (AAE) based on the corrected traffic data. The method may include performing data preprocessing according to a missing imputation model and predicting traffic congestion through a time series-based deep learning model based on the preprocessed traffic data.

본 발명의 실시예에 따르면, LSTM 기반 교통 혼잡도 예측 장치는 순서 또는 시간이라는 측면을 고려하여 교통 데이터의 시계열적인 특성을 학습할 수 있다.According to an embodiment of the present invention, the LSTM-based traffic congestion predicting apparatus may learn time-series characteristics of traffic data in consideration of the aspect of order or time.

본 발명의 다른 실시예에 따르면, LSTM 기반 교통 혼잡도 예측 장치는 도심 외곽지역이 도심지보다 외부의 방해로 인한 교통 흐름에 영향이 작기 때문에 예측 시 변수 작용을 하는 요인이 줄어들어 정확도가 높은 예측을 제공할 수 있다.According to another embodiment of the present invention, the LSTM-based traffic congestion prediction apparatus can provide highly accurate prediction by reducing factors that act as variables during prediction because the outskirts of the city have a smaller influence on traffic flow due to external disturbances than the downtown area. can

본 발명의 효과는 상기한 효과로 한정되는 것은 아니며, 본 발명의 설명 또는 청구범위에 기재된 발명의 구성으로부터 추론 가능한 모든 효과를 포함하는 것으로 이해되어야 한다.The effects of the present invention are not limited to the above effects, and should be understood to include all effects that can be inferred from the description of the present invention or the configuration of the invention described in the claims.

도 1 은 본 발명의 실시예에 따른 지능형 교통시스템의 프로세스를 설명하기 위한 도면이다.
도 2는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 장치를 도시한 도면이다.
도 3 및 도 4는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 이상치 제거를 설명하기 위한 도면들이다.
도 5 및 도 6은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 공간 추세활용법에 따른 결측 값 보정을 설명하기 위한 도면들이다.
도 7은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 시간 추세활용법에 따른 결측 값 보정을 설명하기 위한 도면이다.
도 8은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 적대적 오토 인코더(AAE) 기반 데이터 결측 대치 모델을 설명하기 위한 도면이다.
도 9는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 시계열 기반 딥러닝 모델을 설명하기 위한 도면이다.
도 10 및 도 11은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 장치를 설명하기 위한 도면들이다.
도 12는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법을 설명하기 위한 흐름도이다.1 is a diagram for explaining a process of an intelligent transportation system according to an embodiment of the present invention.
2 is a diagram illustrating an apparatus for predicting traffic congestion based on LSTM according to an embodiment of the present invention.
3 and 4 are diagrams for explaining the elimination of outliers in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.
5 and 6 are diagrams for explaining missing value correction according to the spatial trend utilization method in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.
7 is a diagram for explaining missing value correction according to the time trend utilization method in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.
8 is a diagram for explaining an adversarial auto-encoder (AAE)-based missing data imputation model in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.
9 is a diagram for explaining a time-series-based deep learning model in an LSTM-based traffic congestion prediction method according to an embodiment of the present invention.
10 and 11 are diagrams for explaining an apparatus for predicting traffic congestion based on LSTM according to an embodiment of the present invention.
12 is a flowchart illustrating a method for predicting traffic congestion based on LSTM according to an embodiment of the present invention.

이하에서는 첨부한 도면을 참조하여 본 발명을 설명하기로 한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 따라서 여기에서 설명하는 실시예로 한정되는 것은 아니다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다.Hereinafter, the present invention will be described with reference to the accompanying drawings. However, the present invention may be embodied in many different forms and, therefore, is not limited to the embodiments described herein. And in order to clearly explain the present invention in the drawings, parts irrelevant to the description are omitted, and similar reference numerals are attached to similar parts throughout the specification.

명세서 전체에서, 어떤 부분이 다른 부분과 "연결(접속, 접촉, 결합)"되어 있다고 할 때, 이는 "직접적으로 연결" 되어 있는 경우뿐 아니라, 그 중간에 다른 부재를 사이에 두고 "간접적으로 연결" 되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 "포함" 한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 구비할 수 있다는 것을 의미한다.Throughout the specification, when a part is said to be "connected (connected, contacted, combined)" with another part, this is not only "directly connected", but also "indirectly connected" with another member in between. "Including cases where In addition, when a certain component is said to "include", this means that it may further include other components without excluding other components unless otherwise stated.

본 명세서에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Terms used in this specification are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise. In this specification, terms such as "include" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, but one or more other features It should be understood that the presence or addition of numbers, steps, operations, components, parts, or combinations thereof is not precluded.

이하 첨부된 도면을 참고하여 본 발명의 실시예를 상세히 설명하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1 은 본 발명의 실시예에 따른 지능형 교통시스템(ITS: Intelligent Transport Systems)의 프로세스를 설명하기 위한 도면이다.1 is a diagram for explaining a process of an Intelligent Transport Systems (ITS) according to an embodiment of the present invention.

도 1을 참조하면, 지능형 교통시스템(ITS)은 기술의 융합을 통해 교통 체계의 운영 및 관리의 과학화, 자동화를 통해 교통의 효율성과 안정성을 향상시키는 교통 시스템이다. Referring to FIG. 1 , an intelligent transportation system (ITS) is a transportation system that improves the efficiency and safety of transportation through the scientific and automation of operation and management of the transportation system through technology convergence.

지능형 교통시스템(ITS)은 버스 정류장의 도착 안내시스템, 교차로에서 교통량에 따른 신호처리 시스템, 내비게이션의 실시간 교통정보, 하이패스 등이 있다. Intelligent transportation systems (ITS) include a bus stop arrival information system, a signal processing system according to traffic volume at an intersection, real-time traffic information of navigation, and high-pass.

지능형 교통시스템(ITS)은 도로나 도로변 등에 설치된 교통 정보 수집 장치로부터 교통 정보를 수집할 수 있다. 이때, 교통 정보 수집 장치는 CCTV, 차량감지, GPS 및 기상 정보 시스템을 포함할 수 있다. 교통 정보 수집 장치에 의해 수집된 정보는 유선 또는 무선통신을 통해 교통 데이터 관리 센터로 전송될 수 있다.An intelligent traffic system (ITS) may collect traffic information from traffic information collection devices installed on roads or roadsides. At this time, the traffic information collection device may include a CCTV, vehicle detection, GPS, and weather information system. Information collected by the traffic information collecting device may be transmitted to a traffic data management center through wired or wireless communication.

교통 데이터 관리 센터는 수집된 정보를 교통시설에 사용하기 위해 보안하고 분석하고 처리한다. 분석 처리된 데이터는 교통 관리를 위해 제공될 수 있다. 교통 데이터 관리 센터는 교통 상태를 컨트롤 하고, 사용자에게 정보 전달할 수 있다. 지능형 교통시스템(ITS)은 교통수단과 교통시설에 전자제어 및 통신 등에 첨단 기술을 사용하여 수집하고 교통 정보 및 서비스를 제공할 수 있다 The traffic data management center secures, analyzes, and processes the collected information for use in traffic facilities. Analytically processed data may be provided for traffic management. The traffic data management center can control traffic conditions and deliver information to users. The Intelligent Transportation System (ITS) can collect and provide traffic information and services to means of transportation and transportation facilities using advanced technologies such as electronic control and communication.

실시예에 따라, 교통 데이터 관리 센터는 인터넷을 통해 교통 정보전광판(VMS: Variable Message Sign)으로 정보를 제공할 수 있다. 교통 데이터 관리 센터는 교통신호 제어기, 휴대기기에 정보를 제공할 수 있다. 또한, 운영/물류 관리를 통해 교통 관리를 수행할 수 있다.Depending on the embodiment, the traffic data management center may provide information to a traffic information display board (VMS: Variable Message Sign) through the Internet. The traffic data management center can provide information to traffic signal controllers and mobile devices. In addition, traffic management can be performed through operation/logistics management.

도 2는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 장치를 도시한 도면이다.2 is a diagram illustrating an apparatus for predicting traffic congestion based on LSTM according to an embodiment of the present invention.

도 2를 참조하면, LSTM 기반 교통 혼잡도 예측 장치(100)는 데이터 수집부(110), 데이터 보정부(120), 전처리부(130) 및 예측부(140)를 포함할 수 있다.Referring to FIG. 2 , the LSTM-based traffic congestion prediction apparatus 100 may include a data collection unit 110, a data correction unit 120, a pre-processing unit 130, and a prediction unit 140.

데이터 수집부(110)는 사용자에게 교통 상황을 제공을 위한 데이터를 수집할 수 있다. 이때, 교통데이터는 지능형 교통시스템(ITS)에서 제공하는 노드/링크 데이터와 교통 데이터를 포함할 수 있다.The data collection unit 110 may collect data for providing traffic conditions to the user. At this time, the traffic data may include node/link data and traffic data provided by an intelligent traffic system (ITS).

노드/링크 데이터는 도로의 구간이나 도로가 이어지는 지점 데이터를 포함할 수 있다.The node/link data may include data of a section of a road or a point where the road is connected.

교통데이터는 도로나 도로변 등에서 교통 정보 수집 장비를 통해 수집이 이루어질 수 있다. 교통 데이터에는 수집 장비의 오류나 도로에 차가 다니지 않아 생기는 음영지역 발생으로 인한 정보 수집 불가 및 과속으로 인한 이상치가 생성되어 결측 및 이상치가 포함될 수 있다.Traffic data may be collected through traffic information collection equipment on roads or roadsides. Traffic data may include missing values and outliers due to the inability to collect information due to errors in the collection equipment or the occurrence of shadow areas caused by no cars on the road, and outliers generated due to speeding.

실시예에 따라, 교통데이터는 5분마다 수집되며, 수집된 데이터는 해당 5분에 전체 데이터가 수집되지 않는 경우는 간격 결측(Interval Missing), 도로별로 수집이 되지 않는 부분이 있는 경우는 지역적 결측 (Location Missing), 하루 전체가 수집되지 않은 경우인 일별 결측(Days Missing)이 포함될 수 있다.Depending on the embodiment, traffic data is collected every 5 minutes, and the collected data is interval missing if the entire data is not collected in the corresponding 5 minutes, and regional missing if there are parts not collected for each road. (Location Missing), and Missing Days (Days Missing), in which an entire day was not collected.

실시예에 따라, 교통데이터는 결측값은 아니며 도로마다 교통 흐름의 평균이 존재하지만 교통 흐름의 평균에 벗어나 지나치게 작거나 큰 값을 나타내는 이상치를 포함할 수 있다. 즉, 교통 속도 데이터는 교통 평균속도의 흐름을 왜곡시키는 이상치와 수집이 되지 않은 데이터 결측이 존재한다. 이상치는 도로마다 교통 흐름의 평균이 존재하지만 교통 흐름의 평균에 벗어나 지나치게 작거나 큰 값을 나타낸다.Depending on the embodiment, the traffic data may include outliers that are not missing values and represent excessively small or large values that deviate from the average traffic flow, although the average traffic flow exists for each road. In other words, traffic speed data has outliers that distort the flow of average traffic speed and missing data that have not been collected. An outlier represents an excessively small or large value that deviate from the average traffic flow even though there is an average traffic flow for each road.

실시예에 따라, 교통데이터는 교통데이터가 5분마다 수집되어야 하지만 5분에 해당하는 전체 데이터가 수집되지 않는 경우는 시간적 결측을 포함할 수 있고, 5분마다 수집은 되었지만 도로별로 수집이 되지 않는 부분이 있는 경우는 공간적 결측이 포함될 수 있다.Depending on the embodiment, the traffic data may include temporal missing if the traffic data should be collected every 5 minutes but the entire data corresponding to 5 minutes is not collected, and the traffic data is collected every 5 minutes but not collected by road. If there is a part, spatial missing may be included.

데이터 보정부(120)는 이상치와 결측 값을 처리함으로써 딥러닝 모델을 통한 교통 혼잡도 예측을 위한 특징값 추출하기 위한 데이터 전처리를 수행할 수 있다. 이를 위해, 데이터 보정부(120)는 데이터 필터링 통해 데이터의 이상치를 제거할 수 있다. 또한, 데이터 보정부(120)는 시계열적인 특징과 공간적인 특징을 가지는 교통 데이터에서 결측 값을 보정할 수 있다.The data correction unit 120 may perform data preprocessing for extracting feature values for predicting traffic congestion through a deep learning model by processing outliers and missing values. To this end, the data correction unit 120 may remove outliers in the data through data filtering. In addition, the data correction unit 120 may correct missing values in traffic data having time-series and spatial characteristics.

실시예에 따라, 데이터 보정부(120)는 전처리된 데이터를 사용하여 딥러닝 모델의 구성 중 입력 데이터는 5분 전 속도와 5분 전 속도 평균, 현재 속도, 인접 상류부 속도 데이터를 사용할 수 있고, 출력 데이터는 5분 후 속도 데이터를 사용할 수 있다.According to the embodiment, the data compensator 120 may use the speed data of 5 minutes ago, the average speed of 5 minutes ago, the current speed, and the adjacent upstream speed data as input data during the construction of the deep learning model using the preprocessed data. , the output data can use the speed data after 5 minutes.

전처리부(130)는 데이터 결측 보정을 수행할 때, 시간 합성곱 연산(Time Convolution Layer)과 그래프 기반 합성곱 연산(Graph Convolution Layer)을 활용한 적대적 오토 인코더(AAE: Adversarial Auto Encoder) 기반 데이터 결측 대치 모델을 수행할 수 있다.When the preprocessor 130 performs data missing correction, the data missing based on Adversarial Auto Encoder (AAE) using a time convolution layer and a graph convolution layer. Imputation models can be performed.

적대적 오토 인코더(AAE: Adversarial Auto Encoder)은 인코더(Encoder)와 디코더(Decoder)의 구조를 통해 데이터 생성이 가능한 모델인 변분 오토 인코더(VAE; Variational Auto Encoder)와 생성자(Generator)와 판별자(Discriminator)를 포함하는 적대적 생성 신경망(GAN; Generative Adversarial Networks)의 장점을 합쳐 만든 모델일 수 있다.Adversarial Auto Encoder (AAE) is a model that can generate data through the structure of Encoder and Decoder, Variational Auto Encoder (VAE), Generator, and Discriminator. ), and may be a model created by combining the advantages of Generative Adversarial Networks (GANs).

예측부(140)는 전처리가 완료된 데이터를 가지고 시계열 기반 딥러닝 모델을 통해 교통 혼잡도를 예측할 수 있다. 이때, 시계열 기반 딥러닝 모델은 LSTM(Long Short-Term Memory)이 사용될 수 있다.The prediction unit 140 may predict traffic congestion through a time-series-based deep learning model with pre-processed data. In this case, the time series-based deep learning model may use LSTM (Long Short-Term Memory).

도 3 및 도 4는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 이상치 제거를 설명하기 위한 도면들이다.3 and 4 are diagrams for explaining the elimination of outliers in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.

데이터 보정부(120)는 교통 데이터에서 이상 값과 결측값을 수정하기 위해 이상 값 제거를 먼저 수행할 수 있다. 이상치 제거 방법은 중위절대편차, 절사평균, 윈저화 평균 등이 포함될 수 있다. 데이터 보정부(120)는 각 도로의 특성과 교통 데이터 특성에 따라 적절한 방법을 사용하거나 조합할 수 있다. The data correction unit 120 may first remove anomalies in order to correct anomalies and missing values in the traffic data. Methods for removing outliers may include median absolute deviation, trimmed average, and Winsorized average. The data correction unit 120 may use or combine appropriate methods according to characteristics of each road and traffic data.

데이터 보정부(120)는 이상치 제거를 위해 도 3에 도시된 알고리즘을 이용하여 중위절대편차 방법을 수행할 수 있다. 중위절대편차는 절대 편차의 한 종류로서 수집된 데이터의 중앙값을 사용하여 비정상적으로 크거나 작은 값들을 탐지하는 방법이다. 데이터 보정부(120)는 중위절대편차 방법을 통해 수집된 데이터에서 탐지된 값들을 이상치로 간주하여 제거할 수 있다.The data correction unit 120 may perform the median absolute deviation method using the algorithm shown in FIG. 3 to remove outliers. The median absolute deviation is a type of absolute deviation and is a method of detecting abnormally large or small values using the median value of collected data. The data correction unit 120 may consider values detected in the data collected through the median absolute deviation method as outliers and remove them.

도 4에서 교통 데이터는 이상치 및 각 결측 값을 포함하는 링크-시간 매트릭스 형태의 데이터이다. 교통 데이터 매트릭스의 세로축은 링크아이디(LINK_ID)를 나타내고, 가로축은 시간(TIME)을 나타낸다.Traffic data in FIG. 4 is data in the form of a link-time matrix including outliers and each missing value. The vertical axis of the traffic data matrix represents the link ID (LINK_ID), and the horizontal axis represents the time (TIME).

데이터 보정부(120)는 교통 데이터 매트릭스에서 (t2,L4)의 값은 74, (t3,L1)의 값은 80, (t3,L5)의 값은 77이므로, 중앙값을 사용하여 비정상적으로 크다고 판단하여 이상치라고 판단할 수 있다. 이후, 데이터 보정부(120)는 교통 데이터 매트릭스에서 해당 이상치의 값을 제거(NA)할 수 있다.The data correction unit 120 determines that the value of (t2, L4) is 74, the value of (t3, L1) is 80, and the value of (t3, L5) is 77 in the traffic data matrix, using the median value as abnormally large. Therefore, it can be judged as an outlier. Thereafter, the data correction unit 120 may remove (NA) the value of the corresponding outlier from the traffic data matrix.

도 5 및 도 6은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 공간 추세활용법에 따른 결측 값 보정을 설명하기 위한 도면들이다.5 and 6 are diagrams for explaining missing value correction according to the spatial trend utilization method in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.

데이터 보정부(120)는 도 5에 도시된 알고리즘에 따라 공간 추세 활용법을 이용하여 결측값 보정을 수행할 수 있다. 데이터 보정부(120)는 교통패턴이 유사한 구간, 즉 상류부의 교통의 흐름은 그대로 하류부의 교통의 흐름에 영향이 미친다는 것을 전제하에 결측을 보정할 수 있다. 데이터 보정부(120)는 교통 검지기가 이상이 생겨 데이터가 누락이 발생할 경우, 인접 링크인 근접 교통 검지기로부터 수집한 데이터의 평균을 통해 보정을 수행할 수 있다.The data correction unit 120 may perform missing value correction using the spatial trend utilization method according to the algorithm shown in FIG. 5 . The data compensator 120 may correct missing data on the premise that sections having similar traffic patterns, that is, the flow of traffic in the upstream area directly affects the flow of traffic in the downstream area. The data correction unit 120 may perform correction through an average of data collected from proximity traffic detectors, which are adjacent links, when data is omitted due to an error in the traffic detector.

도 6을 참조하면, 데이터 보정부(120)는 교통 데이터 매트릭스에서 공간 결측에 의해 (t2,L4), (t3,L1), (t3,L4), (t3,L5), (t3,L4), (t4,L3), (t4,Ln), (tn,L3), (tn,Ln)의 데이터 값이 NA인 경우 인접한 교통 검지기로부터 수집한 데이터의 평균 값으로 보정할 수 있다.Referring to FIG. 6, the data correction unit 120 calculates (t2,L4), (t3,L1), (t3,L4), (t3,L5), (t3,L4) by spatial missing in the traffic data matrix. , (t4,L3), (t4,Ln), (tn,L3), and (tn,Ln) data values can be corrected with an average value of data collected from adjacent traffic detectors.

도 7은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 시간 추세활용법에 따른 결측 값 보정을 설명하기 위한 도면이다.7 is a diagram for explaining missing value correction according to the time trend utilization method in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.

데이터 보정부(120)는 시간 추세 활용법을 이용한 결측값 보정을 수행할 수 있다. 데이터 보정부(120)는 시간 추세 활용법을 통해 결측이 검지된 데이터의 시점에서 이전 시점으로 n개의 자료를 평균하여 누락된 시점까지의 자료를 추정할 수 있다.The data correction unit 120 may perform missing value correction using a time trend utilization method. The data compensator 120 may estimate data up to the missing point by averaging n data from the point in time at which the missing data is detected through the time trend utilization method to the previous point in time.

데이터 보정부(120)는 시간 추세를 사용한 보정 방정식을 통해 하기 수학식 1과 같이 누락된 값을 추정할 수 있다.The data correction unit 120 may estimate the missing value as shown in Equation 1 through a correction equation using a time trend.

여기서, Ft는 현재 주기 t의 누락데이터추정치,

은 기간 t-k의 검지 데이터, n은 과거 이용 데이터의 검지 주기수이다.Here, Ft is the missing data estimate of the current period t,

is the detection data of the period tk, and n is the number of detection cycles of past use data.

도 7을 참조하면, 데이터 보정부(120)는 교통 데이터 매트릭스에서 시간 결측에 의해 t1 데이터의 시점이 결측되는 경우, 이전 시점의 자료에 기초하여 누락된 값을 보정할 수 있다.Referring to FIG. 7 , when a time point of t1 data is missing due to time missing in the traffic data matrix, the data correction unit 120 may correct the missing value based on data of a previous time point.

도 8은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 적대적 오토 인코더(AAE) 기반 데이터 결측 대치 모델을 설명하기 위한 도면이다.8 is a diagram for explaining an adversarial auto-encoder (AAE)-based missing data imputation model in the LSTM-based traffic congestion prediction method according to an embodiment of the present invention.

적대적 오토 인코더(AAE)는 교통데이터의 결측 대치를 위해 적대적 오토 인코더(AAE) 모델의 구조를 사용할 수 있다. 전처리부(130)는 인코더와 디코더에 교통데이터의 시공간적 특징들을 학습하기 위해 시간 합성곱 연산층(Time Convolution Layer)과 그래프 기반 합성곱 연산층(Graph Convolution Layer)을 인코더와 디코더에 추가하고, 시간 합성곱 연산층과 그래프 기반 합성곱 연산층에서 각각의 특징을 추출하기 위해 레이어를 쌓아 특징들을 합쳐주는 블록을 통해 연산을 진행할 수 있다.Adversarial autoencoder (AAE) can use the structure of adversarial autoencoder (AAE) model for missing imputation of traffic data. The pre-processing unit 130 adds a time convolution layer and a graph convolution layer to the encoder and decoder to learn the spatio-temporal characteristics of traffic data to the encoder and decoder, and In order to extract each feature from the convolution operation layer and the graph-based convolution operation layer, the operation can be performed through a block that combines features by stacking layers.

적대적 오토 인코더(AAE)는 마스크 벡터와 랜덤 노이즈(z) 그리고 결측된 데이터인 벡터(x)를 아다마르 곱 연산할 수 있다. 연산된 값(X)은 인코더에 인풋 데이터로 입력하여 잠재 분포의 매개 변수를 모델링할 수 있다. 디코더는 결측값을 예측하기 위해 잠재공간에서 표본을 추출할 수 있다. 판별자는 대치된 데이터와 관측값을 구분하며 학습을 진행할 수 있다.An adversarial autoencoder (AAE) can perform Hadamard multiplication of a mask vector, random noise (z), and a vector (x), which is missing data. The calculated value (X) can be input to the encoder as input data to model parameters of the latent distribution. The decoder can sample from the latent space to predict missing values. The discriminator distinguishes between the imputed data and the observed value and can proceed with learning.

즉, 전처리부(130)는 변분 오토 인코더 (VAE)의 인코더 부분이 생성자(Generator) 역할을 하며 데이터 x를 받아서 잠재표현 벡터(Latent vector) z를 샘플링하고, 생성자의 디코더는 이로부터 다시

를 복원한다. 적대적 오토 인코더(AAE)는 연속적인 잠재표현 벡터를 만들기 위해 사전분포 p(z)를 먼저 정의하고 판별자(Discriminator)를 사용하여 학습하고, 잠재표현 벡터 z를 입력으로 받아 사전분포 p(z)와 비교하며 학습을 진행할 수 있다.That is, in the preprocessor 130, the encoder part of the variational autoencoder (VAE) serves as a generator, receives data x, samples a latent vector z, and the generator's decoder regenerates from it.

restore The adversarial autoencoder (AAE) first defines a prior distribution p(z) to create a continuous latent expression vector, learns using a discriminator, and receives the latent expression vector z as input to obtain a prior distribution p(z) You can compare and learn.

도 9는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법에서의 시계열 기반 딥러닝 모델을 설명하기 위한 도면이다.9 is a diagram for explaining a time-series-based deep learning model in an LSTM-based traffic congestion prediction method according to an embodiment of the present invention.

도 9를 참조하면, LSTM 셀의 구조는 메모리 셀과 게이트로 구성되고 메모리 셀에는 입력된 정보가 저장되고 셀에 저장되는 정보를 제어하기 위해 게이트가 사용될 수 있다. Referring to FIG. 9 , the structure of an LSTM cell is composed of a memory cell and a gate. Input information is stored in the memory cell, and the gate may be used to control information stored in the cell.

이때, LSTM 셀의 게이트는 인풋 게이트(

:Input Gate), 포겟 게이트 (

:Forget Gate), 아웃풋 게이트(

:Output Gate)를 포함할 수 있다. 즉, 인풋 게이트는 현재 시간 t에서 받은 입력 데이터를 제어하고, 아웃풋 게이트는 현재 시간 t에서 출력되는 출력 데이터를 제어하고, 포겟 게이트는 현재 시간 t에서 데이터의 보존 여부를 제어할 수 있다. 따라서, LSTM 모델에 기반을 둔 교통 혼잡도 예측 방법은 인풋 게이트를 통해 입력 데이터를 제어하고, 아웃풋 게이트를 통해 출력 데이터를 제어하고, 포겟 게이트를 통해 데이터의 보존 여부를 제어하는 시계열 기반 딥러닝 모델을 통해 교통 혼잡도를 예측할 수 있다.At this time, the gate of the LSTM cell is the input gate (

:Input Gate), forget gate (

:Forget Gate), output gate (

:Output Gate). That is, the input gate can control input data received at the current time t, the output gate can control the output data output at the current time t, and the forget gate can control whether to preserve data at the current time t. Therefore, the traffic congestion prediction method based on the LSTM model is a time series-based deep learning model that controls input data through an input gate, output data through an output gate, and controls whether or not to preserve data through a forget gate. traffic congestion can be predicted.

LSTM 모델에서 LSTM#1은 제1 시간(t-|p|+2)의 데이터 (

,

,···,

)에 기초하여 아웃풋

을 출력할 수 있다. LSTM#2은 제2시간(t-1)의 데이터 (

,

···,

)에 기초하여 아웃풋

을 출력할 수 있다. LSTM#3은 제3시간(t)에 데이터

,

와 이전 LSTM의 출력인

및

에 어텐션(Attention) 메커니즘을 사용하여 예상 아웃풋을 출력할 수 있다.In the LSTM model, LSTM#1 is the data of the first time (t-|p|+2) (

,

,···,

) based on the output

can output LSTM#2 is the data of the second time (t-1) (

,

...,

) based on the output

can output LSTM#3 is the data at the third time (t)

,

and the output of the previous LSTM

and

Expected output can be output using the Attention mechanism.

실시예에 따라, LSTM 모델의 구성 중 입력 데이터는 10분 전 속도와 5분 전 속도 평균, 현재 속도, 인접 상류부 속도 데이터를 사용한다. 출력 데이터는 5분 후 속도 데이터를 사용할 수 있다.Depending on the embodiment, the input data of the construction of the LSTM model uses the average speed of 10 minutes ago and the speed of 5 minutes ago, the current speed, and the speed data of the adjacent upstream. Output data can use speed data after 5 minutes.

도 10 및 도 11은 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 장치를 설명하기 위한 도면들이다.10 and 11 are diagrams for explaining an apparatus for predicting traffic congestion based on LSTM according to an embodiment of the present invention.

도 10에 도시된 바와 같이, 예측 장치는 LSTM 기반 교통 혼잡도 예측을 위한 전처리 과정을 수행할 수 있다.As shown in FIG. 10, the prediction device may perform a preprocessing process for LSTM-based traffic congestion prediction.

이를 위해, LSTM 기반 교통 혼잡도 예측 장치(100)는 예측 구간의 전체 구간별 속도데이터를 표로 표시할 수 있다. LSTM 기반 교통 혼잡도 예측 장치(100)는 링크아이디(LINK_ID)가 입력되고 이상치 제거 및 결측 대치 중 적어도 하나의 필요한 전처리 과정이 선택되고, 시작버튼의 동작에 따라 전처리를 수행할 수 있다.To this end, the LSTM-based traffic congestion prediction apparatus 100 may display speed data for each section of the predicted section in a table. The LSTM-based traffic congestion prediction apparatus 100 may perform preprocessing according to the operation of the start button after a link ID (LINK_ID) is input, at least one required preprocessing process is selected from outlier removal and missing replacement.

또한, LSTM 기반 교통 혼잡도 예측 장치(100)는 전처리가 완료된 구간의 데이터는 그래프로 나타내고, 전 처리된 데이터는 저장 버튼을 통해 저장도 가능할 수 있다.In addition, the LSTM-based traffic congestion prediction apparatus 100 may display the data of the pre-processed section as a graph, and may also store the pre-processed data through a save button.

도 11을 참조하면, LSTM 기반 교통 혼잡도 예측 장치(100)는 전처리 된 데이터에 기초하여 교통 혼잡도를 예측할 수 있다.Referring to FIG. 11 , the LSTM-based traffic congestion prediction apparatus 100 may predict traffic congestion based on preprocessed data.

LSTM 기반 교통 혼잡도 예측 장치(100)에서 왼쪽 상단 구간 선택 창에서 구간을 선택할 수 있다. LSTM 기반 교통 혼잡도 예측 장치(100)는 선택한 구간에 대응하여 링크 개요(LINK overview)창에서는 선택한 구간에 대한 설명(LINK_ID, LINK_NAME, Velocity), 자료 수집(Data collection)창에서는 선택한 구간의 날짜별 속도 수집현황을 표시할 수 있다.In the LSTM-based traffic congestion prediction apparatus 100, a section can be selected from the section selection window at the top left. The LSTM-based traffic congestion prediction device 100 corresponds to the selected section, the description of the selected section (LINK_ID, LINK_NAME, Velocity) in the link overview window, and the speed by date of the selected section in the data collection window Collection status can be displayed.

LSTM 기반 교통 혼잡도 예측 장치(100)는 Status 창에서 5분 후의 예측이 선택되고, 예측(Predict)버튼이 동작하는 경우, 해당 구간의 전반적인 혼잡도 결과를 표시하며 혼잡도는 원할(초록), 복잡(주황), 매우 복잡(빨강)으로 나타낼 수 있다. 이때, 혼잡도는 원활은 30Km/h 이상, 복잡은 15Km/h~30Km/h, 매우 복잡은 15Km/h 미만의 기준으로 설정될 수 있다. 또한, 혼잡이 예상되는 구간의 수치적인 정보는 표로 표시될 수 있다.The LSTM-based traffic congestion prediction device 100 displays the overall congestion result of the corresponding section when the forecast after 5 minutes is selected in the Status window and the Predict button operates, and the congestion degree is smooth (green), complex (orange) ), can be expressed as very complex (red). In this case, the degree of congestion may be set to a criterion of 30 Km/h or more for smooth, 15 Km/h to 30 Km/h for complexity, and less than 15 Km/h for very complex. In addition, numerical information of sections in which congestion is expected may be displayed in a table.

도 12는 본 발명의 실시예에 따른 LSTM 기반 교통 혼잡도 예측 방법을 설명하기 위한 흐름도이다.12 is a flowchart illustrating a method for predicting traffic congestion based on LSTM according to an embodiment of the present invention.

도 12를 참조하면, 단계 S10에서, LSTM 기반 교통 혼잡도 예측 장치는 교통 데이터를 수집할 수 있다.Referring to FIG. 12 , in step S10, the LSTM-based traffic congestion prediction device may collect traffic data.

단계 S20에서, LSTM 기반 교통 혼잡도 예측 장치(100)는 수집된 교통데이터의 이상치를 제거할 수 있다.In step S20, the LSTM-based traffic congestion prediction apparatus 100 may remove outliers from the collected traffic data.

단계 S30에서, LSTM 기반 교통 혼잡도 예측 장치(100)는 공간추세 활용법, 시간추세 활용법 중 적어도 하나를 이용하여 교통데이터의 결측 값을 보정할 수 있다.In step S30, the LSTM-based traffic congestion prediction apparatus 100 may correct missing values of the traffic data using at least one of a spatial trend utilization method and a temporal trend utilization method.

단계 S40에서, LSTM 기반 교통 혼잡도 예측 장치(100)는 보정된 교통데이터에 기초하여 적대적 오토 인코더(AAE) 기반 결측 대치 모델을 수행할 수 있다. LSTM 기반 교통 혼잡도 예측 장치(100)는 교통패턴이 유사한 구간의 데이터를 이용하는 공간추세 활용법과 과거데이터를 통해 현재 값을 보정하는 시간추세 활용법을 수행하여 공간과 시간의 관점을 다르게 두고 데이터에 생기는 결측 값을 상호보완적으로 보정할 수 있다.In step S40, the LSTM-based traffic congestion prediction apparatus 100 may perform an adversarial auto-encoder (AAE)-based missing imputation model based on the corrected traffic data. The LSTM-based traffic congestion prediction device 100 performs a spatial trend utilization method using data of sections with similar traffic patterns and a temporal trend utilization method that corrects the current value through past data, leaving spatial and temporal perspectives different and missing data in the data. Values can be compensated for each other.

단계 S50에서, LSTM 기반 교통 혼잡도 예측 장치(100)는 전처리 된 교통데이터에 기초하여 시계열 기반 딥러닝 모델(LSTM; Long-Term Memory) 기반 교통 혼잡도를 예측할 수 있다.In step S50, the LSTM-based traffic congestion prediction apparatus 100 may predict traffic congestion based on a time-series based deep learning model (LSTM; Long-Term Memory) based on the pre-processed traffic data.

상술한 LSTM 기반 교통 혼잡도 예측 방법은 컴퓨터가 읽을 수 있는 매체 상에 컴퓨터가 읽을 수 있는 코드로 구현될 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체는, 예를 들어 이동형 기록 매체(CD, DVD, 블루레이 디스크, USB 저장 장치, 이동식 하드 디스크)이거나, 고정식 기록 매체(ROM, RAM, 컴퓨터 구비형 하드 디스크)일 수 있다. 상기 컴퓨터로 읽을 수 있는 기록 매체에 기록된 상기 컴퓨터 프로그램은 인터넷 등의 네트워크를 통하여 다른 컴퓨팅 장치에 전송되어 상기 다른 컴퓨팅 장치에 설치될 수 있고, 이로써 상기 다른 컴퓨팅 장치에서 사용될 수 있다.The aforementioned LSTM-based traffic congestion prediction method may be implemented as computer readable code on a computer readable medium. The computer-readable recording medium may be, for example, a removable recording medium (CD, DVD, Blu-ray disc, USB storage device, removable hard disk) or a fixed recording medium (ROM, RAM, computer-equipped hard disk). can The computer program recorded on the computer-readable recording medium may be transmitted to another computing device through a network such as the Internet, installed in the other computing device, and thus used in the other computing device.

이상에서, 본 발명의 실시 예를 구성하는 모든 구성 요소들이 하나로 결합되거나 결합되어 동작하는 것으로 설명되었다고 해서, 본 발명이 반드시 이러한 실시 예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위안에서라면, 그 모든 구성요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다.In the above, even though all the components constituting the embodiment of the present invention have been described as being combined or operated as one, the present invention is not necessarily limited to these embodiments. That is, within the scope of the object of the present invention, all of the components may be selectively combined with one or more to operate.

도면에서 동작들이 특정한 순서로 도시되어 있지만, 반드시 동작들이 도시된 특정한 순서로 또는 순차적 순서로 실행되어야만 하거나 또는 모든 도시 된 동작들이 실행되어야만 원하는 결과를 얻을 수 있는 것으로 이해되어서는 안 된다. 특정 상황에서는, 멀티태스킹 및 병렬 처리가 유리할 수도 있다. 더욱이, 위에 설명한 실시 예 들에서 다양한 구성들의 분리는 그러한 분리가 반드시 필요한 것으로 이해되어서는 안 되고, 설명된 프로그램 컴포넌트들 및 시스템들은 일반적으로 단일 소프트웨어 제품으로 함께 통합되거나 다수의 소프트웨어 제품으로 패키지 될 수 있음을 이해하여야 한다.Although actions are shown in a particular order in the drawings, it should not be understood that the actions must be performed in the specific order shown or in a sequential order, or that all shown actions must be performed to obtain a desired result. In certain circumstances, multitasking and parallel processing may be advantageous. Moreover, the separation of the various components in the embodiments described above should not be understood as requiring such separation, and the described program components and systems may generally be integrated together into a single software product or packaged into multiple software products. It should be understood that there is

이제까지 본 발명에 대하여 그 실시 예들을 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시 예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far, the present invention has been looked at mainly by its embodiments. Those skilled in the art to which the present invention pertains will be able to understand that the present invention can be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered from a descriptive point of view rather than a limiting point of view. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent scope will be construed as being included in the present invention.

100: LSTM 기반 교통 혼잡도 예측 장치
110: 데이터 수집부
120: 데이터 보정부
130: 전처리부
140: 예측부100: LSTM-based traffic congestion prediction device
110: data collection unit
120: data correction unit
130: pre-processing unit
140: prediction unit

Claims

a data collection unit that collects traffic data;
a data correction unit that corrects the collected traffic data;
a pre-processing unit that performs a missing imputation model based on the corrected traffic data; and
A prediction unit for predicting traffic congestion based on a time series based deep learning model (LSTM) based on preprocessed traffic data,
The data correction unit
Remove outliers from the collected traffic data through data filtering,
Correcting the missing value of the traffic data using the spatial trend utilization method using data of sections with similar traffic patterns;
The missing value of the traffic data is corrected using the time trend utilization method that corrects the current value through past data,
The missing imputation model is
an adversarial autoencoder-based data-missing imputation model;
A temporal convolution operation layer and a graph-based convolution operation layer are added to the encoder and decoder, and the layers are stacked to extract each feature, and the features are combined to perform the operation,
The Hadamard product of the mask vector and the vector, which is random noise and missing data, is input to the encoder to model the parameters of the latent distribution, the decoder extracts a sample from the latent space for missing value prediction, The discriminator is an LSTM-based traffic congestion prediction device that learns by distinguishing between imputed data and observed values.

delete

According to claim 1,
The time series-based deep learning model
an input gate controlling input data;
an output gate that controls output data; and
Including a forget gate that controls whether or not to preserve data,
Output each output based on the data of the first time and the second time,
An LSTM-based traffic congestion prediction device that outputs a predicted output using the attention mechanism to the data of the third time and the output.

collecting traffic data;
performing data correction on the collected traffic data;
performing data pre-processing according to a missing imputation model based on the corrected traffic data; and
Including predicting traffic congestion through a time series-based deep learning model based on preprocessed traffic data,
The step of performing the data correction is
removing outliers from the traffic data through data filtering;
Correcting the missing value of the traffic data using a time trend utilization method that corrects the current value through past data; and
Correcting missing values of the traffic data using a spatial trend utilization method using data of sections having similar traffic patterns,
The missing imputation model is
an adversarial autoencoder-based data-missing imputation model;
A temporal convolution operation layer and a graph-based convolution operation layer are added to the encoder and decoder, and the layers are stacked to extract each feature, and the features are combined to perform the operation,
The Hadamard product of the mask vector and the vector, which is random noise and missing data, is input to the encoder to model the parameters of the latent distribution, the decoder extracts a sample from the latent space for missing value prediction, The discriminator is an LSTM-based traffic congestion prediction method that learns by distinguishing imputed data and observed values.

delete

According to claim 6,
The step of predicting traffic congestion through the time series-based deep learning model
controlling input data through an input gate;
controlling output data through an output gate; and
Including the step of controlling whether or not to preserve data through a forget gate,
Output each output based on the data of the first time and the second time,
An LSTM-based traffic congestion prediction method for outputting an expected output using an attention mechanism to data of a third time and the output.

A computer program recorded on a computer-readable recording medium for executing the LSTM-based traffic congestion prediction method according to any one of claims 6 and 10.