research-article

CSR: : Cascade Conditional Variational Auto Encoder with Socially-aware Regression for Pedestrian Trajectory Prediction

Authors:

Hao Zhou,

Dongchun Ren,

Xu Yang,

Mingyu Fan,

Hai HuangAuthors Info & Claims

Volume 133, Issue C

https://doi.org/10.1016/j.patcog.2022.109030

Published: 01 January 2023 Publication History

Highlights

•

The proposed trajectory prediction method consists of a cascaded CVAE module and a socially aware regression module.

•

The cascaded CVAE module decouples and balances the loss function with respect to time steps and minimizes the losses at every time steps independently.

•

The socially aware regression module corrects the predictions by checking the compatibility between the interaction coding and the crude predicted trajectories.

Abstract

Pedestrian trajectory prediction is a key technology in many real applications such as video surveillance, social robot navigation, and autonomous driving, and significant progress has been made in this research topic. However, there remain two limitations of previous studies. First, the losses of the last time steps are heavier weighted than that of the beginning time steps in the objective function at the learning stage, causing the prediction errors generated at the beginning to accumulate to large errors at the last time steps at the inference stage. Second, the prediction results of multiple pedestrians in the prediction horizon might be socially incompatible with the interactions modeled by past trajectories. To overcome these limitations, this work proposes a novel trajectory prediction method called CSR, which consists of a cascaded conditional variational autoencoder (CVAE) module and a socially-aware regression module. The CVAE module estimates the future trajectories in a cascaded sequential manner. Specifically, each CVAE concatenates the past trajectories and the predicted location points so far as the input and predicts the adjacent location at the following time step. The socially-aware regression module generates offsets from the estimated future trajectories to produce the corrected predictions, which are more reasonable and accurate than the estimated trajectories. Experiments results demonstrate that the proposed method exhibits significant improvements over state-of-the-art methods on the Stanford Drone Dataset (SDD) and the ETH/UCY dataset of approximately 38.0% and 22.2%, respectively. The code is available at https://github.com/zhouhao94/CSR.

References

[1]

J. Hong, B. Sapp, J. Philbin, Rules of the road: Predicting driving behavior with a convolutional model of semantic interactions, The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

Highlights

Abstract

References

Cited By

Index Terms

Recommendations

SGAMTE-Net: A pedestrian trajectory prediction network based on spatiotemporal graph attention and multimodal trajectory endpoints

Goal-CurveNet: A pedestrian trajectory prediction network using heterogeneous graph attention goal prediction and curve fitting

CSGAT-Net: a conditional pedestrian trajectory prediction network based on scene semantic maps and spatiotemporal graph attention

Comments

Information

Published In

Publisher

Publication History

Author Tags

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations