More Web Proxy on the site http://driver.im/

research-article

Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network

Authors:

Yueqing Zhuang,

Xiaodong XieAuthors Info & Claims

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

Pages 253 - 261

https://doi.org/10.1145/3323873.3325010

Published: 05 June 2019 Publication History

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network

Pages 253 - 261

Abstract
References

Abstract

Multiple Object Tracking (MOT) has a wide range of applications in surveillance retrieval and autonomous driving. The majority of existing methods focus on extracting features by deep learning and hand-crafted optimizing bipartite graph or network flow. In this paper, we proposed an efficient end-to-end model, Deep Association Network (DAN), to learn the graph-based training data, which are constructed by spatial-temporal interaction of objects. DAN combines Convolutional Neural Network (CNN), Motion Encoder (ME) and Graph Neural Network (GNN). The CNNs and Motion Encoders extract appearance features from bounding box images and motion features from positions respectively, and then the GNN optimizes graph structure to associate the same object among frames together. In addition, we presented a novel end-to-end training strategy for Deep Association Network. Our experimental results demonstrate the effectiveness of DAN up to the state-of-the-art methods without extra-dataset on MOT16 and DukeMTMCT.

References

[1]

David Acuna, Huan Ling, Amlan Kar, and Sanja Fidler. 2018. Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++. In CVPR. 859--868.

[2]

Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. 2016. Social lstm: Human trajectory prediction in crowded spaces. In CVPR. 961-- 971.

[3]

Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018).

[4]

Keni Bernardin and Rainer Stiefelhagen. 2008. Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP Journal on Image and Video Processing 2008, 1 (2008), 246309.

Digital Library

[5]

Jiahui Chen, Hao Sheng, Yang Zhang, and Zhang Xiong. 2017. Enhancing Detection Model for Multiple Hypothesis Tracking. In CVPR Workshops. 18--27.

[6]

Wongun Choi. 2015. Near-online multi-target tracking with aggregated local flow descriptor. In ICCV. 3029--3037.

Digital Library

[7]

Qi Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, Bin Liu, and Nenghai Yu. 2017. Online Multi-Object Tracking Using CNNBased Single Object Tracker With Spatial-Temporal Attention Mechanism. In CVPR. 4836--4845.

[8]

Caglayan Dicle, Octavia I Camps, and Mario Sznaier. 2013. The way they move: Tracking multiple targets with similar appearance. In ICCV. 2304--2311.

Digital Library

[9]

David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.

Digital Library

[10]

Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2010. Object detection with discriminatively trained part-based models. IEEE TPAMI 32, 9 (2010), 1627-- 1645.

Digital Library

[11]

Xu Gao and Tingting Jiang. 2018. OSMO: Online Specific Models for Occlusion in Multiple Object Tracking Under Surveillance Scene. In 2018 ACM Multimedia Conference on Multimedia Conference. 201--210.

Digital Library

[12]

Victor Garcia and Joan Bruna. 2018. Few-shot learning with graph neural networks. ICLR (2018).

[13]

Roberto Henschel, Laura Leal-Taix, Daniel Cremers, and Bodo Rosenhahn. 2017. A Novel Multi-Detector Fusion Framework for Multi-Object Tracking. (2017).

[14]

Roberto Henschel, Laura Leal-Taixe, Daniel Cremers, and Bodo Rosenhahn. 2018. Fusion of head and full-body detectors for multiobject tracking. In Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]

Ju Hong Yoon, Chang-Ryeol Lee, Ming-Hsuan Yang, and Kuk-Jin Yoon. 2016. Online multi-object tracking via structural constraint event aggregation. In CVPR. 1392--1400.

[16]

Chanho Kim, Fuxin Li, Arridhana Ciptadi, and James M Rehg. 2015. Multiple hypothesis tracking revisited. In ICCV. 4696--4704.

Digital Library

[17]

Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).

[18]

Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, and Richard Zemel. 2018. Neural relational inference for interacting systems. ICML (2018).

[19]

Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR (2017).

[20]

Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, and Bjoern Andres. 2017. Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications. (2017).

[21]

Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2016. Gated graph sequence neural networks. ICLR (2016).

[22]

Zijie Zhuang Chong Shang Long Chen, Haizhou Ai. 2018. Realtime Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-identification. ICME (2018).

[23]

Cong Ma, Changshui Yang, Fan Yang, Yueqing Zhuang, Ziwei Zhang, Huizhu Jia, and Xiaodong Xie. 2018. Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking. ICME (2018).

[24]

Andrii Maksai, Xinchao Wang, Francois Fleuret, and Pascal Fua. Globally Consistent Multi-People Tracking using Motion Patterns. ({n. d.}).

[25]

Andrii Maksai, Xinchao Wang, Francois Fleuret, and Pascal Fua. 2017. Non-markovian globally consistent multi-object tracking. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2563--2573.

[26]

Anton Milan, Laura Leal-Taixe, Ian D. Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A Benchmark for Multi-Object Tracking. CoRR abs/1603.00831 (2016). arXiv:1603.00831 http: //arxiv.org/abs/1603.00831

[27]

Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. In ECCV workshop on Benchmarking Multi-Target Tracking.

[28]

Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European Conference on Computer Vision. Springer, 17--35.

[29]

Ergys Ristani and Carlo Tomasi. 2018. Features for Multi-Target Multi-Camera Tracking and Re-Identification. CVPR (2018).

[30]

Amir Sadeghian, Alexandre Alahi, and Silvio Savarese. 2017. Tracking the untrackable: Learning to track multiple cues with long-term dependencies. ICCV (2017).

[31]

Bima Sahbani and Widyawardana Adiprawita. 2017. Kalman filter and iterative-hungarian algorithm implementation for low complexity point tracking as part of fast multiple object tracking system. In ICSET. 109--115.

[32]

Samuel Schulter, Paul Vernaza, Wongun Choi, and Manmohan Chandraker. 2017. Deep Network Flow for Multi-Object Tracking. In CVPR. 6951--6960.

[33]

Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, and Xiaogang Wang. 2018. Person Re-identification with Deep Similarity- Guided Graph Neural Network. In ECCV. Springer, 508--526.

[34]

Hao Sheng, Jiahui Chen, Yang Zhang, Wei Ke, Zhang Xiong, and Jingyi Yu. 2018. Iterative Multiple Hypothesis Tracking with Tracklet-level Association. IEEE Transactions on Circuits and Systems for Video Technology (2018).

[35]

Jeany Son, Mooyeol Baek, Minsu Cho, and Bohyung Han. 2017. Multi-Object Tracking With Quadruplet Convolutional Neural Networks. In CVPR. 5620--5629.

[36]

Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, and Bernt Schiele. 2017. Multiple people tracking by lifted multicut and person reidentification. In CVPR. 3539--3548.

[37]

Yonatan Tariku Tesfaye, Eyasu Zemene, Andrea Prati, Marcello Pelillo, and Mubarak Shah. 2017. Multi-target tracking in multiple non-overlapping cameras using constrained dominant sets. arXiv preprint arXiv:1706.06196 (2017).

[38]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. ICLR (2018). https://openreview.net/forum?id= rJXMpikCZ accepted as poster.

[39]

Bing Wang, Li Wang, Bing Shuai, Zhen Zuo, Ting Liu, Kap Luk Chan, and Gang Wang. 2016. Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association. In CVPR Workshops. 1--8.

[40]

Yu Xiang, Alexandre Alahi, and Silvio Savarese. 2015. Learning to track: Online multi-object tracking by decision making. In ICCV. 4705--4713.

Digital Library

[41]

Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. AAAI (2018).

[42]

Fan Yang, Ke Yan, Shijian Lu, Huizhu Jia, Xiaodong Xie, and Wen Gao. 2019. Attention driven person re-identification. Pattern Recognition 86 (2019), 143 -- 155.

[43]

Kwangjin Yoon, Young-min Song, and Moongu Jeon. 2018. Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views. IET Image Processing (2018).

[44]

Zhimeng Zhang, Jianan Wu, Xuan Zhang, and Chi Zhang. 2017. Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project. arXiv preprint arXiv:1712.09531 (2017).

[45]

Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 1116--1124.

Digital Library

[46]

Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv preprint arXiv:1701.07717 3 (2017).

[47]

Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 3652--3661.

[48]

Ji Zhu, Hua Yang, Nian Liu, Minyoung Kim, Wenjun Zhang, and Ming-Hsuan Yang. 2018. Online Multi-Object Tracking with Dual Matching Attention Networks. In ECCV.

Cited By

Zhang YZheng LHuang Q(2025)Multi-object tracking based on graph neural networksMultimedia Systems10.1007/s00530-025-01679-831:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s00530-025-01679-8
He JHuang ZWang NZhang Z(2024)Learnable Graph Matching: A Practical Paradigm for Data AssociationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336240146:7(4880-4895)Online publication date: Jul-2024
https://doi.org/10.1109/TPAMI.2024.3362401
Nguyen TNguyen HSartipi MFisichella M(2024)Multi-Vehicle Multi-Camera Tracking With Graph-Based Tracklet FeaturesIEEE Transactions on Multimedia10.1109/TMM.2023.327436926(972-983)Online publication date: 2024
https://doi.org/10.1109/TMM.2023.3274369
Show More Cited By

Index Terms

Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Tracking

Recommendations

Data Association with Graph Network for Multi-Object Tracking
Knowledge Science, Engineering and Management
Abstract
Multi-Object Tracking (MOT) methods within Tracking-by-Detection paradigm are usually modeled as graph problem. It is challenging to associate objects in dense scenes with frequent occlusion. To further model object interactions and repair ...
Hebbian Learning Meets Deep Convolutional Neural Networks
Image Analysis and Processing – ICIAP 2019
Abstract
Neural networks are said to be biologically inspired since they mimic the behavior of real neurons. However, several processes in state-of-the-art neural networks, including Deep Convolutional Neural Networks (DCNN), are far from the ones found in ...
Deep Neural Architecture Search with Deep Graph Bayesian Optimization
WI '19: IEEE/WIC/ACM International Conference on Web Intelligence

Image recognition aims to identify objects, places, people, or other targeted items in a given image, and has a wide range of social applications such as natural disasters recognition, plant disease detection, and traffic jam detection. Currently state-...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval

June 2019

427 pages

ISBN:9781450367653

DOI:10.1145/3323873

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada
,
Alberto Del Bimbo
University of Florence, Italy
,
Zhongfei Zhang
Binghamton University, State University of New York, USA
,
Program Chairs:
Alexander Hauptmann
Carnegie Mellon University, USA
,
K. Selcuk Candan
Arizona State University, USA
,
Marco Bertini
University of Florence, Italy
,
Lexing Xie
Australia National University, Australia
,
Xiao-Yong Wei
Sichuan University, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICMR '19

Sponsor:

SIGMM

ICMR '19: International Conference on Multimedia Retrieval

June 10 - 13, 2019

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 254 of 830 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
1,278
Total Downloads

Downloads (Last 12 months)77
Downloads (Last 6 weeks)13

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YZheng LHuang Q(2025)Multi-object tracking based on graph neural networksMultimedia Systems10.1007/s00530-025-01679-831:1Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1007/s00530-025-01679-8
He JHuang ZWang NZhang Z(2024)Learnable Graph Matching: A Practical Paradigm for Data AssociationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336240146:7(4880-4895)Online publication date: Jul-2024
https://doi.org/10.1109/TPAMI.2024.3362401
Nguyen TNguyen HSartipi MFisichella M(2024)Multi-Vehicle Multi-Camera Tracking With Graph-Based Tracklet FeaturesIEEE Transactions on Multimedia10.1109/TMM.2023.327436926(972-983)Online publication date: 2024
https://doi.org/10.1109/TMM.2023.3274369
Zhang WZhang XXu XXu YShao ZShi JWei SZeng T(2024)GNN-JFL: Graph Neural Network for Video SAR Shadow Tracking With Joint Motion-Appearance Feature LearningIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.338387062(1-17)Online publication date: 2024
https://doi.org/10.1109/TGRS.2024.3383870
Bolshakov V(2024)Multi-Agent Reinforcement Learning as Interaction Model for Online Multi-Object Tracking2024 6th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE)10.1109/REEPE60449.2024.10479814(1-6)Online publication date: 29-Feb-2024
https://doi.org/10.1109/REEPE60449.2024.10479814
Peng KDong TZhang WZhang J(2024)City Traffic Aware Multi-Target Tracking Prediction with Multi-Camera2024 IEEE International Conference on Image Processing Challenges and Workshops (ICIPCW)10.1109/ICIPCW64161.2024.10769126(4224-4230)Online publication date: 27-Oct-2024
https://doi.org/10.1109/ICIPCW64161.2024.10769126
Zhang YHuang QZheng L(2024)Multiple object tracking based on appearance and motion graph convolutional neural networks with an explainerNeural Computing and Applications10.1007/s00521-024-09773-036:22(13799-13814)Online publication date: 1-Aug-2024
https://dl.acm.org/doi/10.1007/s00521-024-09773-0
Nahata DOthman K(2023)Exploring the challenges and opportunities of image processing and sensor fusion in autonomous vehicles: A comprehensive reviewAIMS Electronics and Electrical Engineering10.3934/electreng.20230167:4(271-321)Online publication date: 2023
https://doi.org/10.3934/electreng.2023016
Li TSun JLiu YZhang XZhu DGuo ZGeng L(2023)ESMO: Joint Frame Scheduling and Model Caching for Edge Video AnalyticsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328159834:8(2295-2310)Online publication date: Aug-2023
https://doi.org/10.1109/TPDS.2023.3281598
Zhang XLing YYang YChu CZhou Z(2023)Center-point-pair detection and context-aware re-identification for end-to-end multi-object trackingNeurocomputing10.1016/j.neucom.2022.11.094524(17-30)Online publication date: Mar-2023
https://doi.org/10.1016/j.neucom.2022.11.094
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten