[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3323873.3325010acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network

Published: 05 June 2019 Publication History

Abstract

Multiple Object Tracking (MOT) has a wide range of applications in surveillance retrieval and autonomous driving. The majority of existing methods focus on extracting features by deep learning and hand-crafted optimizing bipartite graph or network flow. In this paper, we proposed an efficient end-to-end model, Deep Association Network (DAN), to learn the graph-based training data, which are constructed by spatial-temporal interaction of objects. DAN combines Convolutional Neural Network (CNN), Motion Encoder (ME) and Graph Neural Network (GNN). The CNNs and Motion Encoders extract appearance features from bounding box images and motion features from positions respectively, and then the GNN optimizes graph structure to associate the same object among frames together. In addition, we presented a novel end-to-end training strategy for Deep Association Network. Our experimental results demonstrate the effectiveness of DAN up to the state-of-the-art methods without extra-dataset on MOT16 and DukeMTMCT.

References

[1]
David Acuna, Huan Ling, Amlan Kar, and Sanja Fidler. 2018. Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++. In CVPR. 859--868.
[2]
Alexandre Alahi, Kratarth Goel, Vignesh Ramanathan, Alexandre Robicquet, Li Fei-Fei, and Silvio Savarese. 2016. Social lstm: Human trajectory prediction in crowded spaces. In CVPR. 961-- 971.
[3]
Peter W Battaglia, Jessica B Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti, David Raposo, Adam Santoro, Ryan Faulkner, et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261 (2018).
[4]
Keni Bernardin and Rainer Stiefelhagen. 2008. Evaluating multiple object tracking performance: the CLEAR MOT metrics. EURASIP Journal on Image and Video Processing 2008, 1 (2008), 246309.
[5]
Jiahui Chen, Hao Sheng, Yang Zhang, and Zhang Xiong. 2017. Enhancing Detection Model for Multiple Hypothesis Tracking. In CVPR Workshops. 18--27.
[6]
Wongun Choi. 2015. Near-online multi-target tracking with aggregated local flow descriptor. In ICCV. 3029--3037.
[7]
Qi Chu, Wanli Ouyang, Hongsheng Li, Xiaogang Wang, Bin Liu, and Nenghai Yu. 2017. Online Multi-Object Tracking Using CNNBased Single Object Tracker With Spatial-Temporal Attention Mechanism. In CVPR. 4836--4845.
[8]
Caglayan Dicle, Octavia I Camps, and Mario Sznaier. 2013. The way they move: Tracking multiple targets with similar appearance. In ICCV. 2304--2311.
[9]
David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alan Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.
[10]
Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. 2010. Object detection with discriminatively trained part-based models. IEEE TPAMI 32, 9 (2010), 1627-- 1645.
[11]
Xu Gao and Tingting Jiang. 2018. OSMO: Online Specific Models for Occlusion in Multiple Object Tracking Under Surveillance Scene. In 2018 ACM Multimedia Conference on Multimedia Conference. 201--210.
[12]
Victor Garcia and Joan Bruna. 2018. Few-shot learning with graph neural networks. ICLR (2018).
[13]
Roberto Henschel, Laura Leal-Taix, Daniel Cremers, and Bodo Rosenhahn. 2017. A Novel Multi-Detector Fusion Framework for Multi-Object Tracking. (2017).
[14]
Roberto Henschel, Laura Leal-Taixe, Daniel Cremers, and Bodo Rosenhahn. 2018. Fusion of head and full-body detectors for multiobject tracking. In Computer Vision and Pattern Recognition Workshops (CVPRW).
[15]
Ju Hong Yoon, Chang-Ryeol Lee, Ming-Hsuan Yang, and Kuk-Jin Yoon. 2016. Online multi-object tracking via structural constraint event aggregation. In CVPR. 1392--1400.
[16]
Chanho Kim, Fuxin Li, Arridhana Ciptadi, and James M Rehg. 2015. Multiple hypothesis tracking revisited. In ICCV. 4696--4704.
[17]
Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. ICLR (2015).
[18]
Thomas Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, and Richard Zemel. 2018. Neural relational inference for interacting systems. ICML (2018).
[19]
Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. ICLR (2017).
[20]
Evgeny Levinkov, Jonas Uhrig, Siyu Tang, Mohamed Omran, Eldar Insafutdinov, Alexander Kirillov, Carsten Rother, Thomas Brox, Bernt Schiele, and Bjoern Andres. 2017. Joint Graph Decomposition and Node Labeling: Problem, Algorithms, Applications. (2017).
[21]
Yujia Li, Daniel Tarlow, Marc Brockschmidt, and Richard Zemel. 2016. Gated graph sequence neural networks. ICLR (2016).
[22]
Zijie Zhuang Chong Shang Long Chen, Haizhou Ai. 2018. Realtime Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-identification. ICME (2018).
[23]
Cong Ma, Changshui Yang, Fan Yang, Yueqing Zhuang, Ziwei Zhang, Huizhu Jia, and Xiaodong Xie. 2018. Trajectory Factory: Tracklet Cleaving and Re-connection by Deep Siamese Bi-GRU for Multiple Object Tracking. ICME (2018).
[24]
Andrii Maksai, Xinchao Wang, Francois Fleuret, and Pascal Fua. Globally Consistent Multi-People Tracking using Motion Patterns. ({n. d.}).
[25]
Andrii Maksai, Xinchao Wang, Francois Fleuret, and Pascal Fua. 2017. Non-markovian globally consistent multi-object tracking. In 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2563--2573.
[26]
Anton Milan, Laura Leal-Taixe, Ian D. Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A Benchmark for Multi-Object Tracking. CoRR abs/1603.00831 (2016). arXiv:1603.00831 http: //arxiv.org/abs/1603.00831
[27]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. In ECCV workshop on Benchmarking Multi-Target Tracking.
[28]
Ergys Ristani, Francesco Solera, Roger Zou, Rita Cucchiara, and Carlo Tomasi. 2016. Performance measures and a data set for multi-target, multi-camera tracking. In European Conference on Computer Vision. Springer, 17--35.
[29]
Ergys Ristani and Carlo Tomasi. 2018. Features for Multi-Target Multi-Camera Tracking and Re-Identification. CVPR (2018).
[30]
Amir Sadeghian, Alexandre Alahi, and Silvio Savarese. 2017. Tracking the untrackable: Learning to track multiple cues with long-term dependencies. ICCV (2017).
[31]
Bima Sahbani and Widyawardana Adiprawita. 2017. Kalman filter and iterative-hungarian algorithm implementation for low complexity point tracking as part of fast multiple object tracking system. In ICSET. 109--115.
[32]
Samuel Schulter, Paul Vernaza, Wongun Choi, and Manmohan Chandraker. 2017. Deep Network Flow for Multi-Object Tracking. In CVPR. 6951--6960.
[33]
Yantao Shen, Hongsheng Li, Shuai Yi, Dapeng Chen, and Xiaogang Wang. 2018. Person Re-identification with Deep Similarity- Guided Graph Neural Network. In ECCV. Springer, 508--526.
[34]
Hao Sheng, Jiahui Chen, Yang Zhang, Wei Ke, Zhang Xiong, and Jingyi Yu. 2018. Iterative Multiple Hypothesis Tracking with Tracklet-level Association. IEEE Transactions on Circuits and Systems for Video Technology (2018).
[35]
Jeany Son, Mooyeol Baek, Minsu Cho, and Bohyung Han. 2017. Multi-Object Tracking With Quadruplet Convolutional Neural Networks. In CVPR. 5620--5629.
[36]
Siyu Tang, Mykhaylo Andriluka, Bjoern Andres, and Bernt Schiele. 2017. Multiple people tracking by lifted multicut and person reidentification. In CVPR. 3539--3548.
[37]
Yonatan Tariku Tesfaye, Eyasu Zemene, Andrea Prati, Marcello Pelillo, and Mubarak Shah. 2017. Multi-target tracking in multiple non-overlapping cameras using constrained dominant sets. arXiv preprint arXiv:1706.06196 (2017).
[38]
Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2018. Graph Attention Networks. ICLR (2018). https://openreview.net/forum?id= rJXMpikCZ accepted as poster.
[39]
Bing Wang, Li Wang, Bing Shuai, Zhen Zuo, Ting Liu, Kap Luk Chan, and Gang Wang. 2016. Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association. In CVPR Workshops. 1--8.
[40]
Yu Xiang, Alexandre Alahi, and Silvio Savarese. 2015. Learning to track: Online multi-object tracking by decision making. In ICCV. 4705--4713.
[41]
Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial temporal graph convolutional networks for skeleton-based action recognition. AAAI (2018).
[42]
Fan Yang, Ke Yan, Shijian Lu, Huizhu Jia, Xiaodong Xie, and Wen Gao. 2019. Attention driven person re-identification. Pattern Recognition 86 (2019), 143 -- 155.
[43]
Kwangjin Yoon, Young-min Song, and Moongu Jeon. 2018. Multiple hypothesis tracking algorithm for multi-target multi-camera tracking with disjoint views. IET Image Processing (2018).
[44]
Zhimeng Zhang, Jianan Wu, Xuan Zhang, and Chi Zhang. 2017. Multi-Target, Multi-Camera Tracking by Hierarchical Clustering: Recent Progress on DukeMTMC Project. arXiv preprint arXiv:1712.09531 (2017).
[45]
Liang Zheng, Liyue Shen, Lu Tian, Shengjin Wang, Jingdong Wang, and Qi Tian. 2015. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision. 1116--1124.
[46]
Zhedong Zheng, Liang Zheng, and Yi Yang. 2017. Unlabeled samples generated by gan improve the person re-identification baseline in vitro. arXiv preprint arXiv:1701.07717 3 (2017).
[47]
Zhun Zhong, Liang Zheng, Donglin Cao, and Shaozi Li. 2017. Re-ranking person re-identification with k-reciprocal encoding. In Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on. IEEE, 3652--3661.
[48]
Ji Zhu, Hua Yang, Nian Liu, Minyoung Kim, Wenjun Zhang, and Ming-Hsuan Yang. 2018. Online Multi-Object Tracking with Dual Matching Attention Networks. In ECCV.

Cited By

View all
  • (2024)Learnable Graph Matching: A Practical Paradigm for Data AssociationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336240146:7(4880-4895)Online publication date: Jul-2024
  • (2024)Multi-Vehicle Multi-Camera Tracking With Graph-Based Tracklet FeaturesIEEE Transactions on Multimedia10.1109/TMM.2023.327436926(972-983)Online publication date: 2024
  • (2024)GNN-JFL: Graph Neural Network for Video SAR Shadow Tracking With Joint Motion-Appearance Feature LearningIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.338387062(1-17)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Deep Association: End-to-end Graph-Based Learning for Multiple Object Tracking with Conv-Graph Neural Network

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMR '19: Proceedings of the 2019 on International Conference on Multimedia Retrieval
    June 2019
    427 pages
    ISBN:9781450367653
    DOI:10.1145/3323873
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. computer vision
    2. deep association
    3. deep learning
    4. graph neural network
    5. multiple object tracking
    6. surveillance retrieval

    Qualifiers

    • Research-article

    Conference

    ICMR '19
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 254 of 830 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)81
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 27 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learnable Graph Matching: A Practical Paradigm for Data AssociationIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.336240146:7(4880-4895)Online publication date: Jul-2024
    • (2024)Multi-Vehicle Multi-Camera Tracking With Graph-Based Tracklet FeaturesIEEE Transactions on Multimedia10.1109/TMM.2023.327436926(972-983)Online publication date: 2024
    • (2024)GNN-JFL: Graph Neural Network for Video SAR Shadow Tracking With Joint Motion-Appearance Feature LearningIEEE Transactions on Geoscience and Remote Sensing10.1109/TGRS.2024.338387062(1-17)Online publication date: 2024
    • (2024)Multi-Agent Reinforcement Learning as Interaction Model for Online Multi-Object Tracking2024 6th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE)10.1109/REEPE60449.2024.10479814(1-6)Online publication date: 29-Feb-2024
    • (2024)City Traffic Aware Multi-Target Tracking Prediction with Multi-Camera2024 IEEE International Conference on Image Processing Challenges and Workshops (ICIPCW)10.1109/ICIPCW64161.2024.10769126(4224-4230)Online publication date: 27-Oct-2024
    • (2024)Multiple object tracking based on appearance and motion graph convolutional neural networks with an explainerNeural Computing and Applications10.1007/s00521-024-09773-036:22(13799-13814)Online publication date: 1-Aug-2024
    • (2023)Exploring the challenges and opportunities of image processing and sensor fusion in autonomous vehicles: A comprehensive reviewAIMS Electronics and Electrical Engineering10.3934/electreng.20230167:4(271-321)Online publication date: 2023
    • (2023)ESMO: Joint Frame Scheduling and Model Caching for Edge Video AnalyticsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.328159834:8(2295-2310)Online publication date: Aug-2023
    • (2023)Center-point-pair detection and context-aware re-identification for end-to-end multi-object trackingNeurocomputing10.1016/j.neucom.2022.11.094524(17-30)Online publication date: Mar-2023
    • (2023)A systematic survey on recent deep learning-based approaches to multi-object trackingMultimedia Tools and Applications10.1007/s11042-023-16910-983:12(36203-36259)Online publication date: 26-Sep-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media