More Web Proxy on the site http://driver.im/

research-article

Online Correction of Camera Poses for the Surround-view System: A Sparse Direct Approach

Authors:

Yicong ZhouAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 18, Issue 4

Article No.: 106, Pages 1 - 24

https://doi.org/10.1145/3505252

Published: 04 March 2022 Publication History

Abstract

The surround-view module is an indispensable component of a modern advanced driving assistance system. By calibrating the intrinsics and extrinsics of the surround-view cameras accurately, a top-down surround-view can be generated from raw fisheye images. However, poses of these cameras sometimes may change. At present, how to correct poses of cameras in a surround-view system online without re-calibration is still an open issue. To settle this problem, we introduce the sparse direct framework and propose a novel optimization scheme of a cascade structure. This scheme is actually composed of two levels of optimization and two corresponding photometric error based models are proposed. The model for the first-level optimization is called the ground model, as its photometric errors are measured on the ground plane. For the second level of the optimization, it’s based on the so-called ground-camera model, in which photometric errors are computed on the imaging planes. With these models, the pose correction task is formulated as a nonlinear least-squares problem to minimize photometric errors in overlapping regions of adjacent bird’s-eye-view images. With a cascade structure of these two levels of optimization, an appropriate balance between the speed and the accuracy can be achieved. Experiments show that our method can effectively eliminate the misalignment caused by cameras’ moderate pose changes in the surround-view system. Source code and test cases are available online at https://cslinzhang.github.io/CamPoseCorrection/.

References

[1]

R. Battiti. 1992. First- and second-order methods for learning: Between steepest descent and Newton’s method. Neural Computation 4, 2 (1992), 141–166.

Digital Library

[2]

H. Bay, T. Tuytelaars, and L. V. Gool. 2006. SURF: Speeded up robust features. In Proc. European Conf. Comput. Vis.404–417.

[3]

K. Choi, H. Jung, and J. Suhr. 2018. Automatic calibration of an around view monitor system exploiting lane markings. Sensors 18, 9 (2018), 2956:1–26.

[4]

J. Collado, C. Hilario, A. Escalera, and J. Armingol. 2006. Self-calibration of an on-board stereo-vision system for driver assistance systems. In Proc. Int’l IEEE Conf. Intell. Vehicles Symposium. 156–162.

[5]

T. Dang and C. Hoffmann. 2006. Tracking camera parameters of an active stereo rig. In Joint DAGM Symposium. 627-636.

[6]

J. E. Dennis and R. B. Schnabel. 1983. Numerical methods for unconstrained optimization and nonlinear equations. Prentice Hall, Inc. 28, 3 (1983), 417–419.

[7]

F. Du and M. Brady. 1993. Self-calibration of the intrinsic parameters of cameras for active vision systems. In Proc. IEEE Int’l Conf. Comput. Vis. Pattern Recognit.477–482.

[8]

J. Engel, V. Koltun, and D. Cremers. 2018. Direct sparse odometry. IEEE Trans. Pattern Analysis and Machine Intell. 40, 3 (2018), 611–625.

[9]

J. Engel, T. Schöps, and D. Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In Proc. European Conf. Comput. Vis.834–849.

[10]

C. Forster, M. Pizzoli, and D. Scaramuzza. 2014. SVO: Fast semi-direct monocular visual odometry. In Proc. IEEE Int’l Conf. Robotics and Automation. 15–22.

[11]

M. Gressmann, G. Palm, and O. Löhlein. 2011. Surround view pedestrian detection using heterogeneous classifier cascades. In Proc. Int’l IEEE Conf. Intell. Transportation Systems. 1317–1324.

[12]

P. Hansen, H. Alismail, P. Rander, and B. Browning. 2012. Online continuous stereo extrinsic parameter estimation. In Proc. IEEE Int’l Conf. Comput. Vis. Pattern Recognit.1059–1066.

[13]

R. Hartley and A. Zisserman. 2003. Multiple View Geometry in Computer Vision (2 ed.). Cambridge University Press, USA.

Digital Library

[14]

L. Heng, M. Bürki, G. H. Lee, P. Furgale, R. Siegwart, and M. Pollefeys. 2014. Infrastructure-based calibration of a multi-camera rig. In Proc. IEEE Int’l Conf. Robotics and Automation. 4912–4919.

[15]

L. Heng, B. Li, and M. Pollefeys. 2013. CamOdoCal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry. In Proc. IEEE/RSJ Int’l Conf. Intell. Robots and Systems. 1793–1800.

[16]

W. C. Hoffman. 1966. The Lie algebra of visual perception. J. Mathematical Psychology 3, 1 (1966), 65–98.

[17]

S. Hold, S. Görmer, A. Kummert, M. Meuter, and S. Muller-Schneiders. 2009. A novel approach for the online initial calibration of extrinsic parameters for a car-mounted camera. In Proc. Int’l IEEE Conf. Intell. Transportation Systems. 420–425.

[18]

C. Hou, H. Ai, and S. Lao. 2007. Multiview pedestrian detection based on vector boosting. In Proc. Asian Conf. Comput. Vis.18–22.

[19]

M. Irani and P. Anandan. 1999. About direct methods. In Proc. Int’l Workshop on Vis. Algorithms. 267–277.

[20]

R. Klette, A. Koschan, and K. Schluns. 1998. Computer Vision: Three-dimensional Data from Images. Springer, Singapore.

Digital Library

[21]

M. Knorr, W. Niehsen, and C. Stiller. 2013. Online extrinsic multi-camera calibration using ground plane induced homographies. In Proc. IEEE Intell. Vehicles Symposium. 236–241.

[22]

Pierre Lébraly, Eric Royer, Omar Ait-Aider, Clément Deymier, and Michel Dhome. 2011. Fast calibration of embedded non-overlapping cameras. In Proc. IEEE Int’l Conf. Robotics and Automation. 221–227.

[23]

L. Li, L. Zhang, X. Li, X. Liu, Y. Shen, and L. Xiong. 2017. Vision-based parking-slot detection: A benchmark and a learning-based approach. In Proc. IEEE Int’l Conf. Multimedia and Expo. 649–654.

[24]

C. Lin and M. Wang. 2012. A vision based top-view transformation model for a vehicle parking assistant. Sensors 12, 4 (2012), 4431–4446.

[25]

Y. Ling and S. Shen. 2016. High-precision online markerless stereo extrinsic calibration. In Proc. IEEE/RSJ Int’l Conf. Intell. Robots and Systems. 1771–1778.

[26]

X. Liu, L. Zhang, Y. Shen, S. Zhang, and S. Zhao. 2019. Online camera pose optimization for the surround-view system. In Proc. ACM Int’l Conf. Multimedia. 383–391.

[27]

M. I. A. Lourakis. 2019. Sparse non-linear least squares optimization for geometric vision. In Proc. European Conf. Comput. Vis.43–56.

[28]

D. G. Lowe. 2004. Distinctive image features from scale-invariant keypoints. Int’l J. Comput. Vis. 60, 2 (2004), 91–110.

Digital Library

[29]

B. D. Lucas and T. Kanade. 1981. An iterative image registration technique with an application to stereo vision. In Proc. Int’l Joint Conf. Artificial Intell.674–679.

[30]

J. J. Moré. 1978. The Levenberg-Marquardt algorithm: Implementation and theory. In Numerical Analysis.

[31]

R. Mur-Artal and J. D. Tardós. 2017. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robotics 33, 5 (2017), 1255–1262.

Digital Library

[32]

S. Nedevschi, C. Vancea, T. Marita, and T. Graf. 2007. Online extrinsic parameters calibration for stereo vision systems used in far-range detection vehicle applications. IEEE Trans. Intell. Transportation Systems 8, 4 (2007).

Digital Library

[33]

R. A. Newcombe, S. J. Lovegrove, and A. J. Davison. 2011. DTAM: Dense tracking and mapping in real-time. In Proc. IEEE Int’l Conf. Comput. Vis.2320–2327.

[34]

F. Nielsen. 2005. Surround video: A multihead camera approach. The Visual Computer 21, 1-2 (2005), 92–103.

Digital Library

[35]

J. Nocedal. 1992. Theory of algorithms for unconstrained optimization. Acta Numerica 1, 8 (1992), 199–242.

[36]

J. M. M. Montiel R. Mur-Artal and J. D. Tardós. 2015. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robotics 31, 5 (2015), 1147–1163.

Digital Library

[37]

E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. Proc. IEEE Int’l Conf. Comput. Vis. (2011), 2564–2571.

[38]

S. Schneider, T. Luettel, and H. Wuensche. 2013. Odometry-based online extrinsic sensor calibration. In Proc. IEEE/RSJ Int’l Conf. Intell. Robots and Systems. 1287–1292.

[39]

X. Shao, X. Liu, L. Zhang, S. Zhao, Y. Shen, and Y. Yang. 2019. Revisit surround-view camera system calibration. In Proc. IEEE Int’l Conf. Multimedia and Expo. 1486–1491.

[40]

R. W. M. Wedderburn. 1974. Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika 61, 3 (1974), 439–447.

[41]

J. Xu, G. Chen, and M. Xie. 2000. Vision-guided automatic parking for smart car. In Proc. IEEE Intell. Vehicles Symposium. 725–730.

[42]

L. Zhang, J. Huang, X. Li, and L. Xiong. 2018. Vision-based parking-slot detection: A DCNN-based approach and a large-scale benchmark dataset. IEEE Trans. Image Processing 27, 11 (2018), 5350–5364.

Digital Library

[43]

Z. Zhang. 1999. Flexible camera calibration by viewing a plane from unknown orientations. In Proc. IEEE Int’l Conf. Comput. Vis.666–673.

[44]

K. Zhao, U. Iurgel, M. Meuter, and J. Pauli. 2014. An automatic online camera calibration system for vehicular applications. In Proc. Int’l IEEE Conf. Intell. Transportation Systems. 1490–1492.

[45]

H. Zhu, J. Yang, and Z. Liu. 2009. Fisheye camera calibration with two pairs of vanishing points. In Proc. Int’l Conf. Inf. Tech. Comput. Sci.321–324.

Cited By

Zou YYu HLiu WLv J(2025)Robust calibration of surround-view camera systems with Planar-Eccentric Constraint: A novel Apriltag approachMeasurement10.1016/j.measurement.2024.115914242(115914)Online publication date: Jan-2025
https://doi.org/10.1016/j.measurement.2024.115914
Sun YZhang LWang ZChen YZhao SZhou Y(2024)I2P Registration by Learning the Underlying Alignment Feature Space from Pixel-to-Point SimilaritiesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369783920:12(1-21)Online publication date: 26-Nov-2024
https://dl.acm.org/doi/10.1145/3697839
Yuan BLu JYou SBao B(2024)Unbiased Feature Learning with Causal Intervention for Visible-Infrared Person Re-identificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3674737Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3674737
Show More Cited By

Index Terms

Online Correction of Camera Poses for the Surround-view System: A Sparse Direct Approach
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision tasks
        Scene understanding
      2. Image and video acquisition
        Camera calibration

Recommendations

ROECS: A Robust Semi-direct Pipeline Towards Online Extrinsics Correction of the Surround-view System
MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Generally, a surround-view system (SVS), which is an indispensable component of advanced driving assistant systems (ADAS), consists of four to six wide-angle fisheye cameras. As long as both intrinsics and extrinsics of all cameras have been calibrated, ...
Online Camera Pose Optimization for the Surround-view System
MM '19: Proceedings of the 27th ACM International Conference on Multimedia

Surround-view system is an important information medium for drivers to monitor the driving environment. A typical surround-view system consists of four to six fish-eye cameras arranged around the vehicle. From these camera inputs, a top-down image of ...
Camera calibration for the surround-view system: a benchmark and dataset
Abstract
Surround-view system (SVS) is widely used in the advanced driver assistance system (ADAS). SVS uses four fish-eye lenses to monitor real-time scenes around the vehicle. However, accurate intrinsic and extrinsic parameter estimation is required for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 18, Issue 4

November 2022

497 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3514185

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 March 2022

Accepted: 01 December 2021

Revised: 01 October 2021

Received: 01 May 2021

Published in TOMM Volume 18, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

National Natural Science Foundation of China
Natural Science Foundation of Shanghai
Shanghai Science and Technology Innovation Plan
Dawn Program of Shanghai Municipal Education Commission
Shanghai Municipal Science and Technology Major Project
Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
688
Total Downloads

Downloads (Last 12 months)144
Downloads (Last 6 weeks)16

Reflects downloads up to 12 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zou YYu HLiu WLv J(2025)Robust calibration of surround-view camera systems with Planar-Eccentric Constraint: A novel Apriltag approachMeasurement10.1016/j.measurement.2024.115914242(115914)Online publication date: Jan-2025
https://doi.org/10.1016/j.measurement.2024.115914
Sun YZhang LWang ZChen YZhao SZhou Y(2024)I2P Registration by Learning the Underlying Alignment Feature Space from Pixel-to-Point SimilaritiesACM Transactions on Multimedia Computing, Communications, and Applications10.1145/369783920:12(1-21)Online publication date: 26-Nov-2024
https://dl.acm.org/doi/10.1145/3697839
Yuan BLu JYou SBao B(2024)Unbiased Feature Learning with Causal Intervention for Visible-Infrared Person Re-identificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3674737Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3674737
Zeng XWang XXie Y(2024)Multiple Pseudo-Siamese Network with Supervised Contrast Learning for Medical Multi-modal RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363744120:5(1-23)Online publication date: 11-Jan-2024
https://dl.acm.org/doi/10.1145/3637441
Zhang JDou J(2024)An adversarial pedestrian detection model based on virtual fisheye image trainingSignal, Image and Video Processing10.1007/s11760-024-03018-218:4(3527-3535)Online publication date: 12-Feb-2024
https://doi.org/10.1007/s11760-024-03018-2
Si THe FLi P(2024)UnifiedSC: a unified framework via collaborative optimization for multi-task person re-identificationApplied Intelligence10.1007/s10489-024-05333-054:4(2962-2975)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1007/s10489-024-05333-0
Ma LWu XTang RZhong CZhang K(2023)YuYin: a multi-task learning model of multi-modal e-commerce background music recommendationEURASIP Journal on Audio, Speech, and Music Processing10.1186/s13636-023-00306-62023:1Online publication date: 19-Oct-2023
https://dl.acm.org/doi/10.1186/s13636-023-00306-6
He QZheng ZHu H(2023)A Feature Map is Worth a Video Frame: Rethinking Convolutional Features for Visible-Infrared Person Re-identificationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/361737520:2(1-20)Online publication date: 18-Oct-2023
https://dl.acm.org/doi/10.1145/3617375
Wang HWang YYu BZhan YYuan CYang W(2023)Attentional Composition Networks for Long-Tailed Human Action RecognitionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360325320:1(1-18)Online publication date: 9-Jun-2023
https://dl.acm.org/doi/10.1145/3603253
Yue STu YLi LGao SYu Z(undefined)Multi-grained Representation Aggregating Transformer with Gating Cycle for Change CaptioningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3660346
https://dl.acm.org/doi/10.1145/3660346
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents