Wide-baseline multi-camera calibration from a room filled with people

276 Accesses
1 Altmetric
Explore all metrics

Abstract

When a precise 3D reconstruction of an object or person is attempted, one typically starts from a multi-view setup with cameras spread out all around the investigation area. A triangulation of the matching joints is then performed to retrieve the 3D coordinates. However, calibrating such a setup typically requires dedicated equipment and elaborated test procedures. In this paper, we will demonstrate a calibration method based only on the detection of one or more people walking through the field of view. This, in effect, allows the calibration to happen simultaneously with the measurements being taken, which is practical when dealing with uncontrolled environments. We will also show that this calibration procedure is more accurate than a typical incremental calibration procedure using a chessboard. Conceptually, the novelty that we propose is in using semantic information (e.g. the position of the left shoulder) rather than appearance-based information to drive the calibration, as this type of information is less viewpoint dependent. Note that here we use human pose keypoints but for larger outdoor scenes, car keypoints could be used as well.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Large-Parallax Multi-camera Calibration Method for Indoor Wide-Baseline Scenes

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Auto-calibration of depth camera networks for people tracking

Article Open access 28 March 2019

Notes

Present in the OpenCV-library function cv2.solvePnP.
Downloaded from https://www.epfl.ch/labs/cvlab/data/data-pom-index-php
github.com/HRNet/HigherHRNet-Human-Pose-Estimation.
github.com/mapillary/OpenSfM.
Some thresholds were increased tenfold (i.e. five_point_algo_threshold, triangulation_threshold, resection_threshold and bundle_outlier_fixed_threshold) to account for joint detection error and retriangulation_ratio and bundle_new_points_ratio were set to 0.8. No extensive tuning was needed for these parameters.
Performed using DLT from [21]

References

Claeys, A., Hoedt, S., Domken, C., Aghezzaf, E., Claeys, D., Cottyn, J.: Methodology to integrate ergonomics information in contextualized digital work instructions, In: 9th CIRP Conference on Assembly Technology and Systems, Procedia CIRP, vol. 106, pp. 168-173, (2022)
Tripicchio, P., D’Avella, S., Camacho-Gonzalez, G., Landolfi, L., Baris, G., Avizzano, C.A., Filippeschi, A.: Multi-camera extrinsic calibration for real-time tracking in large outdoor environments. J. Sens. Actuator Netw. 11, 40 (2022)
Article Google Scholar
Lepetit, V., Moreno-Noguer, F., Fua, P.: Epnp: An accurate o(n) solution to the PnP problem. Int. J. Comput. Vision 81, 155–166 (2009)
Article Google Scholar
Cefalu, A., Haala, N., Fritsch, D.: Structureless bundle adjustment with self-calibration using accumulated constraints, In: ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vol. III-3, (2016)
Agarwal, S., Furukawa, Y., Snavely, N., Simon, I., Curless, B., Seitz, S., Szeliski, R.: Building rome in a day. ICCV, (2009)
Svoboda, T., Martinec, D., Pajdla, T.: A convenient multicamera self-calibration for virtual environments. Presence 14(4), 407–422 (2005)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)
Article Google Scholar
Xie, T., Dai, K., Wang, K., Li, R., Zhao, L.: Deepmatcher: a deep transformer-based network for robust and accurate local feature matching, arxiv: 2301.02993v1, (2021)
Fleuret, F., Berclaz, J., Lengagne, R., Fua, P.: Multi-camera people tracking with a probabilistic occupancy map. IEEE Trans Pattern Anal Machine Intell 30(2), 267–282 (2008). https://doi.org/10.1109/TPAMI.2007.1174
Puwein, J., Ballan, L., Ziegler, R., Pollefeys, M.: Joint camera pose estimation and 3d human pose estimation in a multi-camera setup. In: Proceeding of Asian Conference on Computer Vision, pp. 473–487. Springer, (2014)
Takahashi, K., Mikami, D., Isogawa, M., Kimata, H.: Human pose as calibration pattern; 3d human pose estimation with multiple unsynchronized and uncalibrated cameras. In: Conference on Computer Vision and Pattern Recognition Workshops (CVPR), (2018)
Xu, Y., Li, YJ., Weng, X., Kitani, K.: Wide-baseline multi-camera calibration using person re-identification. In: Conference on Computer Vision and Pattern Recognition Workshops (CVPR), (2021)
Kendall, A., Grimes, M., Cipolla, R.: Posenet: A convolutional network for real-time 6-dof camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946, (2015)
Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higher-hrnet: Scale-aware representation learning for bottom-up human pose estimation. arxiv:1908.10357v3, (2019)
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, ISBN 1-57735-004-9, pp. 226-231, (1996)
Dehaeck, S., Domken, C., Bey-Temsamani, A., Abedrabbo, G.: A strong geometric baseline for cross-view matching of multi-person 3D pose estimation from multi-view images. In: Image Analysis and Processing–ICIAP 2022, ISBN: 978-3-031-06430-2, pp.77-88, (2022)
Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)
Article MathSciNet Google Scholar
Dong, J., Jiang, W., Huang, Q., Bao, H., Zhou, X.: Fast and robust multi-person 3d pose estimation from multiple views. In: Conference on Computer Vision and Pattern Recognition Workshops (CVPR), (2019)
Tanke, J., Gall, J.: Iterative greedy matching for 3d human pose tracking from multiple views. In: German conference on Pattern Recognition, (2019)
Gendreau, M., Potvin, J.: Handbook of Metaheuristics, Springer, ISBN 978-3-319-91085-7, (2019)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge, England (2004)
Book MATH Google Scholar
Song, X., Wang, P., Zhou, D., Zhu, R., Guan, C., Dai, Y., Su, H., Li, H., Yang R.: ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving, arXiv:1811.12222, (2018)

Download references

Acknowledgements

We would like to acknowledge the help of E. Kikken for the many interesting discussions. We also thank E. Hage for the construction of the Unity dataset. This research received funding from the Flanders Make ‘2018-134-Ergo-Eyehand-CONV-ICON’ project.

Author information

Authors and Affiliations

Flanders Make, Oude Diestersebaan 133, 3920, Lommel, Belgium
S. Dehaeck, C. Domken, A. Bey-Temsamani & G. Abedrabbo

Authors

S. Dehaeck
View author publications
You can also search for this author in PubMed Google Scholar
C. Domken
View author publications
You can also search for this author in PubMed Google Scholar
A. Bey-Temsamani
View author publications
You can also search for this author in PubMed Google Scholar
G. Abedrabbo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to S. Dehaeck.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dehaeck, S., Domken, C., Bey-Temsamani, A. et al. Wide-baseline multi-camera calibration from a room filled with people. Machine Vision and Applications 34, 45 (2023). https://doi.org/10.1007/s00138-023-01395-1

Download citation

Received: 27 September 2022
Revised: 19 March 2023
Accepted: 24 March 2023
Published: 28 April 2023
DOI: https://doi.org/10.1007/s00138-023-01395-1

Wide-baseline multi-camera calibration from a room filled with people

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large-Parallax Multi-camera Calibration Method for Indoor Wide-Baseline Scenes

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Auto-calibration of depth camera networks for people tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Wide-baseline multi-camera calibration from a room filled with people

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Large-Parallax Multi-camera Calibration Method for Indoor Wide-Baseline Scenes

Automatic Calibration of Stationary Surveillance Cameras in the Wild

Auto-calibration of depth camera networks for people tracking

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation