[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

Monocular Absolute 3D Human Pose Estimation with an Uncalibrated Fixed Camera

  • Conference paper
  • First Online:
Frontiers of Computer Vision (IW-FCV 2024)

Abstract

In this paper, we propose a method for absolute 3D human pose estimation (HPE) with an uncalibrated monocular camera. In the case of analyzing workers’ movement with an existing uncalibrated camera, the previous method cannot estimate absolute human pose due to the lack of information on camera parameters. Our proposed method overcomes this limitation by determining the position and scale of humans based on the pose of surrounding objects. Specifically, we predict the intrinsic and extrinsic parameters of the camera through user-guided manual manipulation. Subsequently, the estimated human pose is transformed from local coordinates to global coordinates for each frame. This absolute coordinate representation allows for real-time prediction of human movements relative to objects. To assess the efficacy of our method, we conducted three kinds of experiments. A user study revealed that the proposed user-guided method archives accurate estimation of camera parameters. Quantitative evaluation using a public dataset demonstrated that our method can predict human pose with practical accuracy, providing a benchmark for future enhancements. Qualitative evaluation with a unique dataset showed that our method could easily generate digital twin representations across diverse environments and camera positions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 79.99
Price includes VAT (United Kingdom)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 54.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alpala, L.O., Quiroga-Parra, D.J., Torres, J.C., Peluffo-Ordóñez, D.H.: Smart factory using virtual reality and online multi-user: towards a metaverse for experimental frameworks. Appl. Sci. 12(12) (2022)

    Google Scholar 

  2. Benzine, A., Chabot, F., Luvison, B., Pham, Q.C., Achard, C.: PandaNet: anchor-based single-shot multi-person 3D pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6855–6864 (2020)

    Google Scholar 

  3. Chen, C.H., Ramanan, D.: 3D human pose estimation = 2D pose estimation + matching. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5759–5767 (2017)

    Google Scholar 

  4. Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multi-person pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018)

    Google Scholar 

  5. CMU Motion Capture Database. https://sites.google.com/a/cgspeed.com/cgspeed/motion-capture. Accessed 1 Jan 2024

  6. Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding YOLO series in 2021. arXiv:2107.08430 (2021)

  7. von Gioi, R.G., Jakubowicz, J., Morel, J.M., Randall, G.: LSD: a fast line segment detector with a false detection control. IEEE Trans. Pattern Anal. Mach. Intell. 32, 722–732 (2010)

    Article  Google Scholar 

  8. Han, P., Zhao, G.: Line-based initialization method for mobile augmented reality in aircraft assembly. Vis. Comput. 33, 1185–1196 (2017)

    Article  Google Scholar 

  9. Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2014)

    Google Scholar 

  10. JustWithJoints: Body controller with joint locations (2022). https://assetstore.unity.com/packages/3d/animations/justwithjoints-body-controller-with-joint-locations-127172. Accessed 1 Jan 2024

  11. Kehl, W., Manhardt, F., Tombari, F., Ilic, S., Navab, N.: SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1530–1538 (2017)

    Google Scholar 

  12. Konishi, Y., Hanzawa, Y., Kawade, M., Hashimoto, M.: Fast 6D pose estimation from a monocular image using hierarchical pose trees. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 398–413. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_24

    Chapter  Google Scholar 

  13. Kritzinger, W., Karner, M., Traar, G., Henjes, J., Sihn, W.: Digital twin in manufacturing: a categorical literature review and classification. IFAC-PapersOnLine 51(11), 1016–1022 (2018)

    Article  Google Scholar 

  14. Lepetit, V., Moreno-Noguer, F., Fua, P.: EPnP: an accurate O(n) solution to the PnP problem. Int. J. Comput. Vis. 81, 155–166 (2009)

    Article  Google Scholar 

  15. Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., Black, M.J.: SMPL: a skinned multi-person linear model. ACM Trans. Graph. 34(6) (2015)

    Google Scholar 

  16. MakeHuman. http://www.makehumancommunity.org. Accessed 1 Jan 2024

  17. Mardia, K., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, Cambridge (1979)

    Google Scholar 

  18. Martinez, J., Hossain, R., Romero, J., Little, J.J.: A simple yet effective baseline for 3D human pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2659–2668 (2017)

    Google Scholar 

  19. Marullo, G., Tanzi, L., Piazzolla, P., Vezzetti, E.: 6D object position estimation from 2D images: a literature review. Multimedia Tools Appl. 82(16), 24605–24643 (2022)

    Article  Google Scholar 

  20. Moon, G., Chang, J.Y., Lee, K.M.: Camera distance-aware top-down approach for 3D multi-person pose estimation from a single RGB image. In: International Conference on Computer Vision, pp. 10132–10141 (2019)

    Google Scholar 

  21. Moteki, A., Saito, H.: Object pose estimation using edge images synthesized from shape information. Sensors 22(24), 9610 (2022)

    Article  Google Scholar 

  22. Mousavian, A., Anguelov, D., Flynn, J., Košecká, J.: 3D bounding box estimation using deep learning and geometry. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5632–5640 (2017)

    Google Scholar 

  23. Orghidan, R., Salvi, J., Gordan, M., Orza, B.: Camera calibration using two or three vanishing points. In: 2012 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 123–130 (2012)

    Google Scholar 

  24. Pavlakos, G., Zhou, X., Chan, A., Derpanis, K.G., Daniilidis, K.: 6-DoF object pose from semantic keypoints. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 2011–2018 (2017)

    Google Scholar 

  25. Pavlakos, G., Zhou, X., Derpanis, K.G., Daniilidis, K.: Coarse-to-fine volumetric prediction for single-image 3D human pose. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1263–1272 (2017)

    Google Scholar 

  26. Peng, S., Zhou, X., Liu, Y., Lin, H., Huang, Q., Bao, H.: PVNet: pixel-wise voting network for 6DoF object pose estimation. IEEE Trans. Pattern Anal. Mach. Intell. 44(06), 3212–3223 (2022)

    Article  Google Scholar 

  27. Rogez, G., Weinzaepfel, P., Schmid, C.: LCR-Net++: multi-person 2D and 3D pose detection in natural images. IEEE Trans. Pattern Anal. Mach. Intell. 42(05), 1146–1161 (2020)

    Google Scholar 

  28. Shan, W., Lu, H., Wang, S., Zhang, X., Gao, W.: Improving robustness and accuracy via relative information encoding in 3D human pose estimation. In: ACM International Conference on Multimedia, pp. 3446–3454 (2021)

    Google Scholar 

  29. Sun, X., Shang, J., Liang, S., Wei, Y.: Compositional human pose regression. In: International Conference on Computer Vision, pp. 2621–2630 (2017)

    Google Scholar 

  30. Tekin, B., Sinha, S.N., Fua, P.: Real-time seamless single shot 6D object pose prediction. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 292–301 (2018)

    Google Scholar 

  31. Ulrich, M., Wiedemann, C., Steger, C.: Combining scale-space and similarity-based aspect graphs for fast 3D object recognition. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 1902–1914 (2012)

    Article  Google Scholar 

  32. Unity Asset Store. https://assetstore.unity.com. Accessed 1 Jan 2024

  33. Wang, C., Li, J., Liu, W., Qian, C., Lu, C.: HMOR: hierarchical multi-person ordinal relations for monocular multi-person 3D pose estimation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 242–259. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_15

    Chapter  Google Scholar 

  34. Wang, G., Manhardt, F., Tombari, F., Ji, X.: GDR-Net: geometry-guided direct regression network for monocular 6D object pose estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 16611–16621 (2021)

    Google Scholar 

  35. Wu, J., et al.: Real-time object pose estimation with pose interpreter networks. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 6798–6805 (2018)

    Google Scholar 

  36. Xu, C., Zhang, L., Cheng, L., Koch, R.: Pose estimation from line correspondences: a complete analysis and a series of solutions. IEEE Trans. Pattern Anal. Mach. Intell. 39(6), 1209–1222 (2017)

    Article  Google Scholar 

  37. Yang, Z., Yu, X., Yang, Y.: DSC-PoseNet: learning 6DoF object pose estimation via dual-scale consistency. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3906–3915 (2021)

    Google Scholar 

  38. Zhan, Y., Li, F., Weng, R., Choi, W.: Ray3D: ray-based 3D human pose estimation for monocular absolute 3D localization. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 13106–13115 (2022)

    Google Scholar 

  39. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)

    Article  Google Scholar 

  40. Zheng, C., et al.: Deep learning-based human pose estimation: a survey. ACM Comput. Surv. 56(1), 1–37 (2023)

    Article  MathSciNet  Google Scholar 

  41. Zheng, C., Zhu, S., Mendieta, M., Yang, T., Chen, C., Ding, Z.: 3D human pose estimation with spatial and temporal transformers. In: International Conference on Computer Vision, pp. 11656–11665 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atsunori Moteki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moteki, A., Hirai, Y., Suzuki, G., Saito, H. (2024). Monocular Absolute 3D Human Pose Estimation with an Uncalibrated Fixed Camera. In: Irie, G., Shin, C., Shibata, T., Nakamura, K. (eds) Frontiers of Computer Vision. IW-FCV 2024. Communications in Computer and Information Science, vol 2143. Springer, Singapore. https://doi.org/10.1007/978-981-97-4249-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-4249-3_5

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-4248-6

  • Online ISBN: 978-981-97-4249-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics