[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3385959.3418452acmconferencesArticle/Chapter ViewAbstractPublication PagessuiConference Proceedingsconference-collections
research-article
Open access

BodySLAM: Opportunistic User Digitization in Multi-User AR/VR Experiences

Published: 30 October 2020 Publication History

Abstract

Today’s augmented and virtual reality (AR/VR) systems do not provide body, hand or mouth tracking without special worn sensors or external infrastructure. Simultaneously, AR/VR systems are increasingly being used in co-located, multi-user experiences, opening the possibility for opportunistic capture of other users. This is the core idea behind BodySLAM, which uses disparate camera views from users to digitize the body, hands and mouth of other people, and then relay that information back to the respective users. If a user is seen by two or more people, 3D pose can be estimated via stereo reconstruction. Our system also maps the arrangement of users in real world coordinates. Our approach requires no additional hardware or sensors beyond what is already found in commercial AR/VR devices, such as Microsoft HoloLens or Oculus Quest.

Supplementary Material

MP4 File (a16-ahuja-supplement.mp4)

References

[1]
Karan Ahuja, Chris Harrison, Mayank Goel, and Robert Xiao. 2019. MeCap: Whole-Body Digitization for Low-Cost VR/AR Headsets. In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology. 453–462.
[2]
Karan Ahuja, Rahul Islam, Varun Parashar, Kuntal Dey, Chris Harrison, and Mayank Goel. 2018. Eyespyvr: Interactive eye sensing using off-the-shelf, smartphone-based vr headsets. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 2, 2 (2018), 1–10.
[3]
Dimitrios S Alexiadis, Philip Kelly, Petros Daras, Noel E O’Connor, Tamy Boubekeur, and Maher Ben Moussa. 2011. Evaluating a dancer’s performance using kinect-based skeleton tracking. In Proceedings of the 19th ACM international conference on Multimedia. 659–662.
[4]
Brian Amento, Will Hill, and Loren Terveen. 2002. The sound of one hand: a wrist-mounted bio-acoustic fingertip gesture interface. In CHI’02 Extended Abstracts on Human Factors in Computing Systems. 724–725.
[5]
Patrick Baudisch, Henning Pohl, Stefanie Reinicke, Emilia Wittmers, Patrick Lühne, Marius Knaust, Sven Köhler, Patrick Schmidt, and Christian Holz. 2013. Imaginary reality gaming: ball games without a ball. In Proceedings of the 26th annual ACM symposium on User interface software and technology. 405–410.
[6]
Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grundmann. 2020. BlazePose: On-device Real-time Body Pose tracking. arXiv preprint arXiv:2006.10204(2020).
[7]
Hrvoje Benko, Edward W Ishak, and Steven Feiner. 2004. Collaborative mixed reality visualization of an archaeological excavation. In Third IEEE and ACM International Symposium on Mixed and Augmented Reality. IEEE, 132–140.
[8]
Mark Billinghurst, Ivan Poupyrev, Hirokazu Kato, and Richard May. 2000. Mixing realities in shared space: An augmented reality interface for collaborative computing. In 2000 IEEE international conference on multimedia and expo. ICME2000. Proceedings. Latest advances in the fast changing world of multimedia (Cat. No. 00TH8532), Vol. 3. IEEE, 1641–1644.
[9]
G. Bradski. 2000. The OpenCV Library. Dr. Dobb’s Journal of Software Tools(2000).
[10]
Cesar Cadena, Luca Carlone, Henry Carrillo, Yasir Latif, Davide Scaramuzza, José Neira, Ian Reid, and John J Leonard. 2016. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on robotics 32, 6 (2016), 1309–1332.
[11]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7291–7299.
[12]
Google Cardboard. 2014. Google Cardboard. Retrieved 2014 from https://vr.google.com/cardboard
[13]
MWM Gamini Dissanayake, Paul Newman, Steve Clark, Hugh F Durrant-Whyte, and Michael Csorba. 2001. A solution to the simultaneous localization and map building (SLAM) problem. IEEE Transactions on robotics and automation 17, 3(2001), 229–241.
[14]
Sehoon Ha, Yunfei Bai, and C Karen Liu. 2011. Human motion reconstruction from force sensors. In Proceedings of the 2011 ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 129–138.
[15]
Richard I Hartley and Peter Sturm. 1997. Triangulation. Computer vision and image understanding 68, 2 (1997), 146–157.
[16]
HoloKit. 2019. HoloKit. Retrieved 2019 from https://holokit.io
[17]
Ke Huo, Tianyi Wang, Luis Paredes, Ana M Villanueva, Yuanzhi Cao, and Karthik Ramani. 2018. Synchronizar: Instant synchronization for spontaneous and spatial collaborations in augmented reality. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology. 19–30.
[18]
JSArUco. 2018. JSArUco. Retrieved 2018 from https://github.com/jcmellado/js-aruco
[19]
David Kim, Otmar Hilliges, Shahram Izadi, Alex D Butler, Jiawen Chen, Iason Oikonomidis, and Patrick Olivier. 2012. Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor. In Proceedings of the 25th annual ACM symposium on User interface software and technology. 167–176.
[20]
Jan J Koenderink and Andrea J Van Doorn. 1991. Affine structure from motion. JOSA A 8, 2 (1991), 377–385.
[21]
Ilya Kostrikov and Juergen Gall. 2014. Depth Sweep Regression Forests for Estimating 3D Human Pose from Images. In BMVC, Vol. 1. 5.
[22]
Cheng Li and Kris M Kitani. 2013. Pixel-level hand detection in ego-centric videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3570–3577.
[23]
Hao Li, Laura Trutoiu, Kyle Olszewski, Lingyu Wei, Tristan Trutna, Pei-Lun Hsieh, Aaron Nicholls, and Chongyang Ma. 2015. Facial performance sensing head-mounted display. ACM Transactions on Graphics (ToG) 34, 4 (2015), 1–9.
[24]
Yutian Lin, Liang Zheng, Zhedong Zheng, Yu Wu, Zhilan Hu, Chenggang Yan, and Yi Yang. 2019. Improving person re-identification by attribute and identity learning. Pattern Recognition 95(2019), 151–161.
[25]
Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017. Vnect: Real-time 3d human pose estimation with a single rgb camera. ACM Transactions on Graphics (TOG) 36, 4 (2017), 1–14.
[26]
METAmotion. 2018. METAmotion. Retrieved 2018 from http://metamotion.com/gypsy/gypsy-motion-capture-system.htm
[27]
Rafael Munoz-Salinas. 2012. Aruco: a minimal library for augmented reality applications based on opencv. Universidad de Córdoba(2012).
[28]
David Nistér, Oleg Naroditsky, and James Bergen. 2004. Visual odometry. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., Vol. 1. Ieee, I–I.
[29]
TechCrunch Oculus. 2020. TechCrunch - Oculus. Retrieved 2020 from https://techcrunch.com/2016/10/06/facebook-social-vr/
[30]
OptiTrack. 2020. OptiTrack. Retrieved 2020 from http://optitrack.com
[31]
George Papandreou, Tyler Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, and Kevin Murphy. 2018. Personlab: Person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model. In Proceedings of the European Conference on Computer Vision (ECCV). 269–286.
[32]
Helge Rhodin, Christian Richardt, Dan Casas, Eldar Insafutdinov, Mohammad Shafiei, Hans-Peter Seidel, Bernt Schiele, and Christian Theobalt. 2016. Egocap: egocentric marker-less motion capture with two fisheye cameras. ACM Transactions on Graphics (TOG) 35, 6 (2016), 1–11.
[33]
Rift. 2006. Oculus Rift. Retrieved 2006 from https://www.oculus.com/rift/
[34]
Damien Constantine Rompapas, Christian Sandor, Alexander Plopski, Daniel Saakes, Joongi Shin, Takafumi Taketomi, and Hirokazu Kato. 2019. Towards large scale high fidelity collaborative augmented reality. Computers & Graphics 84(2019), 24–41.
[35]
Silonie Sachdeva 2009. Fitzpatrick skin typing: Applications in dermatology. Indian Journal of Dermatology, Venereology, and Leprology 75, 1(2009), 93.
[36]
Dieter Schmalstieg, Anton Fuhrmann, Gerd Hesina, Zsolt Szalavári, L Miguel Encarnaçao, Michael Gervautz, and Werner Purgathofer. 2002. The studierstube augmented reality project. Presence: Teleoperators & Virtual Environments 11, 1(2002), 33–54.
[37]
Takaaki Shiratori, Hyun Soo Park, Leonid Sigal, Yaser Sheikh, and Jessica K Hodgins. 2011. Motion capture from body-mounted cameras. In ACM SIGGRAPH 2011 papers. 1–10.
[38]
Jan Smisek, Michal Jancosek, and Tomas Pajdla. 2013. 3D with Kinect. In Consumer depth cameras for computer vision. Springer, 3–25.
[39]
Justus Thies, Michael Zollhöfer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Facevr: Real-time facial reenactment and eye gaze control in virtual reality. arXiv preprint arXiv:1610.03151(2016).
[40]
Denis Tome, Patrick Peluse, Lourdes Agapito, and Hernan Badino. 2019. xr-egopose: Egocentric 3d human pose from an hmd camera. In Proceedings of the IEEE International Conference on Computer Vision. 7728–7738.
[41]
Vicon. 2020. Vicon. Retrieved 2020 from https://vicon.com/
[42]
Vive. 2006. HTC VIVE. Retrieved 2006 from https://www.vive.com/
[43]
Karl DD Willis, Ivan Poupyrev, Scott E Hudson, and Moshe Mahler. 2011. SideBySide: ad-hoc multi-user interaction with handheld projectors. In Proceedings of the 24th annual ACM symposium on User interface software and technology. 431–440.
[44]
Weipeng Xu, Avishek Chatterjee, Michael Zollhoefer, Helge Rhodin, Pascal Fua, Hans-Peter Seidel, and Christian Theobalt. 2019. Mo 2 cap 2: Real-time mobile 3d motion capture with a cap-mounted fisheye camera. IEEE transactions on visualization and computer graphics 25, 5(2019), 2093–2101.
[45]
Yang Zhang and Chris Harrison. 2015. Tomo: Wearable, low-cost electrical impedance tomography for hand gesture recognition. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology. 167–173.
[46]
Yang Zhang, Chouchang Yang, Scott E Hudson, Chris Harrison, and Alanson Sample. 2018. Wall++ room-scale interactive and context-aware sensing. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–15.
[47]
Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. 2018. Through-wall human pose estimation using radio signals. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7356–7365.
[48]
Liang Zheng, Yujia Huang, Huchuan Lu, and Yi Yang. 2019. Pose-invariant embedding for deep person re-identification. IEEE Transactions on Image Processing 28, 9 (2019), 4500–4509.
[49]
Junhan Zhou, Yang Zhang, Gierad Laput, and Chris Harrison. 2016. AuraSense: enabling expressive around-smartwatch interactions with electric field sensing. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology. 81–86.

Cited By

View all
  • (2024)XDTK: A Cross-Device Toolkit for Input & Interaction in XR2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00092(467-470)Online publication date: 16-Mar-2024
  • (2023)V-Light: Leveraging Edge Computing For The Design of Mobile Augmented Reality GamesProceedings of the 18th International Conference on the Foundations of Digital Games10.1145/3582437.3582456(1-10)Online publication date: 12-Apr-2023
  • (2022)Scalable Extended Reality: A Future Research AgendaBig Data and Cognitive Computing10.3390/bdcc60100126:1(12)Online publication date: 26-Jan-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SUI '20: Proceedings of the 2020 ACM Symposium on Spatial User Interaction
October 2020
188 pages
ISBN:9781450379434
DOI:10.1145/3385959
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 October 2020

Check for updates

Author Tags

  1. Augmented Reality
  2. Body Pose
  3. Facial Expression.
  4. Hand Gestures
  5. Mixed Reality
  6. Motion Capture
  7. Virtual Reality

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

SUI '20
SUI '20: Symposium on Spatial User Interaction
October 30 - November 1, 2020
Virtual Event, Canada

Acceptance Rates

Overall Acceptance Rate 86 of 279 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)215
  • Downloads (Last 6 weeks)15
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)XDTK: A Cross-Device Toolkit for Input & Interaction in XR2024 IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (VRW)10.1109/VRW62533.2024.00092(467-470)Online publication date: 16-Mar-2024
  • (2023)V-Light: Leveraging Edge Computing For The Design of Mobile Augmented Reality GamesProceedings of the 18th International Conference on the Foundations of Digital Games10.1145/3582437.3582456(1-10)Online publication date: 12-Apr-2023
  • (2022)Scalable Extended Reality: A Future Research AgendaBig Data and Cognitive Computing10.3390/bdcc60100126:1(12)Online publication date: 26-Jan-2022
  • (2022)Toward practical and high-fidelity user digitization in extended reality environmentsXRDS: Crossroads, The ACM Magazine for Students10.1145/355818529:1(5-7)Online publication date: 6-Oct-2022
  • (2022)SEAR: Scaling Experiences in Multi-user Augmented RealityIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.315046728:5(1982-1992)Online publication date: May-2022
  • (2021)Portable 3D Human Pose Estimation for Human-Human Interaction using a Chest-Mounted Fisheye CameraProceedings of the Augmented Humans International Conference 202110.1145/3458709.3458986(116-120)Online publication date: 22-Feb-2021
  • (2021)Pose-on-the-Go: Approximating User Pose with Smartphone Sensor Fusion and Inverse KinematicsProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445582(1-12)Online publication date: 6-May-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media