Abstract
In this paper, we propose a new video conferencing system that presents correct gaze directions of a remote user by switching among images obtained from multiple cameras embedded in a screen according to a local user’s position. Our proposed method reproduces a situation like that in which the remote user is in the same space as the local user. The position of the remote user to be displayed on the screen is determined so that the positional relationship between the users is reproduced. The system selects one of the embedded cameras whose viewing direction towards the remote user is the closest to the local user’s viewing direction to the remote user’s image on the screen. As a result of quantitative evaluation, we confirmed that, in comparison with the case using a single camera, the accuracy of gaze estimation was improved by switching among the cameras according to the position of the local user.
Similar content being viewed by others
Change history
20 August 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11042-021-11428-4
References
Harrison C, Hudson SE (2008) Pseudo-3d video conferencing with a generic webcam. In: 2008 Tenth IEEE International Symposium on Multimedia, pp 236–241
Hecht H, Boyarskaya E, Kitaoka A (2014) The mona lisa effect: Testing the limits of perceptual robustness vis-à-vis slanted images. Psihologija 47(3):287–301
Higuchi K, Chen Y, Chou PA, Zhang Z, Liu Z (2015) Immerseboard: Immersive telepresence experience using a digital whiteboard. In: Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (ACM, New York, NY, USA), CHI ’15, pp 2383–2392. http://doi.acm.org/10.1145/2702123.2702160
Jones A, Lang M, Fyffe G, Yu X, Busch J, McDowall I, Bolas M, Debevec P (2009) Achieving eye contact in a one-to-many 3d video teleconferencing system. In: ACM SIGGRAPH 2009 Papers, pp 64:1–64:8
Jouppi NP, Iyer S, Thomas S, Slayden A (2004) Bireality: Mutually-immersive telepresence. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, (ACM, New York, NY, USA), MULTIMEDIA ’04, pp 860–867. http://doi.acm.org/10.1145/1027527.1027725
Kim K, Bolton J, Girouard A, Cooperstock J, Vertegaal R (2012) Telehuman: Effects of 3d perspective on gaze and pose estimation with a life-size cylindrical telepresence pod. In: SIGCHI Conference on Human Factors in Computing Systems, pp 2531–2540
Kuster C, Popa T, Bazin J-C, Gotsman C, Gross M (2012) Gaze correction for home video conferencing. ACM Trans. Graph., pp 174:1–174:6
Lincoln P, Nashel A, Ilie A, Towles H, Welch G, Fuchs H (2009) Multi-view lenticular display for group teleconferencing. In: International Conference on Immersive Telecommunications, pp 22:1–22:8
Lincoln P, Welch G, Nashel A, State A, Ilie A, Fuchs H (2011) Animatronic shader lamps avatars. Virtual Real., pp 225–238
Maimone A, Fuchs H (2011) Encumbrance-free telepresence system with real-time 3d capture and display using commodity depth cameras. In: IEEE International Symposium on Mixed and Augmented Reality, pp 137–146
Misawa K, Ishiguro Y, Rekimoto J (2012) Livemask: A telepresence surrogate system with a face-shaped screen for supporting nonverbal communication. In: International Working Conference on Advanced Visual Interfaces, pp 394–397
Nguyen D, Canny J (2005) Multiview: Spatially faithful group video conferencing. In: SIGCHI Conference on Human Factors in Computing Systems, pp 799–808
Otsuka K (2016) Mmspace: Kinetically-augmented telepresence for small group-to-group conversations. In: IEEE Virtual Reality, pp 19–28
Pan Y, Steed A (2014) A gaze-preserving situated multiview telepresence system. In: SIGCHI Conference on Human Factors in Computing Systems, pp 2173–2176
Pejsa T, Kantor J, Benko H, Ofek E, Wilson A (2016) Room2room: Enabling life-size telepresence in a projected augmented reality environment. In: ACM Conference on Computer-Supported Cooperative Work & Social Computing, pp 1716–1725
Sirkin D, Venolia G, Tang J, Robertson G, Kim T, Inkpen K, Sedlins M, Lee B, Sinclair M (2011) Motion and attention in a kinetic videoconferencing proxy. In: IFIP TC 13 International Conference on Human-computer Interaction, pp 162–180
Vertegaal R, Weevers I, Sohn C, Cheung C (2003) Gaze-2: Conveying eye contact in group video conferencing using eye-controlled camera direction. In: SIGCHI Conference on Human Factors in Computing Systems, pp 521–528
Yang R, Zhang Z (2004) Eye gaze correction with stereovision for video-teleconferencing. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp 956–960
Zhang C, Cai Q, Chou P, Zhang Z, Martin-Brualla R (2013) Viewport: A distributed, immersive teleconferencing system with infrared dot pattern. IEEE MultiMedia, pp 17–27
Zhu J, Yang R, Xiang X (2011) Eye contact in video conference via fusion of time-of-flight depth sensor and stereo. 3D Research 2(3):5
Acknowledgements
This research is partially supported by the Center of Innovation Program from Japan Science and Technology Agency.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The original online version of this article was revised: The affiliation details for Kazuki Kobayashi and Takashi Komuro was incorrect.
Rights and permissions
About this article
Cite this article
Kobayashi, K., Komuro, T., Kagawa, K. et al. Transmission of correct gaze direction in video conferencing using screen-embedded cameras. Multimed Tools Appl 80, 31509–31526 (2021). https://doi.org/10.1007/s11042-020-09758-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09758-w