More Web Proxy on the site http://driver.im/

research-article

Look at Me! Correcting Eye Gaze in Live Video Communication

Authors:

Chin-Laung Lei,

Kuan-Ta ChenAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 15, Issue 2

Article No.: 38, Pages 1 - 21

https://doi.org/10.1145/3311784

Published: 05 June 2019 Publication History

Abstract

Although live video communication is widely used, it is generally less engaging than face-to-face communication because of limitations on social, emotional, and haptic feedback. Missing eye contact is one such problem caused by the physical deviation between the screen and camera on a device. Manipulating video frames to correct eye gaze is a solution to this problem. In this article, we introduce a system to rotate the eyeball of a local participant before the video frame is sent to the remote side. It adopts a warping-based convolutional neural network to relocate pixels in eye regions. To improve visual quality, we minimize the L2 distance between the ground truths and warped eyes. We also present several newly designed loss functions to help network training. These new loss functions are designed to preserve the shape of eye structures and minimize color changes around the periphery of eye regions. To evaluate the presented network and loss functions, we objectively and subjectively compared results generated by our system and the state-of-the-art, DeepWarp, in relation to two datasets. The experimental results demonstrated the effectiveness of our system. In addition, we showed that our system can perform eye-gaze correction in real time on a consumer-level laptop. Because of the quality and efficiency of the system, gaze correction by postprocessing through this system is a feasible solution to the problem of missing eye contact in video communication.

References

[1]

T. Banerjee. Webinar 8 Webcast Market Size, Trends 8 Analysis--Forecasts To 2025. Retrieved from https://medium.com/@banerjee.treesha/webinar-webcast-market-size-trends-analysis-forecasts-to-2025-1877a838ce39.

[2]

P. S. N. Lee, L. Leung, V. Lo, C. Xiong, and T. Wu. 2011. Internet communication versus face-to-face interaction in quality of life. Soc. Indicat. Res. 100, 3 (01 Feb. 2011), 375--389.

[3]

The Late Late Show with James Corden. 2017. Harry Styles video chats with james corden. Retrieved from https://www.youtube.com/watch?v=H7ZjRna4ZK4.

[4]

Y. Ganin, D. Kononenko, D. Sungatullina, and V. Lempitsky. 2016. DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation. Springer International Publishing, 311--326.

[5]

G. Huang, Z. Liu, and K. Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2261--2269. arxiv:1608.06993 http://arxiv.org/abs/1608.06993

[6]

R. Yang and Z. Zhang. 2001. Eye Gaze Correction with Stereovision for Video-Teleconferencing. Technical Report. Microsoft. Retrieved from https://www.microsoft.com/en-us/research/publication/eye-gaze-correction-with-stereovision-for-video-teleconferencing/.

[7]

A. Criminisi, J. Shotton, A. Blake, and P. H. S. Torr. 2003. Gaze manipulation for one-to-one teleconferencing. In Proceedings 9th IEEE International Conference on Computer Vision, Vol. 1. 191--198.

Digital Library

[8]

L. Wolf, Z. Freund, and S. Avidan. 2010. An eye for an eye: A single camera gaze-replacement method. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 817--824.

[9]

F. Solina and R. Ravnik. 2011. Fixing missing eye-contact in video conferencing systems. In Proceedings of the 33rd International Conference on Information Technology Interfaces (ITI’11). 233--236.

[10]

J. Gemmell, K. Toyama, C. L. Zitnick, T. Kang, and S. Seitz. 2000. Gaze awareness for video-conferencing: A software approach. IEEE Multimedia 7, 4 (2000), 26--35.

Digital Library

[11]

D. Giger, J. C. Bazin, C. Kuster, T. Popa, and M. Gross. 2014. Gaze correction with a single webcam. In Proceedings of the 2014 IEEE International Conference on Multimedia and Expo (ICME’14). 1--6.

[12]

A. Jaklič, F. Solina, and L. Šajn. 2017. User interface for a better eye contact in videoconferencing. Displays 46 (2017), 25--36.

[13]

L. S. Bohannon, A. M. Herbert, J. B. Pelz, and E. M. Rantanen. 2013. Eye contact and video-mediated communication: A review. Displays 34, 2 (2013), 177--185.

[14]

E. T. Baek and Y. S. Ho. 2017. Gaze correction using feature-based view morphing and performance evaluation. Signal Image Vid. Process. 11, 1 (2017), 187--194.

[15]

G. Doherty-Sneddon, A. Anderson, C. O’Malley, S. Langton, S. Garrod, and V. Bruce. 1997. Face-to-face and video-mediated communication: A comparison of dialogue structure and task performance. J. Exp. Psychol. Appl. 3, 2 (1997), 105--125.

[16]

E. M. Tapia, S. S. Intille, J. R. Rebula, and S. Stoddard. 2003. Concept and partial prototype video: Ubiquitous video communication with the perception of eye contact. In Proceedings of the UBICOMP 2003 Video Program.

[17]

A. Jones, M. Lang, G. Fyffe, X. Yu, J. Busch, I. McDowall, M. Bolas, and P. Debevec. 2009. Achieving eye contact in a one-to-many 3D video teleconferencing system. ACM Trans. Graph. 28, 3 (2009), 64:1--64:8.

Digital Library

[18]

B. M. Rappoport, C. J. Stringer, F. R. Rothkopf, J. C. Franklin, J. P. Ternus, J. C. Hoenig, R. P. Howarth, S. A. MYERS, and S. B. Lynch. 2016. Devices and methods for providing access to internal component. United States Patent US20160358543A1, 2016.

[19]

T. OGITA, S. Takanashi, and S. Takatsuka 2012. Sensor-equipped display apparatus and electronic apparatus. United States Patent US20120069042A1, 2012.

[20]

M. Dumont, S. Rogmans, S. Maesen, and P. Bekaert. 2009. Optimized two-party video chat with restored eye contact using graphics hardware. In e-Business and Telecommunications, Joaquim Filipe and Mohammad S. Obaidat (Eds.). Springer, Berlin, 358--372.

[21]

C. Kuster, T. Popa, J. C. Bazin, C. Gotsman, and M. Gross. 2012. Gaze correction for home video conferencing. ACM Trans. Graph. 31, 6 (2012), 174:1--174:6.

Digital Library

[22]

D. Weiner and N. Kiryati. 2003. Virtual gaze redirection in face images. In Proceedings of the 12th International Conference on Image Analysis and Processing. 76--81.

Digital Library

[23]

Y. Qin, K. C. Lien, M. Turk, and T. Höllerer. 2015. Eye Gaze Correction with a Single Webcam Based on Eye-Replacement. Springer International Publishing, Cham, 599--609.

[24]

Z. Shu, E. Shechtman, D. Samaras, and S. Hadap. 2016. EyeOpener: Editing eyes in the wild. ACM Trans. Graph. 36, 1 (2016).

Digital Library

[25]

E. Wood, T. Baltrušaitis, L. P. Morency, P. Robinson, and A. Bulling. 2018. GazeDirector: Fully articulated eye gaze redirection in video. Eurographics 37, 2 (2018), 217--225.

[26]

D. A. Forsyth and J. Ponce. 2002. Computer Vision: A Modern Approach. Prentice Hall Professional.

Digital Library

[27]

N. A. Dodgson. 2004. Variation and extrema of human interpupillary distance. In Stereoscopic Displays and Virtual Reality Systems XI, Andrew J. Woods, John O. Merritt, Stephen A. Benton, and Mark T. Bolas (Eds.), Vol. 5291. SPIE, 19--22.

[28]

D. E. King. 2009. Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10 (2009), 1755--1758. https://dl.acm.org/citation.cfm?id=1755843

Digital Library

[29]

V. Kazemi and J. Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1867--1874.

Digital Library

[30]

B. Xu, N. Wang, T. Chen, and M. Li. 2015. Empirical evaluation of rectified activations in convolutional network. In Proceedings of the ICML Deep Learning Workshop (2015). 06--11. arxiv:1505.00853 http://arxiv.org/abs/1505.00853

[31]

S. Ioffe and C. Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on International Conference on Machine Learning, Vol. 37. 448--456. http://dl.acm.org/citation.cfm?id=3045118.3045167

Digital Library

[32]

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from https://www.tensorflow.org/.

[33]

D. P. Kingma and J. Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (2015). arxiv:1412.6980 http://arxiv.org/abs/1412.6980

[34]

B. A. Smith, Q. Yin, S. K. Feiner, and S. K. Nayar. 2013. Gaze locking: Passive eye contact detection for human--object interaction. In Proceedings of the ACM Symposium on User Interface Software and Technology (UIST’13). 271--280.

Digital Library

Cited By

Liu FLi KZhong ZJia WHu BYang XWang MGuo D(2024)Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in WildACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368964320:11(1-24)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1145/3689643
Schuessler MHormann LDachselt RBlake ARother C(2024)Gazing Heads: Investigating Gaze Perception in Video-Mediated CommunicationACM Transactions on Computer-Human Interaction10.1145/366034331:3(1-31)Online publication date: 11-Jun-2024
https://dl.acm.org/doi/10.1145/3660343
Bisogni CNappi MTortora GDel Bimbo A(2024)Gaze analysisImage and Vision Computing10.1016/j.imavis.2024.104961144:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.104961
Show More Cited By

Index Terms

Look at Me! Correcting Eye Gaze in Live Video Communication
1. Computing methodologies

Recommendations

Eye gaze correction for videoconferencing
ETRA '02: Proceedings of the 2002 symposium on Eye tracking research & applications

This paper describes a 2D videoconferencing system with eye gaze correction. Tracking the eyes and warping the eyes appropriately each frame appears to create natural eye contact between users. The geometry of the eyes as well as the displacement of the ...
Eye Gaze Correction for Video Conferencing Using Kinect v2
Proceedings, Part II, of the 16th Pacific-Rim Conference on Advances in Multimedia Information Processing -- PCM 2015 - Volume 9315

In video conferencing, eye gaze correction is beneficial for effective communication. In this era, video conferencing at homes using laptops is straightforward. In this paper, we propose an eye gaze correction method with a low-cost simple setup using ...
Active eye contact for human-robot communication
CHI EA '04: CHI '04 Extended Abstracts on Human Factors in Computing Systems

Eye contact is an effective means of controlling communication for humans, such as starting communication. It seems that we can make eye contact if we look at each other. However, this alone cannot complete eye contact. In addition, we need to be aware ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 15, Issue 2

May 2019

375 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3339884

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2019

Accepted: 01 February 2019

Revised: 01 January 2019

Received: 01 September 2018

Published in TOMM Volume 15, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
782
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)10

Reflects downloads up to 13 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu FLi KZhong ZJia WHu BYang XWang MGuo D(2024)Depth Matters: Spatial Proximity-Based Gaze Cone Generation for Gaze Following in WildACM Transactions on Multimedia Computing, Communications, and Applications10.1145/368964320:11(1-24)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1145/3689643
Schuessler MHormann LDachselt RBlake ARother C(2024)Gazing Heads: Investigating Gaze Perception in Video-Mediated CommunicationACM Transactions on Computer-Human Interaction10.1145/366034331:3(1-31)Online publication date: 11-Jun-2024
https://dl.acm.org/doi/10.1145/3660343
Bisogni CNappi MTortora GDel Bimbo A(2024)Gaze analysisImage and Vision Computing10.1016/j.imavis.2024.104961144:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.imavis.2024.104961
Xian TDu PLiao C(2023)Theory and Data-Driven Competence Evaluation with Multimodal Machine Learning—A Chinese Competence Evaluation Multimodal DatasetApplied Sciences10.3390/app1313776113:13(7761)Online publication date: 30-Jun-2023
https://doi.org/10.3390/app13137761
Simmons DGravina NSleiman AKronfli F(2023)Using Web-Based Behavioral Skills Training to Teach Online Interview Skills to College StudentsJournal of Organizational Behavior Management10.1080/01608061.2023.221946644:2(88-112)Online publication date: Jun-2023
https://doi.org/10.1080/01608061.2023.2219466
Izumi KSuzuki SNiwa RShinoda AIijima RHyakuta ROchiai Y(2023)A Preliminary Study on Eye Contact Framework Toward Improving Gaze Awareness in Video ConferencesHuman-Computer Interaction10.1007/978-3-031-35596-7_31(484-498)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-35596-7_31
Han DHeshmat YGeiskkovitch DTan ZNeustaedter C(2022)A Scenario-Based Study of Doctors and Patients on Video Conferencing Appointments from HomeACM Transactions on Computer-Human Interaction10.1145/351423429:5(1-35)Online publication date: 20-Oct-2022
https://dl.acm.org/doi/10.1145/3514234
Zhang YYang JLiu ZWang RChen GTong XGuo B(2022)VirtualCube: An Immersive 3D Video Communication SystemIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2022.315051228:5(2146-2156)Online publication date: May-2022
https://doi.org/10.1109/TVCG.2022.3150512
Helou SEl Helou EEvans NShigematsu TEl Helou JKaneko MKiyono K(2022)Physician eye contact in telemedicine video consultations: A cross-cultural experimentInternational Journal of Medical Informatics10.1016/j.ijmedinf.2022.104825165(104825)Online publication date: Sep-2022
https://doi.org/10.1016/j.ijmedinf.2022.104825
Guo YZhang JChen YCai HHuang ZDeng B(2021)Real-time face view correction for front-facing camerasComputational Visual Media10.1007/s41095-021-0215-y7:4(437-452)Online publication date: 27-Apr-2021
https://doi.org/10.1007/s41095-021-0215-y

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents