Abstract
Across photography, marketing, and website design, being able to direct the viewer’s attention is a powerful tool. Motivated by professional workflows, we introduce an automatic method to make an image region more attention-capturing via subtle image edits that maintain realism and fidelity to the original. From an input image and a user-provided mask, our GazeShiftNet model predicts a distinct set of global parametric transformations to be applied to the foreground and background image regions separately. We present the results of quantitative and qualitative experiments that demonstrate improvements over prior state-of-the-art. In contrast to existing attention shifting algorithms, our global parametric approach better preserves image semantics and avoids typical generative artifacts. Our edits enable inference at interactive rates on any image size, and easily generalize to videos. Extensions of our model allow for multi-style edits and the ability to both increase and attenuate attention in an image region. Furthermore, users can customize the edited images by dialing the edits up or down via interpolations in parameter space. This paper presents a practical tool that can simplify future image editing pipelines.
Work done while Youssef was interning at Adobe Research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Because the provided code could not reproduce the high quality results presented in their paper, for favorable comparison, we directly used images from their project page: https://webee.technion.ac.il/labs/cgm/Computer-Graphics-Multimedia/Software/saliencyManipulation/.
- 2.
Relative saliency increases can grow large when the corresponding instance has an average initial saliency value near zero.
References
Achanta, R., Süsstrunk, S.: Saliency detection for content-aware image resizing. In: ICIP (2009)
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: A randomized correspondence algorithm for structural image editing. In: TOG (2009)
Bianco, S., Cusano, C., Piccoli, F., Schettini, R.: Learning parametric functions for color image enhancement. In: International Workshop on Computational Color Imaging (2019)
Borji, A.: Saliency prediction in the deep learning era: successes and limitations. TPAMI (2019)
Bylinskii, Z., et al.: Learning visual importance for graphic designs and data visualizations. In: UIST (2017)
Bylinskii, Z., Recasens, A., Borji, A., Oliva, A., Torralba, A., Durand, F.: Where should saliency models look next? In: ECCV (2016)
Chandakkar, P.S., Li, B.: A structured approach to predicting image enhancement parameters. In: WACV (2016)
Chen, S.E., Williams, L.: View interpolation for image synthesis. In: SIGGRAPH (1993)
Chen, Y.C., Chang, K.J., Tsai, Y.H., Wang, Y.C.F., Chiu, W.C.: Guide your eyes: learning image manipulation under saliency guidance. In: BMVC (2019)
Cornia, M., Baraldi, L., Cucchiara, R.: Show, control and tell: a framework for generating controllable and grounded captions. In: CVPR (2019)
Fosco, C., et al.: How much time do you have? modeling multi-duration saliency. In: CVPR (2020)
Fried, O., Shechtman, E., Goldman, D.B., Finkelstein, A.: Finding distractors in images. In: CVPR (2015)
Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR (2016)
Gatys, L.A., Kümmerer, M., Wallis, T.S., Bethge, M.: Guiding human gaze with convolutional neural networks. arXiv preprint arXiv:1712.06492 (2017)
Goodfellow, I., et al.: Generative adversarial nets. In: NeurIPS (2014)
Hagiwara, A., Sugimoto, A., Kawamoto, K.: Saliency-based image editing for guiding visual attention. In: Proceedings of the 1st International Workshop on Pervasive Eye Tracking & Mobile Eye-based Interaction (2011)
Hertzmann, A., Jacobs, C.E., Oliver, N., Curless, B., Salesin, D.H.: Image analogies. In: SIGGRAPH (2001)
Hu, Y., He, H., Xu, C., Wang, B., Lin, S.: Exposure: a white-box photo post-processing framework. In: SIGGRAPH (2018)
Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: ICCV (2017)
Jolicoeur-Martineau, A.: The relativistic discriminator: a key element missing from standard GAN. In: ICLR (2019)
Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR (2019)
Kaufman, L., Lischinski, D., Werman, M.: Content-aware automatic photo enhancement. In: Computer Graphics Forum (2012)
Kolkin, N.I., Shakhnarovich, G., Shechtman, E.: Training deep networks to be spatially sensitive. In: ICCV (2017)
Koutras, P., Maragos, P.: SUSiNEt: see, understand and summarize it. In: CVPR Workshops (2019)
Li, N., Zhao, X., Yang, Y., Zou, X.: Objects classification by learning-based visual saliency model and convolutional neural network. Comput. Intell. Neurosci. 2016, 1–12 (2016)
Lin, T.Y., et al.: Microsoft COCO: common objects in context. In: ECCV (2014)
Margolin, R., Zelnik-Manor, L., Tal, A.: How to evaluate foreground maps? In: CVPR (2014)
Mateescu, V.A., Bajić, I.V.: Attention retargeting by color manipulation in images. In: Proceedings of the 1st International Workshop on Perception Inspired Video Processing (2014)
Mathe, S., Sminchisescu, C.: Dynamic eye movement datasets and learnt saliency models for visual action recognition. In: ECCV (2012)
Mechrez, R., Shechtman, E., Zelnik-Manor, L.: Saliency driven image manipulation. Mach. Vis. Appl. 30(2), 189–202 (2019). https://doi.org/10.1007/s00138-018-01000-w
Mejjati, Y.A., Richardt, C., Tompkin, J., Cosker, D., Kim, K.I.: Unsupervised attention-guided image-to-image translation. In: NeurIPS (2018)
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. In: ICLR (2018)
Moosmann, F., Larlus, D., Jurie, F.: Learning saliency maps for object categorization. In: ECCV (2006)
Newman, A., et al.: TurkEyes: a web-based toolbox for crowdsourcing attention data. In: ACM CHI Conference on Human Factors in Computing Systems (2020)
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: GauGAN: semantic image synthesis with spatially adaptive normalization. In: SIGGRAPH (2019)
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: CVPR (2016)
Pérez, P., Gangnet, M., Blake, A.: Poisson image editing. TOG (2003)
Shen, C., Zhao, Q.: Webpage saliency. In: ECCV (2014)
Su, S.L., Durand, F., Agrawala, M.: De-emphasis of distracting image regions using texture power maps. In: ICCV Workshops (2005)
Tsai, Y.H., Shen, X., Lin, Z., Sunkavalli, K., Lu, X., Yang, M.H.: Deep image harmonization. In: CVPR (2017)
Wang, W., Shen, J., Cheng, M.M., Shao, L.: An iterative and cooperative top-down and bottom-up inference network for salient object detection. In: CVPR (2019)
Wang, W., Zhao, S., Shen, J., Hoi, S.C.H., Borji, A.: Salient object detection with pyramid attention and salient edges. In: CVPR (2019)
Wong, L.K., Low, K.L.: Saliency retargeting: an approach to enhance image aesthetics. In: WACV-Workshop (2011)
Yan, Z., Zhang, H., Wang, B., Paris, S., Yu, Y.: Automatic photo adjustment using deep neural networks. TOG 35, 1–15 (2016)
Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: ICML (2019)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR (2018)
Zheng, Q., Jiao, J., Cao, Y., Lau, R.W.: Task-driven webpage saliency. In: ECCV (2018)
Zhou, L., Zhang, Y., Jiang, Y., Zhang, T., Fan, W.: Re-caption: saliency-enhanced image captioning through two-phase learning. IEEE Trans. Image Process. (2020)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: ICCV (2017)
Zünd, F., Pritch, Y., Sorkine-Hornung, A., Mangold, S., Gross, T.: Content-aware compression using saliency-driven image retargeting. In: ICIP (2013)
Acknowledgements
Y. A. Mejjati thanks the Marie Sklodowska-Curie grant No 665992, and the Centre for Doctoral Training in Digital Entertainment (CDE), EP/L016540/1. K. I. Kim thanks Institute of Information & communications Technology Planning Evaluation (IITP) grant (No. 20200013360011001, Artificial Intelligence Graduate School support (UNIST)) funded by the Korea government (MSIT).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mejjati, Y.A., Gomez, C.F., Kim, K.I., Shechtman, E., Bylinskii, Z. (2020). Look Here! A Parametric Learning Based Approach to Redirect Visual Attention. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12368. Springer, Cham. https://doi.org/10.1007/978-3-030-58592-1_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-58592-1_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58591-4
Online ISBN: 978-3-030-58592-1
eBook Packages: Computer ScienceComputer Science (R0)