[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3640543.3645156acmconferencesArticle/Chapter ViewAbstractPublication PagesiuiConference Proceedingsconference-collections
research-article
Open access

Sample, Nudge and Rank: Exploiting Interpretable GAN Controls for Exploratory Search

Published: 05 April 2024 Publication History

Abstract

Exploratory search is characterized by open-ended search tasks and uncertainty with respect to the clarity of users’ information needs. In the context of image retrieval, generative adversarial networks (GANs) present numerous opportunities for satisfying the information needs of users engaged in exploratory search compared to a collection of images. In this article, we present a novel approach for performing exploratory search on a GAN’s image space using interpretable GAN controls that can be summarized as sample, nudge, and rank. At each search iteration, we sample images from the GAN’s latent space. We implement faceted search by nudging the sampled images towards regions of the latent space containing the attributes associated with selected facets. Lastly, we rank the nudged images using reinforcement learning with relevance feedback. We present a comprehensive evaluation of the proposed approach, incorporating results from simulations and a user study. In simulation, we show that our approach efficiently adapts to user preferences, while preserving a high-level of image diversity. In the user study (N=30), a majority of participants (23/30) preferred our system to the baseline. Concordant with simulation results, users reported both higher perceived search efficiency and image diversity compared to the baseline. Indeed, due to the baseline system’s dependence on a warm-start procedure, users of our system examined significantly fewer images while achieving task outcomes of similar subjective quality.

References

[1]
Rameen Abdal, Peihao Zhu, Niloy J Mitra, and Peter Wonka. 2021. Styleflow: Attribute-conditioned exploration of stylegan-generated images using conditional continuous normalizing flows. ACM Transactions on Graphics (TOG) 40, 3 (2021), 1–21.
[2]
Kumaripaba Athukorala, Dorota Głowacka, Giulio Jacucci, Antti Oulasvirta, and Jilles Vreeken. 2016. Is exploratory search different? A comparison of information search behavior for exploratory and lookup tasks. Journal of the Association for Information Science and Technology 67, 11 (2016), 2635–2651.
[3]
Kumaripaba Athukorala, Alan Medlar, Antti Oulasvirta, Giulio Jacucci, and Dorota Glowacka. 2016. Beyond relevance: adapting exploration/exploitation in information retrieval. In Proceedings of the 21st International Conference on Intelligent User Interfaces. ACM, 359–369.
[4]
Pia Borlund. 2000. Experimental components for the evaluation of interactive information retrieval systems. Journal of documentation 56, 1 (2000), 71–90.
[5]
Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. ArXiv abs/1809.11096 (2019).
[6]
Yang Cao, Hai Wang, Changhu Wang, Zhiwei Li, Liqing Zhang, and Lei Zhang. 2010. Mindfinder: interactive sketch-based image search on millions of images. In Proceedings of the 18th ACM international conference on Multimedia. 1605–1608.
[7]
Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. Advances in neural information processing systems 24 (2011).
[8]
Niklas Deckers, Maik Fröbe, Johannes Kiesel, Gianluca Pandolfo, Christopher Schröder, Benno Stein, and Martin Potthast. 2023. The Infinite Index: Information Retrieval on Generative Text-To-Image Models. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval. 172–186.
[9]
Adam Geitgey. 2017. Face Recognition. Retrieved January 25, 2023 from https://github.com/ageitgey/face_recognition
[10]
Dorota Głowacka and Sayantan Hore. 2014. Balancing exploration–exploitation in image retrieval. UMAP 2014 Extended Proceedings (2014).
[11]
Dorota Glowacka and John Shawe-Taylor. 2010. Content-based image retrieval with multinomial relevance feedback. In Proceedings of 2nd Asian Conference on Machine Learning. JMLR Workshop and Conference Proceedings, 111–125.
[12]
Lore Goetschalckx, Alex Andonian, Aude Oliva, and Phillip Isola. 2019. Ganalyze: Toward visual definitions of cognitive image properties. In Proceedings of the ieee/cvf international conference on computer vision. 5744–5753.
[13]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
[14]
Xiaoxiao Guo, Hui Wu, Yu Cheng, Steven Rennie, Gerald Tesauro, and Rogerio Feris. 2018. Dialog-based interactive image retrieval. Advances in neural information processing systems 31 (2018).
[15]
Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. 2020. Ganspace: Discovering interpretable gan controls. Advances in Neural Information Processing Systems 33 (2020), 9841–9850.
[16]
Ahmed Hassan, Ryen W White, Susan T Dumais, and Yi-Min Wang. 2014. Struggling or exploring?: disambiguating long search sessions. In Proceedings of the 7th ACM international conference on Web search and data mining. ACM, 53–62.
[17]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[18]
Sayantan Hore, Dorota Glowacka, Ilkka Kosunen, Kumaripaba Athukorala, and Giulio Jacucci. 2015. FutureView: Enhancing Exploratory Image Search. In IntRS@RecSys.
[19]
Sayantan Hore, Lasse Tyrvainen, Joel Pyykko, and Dorota Glowacka. 2015. A reinforcement learning approach to query-less image retrieval. In International Workshop on Symbiotic Interaction. Springer, 121–126.
[20]
Mark J Huiskes and Michael S Lew. 2008. The mir flickr retrieval evaluation. In Proceedings of the 1st ACM international conference on Multimedia information retrieval. 39–43.
[21]
Ali Jahanian, Lucy Chai, and Phillip Isola. 2019. On the" steerability" of generative adversarial networks. In International Conference on Learning Representations.
[22]
Patrick W Jordan, Bruce Thomas, Ian Lyall McClelland, and Bernard Weerdmeester. 1996. Usability evaluation in industry. CRC Press.
[23]
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. 2018. Progressive Growing of GANs for Improved Quality, Stability, and Variation. In International Conference on Learning Representations.
[24]
Tero Karras, Miika Aittala, Samuli Laine, Erik Härkönen, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2021. Alias-free generative adversarial networks. Advances in Neural Information Processing Systems 34 (2021), 852–863.
[25]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4401–4410.
[26]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110–8119.
[27]
Diane Kelly 2009. Methods for evaluating interactive information retrieval systems with users. Foundations and Trends® in Information Retrieval 3, 1–2 (2009), 1–224.
[28]
Gwanghyun Kim, Taesung Kwon, and Jong Chul Ye. 2022. DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 2426–2435.
[29]
Ksenia Konyushkova and Dorota Glowacka. 2013. Content-based image retrieval with hierarchical Gaussian Process bandits with self-organizing maps. In 21st European Symposium on Artificial Neural Networks, ESANN 2013, Bruges, Belgium, April 24-26, 2013.
[30]
Adriana Kovashka and Kristen Grauman. 2017. Attributes for image retrieval. In Visual Attributes. Springer, 89–117.
[31]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 25 (2012).
[32]
Ivan Kropotov, Alan Medlar, and Dorota Glowacka. 2021. Exploratory Search of GANs with Contextual Bandits. Association for Computing Machinery, New York, NY, USA, 3157–3161. https://doi.org/10.1145/3459637.3482103
[33]
James R Lewis. 2018. Measuring perceived usability: The CSUQ, SUS, and UMUX. International Journal of Human–Computer Interaction 34, 12 (2018), 1148–1156.
[34]
Jing Li and Nigel M Allinson. 2013. Relevance feedback in content-based image retrieval: a survey. In Handbook on neural information processing. Springer, 433–469.
[35]
Vivian Liu and Lydia B Chilton. 2022. Design guidelines for prompt engineering text-to-image generative models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–23.
[36]
Yang Liu, Alan Medlar, and Dorota Glowacka. 2022. ROGUE: A System for Exploratory Search of GANs. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 3278–3282. https://doi.org/10.1145/3477495.3531675
[37]
Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep Learning Face Attributes in the Wild. In Proceedings of International Conference on Computer Vision (ICCV).
[38]
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91–110.
[39]
Shiyang Lu, Tao Mei, Jingdong Wang, Jian Zhang, Zhiyong Wang, and Shipeng Li. 2014. Browse-to-Search: Interactive Exploratory Search with Visual Entities. ACM Trans. Inf. Syst. 32 (2014), 18:1–18:27.
[40]
Gary Marchionini. 2006. Exploratory search: from finding to understanding. Commun. ACM 49, 4 (2006), 41–46.
[41]
Alan Medlar and Dorota Glowacka. 2018. How Consistent is Relevance Feedback in Exploratory Search?. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. ACM, 1615–1618.
[42]
Alan Medlar, Jing Li, and Dorota Głowacka. 2021. Query Suggestions as Summarization in Exploratory Search. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval. 119–128.
[43]
Henning Muller, Wolfgang Muller, Stéphane Marchand-Maillet, Thierry Pun, and David McG Squire. 2000. Strategies for positive and negative relevance feedback in image retrieval. In Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, Vol. 1. IEEE, 1043–1046.
[44]
Jonas Oppenlaender, Rhema Linder, and Johanna Silvennoinen. 2023. Prompting AI art: An investigation into the creative skill of prompt engineering. arXiv preprint arXiv:2303.13534 (2023).
[45]
Jaakko Peltonen, Jonathan Strahl, and Patrik Floréen. 2017. Negative Relevance Feedback for Exploratory Search with Visual Interactive Intent Modeling. In Proceedings of the 22nd International Conference on Intelligent User Interfaces (Limassol, Cyprus) (IUI ’17). Association for Computing Machinery, New York, NY, USA, 149–159. https://doi.org/10.1145/3025171.3025222
[46]
Pearl Pu, Li Chen, and Rong Hu. 2011. A user-centric evaluation framework for recommender systems. In Proceedings of the fifth ACM conference on Recommender systems. 157–164.
[47]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, 2021. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. PMLR, 8748–8763.
[48]
Jun Rao, Fei Wang, Liang Ding, Shuhan Qi, Yibing Zhan, Weifeng Liu, and Dacheng Tao. 2022. Where Does the Performance Improvement Come From? - A Reproducibility Concern about Image-Text Retrieval. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 2727–2737. https://doi.org/10.1145/3477495.3531715
[49]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684–10695.
[50]
Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, and Zheng Wen. 2017. A tutorial on thompson sampling. arXiv preprint arXiv:1707.02038 (2017).
[51]
Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Raphael Gontijo-Lopes, Burcu Karagol Ayan, Tim Salimans, Jonathan Ho, David J. Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. In Advances in Neural Information Processing Systems, Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (Eds.). https://openreview.net/forum?id=08Yk-n5l2Al
[52]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. FaceNet: A unified embedding for face recognition and clustering. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 815–823. https://doi.org/10.1109/CVPR.2015.7298682
[53]
Yujun Shen, Jinjin Gu, Xiaoou Tang, and Bolei Zhou. 2020. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9243–9252.
[54]
Yujun Shen and Bolei Zhou. 2021. Closed-Form Factorization of Latent Semantics in GANs. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021), 1532–1540.
[55]
Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings (2015). http://arxiv.org/abs/1409.1556
[56]
Arnold WM Smeulders, Marcel Worring, Simone Santini, Amarnath Gupta, and Ramesh Jain. 2000. Content-based image retrieval at the end of the early years. IEEE Transactions on pattern analysis and machine intelligence 22, 12 (2000), 1349–1380.
[57]
Bart Thomee and Michael S Lew. 2012. Interactive search in image retrieval: a survey. International Journal of Multimedia Information Retrieval 1, 2 (2012), 71–86.
[58]
Antti Ukkonen, Pyry Joona, and Tuukka Ruotsalo. 2020. Generating Images Instead of Retrieving Them: Relevance Feedback on Generative Adversarial Networks. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. Association for Computing Machinery, 1329–1338.
[59]
Nam Vo, Lu Jiang, Chen Sun, Kevin Murphy, Li-Jia Li, Li Fei-Fei, and James Hays. 2019. Composing text and image for image retrieval-an empirical odyssey. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 6439–6448.
[60]
Jingdong Wang and Xian-Sheng Hua. 2011. Interactive image search by color map. ACM Transactions on Intelligent Systems and Technology (TIST) 3, 1 (2011), 1–23.
[61]
Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (Shinjuku, Tokyo, Japan) (SIGIR ’17). Association for Computing Machinery, New York, NY, USA, 515–524. https://doi.org/10.1145/3077136.3080786
[62]
Ryen W White and Resa A Roth. 2009. Exploratory search: Beyond the query-response paradigm. Synthesis lectures on information concepts, retrieval, and services 1, 1 (2009), 1–98.
[63]
Barbara M Wildemuth and Luanne Freund. 2012. Assigning search tasks designed to elicit exploratory search behaviors. In Proceedings of the symposium on human-computer interaction and information retrieval. 1–10.
[64]
Weihao Xia, Yulun Zhang, Yujiu Yang, Jing-Hao Xue, Bolei Zhou, and Ming-Hsuan Yang. 2023. GAN Inversion: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 3 (2023), 3121–3138. https://doi.org/10.1109/TPAMI.2022.3181070
[65]
Hao Xu, Jingdong Wang, Xian-Sheng Hua, and Shipeng Li. 2010. Image search by concept map. In Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. 275–282.
[66]
Heng Xu, Jun-yi Wang, and Lei Mao. 2017. Relevance feedback for Content-based Image Retrieval using deep learning. In 2017 2nd International Conference on Image, Vision and Computing (ICIVC). IEEE, 629–633.
[67]
Ka-Ping Yee, Kirsten Swearingen, Kevin Li, and Marti Hearst. 2003. Faceted Metadata for Image Search and Browsing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (Ft. Lauderdale, Florida, USA) (CHI ’03). Association for Computing Machinery, New York, NY, USA, 401–408. https://doi.org/10.1145/642611.642681
[68]
Yang Yu, Zhiqiang Gong, Ping Zhong, and Jiaxin Shan. 2017. Unsupervised representation learning with deep convolutional neural network for remote sensing images. In International conference on image and graphics. Springer, 97–108.
[69]
Han Zhang, Ian J. Goodfellow, Dimitris N. Metaxas, and Augustus Odena. 2019. Self-Attention Generative Adversarial Networks. In ICML.
[70]
Weinan Zhang. 2018. Generative adversarial nets for information retrieval: Fundamentals and advances. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 1375–1378.
[71]
Yida Zhao, Yuqing Song, and Qin Jin. 2022. Progressive Learning for Image Retrieval with Hybrid-Modality Queries. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR ’22). Association for Computing Machinery, New York, NY, USA, 1012–1021. https://doi.org/10.1145/3477495.3532047
[72]
Wengang Zhou, Houqiang Li, and Qi Tian. 2017. Recent advance in content-based image retrieval: A literature survey. arXiv preprint arXiv:1706.06064 (2017).

Index Terms

  1. Sample, Nudge and Rank: Exploiting Interpretable GAN Controls for Exploratory Search

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    IUI '24: Proceedings of the 29th International Conference on Intelligent User Interfaces
    March 2024
    955 pages
    ISBN:9798400705083
    DOI:10.1145/3640543
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 April 2024

    Check for updates

    Author Tags

    1. GANs
    2. Thompson sampling
    3. contextual bandits
    4. exploratory search
    5. image retrieval

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    IUI '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 746 of 2,811 submissions, 27%

    Upcoming Conference

    IUI '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 246
      Total Downloads
    • Downloads (Last 12 months)246
    • Downloads (Last 6 weeks)48
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media