More Web Proxy on the site http://driver.im/

research-article

Open access

The Infinite Index: Information Retrieval on Generative Text-To-Image Models

Authors:

Niklas Deckers,

Johannes Kiesel,

Gianluca Pandolfo,

Christopher Schröder,

Martin PotthastAuthors Info & Claims

CHIIR '23: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval

Pages 172 - 186

https://doi.org/10.1145/3576840.3578327

Published: 20 March 2023 Publication History

All formats PDF

Abstract

Conditional generative models such as DALL-E and Stable Diffusion generate images based on a user-defined text, the prompt. Finding and refining prompts that produce a desired image has become the art of prompt engineering. Generative models do not provide a built-in retrieval model for a user’s information need expressed through prompts. In light of an extensive literature review, we reframe prompt engineering for generative models as interactive text-based retrieval on a novel kind of “infinite index”. We apply these insights for the first time in a case study on image generation for game design with an expert. Finally, we envision how active learning may help to guide the retrieval of generated images.

Supplementary Material

PDF File (case-study-report-frame.pdf)

Case Study Report

Download
156.99 MB

References

[1]

Daria Alexander, Wojciech Kusa, and Arjen P. de Vries. 2022. ORCAS-I: Queries Annotated with Intent using Weak Supervision. In SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 3057–3066. https://doi.org/10.1145/3477495.3531737

Digital Library

[2]

James D. Anderson. 1997. Guidelines for Indexesand RelatedInformationRetrieval Devices.

[3]

Javed A. Aslam and Emine Yilmaz. 2007. Inferring document relevance from incomplete information. In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007, Mário J. Silva, Alberto H. F. Laender, Ricardo A. Baeza-Yates, Deborah L. McGuinness, Bjørn Olstad, Øystein Haug Olsen, and André O. Falcão (Eds.). ACM, 633–642.

[4]

Yogesh Balaji, Seungjun Nah, Xun Huang, Arash Vahdat, Jiaming Song, Karsten Kreis, Miika Aittala, Timo Aila, Samuli Laine, Bryan Catanzaro, Tero Karras, and Ming-Yu Liu. 2022. eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers. CoRR abs/2211.01324(2022). https://doi.org/10.48550/arXiv.2211.01324 arXiv:2211.01324

[5]

Yaniv Bernstein and Justin Zobel. 2005. Redundant documents and search effectiveness. In Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31 - November 5, 2005, Otthein Herzog, Hans-Jörg Schek, Norbert Fuhr, Abdur Chowdhury, and Wilfried Teiken (Eds.). ACM, 736–743. https://doi.org/10.1145/1099554.1099733

Digital Library

[6]

Michele Bevilacqua, Giuseppe Ottaviano, Patrick Lewis, Wen-tau Yih, Sebastian Riedel, and Fabio Petroni. 2022. Autoregressive Search Engines: Generating Substrings as Document Identifiers. CoRR abs/2204.10628(2022). https://doi.org/10.48550/arXiv.2204.10628 arXiv:2204.10628

[7]

Sumit Bhatia, Debapriyo Majumdar, and Prasenjit Mitra. 2011. Query suggestions in the absence of query logs. In Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2011, Beijing, China, July 25-29, 2011, Wei-Ying Ma, Jian-Yun Nie, Ricardo Baeza-Yates, Tat-Seng Chua, and W. Bruce Croft (Eds.). ACM, 795–804. https://doi.org/10.1145/2009916.2010023

Digital Library

[8]

Jorge Luis Borges. 1939. La Bibliotheca Total (The Total Library). Buenos Aires. https://www.gwern.net/docs/borges/1939-borges-thetotallibrary.pdf

[9]

Jorge Luis Borges. 1941. La Bibliotheca de Babel (The Library of Babel). https://maskofreason.files.wordpress.com/2011/02/the-library-of-babel-by-jorge-luis-borges.pdf

[10]

Andrew Brock, Jeff Donahue, and Karen Simonyan. 2019. Large Scale GAN Training for High Fidelity Natural Image Synthesis. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net. https://openreview.net/forum?id=B1xsqj09Fm

[11]

Andrei Z. Broder. 2002. A taxonomy of web search. SIGIR Forum 36, 2 (2002), 3–10. https://doi.org/10.1145/792550.792552

Digital Library

[12]

Fei Cai and Maarten de Rijke. 2016. A Survey of Query Auto Completion in Information Retrieval. Found. Trends Inf. Retr. 10, 4 (2016), 273–363. https://doi.org/10.1561/1500000055

Digital Library

[13]

Huiwen Chang, Han Zhang, Jarred Barber, AJ Maschinot, Jose Lezama, Lu Jiang, Ming-Hsuan Yang, Kevin Murphy, William T Freeman, Michael Rubinstein, 2023. Muse: Text-To-Image Generation via Masked Generative Transformers. arXiv preprint arXiv:2301.00704(2023).

[14]

Catherine Chavula, Yujin Choi, and Soo Young Rieh. 2022. Understanding Creative Thinking Processes in Searching for New Ideas. In ACM SIGIR Conference on Human Information Interaction and Retrieval (Regensburg, Germany). ACM, New York, NY, USA, 321–326. https://doi.org/10.1145/3498366.3505783

Digital Library

[15]

Hyerim Cho, Minh TN Pham, Katherine N. Leonard, and Alex C. Urban. 2021. A systematic literature review on image information needs and behaviors. Journal of Documentation 78, 2 (2021), 207–227.

[16]

Youngok Choi. 2013. Analysis of image search queries on the web: Query modification patterns and semantic attributes. Journal of the American Society for Information Science and Technology 64, 7 (2013), 1423–1441.

[17]

Cyril W. Cleverdon. 1967. The Cranfield tests on index language devices. In Aslib proceedings. MCB UP Ltd. (Reprinted in Readings in Information Retrieval, Karen Sparck-Jones and Peter Willett, editors, Morgan Kaufmann, 1997), 173–192.

[18]

Cyril W. Cleverdon. 1991. The Significance of the Cranfield Tests on Index Languages. In Proceedings of the 14th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Chicago, Illinois, USA, October 13-16, 1991 (Special Issue of the SIGIR Forum), Abraham Bookstein, Yves Chiaramella, Gerard Salton, and Vijay V. Raghavan (Eds.). ACM, 3–12.

Digital Library

[19]

Silviu Cucerzan and Eric Brill. 2004. Spelling Correction as an Iterative Process that Exploits the CollectiveKnowledge of Web Users. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, EMNLP 2004, A meeting of SIGDAT, a Special Interest Group of the ACL, held in conjunction with ACL 2004, 25-26 July 2004, Barcelona, Spain. ACL, 293–300. https://aclanthology.org/W04-3238/

[20]

Van Dang and W. Bruce Croft. 2010. Query reformulation using anchor text. In Proceedings of the Third International Conference on Web Search and Web Data Mining, WSDM 2010, New York, NY, USA, February 4-6, 2010, Brian D. Davison, Torsten Suel, Nick Craswell, and Bing Liu (Eds.). ACM, 41–50. https://doi.org/10.1145/1718487.1718493

Digital Library

[21]

Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems 34 (2021), 8780–8794.

[22]

Alan Feuer, Stefan Savev, and Javed A. Aslam. 2007. Evaluation of phrasal query suggestions. In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, CIKM 2007, Lisbon, Portugal, November 6-10, 2007, Mário J. Silva, Alberto H. F. Laender, Ricardo A. Baeza-Yates, Deborah L. McGuinness, Bjørn Olstad, Øystein Haug Olsen, and André O. Falcão (Eds.). ACM, 841–848. https://doi.org/10.1145/1321440.1321556

Digital Library

[23]

Maik Fröbe, Janek Bevendorff, Jan Heinrich Reimer, Martin Potthast, and Matthias Hagen. 2020. Sampling Bias Due to Near-Duplicates in Learning to Rank. In 43rd International ACM Conference on Research and Development in Information Retrieval (SIGIR 2020). ACM, 1997–2000. https://doi.org/10.1145/3397271.3401212

Digital Library

[24]

Maik Fröbe, Jan Philipp Bittner, Martin Potthast, and Matthias Hagen. 2020. The Effect of Content-Equivalent Near-Duplicates on the Evaluation of Search Engines. In Advances in Information Retrieval. 42nd European Conference on IR Research (ECIR 2020)(Lecture Notes in Computer Science, Vol. 12036), Joemon M. Jose, Emine Yilmaz, João Magalhães, Pablo Castells, Nicola Ferro, Mário J. Silva, and Flávio Martins (Eds.). Springer, Berlin Heidelberg New York, 12–19. https://doi.org/10.1007/978-3-030-45442-5_2

Digital Library

[25]

Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion. CoRR abs/2208.01618(2022). https://doi.org/10.48550/arXiv.2208.01618 arXiv:2208.01618

[26]

Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, and Daniel Cohen-Or. 2022. Imagen Video: High Definition Video Generation with Diffusion Models. CoRR abs/2208.01618(2022). https://doi.org/10.48550/arXiv.2208.01618 arXiv:2208.01618

[27]

Gregory Gay, Sonia Haiduc, Andrian Marcus, and Tim Menzies. 2009. On the use of relevance feedback in IR-based concept location. In 25th IEEE International Conference on Software Maintenance (ICSM’09). IEEE Computer Society, 351–360. https://doi.org/10.1109/ICSM.2009.5306315

[28]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (2020), 139–144.

Digital Library

[29]

William S. Hemmig. 2008. The information-seeking behavior of visual artists: a literature review. Journal of Documentation 64, 3 (2008), 343–362. https://doi.org/10.1108/00220410810867579

[30]

Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey A. Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, and Tim Salimans. 2022. Imagen Video: High Definition Video Generation with Diffusion Models. CoRR abs/2210.02303(2022). https://doi.org/10.48550/arXiv.2210.02303 arXiv:2210.02303

[31]

Vera Hollink, Theodora Tsikrika, and Arjen P. de Vries. 2011. Semantic search log analysis: A method and a study on professional image search. J. Assoc. Inf. Sci. Technol. 62 (2011), 691–713.

Digital Library

[32]

Chien-Kang Huang, Lee-Feng Chien, and Yen-Jen Oyang. 2003. Relevant term suggestion in interactive web search based on contextual information in query session logs. J. Assoc. Inf. Sci. Technol. 54, 7 (2003), 638–649. https://doi.org/10.1002/asi.10256

Digital Library

[33]

Sethurathienam Iyer, Shubham Chaturvedi, and Tirtharaj Dash. 2017. Image Captioning-Based Image Search Engine: An Alternative to Retrieval by Metadata. In Soft Computing for Problem Solving (SocProS’17)(Advances in Intelligent Systems and Computing, Vol. 817), Jagdish Chand Bansal, Kedar Nath Das, Atulya Nagar, Kusum Deep, and Akshay Kumar Ojha (Eds.). Springer, 181–191. https://doi.org/10.1007/978-981-13-1595-4_14

[34]

Nasreen Abdul Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah S. Larkey, Xiaoyan Li, Mark D. Smucker, and Courtney Wade. 2004. UMass at TREC 2004: Novelty and HARD. In Proceedings of the Thirteenth Text REtrieval Conference, TREC 2004, Gaithersburg, Maryland, USA, November 16-19, 2004(NIST Special Publication, Vol. 500-261), Ellen M. Voorhees and Lori P. Buckland (Eds.). National Institute of Standards and Technology (NIST). http://trec.nist.gov/pubs/trec13/papers/umass.novelty.hard.pdf

[35]

Bernard Jansen, D. Booth, and A. Spink. 2009. Patterns of Query Reformulation During Web Searching. J. Assoc. Inf. Sci. Technol. 60, 7 (2009), 1358–1371. https://doi.org/10.1002/asi.21071

[36]

Bernard Jansen, Amanda Spink, and Sherry Koshman. 2007. Web searcher interaction with the Dogpile.com metasearch engine. J. Assoc. Inf. Sci. Technol. 58, 5 (2007), 744–755. https://doi.org/10.1002/asi.20555

[37]

Bernard Jansen, Amanda Spink, and Jan Pedersen. 2005. A temporal comparison of AltaVista Web searching. J. Assoc. Inf. Sci. Technol. 56, 6 (2005), 559–570. https://doi.org/10.1002/asi.20145

[38]

Kalervo Järvelin and Jaana Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20, 4 (2002), 422–446.

Digital Library

[39]

Thorsten Joachims and Filip Radlinski. 2007. Search Engines that Learn from Implicit Feedback. Computer 40, 8 (2007), 34–40. https://doi.org/10.1109/MC.2007.289

Digital Library

[40]

Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, and Michal Irani. 2012. Imagic: Text-Based Real Image Editing with Diffusion Models. CoRR abs/2210.09276(2012). arXiv:2210.09276http://arxiv.org/abs/2210.09276

[41]

Aditya Khosla, Tinghui Zhou, Tomasz Malisiewicz, Alexei A. Efros, and Antonio Torralba. 2012. Undoing the Damage of Dataset Bias. In Computer Vision - ECCV 2012 - 12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012, Proceedings, Part I(Lecture Notes in Computer Science, Vol. 7572), Andrew W. Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, and Cordelia Schmid (Eds.). Springer, 158–171. https://doi.org/10.1007/978-3-642-33718-5_12

Digital Library

[42]

Hannah Rose Kirk, Yennie Jun, Filippo Volpin, Haider Iqbal, Elias Benussi, Frederic Dreyer, Aleksandar Shtedritski, and Yuki Asano. 2021. Bias Out-of-the-Box: An Empirical Analysis of Intersectional Occupational Biases in Popular Generative Language Models. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.). Vol. 34. Curran Associates, Inc., 2611–2624. https://proceedings.neurips.cc/paper/2021/file/1531beb762df4029513ebf9295e0d34f-Paper.pdf

[43]

Sarah Kreps, R. Miles McCain, and Miles Brundage. 2022. All the News That’s Fit to Fabricate: AI-Generated Text as a Tool of Media Misinformation. Journal of Experimental Political Science 9, 1 (2022), 104–117. https://doi.org/10.1017/XPS.2020.37

[44]

Carol Collier Kuhlthau. 1993. A Principle of Uncertainty for Information seeking. J. Documentation 49, 4 (1993), 339–355. https://doi.org/10.1108/eb026918

[45]

Kurd Laßwitz. 1897. Bis zum Nullpunkt des Seins und andere Science-Fiction-Erzählungen (Kapitel 10: Die Universalbibliothek. Schlesische Zeitung; Neuauflage auf Projekt Gutenberg 2017. https://www.projekt-gutenberg.org/lasswitz/nullpunk/titlepage.html Erschienen zwischen 1871 und 1908.

[46]

Jooyoung Lee, Thai Le, Jinghui Chen, and Dongwon Lee. 2022. Do Language Models Plagiarize?CoRR abs/2203.07618(2022). https://doi.org/10.48550/arXiv.2203.07618 arXiv:2203.07618

[47]

Lo Lee, Melissa G. Ocepek, Stephann Makri, George Buchanan, and Dana McKay. 2019. Getting creative in everyday life: Investigating arts and crafts hobbyists’ information behavior. Proceedings of the Association for Information Science and Technology 56, 1 (2019), 703–705. https://doi.org/10.1002/pra2.141 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/pra2.141

[48]

Richard Lemarchand. 2021. A Playful Production Process: For Game Designers (and Everyone). MIT Press, Cambridge, MA.

[49]

David D. Lewis and William A. Gale. 1994. A Sequential Algorithm for Training Text Classifiers. In Proceedings of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, W. Bruce Croftand C. J. van Rijsbergen (Eds.). Springer, ACM/Springer, 3–12. https://doi.org/10.1007/978-1-4471-2099-5_1

[50]

Xiaoqing Li, Jiansheng Yang, and Jinwen Ma. 2021. Recent developments of content-based image retrieval (CBIR). Neurocomputing 452(2021), 675–689.

[51]

Yuan Li, Yinglong Zhang, and Robert Capra. 2022. Analyzing Information Resources That Support the Creative Process. In Proceedings of the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR’22)(Regensburg, Germany). ACM, 180–190. https://doi.org/10.1145/3498366.3505817

Digital Library

[52]

Wen-Cheng Lin, Yih-Chen Chang, and Hsin-Hsi Chen. 2004. From Text to Image: Generating Visual Query for Image Retrieval. In Multilingual Information Access for Text, Speech and Images, 5th Workshop of the Cross-Language Evaluation Forum, CLEF 2004, Bath, UK, September 15-17, 2004, Revised Selected Papers(Lecture Notes in Computer Science, Vol. 3491), Carol Peters, Paul D. Clough, Julio Gonzalo, Gareth J. F. Jones, Michael Kluck, and Bernardo Magnini (Eds.). Springer, 664–675. https://doi.org/10.1007/11519645_65

Digital Library

[53]

Vivian Liu and Lydia B. Chilton. 2022. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans, LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA, Article 384, 23 pages. https://doi.org/10.1145/3491102.3501825

Digital Library

[54]

Vivian Liu, Han Qiao, and Lydia Chilton. 2022. Opal: Multimodal Image Generation for News Illustration. arXiv preprint arXiv:2204.09007(2022).

[55]

Yang Liu, Alan Medlar, and Dorota Glowacka. 2022. ROGUE: A System for Exploratory Search of GANs. In SIGIR ’22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 3278–3282. https://doi.org/10.1145/3477495.3531675

Digital Library

[56]

Xiaolu Lu, Alistair Moffat, and J. Shane Culpepper. 2016. The effect of pooling and evaluation depth on IR metrics. Inf. Retr. J. 19, 4 (2016), 416–445.

Digital Library

[57]

Piotr Mirowski, Kory W. Mathewson, Jaylen Pittman, and Richard Evans. 2022. Co-Writing Screenplays and Theatre Scripts with Language Models: An Evaluation by Industry Professionals. arXiv preprint arXiv:2209.14958(2022).

[58]

Bhaskar Mitra and Nick Craswell. 2018. An Introduction to Neural Information Retrieval. Found. Trends Inf. Retr. 13, 1 (2018), 1–126. https://doi.org/10.1561/1500000061

Digital Library

[59]

Alistair Moffat and Justin Zobel. 2008. Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. 27, 1 (2008), 2:1–2:27.

Digital Library

[60]

Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 9, 2016(CEUR Workshop Proceedings, Vol. 1773), Tarek Richard Besold, Antoine Bordes, Artur S. d’Avila Garcez, and Greg Wayne (Eds.). CEUR-WS.org. http://ceur-ws.org/Vol-1773/CoCoNIPS_2016_paper9.pdf

[61]

Rodrigo Frassetto Nogueira, Wei Yang, Jimmy Lin, and Kyunghyun Cho. 2019. Document Expansion by Query Prediction. CoRR abs/1904.08375(2019). arXiv:1904.08375http://arxiv.org/abs/1904.08375

[62]

OpenAI. 2022. DALL·E: Creating Images from Text. https://openai.com/blog/dall-e/.

[63]

Jonas Oppenlaender. 2022. Prompt Engineering for Text-Based Generative Art. arXiv preprint arXiv:2204.13988(2022).

[64]

Srishti Palani, Zijian Ding, Stephen MacNeil, and Steven P. Dow. 2021. The "Active Search" Hypothesis: How Search Strategies Relate to Creative Learning. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval (Canberra ACT, Australia). ACM, New York, NY, USA, 325–329. https://doi.org/10.1145/3406522.3446046

Digital Library

[65]

Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2022. DreamFusion: Text-to-3D using 2D Diffusion. CoRR abs/2209.14988(2022). https://doi.org/10.48550/arXiv.2209.14988 arXiv:2209.14988

[66]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8748–8763. http://proceedings.mlr.press/v139/radford21a.html

[67]

Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical Text-Conditional Image Generation with CLIP Latents. arXiv preprint arXiv:2204.06125(2022).

[68]

Aditya Ramesh, Mikhail Pavlov, Gabriel Goh, Scott Gray, Chelsea Voss, Alec Radford, Mark Chen, and Ilya Sutskever. 2021. Zero-Shot Text-to-Image Generation. In Proceedings of the 38th International Conference on Machine Learning(Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 8821–8831. https://proceedings.mlr.press/v139/ramesh21a.html

[69]

Ali Razavi, Aaron Van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems 32 (2019).

[70]

Navid Rekabsaz, Oleg Lesota, Markus Schedl, Jon Brassey, and Carsten Eickhoff. 2021. TripClick: The Log Files of a Large Health Web Search Engine. In SIGIR ’21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, July 11-15, 2021, Fernando Diaz, Chirag Shah, Torsten Suel, Pablo Castells, Rosie Jones, and Tetsuya Sakai (Eds.). ACM, 2507–2513. https://doi.org/10.1145/3404835.3463242

Digital Library

[71]

Laria Reynolds and Kyle McDonell. 2021. Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm. In CHI ’21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama Japan, May 8-13, 2021, Extended Abstracts, Yoshifumi Kitamura, Aaron Quigley, Katherine Isbister, and Takeo Igarashi (Eds.). ACM, 314:1–314:7. https://doi.org/10.1145/3411763.3451760

Digital Library

[72]

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10684–10695.

[73]

Ian Ruthven. 2008. Interactive information retrieval. Annu. Rev. Inf. Sci. Technol. 42, 1 (2008), 43–91. https://doi.org/10.1002/aris.2008.1440420109

[74]

Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, and Mohammad Norouzi. 2022. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv preprint arXiv:2205.11487(2022).

[75]

Tetsuya Sakai. 2007. Alternatives to Bpref. In SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007, Wessel Kraaij, Arjen P. de Vries, Charles L. A. Clarke, Norbert Fuhr, and Noriko Kando (Eds.). ACM, 71–78.

Digital Library

[76]

Tetsuya Sakai. 2008. Comparing metrics across TREC and NTCIR: The robustness to system bias. In Proceedings of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, Napa Valley, California, USA, October 26-30, 2008, James G. Shanahan, Sihem Amer-Yahia, Ioana Manolescu, Yi Zhang, David A. Evans, Aleksander Kolcz, Key-Sun Choi, and Abdur Chowdhury (Eds.). ACM, 581–590.

Digital Library

[77]

Rob Salkowitz. 2022. Midjourney Founder David Holz On The Impact Of AI On Art, Imagination And The Creative Economy. Forbes (Sept. 2022). https://www.forbes.com/sites/robsalkowitz/2022/09/16/midjourney-founder-david-holz-on-the-impact-of-ai-on-art-imagination-and-the-creative-economy/

[78]

R. Keith Sawyer. 2012. Explaining creativity: The science of human innovation. Oxford University Press, New York, NY, US.

[79]

Greg Schohn and David Cohn. 2000. Less is More: Active Learning with Support Vector Machines. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29 - July 2, 2000, Pat Langley (Ed.). Morgan Kaufmann, 839–846.

[80]

Christopher Schröder, Andreas Niekler, and Martin Potthast. 2022. Revisiting Uncertainty-based Query Strategies for Active Learning with Transformers. In Findings of the Association for Computational Linguistics: ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, 2194–2203. https://doi.org/10.18653/v1/2022.findings-acl.172

[81]

Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. 2021. LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs. CoRR abs/2111.02114(2021). arXiv:2111.02114https://arxiv.org/abs/2111.02114

[82]

Tal Schuster, Roei Schuster, Darsh J. Shah, and Regina Barzilay. 2020. The Limitations of Stylometry for Detecting Machine-Generated Fake News. Comput. Linguist. 46, 2 (June 2020), 499–510. https://doi.org/10.1162/coli_a_00380

Digital Library

[83]

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning. PMLR, 2256–2265.

[84]

Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, and Tom Goldstein. 2022. Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models. CoRR abs/2212.03860(2022). https://doi.org/10.48550/arXiv.2212.03860 arXiv:2212.03860

[85]

Statista Inc.2022. Digital Media Report - Video Games. https://www.statista.com/study/39310/video-games/.

[86]

Yi Tay, Vinh Q. Tran, Mostafa Dehghani, Jianmo Ni, Dara Bahri, Harsh Mehta, Zhen Qin, Kai Hui, Zhe Zhao, Jai Prakash Gupta, Tal Schuster, William W. Cohen, and Donald Metzler. 2022. Transformer Memory as a Differentiable Search Index. CoRR abs/2202.06991(2022). arXiv:2202.06991https://arxiv.org/abs/2202.06991

[87]

Nicola Tonellotto. 2022. Lecture Notes on Neural Information Retrieval. CoRR abs/2207.13443(2022). https://doi.org/10.48550/arXiv.2207.13443 arXiv:2207.13443

[88]

Simon Tong and Edward Y. Chang. 2001. Support vector machine active learning for image retrieval. In Proceedings of the 9th ACM International Conference on Multimedia 2001, Ottawa, Ontario, Canada, September 30 - October 5, 2001, Nicolas D. Georganas and Radu Popescu-Zeletin (Eds.). ACM, 107–118. https://doi.org/10.1145/500141.500159

Digital Library

[89]

Antti Ukkonen, Pyry Joona, and Tuukka Ruotsalo. 2020. Generating Images Instead of Retrieving Them: Relevance Feedback on Generative Adversarial Networks. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25-30, 2020, Jimmy X. Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vanessa Murdock, Ji-Rong Wen, and Yiqun Liu (Eds.). ACM, 1329–1338. https://doi.org/10.1145/3397271.3401129

Digital Library

[90]

Salahuddin Unar, Xingyuan Wang, Chuan Zhang, and Chunpeng Wang. 2019. Detected text-based image retrieval approach for textual images. IET Image Process. 13, 3 (2019), 515–521. https://doi.org/10.1049/iet-ipr.2018.5277

[91]

Ellen M. Voorhees. 2001. The Philosophy of Information Retrieval Evaluation. In Evaluation of Cross-Language Information Retrieval Systems, Second Workshop of the Cross-Language Evaluation Forum, CLEF 2001, Darmstadt, Germany, September 3-4, 2001, Revised Papers(Lecture Notes in Computer Science, Vol. 2406), Carol Peters, Martin Braschler, Julio Gonzalo, and Michael Kluck (Eds.). Springer, 355–370.

[92]

Ellen M. Voorhees. 2019. The Evolution of Cranfield. In Information Retrieval Evaluation in a Changing World - Lessons Learned from 20 Years of CLEF, Nicola Ferro and Carol Peters (Eds.). The Information Retrieval Series, Vol. 41. Springer, 45–69.

[93]

Ellen M. Voorhees, Ian Soboroff, and Jimmy Lin. 2022. Can Old TREC Collections Reliably Evaluate Modern Neural Retrieval Models?CoRR abs/2201.11086(2022). arXiv:2201.11086

[94]

Xintao Wang, Yu Li, Honglun Zhang, and Ying Shan. 2021. Towards Real-World Blind Face Restoration With Generative Facial Prior. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021. Computer Vision Foundation / IEEE, 9168–9178. https://doi.org/10.1109/CVPR46437.2021.00905

[95]

Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Hao Sun, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Allen Sun, Weiwei Deng, Qi Zhang, and Mao Yang. 2022. A Neural Corpus Indexer for Document Retrieval. CoRR abs/2206.02743(2022). https://doi.org/10.48550/arXiv.2206.02743 arXiv:2206.02743

[96]

Ryen W. White, Mikhail Bilenko, and Silviu Cucerzan. 2007. Studying the use of popular destinations to enhance web search interaction. In SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007, Wessel Kraaij, Arjen P. de Vries, Charles L. A. Clarke, Norbert Fuhr, and Noriko Kando (Eds.). ACM, 159–166. https://doi.org/10.1145/1277741.1277771

Digital Library

[97]

Ryen W. White, Ian Ruthven, and Joemon M. Jose. 2005. A study of factors affecting the utility of implicit relevance feedback. In SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, August 15-19, 2005, Ricardo A. Baeza-Yates, Nivio Ziviani, Gary Marchionini, Alistair Moffat, and John Tait (Eds.). ACM, 35–42. https://doi.org/10.1145/1076034.1076044

Digital Library

[98]

Wikipedia contributors. 2022. Infinite monkey theorem — Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/w/index.php?title=Infinite_monkey_theorem&oldid=1122059899 [Online; accessed 10-January-2023].

[99]

Zuobing Xu, Ram Akella, and Yi Zhang. 2007. Incorporating Diversity and Density in Active Learning for Relevance Feedback. In Advances in Information Retrieval, 29th European Conference on IR Research, ECIR 2007, Rome, Italy, April 2-5, 2007, Proceedings(Lecture Notes in Computer Science, Vol. 4425), Giambattista Amati, Claudio Carpineto, and Giovanni Romano (Eds.). Springer, 246–257. https://doi.org/10.1007/978-3-540-71496-5_24

[100]

Christoph Zauner. 2010. Implementation and benchmarking of perceptual image hash functions. Master’s thesis. Upper Austria University of Applied Sciences, Hagenberg Campus.

[101]

Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, and Yejin Choi. 2019. Defending Against Neural Fake News. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc.https://proceedings.neurips.cc/paper/2019/file/3e9f0fc9b2f89e043bc6233994dfcf76-Paper.pdf

[102]

Cha Zhang and Tsuhan Chen. 2002. An active learning framework for content-based information retrieval. IEEE Transactions on Multimedia 4, 2 (2002), 260–268. https://doi.org/10.1109/TMM.2002.1017738

Digital Library

[103]

Yinglong Zhang and Robert Capra. 2019. Understanding How People Use Search to Support Their Everyday Creative Tasks. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval(Glasgow, Scotland UK). ACM, New York, NY, USA, 153–162. https://doi.org/10.1145/3295750.3298936

Digital Library

[104]

Yinglong Zhang, Rob Capra, and Yuan Li. 2020. An In-Situ Study of Information Needs in Design-Related Creative Projects. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval(Vancouver BC, Canada). ACM, New York, NY, USA, 113–123. https://doi.org/10.1145/3343413.3377973

Digital Library

[105]

Shengyao Zhuang, Houxing Ren, Linjun Shou, Jian Pei, Ming Gong, Guido Zuccon, and Daxin Jiang. 2022. Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation. CoRR abs/2206.10128(2022). https://doi.org/10.48550/arXiv.2206.10128 arXiv:2206.10128

Cited By

Sharma KSherk TPatel VTran MNguyen T(2025)GAIA: A Benchmark of Analyzing User Rankings for Synthesized ImagesAdvances in Visual Computing10.1007/978-3-031-77392-1_34(451-463)Online publication date: 22-Jan-2025
https://doi.org/10.1007/978-3-031-77392-1_34
Martín Prada J(2024)La creación artística visual frente a los retos de la inteligencia artificial. Automatización creativa y cuestionamientos éticosEikon / Imago10.5209/eiko.9008113(e90081)Online publication date: 21-Mar-2024
https://doi.org/10.5209/eiko.90081
Chaudhary MTiwari SGangele AGaur L(2024)From Code to ConscienceResponsible Implementations of Generative AI for Multidisciplinary Use10.4018/979-8-3693-9173-0.ch006(165-188)Online publication date: 20-Sep-2024
https://doi.org/10.4018/979-8-3693-9173-0.ch006
Show More Cited By

Index Terms

The Infinite Index: Information Retrieval on Generative Text-To-Image Models
1. Information systems
  1. Information retrieval

Recommendations

Leveraging non-relevant images to enhance image retrieval performance
MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia

Inherent subjectivity in user's perception of an image has motivated the use of relevance feedback (RF) in the image desigined output's retrieval process. RF techniques interactively determine the user's query concept, given the user's relevance ...
Applications of Image Understanding in Semantics-Oriented Multimedia Information Retrieval
MSE '00: Proceedings of the 2000 International Conference on Microelectronic Systems Education

This paper focuses on research in development of semantics-oriented multimedia information retrieval techniques.Semantics-oriented information retrieval addresses the effectiveness of the retrieval.With the goal of significantly improving retrieval ...
A statistical correlation model for image retrieval
MULTIMEDIA '01: Proceedings of the 2001 ACM workshops on Multimedia: multimedia information retrieval

A bigram correlation model for image retrieval is proposed, which captures the semantic relationship among images in a database from simple statistics of users' relevance feedback information. It is used in the post-processing of image retrieval results ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CHIIR '23: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval

March 2023

520 pages

ISBN:9798400700354

DOI:10.1145/3576840

Editors:
Jacek Gwizdka
School of Information, The University of Texas at Austin, Texas, USA
,
Soo Young Rieh
School of Information, The University of Texas at Austin, Texas, USA

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 March 2023

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Bundesministerium für Bildung und Forschung

Conference

CHIIR '23

Sponsor:

CHIIR '23: ACM SIGIR Conference on Human Information Interaction and Retrieval

March 19 - 23, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 55 of 163 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
1,783
Total Downloads

Downloads (Last 12 months)841
Downloads (Last 6 weeks)96

Reflects downloads up to 18 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sharma KSherk TPatel VTran MNguyen T(2025)GAIA: A Benchmark of Analyzing User Rankings for Synthesized ImagesAdvances in Visual Computing10.1007/978-3-031-77392-1_34(451-463)Online publication date: 22-Jan-2025
https://doi.org/10.1007/978-3-031-77392-1_34
Martín Prada J(2024)La creación artística visual frente a los retos de la inteligencia artificial. Automatización creativa y cuestionamientos éticosEikon / Imago10.5209/eiko.9008113(e90081)Online publication date: 21-Mar-2024
https://doi.org/10.5209/eiko.90081
Chaudhary MTiwari SGangele AGaur L(2024)From Code to ConscienceResponsible Implementations of Generative AI for Multidisciplinary Use10.4018/979-8-3693-9173-0.ch006(165-188)Online publication date: 20-Sep-2024
https://doi.org/10.4018/979-8-3693-9173-0.ch006
Ling LChen XWen RLi TLC R(2024)Sketchar: Supporting Character Design and Illustration Prototyping Using Generative AIProceedings of the ACM on Human-Computer Interaction10.1145/36771028:CHI PLAY(1-28)Online publication date: 15-Oct-2024
https://dl.acm.org/doi/10.1145/3677102
Arabzadeh NDiaz FHe JSakai TIshita EOhshima HHasibi FMao JJose J(2024)Offline Evaluation of Set-Based Text-to-Image GenerationProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698424(42-53)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698424
Peng XKoch JMackay W(2024)DesignPrompt: Using Multimodal Interaction for Design Exploration with Generative AIProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661588(804-818)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661588
Misu MLopes CMa INoble J(2024)Towards AI-Assisted Synthesis of Verified Dafny MethodsProceedings of the ACM on Software Engineering10.1145/36437631:FSE(812-835)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643763
Gienapp LScells HDeckers NBevendorff JWang SKiesel JSyed SFröbe MZuccon GStein BHagen MPotthast MHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Evaluating Generative Ad Hoc Information RetrievalProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657849(1916-1929)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657849
Oppenlaender JLinder RSilvennoinen J(2024)Prompting AI Art: An Investigation into the Creative Skill of Prompt EngineeringInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2431761(1-23)Online publication date: 28-Nov-2024
https://doi.org/10.1080/10447318.2024.2431761
Oppenlaender J(2024)The Cultivated Practices of Text-to-Image GenerationHumane Autonomous Technology10.1007/978-3-031-66528-8_14(325-349)Online publication date: 22-Oct-2024
https://doi.org/10.1007/978-3-031-66528-8_14

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents