[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Artificial intelligence snapchat: Visual conversation agent

Published: 01 July 2020 Publication History

Abstract

Visual conversation is a dialog in which parties exchange visual information. The key novelty presented in this paper is an artificial intelligence-driven visual conversation automation method. We will present a state of the art Artificial Intelligence Snapchat Visual Conversation Agent (AISVCA). AISVCA uses our proposed artificial intelligence-driven visual conversation automation method to create received image caption and generate an appropriate reasonable visual response. These functionalities are achieved by using a combination of Convolutional Neural Network (CNN), Long Short-Term Memory Neural Network (LSTM) and, Latent Semantic Indexing method (LSI). CNN and LSTM are used to create image captions and, LSI is used to assess the semantic similarity between captions generated from personalized image dataset, and captions that are extracted from the received image content. We will show that AISVCA, using the proposed method can generate a visual response that is basically indistinguishable from a human visual response. To evaluate the proposed approach, we measured the accuracy of the proposed system and, conducted a user study to test communication quality. In the user study, we analyzed source credibility and interpersonal attraction of the AISVCA. The user study results showed that there are no significant differences in communication quality between a visual conversation with AISVCA and visual conversation with the human agent.

References

[1]
Agrawal A, Lu J, Antol S, Mitchell M, Zitnick CL, Parikh D, and Batra DVqa: Visual question answeringInt J Comput Vis201712314-313640737
[2]
Chattopadhyay P, Yadav D, Prabhu V, Chandrasekaran A, Das A, Lee S, Batra D, Parikh D (2017) Evaluating visual conversational agents via cooperative human-ai games. arXiv:170805122
[3]
Chen J, Dong W, Li M (2016) Image caption generator based on deep neural networks
[4]
Das A, Kottur S, Gupta K, Singh A, Yadav D, Moura JM, Parikh D, Batra D (2017) Visual dialog. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2
[5]
Edwards C, Edwards A, Spence P, Shelton A (2014) Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter 33:372–376
[6]
Edwards C, Edwards A, Spence PR, and Shelton AK Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter Comput Hum Behav 2014 33 372-376
[7]
Fang H, Gupta S, Iandola F, Srivastava RK, Deng L, Dollár P, Gao J, He X, Mitchell M, Platt JC et al (2015) From captions to visual concepts and back. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1473–1482
[8]
Haas C and Wearden ST E-credibility: Building common ground in web environments L1-Educational Studies in Language and Literature 2003 3 1-2 169-184
[9]
Hofmann T (2017) Probabilistic latent semantic indexing. In: ACM SIGIR forum, ACM, vol 51, pp 211–218
[10]
Hosseini MH and Nahad RF Investigating antecedents and consequences of open university brand image Int J Acad Res 2012 4 4 953-960
[11]
Klassen AC, Creswell J, Clark VLP, Smith KC, and Meissner HI Best practices in mixed methods for quality of life research Qual Life Res 2012 21 3 377-380
[12]
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755
[13]
Manning C D, Raghavan P, Schütze H (2008) Matrix decompositions and latent semantic indexing. Introduction to Information Retrieval pp 403–417
[14]
McCroskey JC and McCain TA The measurement of interpersonal attraction Speech Monographs 1974 41 3 261-266
[15]
McCroskey JC and Teven JJ Goodwill: A reexamination of the construct and its measurement Communications Monographs 1999 66 1 90-103
[16]
Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: 11th annual conference of the international speech communication association
[17]
Mostafazadeh N, Misra I, Devlin J, Mitchell M, He X, Vanderwende L (2016) Generating natural questions about an image. arXiv:160306059
[18]
Ohanian R (1991) The impact of celebrity spokespersons’ perceived image on consumers’ intention to purchase. Journal of advertising Research
[19]
Sharma S, Suhubdy D, Michalski V, Kahou SE, Bengio Y (2018) Chatpainter: Improving text to image generation using dialogue. arXiv:180208216
[20]
Soh M (2016) Learning cnn-lstm architectures for image caption generation
[21]
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3156–3164
[22]
Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164
[23]
Vinyals O, Toshev A, Bengio S, and Erhan D Show and tell: Lessons learned from the 2015 mscoco image captioning challenge IEEE transactions on pattern analysis and machine intelligence 2017 39 4 652-663
[24]
Wagner K (2017) Snapchat is still bigger than instagram for younger u.s. millennials. https://www.recode.net/2017/8/24/16198632/snapchat-instagram-teens-comscore-study-growth-users
[25]
Wagner K (2017) Snapchat is still the network of choice for u.s. teens - and instagram is facebook best shot at catching up. https://www.recode.net/2017/12/16/16783570/snapchat-instagram-teenagers-rbc-survey-favorite-app
[26]
Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
[27]
Zhang H, Xu T, Li H, Zhang S, Huang X, Wang X, Metaxas D (2017) Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: IEEE Int. Conf. Comput. Vision (ICCV), pp 5907–5915
[28]
Zhang Y, Jin R, and Zhou ZH Understanding bag-of-words model: A statistical framework Int J Mach Learn Cybern 2010 1 1-4 43-52

Index Terms

  1. Artificial intelligence snapchat: Visual conversation agent
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Applied Intelligence
        Applied Intelligence  Volume 50, Issue 7
        Jul 2020
        313 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 01 July 2020

        Author Tags

        1. AISVCA
        2. Visual conversation
        3. Neural networks
        4. Chat platforms

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 09 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media