More Web Proxy on the site http://driver.im/

research-article

Artificial intelligence snapchat: Visual conversation agent

Authors:

Adrian David Cheok,

Kirthana Govindarajoo,

Nurizzaty Salehuddin,

Somaiyeh VedadiAuthors Info & Claims

Applied Intelligence, Volume 50, Issue 7

Pages 2040 - 2049

https://doi.org/10.1007/s10489-019-01621-2

Published: 01 July 2020 Publication History

Abstract

Visual conversation is a dialog in which parties exchange visual information. The key novelty presented in this paper is an artificial intelligence-driven visual conversation automation method. We will present a state of the art Artificial Intelligence Snapchat Visual Conversation Agent (AISVCA). AISVCA uses our proposed artificial intelligence-driven visual conversation automation method to create received image caption and generate an appropriate reasonable visual response. These functionalities are achieved by using a combination of Convolutional Neural Network (CNN), Long Short-Term Memory Neural Network (LSTM) and, Latent Semantic Indexing method (LSI). CNN and LSTM are used to create image captions and, LSI is used to assess the semantic similarity between captions generated from personalized image dataset, and captions that are extracted from the received image content. We will show that AISVCA, using the proposed method can generate a visual response that is basically indistinguishable from a human visual response. To evaluate the proposed approach, we measured the accuracy of the proposed system and, conducted a user study to test communication quality. In the user study, we analyzed source credibility and interpersonal attraction of the AISVCA. The user study results showed that there are no significant differences in communication quality between a visual conversation with AISVCA and visual conversation with the human agent.

References

[1]

Agrawal A, Lu J, Antol S, Mitchell M, Zitnick CL, Parikh D, and Batra DVqa: Visual question answeringInt J Comput Vis201712314-313640737

[2]

Chattopadhyay P, Yadav D, Prabhu V, Chandrasekaran A, Das A, Lee S, Batra D, Parikh D (2017) Evaluating visual conversational agents via cooperative human-ai games. arXiv:170805122

[3]

Chen J, Dong W, Li M (2016) Image caption generator based on deep neural networks

[4]

Das A, Kottur S, Gupta K, Singh A, Yadav D, Moura JM, Parikh D, Batra D (2017) Visual dialog. In: Proceedings of the IEEE conference on computer vision and pattern recognition, vol 2

[5]

Edwards C, Edwards A, Spence P, Shelton A (2014) Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter 33:372–376

[6]

Edwards C, Edwards A, Spence PR, and Shelton AK Is that a bot running the social media feed? testing the differences in perceptions of communication quality for a human agent and a bot agent on twitter Comput Hum Behav 2014 33 372-376

[7]

Fang H, Gupta S, Iandola F, Srivastava RK, Deng L, Dollár P, Gao J, He X, Mitchell M, Platt JC et al (2015) From captions to visual concepts and back. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1473–1482

[8]

Haas C and Wearden ST E-credibility: Building common ground in web environments L1-Educational Studies in Language and Literature 2003 3 1-2 169-184

[9]

Hofmann T (2017) Probabilistic latent semantic indexing. In: ACM SIGIR forum, ACM, vol 51, pp 211–218

[10]

Hosseini MH and Nahad RF Investigating antecedents and consequences of open university brand image Int J Acad Res 2012 4 4 953-960

[11]

Klassen AC, Creswell J, Clark VLP, Smith KC, and Meissner HI Best practices in mixed methods for quality of life research Qual Life Res 2012 21 3 377-380

[12]

Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: European conference on computer vision, Springer, pp 740–755

[13]

Manning C D, Raghavan P, Schütze H (2008) Matrix decompositions and latent semantic indexing. Introduction to Information Retrieval pp 403–417

[14]

McCroskey JC and McCain TA The measurement of interpersonal attraction Speech Monographs 1974 41 3 261-266

[15]

McCroskey JC and Teven JJ Goodwill: A reexamination of the construct and its measurement Communications Monographs 1999 66 1 90-103

[16]

Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: 11th annual conference of the international speech communication association

[17]

Mostafazadeh N, Misra I, Devlin J, Mitchell M, He X, Vanderwende L (2016) Generating natural questions about an image. arXiv:160306059

[18]

Ohanian R (1991) The impact of celebrity spokespersons’ perceived image on consumers’ intention to purchase. Journal of advertising Research

[19]

Sharma S, Suhubdy D, Michalski V, Kahou SE, Bengio Y (2018) Chatpainter: Improving text to image generation using dialogue. arXiv:180208216

[20]

Soh M (2016) Learning cnn-lstm architectures for image caption generation

[21]

Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 3156–3164

[22]

Vinyals O, Toshev A, Bengio S, Erhan D (2015) Show and tell: A neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3156–3164

[23]

Vinyals O, Toshev A, Bengio S, and Erhan D Show and tell: Lessons learned from the 2015 mscoco image captioning challenge IEEE transactions on pattern analysis and machine intelligence 2017 39 4 652-663

[24]

Wagner K (2017) Snapchat is still bigger than instagram for younger u.s. millennials. https://www.recode.net/2017/8/24/16198632/snapchat-instagram-teens-comscore-study-growth-users

[25]

Wagner K (2017) Snapchat is still the network of choice for u.s. teens - and instagram is facebook best shot at catching up. https://www.recode.net/2017/12/16/16783570/snapchat-instagram-teenagers-rbc-survey-favorite-app

[26]

Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y (2015) Show, attend and tell: Neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057

[27]

Zhang H, Xu T, Li H, Zhang S, Huang X, Wang X, Metaxas D (2017) Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: IEEE Int. Conf. Comput. Vision (ICCV), pp 5907–5915

[28]

Zhang Y, Jin R, and Zhou ZH Understanding bag-of-words model: A statistical framework Int J Mach Learn Cybern 2010 1 1-4 43-52

Index Terms

Artificial intelligence snapchat: Visual conversation agent

Index terms have been assigned to the content through auto-classification.

Recommendations

Artificial Visual Intelligence: Perceptual Commonsense for Human-Centred Cognitive Technologies
Human-Centered Artificial Intelligence
Abstract
We address computational cognitive vision and perception at the interface of language, logic, cognition, and artificial intelligence. The chapter presents general methods for the processing and semantic interpretation of dynamic visuospatial ...
Multimodal and Crossmodal Representation Learning from Textual and Visual Features with Bidirectional Deep Neural Networks for Video Hyperlinking
iV&L-MM '16: Proceedings of the 2016 ACM workshop on Vision and Language Integration Meets Multimedia Fusion

Video hyperlinking represents a classical example of multimodal problems. Common approaches to such problems are early fusion of the initial modalities and crossmodal translation from one modality to the other. Recently, deep neural networks, especially ...
Visual Fatigue Phenomenon in Visual Communication Design Integrating Artificial Intelligence
In the process of cognition, people focus on self-feeling and realize the cognition of the world by judging the information conveyed by vision. Design changes lives, and our lives are always surrounded by design. This research mainly discusses the ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Applied Intelligence

Applied Intelligence Volume 50, Issue 7

Jul 2020

313 pages

ISSN:0924-669X

Issue’s Table of Contents

© Springer Science+Business Media, LLC, part of Springer Nature 2020.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 July 2020

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents