[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3430984.3431065acmotherconferencesArticle/Chapter ViewAbstractPublication PagescodsConference Proceedingsconference-collections
extended-abstract

Model Agnostic Information Biasing for VQA

Published: 02 January 2021 Publication History

Abstract

VQA involves generating information rich features from given images and questions based on them. Here we have explored the use of inducing biases and structuring of multi-modal latent spaces using fusion loss regularization. Our loss based strategy is aimed at making the multi-modal representation of the student branch (Image+Question) to be those like that of the teacher branch (Image+Answer), made with the same model. Our main contribution is that we explore a model agnostic approach based on creation of a homogeneous multimodal latent space for image with question and image with answer’s representation. To our best knowledge this is the only work exploring the use of latent space fusion using regularization for VQA.

References

[1]
P. Anderson Et al.2017. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering. arxiv:1707.07998 [cs.CV]
[2]
X. Gao Et al.2019. Jointly Optimizing Diversity and Relevance in Neural Response Generation. arxiv:1902.11205 [cs.CL]
[3]
Yu Et al.2018. Beyond Bilinear: Generalized Multimodal Factorized High-Order Pooling for Visual Question Answering. (Dec 2018).

Cited By

View all
  • (2023)Integrative Analysis of Multi-view Histopathological Image Features for the Diagnosis of Lung CancerArtificial Intelligence10.1007/978-3-031-20500-2_47(577-587)Online publication date: 1-Jan-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CODS-COMAD '21: Proceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)
January 2021
453 pages
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 January 2021

Check for updates

Author Tags

  1. Model Agnostic
  2. Space Fusion
  3. Visual Question Answering

Qualifiers

  • Extended-abstract
  • Research
  • Refereed limited

Conference

CODS COMAD 2021
CODS COMAD 2021: 8th ACM IKDD CODS and 26th COMAD
January 2 - 4, 2021
Bangalore, India

Acceptance Rates

Overall Acceptance Rate 197 of 680 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Integrative Analysis of Multi-view Histopathological Image Features for the Diagnosis of Lung CancerArtificial Intelligence10.1007/978-3-031-20500-2_47(577-587)Online publication date: 1-Jan-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media