- Mumbai, Maharashtra, India
- @itsmariodias
- in/mario-dias
VQA
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Deep Modular Co-Attention Networks for Visual Question Answering
Unofficial tensorflow implementation of "Bottom-up and Top-down attention for VQA" (TF v. 1.13)
An efficient PyTorch implementation of the winning entry of the 2017 VQA Challenge.
An reimplemtation of the hierarchical co-attention network
Train a deeper LSTM and normalized CNN Visual Question Answering model. This current code can get 58.16 on OpenEnded and 63.09 on Multiple-Choice on test-standard.
A simple Flask app to generate answer given an image and a natural language question about the image. The app uses a deep learning model, trained with Tensorflow, behind the scenes.
[Reimplementation Antol et al 2015] Keras-based LSTM/CNN models for Visual Question Answering
PyTorch bottom-up attention with Detectron2
Rich Visual Knowledge-based AugmentationNetwork for Visual Question Answering