GitHub - hriteshMaikap/INSPIR: A repository whrere machine learning concepts are implemented from Scratch

The INSPIR (Image-Text Synthesis Pipeline for Intelligent Retrieval and Generation) framework presents an innovative approach to image captioning and retrieval by leveraging an ensemble of state-of-the-art models. This research introduces a method that generates descriptive captions from images using BLIP (Bootstrapping Language-Image Pre-training), ViT-GPT2 (Vision Transformer combined with GPT-2), and GIT (Generative Image Text), while employing CLIP (Contrastive Language-Image Pre-training) for ranking the generated captions based on their relevance. The top-ranked captions are then utilized by Llama3.1 to produce creative outputs tailored for various applications, including social media captions and image notes. Furthermore, the INSPIR model enhances image retrieval capabilities by annotating uploaded images and enabling users to conduct text-based searches, thereby facilitating efficient access to relevant visual content. By integrating multiple modalities within a unified semantic space through contrastive learning, this work aims to advance the field of image captioning beyond conventional classification tasks, offering a generalized model performance that addresses the complexities of language and vision interaction.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
final_project_INSPIR.ipynb		final_project_INSPIR.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

hriteshMaikap/INSPIR

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages