-
DAPO Public
Forked from BytedTsinghua-SIA/DAPOAn Open-source RL System from ByteDance Seed and Tsinghua AIR
UpdatedMar 20, 2025 -
3AM Public
Official code and data of "3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset"
-
VGen Public
Forked from ali-vilab/VGenOfficial repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
Python UpdatedJun 21, 2024 -
groundingLMM Public
Forked from mbzuai-oryx/groundingLMM[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.
Python UpdatedJun 2, 2024 -
mm-cot Public
Forked from amazon-science/mm-cotOfficial implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Python Apache License 2.0 UpdatedAug 17, 2023 -
LLaVA Public
Forked from haotian-liu/LLaVALarge Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.
Python Apache License 2.0 UpdatedMay 15, 2023 -
stanford_alpaca Public
Forked from tatsu-lab/stanford_alpacaCode and documentation to train Stanford's Alpaca models, and generate the data.
Python Apache License 2.0 UpdatedApr 4, 2023 -
VL-T5 Public
Forked from j-min/VL-T5PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)
Python MIT License UpdatedMar 20, 2023 -
fairseq_mmt Public
Forked from libeineu/fairseq_mmtThis code repository is for the accepted ACL2022 paper "On Vision Features in Multimodal Machine Translation". We provide the details and scripts for the proposed probing tasks. We hope the code co…
Python MIT License UpdatedFeb 27, 2023 -
THUMT Public
Forked from THUNLP-MT/THUMTAn open-source neural machine translation toolkit developed by Tsinghua Natural Language Processing Group
Python BSD 3-Clause "New" or "Revised" License UpdatedAug 29, 2022 -
valhalla-nmt Public
Forked from JerryYLi/valhalla-nmtCode repository for CVPR 2022 paper "VALHALLA: Visual Hallucination for Machine Translation"
Python MIT License UpdatedJun 6, 2022 -
vision_transformer Public
Forked from google-research/vision_transformerJupyter Notebook Apache License 2.0 UpdatedMar 15, 2022 -
CLIP_prefix_caption Public
Forked from rmokady/CLIP_prefix_captionSimple image captioning model
Jupyter Notebook MIT License UpdatedFeb 15, 2022 -
-
VTLM Public
Forked from ImperialNLP/VTLMCross-lingual Visual Pre-training for Multimodal Machine Translation
Python Other UpdatedDec 28, 2021 -
ViLT Public
Forked from dandelin/ViLTCode for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Python UpdatedDec 19, 2021 -
-
-
Style-AttnGAN Public
Forked from sidward14/Style-AttnGANImproves Text to Image synthesis from AttnGAN by integrating the scale-specific control from StyleGAN; can optionally use GPT-2 as text encoder
Python Other UpdatedNov 5, 2021 -
-
proxynca_pp Public
Forked from euwern/proxynca_ppThe implementation of ProxyNCA++.
Python MIT License UpdatedOct 8, 2021 -
ImageCaptioning.pytorch Public
Forked from ruotianluo/ImageCaptioning.pytorchI decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)
Python MIT License UpdatedSep 28, 2021 -
progressive_growing_of_gans Public
Forked from tkarras/progressive_growing_of_gansProgressive Growing of GANs for Improved Quality, Stability, and Variation
Python Other UpdatedSep 27, 2021 -
GLM (General Language Model)
Python MIT License UpdatedSep 6, 2021 -
x-transformers Public
Forked from lucidrains/x-transformersA simple but complete full-attention transformer with a set of promising experimental features from various papers
Python MIT License UpdatedSep 3, 2021 -
genforce Public
Forked from genforce/genforceGenForce: an efficient PyTorch library for deep generative modeling (StyleGANv1v2, PGGAN, etc)
Python MIT License UpdatedAug 28, 2021 -
TediGAN Public
Forked from IIGROUP/TediGAN[CVPR 2021] Pytorch implementation for TediGAN: Text-Guided Diverse Face Image Generation and Manipulation.
Python MIT License UpdatedAug 28, 2021 -
taming-transformers Public
Forked from CompVis/taming-transformersTaming Transformers for High-Resolution Image Synthesis
Jupyter Notebook MIT License UpdatedAug 27, 2021 -
TTUR Public
Forked from bioinf-jku/TTURTwo time-scale update rule for training GANs
Jupyter Notebook Apache License 2.0 UpdatedAug 22, 2021 -
Vision-Language-Transformer Public
Forked from henghuiding/Vision-Language-TransformerVision-Language Transformer and Query Generation for Referring Segmentation (ICCV 2021)
Python MIT License UpdatedAug 17, 2021