8000 VanderHua / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View VanderHua's full-sized avatar

Highlights

  • Pro

Block or report VanderHua

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ACM MM 2024] See or Guess: Counterfactually Regularized Image Captioning

Python 14 Updated Feb 17, 2025

Grok open release

Python 50,280 8,354 Updated Aug 30, 2024

MiniSora: A community aims to explore the implementation path and future development direction of Sora.

Python 1,265 150 Updated Feb 18, 2025

A curated list of recent diffusion models for video generation, editing, and various other applications.

4,432 259 Updated May 17, 2025

CHAIR metric is a rule-based metric for evaluating object hallucination in caption generation.

Python 29 Updated Nov 8, 2023

[ICLR'23] DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Python 779 95 Updated Mar 1, 2024

Code for the paper "LLark: A Multimodal Instruction-Following Language Model for Music" by Josh Gardner, Simon Durand, Daniel Stoller, and Rachel Bittner.

Jupyter Notebook 343 27 Updated May 30, 2024

Official implementation for "Break-A-Scene: Extracting Multiple Concepts from a Single Image" [SIGGRAPH Asia 2023]

Python 518 25 Updated Jan 14, 2024

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Python 3,946 331 Updated Jan 17, 2025

[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"

Python 484 17 Updated Aug 9, 2024

Acceptance rates for the major AI conferences

Jupyter Notebook 4,485 309 Updated Jan 24, 2025

🚀🎬 ShortGPT - Experimental AI framework for youtube shorts / tiktok channel automation

Python 6,502 867 Updated Feb 10, 2025

(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.

2,339 201 Updated May 6, 2025

🤖 An automated machine learning framework for audio, text, image, video, or .CSV files (50+ featurizers and 15+ model trainers). Python 3.6 required.

Python 141 35 Updated Apr 2, 2025

LP-MusicCaps: LLM-Based Pseudo Music Captioning [ISMIR23]

Python 322 38 Updated Apr 8, 2024

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 256 25 Updated Mar 20, 2024

Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"

Python 176 6 Updated Jul 23, 2023

WavJourney: Compositional Audio Creation with LLMs

Python 536 43 Updated Sep 28, 2023

Emu Series: Generative Multimodal Models from BAAI

Python 1,720 85 Updated Sep 27, 2024

✨✨Latest Advances on Multimodal Large Language Models

15,240 984 Updated May 15, 2025

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,998 2,328 Updated Mar 13, 2025

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,653 236 Updated Dec 9, 2024

Image inpainting tool powered by SOTA AI Model. Remove any unwanted object, defect, people from your pictures or erase and replace(powered by stable diffusion) any thing on your pictures.

Python 21,254 2,159 Updated Apr 29, 2025

Open-source and strong foundation image recognition models.

Jupyter Notebook 3,239 301 Updated Feb 18, 2025

🐙 Guides, papers, lecture, notebooks and resources for prompt engineering

MDX 56,100 5,541 Updated May 16, 2025

ChatGPT爆火,开启了通往AGI的关键一步,本项目旨在汇总那些ChatGPT的开源平替们,包括文本大模型、多模态大模型等,为大家提供一些便利

2,037 201 Updated Aug 14, 2023

Tracking and collecting papers/projects/others related to Segment Anything.

1,609 134 Updated Mar 13, 2025

Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).

Python 2,240 141 Updated Jun 7, 2023

ChatGPT, GenerativeAI and LLMs Timeline

951 58 Updated May 19, 2024
Next
0