Lists (3)
Sort Name ascending (A-Z)
Stars
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
Finetune Qwen3, Llama 4, TTS, DeepSeek-R1 & Gemma 3 LLMs 2x faster with 70% less memory! 🦥
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
Open3D: A Modern Library for 3D Data Processing
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
🦜🔗 Build context-aware reasoning applications
Pytorch pipeline for 3D image domain translation using Cycle-Generative-Adversarial-networks, without paired examples.
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Unsupervised Learning for Image Registration
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source …
deep learning for image processing including classification and object-detection etc.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Image-to-Image Translation in PyTorch
MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.
OpenMMLab's next-generation platform for general 3D object detection.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite