Stars
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
GPT4V-level open-source multi-modal model based on Llama3-8B
WeChatCV / opencv_3rdparty
Forked from opencv/opencv_3rdpartyOpenCV - 3rdparty
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
A curated list of awesome papers on dataset distillation and related applications.
Integrate deep learning models for image classification | Backbone learning/comparison/magic modification project
PyTorch implementation of image classification models for CIFAR-10/CIFAR-100/MNIST/FashionMNIST/Kuzushiji-MNIST/ImageNet
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Stable Diffusion web UI
RepVGG TensorRT int8 量化,实测推理不到1ms一帧!
RepVGG: Making VGG-style ConvNets Great Again
Datasets, Transforms and Models specific to Computer Vision
label-smooth, amsoftmax, partial-fc, focal-loss, triplet-loss, lovasz-softmax. Maybe useful
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
OpenMMLab Detection Toolbox and Benchmark
Semantic Segmentation in Pytorch. Network include: FCN、FCN_ResNet、SegNet、UNet、BiSeNet、BiSeNetV2、PSPNet、DeepLabv3_plus、 HRNet、DDRNet
Semantic Segmentation on PyTorch (include FCN, PSPNet, Deeplabv3, Deeplabv3+, DANet, DenseASPP, BiSeNet, EncNet, DUNet, ICNet, ENet, OCNet, CCNet, PSANet, CGNet, ESPNet, LEDNet, DFANet)
该仓库用于记录作者本人参加的各大数据科学竞赛的获奖方案源码以及一些新比赛的原创baseline. 主要涵盖:kaggle, 阿里天池,华为云大赛校园赛,百度aistudio,和鲸社区,datafountain等
Official PyTorch Implementation of Paper <CRNet: Classification and Regression Neural Network for Facial Beauty Prediction> (Pacific-Rim Conference on Multimedia (PCM) 2018)
An (unofficial) implementation of Focal Loss, as described in the RetinaNet paper, generalized to the multi-class case.