Stars
[CVPR 2025 Oral] OverLoCK: An Overview-first-Look-Closely-next ConvNet with Context-Mixing Dynamic Kernels
[TNNLS 2025] TransXNet: Learning Both Global and Local Dynamics with a Dual Dynamic Token Mixer for Visual Recognition
[NeurIPS 2022] HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
[ECCV2024] Official implementation of paper, "DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs".
[NeurIPS2023]Lightweight Vision Transformer with Bidirectional Interaction
The official code of "Rethinking Local Perception in Lightweight Vision Transformer"
[CVPR 2023] Official code release of our paper "BiFormer: Vision Transformer with Bi-Level Routing Attention"
[CVPR2023] "SMPConv: Self-moving Point Representations for Continuous Convolution"
[NeurIPS 2022] Official code for "Focal Modulation Networks"
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
[CADL'22, ECCVW] Official repository of paper titled "EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications".
The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
[CVPR2025] Breaking the Low-Rank Dilemma of Linear Attention
Official repository for "Self-Distilled Vision Transformer for Domain Generalization" (ACCV-2022 ORAL)
[ACCV 2024 ] Official code for "DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention"
[ICML 2023] Official PyTorch implementation of Global Context Vision Transformers
[ICCV2023] This is an official implementation for "Scale-Aware Modulation Meet Transformer".
(CVPR2024)RMT: Retentive Networks Meet Vision Transformer
Code for our CVPR'25 paper - "DiffLocks: Generating 3D Hair from a Single Image using Diffusion Models"
Janus-Series: Unified Multimodal Understanding and Generation Models
A collection of resources on personalized image generation.
Dataset generation scripts for SynthForge dataset
Code for SynthForge: Synthesizing High-Quality Face Dataset with Controllable 3D Generative Models
FSvFM: A Generalizable Face Security vision Foundation Model via Self-Supervised Facial Representation Learning (CVPR25)