zxd3099

Xiaodong Zhu zxd3099

student of Wuhan University

Wuhan University
Wuhan University

Achievements

Lists (1)

Sort

🚀 My stack

1 repository

Stars

john852517791 / awesome-fake-audio-detection

A list of tools, papers and code related to Fake Audio Detection.

115 4 Updated Jun 10, 2025

IntelLabs / distiller

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

Jupyter Notebook 4,396 805 Updated Apr 24, 2023

FLHonker / Awesome-Knowledge-Distillation

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。

2,600 338 Updated May 30, 2023

Purdue-M2 / Detect-LAIM-generated-Multimedia-Survey

This repository contains a collection of resources and papers on Detecting Multimedia Generated by Large AI Models

92 5 Updated May 7, 2025

media-sec-lab / Audio-Deepfake-Detection

Research progress on speech deepfake detection: Relevant datasets aggregated from the review literature and publicly available codes

201 13 Updated Jun 4, 2025

jxzhanggg / DistillAV

Code repo for our paper of "Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models"

Python 7 Updated Feb 14, 2025

IDRnD / redimnet

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 163 9 Updated Nov 14, 2024

yuchenlin / rebiber

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,850 164 Updated May 22, 2025

smeetrs / deep_avsr

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Python 233 41 Updated Feb 15, 2024

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Python 1,365 127 Updated Apr 24, 2024

facebookresearch / muavic

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 384 31 Updated Sep 11, 2023

Alvin-Zeng / temporal-robustness-benchmark

Python 17 Updated May 6, 2024

TalkUHulk / Awesome-CLIP

A curated list of research based on CLIP.

222 14 Updated Nov 17, 2024

3017218062 / Pytorch-Lightning-Learning

Pytorch Lightning入门中文教程，转载请注明来源。（当初是写着玩的，建议看完MNIST这个例子再上手）

Jupyter Notebook 217 19 Updated Dec 6, 2020

xlliu7 / TadTR

[TIP 2022] End-to-end Temporal Action Detection with Transformer

Python 153 12 Updated Feb 19, 2023

HengruiLou / CHN-DF

面向人脸视频防伪鉴别的大规模中文数据评测基准(Large-Scale Chinese Data Benchmark for Face Video Anti-Forgery Identification)

Python 13 2 Updated Feb 26, 2025

yizt / Grad-CAM.pytorch

pytorch实现Grad-CAM和Grad-CAM++,可以可视化任意分类网络的Class Activation Map (CAM)图,包括自定义的网络;同时也实现了目标检测faster r-cnn和retinanet两个网络的CAM图;欢迎试用、关注并反馈问题...

Python 752 171 Updated Jan 13, 2021

jacobgil / vit-explain

Explainability for Vision Transformers

Python 965 103 Updated Mar 12, 2022

kyegomez / Vit-RGTS

Open source implementation of "Vision Transformers Need Registers"

Python 179 15 Updated Apr 6, 2025

facebookresearch / dino

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 6,891 953 Updated Jul 3, 2024

stoneMo / DeepAVFusion

Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".

Python 31 2 Updated Aug 2, 2024

YuanGongND / cav-mae

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 259 25 Updated Mar 20, 2024

GenjiB / LAVISH

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Python 99 7 Updated Aug 11, 2023

ahaliassos / raven

Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)

Python 65 4 Updated Feb 27, 2025

roger-tseng / av-superb

A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)

Python 51 4 Updated Apr 17, 2024

facebookresearch / av_hubert

A self-supervised learning framework for audio-visual speech

Python 906 141 Updated Dec 7, 2023

Dotori-HJ / TE-TAD

[CVPR 2024] Official implementation of the paper "TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression"

Python 24 1 Updated Jun 26, 2024

OpenGVLab / InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,907 112 Updated May 25, 2025

ymhzyj / UMMAFormer

[ACM MM'23] UMMAFormer: A Universal Multimodal-adaptive Transformer Framework For Temporal Forgery Localization

Python 64 2 Updated Nov 12, 2024

ControlNet / LAV-DF

[CVIU, DICTA Award] Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization

Python 89 18 Updated Dec 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Xiaodong Zhu zxd3099

Achievements

Achievements

Block or report zxd3099

Lists (1)

🚀 My stack

Stars

john852517791 / awesome-fake-audio-detection

IntelLabs / distiller

FLHonker / Awesome-Knowledge-Distillation

Purdue-M2 / Detect-LAIM-generated-Multimedia-Survey

media-sec-lab / Audio-Deepfake-Detection

jxzhanggg / DistillAV

IDRnD / redimnet

yuchenlin / rebiber

smeetrs / deep_avsr

microsoft / SpeechT5

facebookresearch / muavic

Alvin-Zeng / temporal-robustness-benchmark

TalkUHulk / Awesome-CLIP

3017218062 / Pytorch-Lightning-Learning

xlliu7 / TadTR

HengruiLou / CHN-DF

yizt / Grad-CAM.pytorch

jacobgil / vit-explain

kyegomez / Vit-RGTS

facebookresearch / dino

stoneMo / DeepAVFusion

YuanGongND / cav-mae

GenjiB / LAVISH

ahaliassos / raven

roger-tseng / av-superb

facebookresearch / av_hubert

Dotori-HJ / TE-TAD

OpenGVLab / InternVideo

ymhzyj / UMMAFormer

ControlNet / LAV-DF