8000 zxd3099 (Xiaodong Zhu) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View zxd3099's full-sized avatar
  • Wuhan University
  • Wuhan University

Block or report zxd3099

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A list of tools, papers and code related to Fake Audio Detection.

115 4 Updated Jun 10, 2025

Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller

Jupyter Notebook 4,396 805 Updated Apr 24, 2023

Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。

2,600 338 Updated May 30, 2023

This repository contains a collection of resources and papers on Detecting Multimedia Generated by Large AI Models

92 5 Updated May 7, 2025

Research progress on speech deepfake detection: Relevant datasets aggregated from the review literature and publicly available codes

201 13 Updated Jun 4, 2025

Code repo for our paper of "Audio-Visual Representation Learning via Knowledge Distillation from Speech Foundation Models"

Python 7 Updated Feb 14, 2025

The official pytorch implemention of the Intespeech 2024 paper "Reshape Dimensions Network for Speaker Recognition"

Python 163 9 Updated Nov 14, 2024

A simple tool to update bib entries with their official information (e.g., DBLP or the ACL anthology).

Python 2,850 164 Updated May 22, 2025

A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper.

Python 233 41 Updated Feb 15, 2024

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

Python 1,365 127 Updated Apr 24, 2024

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

Python 384 31 Updated Sep 11, 2023

A curated list of research based on CLIP.

222 14 Updated Nov 17, 2024

Pytorch Lightning入门中文教程,转载请注明来源。(当初是写着玩的,建议看完MNIST这个例子再上手)

Jupyter Notebook 217 19 Updated Dec 6, 2020

[TIP 2022] End-to-end Temporal Action Detection with Transformer

Python 153 12 Updated Feb 19, 2023

面向人脸视频防伪鉴别的大规模中文数据评测基准(Large-Scale Chinese Data Benchmark for Face Video Anti-Forgery Identification)

Python 13 2 Updated Feb 26, 2025

pytorch实现Grad-CAM和Grad-CAM++,可以可视化任意分类网络的Class Activation Map (CAM)图,包括自定义的网络;同时也实现了目标检测faster r-cnn和retinanet两个网络的CAM图;欢迎试用、关注并反馈问题...

Python 752 171 Updated Jan 13, 2021

Explainability for Vision Transformers

Python 965 103 Updated Mar 12, 2022

Open source implementation of "Vision Transformers Need Registers"

Python 179 15 Updated Apr 6, 2025

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 6,891 953 Updated Jul 3, 2024

Official codebase for "Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling".

Python 31 2 Updated Aug 2, 2024

Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".

Python 259 25 Updated Mar 20, 2024

Vision Transformers are Parameter-Efficient Audio-Visual Learners

Python 99 7 Updated Aug 11, 2023

Official implementation of RAVEn (ICLR 2023) and BRAVEn (ICASSP 2024)

Python 65 4 Updated Feb 27, 2025

A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)

Python 51 4 Updated Apr 17, 2024

A self-supervised learning framework for audio-visual speech

Python 906 141 Updated Dec 7, 2023

[CVPR 2024] Official implementation of the paper "TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression"

Python 24 1 Updated Jun 26, 2024

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Python 1,907 112 Updated May 25, 2025

[ACM MM'23] UMMAFormer: A Universal Multimodal-adaptive Transformer Framework For Temporal Forgery Localization

Python 64 2 Updated Nov 12, 2024

[CVIU, DICTA Award] Glitch in the Matrix: A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization

Python 89 18 Updated Dec 22, 2024
Next
0