-
CQUPT (Chongqing University of Posts and Telecommunications)
- Chongqing
-
16:19
(UTC +08:00)
Stars
Official Implementation of "VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning".
Official code for "Learning Prompt-Enhanced Context features for Weakly-Supervised Video Anomlay Detection" (IEEE-TIP)
[ACM MM 2022] Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
Implementation for paper "Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Model"
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
[CVPR 2024] Official repository of the paper "Uncovering What, Why and How: A Comprehensive Benchmark for Causation Understanding of Video Anomaly"
Official implementation for paper TEVAD: Improved video anomaly detection with captions
Official project page of the paper "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges" (Accepted by CVPR 2024)
Extract frames from videos in Python using OpenCV.
🔥 🔥 🔥 [NeurIPS 2024] Official Implementation of Hawk: Learning to Understand Open-World Video Anomalies
[ICME 2025] Official implementation of "GlanceVAD: Exploring Glance Supervision for Label-efficient Video Anomaly Detection"
Official implementation of "Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM"
Official implementation for ECCV paper "Towards Open Set Video Anomaly Detection"
Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
Official code for AAAI2023 paper "Dual Memory Units with Uncertainty Regulation for Weakly Supervised Video Anomaly Detection"
Real-world Anomaly Detection in Surveillance Videos CVPR2018 UCF-Crime dataset
My Semantic_Segmentation_Study Based on cityspace
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image