- Tokyo, Japan
-
03:14
(UTC +09:00) - in/phuongdnm
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Starred repositories
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Fine-tune LLMs for free with 100+ Notebooks on Google Colab, Kaggle, and more.
Efficient vision foundation models for high-resolution generation and perception.
Error correction back-end for speaker diarization
Repository for "LLM-based speaker diarization correction: A generalizable approach" paper
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning
Official PyTorch implementation of DistiLLM: Towards Streamlined Distillation for Large Language Models (ICML 2024)
[ECCV2024 Oral] Official implementation of the paper "Relation DETR: Exploring Explicit Position Relation Prior for Object Detection"
Integrate the DeepSeek API into popular softwares
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Replace 'hub' with 'ingest' in any github url to get a prompt-friendly extract of a codebase
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Open-source industrial-grade ASR models supporting Mandarin, Chinese dialects and English, achieving a new SOTA on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics rec…
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
An AI-powered research assistant that performs iterative, deep research on any topic by combining search engines, web scraping, and large language models. The goal of this repo is to provide the si…
A fast tool to convert any website into LLM-ready markdown data. Built by https://supermemory.ai
A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's CoT reasoning traces with Anthropic Claude models.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement"
Sky-T1: Train your own O1 preview model within $450
Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.