-
08:00
(UTC +08:00)
Stars
A model-driven approach to building AI agents in just a few lines of code.
A Unified Toolkit for Deep Learning Based Document Image Analysis
Concats a list of videos together using ffmpeg with sexy OpenGL transitions.
Finetuning and inference tools for the CogView4 and CogVideoX model series.
Fast and Accurate ML in 3 Lines of Code
Official project page of the paper "Towards Surveillance Video-and-Language Understanding: New Dataset, Baselines, and Challenges" (Accepted by CVPR 2024)
Chronos: Pretrained Models for Probabilistic Time Series Forecasting
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
This repository contains examples for customers to get started using the Amazon Bedrock Service. This contains examples for all available foundational models
Solve Visual Understanding with Reinforced VLMs
A Flexible Framework for Comprehensive Multimodal Model Evaluation
The official implementation of "NAS-BNN: Neural Architecture Search for Binary Neural Networks"
Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…
Get your documents ready for gen AI
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Supercharge Your LLM Application Evaluations 🚀
LVBench: An Extreme Long Video Understanding Benchmark
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance
This guide provides instructions for creating and managing a SageMaker Hyperpod cluster, and training the AnimateAnyone algorithm on SageMaker Hyperpod
"Effective Whole-body Pose Estimation with Two-stages Distillation" (ICCV 2023, CV4Metaverse Workshop)