8000 lpcinelli (Lucas Cinelli) / Starred · GitHub
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content
View lpcinelli's full-sized avatar

Block or report lpcinelli

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

VLM driven tool that processes surveillance videos, extracts frames, and generates insightful annotations using a fine-tuned Florence-2 Vision-Language Model. Includes a Gradio-based interface for …

Python 116 14 Updated Jun 6, 2025

Surveillance Perspective Human Action Recognition Dataset: 7759 Videos from 14 Action Classes, aggregated from multiple sources, all cropped spatio-temporally and filmed from a surveillance-camera …

Python 103 20 Updated Apr 2, 2025

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 41,085 3,267 Updated Jun 25, 2025

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 1,503 119 Updated Jun 2, 2025

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Python 66 1 Updated Jan 13, 2025

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 2,371 155 Updated Mar 3, 2025

VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

Python 489 49 Updated Apr 23, 2025

✨✨Latest Advances on Multimodal Large Language Models

15,637 1,015 Updated Jun 19, 2025

VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

Python 3,348 268 Updated Jun 19, 2025

Processing Video POC with Multimodal LLMs

Python 9 2 Updated May 12, 2025

Analyze videos using LLMs, Computer Vision and Automatic Speech Recognition

Python 897 119 Updated Apr 23, 2025

Python Computer Vision & Video Analytics Framework With Batteries Included

Python 650 59 Updated Jun 22, 2025

⛹️ Pytorch ReID: A tiny, friendly, strong pytorch implement of person re-id / vehicle re-id baseline. Tutorial 👉https://github.com/layumi/Person_reID_baseline_pytorch/tree/master/tutorial

Python 4,293 1,021 Updated May 7, 2025

A fast multimodal LLM for real-time voice

Python 4,056 318 Updated Feb 14, 2025

Ready-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.

Go 14,885 1,819 Updated Jun 25, 2025

A lightweight web application for remotely viewing images from a remote computer through a web browser. 🖼️

Python 7 Updated Apr 8, 2025

Implementation of paper "Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces"

Jupyter Notebook 4 Updated Jan 17, 2025

Amazon Kinesis Video Streams Webrtc SDK is for developers to install and customize realtime communication between devices and enable secure streaming of video, audio to Kinesis Video Streams.

C 1,120 346 Updated Jun 19, 2025
Python 179 12 Updated Oct 14, 2024

superglue automates workflows from natural language. Agents use it to build deterministic workflows across apps, APIs and databases. Humans use it to automate complex workflows with just one prompt.

TypeScript 1,798 84 Updated Jun 25, 2025

[NeurIPS 2023] Global Structure-Aware Diffusion Process for Low-Light Image Enhancement

Python 143 10 Updated Aug 6, 2024

DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding

Python 1,096 44 Updated Jun 20, 2025

🤖 Autonomous agent framework for Elixir. Built for distributed, autonomous behavior and dynamic workflows.

Elixir 505 24 Updated Jun 22, 2025

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

Python 10,259 702 Updated Jun 25, 2025

Implementation for paper "Follow the Rules: Reasoning for Video Anomaly Detection with Large Language Model"

Python 79 10 Updated Dec 16, 2024
Python 8 Updated Nov 15, 2024

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.

Python 18,345 1,661 Updated Apr 10, 2025

Ultimate camera streaming application with support RTSP, RTMP, HTTP-FLV, WebRTC, MSE, HLS, MP4, MJPEG, HomeKit, FFmpeg, etc.

Go 9,191 678 Updated Jun 12, 2025

1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.

Python 221 9 Updated Aug 23, 2024
Next
0