-
University of Science and Technology of China
-
07:46
(UTC +08:00) - https://www.ustc.edu.cn/
Highlights
- Pro
Agent
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
🦜🔗 Build context-aware reasoning applications
Towards Large Multimodal Models as Visual Foundation Agents
[NeurIPS 2023] Reflexion: Language Agents with Verbal Reinforcement Learning
Agent framework and applications built upon Qwen>=3.0, featuring Function Calling, MCP, Code Interpreter, RAG, Chrome extension, etc.
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
AUITestAgent is the first automatic, natural language-driven GUI testing tool for mobile apps, capable of fully automating the entire process of GUI interaction and function verification.
LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning
A simple screen parsing tool towards pure vision based GUI agent
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Code and implementations for the paper "AgentGym: Evolving Large Language Model-based Agents across Diverse Environments" by Zhiheng Xi et al.
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
pix2tex: Using a ViT to convert images of equations into LaTeX code.
Eko (Eko Keeps Operating) - Build Production-ready Agentic Workflow with Natural Language - eko.fellou.ai
[ACL 2025] Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
[CVPR 2025] Magma: A Foundation Model for Multimodal AI Agents
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
A lightweight, powerful framework for multi-agent workflows
All-in-one Web Agent framework for post-training. Start building with a few clicks!
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)