Code for the embedding and reranker models, as well for evaluation from the paper "Stack Trace Deduplication: Faster, More Accurately, and in More Realistic Scenarios".

Python 4 Updated Dec 25, 2024

robusta-dev / holmesgpt

Your 24/7 On-Call AI Agent - Solve Alerts Faster with Automatic Correlations, Investigations, and More

Python 1,053 120 Updated Jun 27, 2025

Faustinaqq / CKAAD

Python 12 Updated May 29, 2025

TsinghuaDatabaseGroup / DB-GPT

An LLM Based Diagnosis System (https://arxiv.org/pdf/2312.01454.pdf)

Python 648 87 Updated Feb 28, 2025

punica-ai / punica

Serving multiple LoRA finetuned LLM as one

Python 1,066 52 Updated May 8, 2024

mit-pdos / sigmaos

Go 141 15 Updated Jun 26, 2025

dywsjtu / apparate

Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]

Python 25 2 Updated Nov 21, 2024

muse-research-lab / llm-inference-workload-eval

This repository contains all the code used for the experimental analysis of the paper: The Importance of Workload Choice in Evaluating LLM Inference Systems.

Python 6 1 Updated Apr 25, 2024

zhengzangw / Sequence-Scheduling

PyTorch implementation of paper "Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline".

Python 88 17 Updated May 23, 2023

microsoft / ParrotServe

[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable

Python 164 8 Updated Sep 21, 2024

lunyiliu / LogLM

From Task-based to Instruction-based Automated Log Analysis

Python 10 3 Updated Jan 7, 2025

Jun-jie-Huang / LoFI

Source Code for ISSRE-24 paper "Demystifying and Extracting Fault-indicating Information from Logs for Failure Diagnosis".

Python 8 1 Updated Jan 4, 2025

skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 16+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Python 8,286 686 Updated Jun 27, 2025

fastapi / fastapi

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 86,660 7,522 Updated Jun 26, 2025

LLMServe / DistServe

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 627 66 Updated Apr 6, 2025

IBM / LLM-performance-prediction

Predict the performance of LLM inference services

Jupyter Notebook 19 1 Updated May 7, 2025

Zhihan JIANG zhjiang22

Highlights

Organizations

Starred repositories

Jekyll