-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Insights: Lightning-AI/litgpt
Overview
-
- 11 Merged pull requests
- 0 Open pull requests
- 2 Closed issues
- 0 New issues
Could not load contribution data
Please try again later
11 Pull requests merged by 5 people
-
Transformers version bump
#2029 merged
May 23, 2025 -
Update spacing in README.md
#2058 merged
May 23, 2025 -
ci: use Thunder dev images for testing
#2054 merged
May 23, 2025 -
req: pin
bitsandbytes>=0.45.2,<0.45.5
#2057 merged
May 22, 2025 -
Remove litserve version constraint
#2055 merged
May 22, 2025 -
ci: extend testing with
ubuntu-24.04
#2056 merged
May 21, 2025 -
simplify the GPU testing flow
#2053 merged
May 21, 2025 -
Xfail Thunder integration test due to Dynamo bug
#2050 merged
May 20, 2025 -
tests: mark
test_evaluate_script
as flaky#2049 merged
May 20, 2025 -
bump: update Torch to resolve failing test
#2052 merged
May 20, 2025 -
fix: Pretraining text files with recent litdata versions
#2048 merged
May 19, 2025
2 Issues closed by 2 people
-
litserve version constraint
#2045 closed
May 22, 2025 -
Mismatch in LLaMAMoE litgpt and hf implementation for Mixtral
#2013 closed
May 20, 2025
15 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[WIP] Simplified preparation of pretraining datasets
#1057 commented on
May 23, 2025 • 0 new comments -
Add LongLora for both full and lora fine-tuning
#1350 commented on
May 23, 2025 • 0 new comments -
WIP: TensorParallel with new strategy
#1421 commented on
May 23, 2025 • 0 new comments -
OpenCoder series
#1880 commented on
May 23, 2025 • 0 new comments -
OLMo 2
#1897 commented on
May 23, 2025 • 0 new comments -
Raise error if disk is full before downloading weights
#1903 commented on
May 23, 2025 • 0 new comments -
qwen2.5 long context
#1933 commented on
May 23, 2025 • 0 new comments -
Support for KV caching and batched inference
#1934 commented on
May 23, 2025 • 0 new comments -
Add Multi-head Latent Attention (DeepSeekv2)
#1945 commented on
May 23, 2025 • 0 new comments -
wandb logger args
#1973 commented on
May 24, 2025 • 0 new comments -
(WIP) DeepseekV3 (and Multi-Head Latent Attention)
#2012 commented on
May 23, 2025 • 0 new comments -
LLaMAMoE fixes
#2014 commented on
May 23, 2025 • 0 new comments -
Qwen3 Dense
#2044 commented on
May 24, 2025 • 0 new comments -
Qwen3 MoE Preliminary: add intermediate_size argument to MLP modules
#2046 commented on
May 23, 2025 • 0 new comments -
phi-4 reasoning models
#2047 commented on
May 23, 2025 • 0 new comments