minLLM is a minimal transformer-based language model implemented in PyTorch and in Keras (with a PyTorch backend), featuring causal multi-head attention, feed-forward layers, and basic word-level tokenization on the Tiny Shakespeare dataset. Clone the repository (git clone https://github.com/gustavz/minLLM.git
), install dependencies (pip install -r requirements.txt
), and run python min_llm_keras.py
to download data, train the model with Weights & Biases integration and checkpointing, and generate sample text. All model and training hyperparameters (e.g., MAX_SEQ_LEN
, EMBED_DIM
, NUM_HEADS
, NUM_LAYERS
, BATCH_SIZE
, EPOCHS
, TEMPERATURE
) are configurable at the top of the script. Licensed under the MIT License.
forked from gustavz/minLLM
-
Notifications
You must be signed in to change notification settings - Fork 0
minimal e2e training and inference example of state of the art LLMs
License
fredfurst/minLLM
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
minimal e2e training and inference example of state of the art LLMs
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 100.0%