myLLM

Welcome to myLLM! This project is an educational initiative to build a Language Model (LLM) from scratch, exploring various transformer techniques along the way. The goal is to gain a deep understanding of how modern language models work and to experiment with different approaches in the transformer architecture.

Introduction

myLLM is designed for educational purposes to understand and implement the core components of language models based on transformer architectures. The project covers:

Tokenization
Embeddings
Encoder-Decoder models
Attention mechanisms
Training and fine-tuning techniques

Features

Custom Tokenizer: Build and train your own tokenizer.
Embedding Layer: Learn how to create and use embedding layers.
Transformer Architecture: Implement encoder and decoder components.
Attention Mechanisms: Explore different attention mechanisms, including self-attention and cross-attention.
Training Pipeline: Set up a training pipeline for your model.
Fine-Tuning: Techniques to fine-tune the model on specific tasks.

Installation

To get started with myLLM, you need to have Python 3.8 or higher installed. Follow the steps below to set up the project:

Clone the repository:

git clone https://github.com/ArshiaZr/myLLM.git
cd myLLM

Create a virtual environment and activate it:

python -m venv cuda
source cuda/bin/activate  # On Windows, use `cuda\Scripts\activate`

Install the required packages:
```
pip install -r requirements.txt
```

Usage

To train and test your LLM, follow these steps:

Tokenization: Tokenize your dataset using the custom tokenizer.

from tokenizer import CustomTokenizer
tokenizer = CustomTokenizer()
tokenizer.train('path_to_dataset')

Model Training: Train the transformer model.

from models.GPT import GPTLanguageModel
model = GPTLanguageModel(vocab_size=vocab_size, device=device)
model._train(epochs=200, learning_rate=3e-4, eval_iters=100)

Inference: Use the trained model for inference.

from utils.helpers import load_model
model = load_model('path_to_trained_model')
result = model.generate('your_encoded_input_text')
print(result)

Project Structure

The project structure is organized as follows:

myLLM/
├── data/
├── models/
│   ├── transformer.py
│   └── ...
├── tokenization/
│   ├── tokenizer.py
│   └── ...
├── utils/
│   ├── helpers.py
│   └── ...
├── README.md
└── requirements.txt

Transformer Techniques

This project explores various techniques in transformer models, including:

Positional Encoding: Adding positional information to the embeddings.
Multi-Head Attention: Implementing and understanding multi-head attention mechanisms.
Layer Normalization: Using layer normalization to stabilize training.
Feed-Forward Networks: Incorporating feed-forward neural networks within the transformer blocks.
Residual Connections: Implementing residual connections to improve gradient flow.

Contributing

Contributions are welcome! If you have suggestions or improvements, feel free to open an issue or submit a pull request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Feel free to customize this README to better fit your project's specifics and your preferences!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

myLLM

Table of Contents

Introduction

Features

Installation

Usage

Project Structure

Transformer Techniques

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
models		models
tokenization		tokenization
utils		utils
.gitignore		.gitignore
GPT_example.ipynb		GPT_example.ipynb
README.md		README.md
bigram.ipynb		bigram.ipynb
requirements.txt		requirements.txt

ArshiaZr/myLLM

Folders and files

Latest commit

History

Repository files navigation

myLLM

Table of Contents

Introduction

Features

Installation

Usage

Project Structure

Transformer Techniques

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages