This is the official repository for LegoGPT, the first approach for generating physically stable LEGO brick models from text prompts.
legogpt-480p.mp4
Generating Physically Stable and Buildable LEGO® Designs from Text
Ava Pun*,
Kangle Deng*,
Ruixuan Liu*,
Deva Ramanan,
Changliu Liu,
Jun-Yan Zhu
Carnegie Mellon University
- Llama-3.2-1B-Instruct: LegoGPT is fine-tuned from meta-llama/Llama-3.2-1B-Instruct, a gated model. Request access
to the model here, then generate
a Hugging Face user access token and set it as an environment
variable:
export HF_TOKEN=<your_token>
. The model will be automatically downloaded upon running the code. - Gurobi: Running stability analysis requires a Gurobi licence to use Gurobi. Academics may request a free licence from the Gurobi website here. After obtaining the licence, place it in your home directory or another recommended location.
This repo uses the Python project manager uv. To install this repo as a standalone project, first install all prerequisites. Then,
- Clone the repo:
git clone "https://github.com/AvaLovelace1/LegoGPT.git" && cd LegoGPT
. - (Optional, required for running the
infer
script and texturing) Follow these instructions to install ImportLDraw, required for rendering LEGO visualizations:- Download Git LFS, then run
git lfs install
. - Install the ImportLDraw submodule with
git submodule update --init
. - Download the LDraw parts library and
extract it in your home directory:
(cd ~ && wget https://library.ldraw.org/library/updates/complete.zip && unzip complete.zip)
.- If you wish to put the LDraw parts library in a different directory, set the environment variable
LDRAW_LIBRARY_PATH
to the path of theldraw
directory:export LDRAW_LIBRARY_PATH=path/to/ldraw
.
- If you wish to put the LDraw parts library in a different directory, set the environment variable
- Download Git LFS, then run
- Finally, install uv, and run
uv sync
to create a Python virtual environment with all dependencies installed. Python dependencies are defined inpyproject.toml
.
To install this repo as a package in your own Python project, first install all prerequisites. Then, run
uv add "https://github.com/AvaLovelace1/LegoGPT.git"
if using uv, or
pip install "https://github.com/AvaLovelace1/LegoGPT.git"
if using pip.
You can run inference with the fine-tuned LegoGPT model using:
uv run infer
This script starts an interactive session where you can input a prompt and get a response from the model. The model weights will automatically be downloaded from Hugging Face; they can be found here.
If you wish to run inference with a different set of model weights, specify them using the --model_name_or_path
option. See uv run infer -h
for a full list of options.
Here is an example interaction using the infer
script:
> uv run infer
Enter a prompt, or <Return> to exit: Table featuring a flat rectangular surface over four evenly spaced legs.
Enter a filename to save the output image (default=output.png): output.png
Enter a generation seed (default=42): 42
Generating...
Set parameter Username
Academic license - for non-commercial use only - expires 2026-02-19
--------------------
Finished generating in 63.53s.
Total # bricks: 59
Total # brick rejections: 98
Brick rejection reasons: {'collision': 5, 'already_rejected': 93}
Total # regenerations: 4
Saved results to /home/apun/LegoGPT/output.txt, /home/apun/LegoGPT/output.ldr, and /home/apun/LegoGPT/output.png
--------------------
Enter another prompt, or <Return> to exit:
Three output files are created: output.png
, output.txt
, and output.ldr
.
output.png
contains a rendered image of the generated LEGO structure:
output.txt
contains the LEGO structure in brick-by-brick text format, where each line of the form hxw (x,y,z)
represents a LEGO brick of height h
and width w
at position (x,y,z)
:
1x2 (16,18,0)
1x2 (16,13,0)
2x2 (0,18,0)
2x2 (0,13,0)
1x2 (16,18,1)
[...]
And finally, output.ldr
contains the LEGO structure in LDraw format, which can be opened with any LDraw-compatible
software.
The subdirectory src/texture
contains the code for generating the UV texture or per-brick color given a LEGO design.
To run texturing, cd
into src/texture
and follow the instructions in the README.md
file there.
LegoGPT was created by fine-tuning Llama-3.2-1B-Instruct on the custom LEGO dataset StableText2Lego, converted into instructional format. We used Hugging Face TRL with Accelerate for fine-tuning.
To replicate the fine-tuning process, first install additional Python dependencies with uv sync --extra finetuning
.
Then, follow these instructions:
- Prepare the LEGO dataset for fine-tuning with
uv run prepare_finetuning_dataset --input_path AvaLovelace/StableText2Lego --output_path [FINETUNING_DATASET_PATH]
. This converts the dataset into the instructional format required for fine-tuning LLaMA.- If you wish to run fine-tuning with your own LEGO dataset, replace
AvaLovelace/StableText2Lego
with the path to your dataset. This dataset should have the fields "captions" and "lego". The "lego" field should contain a LEGO structure in the text format described in the paper, and the "captions" field should contain a list of one or more descriptions of the LEGO structure.
- If you wish to run fine-tuning with your own LEGO dataset, replace
- Download the pretrained Llama-3.2-1B-Instruct model to
some directory
[PRETRAINED_DIR]
. IMPORTANT: Replace theconfig.json
,special_tokens_map.json
, andtokenizer_config.json
files with the ones in thefinetuning_config_files
directory. This specifies thepad_token
to be different from theeos_token
, fixing a fine-tuning issue where the model will not learn to output EOS tokens properly. - Initialize the Accelerate config file with
uv run accelerate config
. - Run fine-tuning with
uv run ./scripts/finetune.zsh [PRETRAINED_DIR] [OUTPUT_DIR] [RUN_NAME] [FINETUNING_DATASET_PATH]
. The fine-tuned model will be saved to[OUTPUT_DIR]/[RUN_NAME]
.
The LegoGPT model, StableText2Lego dataset, and majority of the LegoGPT code are licensed under the MIT License. The following submodules may have different licenses:
- ImportLDraw: For visualizing LEGO structures, we used ImportLDraw, available under the LICENSE.
- FlashTex: For LEGO texturing and coloring, we used FlashTex, available under the LICENCE.
If you find this repository useful for your research, please cite the following work.
@article{pun2025legogpt,
title = {Generating Physically Stable and Buildable LEGO Designs from Text},
author = {Pun, Ava and Deng, Kangle and Liu, Ruixuan and Ramanan, Deva and Liu, Changliu and Zhu, Jun-Yan},
journal = {arXiv preprint arXiv:2505.05469},
year = {2025}
}
We thank Minchen Li, Ken Goldberg, Nupur Kumari, Ruihan Gao, and Yihao Shi for their discussions and help.
We also thank Jiaoyang Li, Philip Huang, and Shobhit Aggarwal for developing the bimanual robotic system.
This work is partly supported by the Packard Foundation, Cisco Research Grant, and Amazon Faculty Award. This work is
also in part supported by the Manufacturing Futures Institute, Carnegie Mellon University, through a grant from the
Richard King Mellon Foundation. KD is supported by the Microsoft Research PhD Fellowship.
Our codebase is built upon several amazing repos: Hugging Face TRL, Accelerate, ImportLDraw.