NVIDIA Cosmos is a developer-first world foundation model platform designed to help Physical AI developers build their Physical AI systems better and faster. Cosmos contains
- Pre-trained models (available via Hugging Face) under the NVIDIA Open Model License that allows commercial use of the models for free.
- Training scripts under the Apache 2 License for post-training the models for various downstream Physical AI applications.
Cosmos-Reason1 is a suite of models, ontologies, and benchmarks that we develop with the goal of enabling multimodal LLMs to generate physically grounded responses. We release two multimodal LLMs: Cosmos-Reason1-8B and Cosmos-Reason1-56B which are trained in four stages: vision pre-training, general SFT, Physical AI SFT, and Physical AI reinforcement learning. We define ontologies for physical common sense and embodied reasoning, and also build benchmarks to evaluate Physical AI reasoning capabilities of multimodal LLMs.
- Coming Soon!
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
NVIDIA Cosmos source code is released under the Apache 2 License.
NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com.