8000 GitHub - facebookresearch/locate-3d: Open source repo for Locate 3D Model, 3D-JEPA and Locate 3D Dataset
[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to content

Open source repo for Locate 3D Model, 3D-JEPA and Locate 3D Dataset

License

Notifications You must be signed in to change notification settings

facebookresearch/locate-3d

Repository files navigation

Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D

Official codebase for the Locate-3D models, the 3D-JEPA encoders, and the Locate 3D Dataset.

Meta AI Research, FAIR

[Paper] [Demo]

Locate 3D

Locate 3D is a model for localizing objects in 3D scenes from referring expressions like “the small coffee table between the sofa and the lamp.” Locate 3D sets a new state-of-the-art on standard referential grounding benchmarks and showcases robust generalization capabilities. Notably, Locate 3D operates directly on sensor observation streams (posed RGB-D frames), enabling real-world deployment on robots and AR devices.

3D-JEPA

3D-JEPA, a novel self-supervised learning (SSL) algorithm applicable to sensor point clouds, is key to Locate 3D. It takes as input a 3D pointcloud featurized using 2D foundation models (CLIP, DINO). Subsequently, masked prediction in latent space is employed as a pretext task to aid the self-supervised learning of contextualized pointcloud features. Once trained, the 3D-JEPA encoder is finetuned alongside a language-conditioned decoder to jointly predict 3D masks and bounding boxes.

Locate 3D Dataset

Additionally, we introduce Locate 3D Dataset, a new dataset for 3D referential grounding, spanning multiple capture setups with over 130K annotations. This enables a systematic study of generalization capabilities as well as a stronger model.

MODEL ZOO

Model Num parameters Link
Locate 3D 600M Link
Locate 3D+ 600M Link
3D-JEPA 300M Link

Code Structure

.
├── examples                  # example notebooks for running the different models
├── models                    # model classes for creating Locate 3D and 3D-JEPA
│   ├── encoder               # model for creating the 3D-jepa encoder
    └── locate-3d             # model for creating the locate-3d class
├── locate3d_data             # folder containing the Locate 3d data
│   ├── datasets              # datasets, data loaders, ...

License

Data

The data is licensed CC-by-NC 4.0, however a portion of the data is an output from Llama 3.2 and subject to the Llama 3.2 license (link). Use of the data to train, fine tune, or otherwise improve an AI model, which is distributed or made available, shall also include "Llama" at the beginning of any such AI model name. Third party content pulled from other locations are subject to their own licenses and you may have other legal obligations or restrictions that govern your use of that content.

Code

The majority of locate-3d is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Pointcept is licensed under the MIT license.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@article{arnaudmcvay2025locate3d,
  title={Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D},
  author={Sergio Arnaud*, Paul McVay*, Ada Martin*, Arjun Majumdar, Krishna Murthy Jatavallabhula,
Phillip Thomas, Ruslan Partsey, Daniel Dugas, Abha Gejji, Alexander Sax, Vincent-Pierre Berges,
Mikael Henaff, Ayush Jain, Ang Cao, Ishita Prasad, Mrinal Kalakrishnan, Michael Rabbat, Nicolas
Ballas, Mido Assran, Oleksandr Maksymets, Aravind Rajeswaran, Franziska Meier},
  journal={arXiv},
  year={2025},
  url={https://ai.meta.com/research/publications/locate-3d-real-world-object-localization-via-self-supervised-learning-in-3d}
}

About

Open source repo for Locate 3D Model, 3D-JEPA and Locate 3D Dataset

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

0