Lu Chen, Zongtao He, Liuyi Wang, Chengju Liu, Qijun Chen (Accepted by IEEE Robotics and Automation Letters)
We introduce a temporal scene-object graph (TSOG) to construct an informative and efficient ego-centric visual representation. Firstly, we develop a holistic object feature descriptor (HOFD) to fully describe object features from different aspects, facilitating the learning of relationships between observed and unseen objects. Next, we propose a scene-object graph (SOG) to simultaneously learn local and global correlations between objects and agent observations, granting the agent a more comprehensive and flexible scene understanding ability. This facilitates the agent to perform target association and search more efficiently. Finally, we introduce a temporal graph aggregation (TGA) module to dynamically aggregate memory information across consecutive time steps. TGA offers the agent a dynamic perspective on historical steps, aiding in navigation towards the target in longer trajectories. Extensive experiments in AI2THOR demonstrate our method's effectiveness and efficiency for ObjectNav tasks in unseen environments.
- Clone the repository
git clone https://github.com/izilu/RAL-TSOG
and move into the top level directorycd TSOG
- Create conda environment.
pip install -r requirements.txt
- Download the dataset, which refers to ECCV-VN. The offline data is discretized from AI2-Thor simulator.
data/ └── Scene_Data/ ├── FloorPlan1/ │ ├── resnet18_featuremap.hdf5 │ ├── graph.json │ ├── visible_object_map_1.5.json │ ├── det_feature_categories.hdf5 │ ├── grid.json │ └── optimal_action.json ├── FloorPlan2/ └── ...
python main.py --title TSOG --model TSOG --workers 36 --gpu-ids 0 1 2 --max-ep 3000000 --gat-memory-len 25 --save-model-dir trained_models/TSOG --log-dir runs/TSOG --results-json eval_best_results/TSOG/TSOG.json --test-after-train
The above command will launch a full test at the end to evaluate the navigation performance of all checkpoints and report optimal results.
Continuing Train TSOG: python main.py --title TSOG --model TSOG --workers 36 --gpu-ids 0 1 2 --max-ep 3000000 --gat-memory-len 25 --save-model-dir trained_models/TSOG --log-dir runs/TSOG --continue-training trained_models/TSOG/<lateset_model.dat> --results-json eval_best_results/TSOG/TSOG.json --test-after-train
python full_eval.py --title TSOG --model TSOG --results-json eval_best_results/TSOG/TSOG.json --gpu-ids 0 --workers 6 --gat-memory-len 25 --save-model-dir trained_models/TSOG --log-dir runs/TSOG --visualize-file-name analysis-TSOG.json
python full_eval.py --title TSOG --model TSOG --results-json eval_best_results/TSOG/TSOG.json --gpu-ids 0 --workers 6 --gat-memory-len 25 --save-model-dir trained_models/pretrained_model --log-dir runs/TSOG --visualize-file-name analysis-TSOG.json
If you find this project useful in your research, please consider citing:
@article{10933547,
author = {Chen, Lu and He, Zongtao and Wang, Liuyi and Liu, Chengju and Chen, Qijun},
journal = {IEEE Robotics and Automation Letters},
title = {Temporal Scene-Object Graph Learning for Object Navigation},
year = {2025},
month = may,
volume = {10},
number = {5},
pages = {4914-4921},
keywords = {Navigation;Correlation;Visualization;Semantics;Feature extraction;Training;Artificial intelligence;Aggregates;Trajectory;Reinforcement learning;Vision-based navigation;reinforcement learning;representation learning;autonomous agents},
doi = {10.1109/LRA.2025.3553055},
}