Mid-Level-Planner

Intent based planner assisting the high level planner

Environment: PyBullet

Use environment.yml to create the required environment

Version 1: Destination Prediction (1-step MDP/Bandit Problem) with DQN

Sample 16 action for each state
1. Out of these 16 actions, 1 action is the pixel with highest Q value
2. Remaining 15 must be randomly sampled
3. Propagate loss for all these 16 pixels, and assign 0 loss to the remaining pixels

Sample only 1 action per state
1. It's either a random action with some probability
2. Or the action with highest Q value

Observation: RGB-Height map (2242244)
Action space: 16 (Q values for pushing by a fixed distance in 16 possible fixed directions)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Assets		Assets
Config		Config
Environments		Environments
RL_manipulator_scripts		RL_manipulator_scripts
V1_destination_prediction		V1_destination_prediction
V2_next_best_action/models		V2_next_best_action/models
.gitignore		.gitignore
0_VRDemoSettings.txt		0_VRDemoSettings.txt
README.md		README.md
continue_trainer.py		continue_trainer.py
create_env.py		create_env.py
dqn_v2.1.ipynb		dqn_v2.1.ipynb
dqn_v2.2.ipynb		dqn_v2.2.ipynb
dqn_v2_eval.ipynb		dqn_v2_eval.ipynb
env.yml		env.yml
environment.yml		environment.yml
trainer.py		trainer.py
trainer2.py		trainer2.py
trainer3.py		trainer3.py