Intent based planner assisting the high level planner
Environment: PyBullet
Use environment.yml to create the required environment
- Observation: RGB-Height map (2242244)
- Action space: Pixel-wise Q value (224*224)
- Sample 16 action for each state
- Out of these 16 actions, 1 action is the pixel with highest Q value
- Remaining 15 must be randomly sampled
- Propagate loss for all these 16 pixels, and assign 0 loss to the remaining pixels
- Sample only 1 action per state
- It's either a random action with some probability
- Or the action with highest Q value
- Observation: RGB-Height map (2242244)
- Action space: 16 (Q values for pushing by a fixed distance in 16 possible fixed directions)