Bangkok School of AI - Reinforcement Learning Workshop

How to Use Notebooks

Each notebook contains the content and code-along of each session. We recommend that you run the notebooks from Google Colaboratory for minimal setup requirements. Edit the Fill in The Code section for coding assigments and check with our way of solving them in solutions.

Session 1 Escaping GridWorld with Simple RL Agents

Markov Decision Processes / Discrete States and Actions

What is Reinforcement Learning: Pavlov's kitties
How Useful is Reinforcement Learning: games, robotics, ads biddings, stock trading, etc.
Why is Reinforcement Learning Different: level of workflow automation in classes of machine learning algorithm
- Use cases for reinforcement learning
Reinforcement Learning Framework and Markov Decision Processes
GridWorld example to explain:
- Problems: Markov decision processes, states, actions, and rewards
- Solutions: policies, state values, (state-)action values, discount factor, optimality equations
Words of Caution: a few reasons Deep Reinforcement Learning Doesn't Work Yet
Challenges:
- Read up on Bellman's equations and find out where they hid in our workshop today.
- What are you ideas about how we can find the policy policy?
- Play around with Gridworld. Tweak these variables and see what happens to state and action values:
  - Expand the grid and/or add some more traps
  - Wind probability
  - Move rewards
  - Discount factor
  - Epsilon and how to decay it (or not)

Session 2 Monte Carlo Methods

Discrete States and Actions

Blackjack-v0 environment, human play and computer play
Optimal Strategy for Blackjack
What is Monte Carlo Method
Monte Carlo Prediction
Monte Carlo Control: All-visit, First-visit, and GLIE
Challanges:
- What are some other ways of solving reinforcement learning problems? How are they better or worse than Monte Carlo methods e.g. performance, data requirements, etc.?
- Solve at least one of the following OpenAI gym environments with discrete states and actions:
  - FrozenLake-v0
  - Taxi-v2
  - Blackjack-v0
  - Any other environments with discrete states and actions at OpenAI Gym
- Check session2b.ipynb if you are interested in using Monte Carlo method to solve Grid World. This will give you more insight into difference between all-visit and first-visit Monte Carlo.

Session 3 Temporal Difference Learning

Discrete States and Actions

OpenAI Gym toy environment to explain temporal difference learning: sarsa, q-learning, expected sarsa
Homework: solve an environment with discrete states and actions such as:
- FrozenLake-v0
- Taxi-v2
- Blackjack-v0
Take-home Challenges: Solve an environment with continuous states: discretization, tile codings, etc. such as
- Acrobat-v1
- MountainCar-v0
- CartPole-v0
- LunarLander-v2
Points to consider:
- What are the state space, action space, and rewards of the environment?
- What algorithms did you use to solve the environment and why?
- How many episodes did you solve it in? Can you improve the performance? (Tweaking discount factor, learning rate, using Monte Carlo instead of TD)

Session 3.5 Neural Networks in Pytorch

Tensor operations
Feedforward
Activation functions
Losses
Backpropagation
Why is deeper usually better? Spiral example

Session 4 Deep Q-learning

Continuous States and Discrete Actions

Some approaches to continuous states: discretization, tile coding, other encoding, linear approximations
Vanilla DQN: experience replay and target functions
Take-home Challenges: Work on an Atari game and detail the process of hyperparameter tuning

Session 4.5 Rainbow

Continuous States and Discrete Actions

Rainbow
- Vanilla DQN (experience replay + target network)
- Double DQN
- Prioritized experience replay
- Dueling networks
- Multi-step learning
- Distributional RL
- Noisy networks
Take-home Challenges: Implement Rainbow and compare it to your last project

Se 7440 ssion 5 Policy Gradients

Continuous States and Actions

Policy gradient methods: a2c, a3c, ddpg, REINFORCE

Session 6 Multi-agent Learnig

Monte Carlo tree search

Readings

Environments

OpenAI Gym - a toolkit for developing and comparing reinforcement learning algorithms
Unity ML-Agent Toolkit - an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents
Holodeck - a high-fidelity simulator for reinforcement learning built on top of Unreal Engine 4
AirSim - a simulator for drones, cars and more, built on Unreal Engine
Carla - an open-source simulator for autonomous driving research
Pommerman - a clone of Bomberman built for AI research
MetaCar - a reinforcement learning environment for self-driving cars in the browser
Boardgame.io - a boardgame environment

Agents

Unity ML-Agent Toolkit - an open-source Unity plugin that enables games and simulations to serve as environments for training intelligent agents
SLM Labs - a modular deep reinforcement learning framework in PyTorch
Dopamine - a research framework for fast prototyping of reinforcement learning algorithms
TRF - a library built on top of TensorFlow that exposes several useful building blocks for implementing Reinforcement Learning agent

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
img		img
models		models
solutions		solutions
LICENSE		LICENSE
README.md		README.md
session0.ipynb		session0.ipynb
session1.ipynb		session1.ipynb
session2.ipynb		session2.ipynb
session2b.ipynb		session2b.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bangkok School of AI - Reinforcement Learning Workshop

How to Use Notebooks

Session 1 Escaping GridWorld with Simple RL Agents

Session 2 Monte Carlo Methods

Session 3 Temporal Difference Learning

Session 3.5 Neural Networks in Pytorch

Session 4 Deep Q-learning

Session 4.5 Rainbow

Se 7440 ssion 5 Policy Gradients

Session 6 Multi-agent Learnig

Other Topics

Readings

Environments

Agents

About

Uh oh!

Releases

Packages

Languages

License

tanlull/rl-workshop

Folders and files

Latest commit

History

Repository files navigation

Bangkok School of AI - Reinforcement Learning Workshop

How to Use Notebooks

Session 1 Escaping GridWorld with Simple RL Agents

Session 2 Monte Carlo Methods

Session 3 Temporal Difference Learning

Session 3.5 Neural Networks in Pytorch

Session 4 Deep Q-learning

Session 4.5 Rainbow

Se 7440 ssion 5 Policy Gradients

Session 6 Multi-agent Learnig

Other Topics

Readings

Environments

Agents

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages