You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+13-12
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
-
# Q-Learning - Demo Notebook
2
-
This repositoy contains a short Demo Notebook on how to implement a Reinforcement Learning agent, which learns to solve an OpenAI Gym environment.
1
+
# Q-Learning - Jupyter Notebook
2
+
This repositoy contains a Jupyter Notebook with an implemenation of a Q-Learning Agent, which learns to solve the n-Chain OpenAI Gym environment
3
3
4
4
This notebook is inspired by the following notebook: [Deep Reinforcement Learning Course Notebook](https://github.com/simoninithomas/Deep_reinforcement_learning_Course/blob/master/Q%20learning/Taxi-v2/Q%20Learning%20with%20OpenAI%20Taxi-v2%20video%20version.ipynb)
5
5
@@ -10,17 +10,15 @@ Download the repository:
10
10
Run the Jupyter Notebook:
11
11
`q_learning_notebook.ipynb`
12
12
13
-
## Introduction to Reinforcement Learning
13
+
## Description of the Q-Learning Algorithm
14
14
15
-
The notebook contains a Q-Learning algorithm implementation and a training loop to solve the N-Chain OpenAI Gym environment.
15
+
The notebook contains a Q-Learning algorithm implementation and a training loop to solve the n-Chain OpenAI Gym environment. The below imgage describes the Q-Learning Algorithm (an off-policy Temporal-Difference control algorithm):
The below imgage describes the Q-Learning Algorithm (an off-policy Temporal-Difference control algorithm):
20
-
21
-

22
-
Q-Learning Algorithm: [Image](http://incompleteideas.net/book/the-book-2nd.html) taken from **Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, Second edition, 2014/2015, page 158**
19
+
Q-Learning Algorithm - [Image](http://incompleteideas.net/book/the-book-2nd.html) taken from **Richard S. Sutton and Andrew G. Barto, Reinforcement Learning: An Introduction, Second edition, 2014/2015, page 158**
The n-Chain environment is taken from the OpenAI Gym module: [n-Chain](https://gym.openai.com/envs/NChain-v0/): Official Documentation
36
34
37
35
The image below shows an example of a 5-Chain (n = 5) environment with 5 states. "a" stands for action and "r" for the reward ([Image Source](https://adventuresinmachinelearning.com/reinforcement-learning-tutorial-python-keras/)).
0 commit comments