• Status – shows the status of the current process.
• Locally stored model – Stores the trained model at the end of each iteration so that it is not lost even if the page is refreshed.
Step 1: Initialize and create the model by providing the hidden layer size. Delete the stored model (if present) before creating a new model.
Step 2: Set the training parameter values:
Number of iterations – Defines the number of training cycles.
Games per iteration – Defines how many games you want the pole to play in each iteration.
Max steps per game – Defines the maximum number of steps the pole could take. The game ends either when the pole loses (fails to balance or gets out of the frame) or when the maximum number of steps are reached.
Reward discount rate – Defines the reward and penalty rate. Its value can lie between 0 to 1. For eg:0.95 would mean 95 points as reward and 5 points of penalty.
Learning Rate – The learning rate determines to what extent newly acquired information overrides old information. A factor of 0 makes the agent learn nothing (exclusively exploiting prior knowledge), while a factor of 1 makes the agent consider only the most recent information (ignoring prior knowledge to explore possibilities).
Render during training – Checkbox : Tick, If you want to see the simulation of the pole during the training or learning process.
Then press the train button. The training will start and a graph will be plotted. Once the training is done (can be noted by seeing the status, we move on to the next step.
*Also, You can also stop the training process in between by using the stop button (Train button changes to stop button once pressed!).
Step 3: Once the model is trained, you can test it by pressing the test button (present adjacent to the train button). The status area (present at the top) will show you the number of steps survived by the pole after your training. Also, when you press the test button, you can see the simulation of the pole and look at how it performs while testing.