Content of the material
- Agent Memory
- How To Play Pong On Terminal Details
- To run
- Reinforcement Learning Overview
- Preprocessing Frames
- Plotting the Results with TensorBoard and FloydHub
- How to know what to choose among various suggestions given for How To Play Pong On Terminal?
- ‘pong’ is not recognized
- Solution 1
- Solution 2
- Community QA
- Playing field too large
- Solution 2
Before I get into implementing agent memory, I want to explain what inspired researchers to use it in the first place.
When I first approached this project, I tried to train the agent using the data generated in real-time. So at timestep t, I would use the (S,A,R,S’) data from time (t-1) to train my agent.
The best way I can explain the problem with this is by going back to my early days in college. In my sophomore year, I had statics theory shoved down my throat for a year straight (I pursued mechanical engineering). Then in my junior year, I did nothing with statics and instead had machine design shoved down my throat. Come senior year, I’d forgotten everything there was to forget about statics.
In our environment, there might be 100 frames of the ball just moving to the left side of the screen. If we repeatedly train our agent on the same exact situation then eventually it will overfit to this situation and it won’t generalize the entire game, just like how I forgot about statics while studying machine design.
So, we store all our experiences in memory, then randomly sample from the whole list of memories. This way the agent learns from all his experiences while training, and not just his current situation (to beat this dead horse a little more, imagine if you couldn't learn from the past, and instead you could only learn from the exact present).
Alright now for implementation. For the memory, I make a separate class with 4 separate deques (a first in first out list of fixed size). The lists contain the frames, actions, rewards, and done flags (which tells us if this was a terminal state). I also add a function that allows us to add to these lists.
How To Play Pong On Terminal Details
The system has given 14 helpful results for the search "how to play pong on terminal". These are the recommended solutions for your problem, selecting from sources of help. Whenever a helpful result is detected, the system will add it to the list immediately. The latest ones have updated on 30th June 2021. According to our, the search "how to play pong on terminal" is quite common. Simultaneously, we also detect that many sites and sources also provide solutions and tips for it. So, with the aim of helping people out, we collect all here. Many people with the same problem as you appreciated these ways of fixing.
To play the game, just compile it with gcc and link to the pthread library:
and then run the program to play:
If you can score over 100 you get a special prize 🙂
Reinforcement Learning Overview
Time for a few quick definitions. In Reinforcement Learning, an agent perceives its environment through observations and rewards, and acts upon it through actions.
The agent learns whichactions maximize the reward, given what it learned from the environment.
More precisely, in our Pong case:
- The agent is the Pong AI model we’re training.
- The action is the output of our model: tells if the paddle should go up or down.
- The environment is everything that determines the state of the game.
- The observation is what the agent sees from the environment (here, the frames).
- The reward is what our agent gains after taking an action in the environment (here losing -1 after missing the ball, +1 if the opponent missed the ball).
Currently, the frames received from openai are much larger than we need, with a much higher resolution than we need. See below:
Firstly, we crop the image so that only the important area is displayed. Next, we convert the image to grayscale. Finally, we resize the frame using cv2 with nearest-neighbor interpolation, then convert the image datatype to np.uint8. See the code below for implementation…
The resulting image looks like this:
Plotting the Results with TensorBoard and FloydHub
Go back to your FloydHub Workspace, and open the notebook
train-with-log.ipynb. Then, run the notebook (click on the
Run tab at the top of the workspace, then on
Run all cells). Now, click on the TensorBoard link (inside the blue dashboard bar at the bottom of the screen).
After clicking on the link, FloydHub automatically detects all the TensorBoard logs repositories. You should be seeing a window that looks like this (click on Wall to see the following acc and loss graph):
train-with-log.ipynb is just the old
train.ipynb with additional lines of code to log some variables and metrics.
With Keras, it involves creating a callback object:
And calling the
tbCallBack object when training:
I also used an additional function
tflog to easily keep track of variables not related to Keras, like the
running_reward (=smoothed reward). After having imported the file containing the function, calling it is as simple as:
How to know what to choose among various suggestions given for How To Play Pong On Terminal?
The system can give more than one answer for How To Play Pong On Terminal, we also can’t say which the best one is. The best choice depends on the usefulness of each solution to each person. Normally, the ones that satisfy the majority will be on the top.
‘pong’ is not recognized
Make sure you installed pong globally. You do this by running the command
-g, that is what makes the install global.
If it is installed globally, make sure that the ‘npm’ is in your PATH. For instructions on how to do that, see this stackoverflow post.
Playing field too large
Run pong with a smaller field using the command