This environment is part of the butterfly environments. Please read that page first for general information.
|Action Values||[0, 1]|
|Observation Shape||(280, 480, 3)|
|Observation Values||[0, 255]|
|State Shape||(560, 960, 3)|
|State Values||(0, 255)|
|Average Total Reward||-92.9|
Cooperative pong is a game of simple pong, where the objective is to keep the ball in play for the longest time. The game is over when the ball goes out of bounds from either the left or right edge of the screen. There are two agents (paddles), one that moves along the left edge and the other that moves along the right edge of the screen. All collisions of the ball are elastic. The ball always starts moving in a random direction from the center of the screen with each reset. To make learning a little more challenging, the right paddle is tiered cake-shaped by default. Obser2ation space of each agent is its own half of the screen. There are two possible actions for the agents (move up/down). If the ball stays within bounds, both agents receive a combined reward of
100 / max_cycles (default 0.11), if they successfully complete a frame. Otherwise, each agent receive a reward of
-100 and the game ends.
Move the left paddle using the ‘W’ and ‘S’ keys. Move the right paddle using ‘UP’ and ‘DOWN’ arrow keys.
cooperative_pong_v3.env(ball_speed=9, left_paddle_speed=12, right_paddle_speed=12, cake_paddle=True, max_cycles=900, bounce_randomness=False)
ball_speed: Speed of ball (in pixels)
left_paddle_speed: Speed of left paddle (in pixels)
right_paddle_speed: Speed of right paddle (in pixels)
cake_paddle: If True, the right paddle cakes the shape of a 4 tiered wedding cake
max_cycles: Done is set to True for all agents after this number of frames (steps through all agents) elapses.
bounce_randomness: If True, each collision of the ball with the paddles adds a small random angle to the direction of the ball, with the speed of the ball remaining unchanged.
doneswere computed (1.3.1)