Go

environment gif

This environment is part of the classic environments. Please read that page first for general information.

Name Value
Actions Discrete
Agents 2
Parallel API false
Manual Control No
Action Shape Discrete(170)
Action Values Discrete(170)
Observation Shape (170, 170, 3)
Observation Values [0, 1]
Import from pettingzoo.classic import go_v1
Agents agents= ['black_0', 'white_0']

Agent Environment Cycle

environment aec diagram

Go

Go is a board game with 2 players, black and white. The black player starts by placing a black stone at an empty board intersection. The white player follows by placing a stone of their own, aiming to either surround more territory than their opponent or capture the opponent’s stones. The game ends if both players sequentially decide to pass.

Our implementation is a wrapper for MiniGo.

Arguments

Go takes two optional arguments that define the board size (int) and komi compensation points (float). The default values for the board size and komi are 19 and 7.5, respectively.

g0_v1.env(board_size = 13, komi = 7.5)

g0_v1.env() # with default values

Observation Space

The observation shape is a function of the board size N and has a shape of (N, N, 3). The first plane, (:, :, 0), represent the stones on the board for the current player while the second plane, (:, :, 1), encodes the stones of the opponent. The third plane, (:, :, 2), is all 1 if the current player is black_0 or all 0 if the player is white_0. The state of the board is represented with the top left corner as (0, 0). For example, a (9, 9) board is

   0 1 2 3 4 5 6 7 8
 0 . . . . . . . . .  0
 1 . . . . . . . . .  1
 2 . . . . . . . . .  2
 3 . . . . . . . . .  3
 4 . . . . . . . . .  4
 5 . . . . . . . . .  5
 6 . . . . . . . . .  6
 7 . . . . . . . . .  7
 8 . . . . . . . . .  8
   0 1 2 3 4 5 6 7 8
Plane Description
0 Current Player’s stones
0: no stone, 1: stone
1 Opponent Player’s stones
0: no stone, 1: stone
2 Player
0: white, 1: black

While rendering, the board coordinate system is GTP.

Action Space

Similar to the observation space, the action space is dependent on the board size N.

Action ID Description
Place a stone on the 1st row of the board.
0: (0,0), 1: (0,1), …, N-1: (0,N-1)
Place a stone on the 2nd row of the board.
N: (1,0), N+1: (1,1), …, 2N-1: (1,N-1)
Place a stone on the Nth row of the board.
N^2-N: (N-1,0), N^2-N+1: (N-1,1), …, N^2-1: (N-1,N-1)
Pass

For example, you would use action 4 to place a stone on the board at the (0,3) location or action N^2 to pass. You can transform a non-pass action a back into its 2D (x,y) coordinate by computing (a//N, a%N) The total action space is .

Rewards

Winner Loser
+1 -1

The legal moves available for each agent, found in env.infos[agent]['legal_moves'], are updated after each step. Taking an illegal move ends the game with a reward of -1 for the illegally moving agent and a reward of 0 for all other agents.