This environment is part of the classic environments. Please read that page first for general information.
Import | from pettingzoo.classic import texas_holdem_v4 |
Actions | Discrete |
Parallel API | Yes |
Manual Control | No |
Agents | agents= ['player_0', 'player_1'] |
Agents | 2 |
Action Shape | Discrete(4) |
Action Values | Discrete(4) |
Observation Shape | (72,) |
Observation Values | [0, 1] |
texas_holdem_v4.env(num_players=2)
num_players
: Sets the number of players in the game. Minimum is 2.
The observation is a dictionary which contains an 'obs'
element which is the usual RL observation described below, and an 'action_mask'
which holds the legal moves, described in the Legal Actions Mask section.
The main observation space is a vector of 72 boolean integers. The first 52 entries depict the current player’s hand plus any community cards as follows
Index | Description |
---|---|
0 - 12 | Spades0 : A, 1 : 2, …, 12 : K |
13 - 25 | Hearts13 : A, 14 : 2, …, 25 : K |
26 - 38 | Diamonds26 : A, 27 : 2, …, 38 : K |
39 - 51 | Clubs39 : A, 40 : 2, …, 51 : K |
52 - 56 | Chips raised in Round 152 : 0, 53 : 1, …, 56 : 4 |
57 - 61 | Chips raised in Round 257 : 0, 58 : 1, …, 61 : 4 |
62 - 66 | Chips raised in Round 362 : 0, 63 : 1, …, 66 : 4 |
67 - 71 | Chips raised in Round 467 : 0, 68 : 1, …, 71 : 4 |
The legal moves available to the current agent are found in the action_mask
element of the dictionary observation. The action_mask
is a binary vector where each index of the vector represents whether the action is legal or not. The action_mask
will be all zeros for any agent except the one whose turn it is. Taking an illegal move ends the game with a reward of -1 for the illegally moving agent and a reward of 0 for all other agents.
Action ID | Action |
---|---|
0 | Call |
1 | Raise |
2 | Fold |
3 | Check |
Winner | Loser |
---|---|
+raised chips/2 | -raised chips/2 |