This environment is part of the classic environments. Please read that page first for general information.
|Observation Values||[0, 1]|
Leduc Hold’em is a variation of Limit Texas Hold’em with 2 players, 2 rounds and a deck of six cards (Jack, Queen, and King in 2 suits). At the beginning of the game, each player receives one card and, after betting, one public card is revealed. Another round follow. At the end, the player with the best hand wins and receives a reward (+1) and the loser receives -1. At any time, any player can fold.
Our implementation wraps RLCard and you can refer to its documentation for additional details. Please cite their work if you use this game in research.
The observation is a dictionary which contains an
'obs' element which is the usual RL observation described below, and an
'action_mask' which holds the legal moves, described in the Legal Actions Mask section.
As described by RLCard, the first 3 entries of the main observation space correspond to the player’s hand (J, Q, and K) and the next 3 represent the public cards. Indexes 6 to 19 and 20 to 33 encode the number of chips by the current player and the opponent, respectively.
|0 - 2||Current Player’s Hand
|3 - 5||Community Cards
|6 - 20||Current Player’s Chips
|21 - 35||Opponent’s Chips
The legal moves available to the current agent are found in the
action_mask element of the dictionary observation. The
action_mask is a binary vector where each index of the vector represents whether the action is legal or not. The
action_mask will be all zeros for any agent except the one whos turn it is. Taking an illegal move ends the game with a reward of -1 for the illegally moving agent and a reward of 0 for all other agents.
|+raised chips / 2||-raised chips / 2|