This environment is part of the mpe environments. Please read that page first for general information.
|Average Total Reward||-80.9|
This environment is similar to simple_reference, except that one agent is the ‘speaker’ (gray) and can speak but cannot move, while the other agent is the listener (cannot speak, but must navigate to correct landmark).
Speaker observation space:
Listener observation space:
[self_vel, all_landmark_rel_positions, communication]
Speaker action space:
[say_0, say_1, say_2, say_3, say_4, say_5, say_6, say_7, say_8, say_9]
Listener action space:
[no_action, move_left, move_right, move_down, move_up]
max_cycles: number of frames (a step for each agent) until game terminates