This environment is part of the mpe environments. Please read that page first for general information.
|Action Values||Discrete(5)/Box(0.0, 1.0, (5,))|
In this environment a single agent sees a landmark position and is rewarded based on how close it gets to the landmark (Euclidean distance). This is not a multiagent environment, and is primarily intended for debugging purposes.
max_cycles: number of frames (a step for each agent) until game terminates
continuous_actions: Whether agent action spaces are discrete(default) or continuous