This environment is part of the mpe environments. Please read that page first for general information.
In this environment a single agent sees a landmark position and is rewarded based on how close it gets to the landmark (Euclidian distance). This is not a multiagent environment, and is primarily intended for debugging purposes.
max_cycles: number of frames (a step for each agent) until game terminates