MPE environments


Simple Adversary

Simple Crypto

Simple Push

Simple Reference

Simple Speaker Listener

Simple Spread

Simple Tag

Simple World Comm


The unique dependencies for this set of environments can be installed via:

pip install pettingzoo[mpe]

Multi Particle Environments (MPE) are a set of communication oriented environment where particle agents can (sometimes) move, communicate, see each other, push each other around, and interact with fixed landmarks.

These environments are from OpenAI’s MPE codebase, with several minor fixes, mostly related to making the action space discrete, making the rewards consistent and cleaning up the observation space of certain environments.

Types of Environments

The Simple Adversary, Simple Crypto, Simple Push, Simple Tag, and Simple World Comm environments are adversarial (a “good” agent being rewarded means an “adversary” agent is punished and vice versa, though not always in a perfectly zero-sum manner). In most of these environments, there are “good” agents rendered in green and an “adversary” team rendered in red.

The Simple Reference, Simple Speaker Listener, and Simple Spread environments are cooperative in nature (agents must work together to achieve their goals, and received a mixture of rewards based on their own success and the success of the other agents).

Key Concepts


The game terminates after the number of cycles specified by the max_cycles environment argument is executed. The default for all environments is 25 cycles, as in the original OpenAI source code.

Observation Space

The observation space of an agent is a vector generally composed of the agent’s position and velocity, other agents’ relative positions and velocities, landmarks’ relative positions, landmarks’ and agents’ types, and communications received from other agents. The exact form of this is detailed in the environments’ documentation.

If an agent cannot see or observe the communication of a second agent, then the second agent is not included in the first’s observation space, resulting in varying observation space sizes in certain environments.

Action Space

The action space is a discrete action space representing the combinations of movements and communications an agent can perform. Agents that can move can choose between the 4 cardinal directions or do nothing. Agents that can communicate choose between 2 and 10 environment-dependent communication options, which broadcast a message to all agents that can hear it.


Rendering displays the scene in a window that automatically grows if agents wander beyond its border. Communication is rendered at the bottom of the scene. The render() method also returns the pixel map of the rendered area.


The MPE environments were originally described in the following work:

  title={Emergence of Grounded Compositional Language in Multi-Agent Populations},
  author={Mordatch, Igor and Abbeel, Pieter},
  journal={arXiv preprint arXiv:1703.04908},

But were first released as a part of this work:

  title={Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments},
  author={Lowe, Ryan and Wu, Yi and Tamar, Aviv and Harb, Jean and Abbeel, Pieter and Mordatch, Igor},
  journal={Neural Information Processing Systems (NIPS)},

Please cite one or both of these if you use these environments in your research.