- Desired Features: simulation with depth limit, simulation with cost limit, simulation until reward function can be calculated (i.e. simulation to completion).

- Action Sequence Distribution:

- Currently the probability distribution of the Action sequences being communicated is not being calculated as described in the paper. The probability distribution is simply being set as proportional to the local reward function.