Commit d76cd129 authored by Fernando Arbeiza's avatar Fernando Arbeiza Committed by GitHub
Browse files

Effectively apply weights from the replay buffer

It seems that the weights retrieved from the replay buffer are not applied when training the model. Is there any reason for that or am I missing something?

In any case, I have added a parameter in order for them to be used; just in case it is useful.
parent 0778e9f1
......@@ -89,6 +89,7 @@ def learn(env,
gamma=1.0,
target_network_update_freq=500,
prioritized_replay=False,
prioritized_importance_sampling=False,
prioritized_replay_alpha=0.6,
prioritized_replay_beta0=0.4,
prioritized_replay_beta_iters=None,
......@@ -232,7 +233,10 @@ def learn(env,
else:
obses_t, actions, rewards, obses_tp1, dones = replay_buffer.sample(batch_size)
weights, batch_idxes = np.ones_like(rewards), None
td_errors = train(obses_t, actions, rewards, obses_tp1, dones, np.ones_like(rewards))
if prioritized_importance_sampling:
td_errors = train(obses_t, actions, rewards, obses_tp1, dones, weights)
else:
td_errors = train(obses_t, actions, rewards, obses_tp1, dones, np.ones_like(rewards))
if prioritized_replay:
new_priorities = np.abs(td_errors) + prioritized_replay_eps
replay_buffer.update_priorities(batch_idxes, new_priorities)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment