[rllib] Add vf clipping param to fix pendulum example by ericl · Pull Request #2921 · ray-project/ray

ericl · 2018-09-19T22:20:36Z

What do these changes do?

The vf clip param is sensitive to the scale of the rewards. This broke the pendulum tuned example when clipping was fixed. cc @eugenevinitsky

Related issue number

#2233

richardliaw · 2018-09-19T22:39:07Z

Any plots?

ericl · 2018-09-19T22:47:05Z

You get about the same performance as before: -900 within about 100k, and -140 by 300k ish.

…

On Wed, Sep 19, 2018, 3:39 PM Richard Liaw ***@***.***> wrote: Any plots? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#2921 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAA6SgYHxIEBHjHetM8cB8mNF4h72bjbks5ucseVgaJpZM4WxEc2> .

AmplabJenkins · 2018-09-19T23:54:50Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8301/
Test PASSed.

AmplabJenkins · 2018-09-20T00:31:19Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8303/
Test PASSed.

eugenevinitsky · 2018-09-21T23:44:43Z

this is actually probably why PPO hasn't been working for us; our rewards are on the wrong scale too. Thanks for this fix!

ericl · 2018-09-22T00:23:16Z

I'm wondering if we should disable VF clipping by default (i.e. set to 9999), it seems like it is easy to run into issue.

ericl · 2018-09-22T00:26:12Z

Changed it to 10.0 by default.

eugenevinitsky · 2018-09-22T01:08:10Z

I think that's pretty compelling; it's kind of a hidden failure mode. I'm not sure there's a consensus on how to appropriately scale your rewards, so I could easily imagine users using relatively large reward

AmplabJenkins · 2018-09-22T02:10:35Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8318/
Test PASSed.

richardliaw

I would be ok with turning off clipping by default; but 10.0 is fine too.

add vf clip

b87677b

ericl assigned richardliaw Sep 19, 2018

fix test

ab180a8

Update ppo.py

fdeb2c9

richardliaw approved these changes Sep 23, 2018

View reviewed changes

ericl merged commit 8331d1e into ray-project:master Sep 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[rllib] Add vf clipping param to fix pendulum example#2921

[rllib] Add vf clipping param to fix pendulum example#2921
ericl merged 3 commits into
ray-project:masterfrom
ericl:vfclip

ericl commented Sep 19, 2018

richardliaw commented Sep 19, 2018

ericl commented Sep 19, 2018 via email

AmplabJenkins commented Sep 19, 2018

AmplabJenkins commented Sep 20, 2018

eugenevinitsky commented Sep 21, 2018

ericl commented Sep 22, 2018

ericl commented Sep 22, 2018

eugenevinitsky commented Sep 22, 2018

AmplabJenkins commented Sep 22, 2018

richardliaw left a comment

Labels

4 participants

Uh oh!

Conversation

ericl commented Sep 19, 2018

What do these changes do?

Related issue number

richardliaw commented Sep 19, 2018

ericl commented Sep 19, 2018 via email

AmplabJenkins commented Sep 19, 2018

AmplabJenkins commented Sep 20, 2018

eugenevinitsky commented Sep 21, 2018

ericl commented Sep 22, 2018

ericl commented Sep 22, 2018

eugenevinitsky commented Sep 22, 2018

AmplabJenkins commented Sep 22, 2018

richardliaw left a comment

Choose a reason for hiding this comment

Labels

4 participants