Skip to content

[rllib] Add vf clipping param to fix pendulum example#2921

Merged
ericl merged 3 commits into
ray-project:masterfrom
ericl:vfclip
Sep 23, 2018
Merged

[rllib] Add vf clipping param to fix pendulum example#2921
ericl merged 3 commits into
ray-project:masterfrom
ericl:vfclip

Conversation

@ericl

@ericl ericl commented Sep 19, 2018

Copy link
Copy Markdown
Contributor

What do these changes do?

The vf clip param is sensitive to the scale of the rewards. This broke the pendulum tuned example when clipping was fixed. cc @eugenevinitsky

Related issue number

#2233

@richardliaw

Copy link
Copy Markdown
Contributor

Any plots?

@ericl

ericl commented Sep 19, 2018 via email

Copy link
Copy Markdown
Contributor Author
@AmplabJenkins

Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8301/
Test PASSed.

@AmplabJenkins

Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8303/
Test PASSed.

@eugenevinitsky

Copy link
Copy Markdown
Contributor

this is actually probably why PPO hasn't been working for us; our rewards are on the wrong scale too. Thanks for this fix!

@ericl

ericl commented Sep 22, 2018

Copy link
Copy Markdown
Contributor Author

I'm wondering if we should disable VF clipping by default (i.e. set to 9999), it seems like it is easy to run into issue.

@ericl

ericl commented Sep 22, 2018

Copy link
Copy Markdown
Contributor Author

Changed it to 10.0 by default.

@eugenevinitsky

Copy link
Copy Markdown
Contributor

I think that's pretty compelling; it's kind of a hidden failure mode. I'm not sure there's a consensus on how to appropriately scale your rewards, so I could easily imagine users using relatively large reward

@AmplabJenkins

Copy link
Copy Markdown

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/8318/
Test PASSed.

@richardliaw richardliaw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be ok with turning off clipping by default; but 10.0 is fine too.

@ericl ericl merged commit 8331d1e into ray-project:master Sep 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

4 participants