get action in sac_continuous_action.py #428

zichunxx · 2023-11-10T08:32:03Z

Problem Description

Hi! Thanks for this clean script to help me understand sac.

But I have some questions about the implementation of sac's get action function, mainly focused on the following code snippet

cleanrl/cleanrl/sac_continuous_action.py

Lines 139 to 141 in 2d660b6

    
           # Enforcing Action Bound 
        
           log_prob -= torch.log(self.action_scale * (1 - y_t.pow(2)) + 1e-6) 
        
           log_prob = log_prob.sum(1, keepdim=True)

What is the purpose of this? Thanks!

Checklist

I have installed dependencies via poetry install (see CleanRL's installation guideline.
I have checked that there is no similar issue in the repo.
I have checked the documentation site and found not relevant information in GitHub issues.

The text was updated successfully, but these errors were encountered:

Howuhh · 2023-11-10T12:18:20Z

Usually in SAC we use Normal distribution coupled with tanh to bound action space. However, after such transformation the actual distribution is now not just standard Normal and we can not use it's lob_prob to get the probabilities of actions. This formula accounts for the transformation and gives right probabilities for TanhNormal distribution. See Appendix C in the original paper: https://arxiv.org/pdf/1801.01290.pdf

zichunxx · 2023-11-11T01:20:52Z

Thanks for your generous help @Howuhh. Is 1e-6 meant to limit the logarithmic value to approach negative infinity?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

get action in sac_continuous_action.py #428

get action in sac_continuous_action.py #428

zichunxx commented Nov 10, 2023 •

edited

Howuhh commented Nov 10, 2023

zichunxx commented Nov 11, 2023

get action in sac_continuous_action.py #428

get action in sac_continuous_action.py #428

Comments

zichunxx commented Nov 10, 2023 • edited

Problem Description

Checklist

Howuhh commented Nov 10, 2023

zichunxx commented Nov 11, 2023

zichunxx commented Nov 10, 2023 •

edited