Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rllib]State input's shape about custom env based on gym's env #10024

Closed
lsylusiyao opened this issue Aug 10, 2020 · 5 comments
Closed

[rllib]State input's shape about custom env based on gym's env #10024

lsylusiyao opened this issue Aug 10, 2020 · 5 comments
Assignees
Labels
enhancement Request for new feature and/or capability P3 Issue moderate in impact or severity question Just a question :)

Comments

@lsylusiyao
Copy link

lsylusiyao commented Aug 10, 2020

What is your question?

I've created a custom env, which basically looks like this.

def __init__(self):
  self.action_space = gym.spaces.Discrete(9)
  self.observation_space = gym.spaces.Tuple((
                gym.spaces.MultiBinary(
                    (5,4)), 
                gym.spaces.Box(
                    low=np.zeros(2, dtype=np.float32),
                    high=np.array([100, 100], dtype=np.float32)
                )
            ))
def reset(self):
  a = np.zeros((5,4), dtype=np.int32)
  b = np.array([100, 100], dtype=np.float32)
  return a,b
def step(self, action):
# I've debugged and it seems that this function hasn't been entered. So I believe the above two functions should be enough.
# sth
# sth
# sth
# `states` is just similar as (a,b) in function `reset`
  return states, reward, done, {}

The problem is, when I wrote reset like this, line 375 in modelv2.py : _unpack_obs(obs, obs_space.original_space, tensorlib=tensorlib) gave out : reshape(): argument 'shape' must be tuple of ints, but found element of type tuple at pos 2. Later I've found out that I need to add a newaxis for ndarray. So I added a = a[np.newaxis, :] and b = b[np.newaxis, :] in reset.

However, after doing this, it goes wrong again on line 60 in preprocessors.py : if not self._obs_space.contains(observation), which is Observation outside expected value range ( I'm assure it's not because the absolute value in the Box). I got into self._obs_space.contains(observation) in tuple.py and found out that the problem occurred on line 28 : space.contains(part) for (space,part) in zip(self.spaces,x), which returns [True, False].

I'm a new gay for gym and rllib, and I'm a little confused about the input. Is there any suggestion? Thanks.

Ray version and other system information (Python version, TensorFlow version, OS):
Ray: 0.8.6; Python 3.8.5; PyTorch 1.6.0; Windows 10 2004

@lsylusiyao lsylusiyao added the question Just a question :) label Aug 10, 2020
@lsylusiyao lsylusiyao changed the title [rllib]State input about custom env based on gym's env [rllib]State input's shape about custom env based on gym's env Aug 10, 2020
@sven1977 sven1977 self-assigned this Aug 10, 2020
@sven1977
Copy link
Contributor

Hey @lsylusiyao thanks for filing this. Yeah, looks like a bug or at least something we don't support yet (MultiBinary space). We should have a Bernoulli distribution or MultiBernoulli for these cases. ...

@sven1977 sven1977 added enhancement Request for new feature and/or capability P3 Issue moderate in impact or severity rllib labels Aug 10, 2020
@lsylusiyao
Copy link
Author

However, it seems that the MultiBinary is enough for my problem, and the result of sample satisfies me. The problem could still be in the dimension or in the Box.

@lsylusiyao
Copy link
Author

Oh, and today I tested to switch Tuple to Dict and the same problem occurred in the same way at the same place.......

@sven1977
Copy link
Contributor

Yes, that could be. I'm currently adding a better test case for complex action spaces (including MultiBinary components). Some special combinations do fail currently (even w/o MultiBinary). ...

@lsylusiyao
Copy link
Author

Thanks to @panda361, I finally solve this problem. It seems that the bug is on the Gym part rather than on the ray so I made a pull request for that.
#2023 for Gym
And also, thanks for @sven1977 helping me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Request for new feature and/or capability P3 Issue moderate in impact or severity question Just a question :)
Projects
None yet
Development

No branches or pull requests

2 participants