gpt4 book ai didi

python - Open AI Gym 观察空间形状问题

转载 作者:行者123 更新时间:2023-12-02 22:46:12 27 4
gpt4 key购买 nike

我正在尝试使用强化学习一劳永逸地解决 Yatzee 游戏。可悲的是,当我检查健身房是否符合稳定的基线时,它批评了我观察空间的形状。因此,我在创建对象后立即在构造函数中放置了一条打印语句,告诉我观察空间的形状。


class YatzeeEnv{
game_state = np.zeros(19, np.int32)

def __init__(self):
self.action_space = gym.spaces.Discrete(19)
self.observation_space = gym.spaces.MultiDiscrete(19)

for x in self.game_state_adresses:
self.game_state[x] = -1
self.reroll()
self.game_state[self.reroll_state] = 0
print(self.game_state.shape)
print(self.observation_space.shape)
}

a = YatzeeEnv()

遗憾的是这个输出是

np array shape: (19,)
Observation space shape: ()

这是为什么?我认为 gym.spaces.MultiDiscrete(19) 将观察空间定义为具有 19 个值的 int 数组。

最佳答案

来自文档...

This represents the cartesian product of arbitrary :class:`Discrete` spaces.
It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space.
Note:
Some environment wrappers assume a value of 0 always represents the NOOP action.
e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:
1. Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4
2. Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
3. Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
It can be initialized as ``MultiDiscrete([ 5, 2, 2 ])`` such that a sample might be ``array([3, 1, 0])``.
Although this feature is rarely used, :class:`MultiDiscrete` spaces may also have several axes
if ``nvec`` has several axes:
Example::
>> d = MultiDiscrete(np.array([[1, 2], [3, 4]]))
>> d.sample()
array([[0, 0],
[2, 3]])

如果您只有一个 Action 空间,则不必使用 MultiDiscrete。或者使用 MultiDiscrete([19])。

关于python - Open AI Gym 观察空间形状问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73990048/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com