gpt4 book ai didi

python - 属性错误 : 'Tensor' object has no attribute 'append'

转载 作者:行者123 更新时间:2023-12-05 05:11:48 65 4
gpt4 key购买 nike

我不明白为什么这段代码不起作用。当我将奖励放入列表中时,我收到一条错误消息,告诉我尺寸不正确。我不确定该怎么做。

我正在实现强化深度 q 网络。 r 是一个 numpy 二维数组,1 除以停止点之间的距离。这是为了让更近的停靠点获得更高的返回。

无论我做什么,我都无法获得奖励来顺利运行。我是 Tensorflow 的新手,所以这可能只是因为我对 Tensorflow 占位符和 feed dicts 等东西缺乏经验。

预先感谢您的帮助。

observations = tf.placeholder('float32', shape=[None, num_stops])

game states : r[stop], r[next_stop], r[third_stop]

actions = tf.placeholder('int32',shape=[None])

rewards = tf.placeholder('float32',shape=[None]) # +1, -1 with discounts

Y = tf.layers.dense(observations, 200, activation=tf.nn.relu)
Ylogits = tf.layers.dense(Y, num_stops)

sample_op = tf.random.categorical(logits=Ylogits, num_samples=1)

cross_entropies = tf.losses.softmax_cross_entropy(onehot_labels=tf.one_hot (actions,num_stops), logits=Ylogits)

loss = tf.reduce_sum(rewards * cross_entropies)


optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001, decay=.99)
train_op = optimizer.minimize(loss)




visited_stops = []
steps = 0

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

# Start at a random stop, initialize done to false
current_stop = random.randint(0, len(r) - 1)
done = False

# reset everything
while not done: # play a game in x steps

observations_list = []
actions_list = []
rewards_list = []

# List all stops and their scores
observation = r[current_stop]

# Add the stop to a list of non-visited stops if it isn't
# already there
if current_stop not in visited_stops:
visited_stops.append(current_stop)

# decide where to go
action = sess.run(sample_op, feed_dict={observations: [observation]})

# play it, output next state, reward if we got a point, and whether the game is over
#game_state, reward, done, info = pong_sim.step(action)
new_stop = int(action)


reward = r[current_stop][action]

if len(visited_stops) == num_stops:
done = True

if steps >= BATCH_SIZE:
done = True

steps += 1

observations_list.append(observation)
actions_list.append(action)
rewards.append(reward)



#rewards_list = np.reshape(rewards, [-1, 25])
current_stop = new_stop

#processed_rewards = discount_rewards(rewards, args.gamma)
#processed_rewards = normalize_rewards(rewards, args.gamma)

print(rewards)
sess.run(train_op, feed_dict={observations: [observations_list],
actions: [actions_list],
rewards: [rewards_list]})

最佳答案

rewards.append(reward) 导致错误,这是因为你的 rewards 变量是张量,正如你在 rewards = tf.placeholder('float32',shape=[None]) 并且您不能像那样将值附加到张量。您可能想调用 rewards_list.append(reward)

此外,您正在初始化变量

observations_list = []
actions_list = []
rewards_list = []

在循环内部,因此在每次迭代中,ols 值将被空列表覆盖。您可能希望在 while not done: 行之前添加这 3 行。

关于python - 属性错误 : 'Tensor' object has no attribute 'append' ,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55153915/

65 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com