问题描述
我遇到以下错误:TypeError:不可哈希类型:'list'的以下代码,专门用于:
disc_rewards: discounted_normalized_rewards
此处是代码:
with tf.name_scope('input_'):
inputs=tf.placeholder(tf.float32,[None,4],name='inputs')
actions=tf.placeholder(tf.int32,action_size],name='actions')
disc_rewards=tf.placeholder(tf.float32,],name='disc_rewards')
fc1=tf.keras.layers.Dense(64,activation='relu')(inputs)
fc2=tf.keras.layers.Dense(32,activation='relu')(fc1)
fc3=tf.keras.layers.Dense(16,activation='relu')(fc2)
fc4=tf.keras.layers.Dense(action_size,activation='softmax')(fc3)
out=tf.nn.softmax(fc4)
loss=tf.nn.softmax_cross_entropy_with_logits_v2(labels=actions,logits=out)
weighted_loss=loss*disc_rewards
final_loss=tf.reduce_mean(weighted_loss)
trainer=tf.train.AdamOptimizer(0.01).minimize(final_loss)
该tf正在运行:
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for episode in range(max_episodes):
total_reward_ep=0
state=env.reset()
while True:
env.render()
action_prob=sess.run(out,feed_dict={inputs:state.reshape([1,4])})
action = np.random.choice(range(action_prob.shape[1]),p=action_prob.ravel())
next_state,reward,done,_=env.step(action)
action_=np.zeros(action_size)
action_[action]=1
states.append(state)
rewards.append(reward)
actions.append(action_)
state=next_state
if done:
break
show_video()
discounted_normalized_rewards=disc_norm(rewards)
loss_,_ = sess.run([final_loss,trainer],feed_dict={inputs: np.vstack(np.array(states)),actions: np.vstack(np.array(actions)),disc_rewards: discounted_normalized_rewards
})
错误:
31 loss_,32 actions: np.vstack(np.array(actions)),---> 33 disc_rewards: discounted_normalized_rewards
34 })
35 total_reward_ep=np.sum(rewards)
TypeError: unhashable type: 'list'
我尝试了很多事情,但是无法解决此问题。请让我知道我犯了什么错误。谢谢!
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)