图断开:无法在推理模型中获取张量 KerasTensor 的值,但原始模型拟合成功

问题描述

这基本上是 same question 但我现在有了新的注意力层。我没有手动添加注意力,而是使用 Keras 提供的注意力层,但仍然出现相同的错误。我认为由于不同的层,这需要不同的问题。如果不是这样,我很抱歉。

我在 Keras 中创建了一个具有自我注意功能的 Seq2Seq 模型,用于文本摘要。该模型成功拟合,但是,在使用相同模型生成预测时,我得到 ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None,300),dtype=tf.float32,name='input_1'),name='input_1',description="created by layer 'input_1'") at layer "embedding". The following prevIoUs layers were accessed without issue: []

这是我的模型:

# Encoder
encoder_inputs = Input(shape=(max_text_len,))

# Embedding layer
enc_emb = Embedding(x_voc,embedding_dim,trainable=True)(encoder_inputs)

# Encoder LSTM 1
encoder_lstm1 = Bidirectional(LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.4,recurrent_dropout=0.4))
(encoder_output1,forward_h1,forward_c1,backward_h1,backward_c1) = encoder_lstm1(enc_emb)

# Encoder LSTM 2
encoder_lstm2 = Bidirectional(LSTM(latent_dim,recurrent_dropout=0.4))
(encoder_output2,forward_h2,forward_c2,backward_h2,backward_c2) = encoder_lstm2(encoder_output1)

# Encoder LSTM 3
encoder_lstm3 = Bidirectional(LSTM(latent_dim,recurrent_dropout=0.4))
(encoder_outputs,forward_h,forward_c,backward_h,backward_c) = encoder_lstm3(encoder_output2)

state_h = Concatenate()([forward_h,backward_h])
state_c = Concatenate()([forward_c,backward_c])

# Set up the decoder,using encoder_states as the initial state
decoder_inputs = Input(shape=(None,))

# Embedding layer
dec_emb_layer = Embedding(y_voc,trainable=True)
dec_emb = dec_emb_layer(decoder_inputs)


# Decoder LSTM
decoder_lstm = LSTM(latent_dim*2,recurrent_dropout=0.2)
(decoder_outputs,decoder_fwd_state,decoder_back_state) = \
    decoder_lstm(dec_emb,initial_state=[state_h,state_c])

# attention = dot([decoder_outputs,encoder_outputs],axes=[2,2])
# attention = Activation('softmax')(attention)
# context = dot([attention,1])
# decoder_outputs = Concatenate()([context,decoder_outputs])

attention = Attention(causal = True)([encoder_outputs,decoder_outputs])

# Dense layer
decoder_dense = Timedistributed(Dense(y_voc,activation='softmax'))
decoder_outputs = decoder_dense(attention)

# Define the model
model = Model([encoder_inputs,decoder_inputs],decoder_outputs)

这就是我尝试生成预测的方式: 模型 = load_model("model_self_att.h5") encoder_inputs = model.input[0] # input_1

encoder_outputs,backward_c = model.layers[5].output #Bi-lstm2

state_h_enc = Concatenate()([forward_h,backward_h])
state_c_enc = Concatenate()([forward_c,backward_c])

encoder_states = [state_h_enc,state_c_enc]
encoder_model = Model(encoder_inputs,encoder_states)

decoder_inputs = model.input[1]  # input_2
decoder_state_input_h = Input(shape=(latent_dim*2,),name="input_3")
decoder_state_input_c = Input(shape=(latent_dim*2,name="input_4")
decoder_states_inputs = [decoder_state_input_h,decoder_state_input_c]
decoder_emdedding = model.layers[6](decoder_inputs)
decoder_lstm = model.layers[9]
decoder_outputs,state_h_dec,state_c_dec = decoder_lstm(decoder_emdedding,initial_state=decoder_states_inputs)
decoder_states = [state_h_dec,state_c_dec]

attention = model.layers[-2]([decoder_outputs,encoder_outputs ])

decoder_dense = model.layers[-1]
decoder_outputs = decoder_dense(attention)
decoder_model = Model(
    [decoder_inputs] + decoder_states_inputs,[decoder_outputs] + decoder_states
)

如果我移除注意力层,一切正常。我知道注意力层是什么导致了这个错误:(

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)