问题描述
我正在尝试在来自 kaggle 的 In-Short 数据集上实现注意力机制,但我被解码器模块的这个输入张量困住了。 我使用 glove 进行词嵌入,并创建了 2 个嵌入矩阵,一个用于标题,另一个用于摘要。
数据集的链接是:Click Here
代码如下:
print(max_len_news,max_len_headline)
K.clear_session()
embedding_dim = 300 #Size of word embeddings.
latent_dim = 500
encoder_input = Input(shape=(max_len_news,))
encoder_emb = Embedding(news_vocab,embedding_dim,weights=[embedding_matrix],trainable=True)(encoder_input) #Embedding Layer
#Three-stacked LSTM layers for encoder. Return_state returns the activation state vectors,a(t) and c(t),return_sequences return the output of the neurons y(t).
#With layers stacked one above the other,y(t) of prevIoUs layer becomes x(t) of next layer.
encoder_lstm1 =Bidirectional ( LSTM(latent_dim,return_sequences=True,return_state=True,dropout=0.3,recurrent_dropout=0.2))
y_1,a_1,c_1,a_b1,c_b1 = encoder_lstm1(encoder_emb)
encoder_lstm2 = Bidirectional( LSTM(latent_dim,recurrent_dropout=0.2))
y_2,a_2,c_2,a_b2,c_b2 = encoder_lstm2(y_1)
encoder_lstm3 = Bidirectional (LSTM(latent_dim,recurrent_dropout=0.2))
encoder_output,a_enc,c_enc,a_b3,c_b3 = encoder_lstm3(y_2)
states_a=Concatenate(axis=1)([a_enc,a_b3])
states_c=Concatenate(axis=1)([c_enc,c_b3])
print(states_f.shape)
#Single LSTM layer for decoder followed by Dense softmax layer to predict the next word in summary.
decoder_input = Input(shape=(None,))
decoder_emb = Embedding(headline_vocab,weights=[embedding_matrix1],trainable=True)(decoder_input)
decoder_lstm =(LSTM(latent_dim,recurrent_dropout=0.2))
decoder_output,decoder_fwd,decoder_back = decoder_lstm(decoder_emb,initial_state=([states_a,states_c])) #Final output states of encoder last layer are fed into decoder.
#Attention Layer
attn_layer = AttentionLayer(name='attention_layer')
attn_out,attn_states = attn_layer([encoder_output,decoder_output])
decoder_concat_input = Concatenate(axis=-1,name='concat_layer')([decoder_output,attn_out])
decoder_dense = Timedistributed(Dense(headline_vocab,activation='softmax'))
decoder_output = decoder_dense(decoder_concat_input)
model = Model([encoder_input,decoder_input],decoder_output)
model.summary()
我得到的错误信息如下:
53 14
(None,2000)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-30-0e600101b74e> in <module>()
30
31 decoder_lstm =(LSTM(latent_dim,recurrent_dropout=0.2))
---> 32 decoder_output,states_c])) #Final output states of encoder last layer are fed into decoder.
33
34 #Attention Layer
7 frames
/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/recurrent.py in _validate_state_spec(cell_state_sizes,init_state_specs)
633 cell_state_spec.shape[1:]).is_compatible_with(
634 tensor_shape.TensorShape(cell_state_size)):
--> 635 raise validation_error
636
637 @doc_controls.do_not_doc_inheritable
ValueError: An `initial_state` was passed that is not compatible with `cell.state_size`. Received `state_spec`=ListWrapper([InputSpec(shape=(None,1000),ndim=2),InputSpec(shape=(None,ndim=2)]); however `cell.state_size` is [500,500]
任何人,请帮助我如何纠正这个问题?
解决方法
暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!
如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。
小编邮箱:dio#foxmail.com (将#修改为@)