断言失败:[条件x == y不按元素进行:]

问题描述

我为句子分类任务建立了带有注意层的BiLSTM模型,但由于参数数量不匹配,我的断言失败,我得到一个错误。注意层代码在这里错误代码下面。

class attention(Layer):
    
    def __init__(self,return_sequences=True):
        self.return_sequences = return_sequences
        super(attention,self).__init__()
        
    def build(self,input_shape):
        
        self.W=self.add_weight(name="att_weight",shape=(input_shape[-1],1),initializer="normal")
        self.b=self.add_weight(name="att_bias",shape=(input_shape[1],initializer="zeros")
        
        super(attention,self).build(input_shape)
        
    def call(self,x):
        
        e = K.tanh(K.dot(x,self.W)+self.b)
        a = K.softmax(e,axis=1)
        output = x*a
        
        if self.return_sequences:
            return output
        
        return K.sum(output,axis=1)

当我训练包含注意层的模型时,出现断言失败的错误

Epoch 1/10
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-45-ac310033130c> in <module>()
      1 #Early stopping,Adam,dropout = 0.3,0.5,0.5
      2 #history = model.fit(sequences_matrix,Y_train,batch_size=256,epochs=5,validation_split=0.1,callbacks=[EarlyStopping(monitor='val_loss',min_delta=0.0001)])
----> 3 history = model.fit(sequences_matrix,batch_size=32,epochs=10,validation_split=0.1)

8 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name,num_outputs,inputs,attrs,ctx,name)
     58     ctx.ensure_initialized()
     59     tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle,device_name,op_name,---> 60                                         inputs,num_outputs)
     61   except core._NotOkStatusException as e:
     62     if name is not None:

InvalidArgumentError:  assertion Failed: [Condition x == y did not hold element-wise:] [x (sparse_categorical_crossentropy/SparsesoftmaxCrossEntropyWithLogits/Shape_1:0) = ] [32 1] [y (sparse_categorical_crossentropy/SparsesoftmaxCrossEntropyWithLogits/strided_slice:0) = ] [32 758]
     [[node sparse_categorical_crossentropy/SparsesoftmaxCrossEntropyWithLogits/assert_equal_1/Assert/Assert (defined at <ipython-input-45-ac310033130c>:3) ]] [Op:__inference_train_function_19854]

Function call stack:
train_function

我的模特是

model = Sequential()
model.add(Embedding(max_words,768,input_length=max_len,weights=[embedding]))
model.add(Batchnormalization())
model.add(Activation('tanh'))
model.add(SpatialDropout1D(0.1))
model.add(Conv1D(16,kernel_size=11,activation='relu'))
model.add(Bidirectional(LSTM(16,return_sequences=True)))
model.add(attention(return_sequences=True))
model.add(Batchnormalization())
model.add(Activation('tanh'))
model.add(Dropout(0.2))
model.add(Dense(2,activation='softmax',use_bias=True,kernel_regularizer=regularizers.l1_l2(l1=1e-5,l2=1e-4),bias_regularizer=regularizers.l2(1e-4),activity_regularizer=regularizers.l2(1e-5)))
model.summary()

Y_train的形状是

max_words = 48369
max_len = 768
tok = Tokenizer(num_words = max_words)
tok.fit_on_texts(X_train)
sequences = tok.texts_to_sequences(X_train)
sequences_matrix = sequence.pad_sequences(sequences,maxlen = max_len)
Y_train = np.array(Y_train)
Y_test = np.array(Y_test)

print(Y_train.shape)

(43532,1)

解决方法

您的目标是2D格式,因此您需要在最后一个关注层中设置return_sequences=False才能以2D格式返回输出

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...