ValueError:尝试对 IMDB 评论进行分类时,logits 和标签必须具有相同的形状 ((None, 1) vs (None, 10000))

问题描述

我正在尝试使用 Keras 对 IMDB 电影评论进行二元分类。以下是我使用的代码

from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Dense(16,activation="relu",input_shape=(10000,)))
model.add(layers.Dense(16,activation="relu"))
model.add(layers.Dense(1,activation="sigmoid"))

model.compile(optimizer="rmsprop",loss="binary_crossentropy",metrics=["acc"])

history = model.fit(partial_x_train,partial_y_train,epochs=20,batch_size=512,validation_data = (x_val,y_val))

每个输入张量的形状如下。

print(partial_x_train.shape) --> (15000,10000)
print(partial_y_train.shape)--> (15000,10000)
print(x_val.shape) --> (10000,10000)
print(y_val.shape) --> (10000,10000)

但在执行上述程序时,出现以下错误

ValueError: in user code:
ValueError: logits and labels must have the same shape ((None,1) vs (None,10000))

搜索了很多 SO 问题,但不明白我做错了什么。有人可以帮我避免这个错误并编译模型吗?

解决方法

ValueError 所述,您正在尝试计算形状 ((None,1)(None,10000)) 之间的损失。如果您发布或参考 IMDB 的训练集就很清楚了。尝试使用来自 keras 的内置 IMDB 数据集。

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

max_features = 20000  # Only consider the top 20k words
maxlen = 200  # Only consider the first 200 words of each movie review

(x_train,y_train),(x_val,y_val) = keras.datasets.imdb.load_data(
    num_words=max_features
)

print(len(x_train),"Training sequences")
print(len(x_val),"Validation sequences")

x_train = keras.preprocessing.sequence.pad_sequences(x_train,maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val,maxlen=maxlen)

x_train.shape,y_train.shape
# ((25000,200),(25000,))

根据您的型号

model = models.Sequential()
model.add(layers.Dense(16,activation="relu",input_shape=(maxlen,)))
model.add(layers.Dense(16,activation="relu"))
model.add(layers.Dense(1,activation="sigmoid"))

model.compile(optimizer="rmsprop",loss="binary_crossentropy",metrics=["acc"])
model.fit(x_train,y_train,batch_size=32,epochs=2,validation_data=(x_val,y_val))
Epoch 1/2
782/782 [==============================] - 5s 4ms/step - loss: 164.2350 - acc: 0.5018 - val_loss: 1.0527 - val_acc: 0.5000
Epoch 2/2
782/782 [==============================] - 3s 4ms/step - loss: 1.0677 - acc: 0.4978 - val_loss: 0.7446 - val_acc: 0.5000