问题描述
我正在尝试使用 Keras 对 IMDB 电影评论进行二元分类。以下是我使用的代码。
from keras import models
from keras import layers
model = models.Sequential()
model.add(layers.Dense(16,activation="relu",input_shape=(10000,)))
model.add(layers.Dense(16,activation="relu"))
model.add(layers.Dense(1,activation="sigmoid"))
model.compile(optimizer="rmsprop",loss="binary_crossentropy",metrics=["acc"])
history = model.fit(partial_x_train,partial_y_train,epochs=20,batch_size=512,validation_data = (x_val,y_val))
每个输入张量的形状如下。
print(partial_x_train.shape) --> (15000,10000)
print(partial_y_train.shape)--> (15000,10000)
print(x_val.shape) --> (10000,10000)
print(y_val.shape) --> (10000,10000)
但在执行上述程序时,出现以下错误。
ValueError: in user code:
ValueError: logits and labels must have the same shape ((None,1) vs (None,10000))
我搜索了很多 SO 问题,但不明白我做错了什么。有人可以帮我避免这个错误并编译模型吗?
解决方法
如 ValueError
所述,您正在尝试计算形状 ((None,1)
与 (None,10000))
之间的损失。如果您发布或参考 IMDB 的训练集就很清楚了。尝试使用来自 keras
的内置 IMDB 数据集。
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
max_features = 20000 # Only consider the top 20k words
maxlen = 200 # Only consider the first 200 words of each movie review
(x_train,y_train),(x_val,y_val) = keras.datasets.imdb.load_data(
num_words=max_features
)
print(len(x_train),"Training sequences")
print(len(x_val),"Validation sequences")
x_train = keras.preprocessing.sequence.pad_sequences(x_train,maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val,maxlen=maxlen)
x_train.shape,y_train.shape
# ((25000,200),(25000,))
根据您的型号
model = models.Sequential()
model.add(layers.Dense(16,activation="relu",input_shape=(maxlen,)))
model.add(layers.Dense(16,activation="relu"))
model.add(layers.Dense(1,activation="sigmoid"))
model.compile(optimizer="rmsprop",loss="binary_crossentropy",metrics=["acc"])
model.fit(x_train,y_train,batch_size=32,epochs=2,validation_data=(x_val,y_val))
Epoch 1/2
782/782 [==============================] - 5s 4ms/step - loss: 164.2350 - acc: 0.5018 - val_loss: 1.0527 - val_acc: 0.5000
Epoch 2/2
782/782 [==============================] - 3s 4ms/step - loss: 1.0677 - acc: 0.4978 - val_loss: 0.7446 - val_acc: 0.5000