问题描述
在尝试获取自定义指标回调以与Tensorflow配合使用时遇到问题。我在下面创建了一个最小的工作示例来帮助进行故障排除。我正在跑步:
Windows 10
Python 3.6
scikit-learn==0.23.2
pandas==0.25.3
numpy==1.18.5
tensorflow==2.3.0
使用乳腺癌二进制数据集,我试图调用显示为solution here的自定义指标,但遇到上述错误,可能是因为我没有正确使用它。
此代码...
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score,recall_score,f1_score
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import Callback
# Get binary classification dataset
data = load_breast_cancer(as_frame=True)
print(data)
df = data['data']
df['target'] = data['target']
# Train Test split
train,test = train_test_split(data,test_size = 0.10,shuffle=False)
# Define features and labels
x_train = train.iloc[:,:-1]
y_train = train.iloc[:,-1]
x_test = test.iloc[:,:-1]
y_test = test.iloc[:,-1]
# https://github.com/keras-team/keras/issues/10472#issuecomment-472543538
class Metrics(Callback):
def __init__(self,val_data,batch_size=20):
super().__init__()
self.validation_data = val_data
self.batch_size = batch_size
def on_train_begin(self,logs={}):
# print(self.validation_data)
self.val_f1s = []
self.val_recalls = []
self.val_precisions = []
def on_epoch_end(self,epoch,logs={}):
batches = len(self.validation_data)
total = batches * self.batch_size
val_pred = np.zeros((total,1))
val_true = np.zeros((total))
for batch in range(batches):
xVal,yVal = next(self.validation_data)
val_pred[batch * self.batch_size : (batch+1) * self.batch_size] = np.asarray(self.model.predict(xVal)).round()
val_true[batch * self.batch_size : (batch+1) * self.batch_size] = yVal
val_pred = np.squeeze(val_pred)
_val_f1 = f1_score(val_true,val_pred)
_val_precision = precision_score(val_true,val_pred)
_val_recall = recall_score(val_true,val_pred)
self.val_f1s.append(_val_f1)
self.val_recalls.append(_val_recall)
self.val_precisions.append(_val_precision)
return
# Define a function that creates a basic model
def make_deep_learning_classifier():
model = Sequential()
model.add(Dense(64,activation='relu',input_dim=x_train.shape[1],kernel_initializer='normal'))
model.add(Dense(32,kernel_initializer='normal'))
model.add(Dense(1,kernel_initializer='normal',activation='sigmoid'))
model.compile(loss='binary_crossentropy',optimizer=Adam(),metrics=['accuracy'])
return model
# Get our model
model = make_deep_learning_classifier()
print(model.summary())
# Define some params
batch_size = 32
# Call our custom callback
callback = [Metrics(val_data=[x_test,y_test],batch_size=batch_size)] # < Issue here?
# Start training
model.fit(x_train,y_train,epochs=1000,batch_size=batch_size,verbose=1,callbacks=callback,validation_data=(x_test,y_test))
print(Metrics.val_precisions) # < Issue here?
...产生此追溯...
File "test.py",line 91,in <module>
model.fit(x_train,y_test))
File "C:\Users\...\AppData\Local\Programs\Python\python36\lib\site-packages\tensorflow\python\keras\engine\training.py",line 108,in _method_wrapper
return method(self,*args,**kwargs)
File "C:\Users\...\AppData\Local\Programs\Python\python36\lib\site-packages\tensorflow\python\keras\engine\training.py",line 1137,in fit
callbacks.on_epoch_end(epoch,epoch_logs)
File "C:\Users\...\AppData\Local\Programs\Python\python36\lib\site-packages\tensorflow\python\keras\callbacks.py",line 416,in on_epoch_end
callback.on_epoch_end(epoch,numpy_logs)
File "test.py",line 54,in on_epoch_end
xVal,yVal = next(self.validation_data)
TypeError: 'list' object is not an iterator
当我将val_data=[x_test,y_test]
变量中的val_data=(x_test,y_test)
更改为callback
时,我得到...
TypeError: 'tuple' object is not an iterator
提出此回调解决方案的用户提到了一些有关生成器的内容,但是我不确定它们如何工作。只是尝试为Tensorflow / Keras定义自己的自定义指标。我不会使用此确切的回调,但是一旦运行该回调,就可以对其进行修改。只是提供一个示例,该示例似乎可以在该GitHub帖子中使用,我希望有人能够指出我做错了什么。
谢谢!
更新
使用下面的解决方案,我尝试使用
正确地调用val_data上的迭代器函数iter_val_data = iter(self.validation_data)
for batch in range(batches):
xVal,yVal = next(iter_val_data)
但是随后我得到的值太多,无法解包错误,因此将其更改为:
iter_val_data = iter(self.validation_data)
for batch in range(batches):
xVal = next(iter_val_data)
yVal = next(iter_val_data)
然后我得到了错误:
Traceback (most recent call last):
File "test.py",line 89,line 53,in on_epoch_end
val_pred[batch * self.batch_size : (batch+1) * self.batch_size] = np.asarray(self.model.predict(xVal)).round()
ValueError: Could not broadcast input array from shape (57,1) into shape (32,1)
这里的想法?如果可以,请尝试在与上述相同的环境中运行代码。谢谢!
解决方法
如您所见here并根据错误消息所述,您需要将next()与迭代器一起使用。您在列表中调用next()
,next()
应该如何知道接下来要使用哪个元素?为此,您需要一个迭代器,以保存该状态。因此,这应该可以解决您的问题:
iter_val_data = iter(self.validation_data)
for batch in range(batches):
xVal,yVal = next(iter_val_data)