问题描述
我是 ML 的新手,我一直在尝试 udacity ML 项目。但是,我遇到了一个错误,我很难解决。代码看起来没问题,但我似乎无法遍历数据。我知道这与所做的新 StratifiedShuffleSplit 更改有关。代码已关闭。
def Stratified_Shuffle_Split(X,y,num_test):
sss = StratifiedShuffleSplit(y,1,test_size=num_test,random_state = None)
for train,test in sss:
X_train,X_test = X.iloc[train],X.iloc[test]
y_train,y_test = y.iloc[train],y.iloc[test]
return X_train,X_test,y_train,y_test
# First,decide how many training vs test samples you want
num_all = student_data.shape[0] # same as len(student_data)
num_train = round(num_all*0.75) # about 75% of the data
num_test = num_all - num_train
#print(num_test)
y = student_data['passed'] # identify target variable
X_train,y_test = Stratified_Shuffle_Split(X_all,num_test)
print("Training Set: {0:.2f} Samples".format(X_train.shape[0]))
print("Testing Set: {0:.2f} Samples".format(X_test.shape[0]))
我的错误是这个
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-20-2147158fcaf2> in <module>
13
14 y = student_data['passed'] # identify target variable
---> 15 X_train,num_test)
16
17 print("Training Set: {0:.2f} Samples".format(X_train.shape[0]))
<ipython-input-20-2147158fcaf2> in Stratified_Shuffle_Split(X,num_test)
1 def Stratified_Shuffle_Split(X,num_test):
2 sss = StratifiedShuffleSplit(y,random_state = None)
----> 3 for train,test in sss:
4 X_train,X.iloc[test]
5 y_train,y.iloc[test]
TypeError: 'StratifiedShuffleSplit' object is not iterable
'''
解决方法
根据documentation,需要在StratifiedShuffleSplit上运行.split()
函数。您需要 .split()
来生成要切片的索引。所以这部分可能是:
def Stratified_Shuffle_Split(X,y,num_test):
sss = StratifiedShuffleSplit(y,1,test_size=num_test,random_state = None)
for train,test in sss.split(X,y):
X_train,X_test = X.iloc[train],X.iloc[test]
y_train,y_test = y.iloc[train],y.iloc[test]
return X_train,X_test,y_train,y_test
我也不确定是否需要定义一个新函数,StratifiedShuffleSplit 已经是一个现成的函数,可以使用您拥有的 for 循环执行您想要的操作。