问题描述
这是我的代码
print(len(image_dataset.data))
print(len(phylum_target))
X_train,X_test,y_train,y_test = train_test_split(image_dataset.data,phylum_target,test_size=0.2,random_state=109)
5000
5000
Traceback (most recent call last):
File "Image_SVM_run_only.py",line 298,in <module>
X_train_temp,X_test_temp,y_train_temp,y_test_temp = train_test_split(image_dataset.data,random_state=109)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/model_selection/_split.py",line 2127,in train_test_split
arrays = indexable(*arrays)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py",line 293,in indexable
check_consistent_length(*result)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py",line 257,in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [4999,5000]
即使训练数据和测试数据具有相同的长度,我仍然遇到此错误。 请帮我T.T
解决方法
这是我可以从您的信息中识别出的最小可复制示例,并且效果很好
import numpy as np
from sklearn.model_selection import train_test_split
X = np.zeros((5000,49152))
y = np.zeros((5000,1))
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=109)
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)