ValueError:找到输入样本数量不一致的输入变量:[218,30]

问题描述

我正在使用线性SVC进行一些面部识别训练,其中我的数据集为870x22。我有29个不同人的30帧图像,其中我在图像中使用22个简单值像素来识别人脸图像,其中22个像素是我的特征。另外,当我调用train_test_split()时,它会给我一个218x22的X_test和218个y_test。一旦我训练了分类器并尝试运行一张新面孔(30x22)矩阵的图像,它就会给我错误

ValueError: Found input variables with inconsistent numbers of samples: [218,30]

这是代码

import sklearn
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.svm import SVC
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score,f1_score

    img_amount = 30
    target = np.asarray([1]*img_amount + [2]*img_amount + [3]*img_amount + [4]*img_amount + [5]*img_amount + [6]*img_amount + [7]*img_amount + [8]*img_amount + [9]*img_amount + [10]*img_amount + [11]*img_amount + [12]*img_amount + [13]*img_amount + [14]*img_amount + [15]*img_amount + [16]*img_amount + [17]*img_amount + [18]*img_amount + [19]*img_amount + [20]*img_amount + [21]*img_amount + [22]*img_amount + [23]*img_amount + [24]*img_amount + [25]*img_amount + [26]*img_amount + [27]*img_amount + [28]*img_amount + [29]*img_amount)   
    dataset= dataset[:,0:22]
        
        svc_1 = SVC(kernel='linear',C=0.00005)
        X_train,X_test,y_train,y_test = train_test_split( dataset,target,test_size=0.25,random_state=0)
        
        def train(clf,X_train,y_test):
            
            clf.fit(X_train,y_train)
            print ("Accuracy on training set:")
            print (clf.score(X_train,y_train))
            print ("Accuracy on testing set:")
            print (clf.score(X_test,y_test))
            
            y_pred = clf.predict(X_test)
            
            print ("Classification Report:")
            print (metrics.classification_report(y_test,y_pred))
            print ("Confusion Matrix:")
            print (metrics.confusion_matrix(y_test,y_pred))
    
    
    
        train(svc_1,y_test)
    
    
print ("Classification Report:")
print (metrics.classification_report(y_test,new_face_img))

为了不以视觉方式污染问题,我上传并粘贴了new_face_img的矩阵:https://pastebin.com/uRbvv5jD

数据集链接Dataset

它们只是数组,可以直接传递给它们的变量

出现错误的行是我尝试预测新样本时的行:

predictions = svc_1.predict(new_face_img) 
print ("Classification Report:")
->>>>print (metrics.classification_report(y_test,predictions))

predictions = svc_1.predict(michael_ocluded_array) 
expected=np.ones(len(michael_ocluded_array))
print ("Confusion Matrix:")
print (metrics.confusion_matrix(expected,predictions))

混淆矩阵: -------------------------------------------------- ------------------------- ValueError追踪(最近的通话 最后) 1个预测= svc_1.predict(michael_ocluded_array) 2打印(“混淆矩阵:”) ----> 3次打印(metrics.classification_report(y_test,预测))

C:\ ProgramData \ Miniconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py 在inner_f(* args,** kwargs)中 70未来警告) 71 kwargs.update({k:k的arg,zip中的arg(sig.parameters,args)}) -> 72返回f(** kwargs) 73 return inner_f 74

C:\ ProgramData \ Miniconda3 \ lib \ site-packages \ sklearn \ metrics_classification.py 在category_report(y_true,y_pred,标签,target_names, sample_weight,digits,output_dict,zero_division)1927“”“
1928年 -> 1929 y_type,y_true,y_pred = _check_targets(y_true,y_pred)1930 1931 labels_given = True

C:\ ProgramData \ Miniconda3 \ lib \ site-packages \ sklearn \ metrics_classification.py 在_check_targets(y_true,y_pred)中 79 y_pred:数组或指标矩阵 80“”“ ---> 81 check_consistent_length(y_true,y_pred) 82 type_true = type_of_target(y_true) 83 type_pred = type_of_target(y_pred)

C:\ ProgramData \ Miniconda3 \ lib \ site-packages \ sklearn \ utils \ validation.py 在check_consistent_length(* arrays)中 253个唯一性= np.unique(长度) 254(如果len(uniques)> 1: -> 255提高ValueError(“找到数量不一致的输入变量” 256“样本:%r”%[长度为l的int(l)]) 257

ValueError:找到数量不一致的输入变量 样本:[218,30]

解决方法

这里是问题:

predictions = svc_1.predict(new_face_image) 
print ("Confusion Matrix:")
print (metrics.confusion_matrix(y_test,predictions))

您正在预测new_face_image并通过测试数据集对其进行预测。

predictions = svc_1.predict(new_face_image) 
# change this to what you expect but shape=(30,)
expected=np.ones(len(new_face_image))
print ("Confusion Matrix:")
print (metrics.confusion_matrix(expected,predictions))

已编辑以用于数据集测试数据的验证:

predictions = svc_1.predict(x_test) 
print ("Confusion Matrix:")
print (metrics.confusion_matrix(y_test,predictions))