在 KNeighborsClassifier 中使用自定义度量时,我不断收到“类型错误:只有整数标量数组可以转换为标量索引”

问题描述

我在 SKlearn 的 KNeighborsClassifier 中使用自定义指标。这是我的代码

def chi_squared(x,y):
return np.divide(np.square(np.subtract(x,y)),np.sum(x,y))

chi squared distance function. 的上述函数实现我使用了 NumPy 函数,因为根据 scikit-learn docs,度量函数需要两个一维 numpy 数组。

我已将 chi_squared 函数作为参数传递给 KNeighborsClassifier()。

knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)

但是,我不断收到以下错误

TypeError                                 Traceback (most recent call last)
<ipython-input-29-d2a365ebb538> in <module>
      4 
      5 knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)
----> 6 knn.fit(X_train,Y_train)
      7 predictions = knn.predict(X_test)
      8 print(accuracy_score(Y_test,predictions))

~/.local/lib/python3.8/site-packages/sklearn/neighbors/_classification.py in fit(self,X,y)
    177             The fitted k-nearest neighbors classifier.
    178         """
--> 179         return self._fit(X,y)
    180 
    181     def predict(self,X):

~/.local/lib/python3.8/site-packages/sklearn/neighbors/_base.py in _fit(self,y)
    497 
    498         if self._fit_method == 'ball_tree':
--> 499             self._tree = BallTree(X,self.leaf_size,500                                   metric=self.effective_metric_,501                                   **self.effective_metric_params_)

sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.__init__()

sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree._recursive_build()

sklearn/neighbors/_ball_tree.pyx in sklearn.neighbors._ball_tree.init_node()

sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.rdist()

sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.distanceMetric.rdist()

sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncdistance.dist()

sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncdistance._dist()

<ipython-input-29-d2a365ebb538> in chi_squared(x,y)
      1 def chi_squared(x,y):
----> 2     return np.divide(np.square(np.subtract(x,y))
      3 
      4 
      5 knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)

<__array_function__ internals> in sum(*args,**kwargs)

~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in sum(a,axis,dtype,out,keepdims,initial,where)
   2239         return res
   2240 
-> 2241     return _wrapreduction(a,np.add,'sum',keepdims=keepdims,2242                           initial=initial,where=where)
   2243 

~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj,ufunc,method,**kwargs)
     85                 return reduction(axis=axis,out=out,**passkwargs)
     86 
---> 87     return ufunc.reduce(obj,**passkwargs)
     88 
     89 

TypeError: only integer scalar arrays can be converted to a scalar index

   

解决方法

我可以通过以下方式重现您的错误消息:

In [173]: x=np.arange(3); y=np.array([2,3,4])
In [174]: np.sum(x,y)
Traceback (most recent call last):
  File "<ipython-input-174-1a1a267ebd82>",line 1,in <module>
    np.sum(x,y)
  File "<__array_function__ internals>",line 5,in sum
  File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py",line 2247,in sum
    return _wrapreduction(a,np.add,'sum',axis,dtype,out,keepdims=keepdims,File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py",line 87,in _wrapreduction
    return ufunc.reduce(obj,**passkwargs)
TypeError: only integer scalar arrays can be converted to a scalar index

正确使用np.sum

In [175]: np.sum(x)
Out[175]: 3
In [177]: np.sum(np.arange(6).reshape(2,3),axis=0)
Out[177]: array([3,5,7])
In [178]: np.sum(np.arange(6).reshape(2,0)
Out[178]: array([3,7])

(重新)阅读 np.sum 文档(如有必要)!

使用 np.add 而不是 np.sum

In [179]: np.add(x,y)
Out[179]: array([2,4,6])
In [180]: x+y
Out[180]: array([2,6])

以下应该是等价的:

np.divide(np.square(np.subtract(x,y)),np.add(x,y))

(x-y)**2/(x+y)