问题描述
我在 SKlearn 的 KNeighborsClassifier 中使用自定义指标。这是我的代码:
def chi_squared(x,y):
return np.divide(np.square(np.subtract(x,y)),np.sum(x,y))
chi squared distance function. 的上述函数实现我使用了 NumPy 函数,因为根据 scikit-learn docs,度量函数需要两个一维 numpy 数组。
我已将 chi_squared 函数作为参数传递给 KNeighborsClassifier()。
knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)
但是,我不断收到以下错误:
TypeError Traceback (most recent call last)
<ipython-input-29-d2a365ebb538> in <module>
4
5 knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)
----> 6 knn.fit(X_train,Y_train)
7 predictions = knn.predict(X_test)
8 print(accuracy_score(Y_test,predictions))
~/.local/lib/python3.8/site-packages/sklearn/neighbors/_classification.py in fit(self,X,y)
177 The fitted k-nearest neighbors classifier.
178 """
--> 179 return self._fit(X,y)
180
181 def predict(self,X):
~/.local/lib/python3.8/site-packages/sklearn/neighbors/_base.py in _fit(self,y)
497
498 if self._fit_method == 'ball_tree':
--> 499 self._tree = BallTree(X,self.leaf_size,500 metric=self.effective_metric_,501 **self.effective_metric_params_)
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.__init__()
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree._recursive_build()
sklearn/neighbors/_ball_tree.pyx in sklearn.neighbors._ball_tree.init_node()
sklearn/neighbors/_binary_tree.pxi in sklearn.neighbors._ball_tree.BinaryTree.rdist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.distanceMetric.rdist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncdistance.dist()
sklearn/neighbors/_dist_metrics.pyx in sklearn.neighbors._dist_metrics.PyFuncdistance._dist()
<ipython-input-29-d2a365ebb538> in chi_squared(x,y)
1 def chi_squared(x,y):
----> 2 return np.divide(np.square(np.subtract(x,y))
3
4
5 knn = KNeighborsClassifier(algorithm='ball_tree',metric=chi_squared)
<__array_function__ internals> in sum(*args,**kwargs)
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in sum(a,axis,dtype,out,keepdims,initial,where)
2239 return res
2240
-> 2241 return _wrapreduction(a,np.add,'sum',keepdims=keepdims,2242 initial=initial,where=where)
2243
~/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj,ufunc,method,**kwargs)
85 return reduction(axis=axis,out=out,**passkwargs)
86
---> 87 return ufunc.reduce(obj,**passkwargs)
88
89
TypeError: only integer scalar arrays can be converted to a scalar index
解决方法
我可以通过以下方式重现您的错误消息:
In [173]: x=np.arange(3); y=np.array([2,3,4])
In [174]: np.sum(x,y)
Traceback (most recent call last):
File "<ipython-input-174-1a1a267ebd82>",line 1,in <module>
np.sum(x,y)
File "<__array_function__ internals>",line 5,in sum
File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py",line 2247,in sum
return _wrapreduction(a,np.add,'sum',axis,dtype,out,keepdims=keepdims,File "/usr/local/lib/python3.8/dist-packages/numpy/core/fromnumeric.py",line 87,in _wrapreduction
return ufunc.reduce(obj,**passkwargs)
TypeError: only integer scalar arrays can be converted to a scalar index
正确使用np.sum
:
In [175]: np.sum(x)
Out[175]: 3
In [177]: np.sum(np.arange(6).reshape(2,3),axis=0)
Out[177]: array([3,5,7])
In [178]: np.sum(np.arange(6).reshape(2,0)
Out[178]: array([3,7])
(重新)阅读 np.sum
文档(如有必要)!
使用 np.add
而不是 np.sum
:
In [179]: np.add(x,y)
Out[179]: array([2,4,6])
In [180]: x+y
Out[180]: array([2,6])
以下应该是等价的:
np.divide(np.square(np.subtract(x,y)),np.add(x,y))
(x-y)**2/(x+y)