使用 numpy 测试误报和漏报

问题描述

我想弄清楚如何使用 numpy 计算误报和漏报。

我能够通过以下方式计算准确度和不准确度：

在以下示例中，y_prediction 是对数据集所做预测的二维数组，即 1 和 0 的二维数组。 Truth_labels 是与特征向量相关联的类标签的一维数组，二维数组。

accurate_prediction_rate = np.count_nonzero(y_prediction == truth_labels)/truth_labels.shape[0]
inaccurate_prediction_rate = np.count_nonzero(y_prediction != truth_labels)/truth_labels.shape[0]

然后我尝试像这样计算误报（我的数据集中的正数用 1 表示）...

false_positives = np.count_nonzero((y_prediction != truth_labels)/truth_labels.shape[0] & predictions == 1)

但这会返回一个 TypeError。我是使用 numpy 的新手，所以不熟悉所有可用的方法。有没有更适合我想要做的事情的 numpy 方法？

解决方法

您可以使用 np.logical_and() 和 np.sum() 来实现这一点，我还介绍了如何计算真阳性和真阴性。

negative = 0.0
positive = 1.0

tp = np.sum(np.logical_and(y_prediction == positive,truth_labels == positive))
tn = np.sum(np.logical_and(y_prediction == negative,truth_labels == negative))
fp = np.sum(np.logical_and(y_prediction == positive,truth_labels == negative))
fn = np.sum(np.logical_and(y_prediction == negative,truth_labels == positive))

根据你的问题，我试图制作一个最小的例子。

y_pred = np.array([[0,1],[1,0],[0,1]]) #predictions
y_class = np.array([1,0]) #actual class

y_pred_class = np.argmax(y_pred,axis=1) #extracting class from predictions
false_positive = np.sum((y_pred_class == 1) & (y_class == 0))

numpy numpy-ufunc precision-recall python