问题描述
ValueError: 仅支持 ('multilabel-indicator','continuous-multIoUtput','multiclass-multIoUtput') 格式。取而代之的是多类
这是我的代码:
# Declare classifier,fit on data and make predictions
from sklearn.ensemble import RandomForestClassifier
rnd_forest = RandomForestClassifier()
rnd_forest.fit(X_train_tr,y_train)
y_pred_prob = rnd_forest.predict_proba(X_train_tr)
# Calculate ndcg score
from sklearn.metrics import ndcg_score
# This is where I get an error
ndcg_score(y_train,y_pred_prob,k=5)
这是我的目标和预测概率的样子:
# True labels of the first two samples
y_train[:2]
> array([7,7])
# Predicted probabilities for first two observation
y_pred_prob[:2]
> array([[0.,0.,1.,0.],[0.,0.]])
我尝试将 y_train
重塑为二维数组,但它不起作用。谁能告诉我如何解决这个错误?
解决方法
假设您在 N
中有 y_train
次观察。您必须将 y_train
转换为 N
行和 12
列的矩阵。
# Create an ndarray of size (N,12) filled with zeros
y_train_matrix = np.zeros(shape=(y_pred_prob.shape[0],y_pred_prob.shape[1]))
# Write a 1 on each row's corresponding category
y_train_matrix[np.arange(y_pred_prob.shape[0]),y_train] = 1
# You now have this ndarray
y_train_matrix
array([[0.,0.,1.,0.],[0.,0.]])
现在可以计算分数了:
ndcg_score(y_train_matrix,y_pred_prob)
1.0