Yellowbrick.model_selection 不适用于回归,但适用于分类

问题描述

我有一个数据框 df,它具有 Spotify 数据功能。当我使用 RandomForestClassifier 运行模型时,我得到了特征重要的图,但是当我运行 RandomForestRegressor 时,我只得到一个反对流行度的酒吧。有人可以帮忙吗?

#include <cs50.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>

// Points assigned to each letter of the alphabet
int POINTS[] = {1,3,2,1,4,8,5,10,10};

int compute_score(string word);

int main(void)
{
    // Get input words from both players
    string word1 = get_string("Player 1: ");
    string word2 = get_string("Player 2: ");

    // score both words
    int score1 = compute_score(word1);
    int score2 = compute_score(word2);

    // Todo: Print the winner

    if (score1 > score2)
    {
        printf("\nThe winner is player 1!");
    }

    else if (score2 > score1)
    {
        printf("\nThe winner is player 2!");
    }

    else
    {
        printf("\nThat's a tie!");
    }

    int compute_score(string word);
    int total_points = 0;
    {
        for (int i = 0,n = strlen(word); i < n; i++)
        {
            if (isupper(word[i]))
            {
                total_points = total_points + POINTS[word[i] - 'A'];
            }
            else if (islower(word[i]))
            {
                total_points = total_points + POINTS[word[i] - 'a'];
            }

            return total_points;
        }
    }
}

解决方法

我使用 spotify 数据集重复了上述实验,但是我能够将 RandomForestRegressor 与 Yellowbrick 的 FeatureImportances Visualizer 一起使用(见下图)。我建议您将 Yellowbrick 更新到最近 2 月 9 日发布的最新版本。 pip install -U 黄砖

from yellowbrick.model_selection import FeatureImportances
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier,RandomForestRegressor

# Load spotify Data Set
df = pd.read_csv('data.csv.zip')

df = df[['acousticness','danceability','duration_ms','energy','explicit','instrumentalness','liveness','loudness','popularity','speechiness','tempo']]

X = df.drop('popularity',axis=1)
y = df.popularity

train_X,test_X,train_y,test_y = train_test_split(X,y,test_size= 0.1,random_state=38)

#model = RandomForestClassifier(n_estimators=10)
model = RandomForestRegressor(n_estimators=10)

viz = FeatureImportances(model)
viz.fit(X,y)
viz.show()

enter image description here