创建 bin 时的操作

问题描述

我正在处理 IBM Attrition Dataset,在创建月薪箱时,我无法计算箱中的损耗百分比 (df['% AttritionCluster'])。代码如下:

# Create bins
bins = [1000,2000,3000,4000,5000,6000,7000,8000,9000,10000,20000]

# Create labels for bins
label = ['1000-2000','2001-3000','3001-4000','4001-5000','5001-6000','6001-7000','7001-8000','8001-9000','9001-10000','10000+']

df['MonthlyIncomeBins'] = pd.cut(df['MonthlyIncome'],bins,labels=label)

# Create Dataframe
summary = df.groupby("MonthlyIncomeBins")

# Create new columns with data
index = df.index
df['TotEmployees'] = index.value_counts()
df['% AttritionCluster'] = (df['Attrition'] / (df['TotEmployees']) * 100

df['% TotalAttrition'] = (df['Attrition'] / df['Attrition'].sum()) * 100

summary = summary[['TotEmployees','Attrition','% AttritionCluster','%TotalAttrition']]

summary.sum()*

这是输出

Output

在公式中,代码将 df['TotEmployees'] 读取为 1,但单独编码时,请给出正确的 bin 中员工数量。你能帮我吗,我已经尝试了一切,但代码有效:)

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)