问题描述
import pandas as pd
cat_df = pd.DataFrame({'item':['bed','lamp','candle','chair','bed','lamp'],'location' :['home','home','home' ],'status' :['new','used','new','new' ]})
cat_df = cat_df.astype('category')
print(cat_df.dtypes)
cat_df.describe().transpose()
# item bed 2 28.57
# item lamp 2 28.57
# item candle 2 28.57
# item chair 1 14.28
# location home 7 100
# status new 4 57.14
# status used 3 42.85
解决方法
使用DataFrame.melt
和GroupBy.size
进行计数,然后除以sum
除以新列除以GroupBy.transform
:
df1 = cat_df.melt(value_name='v',var_name='c')
df = df1.groupby(['c','v']).size().reset_index(name='count')
df['perc'] = df['count'].div(df.groupby('c')['count'].transform('sum')).mul(100)
print (df)
c v count perc
0 item bed 2 28.571429
1 item candle 2 28.571429
2 item chair 1 14.285714
3 item lamp 2 28.571429
4 location home 7 100.000000
5 status new 4 57.142857
6 status used 3 42.857143