问题描述
我正在尝试按“plant_name”和“month”的 groupby 计算每个时间序列或数据值的权重,但我看不到解决方案。我有看起来像这样的数据 -
plant_name month adjusted_wspd
0 ARIZONA I 1 7.62
1 ARIZONA I 2 7.37
2 ARIZONA I 3 6.72
3 ARIZONA I 4 6.67
4 ARIZONA I 5 7.07
5 ARIZONA I 6 7.53
6 ARIZONA I 7 8.09
7 ARIZONA I 8 8.83
8 ARIZONA I 9 9.41
9 ARIZONA I 10 9.44
10 ARIZONA I 11 8.92
11 ARIZONA I 12 8.37
12 CAETITE I 1 7.10
13 CAETITE I 2 7.27
14 CAETITE I 3 7.26
我曾尝试编写一个函数并对定义的函数使用“apply”,但我得到了一个不正确的解决方案,看起来像这样,因为该函数可能正在错误地进行数学运算:
ARIZONA I 1 1.0
2 1.0
3 1.0
4 1.0
5 1.0
6 1.0
7 1.0
8 1.0
9 1.0
10 1.0
11 1.0
12 1.0
CAETITE I 1 1.0
2 1.0
3 1.0
我的代码如下:
def my_agg(x):
names = {'Weigths_WSPD': (x['adjusted_wspd'] )/x['adjusted_wspd'].sum()}
return pd.Series(names,index=['Weights_WSPD'])
dfmm.groupby(["plant_name","month"]).apply(my_agg)
我的回答应该像这样以百分比 (%) 表示:
ARIZONA I 1 7.93
2 7.67
3 7.00
4 6.95
5 7.36
6 7.84
7 8.42
8 9.19
9 9.80
10 9.83
11 9.29
12 8.72
CAETITE I 1 33.82
2 33.61
3 33.56
谢谢!
解决方法
让我们尝试对 plant_name
求和,而不是对 plant_name
和 month
求和:
df['Weigths_WSPD'] = (
df['adjusted_wspd'] /
df.groupby('plant_name')['adjusted_wspd'].transform('sum') *
100
).round(2)
plant_name month adjusted_wspd Weigths_WSPD
0 ARIZONA I 1 7.62 7.93
1 ARIZONA I 2 7.37 7.67
2 ARIZONA I 3 6.72 7.00
3 ARIZONA I 4 6.67 6.95
4 ARIZONA I 5 7.07 7.36
5 ARIZONA I 6 7.53 7.84
6 ARIZONA I 7 8.09 8.42
7 ARIZONA I 8 8.83 9.19
8 ARIZONA I 9 9.41 9.80
9 ARIZONA I 10 9.44 9.83
10 ARIZONA I 11 8.92 9.29
11 ARIZONA I 12 8.37 8.72
12 CAETITE I 1 7.10 32.82
13 CAETITE I 2 7.27 33.61
14 CAETITE I 3 7.26 33.56
如果需要 Series
:
s = df.set_index(['plant_name','month'])
s = (
s['adjusted_wspd'] /
s.groupby(level=0)['adjusted_wspd'].transform('sum') *
100
).round(2).rename('Weigths_WSPD')
plant_name month
ARIZONA I 1 7.93
2 7.67
3 7.00
4 6.95
5 7.36
6 7.84
7 8.42
8 9.19
9 9.80
10 9.83
11 9.29
12 8.72
CAETITE I 1 32.82
2 33.61
3 33.56
Name: Weigths_WSPD,dtype: float64
DataFrame 构造函数:
df = pd.DataFrame({
'plant_name': {0: 'ARIZONA I',1: 'ARIZONA I',2: 'ARIZONA I',3: 'ARIZONA I',4: 'ARIZONA I',5: 'ARIZONA I',6: 'ARIZONA I',7: 'ARIZONA I',8: 'ARIZONA I',9: 'ARIZONA I',10: 'ARIZONA I',11: 'ARIZONA I',12: 'CAETITE I',13: 'CAETITE I',14: 'CAETITE I'},'month': {0: 1,1: 2,2: 3,3: 4,4: 5,5: 6,6: 7,7: 8,8: 9,9: 10,10: 11,11: 12,12: 1,13: 2,14: 3},'adjusted_wspd': {0: 7.62,1: 7.37,2: 6.72,3: 6.67,4: 7.07,5: 7.53,6: 8.09,7: 8.83,8: 9.41,9: 9.44,10: 8.92,11: 8.37,12: 7.1,13: 7.27,14: 7.26}
})