将列的inf值替换为列组的最大值为

问题描述

我有一个如下所示的数据框

    ID   sales
0   c1   100.0
1   c1    25.0
2   c1    60.0
3   c1    inf
4   c2    40.0
5   c2    inf
6   c3    50.0
7   c3    inf
8   c3    80.0

我想用ID列将sales列中的'inf'替换为组的最大值

所以输出应如下图所示

  ID    sales
0   c1  100.0
1   c1   25.0
2   c1   60.0
3   c1  100.0
4   c2   40.0
5   c2   40.0
6   c3   50.0
7   c3   80.0
8   c3   80.0

什么是最好的方法？

谢谢

解决方法

import numpy as np
# skip inf records
max_df = df[df['sales'] != np.inf]
# group by ID without inf
for sales_id,id_df in max_df.groupby('ID'):
    # search in original df by ID + inf and set sales to max value of subgroup
    df.loc[(df['sales'] == np.inf) & (df['ID'] == sales_id),'sales'] = id_df['sales'].max()

print(df)
#    ID  sales
# 0  c1  100.0
# 1  c1   25.0
# 2  c1   60.0
# 3  c1  100.0
# 4  c2   40.0
# 5  c2   40.0
# 6  c3   50.0
# 7  c3   80.0
# 8  c3   80.0

inf pandas python replace