例如:我有,
df = pd.DataFrame({0: [420, np.nan, 455, np.nan, np.nan, np.nan]})
df
0
0 420.0
1 NaN
2 455.0
3 NaN
4 NaN
5 NaN
然后使用:
df[0].isnull().astype(int)
0 0
1 1
2 0
3 1
4 1
5 1
Name: 0, dtype: int64
我明白了
df[0].fillna(method='ffill') - df[0].isnull().astype(int)
0 420.0
1 419.0
2 455.0
3 454.0
4 454.0
5 454.0
Name: 0, dtype: float64
我想找到0,1,0,1,2,3,然后到最后:
df[0]= 420, 419, 455; 454,453, 452
解决方法:
groupby,cumcount
df[0].ffill() - df.groupby(df[0].notna().cumsum()).cumcount()
0 420.0
1 419.0
2 455.0
3 454.0
4 453.0
5 452.0
dtype: float64
细节
定义组
df[0].notna().cumsum()
0 1
1 1
2 2
3 2
4 2
5 2
Name: 0, dtype: int64
在groupby中使用cumcount
df.groupby(df[0].notna().cumsum()).cumcount()
0 0
1 1
2 0
3 1
4 2
5 3
dtype: int64