问题描述
我想知道如何保存先前的结果,该结果与某些条件相匹配(在随后的每一行中都是df ['condition']。我知道如何使用for循环,但是我知道在使用熊猫时应该避免使用它们
下面是一个示例。列df ['desired_result]代表我想要实现的目标。
import pandas as pd
import numpy as np
dates = pd.date_range('1/1/2000',periods=10)
values = np.arange(10.0,20.0,1.0)
data = {'date': dates,'value': values}
df = pd.DataFrame.from_dict(data)
df['condition'] = [False,False,True,False]
df_valid = df[df['condition']]
df['desired_result'] = [np.nan,np.nan,12,13,15,18,18]
解决方法
# use df.where based on your condition and assign it to a new col
# Anywhere column condition is True return the value else return NaN
# then add ffill to forward fill NaN values
df['r'] = df['value'].where(df['condition'] == True,np.nan).ffill()
date value condition desired_result r
0 2000-01-01 10.0 False NaN NaN
1 2000-01-02 11.0 False NaN NaN
2 2000-01-03 12.0 True 12.0 12.0
3 2000-01-04 13.0 True 13.0 13.0
4 2000-01-05 14.0 False 13.0 13.0
5 2000-01-06 15.0 True 15.0 15.0
6 2000-01-07 16.0 False 15.0 15.0
7 2000-01-08 17.0 False 15.0 15.0
8 2000-01-09 18.0 True 18.0 18.0
9 2000-01-10 19.0 False 18.0 18.0