如何在熊猫中保存符合特定条件的先前结果

问题描述

我想知道如何保存先前的结果，该结果与某些条件相匹配（在随后的每一行中都是df ['condition']。我知道如何使用for循环，但是我知道在使用熊猫时应该避免使用它们

下面是一个示例。列df ['desired_result]代表我想要实现的目标。

import pandas as pd
import numpy as np

dates = pd.date_range('1/1/2000',periods=10)
values = np.arange(10.0,20.0,1.0)
data = {'date': dates,'value': values}
df = pd.DataFrame.from_dict(data)

df['condition'] = [False,False,True,False]
df_valid = df[df['condition']]
df['desired_result'] = [np.nan,np.nan,12,13,15,18,18]

解决方法

# use df.where based on your condition and assign it to a new col
# Anywhere column condition is True return the value else return NaN
# then add ffill to forward fill NaN values

df['r'] = df['value'].where(df['condition'] == True,np.nan).ffill()

        date  value  condition  desired_result     r
0 2000-01-01   10.0      False             NaN   NaN
1 2000-01-02   11.0      False             NaN   NaN
2 2000-01-03   12.0       True            12.0  12.0
3 2000-01-04   13.0       True            13.0  13.0
4 2000-01-05   14.0      False            13.0  13.0
5 2000-01-06   15.0       True            15.0  15.0
6 2000-01-07   16.0      False            15.0  15.0
7 2000-01-08   17.0      False            15.0  15.0
8 2000-01-09   18.0       True            18.0  18.0
9 2000-01-10   19.0      False            18.0  18.0

dataframe pandas pandas python vectorization