问题描述
我有一个数据框,它是三个连接在一起的数据框。我有变量表示它们来自哪个数据帧。例如,DAY_OF_WEEK_summer1
,DAY_OF_WEEK_summer2
和DAY_OF_WEEK_summer3
。一个值只能存在于这三列之一中,我想用summer2或summer3列中的值填充NaN
中的DAY_OF_WEEK_summer1
值。我总共要填充NaN
个值中的11个属性。
这是一个示例数据框:
df = pd.DataFrame({
'DAY_OF_WEEK_summer1': [np.nan,'WKDY','SAT',np.nan,np.nan],'DAY_OF_WEEK_summer2': [np.nan,'WKDY'],'DAY_OF_WEEK_summer3': ['SAT','ROUTE_summer1': [np.nan,5,6,'ROUTE_summer2': [np.nan,10,10],'ROUTE_summer3': [1,np.nan]
})
我希望结果看起来像这样:
DAY_OF_WEEK_summer1 | DAY_OF_WEEK_summer2 | DAY_OF_WEEK_summer3 | ROUTE_summer1 | ROUTE_summer2 | ROUTE_summer3
---------------------+-----------------------+-----------------------+----------------+------------------+---------------
SAT | NaN | SAT | 1 | NaN | 1
WKDY | NaN | NaN | 5 | NaN | NaN
SAT | NaN | NaN | 6 | NaN | NaN
WKDY | WKDY | NaN | 10 | 10 | NaN
WKDY | WKDY | NaN | 10 | 10 | NaN
解决方法
import numpy as np
df['DAY_OF_WEEK_summer1'] = np.where(df['DAY_OF_WEEK_summer1'].isnull(),df['DAY_OF_WEEK_summer2'],df['DAY_OF_WEEK_summer1'])
df['DAY_OF_WEEK_summer1'] = np.where(df['DAY_OF_WEEK_summer1'].isnull(),df['DAY_OF_WEEK_summer3'],df['DAY_OF_WEEK_summer1'])