问题描述
我正在使用以下DataFrame:
df1 = pd.DataFrame([
[1,np.NaN,np.NaN],[0.5,2,[np.NaN,1.5,3,2.5,4,3.5,5,5.5],6.2,6],],columns=['AA','BB','CC','DD','EE','FF'])
作为输出,我得到:
DataFrame1_______
AA BB CC DD EE FF
0 1.0 NaN NaN NaN NaN NaN
1 0.5 2.0 NaN NaN NaN NaN
2 NaN 1.5 3.0 NaN NaN NaN
3 NaN NaN 2.5 4.0 NaN NaN
4 NaN NaN NaN 3.5 5.0 5.5
5 NaN NaN NaN NaN 6.2 6.0
我想知道是否有一种方法可以将此数据帧转换为没有NaNs值的数据帧,例如:
new_DataFrame1______
AA BB CC DD EE FF
0 1.0 2.0 3.0 4.0 5.0 5.5
1 0.5 1.5 2.5 3.5 6.2 6.0
基本上,我想将不是NaN的每个值移动到其列的index = 0。
预先感谢
解决方法
使用justify
并删除DataFrame.dropna
缺失的行:
#https://stackoverflow.com/a/44559180/2901002
df = pd.DataFrame(justify(df1.to_numpy(),invalid_val=np.nan,axis=0),columns=df1.columns).dropna(how='all')
print (df)
AA BB CC DD EE FF
0 1.0 2.0 3.0 4.0 5.0 5.5
1 0.5 1.5 2.5 3.5 6.2 6.0
另一种解决方案:
df = pd.concat([df1[c].dropna().reset_index(drop=True) for c in df1.columns],axis=1)
print (df)
AA BB CC DD EE FF
0 1.0 2.0 3.0 4.0 5.0 5.5
1 0.5 1.5 2.5 3.5 6.2 6.0
,
您也可以将stack
和groupby
用于dict理解:
print (pd.DataFrame({col:i.tolist() for col,i in df1.stack().groupby(level=1)}))
AA BB CC DD EE FF
0 1.0 2.0 3.0 4.0 5.0 5.5
1 0.5 1.5 2.5 3.5 6.2 6.0