假设我有这张桌子
Type | Killed | Survived Dog 5 2 Dog 3 4 Cat 1 7 Dog nan 3 cow nan 2
[Type] = Dog缺少Killed值之一.
我想在[类型] =狗的[Killed]中归咎于平均值.
我的代码如下:
>搜索平均值
df [df [‘Type’] ==’Dog’].mean().round()
这将给我平均值(约2.25)
>估算均值(这是问题开始的地方)
df.loc [(df [‘Type’] ==’Dog’)& (df [‘Killed’])].fillna(2.25,inplace = True)
代码运行,但值不是估算,NaN值仍然存在.
我的问题是,我如何根据[Type] = Dog来估算[Killed]中的均值.
解决方法
对我来说工作:
df.ix[df['Type'] == 'Dog','Killed'] = df.ix[df['Type'] == 'Dog','Killed'].fillna(2.25) print (df) Type Killed Survived 0 Dog 5.00 2 1 Dog 3.00 4 2 Cat 1.00 7 3 Dog 2.25 3 4 cow NaN 2
如果系列需要fillna
– 因为2列被杀和幸存:
m = df[df['Type'] == 'Dog'].mean().round() print (m) Killed 4.0 Survived 3.0 dtype: float64 df.ix[df['Type'] == 'Dog'] = df.ix[df['Type'] == 'Dog'].fillna(m) print (df) Type Killed Survived 0 Dog 5.0 2 1 Dog 3.0 4 2 Cat 1.0 7 3 Dog 4.0 3 4 cow NaN 2
如果需要fillna只在Killed列中:
#if dont need rounding,omit it m = round(df.ix[df['Type'] == 'Dog','Killed'].mean()) print (m) 4 df.ix[df['Type'] == 'Dog','Killed'].fillna(m) print (df) Type Killed Survived 0 Dog 5.0 2 1 Dog 3.0 8 2 Cat 1.0 7 3 Dog 4.0 3 4 cow NaN 2
您可以重用以下代码:
filtered = df.ix[df['Type'] == 'Dog','Killed'] print (filtered) 0 5.0 1 3.0 3 NaN Name: Killed,dtype: float64 df.ix[df['Type'] == 'Dog','Killed'] = filtered.fillna(filtered.mean()) print (df) Type Killed Survived 0 Dog 5.0 2 1 Dog 3.0 8 2 Cat 1.0 7 3 Dog 4.0 3 4 cow NaN 2