我的熊猫数据帧:
dframe = pd.DataFrame({"A":list("abcde"),"B":list("aabbc"),"C":[1,2,3,4,5]},index=[10,11,12,13,14]) A B C 10 a a 1 11 b a 2 12 c b 3 13 d b 4 14 e c 5
我想要的输出:
A B C a b c 10 a a 1 1 None None 11 b a 2 2 None None 12 c b 3 None 3 None 13 d b 4 None 4 None 14 e c 5 None None 5
想法是根据“B”列中的值创建新列,复制“C”列中的相应值并将其粘贴到新创建的列中.
这是我的代码:
lis = sorted(list(dframe.B.unique())) #creating empty columns for items in lis: dframe[items] = None #here copy and pasting for items in range(0,len(dframe)): slot = dframe.B.iloc[items] dframe[slot][items] = dframe.C.iloc[items]
我最终得到了这个错误:
SettingWithcopyWarning: A value is trying to be set on a copy of a slice from a DataFrame See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy app.launch_new_instance()
此代码在Python 2.7中运行良好,但在3.x中运行不佳.我哪里错了?
解决方法
从…开始
to_be_appended = pd.get_dummies(dframe.B).replace(0,np.nan).mul(dframe.C,axis=0)
然后结束
dframe = pd.concat([dframe,to_be_appended],axis=1)
好像:
print dframe A B C a b c 10 a a 1 1.0 NaN NaN 11 b a 2 2.0 NaN NaN 12 c b 3 NaN 3.0 NaN 13 d b 4 NaN 4.0 NaN 14 e c 5 NaN NaN 5.0
搜索注意事项.
这是将一个热编码与广播乘法相结合.