为列

问题描述

传递Listpd.Series类型以创建新的dataFrame列之间有什么区别?例如,从反复试验中,我注意到:

# (1d) We can also give it a Series,which is quite similar to giving it a List
df['cost1'] = pd.Series([random.choice([1.99,2.99,3.99]) for i in range(len(df))])
df['cost2'] =           [random.choice([1.99,3.99]) for i in range(len(df))]
df['cost3'] = pd.Series([1,2,3]) # <== will pad length with `NaN`
df['cost4'] =           [1,3]  # <== this one will fail because not the same size
d

pd.Series是否与传递标准python列表不同还有其他原因吗?数据框可以采用任何可迭代的python还是可以传递给它的内容有限制?最后,使用pd.Series添加列的“正确”方法,还是可以与其他类型互换使用?

解决方法

List在这里分配给数据帧需要相同的长度

对于pd.Series分配,它将使用索引作为键来匹配原始DataFrame index,然后在Series

中用相同的索引填充值>
df=pd.DataFrame([1,2,3],index=[9,8,7])
df['New']=pd.Series([1,3])
 # the default index is range index,which is from 0 to n 
 # since the dataframe index dose not match the series,then will return NaN 
df
Out[88]: 
   0  New
9  1  NaN
8  2  NaN
7  3  NaN

具有匹配索引的不同长度

df['New']=pd.Series([1,2],8])
df
Out[90]: 
   0  New
9  1  1.0
8  2  2.0
7  3  NaN