问题描述
这按预期工作:
a,b = (0,) * 2
print('Before')
print(a,b)
a = 1
print('\nAfter')
print(a,b)
# Before
# 0 0
#
# After
# 1 0
它在熊猫中的作用不是不,但是sb
只是与sa
相同系列的引用:
sa,sb = (pd.Series(np.zeros(2)),) * 2
print('Before')
for s in (sa,sb):
print(s)
sa[0] = 1
print('\nAfter')
for s in (sa,sb):
print(s)
# Before
# 0 0.0
# 1 0.0
# dtype: float64
# 0 0.0
# 1 0.0
# dtype: float64
# After
# 0 1.0
# 1 0.0
# dtype: float64
# 0 1.0 <- Note: sb[0] has also changed
# 1 0.0
# dtype: float64
这是否是预期的行为,是否已记录在案?它似乎违反了最小惊讶原则。
最方便的解决方法是什么?显然这可行:
sa = pd.Series(np.zeros(2))
sb = sa.copy() # note deep=True by default
但这有点冗长,因为我需要生成多个序列。
这不有效:
sa,sb = (pd.Series(np.zeros(2)).copy(),) * 2
解决方法
如上文《致癌基因》所述,熊猫系列是易变的,因此写给它们也可以修饰浅表。 dict可以看到相同的结果:
da,db = ({0:0,1:0},)*2
print('Before')
for d in (da,db):
print(d)
da[0]=1
print('\nAfter')
for d in (da,db):
print(d)
# Before
# {0: 0,1: 0}
# {0: 0,1: 0}
# After
# {0: 1,1: 0}
# {0: 1,1: 0} <- note: db has also changed
整数是不可变的,因此写入它们会破坏引用。这可以从变量地址中看到:
a,b = (0,) * 2
print('Before')
print(hex(id(a)),hex(id(b)))
a = 1
print('\nAfter')
print(hex(id(a)),hex(id(b)))
# Before
# 0x10ab9fef0 0x10ab9fef0
# After
# 0x10ab9ff10 0x10ab9fef0 <- first address changed
比较:
da,db):
print(hex(id(d)))
da[0]=1
print('\nAfter')
for d in (da,db):
print(hex(id(d)))
# Before
# 0x137b9c320
# 0x137b9c320
# After
# 0x137b9c320 <- first address unchanged
# 0x137b9c320
正如Henry Yik所说,从列表理解中分配是一种从可迭代对象(不同于*
运算符)中创建深层副本的方法。
da,db = [pd.Series(np.zeros(2)) for _ in range(2)]
print('Before')
for d in (da,db):
print(hex(id(d)))
# Before
# 0x137b9fad0 <- now it's different to begin with
# 0x137b93fd0