使用NumPy和/或Pandas处理2D列表时遇到困难:
>获取所有元素的唯一组合的总和,而无需再次从同一行中选择(下面的数组应该是81种组合).
>打印组合中每个元素的行和列.
例如:
arr = [[1, 2, 4], [10, 3, 8], [16, 12, 13], [14, 4, 20]]
(1,3,12,20), Sum = 36 and (row, col) = [(0,0),(1,1),(2,1),(3,2)]
(4,10,16,20), Sum = 50 and (row, col) =[(0,2),(1,0),(2,0),(3,2)]
解决方法:
通过创建所有这样的组合和求和的方法:这是使用itertools.product
和数组索引的矢量化方法 –
from itertools import product
a = np.asarray(arr) # Convert to array for ease of use and indexing
m,n = a.shape
combs = np.array(list(product(range(n), repeat=m)))
out = a[np.arange(m)[:,None],combs.T].sum(0)
样品运行 –
In [296]: arr = [[1, 2, 4], [10, 3, 8], [16, 12, 13], [14, 4, 20]]
In [297]: a = np.asarray(arr)
...: m,n = a.shape
...: combs = np.array(list(product(range(n), repeat=m)))
...: out = a[np.arange(m)[:,None],combs.T].sum(0)
...:
In [298]: out
Out[298]:
array([41, 31, 47, 37, 27, 43, 38, 28, 44, 34, 24, 40, 30, 20, 36, 31, 21,
37, 39, 29, 45, 35, 25, 41, 36, 26, 42, 42, 32, 48, 38, 28, 44, 39,
29, 45, 35, 25, 41, 31, 21, 37, 32, 22, 38, 40, 30, 46, 36, 26, 42,
37, 27, 43, 44, 34, 50, 40, 30, 46, 41, 31, 47, 37, 27, 43, 33, 23,
39, 34, 24, 40, 42, 32, 48, 38, 28, 44, 39, 29, 45])
记忆效率方法:这是一种不创造所有这些组合的方法,而是使用即时broadcasted
总结,其理念深受this other post
的启发 –
a = np.asarray(arr)
m,n = a.shape
out = a[0]
for i in range(1,m):
out = out[...,None] + a[i]
out.shape = out.size # Flatten