有没有一种替代的矢量方法来编写to

问题描述

假设我们有一个参差不齐的嵌套序列，如下所示：

import numpy as np
x = np.ones((10,20))
y = np.zeros((10,20))
a = [[0,x],[y,1]]

，并想要创建一个 full numpy数组，该数组广播参差不齐的子序列（以匹配任何其他子序列的最大尺寸，在这种情况下为(10,20) ）在必要时。首先，我们可能尝试使用np.array(a)，它会产生警告：

VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this,you must specify 'dtype=object' when creating the ndarray

通过更改为np.array(a,dtype=object)，我们得到了一个数组。但是，这是一个对象数组，而不是浮动对象，并保留了参差不齐的子序列，这些子序列没有按需要广播。为了解决这个问题，我创建了一个新函数to_array，该函数采用一个（可能是衣衫，的，嵌套的）序列和一个形状，并返回该形状的完整numpy数组：

    def to_array(a,shape):
        a = np.array(a,dtype=object)
        b = np.empty(shape)
        for index in np.ndindex(a.shape):
            b[index] = a[index]
        return b
    
    b = np.array(a,dtype=object)
    c = to_array(a,(2,2,10,20))
    
    print(b.shape,b.dtype) # prints (2,2) object
    print(c.shape,c.dtype) # prints (2,20) float64

请注意，c而非b是理想的结果。但是，to_array依赖于nindex上的for循环，而Python for循环对于大数组来说很慢。

是否存在另一种矢量化方法来编写to_array函数？

解决方法

鉴于目标形状，几次迭代似乎并不太昂贵：

In [35]: C = np.empty((A.shape+x.shape),x.dtype)                                                    
In [36]: for idx in np.ndindex(A.shape): 
    ...:     C[idx] = A[idx] 
    ...:

或者，您可以将0和1替换为适当的（10,20）数组。在这里，您已经创建了x和y：

In [37]: D = np.array([[y,x],[y,x]])                                                                 
In [38]: np.allclose(C,D)                                                                            
Out[38]: True

通常，在复杂任务上进行几次迭代是可以的。请记住，对对象dtype数组的（许多）操作实际上比对等列表上的操作要慢。这是对数字数组进行整体数组编译的相对较快的操作。那不是你的情况。

但是

C[0,:,:] = 0

使用广播-通过广播用标量C[0,0]填充0的所有（10,20）值。

C[0,1,:] = x

是另一种广播，RHS与左侧匹配。期望numpy通过一次广播操作来处理这两种情况是不合理的。

array-broadcasting arrays arrays numpy python ragged

有没有一种替代的矢量方法来编写to_array函数？

问题描述

解决方法