如何 vstack 或连接不同形状的矩阵?

问题描述

在像下面这样的情况下,我如何 vstack 两个矩阵?

import numpy as np 

a = np.array([[3,3,3],[3,3]])
b = np.array([[2,2],[2,2]])

a = np.vstack([a,b])

Output:   
ValueError: all the input array dimensions for the concatenation axis must match exactly,but along dimension 1,the array at index 0 has size 3 and the array at index 1 has size 2

我想要的输出如下所示:

a = array([[[3,3]],[[2,2]]])

我的目标是然后遍历堆叠矩阵的内容,索引每个矩阵并在特定行上调用函数

for matrix in a:
   row = matrix[1]
   print(row)

Output: 
[3,3]
[2,2]

解决方法

小心那些“Numpy 更快”的说法。如果你已经有了数组,并且充分利用数组方法,numpy确实更快。但是,如果您从列表开始,或者必须使用 Python 级别的迭代(就像您在 Pack... 中所做的那样),numpy 版本可能会更慢。

只是对 Pack 步骤进行时间测试:

In [12]: timeit Pack_Matrices_with_NaN([a,b,c],5)
221 µs ± 9.02 µs per loop (mean ± std. dev. of 7 runs,1000 loops each)

将其与使用简单的列表推导式获取每个数组的第一行进行比较:

In [13]: [row[1] for row in [a,c]]
Out[13]: [array([3.,3.,3.]),array([2.,2.]),array([4.,4.,4.])]
In [14]: timeit [row[1] for row in [a,c]]
808 ns ± 2.17 ns per loop (mean ± std. dev. of 7 runs,1000000 loops each)

200 微秒与不到 1 微秒!

并为您的 Unpack 计时:

In [21]: [Unpack_Matrix_with_NaN(packed_matrices.reshape(3,3,5),i)[1,:] for i in range(3)]
    ...: 
Out[21]: [array([3.,4.])]
In [22]: timeit [Unpack_Matrix_with_NaN(packed_matrices.reshape(3,:] for i in ra
    ...: nge(3)]
199 µs ± 10.8 µs per loop (mean ± std. dev. of 7 runs,1000 loops each)
,

我只能使用 NumPy 来解决这个问题。由于 NumPy 比 python 的列表函数 (https://towardsdatascience.com/how-fast-numpy-really-is-e9111df44347) 快得多,我想分享我的答案,因为它可能对其他人有用。

我开始添加 np.NaN 以使两个数组具有相同的形状。

import numpy as np 

a = np.array([[3,3],[3,3]]).astype(float)
b = np.array([[2,2],[2,2]]).astype(float)

# Extend each vector in array with Nan to reach same shape
b = np.insert(b,2,np.nan,axis=1)

# Now vstack the arrays 
a = np.vstack([[a],[b]])
print(a)

Output: 
[[[ 3.  3.  3.]
  [ 3.  3.  3.]
  [ 3.  3.  3.]]

 [[ 2.  2. nan]
  [ 2.  2. nan]
  [ 2.  2. nan]]]

然后我写了一个函数来解包 a 中的每个数组,并删除 nan。

def Unpack_Matrix_with_NaN(Matrix_with_nan,matrix_of_interest):
    for first_row in Matrix_with_nan[matrix_of_interest,:1]:
        # find shape of matrix row without nan 
        first_row_without_nan = first_row[~np.isnan(first_row)]
        shape = first_row_without_nan.shape[0]
        matrix_without_nan = np.arange(shape)
        for row in Matrix_with_nan[matrix_of_interest]:
            row_without_nan = row[~np.isnan(row)]
            matrix_without_nan = np.vstack([matrix_without_nan,row_without_nan])
        # Remove vector specifying shape 
        matrix_without_nan = matrix_without_nan[1:]
        return matrix_without_nan

然后我可以遍历矩阵,找到我想要的行,然后打印出来。

Matrix_with_nan = a

for matrix in range(len(Matrix_with_nan)):
    matrix_of_interest = Unpack_Matrix_with_NaN(a,matrix)
    row = matrix_of_interest[1]
    print(row)

Output: 
[3. 3. 3.]
[2. 2.]

当每行需要添加多个 nan 时,我还制作了一个打包矩阵的函数:

import numpy as np 

a = np.array([[3,2]]).astype(float)
c = np.array([[4,4,4],[4,4]]).astype(float)

# Extend each vector in array with Nan to reach same shape
def Pack_Matrices_with_NaN(List_of_matrices,Matrix_size):
    Matrix_with_nan = np.arange(Matrix_size)
    for array in List_of_matrices:
        start_position = len(array[0])
        for x in range(start_position,Matrix_size):
            array = np.insert(array,(x),axis=1)
        Matrix_with_nan = np.vstack([Matrix_with_nan,array])
    Matrix_with_nan = Matrix_with_nan[1:]
    return Matrix_with_nan

arrays = [a,c]
packed_matrices = Pack_Matrices_with_NaN(arrays,5)
print(packed_matrices) 

Output:
[[ 3.  3.  3. nan nan]
 [ 3.  3.  3. nan nan]
 [ 3.  3.  3. nan nan]
 [ 2.  2. nan nan nan]
 [ 2.  2. nan nan nan]
 [ 2.  2. nan nan nan]
 [ 4.  4.  4.  4. nan]
 [ 4.  4.  4.  4. nan]
 [ 4.  4.  4.  4. nan]]

相关问答

Selenium Web驱动程序和Java。元素在(x,y)点处不可单击。其...
Python-如何使用点“。” 访问字典成员?
Java 字符串是不可变的。到底是什么意思?
Java中的“ final”关键字如何工作?(我仍然可以修改对象。...
“loop:”在Java代码中。这是什么,为什么要编译?
java.lang.ClassNotFoundException:sun.jdbc.odbc.JdbcOdbc...