堆叠不同大小图像的 Numpy 数组

问题描述

我正在使用 OpenCV 创建一组图像，以便在 TensorFlow 中进行分析。

我创建了以下函数：

def files_to_img_array(path,files_list):
    '''
    Reads a list of image files and creates a Numpy array.
    '''
    # Instantiate arrays
    files = [path+file for file in files_list]
    img_array = np.zeros(72000000) # for flattened 4000x6000 images
    image_names = []

    for file in tqdm.tqdm(files_list):
        full_file = path+file
        image_names.append(file.split('.')[0])
        img = cv2.imread(full_file,1)
        print(img.shape)
        img = img.flatten()
        
        img_array = np.vstack([img_array,img])
    img_array = img_array[1:] # remove instantiating zeroes
    return img_array

问题在于图像大小不统一：

 0%|                                     | 0/10 [00:00<?,?it/s](4000,6000,3)
 10%|███████▊                    | 1/10 [00:00<00:03,2.64it/s](4000,3)
 20%|███████████████▌            | 2/10 [00:00<00:03,2.51it/s](2848,4288,2.18it/s]
Traceback (most recent call last):
...
ValueError: all the input array dimensions for the concatenation axis
must match exactly,but along dimension 1,the array at index 0 has
size 72000000 and the array at index 1 has size 36636672

从编程和图像处理的角度来看，我真的不确定如何解决这个问题。有没有人有关于如何填充这些不同大小的图像的建议，或者 OpenCV 中是否有可以处理这个问题的建议？（我也很高兴使用 PIL，我没有与 OpenCV 结婚。）

解决方法

这里是如何在 Python/OpenCV 中使用透明填充垂直堆叠任意大小的图像。

输入图像：

import cv2
import numpy as np

# load images
img1 = cv2.imread("lena.jpg")
w1 = img1.shape[1]

img2 = cv2.imread("barn.jpg")
w2 = img2.shape[1]

img3 = cv2.imread("monet2.jpg")
w3 = img3.shape[1]

# get maximum width
ww = max(w1,w2,w3)

# pad images with transparency in width
img1 = cv2.cvtColor(img1,cv2.COLOR_BGR2BGRA)
img2 = cv2.cvtColor(img2,cv2.COLOR_BGR2BGRA)
img3 = cv2.cvtColor(img3,cv2.COLOR_BGR2BGRA)
img1 = cv2.copyMakeBorder(img1,ww-w1,borderType=cv2.BORDER_CONSTANT,value=(0,0))
img2 = cv2.copyMakeBorder(img2,ww-w2,0))
img3 = cv2.copyMakeBorder(img3,ww-w3,0))

# stack images vertically
result = cv2.vconcat([img1,img2,img3])

# write result to disk
cv2.imwrite("image_stack.png",result)

cv2.imshow("RESULT",result)
cv2.waitKey(0)
cv2.destroyAllWindows()

结果：

感谢@hpaulj 对导致我调查的问题的评论和这个答案。

以下代码依赖于 Keras 和底层 PIL：

import PIL
import tensorflow
from tensorflow.keras.preprocessing.image import load_img,img_to_array
import concurrent.futures

def keras_pipeline(file):
    TARGET_SIZE = (100,150)
    img = load_img(file,target_size=TARGET_SIZE)
    img_array = img_to_array(img)
    return img_array
 
def files_to_array(path,files_list):
    files = [path+file for file in files_list]
    with concurrent.futures.ProcessPoolExecutor() as executor:
        img_map = executor.map(keras_pipeline,files)
    return img_map

keras_pipeline() 为每个图像创建一个转换管道。 files_to_array() 在每个图像上映射该转换管道并返回一个生成器。然后可以使用 np.hstack():

作为 Numpy 数组传递该生成器

for img in img_map:
    existing_array = np.hstack([existing_array,img])

image-processing image-resizing numpy opencv opencv python