用于堆叠图像的python代码运行速度非常慢,正在寻找加速它的建议

问题描述

我已经编写了一些代码来读取约150张图像(1000像素x 720像素,已裁剪和调整大小)的每个像素的RGB值。

import os
from PIL import Image
print("STACKING IMAGES...")
os.chdir('cropped')
images=os.listdir() #list all images present in directory
print("GETTING IMAGES...")
channelR=[]
channelG=[]
channelB=[]
print("GETTING PIXEL @R_153_4045@ION...")  #runs reasonably fast
for image in images:  #loop through each image to extract RGB channels as separate lists
    with Image.open(image) as img:
        if image==images[0]:
            imgSize=img.size
        channelR.append(list(img.getdata(0)))
        channelG.append(list(img.getdata(1)))
        channelB.append(list(img.getdata(2)))
print("PIXEL @R_153_4045@IION COLLECTED.")
print("AVERAGING IN CHANNEL RED.") #average for each pixel in each channel
avgR=[round(sum(x)/len(channelR)) for x in zip(*channelR)] #unzip the each pixel from all ~250 images,average it,store in tuple,starts to slow
print("AVERAGING IN CHANNEL GREEN.")
avgG=[round(sum(x)/len(channelG)) for x in zip(*channelG)] #slower
print("AVERAGING IN CHANNEL BLUE.")
avgB=[round(sum(x)/len(channelB)) for x in zip(*channelB)] #progressively slower
print("MERGING DATA ACROSS THREE CHANNELS.")
mergedData=[(x) for x in zip(avgR,avgG,avgB)]  #merge averaged colour channels pixel by pixel,doesn't seem to end,takes eternity
print("GENErating IMAGE.")
stacked=Image.new('RGB',(imgSize)) #create image
stacked.putdata(mergedData) #generate image
stacked.show()
os.chdir('..')
stacked.save('stacked.tif','TIFF') #save file
print("FINISHED STACKING !")

在配置适中的计算机(Core2Duo,4GB RAM,Linux Mint OS)上运行它需要花费将近一个小时才能完成三个通道的平均计算,又花了一个小时来合并各个平均像素(未完成,我中止了这个过程)。我已经读过列表理解很慢,并且zip()函数占用太多内存,但是修改这些列表会导致更多错误。我什至读过,将程序划分为功能可能会加快速度。

为了获得可比的性能,我恳请回答问题的人在https://github.com/rlvaugh/Impractical_Python_Projects/tree/master/Chapter_15/video_frames的图像上运行代码

对于加快程序的任何帮助将不胜感激。在转向功能更强大的系统时,它是否有机会大幅度提高速度?

在此先感谢您的帮助。

解决方法

附加到列表很慢。就像在一个循环中可以对某件事进行多个列表理解一样。您还可以使用numpy数组来使用SIMD operations来加快它的速度,而不是遍历list

这是一些图像的示例代码。您可以根据需要扩展它。

import os
import numpy as np
import PIL

os.chdir('cropped')

imgfiles = ['MVI_6450 001.jpg','MVI_6450 002.jpg','MVI_6450 003.jpg','MVI_6450 004.jpg']

allimgs = None

for imgnum,imgfile in enumerate(imgfiles):
    img = PIL.Image.open(imgfile)
    imgdata = np.array(img.getdata()) # Nx3 array. columns: R,G,B channels
    
    if allimgs is None:
        allshape = list(imgdata.shape) # Size of one image
        allshape.append(len(imgfiles)) # Append number of images
        # allshape is now [num_pixels,num_channels,num_images]
        # so making an array of this shape will allow us to store all images
        # Axis 0: pixels. Axis 1: channels. Axis 2: images
        allimgs = np.zeros(allshape) 
    
    allimgs[:,:,imgnum] = imgdata # Set the imgnum'th image data
    

# Get the mean along the last axis 
#     average same pixel across all images for each channel
imgavg = np.mean(allimgs,axis=-1) 

# normalize so that max value is 255
# Also convert to uint8
imgavg = np.uint8(imgavg / np.max(imgavg) * 255)

imgavg_tuple = tuple(map(tuple,imgavg))

stacked = PIL.Image.new("RGB",img.size)
stacked.putdata(imgavg_tuple)
stacked.show()

os.chdir('..')

注意:我们创建一个numpy数组来存储所有图像,而不是在加载更多图像时追加,因为将Jacob追加到numpy数组是一个不好的 bad 想法mentions in a comment below。这是因为numpy数组附加实际上创建了一个新数组,然后复制了两个数组的内容,因此它是O(n ^ 2)操作。