Haskell中更快的SumSquareDifference

问题描述

我正在Haskell中实现二进制图像的分形图像压缩算法。为此,我必须在给定的范围块(子图像)中找到所谓的域池中最接近的图像,即图像列表列表。我正在通过计算两个像素值的和平方差来比较图像。

我使用Haskell图像处理(HIP)库来读取和写入图像。

compress :: Image VS X Bit -> Int -> [(Int,Int)]
compress img blockSize = zip dIndices tIndices
    where rImg = img
          dImg = downsample2 rImg
          rBlocks = (toBlocks rImg blockSize) :: [Image VS X Bit]
          dBlocks = (toBlocks dImg blockSize) :: [Image VS X Bit]
          dPool = (createDPool dBlocks) :: [[Image VS X Bit]]
          distanceLists = map (\x -> (map.map) (distance x) dPool) rBlocks
          dIndices = map (fst . getMinIndices) distanceLists
          tIndices = map (snd . getMinIndices) distanceLists


distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . toLists

toLists :: MArray arr cs e => Image arr cs e -> [[Pixel cs e]]
toLists img = [[index img (i,j) | j <- [0..cols img -1]] | i <- [0.. rows img -1]]

extractBitOfPixel :: Pixel X Bit -> Bit
extractBitOfPixel (PixelX b) = b

sumSquareDifference :: [Int] -> [Int] -> Int
sumSquareDifference a b = sum $ zipWith (\x y -> (x-y)^2) a b

此代码的性能确实很差。尽管使用-O2进行编译,但以块大小2压缩256x256图像仍需要5分钟左右。分析表明,大多数运行时都用在函数distance中,尤其是在 sumSquareDifference中,但也花费在toListstoBinList中:

       binaryCompressionSimple +RTS -p -RTS

    total time  =     1430.89 secs   (1430893 ticks @ 1000 us,1 processor)
    total alloc = 609,573,757,744 bytes  (excludes profiling overheads)

COST CENTRE               MODULE    SRC                                        %time %alloc

sumSquareDifference       Main      binaryCompressionSimple.hs:87:1-63          30.9   28.3
toLists                   Main      binaryCompressionSimple.hs:66:1-90          20.3   47.0
distance.toBinList        Main      binaryCompressionSimple.hs:74:11-79         10.9   15.1
main                      Main      binaryCompressionSimple.hs:(14,1)-(24,21)    7.3    0.0
compress                  Main      binaryCompressionSimple.hs:(28,1)-(36,60)    6.9    0.0
distance                  Main      binaryCompressionSimple.hs:(71,1)-(74,79)    5.7    0.9
compress.distanceLists.\  Main      binaryCompressionSimple.hs:34:38-65          5.2    4.4
compress.distanceLists    Main      binaryCompressionSimple.hs:34:11-74          2.8    0.0
main.\                    Main      binaryCompressionSimple.hs:20:72-128         2.7    0.0
getMinIndices.getMinIndex Main      binaryCompressionSimple.hs:116:11-53         2.7    1.8
sumSquareDifference.\     Main      binaryCompressionSimple.hs:87:52-58          2.7    2.5

是否可以提高性能?

块大小为2表示将16384个范围块与域池的131072个图像进行比较,因此sumSquareDifference将被调用(16384 * 131072 =)2147483648次,并每次计算长度为4的两个列表的和平方差。我意识到这是一个很大的数目,但是代码是否不应该更快(对列表进行延迟评估)?这是Haskell问题还是算法问题?

编辑:

使用以下命令,我至少可以将性能提高三分之一:

distance :: Image VS X Bit -> Image VS X Bit-> Int
distance x y
     | x == y = 0
     | otherwise = sumSquareDifference (toBinList x) (toBinList y)
    where toBinList = map (toNum . extractBitOfPixel) . concat . inlinedToLists

编辑2:

我可以通过使用函数dPool创建genDistanceList来极大地提高性能,一旦找到两个相同的图像,该计算就会停止:

genDistanceList :: [[Image VS X Bit]] -> Image VS X Bit -> [[Int]]
genDistanceList dPool rBlock = nestedTakeWhileInclusive (/= 0) $ (map.map) (distance rBlock) dPool

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)