在最大代码表大小下到底要做什么,然后在 GIF LZW 解压缩算法中读取清晰的代码?

问题描述

我一直致力于学习如何编码和解码 GIF 图像文件。这是我的测试图像:

Here is my test image:

有问题的代码替换为 return,在出错前停止:

from enum import Enum

class Codes(Enum):
    CLEAR = 0
    EOI = 1

def decode(code_bytes: bytes):
    """Return decoded gif from incoming bytes,currently only works with entire bytes object"""
    lzw_min,leng,*_ = code_bytes
    total = 2 + leng

    # Skip lzw_min and sub-block size indicator bytes
    skips = {0,1,total}
    while leng != 1:
        leng = code_bytes[total] + 1
        total += leng
        skips |= {total}

    def _stream(skips):
        """Return least significant bit still available from current byte"""
        for i,byte in enumerate(code_bytes):
            if i in skips:
                continue
            for _ in range(8):
                yield byte & 1 and 1
                byte >>= 1

    code_stream = _stream(skips)

    def get_code(bits,x=0,s=0):
        """Retrieve bits and assemble in proper order"""
        for n in range(s,bits):
            x |= next(code_stream) << n
        return x

    code_table = {
            **{i: [i] for i in range(2 ** lzw_min)},2 ** lzw_min: [Codes.CLEAR],2 ** lzw_min + 1: [Codes.EOI]
    }
    bits = lzw_min + 1
    assert Codes.CLEAR in code_table[get_code(bits)]  # First code is always CLEAR
    table_index = init_table = len(code_table) - 1
    last = get_code(bits)
    code = get_code(bits)
    yield last
    while Codes.EOI not in code_table.get(code,[]):

        if code <= table_index:
            for i in code_table[code]:
                yield i
            table_index += 1
            code_table[table_index] = code_table[last] + [code_table[code][0]]

        else:
            k = code_table[last][0]
            for i in code_table[last] + [k]:
                yield i
            table_index += 1
            code_table[table_index] = code_table[last] + [k]

        if table_index == 2**bits-1:
            bits += 1

        # Problem replaced here
        if bits == 13:
            return

        last,code = code,get_code(bits)

从上面输出的 GIF 的第二帧:

enter image description here

目前,包含问题的相同输入的输出:

test image output,only the top half

问题就在这里:

        if bits == 13:
            last,get_code(bits)
            bits = lzw_min + 1
            table_index = init_table
            assert Codes.CLEAR in code_table[code]

在读取图像时,当 table_index 达到最大位长以下的 1 并增加超过最大位长的位时,此代码将执行。它显然无法正常运行。

达到最大位大小后,接下来读取 CLEAR 代码,正如预期的那样,没有引发 AssertionError。

据我所知,CLEAR 代码意味着将位长重置为 lzw_min + 1 并重新初始化代码表。我相信这就是我的代码所做的。

最初,code_table 是一个列表,在 CLEAR 和位长重置后被重置,但比当前表索引大得多的代码一直从流中输出。使代码表成为 dict 允许在重新初始化后访问当前未覆盖的代码,但这只会产生噪音。显然,一些噪音最终会注册为 EOI 并解析结束。

在遇到最大大小代码后跟 CLEAR 后,该算法执行的操作有什么问题?


重现我的问题(需要枕头):

这里是测试 gif 的第二帧作为字节文字,您可以用于函数输入和框架的调色板作为用于构建 img 的列表文字:

字节:(从 pastebin 中删除)

调色板:

[(230,226,225),(54,28,25),(99,25,28),(117,22,(65,39,33),(79,45,38),(92,36),(81,39),48,40),(88,50,43),(100,42,(104,60,50),(111,66,55),(119,68,57),(138,11,23),(134,14,24),(139,(151,9,(148,13,26),(156,12,(132,18,27),(141,29),(149,19,30),(166,7,(173,(164,(172,(184,4,(181,(190,6,(180,10,(191,16,(142,35),23,(146,27,37),(147,30,41),(189,(169,20,(188,26,35,45),(155,36,42),41,51),(153,43,53),63,56),51,60),40,52),54),(167,53,(192,(193,(194,(196,(195,44),31,48),(199,34,(197,(201,(198,58),(200,(202,(131,75,63),(175,67,61),(159,55,64),(157,59,67),(161,68),(170,58,69),(185,52,56,70),(204,75),(208,62,81,71),76),(171,69,(183,70,73),86,(165,83),76,84),85),90,(168,84,91),99,82),(177,104,86),(182,106,88),97),92,99),102,108),100,107),109,114),(186,118,123),72),(205,65,78),(206,77,74,87),(207,73,90),85,(215,91,(209,94),(218,92),114,93),(203,(212,112,95),100),(210,(211,101),94,105),(213,98,109),119,122,121,(217,125,102),105,(214,113),117),119),(216,111,120),116,115,124),(225,103),(221,130,(226,133,110),(228,135,112),(230,136,(232,138,(224,140,121),129),124,130),(219,123,132),137),143,146),149,152),160,162),132,141),131,139),141,145),(222,139,150,154),153,135),142,149),144,151),148,155),159),169,156),158,161),(227,156,167,169),162,165),166,170,173),175,177),174,176),184,185),179,180),182,184),187,188),161,166),(229,165,170),174),(231,173,178),178,181),183,(233,(234,186,189),191,192),193),(235,190,203,202),195,196),205,204),201,201),208,207),210,210),212,212),216,215),218,217),(236,197),199,200),(237,203),(238,207,208),(240,213,213),221,220),220,219),(241,211,215,216),(242,219,224,223),(239,224),229,228),232,231),234,232),(244,228,227),(245,237,235),(248,239,237),(246,241,239),(247,242,240),(250,248,246),(0,0),0)]

这是使用这些文字产生与我相同的输出所需的代码:

from PIL import Image
w,h = 380,407
gifo = Image.new('RGB',(w,h))
gif = gifo.load()
pix = decode(FRAME_2)  # bytes literal as input
palette = PALETTE      # palette literal as input
try:
    for y in range(h):
        for x in range(w):
            t = next(pix)
            while t in {Codes.EOI,Codes.CLEAR}: # Only needed to handle buggy noise with codes mixed into it
                    t = next(pix)
            gif[x,y] = palette[t]
except StopIteration:
    ...
gifo.show()

我在这个项目中使用的资源:

http://giflib.sourceforge.net/whatsinagif/lzw_image_data.html

https://www.daubnet.com/en/file-format-gif

https://www.w3.org/Graphics/GIF/spec-gif89a.txt

https://www.eecis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html

关于延迟 CLEAR 代码,虽然这个 gif 不包含这些,我没有在这个问题中包含它们的逻辑:

https://halicery.com/Image%20Decoders/GIF/GIF-LZW%20Deferred%20Clear%20Code.html

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)