在python中声明相等长度的zip迭代器

问题描述

我可以想到一个更简单的解决方案,itertools.zip_longest()如果产生的元组中存在用于填充较短的可迭代对象的哨兵值,请使用并引发异常:

from itertools import zip_longest

def zip_equal(*iterables):
    sentinel = object()
    for combo in zip_longest(*iterables, fillvalue=sentinel):
        if sentinel in combo:
            raise ValueError('Iterables have different lengths')
        yield combo

不幸的是,我们不能使用zip()withyield from来避免每次迭代都进行测试的Python代码循环。一旦最短的迭代器用完,zip()将推进所有先前的迭代器,从而吞噬证据,如果其中仅包含一个额外的项目。

解决方法

我正在寻找一种对zip多个可迭代对象引发异常的好方法,如果可迭代对象的长度不相等。

如果可迭代对象是列表或具有某种len方法,则此解决方案简单易行:

def zip_equal(it1,it2):
    if len(it1) != len(it2):
        raise ValueError("Lengths of iterables are different")
    return zip(it1,it2)

但是,如果it1it2是生成器,则前一个函数将失败,因为未定义长度TypeError: object of type 'generator' has no len()

我认为该itertools模块提供了一种实现该目标的简单方法,但是到目前为止,我还没有找到它。我想出了这个自制的解决方案:

def zip_equal(it1,it2):
    exhausted = False
    while True:
        try:
            el1 = next(it1)
            if exhausted: # in a previous iteration it2 was exhausted but it1 still has elements
                raise ValueError("it1 and it2 have different lengths")
        except StopIteration:
            exhausted = True
            # it2 must be exhausted too.
        try:
            el2 = next(it2)
            # here it2 is not exhausted.
            if exhausted:  # it1 was exhausted => raise
                raise ValueError("it1 and it2 have different lengths")
        except StopIteration:
            # here it2 is exhausted
            if not exhausted:
                # but it1 was not exhausted => raise
                raise ValueError("it1 and it2 have different lengths")
            exhausted = True
        if not exhausted:
            yield (el1,el2)
        else:
            return

可以使用以下代码测试该解决方案:

it1 = (x for x in ['a','b','c'])  # it1 has length 3
it2 = (x for x in [0,1,2,3])     # it2 has length 4
list(zip_equal(it1,it2))           # len(it1) < len(it2) => raise
it1 = (x for x in ['a',3])     # it2 has length 4
list(zip_equal(it2,it1))           # len(it2) > len(it1) => raise
it1 = (x for x in ['a','c','d'])  # it1 has length 4
it2 = (x for x in [0,3])          # it2 has length 4
list(zip_equal(it1,it2))                # like zip (or izip in python2)

我是否在忽略任何其他解决方案?我的zip_equal函数有更简单的实现吗?