Python itertools.groupby() 递归函数

问题描述

我试图在递归函数中通过 itertools.groupby 迭代组，以从嵌套列表构造嵌套字典。

输入

example = [['a',[],'b',(),1,None],['a','c',2,None,3,]

预期输出

output = {'a': [{'b': (1,None)},{'c': (1,None
                ]
          }

我正在尝试的代码

from itertools import chain,groupby

def group_key(lst,level=0):
    return lst[level]

def build_dict(data=None,grouper=None):
    if grouper is None:
        gen = groupby(data,key=group_key)
    else:
        if any(isinstance(i,list) for i in grouper):
            level_down = [l[1:] for l in grouper]
            gen = groupby(level_down,key=group_key)
        else:
            return grouper

    for char,group in gen:
        group_lst = list(group)

        if isinstance(char,str):
            value = {char: build_dict(grouper=group_lst)}
        elif char == ():
            value = tuple(build_dict(grouper=group_lst))
        elif char == []:
            value = [build_dict(grouper=group_lst)]
        else:
            value = chain.from_iterable(group_lst)
        
        return value

当我提交代码时，我只得到 for char,group in gen: 循环中的第一组。不知何故，该功能不会继续与其他组。我在递归函数方面不是很好，所以也许我在那里遗漏了一些东西。 这是代码产生的：

In: build_dict(example)
Out: {'a': [{'b': (1,None)}]}

解决方法

该结构有点不一致，因为它在顶层将字典内容显示为 [key,collection,values...] 列表，但指定了没有封闭列表列表的子词典。尽管必须解决这种不一致问题，但可以递归地构建数据结构。

def buildData(content,asValues=False):
    if not asValues:    
        result = dict() # assumes a list of key,model,values...
        for k,*values in content:
            result.setdefault(k,model)
            result[k] += type(model)(buildData(values,True))
        return result
    if len(content)>2 \
    and isinstance(content[0],str) and isinstance(content[1],(tuple,list)):
        return [buildData([content])] # adapts to match top level structure  
    if content: # everythoing else produces a list of data items
        return content[:1] + buildData(content[1:],True)
    return [] # until data exhausted

输出：

example = [['a',[],'b',(),1,None],['a','c',2,None,3,]
d = buildData(example)

print(d)
            
{'a': [{'b': (1,None)},{'c': (0,None]}

重组

这对 itertools.groupby 来说不是问题。您用于“分组”元素的逻辑是独一无二的，我不希望找到满足您确切需求的内置函数。下面我从 restructure 开始，它从 example 中获取每个元素并产生与您已有的输出类似的输出 -

def restructure(t):
  def loop(t,r):
    if not t:
      return r[0]
    if t[-1] == ():
      return loop(t[0:-1],tuple(r))
    elif t[-1] == []:
      return loop(t[0:-1],list(r))
    elif isinstance(t[-1],str):
      return loop(t[0:-1],({t[-1]: r},))
    else:
      return loop(t[0:-1],(t[-1],*r))
  return loop(t[0:-1],))

for e in example:
  print(restructure(e))

{'a': [{'b': (1,None)}]}
{'a': [{'c': (0,None)}]}
{'a': [2,None]}
{'a': [3,None]}

合并

随着每个元素的重组，我们现在定义一种方法来merge重组元素 -

def merge(r,t):
  if isinstance(r,dict) and isinstance(t,dict):
    for (k,v) in t.items():
      r[k] = merge(r[k],v)
    return r
  elif isinstance(r,tuple) and isinstance(t,tuple):
    return r + t
  elif isinstance(r,list) and isinstance(t,list):
    return r + t
  else:
    return t

a = restructure(example[0])
b = restructure(example[1])

print(merge(a,b))
{'a': [{'b': (1,None)}]}

构建

最后，build 负责将所有内容联系在一起 -

def build(t):
  if not t:
    return None
  elif len(t) == 1:
    return restructure(t[0])
  else:
    return merge(restructure(t[0]),build(t[1:]))

example = \
  [ ['a',None]
  ]

print(build(example))

{'a': [{'b': (1,None]}

以上，build实际上与functools.reduce和map相同 -

from functools import reduce

def build(t):
  if not t:
    return None
  else:
    return reduce(merge,map(restructure,t))

print(build(example))

{'a': [{'b': (1,None]}

警告

这个答案对防止无效输入没有任何作用。您负责验证输入是否有效 -

restructure([])                     # IndexError
restructure([[],"a"])              # a
restructure(["a","b",()]) # {'a': ({'b': ((),)},)}

itertools nested-lists python recursion

Python itertools.groupby() 递归函数

问题描述

解决方法

相关问答