如何链接，然后“取消链接”嵌套列表？

问题描述

我有一个嵌套列表，我需要将它链接起来，然后运行指标，然后“解链”回其原始嵌套格式。以下是用于说明的示例数据：

from itertools import chain

nested_list = [['x','xx','xxx'],['yy','yyy','y','yyyy'],['zz','z']]
chained_list = list(chain(*nested_list))
print("chained_list: \n",chained_list)
metrics_list = [str(chained_list[x]) +'_score' \
    for x in range(len(chained_list))]
print("metrics_list: \n",metrics_list) 
zipped_scores = list(zip(chained_list,metrics_list))
print("zipped_scores: \n",zipped_scores)

unchain_function = '????'

chained_list: 
 ['x','xxx','yy','yyyy','zz','z']
metrics_list: 
 ['x_score','xx_score','xxx_score','yy_score','yyy_score','y_score','yyyy_score','zz_score','z_score']
zipped_scores: 
 [('x','x_score'),('xx','xx_score'),('xxx','xxx_score'),('yy','yy_score'),('yyy','yyy_score'),('y','y_score'),('yyyy','yyyy_score'),('zz','zz_score'),('z','z_score')]

是否有 python 函数或 pythonic 方法来编写“unchain_function”来获得这个所需的输出？

[
    [
        ('x','xxx_score')
    ],[
        ('yy','yyyy_score')
    ],[
        ('zz','z_score')
    ]
]

（背景：这是为了在长度大于 100,000 的列表上运行指标）

解决方法

我不知道这是多么pythonic，但这应该有效。长话短说，我们使用 Wrapper 类将不可变的原语（如果不替换就无法更改）转换为可变变量（因此我们可以对同一变量有多个引用，每个引用的组织方式不同）。

我们创建了一个相同的嵌套列表，只是每个值都是原始列表中相应值的 Wrapper。然后，我们应用相同的转换来解开包装器列表。将处理后的链表中的更改复制到链式包装器列表中，然后从嵌套包装器列表中访问这些更改并展开它们。

我认为使用名为 Wrapper 的显式和简单的类更容易理解，但您可以通过使用单例列表来包含变量而不是 Wrapper 的实例来做本质上相同的事情.

from itertools import chain

nested_list = [['x','xx','xxx'],['yy','yyy','y','yyyy'],['zz','z']]
chained_list = list(chain(*nested_list))

metrics_list = [str(chained_list[x]) +'_score' for x in range(len(chained_list))]
zipped_scores = list(zip(chained_list,metrics_list))

# create a simple Wrapper class,so we can essentially have a mutable primitive.
# We can put the Wrapper into two different lists,and modify its value without
# overwriting it.
class Wrapper:
    def __init__(self,value):
        self.value = value

# create a 'duplicate list' of the nested and chained lists,respectively,# such that each element of these lists is a Wrapper of the corresponding
# element in the above lists
nested_wrappers = [[Wrapper(elem) for elem in sublist] for sublist in nested_list]
chained_wrappers = list(chain(*nested_wrappers))

# now we have two references to the same MUTABLE Wrapper for each element of 
# the original lists - one nested,and one chained. If we change a property
# of the chained Wrapper,the change will reflect on the corresponding nested
# Wrapper. Copy the changes from the zipped scores onto the chained wrappers
for score,wrapper in zip(zipped_scores,chained_wrappers):
    wrapper.value = score

# then extract the values in the unchained list of the same wrappers,thus
# preserving both the changes and the original nested organization
unchained_list = [[wrapper.value for wrapper in sublist] for sublist in nested_wrappers]

这以 unchained_list 结束，等于以下内容：

[[('x','x_score'),('xx','xx_score'),('xxx','xxx_score')],[('yy','yy_score'),('yyy','yyy_score'),('y','y_score'),('yyyy','yyyy_score')],[('zz','zz_score'),('z','z_score')]]

我认为您只是想根据某种条件对数据进行分组，即每个元组中第一个索引的第一个字母。

给定

您的扁平化压缩数据：

data = [
    ('x','xxx_score'),('yy','yyyy_score'),('zz','z_score')
]

代码

[list(g) for _,g in itertools.groupby(data,key=lambda x: x[0][0])]

输出

[[('x','z_score')]]

另见

此 post 关于此工具的工作原理

您使算法变得非常复杂，您只需通过如下所示的简单步骤即可完成：

首先创建一个所需大小的空嵌套列表

formatted_list = [[] for _ in range(3)]

只需遍历列表并相应地格式化

对于范围内的 K (0,3)：

      for i in nested_list[K]:

          formatted_list[K].append(i + '_score')

      print([formatted_list])

itertools python

如何链接，然后“取消链接”嵌套列表？

问题描述

解决方法

相关问答