有没有办法在列表理解中做到这一点

问题描述

我正在尝试使用列表理解来做到这一点。我正在使用 python 2.7 的子集,它不允许使用命令 anyall

string_list1 = ['James Dean','Mr. James Dean','Jon Sparrow','Timothy Hook','Captain Jon Sparrow']
string_list2 = []

# Get elements that are a substring of other elements
for str1 in string_list1:
    for str2 in string_list1:
        if str1 in str2 and str1 != str2:
            string_list2.append(str1)
print('Substrings: ',string_list2)

# remove element if another element is within it
for str2 in string_list2:
    for str1 in string_list1:
        if str2 in str1 and str2 != str1:
            string_list1.remove(str1)
print('Desired: ',string_list1) # all elements that are unique

结果应该是 ['James Dean','Timothy Hook'] 基本上是子串和非子串元素

解决方法

你可以像这样在列表理解中应用相同的算法:

lst = ['James Dean','Mr. James Dean','Jon Sparrow','Timothy Hook','Captain Jon Sparrow']

res = [primitive for primitive in lst 
          if primitive not in (superstr for superstr in lst
              if [substr for substr in lst if substr in superstr and substr != superstr]
          )
      ]

print(res)

但是解释器不会看到内部表达式 (superstr ...) 只需要计算一次,而不是对外部循环的每次迭代进行计算。所以我更愿意分两步做:

lst = ['James Dean','Captain Jon Sparrow']

exclude = [superstr for superstr in lst
              if [substr for substr in lst if substr in superstr and substr != superstr]
          ]
res = [primitive for primitive in lst if primitive not in exclude]

print(res)