问题描述
我正在尝试过滤掉字符串列表中等于 1 个字符或 2 个字符的单词。这是我的示例数据 -
l = ['new vaccine tech long term testing','concerned past negative effects vaccines flu shot b','standard initial screening tb never tb chest x ray']
我尝试编写此逻辑,但不知何故,输出是单词列表而不是句子列表
cleaner = [ ''.join(word) for each in l for word in each.split() if len(word) > 2 ]
cleaner
['new','vaccine','tech','long','term','testing','concerned','past','negative','effects','vaccines','flu','shot','standard','initial','screening','never','chest','ray']
如何让这个输出如下
output = ['new vaccine tech long term testing','concerned past negative effects vaccines flu shot','standard initial screening never chest ray']
解决方法
您需要使用嵌套列表解析,而不是单个列表解析。外层为句子,内层为词。
并且您需要加入一个空格,而不是空字符串,以便在单词之间放置一个空格。
output = [' '.join([word for word in sentence.split() if len(word) > 2]) for sentence in l]
,
您可以过滤以获取字符串的特定大小的单词,然后将其加入列表
l = ['new vaccine tech long term testing','concerned past negative effects vaccines flu shot b','standard initial screening tb never tb chest x ray']
res = [" ".join(filter(lambda x: len(x) > 2,eaach.split(' '))) for eaach in l]
print(res)