从列表中获取后无法操作字符串

问题描述

我希望删除用于解析的规则中的最后一条语句。语句用@字符封装，规则本身用模式标签封装。

我要做的就是删除最后一条规则语句。

我目前实现这一目标的想法是这样的：

打开规则文件，将每一行作为元素保存到列表中。
选择包含正确的rule-id的行，然后将规则模式另存为新字符串。
撤消已保存的规则模式。
删除最后一条规则语句。
重新反转规则模式。
添加尾随模式标记。

所以输入看起来像：

<pattern>@this is a statement@ @this is also a statement@</pattern>

输出将如下所示：

<pattern>@this is a statement@ </pattern>

我目前的尝试是这样的：

with open(rules) as f:
    lines = f.readlines()
string = ""
for line in lines:
    if ruleid in line:
        position = lines.index(line)
        string = lines[position + 2] # the rule pattern will be two lines down
                                     # from where the rule-id is located,hence 
                                     # the position + 2

def reversed_string(a_string): #reverses the string
    return a_string[::-1] 
def remove_at(x): #removes everything until the @ character
    return re.sub('^.*?@','',x) 

print(reversed_string(remove_at(remove_at(reversed_string(string)))))

这将反转字符串，但一旦删除最后一条规则语句，则不会删除它。

仅运行reversed_string()函数将成功反转该字符串，但是尝试通过remove_at()函数运行相同的字符串根本无法工作。

但是，如果您手动创建输入字符串（具有相同的规则模式），并且放弃打开和获取规则模式，它将成功删除尾随的规则语句。

成功的代码如下：

string = '<pattern>@this is a statement@ @this is also a statement@</pattern>'

def reversed_string(a_string): #reverses the string
    return a_string[::-1] 
def remove_at(x): #removes everything until the @ character
    return re.sub('^.*?@',x) 

print(reversed_string(remove_at(remove_at(reversed_string(string)))))

同样，删除完成后如何添加模式标签？

解决方法

您正在阅读的行末尾可能带有\n，这就是为什么替换行不通的原因。 This question可以指导您阅读文件而无需换行。

在这些选项中，可以像这样使用rstrip()删除\n：

string = lines[position + 2].rstrip("\n")

现在，关于替换，我认为您可以使用以下正则表达式来简化它：

@[^@]+@(?!.*@)

它由以下部分组成：

@[^@]+@匹配一个@，然后匹配一个或多个不是 @的字符，然后匹配另一个@。
(?!.*@)是negative lookahead，用于检查在前面没有发现@之前是否出现了零个或多个其他字符。

Here，您可以看到此正则表达式的演示。

此表达式应与最后一条语句匹配，并且您无需反转字符串：

re.sub("@[^@]+@(?!.*@)","",string)

parsing python regex