Notepad ++正则表达式替换选择所有文本在RegExr中工作

问题描述

我正在尝试用逗号替换日志文件中的所有空格（以将其转换为CSV格式）。但是，某些日志条目包含我不想替换的空间。这些条目由引号引起来。我看了几个示例，并提出了以下代码，这些代码似乎可以在RegExr.com和regex101.com中使用。

[\s](?=(?:"[^"]*"|[^"])*$)

但是，当我使用该表达式进行查找/替换时，它会正确运行，直到用空格将第一个引号引起来，然后选择文件的全部内容。

示例日志文件条目：

date=2020-08-24 time=07:35:15 idseq=216296511061885345 itime="2020-08-24 07:35:15" euid=3 epid=4107 dsteuid=3 dstepid=101 type="utm" subtype="webfilter" level="notice" action="passthrough" msg="URL belongs to an allowed category in policy"

所需结果：

date=2020-08-24,time=07:35:15,idseq=216296511061885345,itime="2020-08-24 07:35:15",euid=3,epid=4107,dsteuid=3,dstepid=101,type="utm",subtype="webfilter",level="notice",action="passthrough",msg="URL belongs to an allowed category in policy"

RegExr结果：

编辑：经过更多测试，似乎只有一行可以进行替换。但是，如果您有多行，它将用替换字符（在我的情况下是逗号）替换所有行。

解决方法

虽然很长，但是如果您有一个已知的值列表，则可以简单地将它们用作替换键

第一个值被跳过，因为它不应该以{{1}}作为前缀
必须确保在标签周围捕获,和，以确保（尽管不能保证不会在=字段中找到子字符串）

msg

Python示例

's/ (time|idseq|itime|euid|epid|dsteuid|dstepid|type|subtype|level|action|msg)=/,$1='

您可能会发现一些包含import re >>> source = '''date=2020-08-24 time=07:35:15 idseq=216296511061885345 itime="2020-08-24 07:35:15" euid=3 epid=4107 dsteuid=3 dstepid=101 type="utm" subtype="webfilter" level="notice" action="passthrough" msg="URL belongs to an allowed category in policy"''' >>> regex = ''' (time|idseq|itime|euid|epid|dsteuid|dstepid|type|subtype|level|action|msg)=''' >>> print(re.sub(regex,r",\1=",source)) # raw string to prevent loss of 1 date=2020-08-24,time=07:35:15,idseq=216296511061885345,itime="2020-08-24 07:35:15",euid=3,epid=4107,dsteuid=3,dstepid=101,type="utm",subtype="webfilter",level="notice",action="passthrough",msg="URL belongs to an allowed category in policy"或类似值的值，甚至可以破坏非常小心的正则表达式！

还请注意，对于CSV，您可能希望完全替换字段名称

Ctrl + H
查找内容："[^"\r\n]+"(*SKIP)(*FAIL)|\h+
替换为：,
检查环绕
检查 正则表达式
全部替换

说明：

"[^"\r\n]+"     # everything between quotes
(*SKIP)(*FAIL)  # kip and fail  the match
|               # OR
\h+             # 1 or more horizontal spaces

屏幕截图（之前）：

屏幕截图（之后）：

notepad++regex regex regex

Notepad ++正则表达式替换选择所有文本在RegExr中工作

问题描述

解决方法

相关问答