sed重复匹配行为异常

问题描述

我正在尝试从以下字符串获取文件路径：

"# configuration file /etc/Nginx/conf.d/default.conf"

将其传递给sed：

sed -n 's,\(# configuration file \)\(\/[a-zA-Z_.]\+\)\+,\1,'

我希望/etc/Nginx/conf.d/default.conf被\1抓住，但令人惊讶的是仅返回default.conf部分。在这里我明白了引用每次使用/[a-zA-Z_.]\+的下一个匹配项时，该部分都会重新填充。这不合逻辑吗每个下一个匹配项都转到下一个引用，因此default.conf将在\4中返回吗？

/[a-zA-Z_.]\+ >>>

\(/etc\)\(/Nginx\)\(/conf.d\)\(/default.conf\)
   \1        \2        \3           \4

解决方法

这可能对您有用（GNU sed）：

sed -nE 's,(# configuration file )((/[a-zA-Z_.]+)+),\2,p' file

这将捕获文件路径。

sed -nE 's,\1,p' file

这将捕获评论的开头。

sed -nE 's/(# configuration file )((\/[a-zA-Z_.]+)+)/\3/p' file

这将捕获文件路径的结尾。

当捕获组通过某种可能符合重复的条件（例如*，?，+或{...}之间的任何条件）时，它将保留最后一次这样的重复（请参阅解决方案3）

pattern-matching regex regex regex sed