为什么 Perl 正则表达式捕获组在“打印”和算术运算之间的行为不同？

问题描述

在 Perl (v5.30.0) 中，当用作 print() 的参数时，正则表达式被评估为捕获：

# Simplified example; the real case has more text,and the capture covers only part of it.

echo $'1\n2\n3' | perl -ne 'print /(.)/'
# 123

这非常适合文本提取。我想为算术运算利用同样的便利，但这并没有按预期工作：

# Attempt to compute a sum of the int value of the captures
#
echo $'1\n2\n3' | perl -ne '$tot += /(.)/; END { print $tot }'
# 3

# Attempt to print twice the int value of each capture
#
echo $'1\n2\n3' | perl -ne 'print(/(.)/ * 2)'
# 222

使用捕获变量工作：

echo $'1\n2\n3' | perl -ne 'if (/(.)/) { $tot += $1 }; END { print $tot }'
# 6

但是，我很好奇为什么会发生这种情况，以及是否可以以任何方式使用以前的更紧凑的形式来对捕获的值执行算术运算。

解决方法

那是因为 m// 在标量上下文中返回 1 表示成功（参见 perlop）。您可以强制列表上下文返回匹配的部分：

echo $'1\n2\n3' | perl -ne '$tot += (/(.)/)[0]; END { print $tot }'

您可以使用 $_ 将所有输入相加：

echo $'1\n2\n3' | perl -ne '$tot += $_; END { print $tot . "\n" }'

6

或者，您可以使用 -a（自动拆分）选项将输入拆分为字段：

echo $'1\n2\n3' | perl -ane '$tot += $F[0]; END { print $tot . "\n" }'

6

capturing-group perl regex regex regex text-processing