为什么像 `(!"foo" .*)` 这样的表达式会在 PEG.js 中生成 `[undefined, char]` 值的数组

问题描述

我对 PEG.js 还是很陌生，我猜这只是初学者的误解。

试图解析这样的东西：

deFinitions
    some text

if
    some additonal text
    to parse here    

then
    still more text will
    go here

我可以得到一个语法来正确阅读这三个部分（当然稍后会进一步解析。）但它以奇怪的格式生成该文本。比如上面的“一些文字”变成了

[
  [undefined,"s"],[undefined,"o"],"m"],"e"]," "],"t"],"x"],"t"]
]

我可以很容易地将其转换为普通字符串，但我想知道我在做什么才能给它这种糟糕的格式。到目前为止，这是我的语法：

{
  const combine = (xs) => xs .map (x => x[1]) .join('')
}

MainObject
  = _ defs:DefSection _ condition:CondSection _ consequent: ConsequentSection
    {return {defs,condition,consequent}}

DefSection = _ "deFinitions"i _ defs:(!"\nif" .)+
  {return defs}

CondSection = _ "if"i _ cond:(!"\nthen" .)+
  {return combine (cond)}

ConsequentSection = _ "then"i _ cons:.*
  {return cons .join ('')} 

_ "whitespace"
  = [ \t\n\r]*

我可以像其他部分一样通过将 {return defs} 替换为 {return combine(defs)} 来修复它。

我的主要问题很简单，它为什么会生成该输出？有没有更简单的方法来修复它？

总的来说，因为我对 PEG.js 还是很陌生，我很想知道是否有更好的方法来编写这种语法。 (!"\nif" .*) 之类的表达式看起来相当粗略。

解决方法

负面展望，例如!Rule，将始终返回 undefined，如果 Rule 匹配将失败。
点 . 将始终匹配单个字符。
一个序列 Rule1 Rule2 ... 将创建一个包含每个规则结果的列表
重复 Rule+ 或 Rule* 将尽可能多地匹配 Rule 并创建一个列表。（如果第一次尝试匹配规则失败，则 + 失败）

你的结果是

[ // Start (!"\nif" .)
  [undefined // First "\nif","s" // First .
] // first ("\nif" .),[undefined,"o"] // Second (!"\nif" .),"m"],"e"]," "],"t"],"x"],"t"]
] // This list is (!"\nif" .)*,all the matches of ("\nif" .)

您似乎想要的是阅读文本，您可以为此使用运算符 $Rule，它将返回输入而不是生成的输出。

MainObject
  = _ defs:DefSection _ condition:CondSection _ consequent: ConsequentSection
    {return {defs,condition,consequent}}

DefSection = _ "definitions"i _ defs:$(!"\nif" .)+
  {return defs.trim()}

CondSection = _ "if"i _ cond:$(!"\nthen" .)+
  {return cond.trim()}

ConsequentSection = _ "then"i _ cons:$(.*)
  {return cons.trim()} 

_ "whitespace"
  = [ \t\n\r]*

会产生

{
   "defs": "some text","condition": "some additonal text
    to parse here","consequent": "still more text will
    go here"
}

javascript peg pegjs