perl – 我如何让Marpa的序列规则变得贪婪?

我正在研究一个 Marpa::R2语法,它将文本中的项目分组.每个组只能包含某种类型的项目,但不会明确分隔.这会导致问题,因为x … x(其中.表示可以成为组的一部分的项目)可以分组为x(…)x,x(..)(.)x,x(.) (..)x,x(.)(.)(.)x.换句话说,语法非常模糊.

如果我只想要x(…)x解析,即如果我想强制量词只表现为“贪婪”(就像它在Perl正则表达式中那样),我怎么能消除这种歧义呢?

在下面的语法中,我尝试将秩副词添加到序列规则中,以便优先处理Group over Sequence,但这似乎不起作用.

下面是一个练习此行为的测试用例.

use strict;
use warnings;

use Marpa::R2;
use Test::More;

my $grammar_source = <<'END_GRAMMAR';
inaccessible is fatal by default
:discard ~ space
:start ::= Sequence

Sequence
    ::= SequenceItem+  action => ::array
SequenceItem
    ::= WORD    action => ::first
    |   Group   action => ::first
Group
    ::= GroupItem+  action => [name,values]
GroupItem
    ::= ('[') Sequence (']')  action => ::first

WORD    ~ [a-z]+
space   ~ [\s]+
END_GRAMMAR

my $input = "foo [a] [b] bar";

diag "perl $^V";
diag "Marpa::R2 " . Marpa::R2->VERSION;

my $grammar = Marpa::R2::Scanless::G->new({ source => \$grammar_source });
my $recce = Marpa::R2::Scanless::R->new({ grammar => $grammar });

$recce->read(\$input);

my $parse_count = 0;
while (my $value = $recce->value) {
    is_deeply $$value,['foo',[Group => ['a'],['b']],'bar'],'expected structure'
        or diag explain $$value;
    $parse_count++;
}
is $parse_count,1,'expected number of parses';

done_testing;

测试用例的输出(FAIL):

# perl v5.18.2
# Marpa::R2 2.09
ok 1 - expected structure
not ok 2 - expected structure
#   Failed test 'expected structure'
#   at - line 38.
#     Structures begin differing at:
#          $got->[1][2] = Does not exist
#     $expected->[1][2] = ARRAY(0x981bd68)
# [
#   'foo',#   [
#     'Group',#     [
#       'a'
#     ]
#   ],#   [
#     ${\$VAR1->[1][0]},#     [
#       'b'
#     ]
#   ],#   'bar'
# ]
not ok 3 - expected number of parses
#   Failed test 'expected number of parses'
#   at - line 41.
#          got: '2'
#     expected: '1'
1..3
# Looks like you failed 2 tests of 3.

解决方法

序列规则是针对非棘手案例而设计的.当事情变得棘手时,序列规则总是可以重写为BNF规则,这就是我在这里建议的.以下内容使您的测试工作:

use strict;
use warnings;

use Marpa::R2;
use Test::More;

my $grammar_source = <<'END_GRAMMAR';
inaccessible is fatal by default
:discard ~ space

# Three cases
# 1.) Just one group.
# 2.) Group follows by alternating words and groups.
# 3.) Alternating words and groups,starting with words
Sequence ::= Group action => ::first
Sequence ::= Group Subsequence action => [values]
Sequence ::= Subsequence action => ::first

Subsequence ::= Words action => ::first

# "action => [values]" makes the test work unchanged.
# The action for the next rule probably should be
# action => [name,values] in order to handle the general case.
Subsequence ::= Subsequence Group Words action => [values]

Words ::= WORD+ action => ::first
Group
::= GroupItem+  action => [name,values]
GroupItem
::= ('[') Sequence (']')  action => [value]

WORD    ~ [a-z]+
space   ~ [\s]+
END_GRAMMAR

my $input = "foo [a] [b] bar";

diag "perl $^V";
diag "Marpa::R2 " . Marpa::R2->VERSION;

my $grammar = Marpa::R2::Scanless::G->new( { source  => \$grammar_source } );
my $recce   = Marpa::R2::Scanless::R->new( { grammar => $grammar } );

$recce->read( \$input );

my $parse_count = 0;
while ( my $value = $recce->value ) {
is_deeply $$value,[ 'foo',[ Group => ['a'],['b'] ],'bar' ],'expected structure'
    or diag explain $$value;
$parse_count++;
} ## end while ( my $value = $recce->value )
is $parse_count,'expected number of parses';

done_testing;

相关文章

1. 如何去重 #!/usr/bin/perl use strict; my %hash; while(...
最近写了一个perl脚本,实现的功能是将表格中其中两列的数据...
表的数据字典格式如下:如果手动写MySQL建表语句,确认麻烦,...
巡检类工作经常会出具日报,最近在原有日报的基础上又新增了...
在实际生产环境中,常常需要从后台日志中截取报文,报文的形...
最近写的一个perl程序,通过关键词匹配统计其出现的频率,让...