Antlr3语法在遇到Pound字符时生成解析错误

问题描述

Antlr-3在遇到法语的英镑字符(“ £”)时产生错误,这与哈希的“ ”英语字符等效,甚至在lexer / parser规则中也指定了三个特殊字符 @ $ 的Unicode值。

仅供参考::Pound char(法语)的Unicode值= Hash char(英语)的Unicode值。

词法分析器/解析器规则:

grammar SimpleCalc;

options
{
  k        = 8;
  language = Java;
  //filter   = true;
}
 
tokens {
    PLUS    = '+' ;
    MINUS   = '-' ;
    MULT    = '*' ;
    DIV = '/' ;
}
 
/*------------------------------------------------------------------
 * PARSER RULES
 *------------------------------------------------------------------*/
 
expr    : n1=NUMBER ( exp = ( PLUS | MINUS )  n2=NUMBER )* 
{
  if ($exp.text.equals("+"))
   System.out.println("Plus Result = " + $n1.text + $n2.text);
  else
   System.out.println("Minus Result = " + $n1.text + $n2.text);
}
;
 
/*------------------------------------------------------------------
 * LEXER RULES
 *------------------------------------------------------------------*/
 
NUMBER  : (DIGIT)+ ;
 
WHITESPACE : ( '\t' | ' ' | '\r' | '\n'| '\u000C' )+    { $channel = HIDDEN; } ;
 
fragment DIGIT  : '0'..'9' | '£' | ('\u0040' | '\u0023' | '\u0024');

该文本文件也以UTF-8格式读取:

    public static void main(String[] args) throws Exception
    {
        try
        {
            args = new String[1];
            args[0] = new String("antlr_test.txt");
            SimpleCalcLexer lex = new SimpleCalcLexer(new ANTLRFileStream(args[0],"UTF-8"));
            CommonTokenStream tokens = new CommonTokenStream(lex);
            
            SimpleCalcParser parser = new SimpleCalcParser(tokens);
            
            parser.expr();
            //System.out.println(tokens);
        }
        catch (Exception e)
        {
            e.printstacktrace();
        }
    }

输入文件只有1行:

 £3 + 4£
 

错误是:

antlr_test.txt line 1:1 no viable alternative at character '£'
antlr_test.txt line 1:7 no viable alternative at character '£'

我的方法有什么问题? 还是我错过了什么?

解决方法

我无法复制您的描述。当我未经修改地测试您的语法时,会得到NumberFormatException,这是预料之中的,因为Integer.parseInt("£3")无法成功。

当我将您的嵌入式代码更改为此时:

{
  if ($exp.text.equals("+"))
   System.out.println("Result = " + (Integer.parseInt($n1.text.replaceAll("\\D","")) + Integer.parseInt($n2.text.replaceAll("\\D",""))));
  else
   System.out.println("Result = " + (Integer.parseInt($n1.text.replaceAll("\\D","")) - Integer.parseInt($n2.text.replaceAll("\\D",""))));
}

并重新生成lexer和parser类(您可​​能尚未完成),然后重新运行驱动程序代码,我得到以下输出:

Result = 7

编辑

也许是语法中的英镑符号?如果您尝试怎么办:

fragment DIGIT  : '0'..'9' | '\u00A3' | ('\u0040' | '\u0023' | '\u0024');

代替:

fragment DIGIT  : '0'..'9' | '£' | ('\u0040' | '\u0023' | '\u0024');