gpt4 book ai didi

exception - ANTLR v3 NoViableAltException 没有出现

转载 作者:行者123 更新时间:2023-12-02 02:24:27 26 4
gpt4 key购买 nike

考虑以下我的语法摘录:

definition
: '(' 'define'
( '(' variable def_formals ')' body ')'
| variable expression ')'
)
;

def_formals
: variable* ('.' variable)?
;

body
: ((definition)=> definition)* expression+
;

变量是标识符,表达式是方案的一些表达式(如文字或 lambda 表达式)。完整的语法可以在我的其他一些问题中找到。

所以我正在测试整个事情并想出了一个关于 NoViableException 的问题。

到目前为止,一切应该运行良好的都运行良好。例如

(define x 5)

被识别。

现在我正在测试解析器不应该识别的内容。

例如

(define x 5))

报告行末的额外“)”。

但是当我遗漏一些东西时,例如

(define x)

(define)

解析器一点也不提示。当我检查解释器时,NoViableAltException 正确显示。但我不知道如何让这个错误出现在外部程序中(比如 java 测试类)

我试图让解析器在他看到第一个语法错误时中断,就像 Terrence Parr 的书中(第 252 页)中描述的那样,但这也无济于事。我也尝试过类似的东西

    private List<String> errors = new LinkedList<String>();
public void displayRecognitionError(String[] tokenNames,
RecognitionException e) {
String hdr = getErrorHeader(e);
String msg = getErrorMessage(e, tokenNames);
errors.add(hdr + " " + msg);
}
public List<String> getErrors() {
return errors;
}

但是该方法在调用时不会返回任何内容。

那么当这些错误显然是在内部抛出时,我如何让 ANTLR 向我显示这些错误?

编辑:这是整个语法:

grammar R5RS;

options {
language = Java;
output=AST;
}

@header{
package r5rsgrammar;
import r5rsgrammar.scope.*;
import java.util.LinkedList;
}

@lexer::header{
package r5rsgrammar;
import r5rsgrammar.scope.*;
import java.util.LinkedList;
}

@members{

// variables wich is used to distinguish between top level and inner definitions
private boolean topLevel;


// the toplevel scope of a file, whose parent is null
private IScope scope;

@Override
public void emitErrorMessage(String message) {
throw new RuntimeException(message);
}
}

// PROGRAMS AND DEFINITIONS

parse
@init{
this.topLevel = true;
this.scope = new Scope();
}
: command_or_definition* EOF
;

command_or_definition
: (syntax_definition)=> syntax_definition
| (definition)=> definition
| ('('BEGIN command_or_definition)=>
'('BEGIN
{ this.topLevel = false;
this.scope = this.scope.push();
}
command_or_definition+
{ this.scope = this.scope.pop();
this.topLevel = true;
}')'
| command
;

command
: expression
;

definition
: '(' DEFINE ( '(' var=variable
{ this.topLevel = false;
this.scope.bind($var.text);
this.scope = this.scope.push();
}
def_formals ')' body
{ this.topLevel = true;
this.scope = this.scope.pop();
}')'
| var=variable
{ this.topLevel = false;
this.scope.bind($var.text);
this.scope = this.scope.push();
}
expression
{ this.topLevel = true;
this.scope = this.scope.pop();
}')'
)
| '(' BEGIN
{this.scope = this.scope.push();}
definition*
{this.scope = this.scope.pop();}')'
;

def_formals
: vars+=variable* ('.' vars+=variable)?
{for (int i = 0; i \less $vars.size(); i++){
String name = ((CommonTree)$vars.get(i)).getText();
this.scope.bind(name);
}
}
;


syntax_definition
: '(' DEFINE_SYNTAX var=variable
{ this.scope.bind($var.text);
this.scope = this.scope.push();}
transformer_spec
{this.scope = this.scope.pop();}')'
;

// EXPRESSIONS

expression
: (variable)=> var=variable
{
if(!this.scope.isBound($var.text))
System.err.println($var.text + " not bound");
}
| (literal)=> literal
| (lambda_expression)=> lambda_expression
| (conditional)=> conditional
| (assignment)=> assignment
| (derived_expression)=> derived_expression
| (procedure_call)=> procedure_call
| (macro_use)=> macro_use
| macro_block
;

keyword
: identifier
;

literal
: quotation
| self_evaluating
;

self_evaluating
: bool
| number
| CHARACTER
| STRING
;

quotation
: '\'' datum
| '(' QUOTE datum ')'
;

lambda_expression
: '(' LAMBDA {this.scope = this.scope.push();}
formals body
{this.scope = this.scope.pop();}')'
;

formals
: '(' (vars+=variable+ ('.' vars+=variable )?)? ')'
{for (int i = 0; i \less $vars.size(); i++){
String name = ((CommonTree)$vars.get(i)).getText();
this.scope.bind(name);
}
}
| var=variable
{this.scope.bind($var.text);}
;

body
: ((definition)=> definition)* sequence
;

sequence
: expression+
;


conditional
: '(' IF test consequent alternate? ')'
;

test
: expression
;
consequent
: expression
;
alternate
: expression
;

assignment
: '(' SET_BANG variable expression ')'
;


derived_expression
: quasiquotation
| '(' ( COND ( '(' ELSE sequence ')'
| cond_clause+ ('(' ELSE sequence ')')?
)
| CASE expression ( case_clause+ ('(' ELSE sequence ')')?
| '(' ELSE sequence ')'
)
| AND test*
| OR test*
| LET variable? '(' {this.scope = this.scope.push();}
binding_spec[false] ')' body
{this.scope = this.scope.pop();}
| LET_STAR '(' {this.scope = this.scope.push();}
binding_spec[true] ')' body
{this.scope = this.scope.pop();}
| LETREC '(' {this.scope = this.scope.push();}
binding_spec[true] ')' body
{this.scope = this.scope.pop();}
| BEGIN sequence
| DO '(' iteration_spec* ')' '(' test do_result? ')' command*
| DELAY expression
)
')'

;

cond_clause
: '(' test (sequence | FOLLOWS recipient)? ')'
;

recipient
: expression
;

case_clause
: '(' '(' datum* ')' sequence ')'
;

binding_spec[boolean sequential]
: {sequential}? // let* or letrec: bind the var immediatly
('(' var=variable
{this.scope.bind($var.text);}
expression ')')*

| {!sequential}? // normal let: bind all vars at the end
('(' vars+=variable expression ')')*
{for (int i = 0; i \less $vars.size(); i++){
String name = ((CommonTree)$vars.get(i)).getText();
this.scope.bind(name);
}
}
;

iteration_spec
: '(' variable init step ')'
;

init
: expression
;

step
: expression
;

do_result
: sequence
;

procedure_call
: '(' operator operand* ')'
;

operator
: expression
;

operand
: expression
;

macro_use
: '(' keyword datum* ')'
;

macro_block
: '(' (LET_SYNTAX | LETREC_SYNTAX) '(' syntax_spec*')' body ')'
;

syntax_spec
: '(' keyword transformer_spec')'
;


// TRANSFORMERS

transformer_spec
: '(' SYNTAX_RULES '(' identifier* ')' syntax_rule* ')'
;

syntax_rule
: '(' pattern template ')'
;

pattern
: pattern_identifier
| '(' (pattern+ ('.' pattern)?)? ')'
| '#(' (pattern+ ELLIPSIS?)? ')'
| pattern_datum
;

pattern_datum
: bool
| number
| CHARACTER
| STRING
;

template
: pattern_identifier
| '(' (template_element+ ('.' template)?)? ')'
| '#('template_element* ')'
| template_datum
;

template_element
: template ELLIPSIS?
;

template_datum
: pattern_datum
;

pattern_identifier
: syntactic_keyword
| VARIABLE
;

// external representations
// a Datum is what the _read_ procedure successfully parses.
// Note that any string that parses as an expression will also parse as a datum.
datum
: simple_datum
| compound_datum
;

simple_datum
: bool
| number
| CHARACTER
| STRING
| identifier
;

compound_datum
: list
| vector
;

list
: '(' (datum+ ( '.' datum)?)? ')'
| abbreviation
;

abbreviation
: abbrev_prefix datum
;

abbrev_prefix
: ('\'' | '`' | ',' | ',@')
;

vector
: '#(' datum* ')'
;

// QUASIQUOTATIONS
// CONTEXT-SENSITIVE

quasiquotation
: quasiquotation_D[1]
;

quasiquotation_D[int d]
: '`' qq_template[d]
| '(' QUASIQUOTE qq_template[d] ')'
;

qq_template[int d]
: (expression)=> expression
| ('(' UNQUOTE)=> unquotation[d]
| simple_datum
| vectorQQ_template[d]
| listQQ_template[d]
;

vectorQQ_template[int d]
: '#(' qq_template_or_slice[d]* ')'
;

listQQ_template[int d]
: '\'' qq_template[d]
| ('(' QUASIQUOTE)=> quasiquotation_D[d+1]
| '(' (qq_template_or_slice[d]+ ('.' qq_template[d])?)? ')'
;

unquotation[int d]
: ',' qq_template[d-1]
| '(' UNQUOTE qq_template[d-1] ')'
;

qq_template_or_slice[int d]
: ('(' UNQUOTE_SPLICING)=> splicing_unquotation[d]
| qq_template[d]
;

splicing_unquotation[int d]
: ',@' qq_template[d-1]
| '(' UNQUOTE_SPLICING qq_template[d-1] ')'
;



// values

bool: TRUE | FALSE;
number: NUM_2 | NUM_8 | NUM_10 | NUM_16;
identifier: syntactic_keyword | variable;
variable : VARIABLE | ELLIPSIS;

// KEYWORDS

syntactic_keyword
: expression_keyword
| ELSE
| FOLLOWS
| DEFINE
| UNQUOTE
| UNQUOTE_SPLICING;
expression_keyword
: QUOTE
| LAMBDA
| IF
| SET_BANG
| BEGIN
| COND
| AND
| OR
| CASE
| LET
| LET_STAR
| LETREC
| DO
| DELAY
| QUASIQUOTE;

// syntactic keywords
ELSE : 'else';
FOLLOWS : '=>';
DEFINE : 'define';
UNQUOTE : 'unquote';
UNQUOTE_SPLICING : 'unquote-splicing';

// expression keywords
QUOTE : 'QUOTE';
LAMBDA : 'lambda';
IF : 'if';
SET_BANG : 'set!';
BEGIN : 'begin';
COND : 'cond';
AND : 'and';
OR : 'or';
CASE : 'case';
LET : 'let';
LET_STAR : 'let*';
LETREC : 'letrec';
DO : 'do';
DELAY : 'delay';
QUASIQUOTE : 'quasiquote';

// macro keywords
LETREC_SYNTAX : 'letrec-syntax';
LET_SYNTAX : 'let-syntax';
SYNTAX_RULES : 'syntax_rules';
DEFINE_SYNTAX : 'define-syntax';

ELLIPSIS : '...';

//RESERVED_CHAR : '{'| '}' | '[' | ']' | '|';

STRING : '"' STRING_ELEMENT* '"';

TRUE : '#' ('T' | 't');
FALSE : '#' ('f' | 'F');

CHARACTER : '#\\' (~(' ' | '\n') | CHARACTER_NAME);

VARIABLE : INITIAL SUBSEQUENT* | PECULIAR_IDENTIFIER;

// space and comments are ignored
SPACE : (' ' | '\n' | '\t' | '\r') {$channel = HIDDEN;};
COMMENT : ';' ~('\r' | '\n')* {$channel = HIDDEN;};


fragment INITIAL : LETTER | SPECIAL_INITIAL;
fragment LETTER : 'a'..'z' | 'A'..'Z';
fragment SPECIAL_INITIAL : '!' | '$' | '%' | '&' | '*' | '/' | ':' | '\less' | '=' | '>' | '?' | '^' | '_' | '~';
fragment SUBSEQUENT : INITIAL | DIGIT | SPECIAL_SUBSEQUENT;
fragment SPECIAL_SUBSEQUENT : '+' | '-' | '.' | '@';
fragment PECULIAR_IDENTIFIER : '+' | '-';
fragment STRING_ELEMENT : ~('"' | '\\') | '\\' ('"' | '\\');
fragment CHARACTER_NAME : 'space' | 'newline';



// NUMBERS

fragment SUFFIX : EXPONENT_MARKER SIGN? DIGIT+;
fragment EXPONENT_MARKER : 'e' | 'E' | 's' | 'S' | 'f' | 'F' | 'd' | 'D' | 'l' |'L';
fragment SIGN : '+' | '-';
fragment EXACTNESS : '#' ('i' | 'I' | 'e' | 'E');
fragment IMAGINARY : 'i' | 'I';
fragment DIGIT : '0'..'9';

// BINARY NUMBERS

NUM_2 : PREFIX_2 COMPLEX_2;

fragment COMPLEX_2
: REAL_2 ('@' REAL_2)?
| REAL_2? ('+' | '-') UREAL_2? IMAGINARY
;
fragment REAL_2 : SIGN? UREAL_2;
fragment UREAL_2 : UINTEGER_2 ('/' UINTEGER_2)?;
fragment UINTEGER_2 : DIGIT_2+ '#'*;

fragment PREFIX_2
: RADIX_2 EXACTNESS? // #d #i
| EXACTNESS RADIX_2 // #i #d
;

fragment RADIX_2 : '#' ('b' | 'B');
fragment DIGIT_2 : '0' | '1';

// OCTAL NUMBERS

NUM_8 : PREFIX_8 COMPLEX_8;

fragment COMPLEX_8
: REAL_8 ('@' REAL_8)?
| REAL_8? ('+' | '-') UREAL_8? IMAGINARY
;

fragment REAL_8 : SIGN? UREAL_8;

fragment UREAL_8
: UINTEGER_8 ('/' UINTEGER_8)?;

fragment UINTEGER_8 : DIGIT_8+ '#'*;

fragment PREFIX_8
: RADIX_8 EXACTNESS? // #d #i
| EXACTNESS RADIX_8; // #i #d

fragment RADIX_8 : '#' ('o' | 'O');
fragment DIGIT_8 : '0' .. '7';

// DECIMAl NUMBERS

NUM_10 : PREFIX_10? COMPLEX_10;

fragment COMPLEX_10
: REAL_10 ('@' REAL_10)?
| REAL_10? ('+' | '-') UREAL_10? IMAGINARY
;

fragment REAL_10 : SIGN? UREAL_10;
fragment UREAL_10 : UINTEGER_10 ('/' UINTEGER_10)? | DECIMAL_10;
fragment UINTEGER_10 : DIGIT+ '#'*;

fragment DECIMAL_10
: UINTEGER_10 SUFFIX
| '.' DIGIT+ '#'* SUFFIX?
| DIGIT+ '.' DIGIT* '#'* SUFFIX?
| DIGIT+ '#'+ '.' '#'* SUFFIX?;

fragment PREFIX_10
: RADIX_10 EXACTNESS? // #d #i
| EXACTNESS RADIX_10; // #i #d

fragment RADIX_10 : '#' ('d' | 'D');

// HEXADECIMAL NUMBERS

NUM_16 : PREFIX_16 COMPLEX_16;

fragment COMPLEX_16
: REAL_16 ('@' REAL_16)?
| REAL_16? ('+' | '-') UREAL_16? IMAGINARY
;

fragment REAL_16 : SIGN? UREAL_16;

fragment UREAL_16
: UINTEGER_16 ('/' UINTEGER_16)?;

fragment UINTEGER_16 : DIGIT_16+ '#'*;

fragment PREFIX_16
: RADIX_16 EXACTNESS? // #d #i
| EXACTNESS RADIX_16; // #i #d

fragment RADIX_16 : '#' ('x' | 'X');
fragment DIGIT_16 : DIGIT | 'a'.. 'f' | 'A' .. 'F';

(我必须用“\less”替换“<”才能使格式正常工作)

编辑这个问题的解决方案要简单得多:(define x) 是(令人惊讶的是在 r5rs 中有效(见最后一条评论)

最佳答案

改进错误报告的方法有很多。一个快速的修复方法是在解析器类中覆盖 emitErrorMessage(String message) 并简单地抛出一个带有提供的消息的异常:

grammar T;

@members {
@Override
public void emitErrorMessage(String message) {
throw new RuntimeException(message);
}
}

definition
: '(' 'define' ( '(' variable def_formals ')' body ')'
| variable expression ')'
)
;

def_formals
: variable* ('.' variable)?
;

body
: ((definition)=> definition)* expression+
;

expression
: INT
;

variable
: ID
;

ID : 'a'..'z'+;
INT : '0'..'9';
SPACE : ' ' {skip();};

你可以在类里面测试:

import org.antlr.runtime.*;

public class Main {
public static void main(String[] args) {
String[] tests = {
"(define x 5)",
"(define x 5))",
"(define x)",
"(define)"
};
for(String input : tests) {
TLexer lexer = new TLexer(new ANTLRStringStream(input));
TParser parser = new TParser(new CommonTokenStream(lexer));
System.out.println("\nParsing : " + input);
try {
parser.definition();
} catch(Exception e) {
System.out.println(" exception -> " + e.getMessage());
}
}
}
}

运行上面的类后,您将看到以下内容:

bart@hades:~/Programming/ANTLR/Demos/T$ java -cp antlr-3.3.jar org.antlr.Tool T.g
bart@hades:~/Programming/ANTLR/Demos/T$ javac -cp antlr-3.3.jar *.java
bart@hades:~/Programming/ANTLR/Demos/T$ java -cp .:antlr-3.3.jar Main

Parsing : (define x 5)

Parsing : (define x 5))

Parsing : (define x)
exception -> line 1:9 missing INT at ')'

Parsing : (define)
exception -> line 1:7 no viable alternative at input ')'

如您所见,输入 (define x 5)) 没有产生异常!这是因为词法分析器没有问题(它们都是有效的标记)并且解析器只是被指示使用 definition 规则:

definition
: '(' 'define' ( '(' variable def_formals ')' body ')'
| variable expression ')'
)
;

它的作用。如果你想因为悬空的 ')' 而出错,那么你可以在规则末尾添加 EOF 标记:

definition
: '(' 'define' ( '(' variable def_formals ')' body ')'
| variable expression ')'
)
EOF
;

关于exception - ANTLR v3 NoViableAltException 没有出现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6627508/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com