gpt4 book ai didi

java - ANTLR 不匹配输入错误

转载 作者:行者123 更新时间:2023-11-30 03:49:04 26 4
gpt4 key购买 nike

我正在为我自己的语言编写一个解析器。我正在尝试解析这个短语

Number a is 10;

这基本上相当于int a = 10;

它应该匹配variable_def规则。当我运行它时,我收到错误

line 1:0 extraneous input 'Number' expecting {<EOF>, 'while', ';', 'if', 'function', TYPE, 'global', 'room', ID}
line 1:9 mismatched input 'is' expecting '('

这是我的语法:

grammar Script;

@header {
package script;
}

// PARSER

program
:
block EOF
;

block
:
(
statement
| functionDecl
)*
;

statement
:
(variable_def
| functionCall
| ifStatement
| forStatement
| whileStatement) ';'
;

whileStatement
:
'while' '(' expression ')' '{' (statement)* '}'
;

forStatement
:
;

ifStatement
:
'if' '(' expression ')' '{' statement* '}'
(
(
'else' '{' statement* '}'
)
|
(
'else' ifStatement
)
)?
;

functionDecl
:
'function' ID
(
'('
(
TYPE ID
)?
(
',' TYPE ID
)* ')'
)?
(
'returns' RETURN_TYPE
)? '{' statement* '}'
;

functionCall
:
ID '(' exprList? ')'
;

exprList
:
expression
(
',' expression
)*
;

variable_def
:

TYPE assignment
| GLOBAL variable_def
| ROOM variable_def
;

expression
:
'-' expression # unaryMinusExpression
| '!' expression # notExpression
| expression '^' expression # powerExpression
| expression '*' expression # multiplyExpression
| expression '/' expression # divideExpression
| expression '%' expression # modulusExpression
| expression '+' expression # addExpression
| expression '-' expression # subtractExpression
| expression '>=' expression # gtEqExpression
| expression '<=' expression # ltEqExpression
| expression '>' expression # gtExpression
| expression '<' expression # ltExpression
| expression '==' expression # eqExpression
| expression '!=' expression # notEqExpression
| expression '&&' expression # andExpression
| expression '||' expression # orExpression
| expression IN expression # inExpression
| NUMBER # numberExpression
| BOOLEAN # boolExpression
| functionCall # functionCallExpression
| '(' expression ')' # expressionExpression
;

assignment
:
ID ASSIGN expression
;

// LEXER

RETURN_TYPE
:
TYPE
| 'Nothing'
;

TYPE
:
'Number'
| 'String'
| 'Anything'
| 'Boolean'
| 'Growable'? 'List' 'of' TYPE
;

GLOBAL
:
'global'
;

ROOM
:
'room'
;

ASSIGN
:
'is'
(
'a'
| 'an'
| 'the'
)?
;

EQUAL
:
'is'?
(
'equal'
(
's'
| 'to'
)?
| 'equivalent' 'to'?
| 'the'? 'same' 'as'?
)
;

IN
:
'in'
;

BOOLEAN
:
'true'
| 'false'
;

NUMBER
:
'-'? INT '.' INT EXP? // 1.35, 1.35E-9, 0.3, -4.5

| '-'? '.' INT EXP? // -.35, .35e5

| '-'? INT EXP // 1e10 -3e4

| '-'? INT // -3, 45

;

fragment
EXP
:
[Ee] [+\-]? INT
;

fragment
INT
:
'0'
| [1-9] [0-9]*
;

STRING
:
'"'
(
' ' .. '~'
)* '"'
;

ID
:
(
'a' .. 'z'
| 'A' .. 'Z'
| '_'
)
(
'a' .. 'z'
| 'A' .. 'Z'
| '0' .. '9'
| '_'
)*
;

fragment
JAVADOC_COMMENT
:
'/*' .*? '*/'
;

fragment
LINE_COMMENT
:
(
'//'
| '#'
) ~( '\r' | '\n' )*
;

COMMENT
:
(
LINE_COMMENT
| JAVADOC_COMMENT
) -> skip
;

WS
:
[ \t\n\r]+ -> skip
;

如何修复此错误?

最佳答案

主要原因是因为在您当前的语法中,永远不会创建 TYPE 标记,因为 RETURN_TYPE 也与 TYPE 匹配并被定义在 TYPE 之前(因此优先于它)。

此外,您在词法分析器中做了太多事情。一旦您开始在词法分析器中将单词粘合在一起,就表明您应该改为制定这些规则解析器规则。

词法分析器可能会跳过空格,但仅限于解析器规则。以您的 ASSIGN 规则为例:

ASSIGN
: 'is' ( 'a' | 'an' | 'the' )?
;

此规则不会匹配字符串 "is a"("is""a" 之间有一个空格),它将仅匹配 "isa""isan""isthe"。解决方案:从中创建一个解析器规则:

assign
: 'is' ( 'a' | 'an' | 'the' )?
;

这相当于:

assign
: 'is' ( 'a' | 'an' | 'the' )?
;

IS : 'is';
A : 'a';
AN : 'an';
THE : 'the';

...

ID : [a-zA-Z_] [a-zA-Z_0-9]*;

这将导致标记 'is''a''an''the' 永远不会被匹配为 ID token 。因此,以下源作为正确的分配将失败:

Number a is 42;

因为 'a' 被标记为 A 标记,而不是 ID

要解决此问题,您可以添加以下解析器规则:

id
: ( ID | A | AN | IS | THE | ... )
;

并在其他解析器规则中使用该规则而不是ID

快速演示如下所示:

grammar Script;

// PARSER

program
: block EOF
;

block
: ( statement | functionDecl )*
;

statement
: ( variable_def
| functionCall
| ifStatement
| forStatement
| whileStatement
)
';'
;

whileStatement
: 'while' '(' expression ')' '{' statement* '}'
;

forStatement
:
;

ifStatement
: 'if' '(' expression ')' '{' statement* '}'
( ( 'else' '{' statement* '}' ) | ( 'else' ifStatement ) )?
;

functionDecl
: 'function' id ( '(' ( type id )? ( ',' type id )* ')' )?
( 'returns' return_type )? '{' statement* '}'
;

functionCall
: id '(' exprList? ')'
;

exprList
: expression ( ',' expression )*
;

variable_def
: type assignment
| GLOBAL variable_def
| ROOM variable_def
;

expression
: '-' expression # unaryMinusExpression
| '!' expression # notExpression
| expression '^' expression # powerExpression
| expression '*' expression # multiplyExpression
| expression '/' expression # divideExpression
| expression '%' expression # modulusExpression
| expression '+' expression # addExpression
| expression '-' expression # subtractExpression
| expression '>=' expression # gtEqExpression
| expression '<=' expression # ltEqExpression
| expression '>' expression # gtExpression
| expression '<' expression # ltExpression
| expression '==' expression # eqExpression
| expression '!=' expression # notEqExpression
| expression '&&' expression # andExpression
| expression '||' expression # orExpression
| expression IN expression # inExpression
| NUMBER # numberExpression
| BOOLEAN # boolExpression
| functionCall # functionCallExpression
| '(' expression ')' # expressionExpression
;

assignment
: id assign expression
;

return_type
: type
| 'Nothing'
;

type
: TYPE
| 'Growable'? 'List' OF TYPE
;

assign
: 'is' ( A | AN | THE )?
;

equal
: 'is'? ( EQUAL ( S
| TO
)?
| EQUIVALENT TO?
| THE? SAME AS?
)
;

id
: ( ID | OF | A | AN | EQUAL | S | EQUIVALENT | TO | THE | SAME | AS )
;

// LEXER

// Some keyword you might want to match as an identifier too:
OF : 'of';
A : 'a';
AN : 'an';
EQUAL : 'equal';
S : 's';
EQUIVALENT : 'equivalent';
TO : 'to';
THE : 'the';
SAME : 'same';
AS : 'as';

COMMENT
: ( LINE_COMMENT | JAVADOC_COMMENT ) -> skip
;

WS
: [ \t\n\r]+ -> skip
;

TYPE
: 'Number'
| 'String'
| 'Anything'
| 'Boolean'
;

GLOBAL
: 'global'
;

ROOM
: 'room'
;

IN
: 'in'
;

BOOLEAN
: 'true'
| 'false'
;

NUMBER
: '-'? INT '.' INT EXP? // 1.35, 1.35E-9, 0.3, -4.5
| '-'? '.' INT EXP? // -.35, .35e5
| '-'? INT EXP // 1e10 -3e4
| '-'? INT // -3, 45
;

STRING
: '"' .*? '"'
;

ID
: [a-zA-Z_] [a-zA-Z_0-9]*
;

fragment EXP
: [Ee] [+\-]? INT
;

fragment INT
: '0'
| [1-9] [0-9]*
;

fragment JAVADOC_COMMENT
: '/*' .*? '*/'
;

fragment LINE_COMMENT
: ( '//' | '#' ) ~( '\r' | '\n' )*
;

关于java - ANTLR 不匹配输入错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24875836/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com