gpt4 book ai didi

ANTLR 解析语法 -> 树语法

转载 作者:行者123 更新时间:2023-12-04 05:50:13 24 4
gpt4 key购买 nike

我们编译器理论课的最后一个任务是为一小部分 Java(不是 MiniJava)创建一个编译器。我们的教授让我们可以选择使用我们希望使用的任何工具,经过反复研究,我决定使用 ANTLR。我设法让扫描器和解析器启动并运行,解析器输出 AST。我现在被困在试图编译一个树语法文件。我理解的基本思想是从解析器复制语法规则并消除大部分代码,保留重写规则,但它似乎不想编译(offendingToken 错误)。我在正确的轨道上吗?我错过了一些微不足道的东西吗?

树语法:

tree grammar J0_SemanticAnalysis;

options {
language = Java;
tokenVocab = J0_Parser;
ASTLabelType = CommonTree;
}

@header
{
package ritterre.a4;
import java.util.Map;
import java.util.HashMap;
}

@members
{

}

walk
: compilationunit
;

compilationunit
: ^(UNIT importdeclaration* classdeclaration*)
;

importdeclaration
: ^(IMP_DEC IDENTIFIER+)
;

classdeclaration
: ^(CLASS IDENTIFIER ^(EXTENDS IDENTIFIER)? fielddeclaration* methoddeclaration*)
;

fielddeclaration
: ^(FIELD_DEC IDENTIFIER type visibility? STATIC?)
;

methoddeclaration
: ^(METHOD_DEC IDENTIFIER type visibility? STATIC? ^(PARAMS parameter+)? body)
;

visibility
: PRIVATE
| PUBLIC
;

parameter
: ^(PARAM IDENTIFIER type)
;

body
: ^(BODY ^(DECLARATIONS localdeclaration*) ^(STATEMENTS statement*))
;

localdeclaration
: ^(DECLARATION type IDENTIFIER)
;

statement
: assignment
| ifstatement
| whilestatement
| returnstatement
| callstatement
| printstatement
| block
;

assignment
: ^(ASSIGN IDENTIFIER+ expression? expression)
;

ifstatement
: ^(IF relation statement ^(ELSE statement)?)
;

whilestatement
: ^(WHILE relation statement)
;

returnstatement
: ^(RETURN expression?)
;

callstatement
: ^(CALL IDENTIFIER+ expression+)
;

printstatement
: ^(PRINT expression)
;

block
: ^(STATEMENTS statement*)
;

relation
// : expression (LTHAN | GTHAN | EQEQ | NEQ)^ expression
: ^(LTHAN expression expression)
| ^(GTHAN expression expression)
| ^(EQEQ expression expression)
| ^(NEQ expression expression)
;

expression
// : (PLUS | MINUS)? term ((PLUS | MINUS)^ term)*
: ^(PLUS term term)
| ^(MINUS term term)
;

term
// : factor ((MULT | DIV)^ factor)*
: ^(MULT factor factor)
| ^(DIV factor factor)
;

factor
: NUMBER
| IDENTIFIER (DOT IDENTIFIER | LBRAC expression RBRAC)?
| NULL
| NEW IDENTIFIER LPAREN RPAREN
| NEW (INT | IDENTIFIER) (LBRAC RBRAC)?
;

type
: (INT | IDENTIFIER) (LBRAC RBRAC)?
| VOID
;

解析器语法:
parser grammar J0_Parser;

options
{
output = AST; // Output an AST
tokenVocab = J0_Scanner; // Pull Tokens from Scanner
//greedy = true; // forcing this throughout?! success!!
//cannot force greedy true throughout. bad things happen and the parser doesnt build
}

tokens
{
UNIT;
IMP_DEC;
FIELD_DEC;
METHOD_DEC;
PARAMS;
PARAM;
BODY;
DECLARATIONS;
STATEMENTS;
DECLARATION;
ASSIGN;
CALL;
}

@header { package ritterre.a4; }

// J0 - Extended Specification - EBNF
parse
: compilationunit EOF -> compilationunit
;

compilationunit
: importdeclaration* classdeclaration*
-> ^(UNIT importdeclaration* classdeclaration*)
;

importdeclaration
: IMPORT IDENTIFIER (DOT IDENTIFIER)* SCOLON
-> ^(IMP_DEC IDENTIFIER+)
;

classdeclaration
: (PUBLIC)? CLASS n=IDENTIFIER (EXTENDS e=IDENTIFIER)? LBRAK (fielddeclaration|methoddeclaration)* RBRAK
-> ^(CLASS $n ^(EXTENDS $e)? fielddeclaration* methoddeclaration*)
;

fielddeclaration
: visibility? STATIC? type IDENTIFIER SCOLON
-> ^(FIELD_DEC IDENTIFIER type visibility? STATIC?)
;

methoddeclaration
: visibility? STATIC? type IDENTIFIER LPAREN (parameter (COMMA parameter)*)? RPAREN body
-> ^(METHOD_DEC IDENTIFIER type visibility? STATIC? ^(PARAMS parameter+)? body)
;

visibility
: PRIVATE
| PUBLIC
;

parameter
: type IDENTIFIER
-> ^(PARAM IDENTIFIER type)
;

body
: LBRAK localdeclaration* statement* RBRAK
-> ^(BODY ^(DECLARATIONS localdeclaration*) ^(STATEMENTS statement*))
;

localdeclaration
: type IDENTIFIER SCOLON
-> ^(DECLARATION type IDENTIFIER)
;

statement
: assignment
| ifstatement
| whilestatement
| returnstatement
| callstatement
| printstatement
| block
;

assignment
: IDENTIFIER (DOT IDENTIFIER | LBRAC a=expression RBRAC)? EQ b=expression SCOLON
-> ^(ASSIGN IDENTIFIER+ $a? $b)
;

ifstatement
: IF LPAREN relation RPAREN statement (options {greedy=true;} : ELSE statement)?
-> ^(IF relation statement ^(ELSE statement)?)
;

whilestatement
: WHILE LPAREN relation RPAREN statement
-> ^(WHILE relation statement)
;

returnstatement
: RETURN expression? SCOLON
-> ^(RETURN expression?)
;

callstatement
: IDENTIFIER (DOT IDENTIFIER)? LPAREN (expression (COMMA expression)*)? RPAREN SCOLON
-> ^(CALL IDENTIFIER+ expression+)
;

printstatement
: PRINT LPAREN expression RPAREN SCOLON
-> ^(PRINT expression)
;

block
: LBRAK statement* RBRAK
-> ^(STATEMENTS statement*)
;

relation
: expression (LTHAN | GTHAN | EQEQ | NEQ)^ expression
;

expression
: (PLUS | MINUS)? term ((PLUS | MINUS)^ term)*
;

term
: factor ((MULT | DIV)^ factor)*
;

factor
: NUMBER
| IDENTIFIER (DOT IDENTIFIER | LBRAC expression RBRAC)?
| NULL
| NEW IDENTIFIER LPAREN RPAREN
| NEW (INT | IDENTIFIER) (LBRAC RBRAC)?
;

type
: (INT | IDENTIFIER) (LBRAC RBRAC)?
| VOID
;

最佳答案

问题在于,在您的树语法中,您执行以下操作(我相信 3 次):

classdeclaration
: ^(CLASS ... ^(EXTENDS IDENTIFIER)? ... )
;
^(EXTENDS IDENTIFIER)?部分是错误的:您需要将树环绕在括号中,然后才将其设为可选:
classdeclaration
: ^(CLASS ... (^(EXTENDS IDENTIFIER))? ... )
;

但是,如果仅此而已,那就太容易了,不是吗? :)

当您解决上述问题时,ANTLR 会在尝试从您的树文法生成树行者时提示树文法不明确。 ANTLR 将向您抛出以下内容:

error(211): J0_SemanticAnalysis.g:61:26: [fatal] rule assignment has non-LL(*) decision due to recursive rule invocations reachable from alts 1,2. Resolve by left-factoring or using syntactic predicates or using backtrack=true option.



它提示 assignment语法规则:
assignment
: ^(ASSIGN IDENTIFIER+ expression? expression)
;

因为 ANTLR 是 LL 解析器生成器1,它从左到右解析标记。因此 IDENTIFIER+ expression? expression 中的可选表达式使语法有歧义。通过移动 ? 解决此问题到最后 expression :
assignment
: ^(ASSIGN IDENTIFIER+ expression expression?)
;

1 不要让 ANTLR 名称中的最后两个字母误导您,它们代表语言识别,而不是它生成的解析器类!

关于ANTLR 解析语法 -> 树语法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10153091/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com