gpt4 book ai didi

java - Antlworks语法解析器

转载 作者:行者123 更新时间:2023-12-02 00:32:31 25 4
gpt4 key购买 nike

我在 AntlWorks 中创建了一个简单的语法。然后我生成了代码,并且有两个文件:grammarLexer.javagrammarParser.java。我的目标是创建我的语法到 java 语言的映射。接下来我应该做什么来实现它?

这是我的语法:` 语法语法; prog : ((FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | VARIABLE) | FUNCTION_DEC)+;

FOR        :     WS* 'for' WS+ VARIABLE WS+ DIGIT+ WS+ DIGIT+ WS* ENTER  ( FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | INC_DEC )* WS* 'end' WS* ENTER;
WHILE : WS* 'while' WS+ (VARIABLE | DIGIT+) WS* EQ_OPERATOR WS* (VARIABLE | DIGIT+) WS* ENTER (FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | (WS* INC_DEC))* WS* 'end' WS* ENTER;
IF : WS* 'if' WS+ ( FUNCTION | VARIABLE | DIGIT+) WS* EQ_OPERATOR WS* (VARIABLE | DIGIT+) WS* ENTER (FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | INC_DEC)* ( WS* 'else' ENTER (FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | (WS* INC_DEC))*)? WS* 'end' WS* ENTER;

CHAR : ('a'..'z'|'A'..'Z')+;
EQ_OPERATOR : ('<' | '>' | '==' | '>=' | '<=' | '!=');
DIGIT : '0'..'9'+;
ENTER : '\n';
WS : ' ' | '\t';

PRINT_TEMPLATE : WS+ (('"' (CHAR | DIGIT | WS)* '"') | VARIABLE | DIGIT+ | FUNCTION | INC_DEC);
PRINT : WS* 'print' PRINT_TEMPLATE (',' PRINT_TEMPLATE)* WS* ENTER;

VARIABLE : CHAR(CHAR|DIGIT)*;
FUN_TEMPLATE : WS* (VARIABLE | DIGIT+ | '"' (CHAR | DIGIT | WS)* '"');
FUNCTION : VARIABLE '(' (FUN_TEMPLATE (WS* ',' FUN_TEMPLATE)*)? ')' WS* ENTER*;

DECLARATION : WS* VARIABLE WS* ('=' WS* (DIGIT+ | '"' (CHAR | DIGIT | WS)* '"' | VARIABLE)) WS* ENTER;
FUNCTION_DEC : WS*'def' WS* FUNCTION ( FOR | WHILE | IF | PRINT | DECLARATION | ENTER | (WS* FUNCTION) | INC_DEC )* WS* 'end' WS* ENTER*;

INC_DEC : VARIABLE ('--' | '++') WS* ENTER*;`

这是我的解析器主类: `
导入 org.antlr.runtime.ANTLRStringStream; 导入 org.antlr.runtime.CommonToken; 导入 org.antlr.runtime.CommonTokenStream; 导入 org.antlr.runtime.Parser;

public class Main {
public static void main(String[] args) throws Exception {
// the input source
String source =
"for i 1 3\n " +
"printHi()\n " +
"end\n " +
"if fun(y, z) == 0\n " +
"end\n ";
// create an instance of the lexer
grammarLexer lexer = new grammarLexer(new ANTLRStringStream(source));

// wrap a token-stream around the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);

// traverse the tokens and print them to see if the correct tokens are created
int n = 1;
for(Object o : tokens.getTokens()) {
CommonToken token = (CommonToken)o;
System.out.println("token(" + n + ") = " + token.getText().replace("\n", "\\n"));
n++;
}
grammarParser parser = new grammarParser(tokens);
parser.file();
}
}
`

最佳答案

正如我在评论中已经提到的:您过度使用词法分析器规则是错误的。将词法分析器规则视为语言的基本构建 block 。就像你在化学中描述水一样。你不会这样描述水:

WATER
: 'HHO'
;

即:作为单个元素。水应该被描述为 3 个独立的元素:

water
: Hydrogen Hydrogen Oxygen
;

Hydrogen : 'H';
Oxygen : 'O';

其中,HydrogenOxygen 是基本构建 block (词法分析器规则),water 是化合物(解析器规则)。

一个好的经验法则是,如果您创建的词法分析器规则由其他几个词法分析器规则组成,那么您的语法中很可能存在可疑之处。当然,情况并非总是如此。

假设您要解析以下输入:

for i 1 3
print(i)
end

if fun(y, z) == 0
print('foo')
end

语法可以如下所示:

grammar T;

options {
output=AST;
}

tokens {
BLOCK;
CALL;
PARAMS;
}

// parser rules
parse
: block EOF!
;

block
: stat* -> ^(BLOCK stat*)
;

stat
: for_stat
| if_stat
| func_call
;

for_stat
: FOR^ ID expr expr block END!
;

if_stat
: IF^ expr block END!
;

expr
: eq_expr
;

eq_expr
: atom (('==' | '!=')^ atom)*
;

atom
: func_call
| INT
| ID
| STR
;

func_call
: ID '(' params ')' -> ^(CALL ID params)
;

params
: (expr (',' expr)*)? -> ^(PARAMS expr*)
;

// lexer rules
FOR : 'for';
END : 'end';
IF : 'if';
ID : ('a'..'z' | 'A'..'Z')+;
INT : '0'..'9'+;
STR : '\'' ~('\'')* '\'';
SP : (' ' | '\t' | '\r' | '\n')+ {skip();};

如果您现在运行此测试类:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
public static void main(String[] args) throws Exception {
String src =
"for i 1 3 \n" +
" print(i) \n" +
"end \n" +
" \n" +
"if fun(y, z) == 0 \n" +
" print('foo') \n" +
"end \n";
TLexer lexer = new TLexer(new ANTLRStringStream(src));
TParser parser = new TParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.parse().getTree();
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT(tree);
System.out.println(st);
}
}

您将看到一些输出被打印到控制台,对应于以下 AST:

enter image description here

关于java - Antlworks语法解析器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/8659200/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com