gpt4 book ai didi

java - ANTLR - 连接 token 以输出

转载 作者:行者123 更新时间:2023-12-01 12:45:23 26 4
gpt4 key购买 nike

使用 ANTLR3,我想解析字符串:

  • 姓名为空且年龄不在 (14, 15)

对于这些情况,我想要获得以下 AST:

  n0 [label="QUERY"];
n1 [label="AND"];
n1 [label="AND"];
n2 [label="IS NOT"];
n2 [label="IS NOT"];
n3 [label="name"];
n4 [label="empty"];
n5 [label="NOT IN"];
n5 [label="NOT IN"];
n6 [label="age"];
n7 [label="14"];
n8 [label="15"];

n0 -> n1 // "QUERY" -> "AND"
n1 -> n2 // "AND" -> "IS NOT"
n2 -> n3 // "IS NOT" -> "name"
n2 -> n4 // "IS NOT" -> "empty"
n1 -> n5 // "AND" -> "NOT IN"
n5 -> n6 // "NOT IN" -> "age"
n5 -> n7 // "NOT IN" -> "14"
n5 -> n8 // "NOT IN" -> "15"

但是我的n2和n5节点看起来像:n2 [标签=“IS”];n5 [标签=“不”];

即,仅出现第一个单词。如何将两个 token 合并到一个 token 中?

我的语法是:

query
: expr EOF -> ^(QUERY expr)
;

expr
: logical_expr
;

logical_expr
: equality_expr (logical_op^ equality_expr)*
;

equality_expr
: ID equality_op+ atom -> ^(equality_op ID atom)
| '(' expr ')' -> ^('(' expr)
;

atom
: ID
| id_list
| Int
| Number
| String
| '*'
;

id_list
: '(' ID (',' ID)+ ')' -> ID+
| '(' Number (',' Number)* ')' -> Number+
| '(' String (',' String)* ')' -> String+
;

equality_op
: 'IN'
| 'IS'
| 'NOT'
| 'in'
| 'is'
| 'not'
;

logical_op
: 'AND'
| 'OR'
| 'and'
| 'or'
;

Number
: Int ('.' Digit*)?
;

ID
: ('a'..'z' | 'A'..'Z' | '_' | '.' | '-' | '*' | '/' | ':' | Digit)*
;

String
@after {
setText(getText().substring(1, getText().length()-1).replaceAll("\\\\(.)", "$1"));
}
: '"' (~('"' | '\\') | '\\' ('\\' | '"'))* '"'
| '\'' (~('\'' | '\\') | '\\' ('\\' | '\''))* '\''
;

Comment
: '//' ~('\r' | '\n')* {skip();}
| '/*' .* '*/' {skip();}
;

Space
: (' ' | '\t' | '\r' | '\n' | '\u000C') {skip();}
;

fragment Int
: '1'..'9' Digit*
| '0'
;

fragment Digit
: '0'..'9'
;

indexes
: ('[' expr ']')+ -> ^(INDEXES expr+)
;

最佳答案

改为执行类似的操作(检查我添加的内联注释):

tokens {
IS_NOT; // added
NOT_IN; // added
QUERY;
INDEXES;
}

query
: expr EOF -> ^(QUERY expr)
;

expr
: logical_expr
;

logical_expr
: equality_expr (logical_op^ equality_expr)*
;

equality_expr
: ID equality_op atom -> ^(equality_op ID atom) // changed equality_op+ to equality_op
| '(' expr ')' -> ^('(' expr)
;

atom
: ID
| id_list
| Int
| Number
| String
| '*'
;

id_list
: '(' ID (',' ID)+ ')' -> ID+
| '(' Number (',' Number)* ')' -> Number+
| '(' String (',' String)* ')' -> String+
;

equality_op
: IS NOT -> IS_NOT // added
| NOT IN -> NOT_IN // added
| IN
| IS
| NOT
;

logical_op
: AND
| OR
;

IS : 'IS' | 'is'; // added
NOT : 'NOT' | 'not'; // added
IN : 'IN' | 'in'; // added
AND : 'AND' | 'and'; // added
OR : 'OR' | 'or'; // added

Number
: Int ('.' Digit*)?
;

ID
: ('a'..'z' | 'A'..'Z' | '_' | '.' | '-' | '*' | '/' | ':' | Digit)+
;

String
@after {
setText(getText().substring(1, getText().length()-1).replaceAll("\\\\(.)", "$1"));
}
: '"' (~('"' | '\\') | '\\' ('\\' | '"'))* '"'
| '\'' (~('\'' | '\\') | '\\' ('\\' | '\''))* '\''
;

Comment
: '//' ~('\r' | '\n')* {skip();}
| '/*' .* '*/' {skip();}
;

Space
: (' ' | '\t' | '\r' | '\n' | '\u000C') {skip();}
;

fragment Int
: '1'..'9' Digit*
| '0'
;

fragment Digit
: '0'..'9'
;

indexes
: ('[' expr ']')+ -> ^(INDEXES expr+)
;

生成以下 AST:

enter image description here

此外,词法分析器规则应始终匹配至少 1 个字符(我之前已经向您提到过这一点)。您的词法分析器规则 ID 可能匹配 0 个字符。

关于java - ANTLR - 连接 token 以输出,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24754291/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com