gpt4 book ai didi

词法分析器中的 ANTLR4 负前瞻

转载 作者:行者123 更新时间:2023-12-04 00:36:15 24 4
gpt4 key购买 nike

我正在尝试为 PostgreSQL SQL 定义词法分析器规则。

问题在于运算符定义和行注释相互冲突。

例如 @--- 是一个运算符标记 @- 后跟 -- 注释而不是运算符标记 @---

grako 中,可以为 - 片段定义否定前瞻,例如:

OP_MINUS: '-' ! ( '-' ) .

在 ANTLR4 中,我找不到任何方法来回滚已经消耗的片段。

有什么想法吗?

这里是 PostgreSQL 运算符的原始定义:

The operator name is a sequence of up to NAMEDATALEN-1
(63 by default) characters from the following list:

+ - * / < > = ~ ! @ # % ^ & | ` ?

There are a few restrictions on your choice of name:
-- and /* cannot appear anywhere in an operator name,
since they will be taken as the start of a comment.

A multicharacter operator name cannot end in + or -,
unless the name also contains at least one of these
characters:

~ ! @ # % ^ & | ` ?

For example, @- is an allowed operator name, but *- is not.
This restriction allows PostgreSQL to parse SQL-compliant
commands without requiring spaces between tokens.

最佳答案

您可以在词法分析器规则中使用语义谓词来执行前瞻(或后视)而不消耗字符。例如,以下涵盖了运算符的几个规则。

OPERATOR
: ( [+*<>=~!@#%^&|`?]
| '-' {_input.LA(1) != '-'}?
| '/' {_input.LA(1) != '*'}?
)+
;

但是,上述规则并未解决在运算符末尾包含 +- 的限制。为了尽可能以最简单的方式处理这个问题,我可能会将这两种情况分成不同的规则。

// this rule does not allow + or - at the end of a rule
OPERATOR
: ( [*<>=~!@#%^&|`?]
| ( '+'
| '-' {_input.LA(1) != '-'}?
)+
[*<>=~!@#%^&|`?]
| '/' {_input.LA(1) != '*'}?
)+
;

// this rule allows + or - at the end of a rule and sets the type to OPERATOR
// it requires a character from the special subset to appear
OPERATOR2
: ( [*<>=+]
| '-' {_input.LA(1) != '-'}?
| '/' {_input.LA(1) != '*'}?
)*
[~!@#%^&|`?]
OPERATOR?
( '+'
| '-' {_input.LA(1) != '-'}?
)+
-> type(OPERATOR)
;

关于词法分析器中的 ANTLR4 负前瞻,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24194110/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com