gpt4 book ai didi

error-handling - Antlr4 丢弃剩余的 token 而不是救助

转载 作者:行者123 更新时间:2023-12-05 01:10:15 24 4
gpt4 key购买 nike

我正在使用 Antlr4,这是我写的简化语法:

grammar BooleanExpression;

/*******************************
* Parser Rules
*******************************/
booleanTerm
: booleanLiteral (KW_OR booleanLiteral)+
| booleanLiteral
;

id
: IDENTIFIER
;

booleanLiteral
: KW_TRUE
| KW_FALSE
;

/*******************************
* Lexer Rules
*******************************/
KW_TRUE
: 'true'
;

KW_FALSE
: 'false'
;

KW_OR
: 'or'
;

IDENTIFIER
: (SIMPLE_LATIN)+
;

fragment
SIMPLE_LATIN
: 'A' .. 'Z'
| 'a' .. 'z'
;

WHITESPACE
: [ \t\n\r]+ -> skip
;

我使用了 BailErrorStategy 和 BailLexer,如下所示:
public class BailErrorStrategy extends DefaultErrorStrategy {
/**
* Instead of recovering from exception e, rethrow it wrapped in a generic
* IllegalArgumentException so it is not caught by the rule function catches.
* Exception e is the "cause" of the IllegalArgumentException.
*/

@Override
public void recover(Parser recognizer, RecognitionException e) {
throw new IllegalArgumentException(e);
}

/**
* Make sure we don't attempt to recover inline; if the parser successfully
* recovers, it won't throw an exception.
*/
@Override
public Token recoverInline(Parser recognizer) throws RecognitionException {
throw new IllegalArgumentException(new InputMismatchException(recognizer));
}

/** Make sure we don't attempt to recover from problems in subrules. */
@Override
public void sync(Parser recognizer) {
}

@Override
protected Token getMissingSymbol(Parser recognizer) {
throw new IllegalArgumentException(new InputMismatchException(recognizer));
}
}



public class BailLexer extends BooleanExpressionLexer {
public BailLexer(CharStream input) {
super(input);
//removeErrorListeners();
//addErrorListener(new ConsoleErrorListener());
}

@Override
public void recover(LexerNoViableAltException e) {
throw new IllegalArgumentException(e); // Bail out
}

@Override
public void recover(RecognitionException re) {
throw new IllegalArgumentException(re); // Bail out
}
}

除了一种情况外,一切正常。我尝试了以下表达式:
true OR false

我希望这个表达式被拒绝并抛出 IllegalArgumentException 因为“或”标记应该是小写而不是大写。但事实证明 Antlr4 没有拒绝这个表达式,并且表达式被标记为“KW_TRUE IDENTIFIER KW_FALSE”(这是预期的,大写的 'OR' 将被视为 IDENTIFIER),但是解析器在此期间没有抛出错误处理此 token 流并将其解析为仅包含“true”的树,并丢弃剩余的“IDENTIFIER KW_FALSE” token 。我尝试了不同的预测模式,但它们都像上面一样工作。我不知道为什么它会这样工作并进行了一些调试,最终在 Antlr 中导致了这段代码:
ATNConfigSet reach = computeReachSet(previous, t, false);

if ( reach==null ) {
// if any configs in previous dipped into outer context, that
// means that input up to t actually finished entry rule
// at least for SLL decision. Full LL doesn't dip into outer
// so don't need special case.
// We will get an error no matter what so delay until after
// decision; better error message. Also, no reachable target
// ATN states in SLL implies LL will also get nowhere.
// If conflict in states that dip out, choose min since we
// will get error no matter what.
int alt = getAltThatFinishedDecisionEntryRule(previousD.configs);
if ( alt!=ATN.INVALID_ALT_NUMBER ) {
// return w/o altering DFA
return alt;
}
throw noViableAlt(input, outerContext, previous, startIndex);
}

代码“int alt = getAltThatFinishedDecisionEntryRule(previousD.configs);”返回 booleanTerm 中的第二个替代项(因为“true”匹配第二个替代项“booleanLiteral”)但由于它不等于 ATN.INVALID_ALT_NUMBER,因此不会立即抛出 noViableAlt。根据那里的 Java 评论,“无论如何我们都会得到一个错误,所以要延迟到决定之后”,但似乎最终没有抛出错误。

在这种情况下,我真的不知道如何让 Antlr 报告错误,有人能给我一些启示吗?任何帮助表示赞赏,谢谢。

最佳答案

如果您的顶级规则没有以明确的 EOF 结尾,那么 ANTLR 不需要解析到输入序列的末尾。它不会抛出异常,而是简单地解析您提供的序列的有效部分。

以下start规则将强制它将整个输入序列解析为单个 booleanTerm .

start : booleanTerm EOF;

另外, BailErrorStrategy 由 ANTLR 4 运行时提供,并抛出更多信息 ParseCancellationException比您的示例中显示的那个。

关于error-handling - Antlr4 丢弃剩余的 token 而不是救助,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15127116/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com