gpt4 book ai didi

perl - Marpa 的错误标记化

转载 作者:行者123 更新时间:2023-12-04 14:29:34 28 4
gpt4 key购买 nike

我有一个相当大的 Marpa 语法(用于解析 XPath),并且遇到了标记化问题。我在下面创建了一个最小的破坏示例:

use strict;
use warnings;
use Marpa::R2;

my $grammar = Marpa::R2::Scanless::G->new(
{
source => \(<<'END_OF_SOURCE'),
:default ::= action => ::array
:start ::= Start

Start ::= Child DoubleColon Token

DoubleColon ~ '::'
Child ~ 'child'
Token ~
word
| word ':' word
word ~ [\w]+

END_OF_SOURCE
}
);
my $reader = Marpa::R2::Scanless::R->new(
{
grammar => $grammar,
trace_terminals => 1,
}
);

my $input = 'child::book';
$reader->read(\$input);

此脚本打印以下内容:
Registering character U+0063 as symbol 10: [[\w]]
Registering character U+0063 as symbol 3: [[c]]
Registering character U+0068 as symbol 10: [[\w]]
Registering character U+0068 as symbol 4: [[h]]
Registering character U+0069 as symbol 10: [[\w]]
Registering character U+0069 as symbol 5: [[i]]
Registering character U+006c as symbol 10: [[\w]]
Registering character U+006c as symbol 6: [[l]]
Registering character U+0064 as symbol 10: [[\w]]
Registering character U+0064 as symbol 7: [[d]]
Registering character U+003a as symbol 1: [[\:]]
Rejected lexeme @0-5: Token; value="child"
Accepted lexeme @0-5: Child; value="child"
Registering character U+0062 as symbol 10: [[\w]]
Error in SLIF G1 read: No lexeme found at position 6
* String before error: child::
* The error was at line 1, column 8, and at character 0x0062 'b', ...
* here: book

我希望输入被标记为 [Child] [DoubleColon] [word] .正如终端跟踪所示,只有一个冒号字符被读取和处理。它似乎试图将字符串的开头标记为 [word] [':'] [word]并且中途失败。如果删除语法的第 10 行 ( | word ':' word ),将不再抛出该错误。

我尝试为 DoubleColon ( :lexeme ~ <DoubleColon> priority > 1 ) 创建优先级,但这没有用。有人能告诉我怎么做才能让这个语法正确解析输入字符串吗?它仍然需要能够解析 child::ns:book , 等等。

最佳答案

这似乎是 Marpa::R2 当前版本 2.058 中的一个错误。我很抱歉,并感谢您对问题的仔细记录。

我有一个修复程序,它通过了测试套件,我很快就会发布一个新版本。

关于perl - Marpa 的错误标记化,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/17203668/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com