c++ - 提升精神解析器: getting around the greedy kleene *-6ren

c++ - 提升精神解析器: getting around the greedy kleene *

转载作者：搜寻专家更新时间：2023-10-31 02:07:34

25

4

我有一个语法，它应该匹配字符序列后跟一个字符，该字符是第一个字符的子集。例如，

boost::spirit::qi::rule<Iterator, std::string()> grammar = *char_('a', 'z') >> char_('b', 'z').

由于 kleene * 是贪婪运算符，它吞噬了所有东西，没有给第二个解析器留下任何东西，所以它无法匹配像“abcd”这样的字符串

有什么办法可以解决这个问题吗？

最佳答案

是的，尽管您的示例缺乏我们了解的上下文。

我们需要知道什么是完全匹配，因为现在“b”是有效匹配，“bb”或“bbb”是有效匹配。那么当输入是“bbb”时，匹配项是什么？ (b、bb 还是 bbb？)。

当您(可能)回答“显然，bbb”时，“bbbb”会怎样？您什么时候停止接受子集中的字符？如果你想让kleene star不贪心，你想让它还是贪心吗？

上面的对话框很烦人，但目的是让你思考你需要什么。你不需要需要一个非贪婪的 kleene-star。您可能希望对最后一个字符进行验证约束。最有可能的是，如果输入有“bbba”，您不想只想匹配“bbb”，留下“a”。相反，您可能想停止解析，因为“bbba”不是有效标记。

假设...

我会写

grammar = +char_("a-z") >> eps(px::back(_val) != 'a');

这意味着我们接受 至少 1 个字符，只要它匹配，断言最后一个字符不是 a。

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/phoenix_stl.hpp>

namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;

template <typename It>
struct P : qi::grammar<It, std::string()>
{
    P() : P::base_type(start) {
        using namespace qi;
        start = +char_("a-z") >> eps(px::back(_val) != 'a');
    }
  private:
    qi::rule<It, std::string()> start;
};

#include <iomanip>

int main() {
    using It = std::string::const_iterator;
    P<It> const p;

    for (std::string const input : { "", "b", "bb", "bbb", "aaab", "a", "bbba" }) {
        std::cout << std::quoted(input) << ": ";
        std::string out;
        It f = input.begin(), l = input.end();
        if (parse(f, l, p, out)) {
            std::cout << std::quoted(out);
        } else {
            std::cout << "(failed) ";
        }

        if (f != l)
            std::cout << " Remaining: " << std::quoted(std::string(f,l));
        std::cout << "\n";
    }
}

打印

"": (failed) 
"b": "b"
"bb": "bb"
"bbb": "bbb"
"aaab": "aaab"
"a": (failed)  Remaining: "a"
"bbba": (failed)  Remaining: "bbba"

奖金

一种更通用但效率较低的方法是将前导字符与前瞻性断言相匹配，即它不是同类中的最后一个字符:

start = *(char_("a-z") >> &char_("a-z")) >> char_("b-z");

这里的一个好处是不需要使用 Phoenix:

Live On Coliru

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

template <typename It>
struct P : qi::grammar<It, std::string()>
{
    P() : P::base_type(start) {
        using namespace qi;
        start = *(char_("a-z") >> &char_("a-z")) >> char_("b-z");
    }
  private:
    qi::rule<It, std::string()> start;
};

#include <iomanip>

int main() {
    using It = std::string::const_iterator;
    P<It> const p;

    for (std::string const input : { "", "b", "bb", "bbb", "aaab", "a", "bbba" }) {
        std::cout << std::quoted(input) << ": ";
        std::string out;
        It f = input.begin(), l = input.end();
        if (parse(f, l, p, out)) {
            std::cout << std::quoted(out);
        } else {
            std::cout << "(failed) ";
        }

        if (f != l)
            std::cout << " Remaining: " << std::quoted(std::string(f,l));
        std::cout << "\n";
    }
}

关于c++ - 提升精神解析器: getting around the greedy kleene *，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48478423/

25

4

0

文章推荐： c++ - 多次解包参数包

c++ - 增强::精神::保持空白
我正在使用此代码将“k1 = v1; k2 = v2; k3 = v3; kn = vn”字符串解析为映射。 qi::phrase_parse( begin,end,
c++ - 精神: discarding attribute during backtracking
我不理解以下示例中 x3 的行为(取自更大的语法)。当然，语法有点奇怪，但大致实现了 (lal)?()? .当第二组不存在时，默认为 .我不明白为什么要输入 "lal"我得到 defaultcha
c++ - 升压::精神::莱克斯;如何指定 token "||"？
所以我的问题很简单，在我的词法分析器类(扩展 lex::lexer )中，我有以下内容； this->self.add ... ("&&", AND_AND) ("||", O
C++ Boost 精神，将二维数组(以及更多)解析为结构
我正在尝试修改以下示例:http://www.boost.org/doc/libs/1_57_0/libs/spirit/example/qi/employee.cpp 我想在 employee 结构
version-control - 计划不周、分支多、精神 split 的应用程序的版本号方案是什么
我正在为当前分支为多个版本的应用程序寻找版本编号方案/模式/系统 shell game样式发布日期。这使得版本控制成为一场噩梦。我想只使用典型的 Major.Minor.Revision 但是这会很快
ruby-on-rails - 测试驱动开发？精神 split 症？我糊涂了!我应该使用什么进行测试，为什么？
是的，我开始做这个测试了! 但是我不知道该用什么=/ Rspec + 应该吗？ Rspec + 牛排？迷你测试？ cucumber ？ capybara ？可以吗？ ( cucumber 与 Sh

首页

博学

6Ren·AI

商城

c++ - 提升精神解析器: getting around the greedy kleene *

假设...

奖金