gpt4 book ai didi

c++ - boost::spirit::lex 和空格的问题

转载 作者:塔克拉玛干 更新时间:2023-11-03 00:08:05 26 4
gpt4 key购买 nike

我尝试学习使用 boost::spirit。为此,我想创建一些简单的词法分析器,将它们组合起来,然后开始使用 spirit 进行解析。我尝试修改示例,但它没有按预期运行(结果 r 不正确)。

这是词法分析器:

#include <boost/spirit/include/lex_lexertl.hpp>

namespace lex = boost::spirit::lex;

template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \\t\\n]+")
{
using boost::spirit::lex::_start;
using boost::spirit::lex::_end;

this->self = identifier;
this->self("WS") = white_space;
}
lex::token_def<> identifier;
lex::token_def<> white_space;
std::string identifier_name;
};

这是我要运行的示例:

#include "stdafx.h"

#include <boost/spirit/include/lex_lexertl.hpp>
#include "my_Lexer.h"

namespace lex = boost::spirit::lex;

int _tmain(int argc, _TCHAR* argv[])
{
typedef lex::lexertl::token<char const*,lex::omit, boost::mpl::false_> token_type;
typedef lex::lexertl::lexer<token_type> lexer_type;

typedef lexer_identifier<lexer_type>::iterator_type iterator_type;

lexer_identifier<lexer_type> my_lexer;

std::string test("adedvied das934adf dfklj_03245");

char const* first = test.c_str();
char const* last = &first[test.size()];

lexer_type::iterator_type iter = my_lexer.begin(first, last);
lexer_type::iterator_type end = my_lexer.end();

while (iter != end && token_is_valid(*iter))
{
++iter;
}

bool r = (iter == end);

return 0;
}

只要字符串中只有一个标记,r 就为真。为什么会这样?

问候托拜厄斯

最佳答案

您已经创建了第二个词法分析器状态,但从未调用它。

简化并获利:


在大多数情况下,获得预期效果的最简单方法是在可跳过标记上使用带有 pass_ignore 标志的单态词法分析:

    this->self += identifier
| white_space [ lex::_pass = lex::pass_flags::pass_ignore ];

请注意,这需要一个 actor_lexer 来允许语义操作:

typedef lex::lexertl::actor_lexer<token_type> lexer_type;

完整示例:

#include <boost/spirit/include/lex_lexertl.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
namespace lex = boost::spirit::lex;

template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \\t\\n]+")
{
using boost::spirit::lex::_start;
using boost::spirit::lex::_end;

this->self += identifier
| white_space [ lex::_pass = lex::pass_flags::pass_ignore ];
}
lex::token_def<> identifier;
lex::token_def<> white_space;
std::string identifier_name;
};

int main(int argc, const char *argv[])
{
typedef lex::lexertl::token<char const*,lex::omit, boost::mpl::false_> token_type;
typedef lex::lexertl::actor_lexer<token_type> lexer_type;

typedef lexer_identifier<lexer_type>::iterator_type iterator_type;

lexer_identifier<lexer_type> my_lexer;

std::string test("adedvied das934adf dfklj_03245");

char const* first = test.c_str();
char const* last = &first[test.size()];

lexer_type::iterator_type iter = my_lexer.begin(first, last);
lexer_type::iterator_type end = my_lexer.end();

while (iter != end && token_is_valid(*iter))
{
++iter;
}

bool r = (iter == end);
std::cout << std::boolalpha << r << "\n";
}

打印

true

“WS”作为 skipper 状态


也有可能您遇到了一个示例,该示例使用 skipper 的第二个解析器状态 (lex::tokenize_and_phrase_parse)。让我花一到 10 分钟为此创建一个工作示例。

更新 我花了 10 多分钟(waaaah):) 这是一个比较测试,展示了词法分析器状态如何交互,以及如何使用 Spirit Skipper 解析来调用第二个解析器状态:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
namespace lex = boost::spirit::lex;
namespace qi = boost::spirit::qi;

template <typename Lexer>
struct lexer_identifier : lex::lexer<Lexer>
{
lexer_identifier()
: identifier("[a-zA-Z_][a-zA-Z0-9_]*")
, white_space("[ \\t\\n]+")
{
this->self = identifier;
this->self("WS") = white_space;
}
lex::token_def<> identifier;
lex::token_def<lex::omit> white_space;
};

int main()
{
typedef lex::lexertl::token<char const*, lex::omit, boost::mpl::true_> token_type;
typedef lex::lexertl::lexer<token_type> lexer_type;

typedef lexer_identifier<lexer_type>::iterator_type iterator_type;

lexer_identifier<lexer_type> my_lexer;

std::string test("adedvied das934adf dfklj_03245");

{
char const* first = test.c_str();
char const* last = &first[test.size()];

// cannot lex in just default WS state:
bool ok = lex::tokenize(first, last, my_lexer, "WS");
std::cout << "Starting state WS:\t" << std::boolalpha << ok << "\n";
}

{
char const* first = test.c_str();
char const* last = &first[test.size()];

// cannot lex in just default state either:
bool ok = lex::tokenize(first, last, my_lexer, "INITIAL");
std::cout << "Starting state INITIAL:\t" << std::boolalpha << ok << "\n";
}

{
char const* first = test.c_str();
char const* last = &first[test.size()];

bool ok = lex::tokenize_and_phrase_parse(first, last, my_lexer, *my_lexer.self, qi::in_state("WS")[my_lexer.self]);
ok = ok && (first == last); // verify full input consumed
std::cout << std::boolalpha << ok << "\n";
}
}

输出是

Starting state WS:  false
Starting state INITIAL: false
true

关于c++ - boost::spirit::lex 和空格的问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13361519/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com