gpt4 book ai didi

c++ - 振奋 spirit : how to use custom logic when parsing a list of doubles with text specifiers

转载 作者:行者123 更新时间:2023-11-28 04:03:01 26 4
gpt4 key购买 nike

我想解析一个 double vector 。然而,这个 vector 也可能包含两种类型的语句,它们对数据进行了一定程度的压缩:FORRAMP .

如果FOR在字符串中,格式应该是"<double> FOR <int>" .这意味着重复 <double> <int>次。

例如"1 1.5 2 2.5 3 FOR 4 3.5"应该解析为 { 1, 1.5, 2, 2.5, 3, 3, 3, 3, 3.5 }

如果RAMP在字符串中,格式应该是"<double1> RAMP <int> <double2>" .这意味着在 <double1> 之间进行线性插值和 <double2><int>期间。

例如"1 2 3 4 RAMP 3 6 7 8"应该解析为 { 1, 2, 3, 4, 5, 6, 7, 8 }

除了为单个元素定义解析器之外,我不知道如何继续。遇到扩展时,如何提供自定义代码来执行扩展?

谢谢!

最佳答案

没有语义操作的最简单方法¹ 是解析成一个 AST,然后您可以解释它。

更乏味的方法是使用语义操作来构建结果。 (请记住,回溯语法会带来问题。)

我做过的类似回答:

事不宜迟:

使用 AST 表示

AST 示例:

namespace AST {
using N = unsigned long;
using V = double;

struct repeat { N n; V value; };
struct interpolate {
N n; V start, end;
bool is_valid() const;
};

using element = boost::variant<repeat, interpolate>;
using elements = std::vector<element>;

The is_valid is a good place where we can do logic asserts like "the number of periods isn't zero" or "if the number of periods is 1, start and end must coincide".

现在,对于我们的最终结果,我们希望将其转换为 just-a-vector-of-V:

    using values = std::vector<V>;

static inline values expand(elements const& v) {
struct {
values result;
void operator()(repeat const& e) {
result.insert(result.end(), e.n, e.value);
}
void operator()(interpolate const& e) {
if (!e.is_valid()) {
throw std::runtime_error("bad interpolation");
}
if (e.n>0) { result.push_back(e.start); }
if (e.n>2) {
auto const delta = (e.end-e.start)/(e.n-1);
for (N i=1; i<(e.n-1); ++i)
result.push_back(e.start + i * delta);
}
if (e.n>1) { result.push_back(e.end); }
}
} visitor;
for (auto& el : v) {
boost::apply_visitor(visitor, el);
}
return std::move(visitor.result);
}
}

现在我们已经掌握了基础知识,让我们来解析和测试:

解析

首先,让我们调整 AST 类型:

BOOST_FUSION_ADAPT_STRUCT(AST::repeat, value, n)
BOOST_FUSION_ADAPT_STRUCT(AST::interpolate, start, n, end)

Note: the "natural grammar order" of the adapted properties makes attribute propagation painless without semantic actions

现在让我们推出一个语法:

namespace qi = boost::spirit::qi;

template <typename It> struct Grammar : qi::grammar<It, AST::elements()> {
Grammar() : Grammar::base_type(start) {
elements_ = *element_;
element_ = interpolate_ | repeat_;
repeat_
= value_ >> "FOR" >> qi::uint_
| value_ >> qi::attr(1u)
;
interpolate_
= value_ >> "RAMP" >> qi::uint_ >> value_
;

value_ = qi::auto_;

start = qi::skip(qi::space) [ elements_ ];

BOOST_SPIRIT_DEBUG_NODES((start)(elements_)(element_)(repeat_)(interpolate_)(value_))
}
private:
qi::rule<It, AST::elements()> start;
qi::rule<It, AST::elements(), qi::space_type> elements_;
qi::rule<It, AST::element(), qi::space_type> element_;
qi::rule<It, AST::repeat(), qi::space_type> repeat_;
qi::rule<It, AST::interpolate(), qi::space_type> interpolate_;
qi::rule<It, AST::V(), qi::space_type> value_;
};

Note:

  • BOOST_SPIRIT_DEBUG_NODES enables rule debugging
  • The order of interpolate_ | repeat_ is important, since repeat_ also parses individual numbers (so it would prevent FROM from being parsed in time.

调用解析器和 expand() 中间表示的简单实用程序:

AST::values do_parse(std::string const& input) {
static const Grammar<std::string::const_iterator> g;

auto f = begin(input), l = end(input);
AST::elements intermediate;
if (!qi::parse(f, l, g >> qi::eoi, intermediate)) {
throw std::runtime_error("bad input");
}

return expand(intermediate);
}

测试

布丁的证明在于吃:

Live On Coliru

int main() {
std::cout << std::boolalpha;

struct { std::string input; AST::values expected; } cases[] = {
{ "1 1.5 2 2.5 3 FOR 4 3.5", { 1, 1.5, 2, 2.5, 3, 3, 3, 3, 3.5 } },
{ "1 2 3 4 RAMP 3 6 7 8", { 1, 2, 3, 4, 5, 6, 7, 8 } },
};

for (auto const& test : cases) {
try {
std::cout << std::quoted(test.input) << " -> ";
auto actual = Parse::do_parse(test.input);
std::cout << (actual==test.expected? "PASSED":"FAILED") << " { ";

// print the actual for reference
std::cout << " {";
for (auto& v : actual) std::cout << v << ", ";
std::cout << "}\n";
} catch(std::exception const& e) {
std::cout << "ERROR " << std::quoted(e.what()) << "\n";
}
}
}

打印

"1 1.5 2 2.5 3 FOR 4 3.5" -> PASSED {  {1, 1.5, 2, 2.5, 3, 3, 3, 3, 3.5, }
"1 2 3 4 RAMP 3 6 7 8" -> PASSED { {1, 2, 3, 4, 5, 6, 7, 8, }

改用语义 Action

这可能更有效,而且我发现我实际上更喜欢这种方法的表现力。

随着语法变得越来越复杂,它可能无法很好地扩展。

这里我们“反转”流程:

Grammar() : Grammar::base_type(start) {
element_ =
qi::double_ [ px::push_back(qi::_val, qi::_1) ]
| ("FOR" >> qi::uint_) [ handle_for(qi::_val, qi::_1) ]
| ("RAMP" >> qi::uint_ >> qi::double_) [ handle_ramp(qi::_val, qi::_1, qi::_2) ]
;

start = qi::skip(qi::space) [ *element_ ];
}

此处语义 Action 中的handle_forhandle_ramp 是惰性Actor,它们基本上执行与AST 中的expand() 相同的操作-基于方法,但是

  • 即时
  • 第一个操作数是隐式的(它是 vector 后面已经存在的最后一个值)

这会进行一些额外的检查(当用户传递以 "FOR""RAMP" 开头的字符串时,我们不希望 UB) :

    struct handle_for_f {
void operator()(Values& vec, unsigned n) const {
if (vec.empty() || n<1)
throw std::runtime_error("bad quantifier");

vec.insert(vec.end(), n-1, vec.back());
}
};

struct handle_ramp_f {
void operator()(Values& vec, unsigned n, double target) const {
if (vec.empty())
throw std::runtime_error("bad quantifier");
if ((n == 0) || (n == 1 && (vec.back() != target)))
throw std::runtime_error("bad interpolation");

auto start = vec.back();

if (n>2) {
auto const delta = (target-start)/(n-1);
for (std::size_t i=1; i<(n-1); ++i)
vec.push_back(start + i * delta);
}
if (n>1) { vec.push_back(target); }
}
};

为了避免语义操作中繁琐的 boost::phoenix::bind,让我们适应 Phoenix Functions:

    px::function<handle_for_f> handle_for;
px::function<handle_ramp_f> handle_ramp;

解析

do_parse 助手变得更简单了,因为我们没有中间表示:

Values do_parse(std::string const& input) {
static const Grammar<std::string::const_iterator> g;

auto f = begin(input), l = end(input);
Values values;
if (!qi::parse(f, l, g >> qi::eoi, values)) {
throw std::runtime_error("bad input");
}

return values;
}

测试

同样,布丁的证明在于吃。未修改main()的测试程序:

Live On Coliru

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iostream>
#include <iomanip>

using Values = std::vector<double>;

namespace Parse {
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;

template <typename It> struct Grammar : qi::grammar<It, Values()> {
Grammar() : Grammar::base_type(start) {
element_ =
qi::double_ [ px::push_back(qi::_val, qi::_1) ]
| ("FOR" >> qi::uint_) [ handle_for(qi::_val, qi::_1) ]
| ("RAMP" >> qi::uint_ >> qi::double_) [ handle_ramp(qi::_val, qi::_1, qi::_2) ]
;

start = qi::skip(qi::space) [ *element_ ];
}
private:
qi::rule<It, Values()> start;
qi::rule<It, Values(), qi::space_type> element_;

struct handle_for_f {
void operator()(Values& vec, unsigned n) const {
if (vec.empty() || n<1)
throw std::runtime_error("bad quantifier");

vec.insert(vec.end(), n-1, vec.back());
}
};

struct handle_ramp_f {
void operator()(Values& vec, unsigned n, double target) const {
if (vec.empty())
throw std::runtime_error("bad quantifier");
if ((n == 0) || (n == 1 && (vec.back() != target)))
throw std::runtime_error("bad interpolation");

auto start = vec.back();

if (n>2) {
auto const delta = (target-start)/(n-1);
for (std::size_t i=1; i<(n-1); ++i)
vec.push_back(start + i * delta);
}
if (n>1) { vec.push_back(target); }
}
};

px::function<handle_for_f> handle_for;
px::function<handle_ramp_f> handle_ramp;
};

Values do_parse(std::string const& input) {
static const Grammar<std::string::const_iterator> g;

auto f = begin(input), l = end(input);
Values values;
if (!qi::parse(f, l, g >> qi::eoi, values)) {
throw std::runtime_error("bad input");
}

return values;
}
}

int main() {
std::cout << std::boolalpha;

struct { std::string input; Values expected; } cases[] = {
{ "1 1.5 2 2.5 3 FOR 4 3.5", { 1, 1.5, 2, 2.5, 3, 3, 3, 3, 3.5 } },
{ "1 2 3 4 RAMP 3 6 7 8", { 1, 2, 3, 4, 5, 6, 7, 8 } },
};

for (auto const& test : cases) {
try {
std::cout << std::quoted(test.input) << " -> ";
auto actual = Parse::do_parse(test.input);
std::cout << (actual==test.expected? "PASSED":"FAILED") << " { ";

// print the actual for reference
std::cout << " {";
for (auto& v : actual) std::cout << v << ", ";
std::cout << "}\n";
} catch(std::exception const& e) {
std::cout << "ERROR " << std::quoted(e.what()) << "\n";
}
}
}

打印和之前一样:

"1 1.5 2 2.5 3 FOR 4 3.5" -> PASSED {  {1, 1.5, 2, 2.5, 3, 3, 3, 3, 3.5, }
"1 2 3 4 RAMP 3 6 7 8" -> PASSED { {1, 2, 3, 4, 5, 6, 7, 8, }

¹ Boost Spirit: "Semantic actions are evil"?

关于c++ - 振奋 spirit : how to use custom logic when parsing a list of doubles with text specifiers,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59198525/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com