gpt4 book ai didi

c++ - 解析表达式语法中的左分解

转载 作者:塔克拉玛干 更新时间:2023-11-03 02:04:49 24 4
gpt4 key购买 nike

我正在尝试为允许以下表达式的语言编写语法:

  1. f args 形式的函数调用(注意:没有括号!)
  2. a + b
  3. 形式的添加(和更复杂的表达式,但这不是重点)

例如:

f 42       => f(42)
42 + b => (42 + b)
f 42 + b => f(42 + b)

语法是明确的(每个表达式都可以完全以一种方式解析)但我不知道如何将此语法编写为 PEG,因为两个产品可能以相同的标记开始,id .这是我错误的PEG。我怎样才能重写它以使其有效?

expression ::= call / addition

call ::= id addition*

addition ::= unary
( ('+' unary)
/ ('-' unary) )*

unary ::= primary
/ '(' ( ('+' unary)
/ ('-' unary)
/ expression)
')'

primary ::= number / id

number ::= [1-9]+

id ::= [a-z]+

现在,当此语法尝试解析输入“a + b”时,它将“a”解析为零参数的函数调用,并在“+ b”。

我上传了一个 C++ / Boost.Spirit.Qi implementation of the grammar以防万一有人想玩它。


(请注意,unary 消除了一元运算和加法的歧义:为了以负数作为参数调用函数,您需要指定括号,例如 f (-1).)

最佳答案

chat 中所提议你可以从这样的事情开始:

expression = addition | simple;

addition = simple >>
( ('+' > expression)
| ('-' > expression)
);

simple = '(' > expression > ')' | call | unary | number;

call = id >> *expression;

unary = qi::char_("-+") > expression;

// terminals
id = qi::lexeme[+qi::char_("a-z")];
number = qi::double_;

从那时起,我在 C++ 中使用 AST 演示文稿实现了它,因此您可以通过 pretty-print 来感受这种语法实际上如何构建表达式树。

All source code is on github: https://gist.github.com/2152518

There are two versions (scroll down to 'Alternative' to read more


语法:

template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type>
{
qi::rule<Iterator, std::string(), qi::space_type> id;
qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
qi::rule<Iterator, number_t(), qi::space_type> number;
qi::rule<Iterator, call_t(), qi::space_type> call;
qi::rule<Iterator, unary_t(), qi::space_type> unary;

mini_grammar() : mini_grammar::base_type(expression)
{
expression = addition | simple;

addition = simple [ qi::_val = qi::_1 ] >>
+(
(qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ]
);

simple = '(' > expression > ')' | call | unary | number;

call = id >> *expression;

unary = qi::char_("-+") > expression;

// terminals
id = qi::lexeme[+qi::char_("a-z")];
number = qi::double_;
}
};

相应的 AST 结构是使用非常强大的 Boost Variant 快速定义的:

struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
number_t,
boost::recursive_wrapper<call_t>,
boost::recursive_wrapper<unary_t>,
boost::recursive_wrapper<addition_t>
> expression_t;

struct addition_t
{
expression_t lhs;
char binop;
expression_t rhs;
};

struct call_t
{
std::string id;
std::vector<expression_t> args;
};

struct unary_t
{
char unop;
expression_t operand;
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t, (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t, (char, unop)(expression_t, operand));

在完整代码中,我还为这些结构重载了运算符<<。


完整演示

//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <iterator>
#include <string>

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/optional.hpp>

namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;

struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
number_t,
boost::recursive_wrapper<call_t>,
boost::recursive_wrapper<unary_t>,
boost::recursive_wrapper<addition_t>
> expression_t;

struct addition_t
{
expression_t lhs;
char binop;
expression_t rhs;

friend std::ostream& operator<<(std::ostream& os, const addition_t& a)
{ return os << "(" << a.lhs << ' ' << a.binop << ' ' << a.rhs << ")"; }
};

struct call_t
{
std::string id;
std::vector<expression_t> args;

friend std::ostream& operator<<(std::ostream& os, const call_t& a)
{ os << a.id << "("; for (auto& e : a.args) os << e << ", "; return os << ")"; }
};

struct unary_t
{
char unop;
expression_t operand;

friend std::ostream& operator<<(std::ostream& os, const unary_t& a)
{ return os << "(" << a.unop << ' ' << a.operand << ")"; }
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t, (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t, (char, unop)(expression_t, operand));

void append_term(expression_t& lhs, char op, expression_t operand)
{
lhs = addition_t { lhs, op, operand };
}

template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type>
{
qi::rule<Iterator, std::string(), qi::space_type> id;
qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
qi::rule<Iterator, number_t(), qi::space_type> number;
qi::rule<Iterator, call_t(), qi::space_type> call;
qi::rule<Iterator, unary_t(), qi::space_type> unary;

mini_grammar() : mini_grammar::base_type(expression)
{
expression = addition | simple;

addition = simple [ qi::_val = qi::_1 ] >>
+(
(qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ]
);

simple = '(' > expression > ')' | call | unary | number;

call = id >> *expression;

unary = qi::char_("-+") > expression;

// terminals
id = qi::lexeme[+qi::char_("a-z")];
number = qi::double_;

BOOST_SPIRIT_DEBUG_NODE(expression);
BOOST_SPIRIT_DEBUG_NODE(call);
BOOST_SPIRIT_DEBUG_NODE(addition);
BOOST_SPIRIT_DEBUG_NODE(simple);
BOOST_SPIRIT_DEBUG_NODE(unary);
BOOST_SPIRIT_DEBUG_NODE(id);
BOOST_SPIRIT_DEBUG_NODE(number);
}
};

std::string read_input(std::istream& stream) {
return std::string(
std::istreambuf_iterator<char>(stream),
std::istreambuf_iterator<char>());
}

int main() {
std::cin.unsetf(std::ios::skipws);
std::string const code = read_input(std::cin);
auto begin = code.begin();
auto end = code.end();

try {
mini_grammar<decltype(end)> grammar;
qi::space_type space;

std::vector<expression_t> script;
bool ok = qi::phrase_parse(begin, end, *(grammar > ';'), space, script);

if (begin!=end)
std::cerr << "Unparsed: '" << std::string(begin,end) << "'\n";

std::cout << std::boolalpha << "Success: " << ok << "\n";

if (ok)
{
for (auto& expr : script)
std::cout << "AST: " << expr << '\n';
}
}
catch (qi::expectation_failure<decltype(end)> const& ex) {
std::cout << "Failure; parsing stopped after \""
<< std::string(ex.first, ex.last) << "\"\n";
}
}

替代方案:

我有一个替代版本,它迭代地而不是递归地构建 addition_t,可以这么说:

struct term_t
{
char binop;
expression_t rhs;
};

struct addition_t
{
expression_t lhs;
std::vector<term_t> terms;
};

这消除了使用 Phoenix 构建表达式的需要:

    addition = simple >> +term;

term = qi::char_("+-") > simple;

关于c++ - 解析表达式语法中的左分解,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9759093/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com