c++ - 解析表达式语法中的左分解-6ren

c++ - 解析表达式语法中的左分解

转载作者：塔克拉玛干更新时间：2023-11-03 02:04:49

我正在尝试为允许以下表达式的语言编写语法:

f args 形式的函数调用(注意:没有括号!)
a + b

例如:

f 42       => f(42)
42 + b     => (42 + b)
f 42 + b   => f(42 + b)

语法是明确的(每个表达式都可以完全以一种方式解析)但我不知道如何将此语法编写为 PEG，因为两个产品可能以相同的标记开始，id .这是我错误的PEG。我怎样才能重写它以使其有效？

expression ::= call / addition

call ::= id addition*

addition ::= unary
           ( ('+' unary)
           / ('-' unary) )*

unary ::= primary
        / '(' ( ('+' unary)
              / ('-' unary)
              / expression)
          ')'

primary ::= number / id

number ::= [1-9]+

id ::= [a-z]+

现在，当此语法尝试解析输入“a + b”时，它将“a”解析为零参数的函数调用，并在“+ b”。

我上传了一个 C++ / Boost.Spirit.Qi implementation of the grammar以防万一有人想玩它。

(请注意，unary 消除了一元运算和加法的歧义:为了以负数作为参数调用函数，您需要指定括号，例如 f (-1).)

最佳答案

如 chat 中所提议你可以从这样的事情开始:

expression = addition | simple;

addition = simple >>
    (  ('+' > expression)
     | ('-' > expression)
    );

simple = '(' > expression > ')' | call | unary | number;

call = id >> *expression;

unary = qi::char_("-+") > expression;

// terminals
id = qi::lexeme[+qi::char_("a-z")];
number = qi::double_;

从那时起，我在 C++ 中使用 AST 演示文稿实现了它，因此您可以通过 pretty-print 来感受这种语法实际上如何构建表达式树。

All source code is on github: https://gist.github.com/2152518

There are two versions (scroll down to 'Alternative' to read more

语法:

template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type> 
{
    qi::rule<Iterator, std::string(),  qi::space_type> id;
    qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
    qi::rule<Iterator, number_t(),     qi::space_type> number;
    qi::rule<Iterator, call_t(),       qi::space_type> call;
    qi::rule<Iterator, unary_t(),      qi::space_type> unary;

    mini_grammar() : mini_grammar::base_type(expression) 
    {
        expression = addition | simple;

        addition = simple [ qi::_val = qi::_1 ] >> 
           +(  
               (qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ] 
            );

        simple = '(' > expression > ')' | call | unary | number;

        call = id >> *expression;

        unary = qi::char_("-+") > expression;

        // terminals
        id = qi::lexeme[+qi::char_("a-z")];
        number = qi::double_;
    }
};

相应的 AST 结构是使用非常强大的 Boost Variant 快速定义的:

struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
    number_t,
    boost::recursive_wrapper<call_t>,
    boost::recursive_wrapper<unary_t>,
    boost::recursive_wrapper<addition_t>
    > expression_t;

struct addition_t
{
    expression_t lhs;
    char binop;
    expression_t rhs;
};

struct call_t
{
    std::string id;
    std::vector<expression_t> args;
};

struct unary_t
{
    char unop;
    expression_t operand;
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t,     (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t,    (char, unop)(expression_t, operand));

在完整代码中，我还为这些结构重载了运算符<<。

完整演示

//#define BOOST_SPIRIT_DEBUG
#include <iostream>
#include <iterator>
#include <string>

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/fusion/adapted.hpp>
#include <boost/optional.hpp>

namespace qi = boost::spirit::qi;
namespace phx= boost::phoenix;

struct addition_t;
struct call_t;
struct unary_t;
typedef double number_t;

typedef boost::variant<
    number_t,
    boost::recursive_wrapper<call_t>,
    boost::recursive_wrapper<unary_t>,
    boost::recursive_wrapper<addition_t>
    > expression_t;

struct addition_t
{
    expression_t lhs;
    char binop;
    expression_t rhs;

    friend std::ostream& operator<<(std::ostream& os, const addition_t& a) 
        { return os << "(" << a.lhs << ' ' << a.binop << ' ' << a.rhs << ")"; }
};

struct call_t
{
    std::string id;
    std::vector<expression_t> args;

    friend std::ostream& operator<<(std::ostream& os, const call_t& a)
        { os << a.id << "("; for (auto& e : a.args) os << e << ", "; return os << ")"; }
};

struct unary_t
{
    char unop;
    expression_t operand;

    friend std::ostream& operator<<(std::ostream& os, const unary_t& a)
        { return os << "(" << a.unop << ' ' << a.operand << ")"; }
};

BOOST_FUSION_ADAPT_STRUCT(addition_t, (expression_t, lhs)(char,binop)(expression_t, rhs));
BOOST_FUSION_ADAPT_STRUCT(call_t,     (std::string, id)(std::vector<expression_t>, args));
BOOST_FUSION_ADAPT_STRUCT(unary_t,    (char, unop)(expression_t, operand));

void append_term(expression_t& lhs, char op, expression_t operand)
{
    lhs = addition_t { lhs, op, operand };
}

template <typename Iterator>
struct mini_grammar : qi::grammar<Iterator, expression_t(), qi::space_type> 
{
    qi::rule<Iterator, std::string(),  qi::space_type> id;
    qi::rule<Iterator, expression_t(), qi::space_type> addition, expression, simple;
    qi::rule<Iterator, number_t(),     qi::space_type> number;
    qi::rule<Iterator, call_t(),       qi::space_type> call;
    qi::rule<Iterator, unary_t(),      qi::space_type> unary;

    mini_grammar() : mini_grammar::base_type(expression) 
    {
        expression = addition | simple;

        addition = simple [ qi::_val = qi::_1 ] >> 
           +(  
               (qi::char_("+-") > simple) [ phx::bind(&append_term, qi::_val, qi::_1, qi::_2) ] 
            );

        simple = '(' > expression > ')' | call | unary | number;

        call = id >> *expression;

        unary = qi::char_("-+") > expression;

        // terminals
        id = qi::lexeme[+qi::char_("a-z")];
        number = qi::double_;

        BOOST_SPIRIT_DEBUG_NODE(expression);
        BOOST_SPIRIT_DEBUG_NODE(call);
        BOOST_SPIRIT_DEBUG_NODE(addition);
        BOOST_SPIRIT_DEBUG_NODE(simple);
        BOOST_SPIRIT_DEBUG_NODE(unary);
        BOOST_SPIRIT_DEBUG_NODE(id);
        BOOST_SPIRIT_DEBUG_NODE(number);
    }
};

std::string read_input(std::istream& stream) {
    return std::string(
        std::istreambuf_iterator<char>(stream),
        std::istreambuf_iterator<char>());
}

int main() {
    std::cin.unsetf(std::ios::skipws);
    std::string const code = read_input(std::cin);
    auto begin = code.begin();
    auto end = code.end();

    try {
        mini_grammar<decltype(end)> grammar;
        qi::space_type space;

        std::vector<expression_t> script;
        bool ok = qi::phrase_parse(begin, end, *(grammar > ';'), space, script);

        if (begin!=end)
            std::cerr << "Unparsed: '" << std::string(begin,end) << "'\n";

        std::cout << std::boolalpha << "Success: " << ok << "\n";

        if (ok)
        {
            for (auto& expr : script)
                std::cout << "AST: " << expr << '\n';
        }
    }
    catch (qi::expectation_failure<decltype(end)> const& ex) {
        std::cout << "Failure; parsing stopped after \""
                  << std::string(ex.first, ex.last) << "\"\n";
    }
}

替代方案:

我有一个替代版本，它迭代地而不是递归地构建 addition_t，可以这么说:

struct term_t
{
    char binop;
    expression_t rhs;
};

struct addition_t
{
    expression_t lhs;
    std::vector<term_t> terms;
};

这消除了使用 Phoenix 构建表达式的需要:

    addition = simple >> +term;

    term = qi::char_("+-") > simple;

关于c++ - 解析表达式语法中的左分解，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9759093/

文章推荐： c++ - 在所有者绘制的列表框中禁用滚动条

文章推荐： c# - 在 Linux 上使用 C# 进行开发

文章推荐： linux - linux 什么东西占用内存

文章推荐： c++ - 改进我的四叉树设计？

r - 通过 QR 分解、SVD(和 Cholesky 分解？)计算投影/帽子矩阵
我正在尝试在 R 中计算任意 N x J 矩阵 S 的投影矩阵 P: P = S (S'S) ^ -1 S' 我一直在尝试使用以下函数来执行此操作: P 概述 solve 基于一般方阵的 LU 分解
php - 具有不规则空格和制表符的文件按列拆分/分解
所以我有一个包含数千行的非常旧的文件(我猜是手工生成的)，我正试图将它们移动到一个 rdb 中，但是这些行没有转换为列的格式/模式。例如，文件中的行如下所示: blah blahsdfas
Django:分解 View
这实际上只是一个“最佳实践”问题...... 我发现在开发应用程序时，我经常会得到很多 View 。将这些 View 分解为几个 View 文件是常见的做法吗？换句话说......而不只是有view
r - 分解 `…`参数并分配给多个函数
使用以下函数foo()作为简单示例，如果可能的话，我想将...中给出的值分配给两个不同的函数。 foo args(mapply) function (FUN, ..., MoreArgs = NUL
Groovy 在列表上的不一致解构/分解？
正面案例:可以进入列表 groovy> println GroovySystem.version groovy> final data1 = [[99,2] , [100,4]] groovy> d
algorithm - 数学:分解
省略素数计算方法和因式分解方法的详细信息。为什么要进行因式分解？它的应用是什么？最佳答案哇，这个线程里有这么多争斗。具有讽刺意味的是，这个问题有一个主要的有效答案。因式分解实际上在加密/解
c++ - 分解/重构程序
术语“分解不良”和“重构”程序是什么意思？你能举一个简单的例子来理解基本的区别吗？最佳答案重构是一种通用技术，可以指代许多任务。它通常意味着清理代码、去除冗余、提高代码质量和可读性。分解不良代码
c++ - 分解/分解函数的函数
我以前有，here ，表明 C++ 函数不容易在汇编中表示。现在我有兴趣以一种或另一种方式阅读它们，因为 Callgrind 是 Valgrind 的一部分，在组装时显示它们已损坏。所以我想要么破坏
python - 分解 with 语句
最初，我一直在打开并同时阅读两个文件，内容如下: with open(file1, 'r') as R1: with open(file2, 'r') as R2: ### m
python - Beautifulsoup 分解()
我正在尝试摆脱标签和标签内的内容使用 beatifulsoup。我去看了文档，似乎是一个非常简单的调用函数。有关该功能的更多信息是 here .这是我到目前为止解析的 html 页面的内容...
c# - 分解 float
给定一个 float ，我想将它分成几个部分的总和，每个部分都有给定的位数。例如，给定 3.1415926535 并要求将其分成以 10 为基数的部分，每部分 4 位数字，它将返回 3.141 + 5
jsf - 分解 EAR 文件
我的 JSF 项目被部署为一个 EAR 文件。它还包括一些 war 文件。我需要 EAR 的分解版本(包括分解的内部 WAR)。有什么工具可以做到吗？最佳答案以编程方式还是手动？ EAR 和 W
r - 带有行枢轴的 LU 分解
以下函数不使用行透视进行 LU 分解。 R 中是否有一个现有的函数可以使用行数据进行 LU 分解？ > require(Matrix) > expand(lu(matrix(rnorm(16),4,4
r - 分解 Shiny 代码的最佳实践
关闭。这个问题是opinion-based .它目前不接受答案。想改进这个问题？更新问题，以便 editing this post 提供事实和引用来回答它. 7年前关闭。 Improve this
r - 如何拆分(分解)R中的日期？
我正在使用登记数据进行病假研究。从登记册上，我只得到了每个人的病假开始日期和结束日期。但日期并没有逐年分割。例如，对于人 A，只有开始日期 (1-may-2016) 和结束日期 (14-feb-201
R 高斯消除和 qr 分解
我发现以下 R 代码使用 qr 因式分解无法恢复原始矩阵。我不明白为什么。 a <- matrix(runif(180),ncol=6) a[,c(2,4)] <- 0 b <- qr(a) d <-
r - 用于异常检测的具有缺失值的时间序列的 STL 分解
我正在尝试检测气候数据时间序列中的异常值，其中一些缺失的观测值。在网上搜索我发现了许多可用的方法。其中，STL 分解似乎很有吸引力，因为它去除了趋势和季节性成分并研究了其余部分。阅读 STL: A S
javascript 分解 VIN 号码
我想使用 javascript 分解数组中的 VIN，可能使用正则表达式，然后使用某种循环... 以下是读取 VIN 的方法: http://forum.cardekho.com/topic/600-
scala - 分解 Spark 数据框中的嵌套结构
我正在研究 Databricks 示例。数据框的架构如下所示: > parquetDF.printSchema root |-- department: struct (nullable = true
javascript - 分解 JS 函数
我正在尝试简化我的代码并将其分解为多个文件。例如，我设法做到了: socket.once("disconnect", disconnectSocket); 然后有一个名为 disconnectSock

塔克拉玛干

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城