c++ - boost spirit istream_iterator 从流中消耗太多-6ren

c++ - boost spirit istream_iterator 从流中消耗太多

转载作者：塔克拉玛干更新时间：2023-11-03 00:33:41

考虑从更复杂的代码中提取的以下示例:

#include <boost/fusion/adapted.hpp>
#include <boost/fusion/include/std_pair.hpp>
#include <boost/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_istream_iterator.hpp>
#include <map>
#include <string>

namespace qi  = boost::spirit::qi;
namespace phx = boost::phoenix;

// The class implements a XML tag storing the name and a variable number of attributes:
struct Tag
{
    // The typedef defines the type used for a XML name:
    typedef std::string name_type;

    // The typedef defines the type used for a XML value:
    typedef std::string value_type;

    // The typedef defines the type of a XML attribute:
    typedef std::pair<
        name_type,
        value_type
    > attribute_type;

    // The type defines a list of attributes.
    // Note: We use a std::map to simplify the attribute search.
    typedef std::map<
        name_type,
        value_type
    > list_type;

    // Clear all information stored within the instance:
    void clear( )
    {
        m_name.clear( ); m_attribute.clear( );
    }

    std::size_t m_indent;    // The tag shall be / is indented by m_indent number of tabs.
    name_type   m_name;      // Name of the tag.
    list_type   m_attribute; // List of tag attributes.
};

// Define the mapping between Tools::Serialization::Archive::Type::Xml::Format::Tag and boost::fusion:
BOOST_FUSION_ADAPT_STRUCT( Tag,
                         ( std::size_t   , m_indent    )
                         ( Tag::name_type, m_name      )
                         ( Tag::list_type, m_attribute ) )

// This class implements the decoder skipper grammar:
template < typename _Iterator >
    struct skipper
    : qi::grammar< _Iterator >
    {
        skipper( ) : skipper::base_type( m_skipper )
        {
            // The rule defines the default skipper grammar:
            m_skipper = ( qi::space )  // Skip all "spaces".
                        |
                        ( qi::cntrl ); // Skip all "cntrl".
        }

        // The following variables define the rules used within this grammar:
        qi::rule< _Iterator > m_skipper;
    };

// This class implements the grammar used to parse a XML "begin tag".
// The expected format is as follows: <name a="xyz" b="xyz" ... N="xyz">
template < typename _Iterator, typename _Skipper >
    struct tag_begin : qi::grammar< _Iterator, Tag( ), _Skipper >
    {
        tag_begin( ) : tag_begin::base_type( m_tag )
        {
            // The rule for a XML name shall stop when a ' ' or '>' is detected:
            m_string = qi::lexeme[ *( qi::char_( "a-zA-Z0-9_.:" ) ) ];

            // The rule for a XML attribute shall parse the following format: 'name="value"':
            m_attribute =    m_string
                          >> "=\""
                          >> m_string
                          >> '"';

            // The rule for an XML attribute list is a sequence of attributes separated by ' ':
            m_list = *( m_attribute - '>' );

            // Finally the resulting XML tag has the following format: <name a="xyz" b="xyz" ... N="xyz">
            m_tag =     '<'
                     >> -qi::int_
                     >> m_string
                     >> m_list
                     >> '>';

            // Enable debug support for the used rules. To activate the debug output define macro BOOST_SPIRIT_DEBUG:
            BOOST_SPIRIT_DEBUG_NODES( ( m_string )( m_attribute )( m_list ) )
        }

    // The following variables define the rules used within this grammar:
    qi::rule< _Iterator, Tag::name_type( )     , _Skipper > m_string;
    qi::rule< _Iterator, Tag::attribute_type( ), _Skipper > m_attribute;
    qi::rule< _Iterator, Tag::list_type( )     , _Skipper > m_list;
    qi::rule< _Iterator, Tag( )                , _Skipper > m_tag;
};

bool beginTag( std::istream& stream, Tag& tag )
{
    // Ensure that no whitespace characters are skipped:
    stream.unsetf( std::ios::skipws );

    // Create begin and end iterator for given stream:
    boost::spirit::istream_iterator begin( stream );
    boost::spirit::istream_iterator end;

    // Define the grammar skipper type:
    typedef skipper<
        boost::spirit::istream_iterator
    > skipper_type;

    // Create an instance of the used skipper:
    skipper_type sk;

    // Create an instance of the used grammar:
    tag_begin<
        boost::spirit::istream_iterator,
        skipper_type
    > gr;

    // Try to parse the data stored within the stream according the grammar and store the result in the tag variable:
    bool r = boost::spirit::qi::phrase_parse( begin,
                                              end,
                                              gr,
                                              sk,
                                              tag );

    char nextSym = 0;
    stream >> nextSym;

    for( auto i = tag.m_attribute.begin( ); i != tag.m_attribute.end( ); ++i )
    {
        std::cout << i->first << " : " << i->second << std::endl;
    }
    std::cout << "Next symbol: " << nextSym << std::endl;

    return r;
}

int main( )
{
    std::stringstream s;
    s << "<object cName=\"bool\" cVersion=\"1\" vName=\"bool\">       <value>0</value></object>";

    Tag t;
    beginTag( s, t );

    return 0;
}

我使用语法提取xml标签内容。原则上这按预期工作，结果如下:

cName : bool
cVersion : 1
vName : bool
Next symbol: v

问题是解析器消耗了太多数据。我的期望是解析器在第一个标记关闭“>”时停止。但似乎解析器还使用了以下空格和“<”符号。所以从流中读取的下一个符号等于'v'。我想避免这种情况，因为以下解析器调用需要“<”符号。有什么想法吗？

最佳答案

没有可靠的方法来实现这一点。

问题是您没有在解析调用中重复使用 istream_iterator。 boost::spirit::istream_iterator 的全部目的是在 InputIterator¹ 之上提供一个具有 multi_pass 功能的迭代器。

因为 Spirit 允许具有任意回溯的任意语法，所以您无法避免消耗比实际成功解析的输入更多的输入。

这里显而易见的解决方案是将所有后续步骤集成到相同的语法和/或重用迭代器(因此迭代器存储的回溯缓冲区仍然包含您需要的字符)。

演示/概念验证

这是一个循环解析打开标签的版本

while (boost::spirit::qi::phrase_parse(begin, end, gr, sk, tag)) {
    std::cout << "============\nParsed open tag '" << tag.m_name << "'\n";
    for (auto const& p: tag.m_attribute)
        std::cout << p.first << ": " << p.second << "\n";

    count += 1;
    tag.clear();
};

std::cout << "Next symbol: ";
std::copy(begin, end, std::ostream_iterator<char>(std::cout));

它打印:

============
Parsed open tag 'object'
cName: bool
cVersion: 1
vName: bool
============
Parsed open tag 'value'
Next symbol: 0</value>
        </object>

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <map>

namespace qi = boost::spirit::qi;

// The class implements a XML tag storing the name and a variable number of
// attributes:
struct Tag {
    typedef std::string name_type;
    typedef std::string value_type;

    typedef std::pair<name_type, value_type> attribute_type;
    typedef std::map<name_type, value_type>  list_type;

    // Clear all information stored within the instance:
    void clear() {
        m_name.clear();
        m_attribute.clear();
    }

    std::size_t m_indent;  // The tag shall be / is indented by m_indent number of tabs.
    name_type m_name;      // Name of the tag.
    list_type m_attribute; // List of tag attributes.
};

BOOST_FUSION_ADAPT_STRUCT(Tag, m_indent, m_name, m_attribute)

// This class implements the grammar used to parse a "XML" begin tag.
// The expected format is as follows: <name a="xyz" b="xyz" ... N="xyz">
template <typename Iterator, typename Skipper> struct tag_begin : qi::grammar<Iterator, Tag(), Skipper> {
    tag_begin() : tag_begin::base_type(m_tag) {
        m_string     = *qi::char_("a-zA-Z0-9_.:");
        m_attribute  = m_string >> '=' >> qi::lexeme['"' >> m_string >> '"'];
        m_attributes = *m_attribute;
        m_tag        = '<' >> -qi::int_ >> m_string >> m_attributes >> '>';

        BOOST_SPIRIT_DEBUG_NODES((m_string)(m_attribute)(m_attributes))
    }
  private:

    // The following variables define the rules used within this grammar:
    qi::rule<Iterator, Tag::attribute_type(), Skipper> m_attribute;
    qi::rule<Iterator, Tag::list_type(), Skipper> m_attributes;
    qi::rule<Iterator, Tag(), Skipper> m_tag;
    // lexemes
    qi::rule<Iterator, Tag::name_type()> m_string;
};

bool beginTag(std::istream &stream, Tag &tag) {
    // Ensure that no whitespace characters are skipped:
    stream.unsetf(std::ios::skipws);

    typedef boost::spirit::istream_iterator It; 
    typedef qi::rule<It> skipper_type;

    skipper_type sk = qi::space | qi::cntrl;
    tag_begin<boost::spirit::istream_iterator, skipper_type> gr;

    It begin(stream), end;

    int count = 0;
    while (boost::spirit::qi::phrase_parse(begin, end, gr, sk, tag)) {
        std::cout << "============\nParsed open tag '" << tag.m_name << "'\n";
        for (auto const& p: tag.m_attribute)
            std::cout << p.first << ": " << p.second << "\n";

        count += 1;
        tag.clear();
    };

    std::cout << "Next symbol: ";
    std::copy(begin, end, std::ostream_iterator<char>(std::cout));

    return count > 0;
}

int main() {
    std::stringstream s;
    s << R"(
        <object cName="bool" cVersion="1" vName="bool">
            <value>0</value>
        </object>
    )";

    Tag t;
    beginTag(s, t);
}

¹(严格向前，不能重复取消引用)

关于c++ - boost spirit istream_iterator 从流中消耗太多，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35256521/

文章推荐： c++ - 并行写入数组

文章推荐： c++ - 我是否应该期望单继承中的向上转型和向下转型不调整指针？

文章推荐： c++ - 如何使用 Altivec 将 vector 存储到内存中未对齐的位置

文章推荐： linux - 执行二进制时是否有任何实用程序来控制线程数量？

php - (太多)太多 View 导致问题
我有一个使用 PHP 和 MariaDB 10.3 的小型大型数据库应用程序。我有大约 100 个表，大约有 3,000 个 View 。当超过 1,000 个 View 时，数据库架构就会崩溃并
regex - 解析 "\(|.*?)|)"- 太多)
parsing "\(|.*?)|)" - Too many )'s. 写这个的时候我收到这个错误... private static Regex resourceTextsREGEX = new
json - 为什么我生成的JSON的 “\\”太多？
我有一个Powershell脚本，它会生成一个包含数据的JSON文件。我对此文件有问题。它产生两倍的“\”! 你知道我该怎么解决吗？这是我的生成JSON文件的代码: [ordered]@{ pcn
Python:太多 join()？
我不确定为什么会收到此错误，我在不同点使用 str.join() 和 os.path.join()在脚本中，这是原因吗？使用os.path.join: from os.path import get
ios - b2Body 太多？
一段时间后，在我的应用程序中，似乎出现了一个大问题。有一个来自 Box2D 的 b2Bodys 的构建。我确实在我的应用程序中使用了一些 b2Body 来进行碰撞，但我会说屏幕上一次最多有 10 个。
javascript - 太多 "or"语句 (javascript)
我正在创建一个包含 6 种不同问题类型的简单数学程序。我想让程序随机显示6种类型中的一种，但有些问题应该出现得比较频繁。我使用加权数组，但从加权数组中选择问题类型后，如果不在 if 语句中使用 10
objective-c - NSView 太多？
我想构建一个包含大约 400 个单元的 Controller ，4 列，每列 100 个单元。每个单元格都必须被绘制并响应鼠标事件。这个会不会太重了？我应该为每个单元使用另一种方法，如 CALayer
Haskell 太多 where 子句，任何替代建议
我是 Haskell 的新手，在编写小程序时，我通常会使用太多的 where 子句来检查函数中的许多内容，因此编写 where 子句是一种很好的做法，或者还有其他好的替代方法吗？例如，在下面的代码中
firebase - 尝试按照部署指令部署多个功能导致错误，arg 太多
我有一个 index.js，其中包含一些导出，每个导出仅包含一个函数。我尝试一次部署其中的几个，CLI 给我以下错误； Error: Too many arguments. Run firebase
javascript - 正则表达式有(太多？)很多情况
我在正则表达式上挣扎了几个小时，似乎没有找到最后一点解决方案。我基本上是逐行解析 C 头文件以查找变量。以下是我可能遇到的需要传递正则表达式的行的情况: //#define variable_nam
PHP 和(太多)输入字段
我有一个 html 表单，大约有 1500 个输入字段*(文本或隐藏)。form.action 是 POST 并且每个输入字段都有一个唯一的名称(没有 name=foo[])。每当我在提交表单后尝试
.net - GAC 文件夹 - 太多？
我很困惑一劳永逸 VS 添加引用(/net 选项卡)说 dll 的 gac 在这里: 我发现这个包含 GAC 的文件夹:(附注:为什么有 3 个 Gac 类型？) 还有这个包含 GAC 的文件夹:
Java:实现可比较但条件 if 太多。我怎样才能避免它们？
我有一个实现Comparable的对象列表。我想对此列表进行排序，这就是我使用Comparable的原因。每个对象都有一个字段 weight，它由另外 3 个成员 int 变量组成。对于具有最大
c# - WCF channel 太多
在我们的系统中，有多个“站点”通过 WCF 相互通信。每个站点通过 NetTCP 绑定(bind)公开约 20 个接口(interface)。当一个站点使用对等站点的接口(interface)时，它
c++ - 太多 libboost_*.lib
我已经从 http://boost.teeks99.com/ 下载了 boost 1.58.0(预编译，x86，VC 12.0)并安装到C:\local\boost_1_58_0(我也试过自己用msv
mysql - COUNT UNION 太多
所以...我有一个查询，该查询返回在我的网站上使用相同的电子邮件地址、密码和其他信息创建的用户帐户(是的，实现不好，不要问)。它通过从另一个程序获取用户 ID 来实现这一点。我的 SQL 是 SEL
javascript - AngularJS 太多 Controller ？
我知道这是一个有点菜鸟的问题，但我只是想问一下，如果我有太多 Controller ，这是好事还是坏事。假设我有一个网络应用程序，它有大约 12 个 View 。每个 View 都有自己的 Contr
ios - 导航 Controller 太多？
我认为我的项目做了一些可笑的错误。我正在制作一个项目，基本上是一组 View Controller ，其中一些 Controller 上有视频，其他 Controller 上有图像。我创建了一个模型，
PHP - 太多 mysql_query ("SELECT .. ") ..?
嘿，我正在创建一个电子商店并显示类别树和所有产品及其多种价格变化，我制作了 150 多个 mysql_query("SELECT ..."); 在一页上查询。 (如果我计算“while”循环)。是不
JavaScript:太多 if-else 语句？
我在 JS 方面遇到了问题。我正在尝试制作按类型排序的三个成分列表(用于酿造药水)，所有这些都是标签内的复选框。您应该选择(选中)三个列表中每一个的一个元素才能酿造一剂药水。如果您选择正确的成分并按

塔克拉玛干

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

c++ - boost spirit istream_iterator 从流中消耗太多

演示/概念验证