gpt4 book ai didi

c++ - 无法获得 Boost Spirit 语法以使用 std::map<> 的已知键

转载 作者:搜寻专家 更新时间:2023-10-31 02:07:55 24 4
gpt4 key购买 nike

我似乎遇到了一些我无法忍受的 Boost Spirit spirit 障碍。我需要处理一个相当简单的语法,我想将值放入一个结构中,其中包含一个 std::map<> 作为它的成员之一。对的键名是预先知道的,因此只允许使用这些键名。映射中可能有一对多的键,按任何顺序,每个键名都通过 qi 验证。

例如,语法看起来像这样。

test .|*|<hostname> add|modify|save ( key [value] key [value] ... ) ;

//
test . add ( a1 ex00
a2 ex01
a3 "ex02,ex03,ex04" );

//
test * modify ( m1 ex10
m2 ex11
m3 "ex12,ex13,ex14"
m4 "abc def ghi" );


//
test 10.0.0.1 clear ( c1
c2
c3 );

在此示例中,“添加”的键是 a1、a2 和 a3,“修改”的键也是 m1、m2、m3 和 m4,并且每个都必须包含一个值。对于“清除”,映射 c1、c2 和 c3 的键可能不包含值。此外,假设对于此示例,您最多可以拥有 10 个键(a1 ... a11、m1 ... m11 和 c1 ... c11),它们的任何组合都可以按任何顺序用于相应的操作。这意味着您不能将已知键 cX 用于“添加”或 mX 用于“清除”

结构遵循这个简单的模式
//
struct test
{
std::string host;
std::string action;
std::map<std::string,std::string> option;
}

所以从上面的例子中,我希望结构包含......
// add ...
test.host = .
test.action = add
test.option[0].first = a1
test.option[0].second = ex00
test.option[1].first = a2
test.option[1].second = ex01
test.option[2].first = a3
test.option[2].second = ex02,ex03,ex04

// modify ...
test.host = *
test.action = modify
test.option[0].first = m1
test.option[0].second = ex10
test.option[1].first = m2
test.option[1].second = ex11
test.option[2].first = m3
test.option[2].second = ex12,ex13,ex14
test.option[2].first = m3
test.option[2].second = abc def ghi

// clear ...
test.host = *
test.action = 10.0.0.1
test.option[0].first = c1
test.option[0].second =
test.option[1].first = c2
test.option[1].second =
test.option[2].first = c3
test.option[2].second =

我可以让每个单独的部分独立工作,但我似乎无法让他们一起工作。例如,我让主机和操作在没有 map <> 的情况下工作。

我改编了 Sehe 中的一个先前发布的示例( here )试图让它工作(顺便说一句: Sehe 有一些很棒的例子,我一直在使用这些例子和文档一样多)。

这是一个摘录(显然不起作用),但至少显示了我想要去的地方。
namespace ast {

namespace qi = boost::spirit::qi;

//
using unused = qi::unused_type;

//
using string = std::string;
using strings = std::vector<string>;
using list = strings;
using pair = std::pair<string, string>;
using map = std::map<string, string>;

//
struct test
{
using preference = std::map<string,string>;

string host;
string action;
preference option;
};
}

//
BOOST_FUSION_ADAPT_STRUCT( ast::test,
( std::string, host )
( std::string, action ) )
( ast::test::preference, option ) )

//
namespace grammar
{
//
template <typename It>
struct parser
{
//
struct skip : qi::grammar<It>
{
//
skip() : skip::base_type( text )
{
using namespace qi;

// handle all whitespace (" ", \t, ...)
// along with comment lines/blocks
//
// comment blocks: /* ... */
// // ...
// -- ...
// # ...
text = ascii::space
| ( "#" >> *( char_ - eol ) >> ( eoi | eol ) ) // line comment
| ( "--" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "//" >> *( char_ - eol ) >> ( eoi | eol ) ) // ...
| ( "/*" >> *( char_ - "*/" ) >> "*/" ); // block comment

//
BOOST_SPIRIT_DEBUG_NODES( ( text ) )
}

//
qi::rule<It> text;
};
//
struct token
{
//
token()
{
using namespace qi;

// common
string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
real = double_;
integer = int_;

//
value = ( string | identity );

// ip target
any = '*';
local = ( char_('.') | fqdn );
fqdn = +char_("a-zA-Z0-9.\\-" ); // consession

ipv4 = +as_string[ octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
>> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] ];

//
target = ( any | local | fqdn | ipv4 );

//
pair = identity >> -( attr( ' ' ) >> value );
map = pair >> *( attr( ' ' ) >> pair );
list = *( value );

//
BOOST_SPIRIT_DEBUG_NODES( ( string )
( identity )
( value )
( real )
( integer )
( any )
( local )
( fqdn )
( ipv4 )
( target )
( pair )
( keyval )
( map )
( list ) )
}

//
qi::rule<It, std::string()> string;
qi::rule<It, std::string()> identity;
qi::rule<It, std::string()> value;
qi::rule<It, double()> real;
qi::rule<It, int()> integer;
qi::uint_parser<unsigned, 10, 1, 3> octet;

qi::rule<It, std::string()> any;
qi::rule<It, std::string()> local;
qi::rule<It, std::string()> fqdn;
qi::rule<It, std::string()> ipv4;
qi::rule<It, std::string()> target;

//
qi::rule<It, ast::map()> map;
qi::rule<It, ast::pair()> pair;
qi::rule<It, ast::pair()> keyval;
qi::rule<It, ast::list()> list;
};

//
struct test : token, qi::grammar<It, ast::test(), skip>
{
//
test() : test::base_type( command_ )
{
using namespace qi;
using namespace qr;

auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );

// not sure how to enforce the "key" names!
key_ = *( '(' >> *value >> ')' );
// tried using token::map ... didn't work ...

//
add_ = ( ( "add" >> attr( ' ' ) ) [ _val = "add" ] );
modify_ = ( ( "modify" >> attr( ' ' ) ) [ _val = "modify" ] );
clear_ = ( ( "clear" >> attr( ' ' ) ) [ _val = "clear" ] );

//
action_ = ( add_ | modify_ | clear_ );


/* *** can't get from A to B here ... not sure what to do *** */

//
command_ = kw[ "test" ]
>> target
>> action_
>> ';';

BOOST_SPIRIT_DEBUG_NODES( ( command_ )
( action_ )
( add_ )
( modify_ )
( clear_ ) )
}

//
private:
//
using token::value;
using token::target;
using token::map;

qi::rule<It, ast::test(), skip> command_;
qi::rule<It, std::string(), skip> action_;

//
qi::rule<It, std::string(), skip> add_;
qi::rule<It, std::string(), skip> modify_;
qi::rule<It, std::string(), skip> clear_;
};

...

};
}

我希望这个问题不是太模棱两可,如果你需要一个问题的工作示例,我当然可以提供。任何和所有的帮助都非常感谢,所以提前谢谢你!

最佳答案

笔记:

  • 用这个
            add_     = ( ( "add"    >> attr( ' ' ) ) [ _val = "add" ] );
    modify_ = ( ( "modify" >> attr( ' ' ) ) [ _val = "modify" ] );
    clear_ = ( ( "clear" >> attr( ' ' ) ) [ _val = "clear" ] );

    你的意思是需要一个空间?或者你真的只是想强制 struct action 字段包含一个尾随空格(这就是将会发生的事情)。

    如果您的意思是后者,我会在解析器之外执行此操作¹。

    如果您想要第一个,请使用 kw 工具:
            add_    = kw["add"]    [ _val = "add"    ];
    modify_ = kw["modify"] [ _val = "modify" ];
    clear_ = kw["clear"] [ _val = "clear" ];

    事实上,您可以简化它(再次,¹):
            add_    = raw[ kw["add"] ];
    modify_ = raw[ kw["modify"] ];
    clear_ = raw[ kw["clear"] ];

    这也意味着您可以简化为
            action_  = raw[ kw[lit("add")|"modify"|"clear"] ];

    但是,有点接近您的问题,您也可以使用 symbol parser :
            symbols<char> action_sym;
    action_sym += "add", "modify", "clear";
    //
    action_ = raw[ kw[action_sym] ];

    Caveat: the symbols needs to be a member so its lifetime extends beyond the constructor.

  • 如果您打算使用以下命令捕获 ipv4 地址的输入表示
            ipv4     =  +as_string[ octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
    >> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
    >> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] >> '.'
    >> octet[ _pass = ( _1 >= 0 && _1 <= 255 ) ] ];

    Side note I'm assuming +as_string is a simple mistake and you meant as_string instead.



    简化:
        qi::uint_parser<uint8_t, 10, 1, 3> octet;

    这避免了范围检查(再次参见 ¹):
        ipv4 = as_string[ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];

    但是,这将构建地址的 4 字符二进制字符串表示。如果你想要那个,很好。我对此表示怀疑(因为您会写 std::array<uint8_t, 4>uint64_t ,对吗?)。因此,如果您想要字符串,请再次使用 raw[] :
        ipv4     = raw[ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
  • 与数字 1 相同的问题:
        pair     =  identity >> -( attr(' ') >> value );

    这一次,问题出卖了产品不应该在 token ;从概念上讲, token -izing 在解析之前,因此我会保持 token 无 skipper 。 kw 在这种情况下并没有真正做很多好事。相反,我会将 pairmaplist (未使用?)移动到解析器中:
        pair     =  kw[identity] >> -value;
    map = +pair;
    list = *value;

  • 一些例子

    我最近做了一个关于使用 symbols 解析( here )的例子,但这个答案更接近你的问题:
  • How to provider user with autocomplete suggestions for given boost::spirit grammar?

  • 它远远超出了解析器的范围,因为它在语法中执行各种操作,但它确实显示了可以使用特定“符号集”参数化的通用“查找-ish”规则:参见 Identifier Lookup section答案:

    Identifier Lookup

    We store "symbol tables" in Domain members _variables and _functions:

          using Domain = qi::symbols<char>;           Domain _variables, _functions;

    Then we declare some rules that can do lookups on either of them:

          // domain identifier lookups
    qi::_r1_type _domain;
    qi::rule<It, Ast::Identifier(Domain const&)> maybe_known, known,

    unknown;

    The corresponding declarations will be shown shortly.

    Variables are pretty simple:

          variable   = maybe_known(phx::ref(_variables));

    Calls are trickier. If a name is unknown we don't want to assume it implies a function unless it's followed by a '(' character. However, if an identifier is a known function name, we want even to imply the ( (this gives the UX the appearance of autocompletion where when the user types sqrt, it suggests the next character to be ( magically).

          // The heuristics:          // - an unknown identifier followed by (
    // - an unclosed argument list implies ) call %= (

    known(phx::ref(_functions)) // known -> imply the parens | &(identifier >> '(') >> unknown(phx::ref(_functions)) ) >> implied('(') >> -(expression % ',') >> implied(')');

    It all builds on known, unknown and maybe_known:

              ///////////////////////////////
    // identifier loopkup, suggesting
    {
    maybe_known = known(_domain) | unknown(_domain);

    // distinct to avoid partially-matching identifiers
    using boost::spirit::repository::qi::distinct;
    auto kw = distinct(copy(alnum | '_'));

    known = raw[kw[lazy(_domain)]];
    unknown = raw[identifier[_val=_1]] [suggest_for(_1, _domain)];
    }


    我认为您可以在这里 build 性地使用相同的方法。另一个噱头可能是验证所提供的属性实际上是唯一的。

    演示工作

    结合上面的所有提示可以编译和“解析”测试命令:

    Live On Coliru
    #include <string>
    #include <map>
    #include <vector>

    namespace ast {

    //
    using string = std::string;
    using strings = std::vector<string>;
    using list = strings;
    using pair = std::pair<string, string>;
    using map = std::map<string, string>;

    //
    struct command {
    string host;
    string action;
    map option;
    };
    }

    #include <boost/fusion/adapted.hpp>

    BOOST_FUSION_ADAPT_STRUCT(ast::command, host, action, option)

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/spirit/repository/include/qi_distinct.hpp>

    namespace grammar
    {
    namespace qi = boost::spirit::qi;
    namespace qr = boost::spirit::repository::qi;

    template <typename It>
    struct parser
    {
    struct skip : qi::grammar<It> {

    skip() : skip::base_type(text) {
    using namespace qi;

    // handle all whitespace along with line/block comments
    text = ascii::space
    | (lit("#")|"--"|"//") >> *(char_ - eol) >> (eoi | eol) // line comment
    | "/*" >> *(char_ - "*/") >> "*/"; // block comment

    //
    BOOST_SPIRIT_DEBUG_NODES((text))
    }

    private:
    qi::rule<It> text;
    };
    //
    struct token {
    //
    token() {
    using namespace qi;

    // common
    string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
    identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
    value = string | identity;

    // ip target
    any = '*';
    local = '.' | fqdn;
    fqdn = +char_("a-zA-Z0-9.\\-"); // concession

    ipv4 = raw [ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
    //
    target = any | local | fqdn | ipv4;

    //
    BOOST_SPIRIT_DEBUG_NODES(
    (string) (identity) (value)
    (any) (local) (fqdn) (ipv4) (target)
    )
    }

    protected:
    //
    qi::rule<It, std::string()> string;
    qi::rule<It, std::string()> identity;
    qi::rule<It, std::string()> value;
    qi::uint_parser<uint8_t, 10, 1, 3> octet;

    qi::rule<It, std::string()> any;
    qi::rule<It, std::string()> local;
    qi::rule<It, std::string()> fqdn;
    qi::rule<It, std::string()> ipv4;
    qi::rule<It, std::string()> target;
    };

    //
    struct test : token, qi::grammar<It, ast::command(), skip> {
    //
    test() : test::base_type(command_)
    {
    using namespace qi;

    auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );

    //
    action_sym += "add", "modify", "clear";
    action_ = raw[ kw[action_sym] ];

    //
    command_ = kw["test"]
    >> target
    >> action_
    >> '(' >> map >> ')'
    >> ';';

    //
    pair = kw[identity] >> -value;
    map = +pair;
    list = *value;

    BOOST_SPIRIT_DEBUG_NODES(
    (command_) (action_)
    (pair) (map) (list)
    )
    }

    private:
    using token::target;
    using token::identity;
    using token::value;
    qi::symbols<char> action_sym;

    //
    qi::rule<It, ast::command(), skip> command_;
    qi::rule<It, std::string(), skip> action_;

    //
    qi::rule<It, ast::map(), skip> map;
    qi::rule<It, ast::pair(), skip> pair;
    qi::rule<It, ast::list(), skip> list;
    };

    };
    }

    #include <fstream>

    int main() {
    using It = boost::spirit::istream_iterator;
    using Parser = grammar::parser<It>;

    std::ifstream input("input.txt");
    It f(input >> std::noskipws), l;

    Parser::skip const s{};
    Parser::test const p{};

    std::vector<ast::command> data;
    bool ok = phrase_parse(f, l, *p, s, data);

    if (ok) {
    std::cout << "Parsed " << data.size() << " commands\n";
    } else {
    std::cout << "Parsed failed\n";
    }

    if (f != l) {
    std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
    }
    }

    打印
    Parsed 3 commands

    让我们限制 key

    就像上面链接的答案一样,让我们​​通过 mappair 规则实际的键集来获取它们的允许值:
        using KeySet = qi::symbols<char>;
    using KeyRef = KeySet const*;
    //
    KeySet add_keys, modify_keys, clear_keys;
    qi::symbols<char, KeyRef> action_sym;

    qi::rule<It, ast::pair(KeyRef), skip> pair;
    qi::rule<It, ast::map(KeyRef), skip> map;

    Note A key feature used is the associated attribute value with a symbols<> lookup (in this case we associate a KeyRef with an action symbol):


        //
    add_keys += "a1", "a2", "a3", "a4", "a5", "a6";
    modify_keys += "m1", "m2", "m3", "m4";
    clear_keys += "c1", "c2", "c3", "c4", "c5";

    action_sym.add
    ("add", &add_keys)
    ("modify", &modify_keys)
    ("clear", &clear_keys);

    现在开始繁重的工作。

    使用 qi::locals<> 和继承的属性

    让我们给 command_ 一些本地空间来存储选定的键集:
      qi::rule<It, ast::command(), skip, qi::locals<KeyRef> > command_;

    现在我们原则上可以分配给它(使用 _a 占位符)。但是,有一些细节:
        //
    qi::_a_type selected;

    总是更喜欢描述性的名称:) _a_r1 很快就会变老。事情已经足够令人困惑了。
        command_ %= kw["test"]
    >> target
    >> raw[ kw[action_sym] [ selected = _1 ] ]
    >> '(' >> map(selected) >> ')'
    >> ';';

    Note: the subtlest detail here is %= instead of = to avoid the suppression of automatic attribute propagation when a semantic action is present (yeah, see ¹ again...)



    但总而言之,这并没有那么糟糕吗?
        //
    qi::_r1_type symref;
    pair = raw[ kw[lazy(*symref)] ] >> -value;
    map = +pair(symref);

    现在至少事情解析

    快好了

    Live On Coliru
    //#define BOOST_SPIRIT_DEBUG
    #include <string>
    #include <map>
    #include <vector>

    namespace ast {

    //
    using string = std::string;
    using strings = std::vector<string>;
    using list = strings;
    using pair = std::pair<string, string>;
    using map = std::map<string, string>;

    //
    struct command {
    string host;
    string action;
    map option;
    };
    }

    #include <boost/fusion/adapted.hpp>

    BOOST_FUSION_ADAPT_STRUCT(ast::command, host, action, option)

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <boost/spirit/repository/include/qi_distinct.hpp>

    namespace grammar
    {
    namespace qi = boost::spirit::qi;
    namespace qr = boost::spirit::repository::qi;

    template <typename It>
    struct parser
    {
    struct skip : qi::grammar<It> {

    skip() : skip::base_type(rule_) {
    using namespace qi;

    // handle all whitespace along with line/block comments
    rule_ = ascii::space
    | (lit("#")|"--"|"//") >> *(char_ - eol) >> (eoi | eol) // line comment
    | "/*" >> *(char_ - "*/") >> "*/"; // block comment

    //
    //BOOST_SPIRIT_DEBUG_NODES((skipper))
    }

    private:
    qi::rule<It> rule_;
    };
    //
    struct token {
    //
    token() {
    using namespace qi;

    // common
    string = '"' >> *("\\" >> char_ | ~char_('"')) >> '"';
    identity = char_("a-zA-Z_") >> *char_("a-zA-Z0-9_");
    value = string | identity;

    // ip target
    any = '*';
    local = '.' | fqdn;
    fqdn = +char_("a-zA-Z0-9.\\-"); // concession

    ipv4 = raw [ octet >> '.' >> octet >> '.' >> octet >> '.' >> octet ];
    //
    target = any | local | fqdn | ipv4;

    //
    BOOST_SPIRIT_DEBUG_NODES(
    (string) (identity) (value)
    (any) (local) (fqdn) (ipv4) (target)
    )
    }

    protected:
    //
    qi::rule<It, std::string()> string;
    qi::rule<It, std::string()> identity;
    qi::rule<It, std::string()> value;
    qi::uint_parser<uint8_t, 10, 1, 3> octet;

    qi::rule<It, std::string()> any;
    qi::rule<It, std::string()> local;
    qi::rule<It, std::string()> fqdn;
    qi::rule<It, std::string()> ipv4;
    qi::rule<It, std::string()> target;
    };

    //
    struct test : token, qi::grammar<It, ast::command(), skip> {
    //
    test() : test::base_type(start_)
    {
    using namespace qi;

    auto kw = qr::distinct( copy( char_( "a-zA-Z0-9_" ) ) );

    //
    add_keys += "a1", "a2", "a3", "a4", "a5", "a6";
    modify_keys += "m1", "m2", "m3", "m4";
    clear_keys += "c1", "c2", "c3", "c4", "c5";

    action_sym.add
    ("add", &add_keys)
    ("modify", &modify_keys)
    ("clear", &clear_keys);

    //
    qi::_a_type selected;

    command_ %= kw["test"]
    >> target
    >> raw[ kw[action_sym] [ selected = _1 ] ]
    >> '(' >> map(selected) >> ')'
    >> ';';

    //
    qi::_r1_type symref;
    pair = raw[ kw[lazy(*symref)] ] >> -value;
    map = +pair(symref);
    list = *value;

    start_ = command_;

    BOOST_SPIRIT_DEBUG_NODES(
    (start_) (command_)
    (pair) (map) (list)
    )
    }

    private:
    using token::target;
    using token::identity;
    using token::value;

    using KeySet = qi::symbols<char>;
    using KeyRef = KeySet const*;

    //
    qi::rule<It, ast::command(), skip> start_;
    qi::rule<It, ast::command(), skip, qi::locals<KeyRef> > command_;

    //
    KeySet add_keys, modify_keys, clear_keys;
    qi::symbols<char, KeyRef> action_sym;

    qi::rule<It, ast::pair(KeyRef), skip> pair;
    qi::rule<It, ast::map(KeyRef), skip> map;
    qi::rule<It, ast::list(), skip> list;
    };

    };
    }

    #include <fstream>

    int main() {
    using It = boost::spirit::istream_iterator;
    using Parser = grammar::parser<It>;

    std::ifstream input("input.txt");
    It f(input >> std::noskipws), l;

    Parser::skip const s{};
    Parser::test const p{};

    std::vector<ast::command> data;
    bool ok = phrase_parse(f, l, *p, s, data);

    if (ok) {
    std::cout << "Parsed " << data.size() << " commands\n";
    } else {
    std::cout << "Parsed failed\n";
    }

    if (f != l) {
    std::cout << "Remaining unparsed input: '" << std::string(f,l) << "'\n";
    }
    }

    打印
    Parsed 3 commands

    坚持住,不要那么快!这是不对的

    是的。如果启用调试,您会看到它奇怪地解析事物:
     <attributes>[[[1, 0, ., 0, ., 0, ., 1], [c, l, e, a, r], [[[c, 1], [c, 2]], [[c, 3], []]]]]</attributes>

    这实际上“仅仅是”语法问题。如果语法看不到 keyvalue 之间的区别,那么显然 c2 将被解析为键为 c1 的属性值。

    消除语法歧义由您决定。现在,我将使用否定断言演示修复:我们只接受未知键的值。这有点脏,但可能对您有用:
        key      = raw[ kw[lazy(*symref)] ];
    pair = key(symref) >> -(!key(symref) >> value);
    map = +pair(symref);

    注意我为了可读性考虑了 key 规则:

    Live On Coliru

    解析
    <attributes>[[[1, 0, ., 0, ., 0, ., 1], [c, l, e, a, r], [[[c, 1], []], [[c, 2], []], [[c, 3], []]]]]</attributes>

    正是医生吩咐的!

    ¹ Boost Spirit: "Semantic actions are evil"?

    关于c++ - 无法获得 Boost Spirit 语法以使用 std::map<> 的已知键,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48177919/

    24 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com