bison - BNF 到 Flex/Bison

转载作者：行者123 更新时间：2023-12-05 08:17:28

我正在尝试使用 BNF 语法编写 Flex/Bison 文件。但是，当我尝试编译时出现错误，而且我不确定如何调试它们。

BNF语法:

<exp>::=<list> | head(<list>)
<list>::=<num1>::<list> | <list>@<list> | tail(<list>) | [<numlist>] 
<numlist>::=<empty> | <num1><num2>
<num2>::=<empty> | ,<num1><num2>
<num1>::=<D1><N> | <N> | head(<list>)
<D1>::=<D1><N> | <D2>
<D2>::=[1-9]
<N>::=[0-9]

我的 Flex 文件

%option noyywrap

%{
#include <stdlib.h>
#include <list>
using namespace std;
#define YYSTYPE list<int>
#include "listop.tab.h"
%}

hd head
tl tail
ap [@()\[\]]
con ::
n [0-9]
d2 [1-9]
d1 ({d2}{n}+)|{d2}
ws[ \t\n]+

%%

{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;
{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;
{hd} return HEAD;
{tl} return TAIL; 
{ap} return *yytext;
{con} return CON;
{ws} /* eat up white spaces */

%%

我的 Bison 文件

%{
#include <math.h>
#include <stdio.h>
#include <iostream>
#include <list>
using namespace std;

#define YYSTYPE list<int>

void outputlist(list<int>);
int yylex(void);
int yyerror(const char*);
list<int> a;
%}

%token N
%token D1
%token HEAD
%token TAIL
%right CON
%left '@'

%% /* Grammer rules and actions follow */

input: /* empty */ 
| input exp ;
exp: list '\n' {outputlist($1);}
| HEAD '(' list ')' '\n' {cout <<$3.front();} ;
list: num1 CON list {$3.push_front($1.front()); $$=$3;}
| list '@' list {$1.splice($1.end(),$3); $$=$1;}
| TAIL '(' list ')'{$$ = $3;} 
| '[' numlist ']' {$$ = $2;};
numlist: /* empty */
| num1 num2 {$1.splice($1.end(), $2); $$=$1;};
num2: /* empty */ 
| ',' num1 num2 {$2.splice($2.end(),$3); $$=$2;};
num1: D1 N { $1.splice($1.end(), $2); $$=$1;}
| N { $$ = $1;}
| HEAD '(' list ')' {list<int> a; a.push_front($3.front()); $$=a;};

%%

int main() { return yyparse();}
int yyerror(const char* s) { cout << "error" << endl; return 0;}

void outputlist(list<int> list1)
{
  cout << '[';
  for (int i = list1.size(); i > 1; i --)
  {
    cout << list1.front() << ',';
    list1.pop_front();
  }
  cout << list1.front();
  list1.pop_front();
  cout << ']' << endl;
}

自 YYSTYPE已定义为 list<int>在这两个文件的 C 声明部分，我使用库函数来插入/删除列表的头部、获取第一个元素等。

感谢任何帮助。谢谢。

编辑:我已经编辑了上面的 flex & bison 文件。我的程序现在运行了，但我没有得到正确的输出。这是我得到的:

输入:头([5,2,4])

输出:,错误

正确输出:5

最佳答案

使用可用的工具(例如 bison、C++ 编译器和 bison 手册)调试您的代码并不难。当您似乎正在学习如何在简单练习中使用 flex 和 bison 时，您可能需要有关如何完成调试的教程。如果您不需要教程，也许其他读者可能需要。接下来将是调试您的示例的教程。我目前正在计算机科学课上这样做，所以它会有老师的语气。

BNF

首先，我查看了您的 BNF。我注意到您似乎没有严格遵守 BNF notation而且布局有点乱。更好的版本是:

<exp>     ::= <list> 
            | head ( <list> )
<list>    ::= <num1> :: <list> 
            | <list> @ <list> 
            | tail ( <list> ) 
            | [ <numlist> ] 
<numlist> ::= <empty> 
            | <num1> <num2>
<num2>    ::= <empty> 
            | , <num1> <num2>
<num1>    ::= <D1> <N> 
            | <N> | head ( <list> )
<D1>      ::= <D1> <N> 
            | <D2>
<D2>      ::=     1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<N>       ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<empty>   ::=

了解布局如何提高可读性？您使用了 []用于两个不同目的的符号，作为它们自己和指示字符集。字符集是一种灵活的表示法。 []字符在扩展 BNF 中用作元字符，但它们用于表示选项而不是集合。您没有定义 <empty>非终结符。尽管此示例遵循严格的符号，但为了防止终端和元符号之间的混淆，将使用引号，如:

<exp>     ::= <list> 
            | head "(" <list> ")"
<list>    ::= <num1> "::" <list> 
            | <list> "@" <list> 
            | tail "(" <list> ")" 
            | "[" <numlist> "]" 
<numlist> ::= <empty> 
            | <num1> <num2>
<num2>    ::= <empty> 
            | "," <num1> <num2>
<num1>    ::= <D1> <N> 
            | <N> | head "(" <list> ")"
<D1>      ::= <D1> <N> 
            | <D2>
<D2>      ::=     1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<N>       ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<empty>   ::=

调试

现在让我们来看看词法分析。如果我们按照@rici 的建议进行修复，并添加缺失的逗号，我们就有了这个规则:

ap [@()\[\],]

现在bison has an in-built debug capability这对于理解解析是如何工作的非常有用。为您的代码启用此功能可以清楚地看到它失败的地方。首先，我们必须启用跟踪，如手册中所述。我们需要做两件事:

设置变量yydebug到非零
用 -t 构建解析器(或 --debug )选项

因此在您的 listop.y文件你应该改变main()功能是:

int main() { 
#ifdef YYDEBUG
extern int yydebug;
yydebug = 1;
#endif
return yyparse();}

重建 Bison :

bison listop.y -d -t

现在运行 flex 并重新编译:

flex listop.l
g++ -o listop.exe listop.tab.c lex.yy.c -DYYDEBUG -lfl

现在，当我们运行时，我们会得到一些诊断输出(对于您输入的 head([5,2,4]) ):

Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token D1 ()
Shifting token D1 ()
Entering state 4
Reading a token: Next token is token ',' ()
error
Error: popping token D1 ()
Stack now 0 1 5 12 7
Error: popping token '[' ()
Stack now 0 1 5 12
Error: popping token '(' ()
Stack now 0 1 5
Error: popping token HEAD ()
Stack now 0 1
Error: popping nterm input ()
Stack now 0
Cleanup: discarding lookahead token ',' ()
Stack now 0

从这一行可以看出:

Reading a token: Next token is token D1 ()

数字 5 已作为标记返回 D1由词法分析器。但是查阅BNF语法，我们可以看到它应该被视为 token N。 .作为D1和 N包含它可以选择其中任何一个的相同数字。为什么选择D1 ？这是因为模式/ Action 规则的顺序。较高的规则优先于较低的规则。要解决此问题，有必要更改规则的顺序:

{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;
{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;

到:

{n} yylval.clear(); yylval.push_front(atoi(yytext)); return N;
{d1} yylval.clear(); yylval.push_front(atoi(yytext)); return D1;

然后重新构建所有的部分。如果我们再试一次，我们会得到以下更长的轨迹:

Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7
Entering state 16
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24
Entering state 31
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24 31 24
Entering state 31
Reading a token: Next token is token ']' ()
Reducing stack by rule 11 (line 35):
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
   $1 = token ',' ()
   $2 = nterm num1 ()
   $3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
   $1 = token ',' ()
   $2 = nterm num1 ()
   $3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16
Entering state 25
Reducing stack by rule 10 (line 34):
   $1 = nterm num1 ()
   $2 = nterm num2 ()
-> $$ = nterm numlist ()
Stack now 0 1 5 12 7
Entering state 15
Next token is token ']' ()
Shifting token ']' ()
Entering state 23
Reducing stack by rule 8 (line 32):
   $1 = token '[' ()
   $2 = nterm numlist ()
   $3 = token ']' ()
-> $$ = nterm list ()
Stack now 0 1 5 12
Entering state 20
Reading a token: Next token is token ')' ()
Shifting token ')' ()
Entering state 28
Reading a token:

然而，预期的结果(5)仍然没有输出。检查操作规则我们可以看到 outputlist方法仅在 '\n' 时调用 token 匹配且跟踪不显示任何 '\n' token 。这是因为它们在词法分析器中作为空格被删除。需要更改弹性规则以解决此问题:

ap [@()\[\],\n]
ws[ \t]+

再次重建。它现在可以工作，输出正确的值:

Starting parse
Entering state 0
Reducing stack by rule 1 (line 25):
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: head([5,2,4])
Next token is token HEAD ()
Shifting token HEAD ()
Entering state 5
Reading a token: Next token is token '(' ()
Shifting token '(' ()
Entering state 12
Reading a token: Next token is token '[' ()
Shifting token '[' ()
Entering state 7
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7
Entering state 16
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24
Entering state 31
Reading a token: Next token is token ',' ()
Shifting token ',' ()
Entering state 24
Reading a token: Next token is token N ()
Shifting token N ()
Entering state 3
Reducing stack by rule 14 (line 38):
   $1 = token N ()
-> $$ = nterm num1 ()
Stack now 0 1 5 12 7 16 24 31 24
Entering state 31
Reading a token: Next token is token ']' ()
Reducing stack by rule 11 (line 35):
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
   $1 = token ',' ()
   $2 = nterm num1 ()
   $3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16 24 31
Entering state 34
Reducing stack by rule 12 (line 36):
   $1 = token ',' ()
   $2 = nterm num1 ()
   $3 = nterm num2 ()
-> $$ = nterm num2 ()
Stack now 0 1 5 12 7 16
Entering state 25
Reducing stack by rule 10 (line 34):
   $1 = nterm num1 ()
   $2 = nterm num2 ()
-> $$ = nterm numlist ()
Stack now 0 1 5 12 7
Entering state 15
Next token is token ']' ()
Shifting token ']' ()
Entering state 23
Reducing stack by rule 8 (line 32):
   $1 = token '[' ()
   $2 = nterm numlist ()
   $3 = token ']' ()
-> $$ = nterm list ()
Stack now 0 1 5 12
Entering state 20
Reading a token: Next token is token ')' ()
Shifting token ')' ()
Entering state 28
Reading a token: Next token is token '\n' ()
Shifting token '\n' ()
Entering state 32
Reducing stack by rule 4 (line 28):
   $1 = token HEAD ()
   $2 = token '(' ()
   $3 = nterm list ()
   $4 = token ')' ()
   $5 = token '\n' ()
5-> $$ = nterm exp ()
Stack now 0 1
Entering state 8
Reducing stack by rule 2 (line 26):
   $1 = nterm input ()
   $2 = nterm exp ()
-> $$ = nterm input ()
Stack now 0
Entering state 1
Reading a token: ^Z
Now at end of input.
Shifting token $end ()
Entering state 2
Stack now 0 1 2
Cleanup: popping token $end ()
Cleanup: popping nterm input ()

做得更好

实际上，尽管调试使程序可以运行，但它仍然是一种糟糕的方式。它不使用词法分析器来构建标记和解析器来匹配语法。所有的数字识别都应该在 flex 代码中完成。例如，为什么不使用这样的词法分析器:

%option noyywrap

%{
#include <stdlib.h>
#include <list>
using namespace std;
#define YYSTYPE list<int>
#include "listop.tab.h"
%}

hd head
tl tail
ap [@()\[\],\n]
con ::
n [0-9]
d2 [1-9]
num ({d2}+{n}?)|{n}
ws[ \t]+

%%

{num} return NUM; yylval.push_front(atoi(yytext));
{hd} return HEAD;
{tl} return TAIL; 
{ap} return *yytext;
{con} return CON;
{ws} /* eat up white spaces */

%%

然后你可以稍微简化解析器中的语法，并使整体实现更简单。

我会把那部分留作练习。

关于bison - BNF 到 Flex/Bison，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33645123/

文章推荐： php - fatal error : Cannot use isset, 使用空表达式

文章推荐： python - 在模板中访问 jinja2 全局变量

文章推荐： php usort - 函数 'comp' 未找到或函数名称无效

文章推荐： php - 如何知道更新了哪些字段

bison - Yacc/Bison 单引号和双引号文字之间的区别
我有一个使用 Yacc 编写的语法。语法的相关部分摘录在这里 postfix : primary | postfix '[' expr ']' | postfix '[' ex
bison - bison/yacc 中的继承属性
如何在 bison/yacc 中使用继承的属性？说，我有这样的语法 - 程序 -> stmts 我想做的是在 Bison 身上: 程序:stmts {$$.next = newLabel(); $1
bison - 更改 Bison 错误消息的格式？
有什么方法可以更改 Bison 错误消息的格式？例如，不要说 syntax error, unexpected T_ID, expected ';' or T_IMPORT说 unexpected i
bison - 用 Bison 解决悬空其他问题
我的语法有这两个标记声明: %token RP %token ELSE 而这两条规则: Statement : IF LP Exp RP Statement; Statement : IF LP
bison - 我如何说服 Bison 解析文件的一部分？
如果有一个表单的输入文件: BEGIN stuff.... END BEGIN stuff ... END 我的 .y 文件是这样的 %token BEGIN %token END %star
bison - 了解 Bison/Jison
我希望为现有语言创建一个 JavaScript 解析器，该语言目前具有笨拙的“手工制作”C# 和 Java 解析器。我想使用 Jison，并且也一直在尝试学习 Bison 的基础知识。我不确定如何回
bison - 使 bison 使用除开始规则以外的规则开始解析
目前我正在研究一个源到源编译器，我已经编写了一个野牛解析器，可以正确地为输入创建 AST。我现在需要对语法树进行多次转换，因此我需要向树中插入许多节点。我可以手动创建所有要添加到语法树中的结构体/联
bison - 我如何说服 Bison 解析文件的一部分？
如果有一个表单的输入文件: BEGIN stuff.... END BEGIN stuff ... END 我的 .y 文件是这样的 %token BEGIN %token END %star
bison - 了解 Bison/吉森
我正在寻找为现有语言创建 JavaScript 解析器的方法，该语言目前具有笨拙的“手工制作”C# 和 Java 解析器。想用Jison，也在努力学习Bison的基础。一个我不确定如何回答的问题是
bison - flex/bison 解析器编译时出现段错误
我正在使用 flex/bison 编写解析器(我可以用 Python 编写解析器，但我总是更喜欢经典。) 当我用这个编译代码时: gcc -lfl -ly chance.tab.c lex.yy.c
bison - 警告 Bison 编译
正在使用 flex/bison 开发编译器。我的构建输出中有此警告。警告:在默认操作中键入冲突 ('s' '') 请问有什么帮助吗？最佳答案它似乎与源中的 %token 和 %type 声明有关
bison - flex/bison 中的半保留字处理
考虑这个 lex.l 文件: %{ #include "y.tab.h" %} digit [0-9] letter [a-zA-Z] %% "+"
bison - 如何在 Bison 中显示语法的抽象语法树？
我正在为 bison 中的一个简单的 Pascal 编译器编写语法，我想可视化我在 pascal.y 中指定的语法的解析树。是否可以基于我的 pascal.y 文件以图形方式显示语法树？最佳答案野
bison - Flex/bison 语法错误
我正在尝试编写一个能够使用以下输入的语法: begin #this is a example x = 56; while x > 0 do begin point 15
bison - BNF 到 Flex/Bison
我正在尝试使用 BNF 语法编写 Flex/Bison 文件。但是，当我尝试编译时出现错误，而且我不确定如何调试它们。 BNF语法: ::= | head() ::=:: | @ | tail() |
bison - 弹性 Bison : how to create Keywords?
我正在用 Flex/Bison 编写一个小程序来标记/解析我创建的查询语言。我想知道是否有任何方法可以在 Flex/Bison 中创建任何关键字。我的意思是:flex 将输入分解成一个标记列表，但
bison - flex 究竟是如何支持 Bison 定位的？
我正在尝试使用 flex 和 bison 创建过滤器，因为我想从复杂的语言中获取某些语法元素。我的计划是使用 flex + bison 来识别语法，并转储出感兴趣元素的位置。 (然后使用脚本根据转储的
bison - flex/bison 将数字解释为 float
我正在尝试实现一个可以进行浮点运算的 Flex/bison 计算器。我的 Flex 代码如下所示 %{ #include "calc.tab.h" #include void yyerror(cha
bison - 柔性/Bison : Why my rule is not regonized?
我正在尝试使用 FLEX 和 BISON 进行一些练习。这是我写的代码: calc_pol.y %{ #define YYSTYPE double #include "calc_pol.tab.h"
bison - 如何在 yacc/bison 中找到用于错误报告的不平衡分隔符？
我有一个使用括号和方括号作为分隔符的语法。当由 bison 生成的解析器输入带有不平衡分隔符的输入时，传递给 yyerror 的 YYLTYPE* 中的错误位置是输入的结尾。所以，例如，在输入 x

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

bison - BNF 到 Flex/Bison

BNF

调试

做得更好