gpt4 book ai didi

compiler-construction - GNU Bison : Syntax Error, 意外

转载 作者:行者123 更新时间:2023-12-05 04:16:45 26 4
gpt4 key购买 nike

我正在使用 bison 为一种玩具编程语言编写解析器,但我遇到了困难:

我的 grammar.y 文件如下:

%{
#include <stdio.h>
#include "util.h"
#include "errormsg.h"

#define YYDEBUG 1

int yylex(void); /* function prototype */

void yyerror(char *s)
{
EM_error(EM_tokPos, "%s", s);
}
%}

%union {
int pos;
int ival;
string sval;
}

%token <sval> TK_ID TK_STRING
%token <ival> TK_INT

%token <pos>
TK_COMMA TK_COLON TK_SEMICOLON TK_LPAREN TK_RPAREN TK_LBRACK TK_RBRACK
TK_LBRACE TK_RBRACE TK_DOT TK_ASSIGN
TK_ARRAY TK_IF TK_THEN TK_ELSE TK_WHILE TK_FOR TK_TO TK_DO TK_LET TK_IN
TK_END TK_OF TK_BREAK TK_NIL
TK_FUNCTION TK_VAR TK_TYPE

/* Precedence in Bison is weird: lower is higher. Take a look at the spec too. */
%left <pos> TK_OR
%left <pos> TK_AND
%nonassoc <pos> TK_EQ TK_NEQ TK_LT TK_LE TK_GT TK_GE
%left <pos> TK_PLUS TK_MINUS
%left <pos> TK_TIMES TK_DIVIDE
%left <pos> TK_UMINUS

%error-verbose

%start program

%%

/* According to the spec, Tiger programs are just an expression exp. */
program: exp

/* An expression can be many things; consult the spec for more info: Expressions. */
/* For the %prec rule, take a look at 5.4 Context-Dependent Precedence on bison manual */
exp:
lvalue
| TK_NIL
| exp exp_seq_aug
| TK_LPAREN TK_RPAREN
| TK_LET TK_IN TK_END
| TK_INT
| TK_STRING
| TK_MINUS exp %prec TK_UMINUS
| TK_ID TK_LPAREN TK_RPAREN
| TK_ID TK_LPAREN exp params TK_RPAREN
| exp TK_PLUS exp
| exp TK_MINUS exp
| exp TK_TIMES exp
| exp TK_DIVIDE exp
| exp TK_EQ exp
| exp TK_NEQ exp
| exp TK_GT exp
| exp TK_LT exp
| exp TK_GE exp
| exp TK_LE exp
| exp TK_AND exp
| exp TK_OR exp
| TK_ID TK_LBRACE TK_RBRACE
| TK_ID TK_LBRACE TK_ID TK_EQ exp record_exp TK_RBRACE
| TK_ID TK_LBRACK exp TK_RBRACK TK_OF exp
| lvalue TK_ASSIGN exp
| TK_IF exp TK_THEN exp TK_ELSE exp
| TK_IF exp TK_THEN exp
| TK_WHILE exp TK_DO exp
| TK_FOR TK_ID TK_ASSIGN exp TK_TO exp TK_DO exp
| TK_BREAK
| TK_LET decl_seq TK_IN exp_seq_aug TK_END
;

decl_seq:
/* empty */
| decl_seq decl
;

decl:
type_decl
| var_decl
| func_decl
;

var_decl:
TK_VAR TK_ID TK_ASSIGN exp
| TK_VAR TK_ID TK_COLON TK_ID TK_ASSIGN exp
;

func_decl:
TK_FUNCTION TK_ID TK_LPAREN type_fields TK_RPAREN TK_EQ exp
| TK_FUNCTION TK_ID TK_LPAREN type_fields TK_COLON TK_ID TK_EQ exp
;

type_decl:
TK_TYPE TK_ID TK_EQ type
;

type:
TK_TYPE
| TK_LBRACE type_fields TK_RBRACE
| TK_ARRAY TK_OF TK_ID
;

type_fields:
/* empty */
| TK_ID TK_COLON TK_ID type_fields
| TK_COMMA TK_ID TK_COLON TK_ID type_fields
;

lvalue:
TK_ID
| lvalue TK_DOT TK_ID
| lvalue TK_LBRACK exp TK_RBRACK
;

exp_seq:
/* epsilon */
| TK_SEMICOLON exp
| exp_seq TK_SEMICOLON exp
;

exp_seq_aug:
TK_LPAREN exp_seq TK_RPAREN
;

params:
/* epsilon */
| params TK_COMMA exp
;
record_exp:
/* epsilon */
| record_exp TK_COMMA TK_ID TK_EQ exp
;

这没什么特别的,而且几乎没有 (96) 次移位/归约冲突(我猜这很可能是由于 if 语句和函数调用语句)。我知道它应该没有什么要清楚的,但是同一练习的其他替代实现可以清楚地解析更多的移位/减少冲突,所以考虑到我也收到的错误消息,这应该没什么大不了的。

token 文件由 bison 根据 %token 指令(y.tab.h 和 y.tab.c)生成,我得到的具体错误消息是:

nlightnfotis@frodo ~/Software/tigerc $ ./a.out tests/test4.tig
tests/test4.tig:2.1: syntax error, unexpected TK_GE
Parsing failed

这非常令人沮丧,因为解析器说它找到了一个大于或等于的标记,而测试文件没有:

/* define a recursive function */
let

/* calculate n! */
function nfactor(n: int): int =
if n = 0
then 1
else n * nfactor(n-1)

in
nfactor(10)
end

我怎么可能调试这个?

[编辑]:根据要求,这是我的 flex 词法分析器的源代码:

%{
#include <string.h>
#include "util.h"
#include "tokens.h"
#include "errormsg.h"

int charPos = 1;

int
yywrap (void)
{
charPos = 1;
return 1;
}

// Adjust the token position in the string
// Mainly used for error checking
void
adjust (void)
{
EM_tokPos = charPos;
charPos += yyleng;
}

%}

/* Will be used for conditional activation of the comment rule. */
%x C_COMMENT

digits [0-9]+
letters [_a-zA-Z]+


%%
" " {adjust(); continue;}
\n {adjust(); EM_newline(); continue;}
\t {adjust(); continue;}

"/*" {adjust(); BEGIN(C_COMMENT);}
<C_COMMENT>[^*\n] {adjust();}
<C_COMMENT>"*/" {adjust(); BEGIN(INITIAL);}

\"(\\.|[^"])*\" {adjust(); yylval.sval = String(yytext); return STRING;}

"," {adjust(); return COMMA;}
";" {adjust(); return SEMICOLON;}
":" {adjust(); return COLON;}
"." {adjust(); return DOT;}
"+" {adjust(); return PLUS;}
"-" {adjust(); return MINUS;}
"*" {adjust(); return TIMES;}
"/" {adjust(); return DIVIDE;}
"=" {adjust(); return EQ;}
"<>" {adjust(); return NEQ;}
"<" {adjust(); return LT;}
"<=" {adjust(); return LE;}
">" {adjust(); return GT;}
">=" {adjust(); return GE;}
"&" {adjust(); return AND;}
"|" {adjust(); return OR;}
":=" {adjust(); return ASSIGN;}
"(" {adjust(); return LPAREN;}
")" {adjust(); return RPAREN;}
"{" {adjust(); return LBRACE;}
"}" {adjust(); return RBRACE;}
"[" {adjust(); return LBRACK;}
"]" {adjust(); return RBRACK;}

for {adjust(); return FOR;}
if {adjust(); return IF;}
then {adjust(); return THEN;}
else {adjust(); return ELSE;}
while {adjust(); return WHILE;}
to {adjust(); return TO;}
do {adjust(); return DO;}
let {adjust(); return LET;}
in {adjust(); return IN;}
end {adjust(); return END;}
of {adjust(); return OF;}
break {adjust(); return BREAK;}
nil {adjust(); return NIL;}
function {adjust(); return FUNCTION;}
var {adjust(); return VAR;}
type {adjust(); return TYPE;}
array {adjust(); return ARRAY;}

{digits} {adjust(); yylval.ival = atoi (yytext); return INT;}
{letters}[a-zA-Z0-9_]* {adjust(); yylval.sval = String (yytext); return ID;}

. {adjust(); EM_error (EM_tokPos,"illegal token");}

最佳答案

How can I possibly debug this?

对于初学者,您需要学习使用 Bison 调试选项。这将输出所有状态的转储,诚然,调试它们需要很多耐心和时间,乍一看,您通常至少可以缩小导致问题的规则范围。

就您的问题而言,您的词法分析器没有返回 bison 定义的标记。

例如,在 Bison 中你有 %token TK_GE,但你的词法分析器返回 GE。 Bison 语法只知道 TK_GE,这就是它所期望的。如果我记得的话,它会将标记定义为 ASCII 序列之上的递增数字序列,并且您必须在词法分析器中使用这些值。

除非你正在做一些我在 tokens.h 中看不到的重新定义,否则你需要重写词法分析器来做:

">="     {adjust(); return TK_GE;}

可能你在某处有 #define GE 42,但 bison 正在使用 #define TK_GE 21(示例值)生成 token 文件。

关于compiler-construction - GNU Bison : Syntax Error, 意外 <token>,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26443728/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com