gpt4 book ai didi

java - 使用 Javacc 处理 COBOL 语法中的注释和行/列号

转载 作者:行者123 更新时间:2023-12-01 14:48:02 27 4
gpt4 key购买 nike

我正在使用JavaCC开发COBOL解析器。 COBOL 文件通常将第 1 至 6 列作为行/列号。如果行/列号不存在,它将包含空格。

我需要知道如何处理 COBOL 文件中的注释和序列区域并仅解析主区域。

我尝试了很多表达方式,但没有一个有效。我创建了一个特殊的标记,它将检查换行符,然后检查是否出现六次空格或除空格和回车符之外的任何字符,第七个字符之后将是用于注释的 "*""" 对于普通线路。

我正在使用此处提供的 Cobol.jj 文件 http://java.net/downloads/javacc/contrib/grammars/cobol.jj

谁能建议我应该使用什么语法?

我的语法文件示例:

    PARSER_END(CblParser)

////////////////////////////////////////////////////////////////////////////////
// Lexical structure
////////////////////////////////////////////////////////////////////////////////

SPECIAL_TOKEN :
{
< EOL: "\n" > : LINE_START
| < SPACECHAR: ( " " | "\t" | "\f" | ";" | "\r" )+ >
}

SPECIAL_TOKEN :
{
< COMMENT: ( ~["\n","\r"," "] ~["\n","\r"," "] ~["\n","\r"," "] ~["\n","\r"," "] ~["\n","\r"," "] ~["\n","\r"," "] ) ( "*" | "|" ) (~["\n","\r"])* >
| < PREPROC_COMMENT: "*|" (~["\n","\r"])* >
| < SPACE_SEPARATOR : ( <SPACECHAR> | <EOL> )+ >
| < COMMA_SEPARATOR : "," <SPACE_SEPARATOR> >
}

<LINE_START> SKIP :
{
< ((~[])(~[])(~[])(~[])(~[])(~[])) (" ") >
}

最佳答案

由于解析器从行的开头开始,因此您应该使用 DEFAULT 状态来表示行的开头。我会做类似下面的事情[下面是未经测试的代码]。

// At the start of each line, the first 6 characters are ignored and the 7th is used
// to determine whether this is a code line or a comment line.
// (Continuation lines are handled elsewhere.)
// If there are fewer than 7 characters on the line, it is ignored.
// Note that there will be a TokenManagerError if a line has at least 7 characters and
// the 7th character is other than a "*", a "/", or a space.
<DEFAULT> SKIP :
{
< (~[]){0,6} ("\n" | "\r" | "\r\n") > :DEFAULT
|
< (~[]){6} (" ") > :CODE
|
< (~[]){6} ("*"|"/") :COMMENT
}

<COMMENT> SKIP :
{ // At the end of a comment line, return to the DEFAULT state.
< "\n" | "\r" | "\r\n" > : DEFAULT
| // All non-end-of-line characters on a comment line are ignored.
< ~["\n","\r"] > : COMMENT
}
<CODE> SKIP :
{ // At the end of a code line, return to the DEFAULT state.
< "\n" | "\r" | "\r\n" > : DEFAULT
| // White space is skipped, as are semicolons.
< ( " " | "\t" | "\f" | ";" )+ >
}
<CODE> TOKEN :
{
< ACCEPT: "accept" >
|
... // all rules for tokens should be in the CODE state.
}

关于java - 使用 Javacc 处理 COBOL 语法中的注释和行/列号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15204851/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com