gpt4 book ai didi

java - ANTLR 没有为 Scala 语法提供正确的输出标记

转载 作者:行者123 更新时间:2023-12-03 14:59:21 30 4
gpt4 key购买 nike

关闭。这个问题需要更多focused .它目前不接受答案。












想改善这个问题吗?更新问题,使其仅关注一个问题 editing this post .

6 个月前关闭。




Improve this question




我是 Scala 的新手,我正在尝试使用 Scala Grammar 和 ANTLR 来解析 Scala 文件。下面是我从 git hub 链接获得的 Scala 语法代码:

https://github.com/antlr/grammars-v4/tree/master/scala

有机会移动 repo,所以我在这里粘贴 Scala 语法代码:

grammar Scala;

literal : '-'? IntegerLiteral
| '-'? FloatingPointLiteral
| BooleanLiteral
| CharacterLiteral
| StringLiteral
| SymbolLiteral
| 'null' ;

qualId : Id ('.' Id)* ;

ids : Id (',' Id)* ;

stableId : (Id | (Id '.')? 'this') '.' Id
| (Id '.')? 'super' classQualifier? '.' Id ;

classQualifier : '[' Id ']' ;

type : functionArgTypes '=>' type
| infixType existentialClause? ;

functionArgTypes : infixType
| '(' ( paramType (',' paramType )* )? ')' ;

existentialClause : 'forSome' '{' existentialDcl (Semi existentialDcl)* '}';

existentialDcl : 'type' typeDcl
| 'val' valDcl;

infixType : compoundType (Id Nl? compoundType)*;

compoundType : annotType ('with' annotType)* refinement?
| refinement;

annotType : simpleType annotation*;

simpleType : simpleType typeArgs
| simpleType '#' Id
| stableId
| (stableId | (Id '.')? 'this') '.' 'type'
| '(' types ')';

typeArgs : '[' types ']';

types : type (',' type)*;

refinement : Nl? '{' refineStat (Semi refineStat)* '}';

refineStat : dcl
| 'type' typeDef
| ;

typePat : type;

ascription : ':' infixType
| ':' annotation+
| ':' '_' '*';

expr : (bindings | 'implicit'? Id | '_') '=>' expr
| expr1 ;

expr1 : 'if' '(' expr ')' Nl* expr (Semi? 'else' expr)?
| 'while' '(' expr ')' Nl* expr
| 'try' ('{' block '}' | expr) ('catch' '{' caseClauses '}')? ('finally' expr)?
| 'do' expr Semi? 'while' '(' expr ')'
| 'for' ('(' enumerators ')' | '{' enumerators '}') Nl* 'yield'? expr
| 'throw' expr
| 'return' expr?
| (('new' (classTemplate | templateBody)| blockExpr | simpleExpr1 '_'?) '.') Id '=' expr
| simpleExpr1 argumentExprs '=' expr
| postfixExpr
| postfixExpr ascription
| postfixExpr 'match' '{' caseClauses '}' ;

postfixExpr : infixExpr (Id Nl?)? ;

infixExpr : prefixExpr
| infixExpr Id Nl? infixExpr ;

prefixExpr : ('-' | '+' | '~' | '!')?
('new' (classTemplate | templateBody)| blockExpr | simpleExpr1 '_'?) ;

simpleExpr1 : literal
| stableId
| (Id '.')? 'this'
| '_'
| '(' exprs? ')'
| ('new' (classTemplate | templateBody) | blockExpr ) '.' Id
| ('new' (classTemplate | templateBody) | blockExpr ) typeArgs
| simpleExpr1 argumentExprs
;

exprs : expr (',' expr)* ;

argumentExprs : '(' exprs? ')'
| '(' (exprs ',')? postfixExpr ':' '_' '*' ')'
| Nl? blockExpr ;

blockExpr : '{' caseClauses '}'
| '{' block '}' ;
block : blockStat (Semi blockStat)* resultExpr? ;

blockStat : import_
| annotation* ('implicit' | 'lazy')? def
| annotation* localModifier* tmplDef
| expr1
| ;

resultExpr : expr1
| (bindings | ('implicit'? Id | '_') ':' compoundType) '=>' block ;

enumerators : generator (Semi generator)* ;

generator : pattern1 '<-' expr (Semi? guard | Semi pattern1 '=' expr)* ;

caseClauses : caseClause+ ;

caseClause : 'case' pattern guard? '=>' block ;

guard : 'if' postfixExpr ;

pattern : pattern1 ('|' pattern1 )* ;

pattern1 : Varid ':' typePat
| '_' ':' typePat
| pattern2 ;

pattern2 : Varid ('@' pattern3)?
| pattern3 ;

pattern3 : simplePattern
| simplePattern (Id Nl? simplePattern)* ;

simplePattern : '_'
| Varid
| literal
| stableId ('(' patterns ')')?
| stableId '(' (patterns ',')? (Varid '@')? '_' '*' ')'
| '(' patterns? ')' ;

patterns : pattern (',' patterns)*
| '_' * ;

typeParamClause : '[' variantTypeParam (',' variantTypeParam)* ']' ;

funTypeParamClause: '[' typeParam (',' typeParam)* ']' ;

variantTypeParam : annotation? ('+' | '-')? typeParam ;

typeParam : (Id | '_') typeParamClause? ('>:' type)? ('<:' type)?
('<%' type)* (':' type)* ;

paramClauses : paramClause* (Nl? '(' 'implicit' params ')')? ;

paramClause : Nl? '(' params? ')' ;

params : param (',' param)* ;

param : annotation* Id (':' paramType)? ('=' expr)? ;

paramType : type
| '=>' type
| type '*';

classParamClauses : classParamClause*
(Nl? '(' 'implicit' classParams ')')? ;

classParamClause : Nl? '(' classParams? ')' ;

classParams : classParam (',' classParam)* ;

classParam : annotation* modifier* ('val' | 'var')?
Id ':' paramType ('=' expr)? ;

bindings : '(' binding (',' binding )* ')' ;

binding : (Id | '_') (':' type)? ;

modifier : localModifier
| accessModifier
| 'override' ;

localModifier : 'abstract'
| 'final'
| 'sealed'
| 'implicit'
| 'lazy' ;

accessModifier : ('private' | 'protected') accessQualifier? ;

accessQualifier : '[' (Id | 'this') ']' ;

annotation : '@' simpleType argumentExprs* ;

constrAnnotation : '@' simpleType argumentExprs ;

templateBody : Nl? '{' selfType? templateStat (Semi templateStat)* '}' ;

templateStat : import_
| (annotation Nl?)* modifier* def
| (annotation Nl?)* modifier* dcl
| expr
| ;

selfType : Id (':' type)? '=>'
| 'this' ':' type '=>' ;

import_ : 'import' importExpr (',' importExpr)* ;

importExpr : stableId '.' (Id | '_' | importSelectors) ;

importSelectors : '{' (importSelector ',')* (importSelector | '_') '}' ;

importSelector : Id ('=>' Id | '=>' '_') ;

dcl : 'val' valDcl
| 'var' varDcl
| 'def' funDcl
| 'type' Nl* typeDcl ;

valDcl : ids ':' type ;

varDcl : ids ':' type ;

funDcl : funSig (':' type)? ;

funSig : Id funTypeParamClause? paramClauses ;

typeDcl : Id typeParamClause? ('>:' type)? ('<:' type)? ;

patVarDef : 'val' patDef
| 'var' varDef ;

def : patVarDef
| 'def' funDef
| 'type' Nl* typeDef
| tmplDef ;

patDef : pattern2 (',' pattern2)* (':' type)* '=' expr ;

varDef : patDef
| ids ':' type '=' '_' ;

funDef : funSig (':' type)? '=' expr
| funSig Nl? '{' block '}'
| 'this' paramClause paramClauses
('=' constrExpr | Nl constrBlock) ;

typeDef : Id typeParamClause? '=' type ;

tmplDef : 'case'? 'class' classDef
| 'case' 'object' objectDef
| 'trait' traitDef ;

classDef : Id typeParamClause? constrAnnotation* accessModifier?
classParamClauses classTemplateOpt ;

traitDef : Id typeParamClause? traitTemplateOpt ;

objectDef : Id classTemplateOpt ;

classTemplateOpt : 'extends' classTemplate | ('extends'? templateBody)? ;

traitTemplateOpt : 'extends' traitTemplate | ('extends'? templateBody)? ;

classTemplate : earlyDefs? classParents templateBody? ;

traitTemplate : earlyDefs? traitParents templateBody? ;

classParents : constr ('with' annotType)* ;

traitParents : annotType ('with' annotType)* ;

constr : annotType argumentExprs* ;

earlyDefs : '{' (earlyDef (Semi earlyDef)*)? '}' 'with' ;

earlyDef : (annotation Nl?)* modifier* patVarDef ;

constrExpr : selfInvocation
| constrBlock ;

constrBlock : '{' selfInvocation (Semi blockStat)* '}' ;
selfInvocation : 'this' argumentExprs+ ;

topStatSeq : topStat (Semi topStat)* ;

topStat : (annotation Nl?)* modifier* tmplDef
| import_
| packaging
| packageObject
| ;

packaging : 'package' qualId Nl? '{' topStatSeq '}' ;

packageObject : 'package' 'object' objectDef ;

compilationUnit : ('package' qualId Semi)* topStatSeq ;

// Lexer
BooleanLiteral : 'true' | 'false';
CharacterLiteral : '\'' (PrintableChar | CharEscapeSeq) '\'';
StringLiteral : '"' StringElement* '"'
| '"""' MultiLineChars '"""';
SymbolLiteral : '\'' Plainid;
IntegerLiteral : (DecimalNumeral | HexNumeral) ('L' | 'l');
FloatingPointLiteral
: Digit+ '.' Digit+ ExponentPart? FloatType?
| '.' Digit+ ExponentPart? FloatType?
| Digit ExponentPart FloatType?
| Digit+ ExponentPart? FloatType;
Id : Plainid
| '`' StringLiteral '`';
Varid : Lower Idrest;
Nl : '\r'? '\n';
Semi : ';' | Nl+;

Paren : '(' | ')' | '[' | ']' | '{' | '}';
Delim : '`' | '\'' | '"' | '.' | ';' | ',' ;

Comment : '/*' .*? '*/'
| '//' .*? Nl;

// fragments
fragment UnicodeEscape : '\\' 'u' 'u'? HexDigit HexDigit HexDigit HexDigit ;
fragment WhiteSpace : '\u0020' | '\u0009' | '\u000D' | '\u000A';
fragment Opchar : PrintableChar // printableChar not matched by (whiteSpace | upper | lower |
// letter | digit | paren | delim | opchar | Unicode_Sm | Unicode_So)
;
fragment Op : Opchar+;
fragment Plainid : Upper Idrest
| Varid
| Op;
fragment Idrest : (Letter | Digit)* ('_' Op)?;

fragment StringElement : '\u0020'| '\u0021'|'\u0023' .. '\u007F' // (PrintableChar Except '"')
| CharEscapeSeq;
fragment MultiLineChars : ('"'? '"'? .*?)* '"'*;

fragment HexDigit : '0' .. '9' | 'A' .. 'Z' | 'a' .. 'z' ;
fragment FloatType : 'F' | 'f' | 'D' | 'd';
fragment Upper : 'A' .. 'Z' | '$' | '_'; // and Unicode category Lu
fragment Lower : 'a' .. 'z'; // and Unicode category Ll
fragment Letter : Upper | Lower; // and Unicode categories Lo, Lt, Nl
fragment ExponentPart : ('E' | 'e') ('+' | '-')? Digit+;
fragment PrintableChar : '\u0020' .. '\u007F' ;
fragment CharEscapeSeq : '\\' ('b' | 't' | 'n' | 'f' | 'r' | '"' | '\'' | '\\');
fragment DecimalNumeral : '0' | NonZeroDigit Digit*;
fragment HexNumeral : '0' 'x' HexDigit HexDigit+;
fragment Digit : '0' | NonZeroDigit;
fragment NonZeroDigit : '1' .. '9';

上面的Scala语法和我从Scala官网得到的一样:

http://www.scala-lang.org/files/archive/spec/2.11/13-syntax-summary.html

现在我正在尝试为名为 scala.scala 的 scala 文件生成 token 。该文件的代码如下:
object HelloWorld {
def main(args: Array[String]) {
println("Hello, world!")
}
}

我正在运行以下命令来获取 token :

grun Scala compilationUnit -tokens scala.scala

或者

grun Scala expr -tokens scala.scala

或者

grun Scala literal -tokens scala.scala

我得到的输出是:
[@0,0:18='object HelloWorld {',<68>,1:0]
[@1,19:19='\n',<70>,1:19]
[@2,20:52=' def main(args: Array[String]) {',<68>,2:0]
[@3,53:53='\n',<70>,2:33]
[@4,54:81=' println("Hello, world!")',<68>,3:0]
[@5,82:82='\n',<70>,3:28]
[@6,83:85=' }',<68>,4:0]
[@7,86:86='\n',<70>,4:3]
[@8,87:87='}',<14>,5:0]
[@9,88:88='\n',<70>,5:1]
[@10,89:88='<EOF>',<-1>,6:0]
line 1:19 no viable alternative at input 'object HelloWorld {\n'

树形输出是这样的:
(expr object HelloWorld { \n   def main(args: Array[String]) { \n     println("Hello, world!") \n   } \n } \n)

gui中的输出是这样的:

Image exported from the antlr tool

那是完全愚蠢的。代替 token ,它给了我简单的 LOC 。我针对其他语言 Java 和 C 对其进行了测试,效果很好。它为我提供了以下语法链接所需的正确输出/正确标记:

https://github.com/antlr/grammars-v4

请纠正我如果我做错了什么,因为我是 Antlr 和 Scala 的新手。

我从 token 中的意思是所有关键字、操作数和所有运算符都在那里。在我看来,它从来都不是简单的代码行。
Below is the Scala.tokens file which I got using Scala.g4(Scala Grammar with ANTLR).



T__0=1
T__1=2
T__2=3
T__3=4
T__4=5
T__5=6
T__6=7
T__7=8
T__8=9
T__9=10
T__10=11
T__11=12
T__12=13
T__13=14
T__14=15
T__15=16
T__16=17
T__17=18
T__18=19
T__19=20
T__20=21
T__21=22
T__22=23
T__23=24
T__24=25
T__25=26
T__26=27
T__27=28
T__28=29
T__29=30
T__30=31
T__31=32
T__32=33
T__33=34
T__34=35
T__35=36
T__36=37
T__37=38
T__38=39
T__39=40
T__40=41
T__41=42
T__42=43
T__43=44
T__44=45
T__45=46
T__46=47
T__47=48
T__48=49
T__49=50
T__50=51
T__51=52
T__52=53
T__53=54
T__54=55
T__55=56
T__56=57
T__57=58
T__58=59
T__59=60
T__60=61
BooleanLiteral=62
CharacterLiteral=63
StringLiteral=64
SymbolLiteral=65
IntegerLiteral=66
FloatingPointLiteral=67
Id=68
Varid=69
Nl=70
Semi=71
Paren=72
Delim=73
Comment=74
'-'=1
'null'=2
'.'=3
','=4
'this'=5
'super'=6
'['=7
']'=8
'=>'=9
'('=10
')'=11
'forSome'=12
'{'=13
'}'=14
'type'=15
'val'=16
'with'=17
'#'=18
':'=19
'_'=20
'*'=21
'implicit'=22
'if'=23
'else'=24
'while'=25
'try'=26
'catch'=27
'finally'=28
'do'=29
'for'=30
'yield'=31
'throw'=32
'return'=33
'new'=34
'='=35
'match'=36
'+'=37
'~'=38
'!'=39
'lazy'=40
'<-'=41
'case'=42
'|'=43
'@'=44
'>:'=45
'<:'=46
'<%'=47
'var'=48
'override'=49
'abstract'=50
'final'=51
'sealed'=52
'private'=53
'protected'=54
'import'=55
'def'=56
'class'=57
'object'=58
'trait'=59
'extends'=60
'package'=61

我确信这些 token 是不正确的。任何人都可以确定 Scala Gramma 或 ANTLR 有这个问题吗?

最佳答案

这个文件现在似乎解析得很好,所以语法可能已经修复

关于java - ANTLR 没有为 Scala 语法提供正确的输出标记,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40482259/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com