gpt4 book ai didi

python - antlr4 python 3 从 plsql 语法打印或转储标记

转载 作者:行者123 更新时间:2023-12-01 01:32:52 24 4
gpt4 key购买 nike

我在Python中使用antlr4来读取以下语法:

https://github.com/antlr/grammars-v4/tree/master/plsql

文件 grants.sql 只有“begin select 'bob' from Dual; end;”

打印像树一样的 Lisp 的简单代码

from antlr4 import *
from PlSqlLexer import PlSqlLexer
from PlSqlParser import PlSqlParser
from PlSqlParserListener import PlSqlParserListener

input = FileStream('grants.sql')
lexer = PlSqlLexer(input)

stream = CommonTokenStream(lexer)
parser = PlSqlParser(stream)
tree = parser.sql_script()

print ("Tree " + tree.toStringTree(recog=parser));

输出如下:

tree(sql_script(unit_statement(anonymous_block begin)(seq_of_statements(squl_statement(data_manipulation _Statement) (compound_expression(condenation(model_expression(unary_expression) (atom (constant (quoted_string 'bob')))))))))))))) (from_clause FROM (table_ref_list (table_ref (table_ref_aux (table_ref_aux_internal (dml_table_expression_clause (tableview_name (identifier (id_expression (regular_id DUAL))))))) )))))))))))) ;) 结束 ;)) )

我希望能够有 python 代码列出上述内容,而不是像 lisp 那样的语句,而是列出所有规则和标记..即

  1. .sql_脚本
  2. ..unit_statement
  3. ...anonymous_block
  4. ...开始

等等等等

有人可以提供执行此操作的 python 代码或给我一些提示吗?不胜感激。

最佳答案

这是一个开始:

from antlr4 import *
from antlr4.tree.Tree import TerminalNodeImpl
from PlSqlLexer import PlSqlLexer
from PlSqlParser import PlSqlParser

# Generate the lexer nad parser like this:
#
# java -jar antlr-4.7.1-complete.jar -Dlanguage=Python3 *.g4
#
def main():
lexer = PlSqlLexer(InputStream("SELECT * FROM TABLE_NAME"))
parser = PlSqlParser(CommonTokenStream(lexer))
tree = parser.sql_script()
traverse(tree, parser.ruleNames)

def traverse(tree, rule_names, indent = 0):
if tree.getText() == "<EOF>":
return
elif isinstance(tree, TerminalNodeImpl):
print("{0}TOKEN='{1}'".format(" " * indent, tree.getText()))
else:
print("{0}{1}".format(" " * indent, rule_names[tree.getRuleIndex()]))
for child in tree.children:
traverse(child, rule_names, indent + 1)

if __name__ == '__main__':
main()

打印:

sql_script
unit_statement
data_manipulation_language_statements
select_statement
subquery
subquery_basic_elements
query_block
TOKEN='SELECT'
TOKEN='*'
from_clause
TOKEN='FROM'
table_ref_list
table_ref
table_ref_aux
table_ref_aux_internal
dml_table_expression_clause
tableview_name
identifier
id_expression
regular_id
TOKEN='TABLE_NAME'

请注意,为了使词法分析器和解析器正常工作,我添加了以下 Python 类:

# PlSqlBaseLexer.py
from antlr4 import *

class PlSqlBaseLexer(Lexer):

def IsNewlineAtPos(self, pos):
la = self._input.LA(pos)
return la == -1 or la == '\n'

和:

# PlSqlBaseParser.py
from antlr4 import *

class PlSqlBaseParser(Parser):

_isVersion10 = False
_isVersion12 = True

def isVersion10(self):
return self._isVersion10

def isVersion12(self):
return self._isVersion12

def setVersion10(self, value):
self._isVersion10 = value

def setVersion12(self, value):
self._isVersion12 = value

我将其放置在与生成的 Python 类相同的文件夹中。我还需要在生成的 PlSqlLexer.py 类中添加 import 语句 from PlSqlBaseLexer import PlSqlBaseLexer ,并修复 PlSqlParser.py 中的导入语句从 from ./PlSqlBaseParser import PlSqlBaseParserfrom PlSqlBaseParser import PlSqlBaseParser

请注意,运行演示相当慢。除非您有在 Python 中执行此操作的硬性要求,否则我建议您使用(快得多!)更快的 Java 或 C# 目标。

关于python - antlr4 python 3 从 plsql 语法打印或转储标记,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52673751/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com