gpt4 book ai didi

sql - 对 Pyparsing 中 setResultsName 的行为感到困惑

转载 作者:行者123 更新时间:2023-12-04 04:54:58 26 4
gpt4 key购买 nike

我正在尝试解析一些 SQL 语句。这是一个示例:

select
ms.member_sk a,
dd.date_sk b,
st.subscription_type,
(SELECT foo FROM zoo) e
from dim_member_subscription_all p,
dim_subs_type
where a in (select moo from t10)

我现在只对获取 table 感兴趣。所以我想看看
[动物园,dim_member_subscription_all,dim_subs_type] & [t10]

我已经编写了一个小脚本,查看 Paul McGuire 的示例
#!/usr/bin/env python
import sys
import pprint
from pyparsing import *


pp = pprint.PrettyPrinter(indent=4)
semicolon = Combine(Literal(';') + lineEnd)
comma = Literal(',')
lparen = Literal('(')
rparen = Literal(')')

update_kw, volatile_kw, create_kw, table_kw, as_kw, from_kw, \
where_kw, join_kw, left_kw, right_kw, cross_kw, outer_kw, \
on_kw , insert_kw , into_kw= \
map(lambda x: Keyword(x, caseless=True), \
['UPDATE', 'VOLATILE', 'CREATE', 'TABLE', 'AS', 'FROM',
'WHERE', 'JOIN' , 'LEFT', 'RIGHT' , \
'CROSS', 'OUTER', 'ON', 'INSERT', 'INTO'])

select_kw = Keyword('SELECT', caseless=True) | Keyword('SEL' , caseless=True)

reserved_words = (update_kw | volatile_kw | create_kw | table_kw | as_kw |
select_kw | from_kw | where_kw | join_kw |
left_kw | right_kw | cross_kw | on_kw | insert_kw |
into_kw)

ident = ~reserved_words + Word(alphas, alphanums + '_')

table = Combine(Optional(ident + Literal('.')) + ident)
column = Combine(Optional(ident + Literal('.')) + (ident | Literal('*')))

column_alias = Optional(Optional(as_kw).suppress() + ident)
table_alias = Optional(Optional(as_kw).suppress() + ident).suppress()

select_stmt = Forward()
nested_table = lparen.suppress() + select_stmt + rparen.suppress() + table_alias
table_list = delimitedList((nested_table | table) + table_alias)
column_list = delimitedList((nested_table | column) + column_alias)

txt = """
select
ms.member_sk a,
dd.date_sk b,
st.subscription_type,
(SELECT foo FROM zoo) e
from dim_member_subscription_all p,
dim_subs_type
where a in (select moo from t10)
"""

select_stmt << select_kw.suppress() + column_list + from_kw.suppress() + \
table_list.setResultsName('tables', listAllMatches=True)

print txt

for token in select_stmt.searchString(txt):
pp.pprint(token.asDict())

我得到以下嵌套输出。有人可以帮我理解我做错了什么吗?
{   'tables': ([(['zoo'], {}), (['dim_member_subscription_all', 'dim_subs_type'], {})], {})}
{ 'tables': ([(['t10'], {})], {})}

最佳答案

searchString将返回所有匹配的列表 ParseResults - 你可以看到 tables每个使用的值:

for token in select_stmt.searchString(txt):
print token.tables

给予:
[['zoo'], ['dim_member_subscription_all', 'dim_subs_type']]
[['t10']]

所以 searchString 找到了两个 SELECT 语句。

最新版本的 pyparsing 支持使用 Python 内置 sum 将此列表汇总为单个合并。 .访问 tables此合并结果的值如下所示:
print sum(select_stmt.searchString(txt)).tables

[['zoo'], ['dim_member_subscription_all', 'dim_subs_type'], ['t10']]

我认为解析器正在做你想做的一切,你只需要弄清楚如何处理返回的结果。

为了进一步调试,您应该开始使用 dump ParseResults 上的方法来查看你得到了什么,这将打印返回标记的嵌套列表,然后是所有命名结果的分层树。对于您的示例:
for token in select_stmt.searchString(txt):
print token.dump()
print

打印:
['ms.member_sk', 'a', 'dd.date_sk', 'b', 'st.subscription_type', 'foo', 'zoo', 'dim_member_subscription_all', 'dim_subs_type']
- tables: [['zoo'], ['dim_member_subscription_all', 'dim_subs_type']]

['moo', 't10']
- tables: [['t10']]

关于sql - 对 Pyparsing 中 setResultsName 的行为感到困惑,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16878897/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com