gpt4 book ai didi

python - pyparsing OneOrMore 嵌入到其他 OneOrMore 中

转载 作者:太空宇宙 更新时间:2023-11-03 11:55:18 25 4
gpt4 key购买 nike

我是第一次尝试使用 pyparsing。我的解析器没有按照我希望的方式执行,请有人检查一下,看看哪里出了问题。 我正在尝试将 OneOrMore 嵌入到 OneOrMore 中,我认为这应该可以正常工作,但事实并非如此。

完整代码如下:

import pyparsing

status = """
sale number : 11/7
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h
sale number : 11/8
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h
"""

integer = pyparsing.Word(pyparsing.nums).setParseAction(lambda toks: int(toks[0]))
decimal = pyparsing.Word(pyparsing.nums + ".").setParseAction(lambda toks: float(toks[0]))
wordSuppress = pyparsing.Suppress(pyparsing.Word(pyparsing.alphas))
endOfLine = pyparsing.LineEnd().suppress()
colon = pyparsing.Suppress(":")

saleNumber = pyparsing.Regex("\d{2}\/\d{1}").setResultsName("saleNumber")
lineSuppress = pyparsing.Regex("NAME.*STOP") + endOfLine
saleRow = wordSuppress + wordSuppress + colon + saleNumber + endOfLine

name = pyparsing.Regex("cross-cu-\d").setResultsName("name")
id = integer.setResultsName("id")
pawn = integer.setResultsName("pawn")
price = integer.setResultsName("price") + "K"
time = pyparsing.Regex("\d{2}:\d{2}:\d{2}.\d{2}").setResultsName("time")
c = decimal.setResultsName("c") + "%"
state = pyparsing.Word(pyparsing.alphas).setResultsName("state")
startStop = pyparsing.Word(pyparsing.alphanums).setResultsName("startStop")
row = name + id + pawn + price + time + c + state + startStop + endOfLine

table = pyparsing.OneOrMore(pyparsing.Group(saleRow + lineSuppress.suppress() + (pyparsing.OneOrMore(pyparsing.Group(row) | pyparsing.SkipTo(row).suppress())) ) | pyparsing.SkipTo(saleRow).suppress())

resultDic = [x.asDict() for x in table.parseString(status)]
print resultDic

它只返回 [{'saleNumber': '11/7'}]我希望得到这样的 dic 列表:

[{ {'saleNumber': '11/7'},{ elements in cross-cu-1 line, elements in cross-cu-2 line } },
{ {'saleNumber': '11/8'},{ elements in cross-cu-3 line, elements in cross-cu-4 line } }]

感谢任何帮助!请不要建议其他实现此输出的方法!我也在努力学习 pyparsing!

最佳答案

在这种情况下,pyparsing 可能有点矫枉过正。为什么不简单地逐行读取文件,然后解析结果?

代码看起来像这样:

编辑:我更新了代码以更紧密地遵循您的示例。

从集合导入默认字典

status = """
sale number : 11/7
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-1 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-2 918 1 104708K 07:38:19.08 24.02% run 1d01h
sale number : 11/8
NAME ID PAWN PRICE TIME %C STATE START/STOP
cross-cu-3 1055 1 106284K 07:49:36.19 25.05% run 1d01h
cross-cu-4 918 1 104708K 07:38:19.08 24.02% run 1d01h
"""

sale_number = ''

sales = defaultdict(list)

for line in status.split('\n'):
line = line.strip()
if line.startswith("NAME"):
continue
elif line.startswith("sale number"):
sale_number = line.split(':')[1].strip()
elif not line or line.isspace() :
continue
else:
# you can also use a regular expression here
sales[sale_number].append(line.split())

for sale in sales:
print sale, sales[sale]

关于python - pyparsing OneOrMore 嵌入到其他 OneOrMore 中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12423923/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com