gpt4 book ai didi

在 Lua 中用 LPeg 解析出多行

转载 作者:行者123 更新时间:2023-12-04 22:05:17 27 4
gpt4 key购买 nike

我有一些带有多行块的文本文件,例如

2011/01/01 13:13:13,<AB>, Some Certain Text,=,
[
certain text
[
0: 0 0 0 0 0 0 0 0
8: 0 0 0 0 0 0 0 0
16: 0 0 0 9 343 3938 9433 8756
24: 6270 4472 3182 2503 1768 1140 836 496
32: 326 273 349 269 144 121 94 82
40: 64 80 66 59 56 47 50 46
48: 64 35 42 53 42 40 41 34
56: 35 41 39 39 47 30 30 39
Total count: 12345
]
certain text
]
some text
2011/01/01 14:14:14,<AB>, Some Certain Text,=,
[
certain text
[
0: 0 0 0 0 0 0 0 0
8: 0 0 0 0 0 0 0 0
16: 0 0 0 4 212 3079 8890 8941
24: 6177 4359 3625 2420 1639 974 594 438
32: 323 286 318 296 206 132 96 85
40: 65 73 62 53 47 55 49 52
48: 29 44 44 41 43 36 50 36
56: 40 30 29 40 35 30 25 31
64: 47 31 25 29 24 30 35 31
72: 28 31 17 37 35 30 20 33
80: 28 20 37 25 21 23 25 36
88: 27 35 22 23 15 24 34 28
Total count: 123456
]
certain text
some text
]
这些变体长度块存在于文本之间。我想读出 : 之后的所有数字并将它们保存在单独的数组中。
在这种情况下,将有两个数组:

array1 = { 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9 343 3938 9433 8756 6270 4472 3182 2503 1768 1140 836 496 326 273 349 269 144 121 94 82 64 80 66 59 56 47 50 46 64 35 42 53 42 40 41 34 35 41 39 39 47 30 30 39 12345 }

array2 = { 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 212 3079 8890 8941 6177 4359 3625 2420 1639 974 594 438 323 286 318 296 206 132 96 85 65 73 62 53 47 55 49 52 29 44 44 41 43 36 50 36 40 30 29 40 35 30 25 31 47 31 25 29 24 30 35 31 28 31 17 37 35 30 20 33 28 20 37 25 21 23 25 36 27 35 22 23 15 24 34 28 123456 }


我发现 lpeg 可能是实现它的轻量级方法。但我对 PEG 和 LPeg 完全陌生。请帮忙!

最佳答案

LPEG版本:

local lpeg            = require "lpeg"
local lpegmatch = lpeg.match
local C, Ct, P, R, S = lpeg.C, lpeg.Ct, lpeg.P, lpeg.R, lpeg.S
local Cg = lpeg.Cg

local data_to_arrays

do
local colon = P":"
local lbrak = P"["
local rbrak = P"]"
local digits = R"09"^1
local eol = P"\n\r" + P"\r\n" + P"\n" + P"\r"
local ws = S" \t\v"
local optws = ws^0
local getnum = C(digits) / tonumber * optws
local start = lbrak * optws * eol
local stop = optws * rbrak
local line = optws * digits * colon * optws
* getnum * getnum * getnum * getnum
* getnum * getnum * getnum * getnum
* eol
local count = optws * P"Total count:" * optws * getnum * eol
local inner = Ct(line^1 * count^-1)
--local inner = Ct(line^1 * Cg(count, "count")^-1)
local array = start * inner * stop
local extract = Ct((array + 1)^0)

data_to_arrays = function (data)
return lpegmatch (extract, data)
end
end

这实际上只有在恰好有八个整数时才有效
数据块的每一行。
根据您输入的格式如何,这可能是诅咒或
祝福 ;-)
和一个测试文件:
data = [[
some text
[
some text
[
0: 0 0 0 0 0 0 0 0
8: 0 0 0 0 0 0 0 0
16: 0 0 0 9 343 3938 9433 8756
24: 6270 4472 3182 2503 1768 1140 836 496
32: 326 273 349 269 144 121 94 82
40: 64 80 66 59 56 47 50 46
48: 64 35 42 53 42 40 41 34
56: 35 41 39 39 47 30 30 39
Total count: 12345
]
some text
]
some text
[
some text
[
0: 0 0 0 0 0 0 0 0
8: 0 0 0 0 0 0 0 0
16: 0 0 0 4 212 3079 8890 8941
24: 6177 4359 3625 2420 1639 974 594 438
32: 323 286 318 296 206 132 96 85
40: 65 73 62 53 47 55 49 52
48: 29 44 44 41 43 36 50 36
56: 40 30 29 40 35 30 25 31
64: 47 31 25 29 24 30 35 31
72: 28 31 17 37 35 30 20 33
80: 28 20 37 25 21 23 25 36
88: 27 35 22 23 15 24 34 28
]
some text
some text
]
]]

local arrays = data_to_arrays (data)

for n = 1, #arrays do
local ar = arrays[n]
local size = #ar
io.write (string.format ("[%d] = { --[[size: %d items]]\n ", n, size))
for i = 1, size do
io.write (string.format ("%d,%s", ar[i], (i % 5 == 0) and "\n " or " "))
end
if ar.count ~= nil then
io.write (string.format ("\n [\"count\"] = %d,", ar.count))
end
io.write (string.format ("\n}\n"))
end

关于在 Lua 中用 LPeg 解析出多行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19410454/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com