gpt4 book ai didi

python - 在Python正则表达式中循环多个模式

转载 作者:行者123 更新时间:2023-12-01 05:50:09 25 4
gpt4 key购买 nike

您好,我有一个以下格式的输入文件。

    .....
......

<TABLE COLS="3">
<ROW>
<R>data</R>
<R>data</R>
</ROW>
<ROW>
<R>data</R>
<R>data</R>
<R>data</R>
</ROW>
</TABLE>
<TABLE COLS="4">
<ROW>
<R>data</R>
<R>data</R>
<R>data</R>
<R>data</R>
<R>data</R>
</ROW>
<ROW>
<R>data</R>
<R>data</R>
</ROW>
</TABLE>
.......
.....
.
...

输出文件应为:

....
....
.
..

<table ct="3">
<ent="1">
<ent="2">
<ent="3">

<row>
<rvn ="1">data</rvn>
<rvn ="2">data</rvn>
</row>
<row>
<rvn ="1">data</rvn>
<rvn ="2">data</rvn>
<rvn ="3">data</rvn>
</row>
</table>
<table ct="4">
<ent="1">
<ent="2">
<ent="3">
<ent="4">
<row>
<rvn ="1">data</rvn>
<rvn ="2">data</rvn>
<rvn ="3">data</rvn>
<rvn ="4">data</rvn>
<rvn ="5">data</rvn>
</row>
<row>
<rvn ="1">data</rvn>
<rvn ="2">data</rvn>
</row>
</table>
...
...
...

我编写了以下代码:当我运行此代码时,表列值将被最后一个表列值替换。而且我在增加 <rvn> 方面也面临问题值(value)。你们中的任何一个人都可以帮我解决这个问题吗?

    import re

def tblcnv( st, val ):
Tcolspec = ''
Endval = int(val) + 1
for i in range(1, Endval):
l = str(i)
Tcolspec += "<colspec col='" + l + "' colwidth=''/>\n"
Theader = re.sub(r"(?i)<table.*?>","<table ct='" + val +"'>\n" + Tcolspec + "\n", st)
return Theader

in_data = open("in.txt", "r")
out_data = open("out.txt", "w")
Rdata = in_data.read()
Rval = Rdata.replace("\n", " ")

Rval = re.sub("(?i)(<TABLE.*cols=\"(\d+).*?</TABLE>)", lambda m: tblcnv(m.group(1), m.group(2)), Rval)
out_data.write(Rval)

最佳答案

这是您的工作代码...

注意:您不应该为此使用正则表达式...解析始终是更好的方法...

import re

counter = None

def datacnv( st ):
global counter
return "<rvn=\""+ next(counter) +"\">" + st + "</rvn>\n"

def rowcnv( st ):
global counter

counter = iter("".join([str(x) for x in range(1,10)]))

st = re.sub("(?i)<R>(.*?)</R>", lambda m: datacnv(m.group(1)), st)

return "<row>\n" + st + "</row>\n"

def tblcnv( st, val ):
Tcolspec = ''
Endval = int(val) + 1
for i in range(1, Endval):
l = str(i)
Tcolspec += "<colspec col='" + l + "' colwidth=''/>\n"
Theader = re.sub(r"(?i)<table.*?>","\n<table ct='" + val +"'>\n" + Tcolspec + "\n", st)

Theader = re.sub("(?i)<ROW>(.*?)</ROW>", lambda m: rowcnv(m.group(1)), Theader)

return Theader

in_data = open("in.txt", "r")
out_data = open("out.txt", "w")
Rdata = in_data.read().lower()
in_data.close()
Rval = Rdata.replace("\n", " ")

Rval = re.sub("(?i)(<TABLE.*?cols=\"(\d+).*?</TABLE>)", lambda m: tblcnv(m.group(1), m.group(2)), Rval)
out_data.write(Rval)

out_data.close()

输出

<table ct='3'>
<colspec col='1' colwidth=''/>
<colspec col='2' colwidth=''/>
<colspec col='3' colwidth=''/>

<row>
<rvn="1">data</rvn>
<rvn="2">data</rvn>
</row>
<row>
<rvn="1">data</rvn>
<rvn="2">data</rvn>
<rvn="3">data</rvn>
</row>
</table>
<table ct='4'>
<colspec col='1' colwidth=''/>
<colspec col='2' colwidth=''/>
<colspec col='3' colwidth=''/>
<colspec col='4' colwidth=''/>

<row>
<rvn="1">data</rvn>
<rvn="2">data</rvn>
<rvn="3">data</rvn>
<rvn="4">data</rvn>
<rvn="5">data</rvn>
</row>
<row>
<rvn="1">data</rvn>
<rvn="2">data</rvn>
</row>
</table>

关于python - 在Python正则表达式中循环多个模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14584767/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com