gpt4 book ai didi

python - 保留拆分字符

转载 作者:塔克拉玛干 更新时间:2023-11-03 00:51:13 36 4
gpt4 key购买 nike

我有以下数据:

<http://dbpedia.org/data/Plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerInfo> _:header16125770191335188966549 <http://dbpedia.org/data/Plasmodium_hegneri.xml> .
_:header16125770191335188966549 <http://www.w3.org/2006/http#responseCode> "200"^^<http://www.w3.org/2001/XMLSchema#integer> <http://dbpedia.org/data/Plasmodium_hegneri.xml> .
_:header16125770191335188966549 <http://www.w3.org/2006/http#date> "Mon, 23 Apr 2012 13:49:27 GMT" <http://dbpedia.org/data/Plasmodium_hegneri.xml> .
_:header16125770191335188966549 <http://www.w3.org/2006/http#content-type> "application/rdf+xml; charset=UTF-8" <http://dbpedia.org/data/Plasmodium_hegneri.xml> .

现在我想将此数据转换为以下形式——这样最后一个包含在 < > 中的字符串出现在添加了#@ 的行之前。

#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
<http://dbpedia.org/data/Plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerInfo> _:header16125770191335188966549 .
#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
_:header16125770191335188966549 <http://www.w3.org/2006/http#responseCode> "200"^^<http://www.w3.org/2001/XMLSchema#integer> .
#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
_:header16125770191335188966549 <http://www.w3.org/2006/http#date> "Mon, 23 Apr 2012 13:49:27 GMT" .
#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
_:header16125770191335188966549 <http://www.w3.org/2006/http#content-type> "application/rdf+xml; charset=UTF-8" .

为了实现同样的目的,我编写了以下 python 代码:

infile = open('testnq.nq', 'r')
outfile= open('outFile.ttl','w')
while True:
inFileLine1=infile.readline()
if not inFileLine1:
break #EOF
splitString=inFileLine1.split(' ')
line1= "#@ " + splitString[len(splitString)-2]
outfile.write(line1)
line2=""
for num in range (0,len(splitString)-2):
line2= line2 + splitString[num]
outFile.write(line2)

outFile.close()

但我无法在所需位置获得空间。有人可以建议我如何在 python 或使用 linux 命令中做同样的事情

最佳答案

考虑到使用正则表达式并使事情复杂化的风险,这可能有效:

import re

line = """<http://dbpedia.org/data/Plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerInfo> _:header16125770191335188966549 <http://dbpedia.org/data/Plasmodium_hegneri.xml> ."""
print re.sub('^(?P<before>.*)(?P<match>\<[^>]+\>)(?P<after>[^<]*)$', '#@ \g<match>\n\g<before>\g<after>', line)

line = """_:header16125770191335188966549 <http://www.w3.org/2006/http#responseCode> "200"^^<http://www.w3.org/2001/XMLSchema#integer> <http://dbpedia.org/data/Plasmodium_hegneri.xml> ."""
print re.sub('^(?P<before>.*)(?P<match>\<[^>]+\>)(?P<after>[^<]*)$', '#@ \g<match>\n\g<before>\g<after>', line)

哪些输出:

#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
<http://dbpedia.org/data/Plasmodium_hegneri.xml> <http://code.google.com/p/ldspider/ns#headerInfo> _:header16125770191335188966549 .
#@ <http://dbpedia.org/data/Plasmodium_hegneri.xml>
_:header16125770191335188966549 <http://www.w3.org/2006/http#responseCode> "200"^^<http://www.w3.org/2001/XMLSchema#integer> .

关于python - 保留拆分字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31283355/

36 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com