gpt4 book ai didi

java - nio解析xml文件时出错

转载 作者:太空宇宙 更新时间:2023-11-03 19:30:44 24 4
gpt4 key购买 nike

我在 Jython 中有一个函数,该函数使用 Popen 运行另一个程序,该程序将 xml 文件写入其 stdout,该文件被定向到一个文件。该过程完成后,我关闭文件并调用另一个函数来解析它。在解析过程中,我收到了一堆错误消息,涉及访问已关闭的文件和/或格式不正确的 xml 文件(当我查看它们时,它们看起来很好)。我认为 output.close() 可能会在关闭文件之前返回,因此我添加了一个循环来等待 output.close 为 true。一开始这似乎有效,但后来我的程序打印了以下内容

blasting  
blasted
parsing
parsed
Extending genes found via genemark, 10.00% done
blasting
blasted
parsing
Exception in thread "_CouplerThread-7 (stdout)" Traceback (most recent call last):
File "/Users/mbsulli/jython/Lib/subprocess.py", line 675, in run
self.write_func(buf)
IOError: java.nio.channels.AsynchronousCloseException
[Fatal Error] 17_2_corr.blastp.xml:15902:63: XML document structures must start and end within the same entity.
Retry
blasting
blasted
parsing
Exception in thread "_CouplerThread-9 (stdout)" Traceback (most recent call last):
File "/Users/mbsulli/jython/Lib/subprocess.py", line 675, in run
self.write_func(buf)
IOError: java.nio.channels.ClosedChannelException
[Fatal Error] 17_2_corr.blastp.xml:15890:30: XML document structures must start and end within the same entity.
Retry
blasting

我不确定我的选择是什么。我认为在解析 xml 之前没有编写它是否正确?如果是这样,我可以找谁来确保它是。

def parseBlast(fileName):
"""
A function for parsing XML blast output.
"""
print "parsing"
reader = XMLReaderFactory.createXMLReader()
reader.entityResolver = reader.contentHandler = BlastHandler()
reader.parse(fileName)
print "parsed"

return dict(map(lambda iteration: (iteration.query, iteration), reader.getContentHandler().iterations))

def cachedBlast(fileName, blastLocation, database, eValue, query, pipeline, remote = False, force = False):
"""
Performs a blast search using the blastp executable and database in blastLocation on
the query with the eValue. The result is an XML file saved to fileName. If fileName
already exists the search is skipped. If remote is true then the search is done remotely.
"""
if not os.path.isfile(fileName) or force:
output = open(fileName, "w")
command = [blastLocation + "/bin/blastp",
"-evalue", str(eValue),
"-outfmt", "5",
"-query", query]
if remote:
command += ["-remote",
"-db", database]
else:
command += ["-num_threads", str(Runtime.getRuntime().availableProcessors()),
"-db", database]
print "blasting"
blastProcess = subprocess.Popen(command,
stdout = output)
while blastProcess.poll() == None:
if pipeline.exception:
print "Stopping in blast"
blastProcess.kill()
output.close()
raise pipeline.exception
output.close()
while not output.closed:
pass
print "blasted"
try:
return parseBlast(fileName)
except SAXParseException:
print 'Retry'
return cachedBlast(fileName, blastLocation, database, eValue, query, pipeline, remote, True)

最佳答案

我认为这个问题是在我从调用子进程上的 wait 切换到使用 poll 方法时开始的,这样我就可以在进程运行时停止它。由于我已经获得了我使用的许多数据集的结果,所以需要一段时间才能再次启动子流程,所以很难说。不管怎样,我的猜测是,当我关闭它时,输出仍在被写入,我的解决方案是切换到管道并自己写入文件。

def cachedBlast(fileName, blastLocation, database, eValue, query, pipeline, remote = False, force = False):


"""
Performs a blast search using the blastp executable and database in blastLocation on
the query with the eValue. The result is an XML file saved to fileName. If fileName
already exists the search is skipped. If remote is true then the search is done remotely.
"""
if not os.path.isfile(fileName) or force:
output = open(fileName, "w")
command = [blastLocation + "/bin/blastp",
"-evalue", str(eValue),
"-outfmt", "5",
"-query", query]
if remote:
command += ["-remote",
"-db", database]
else:
command += ["-num_threads", str(Runtime.getRuntime().availableProcessors()),
"-db", database]
blastProcess = subprocess.Popen(command,
stdout = subprocess.PIPE)
while blastProcess.poll() == None:
output.write(blastProcess.stdout.read())
if pipeline.exception:
psProcess = subprocess.Popen(["ps", "aux"], stdout = subprocess.PIPE)
awkProcess = subprocess.Popen(["awk", "/" + " ".join(command).replace("/", "\\/") + "/"], stdin = psProcess.stdout, stdout = subprocess.PIPE)
for line in awkProcess.stdout:
subprocess.Popen(["kill", "-9", re.split(r"\s+", line)[1]])
output.close()
raise pipeline.exception
remaining = blastProcess.stdout.read()
while remaining:
output.write(remaining)
remaining = blastProcess.stdout.read()

output.close()

try:
return parseBlast(fileName)
except SAXParseException:
return cachedBlast(fileName, blastLocation, database, eValue, query, pipeline, remote, True)

关于java - nio解析xml文件时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5998947/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com