gpt4 book ai didi

python - 返回的输出未进入 xlsxwriter

转载 作者:行者123 更新时间:2023-12-01 09:15:33 34 4
gpt4 key购买 nike

我正在尝试从文本文件收集数据。当我打印输出时,它们返回我正在寻找的正确值,但是,当我尝试使用 xlsxwriter 将这些输出放入表中时,该表仅包含 txt 文件最后一行的输出重复的数量文本文件中存在行的次数。即有 5000 行文本,我需要从中获取 3 条信息,.xlsx 文件有 5000 行和 3 列,但都包含文本文件中最后一行的信息。

EC:1 > GO:N-乙基马来酰亚胺还原 enzyme active ; GO:0008748

EC:1 > GO:氧化还原 enzyme active ; GO:0016491

EC:1 > GO:还原型(prototype)辅 enzyme F420 脱氢 enzyme active ; GO:0043738

EC:1 > GO:硫加氧 enzyme 还原 enzyme active ; GO:0043826

EC:1 > GO:苹果酸乳酸 enzyme active ; GO:0043883

^txt 文件是什么样的

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

6.6.1.2钴螯合 enzyme active 0051116

…………

(表格的外观,但只有 5000 行)

如有任何帮助,我们将不胜感激,问候

import xlsxwriter

File = 'EC_to_GO.txt'
def analysis(line, output):
with open(File) as fp:
lines = fp.readlines()

for line in lines:
output[0] = line[3:].split(' > ')[0]
output[1] = line[:-14].split(' > GO:')[-1]
output[2] = line[-8:]
return output


with open(File) as fp:
lines = fp.readlines()

for line in lines:
if 'Generated on 2018-07-04T09:08Z' in line:
a = lines.index(line)

for line in lines:
if 'GO:cobaltochelatase activity ; GO:0051116' in line:
b = lines.index(line)

req_list = lines[a:b]

rxn_end_index = []

for i in range(len(req_list)):
if '> GO:' in req_list[i]:
rxn_end_index.append(i)

inner_list = []

outer_list =[]

spare = [0] + rxn_end_index

for i in range(len(spare)-1):
inner_list = req_list[spare[i]:spare[i+1]]
outer_list.append(inner_list)



res_list=[]
for i in range(len(outer_list)):
res_list.append(analysis(outer_list[i],['NA','NA','NA']))




# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('EC_to_GO.xlsx')
worksheet = workbook.add_worksheet('EC_to_GO')

#res_list1 = [EC, Genome name, GO]

#for i in res_list:
#res_list1.append(i)

# Some data we want to write to the worksheet.
t = tuple(res_list)

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for a,b,c in (t):
worksheet.write(row, col, a)
worksheet.write(row, col + 1, b)
worksheet.write(row, col + 2, c)
row += 1


workbook.close()

最佳答案

您基本上将相同的列表附加到 res_list 中。因此,您有同一个输出列表的多个副本。

修复:而不是

res_list.append(analysis(outer_list[i],['NA','NA','NA']))

#And in the previous loop
for i in range(len(spare)-1):
inner_list = req_list[spare[i]:spare[i+1]]
outer_list.append(inner_list)

将其更改为:

res_list.append(analysis(outer_list[i],['NA','NA','NA'])[:])
for i in range(len(spare)-1):
inner_list = req_list[spare[i]:spare[i+1]]
outer_list.append(inner_list[:])

或者

res_list.append(copy(analysis(outer_list[i],['NA','NA','NA']))) 
for i in range(len(spare)-1):
inner_list = req_list[spare[i]:spare[i+1]]
outer_list.append(copy(inner_list))

符号 list[:] 创建列表的副本。从技术上讲,您正在创建整个列表的一部分。

关于python - 返回的输出未进入 xlsxwriter,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51308602/

34 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com