gpt4 book ai didi

python - 执行此搜索算法的更有效方法?

转载 作者:塔克拉玛干 更新时间:2023-11-03 05:42:42 25 4
gpt4 key购买 nike

我只是想知道是否有更好的方法来执行此算法。我发现我需要经常执行此类操作,而我目前执行此操作的方式需要数小时,因为我认为它会被视为 n^2 算法。我会附在下面。

import csv

with open("location1", 'r') as main:
csvMain = csv.reader(main)
mainList = list(csvMain)

with open("location2", 'r') as anno:
csvAnno = csv.reader(anno)
annoList = list(csvAnno)

tempList = []
output = []

for full in mainList:
geneName = full[2].lower()
for annot in annoList:
if geneName == annot[2].lower():
tempList.extend(full)
tempList.append(annot[3])
tempList.append(annot[4])
tempList.append(annot[5])
tempList.append(annot[6])
output.append(tempList)

for i in tempList:
del i

with open("location3", 'w') as final:
a = csv.writer(final, delimiter=',')
a.writerows(output)

我有两个 csv 文件,每个文件包含 15,000 个字符串,我希望比较每个文件的列,如果它们匹配,则将第二个 csv 的末尾连接到第一个 csv 的末尾。任何帮助将不胜感激!

谢谢!

最佳答案

这样应该效率更高:

import csv
from collections import defaultdict

with open("location1", 'r') as main:
csvMain = csv.reader(main)
mainList = list(csvMain)

with open("location2", 'r') as anno:
csvAnno = csv.reader(anno)
annoList = list(csvAnno)

output = []
annoMap = defaultdict(list)

for annot in annoList:
tempList = annot[3:] # adapt this to the needed columns
annoMap[annot[2].lower()].append(tempList) # put these columns into the map at position of the column of intereset

for full in mainList:
geneName = full[2].lower()
if geneName in annoMap: # check if matching column exists
output.extend(annoMap[geneName])

with open("location3", 'w') as final:
a = csv.writer(final, delimiter=',')
a.writerows(output)

它的效率更高,因为您只需要遍历每个列表一次。字典中的查找平均为 O(1),因此您基本上得到了一个线性算法。

关于python - 执行此搜索算法的更有效方法?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42256304/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com