gpt4 book ai didi

python - 如何修复这个多线程 Python 脚本?

转载 作者:行者123 更新时间:2023-12-01 06:16:27 24 4
gpt4 key购买 nike

我正在编写一个 Python 脚本来读取域列表,找出 Mcafee 的 Siteadvisor 服务给出的评级,然后将域和结果输出到 CSV。

我的脚本基于 this previous answer 。它使用 urllib 来抓取有问题的域的 Siteadvisor 页面(我知道这不是最好的方法,但 Siteadvisor 没有提供其他选择)。不幸的是,它无法产生任何结果 - 我总是收到此错误:

Traceback (most recent call last):
File "multi.py", line 55, in <module>
main()
File "multi.py", line 44, in main
resolver_thread.start()
File "/usr/lib/python2.6/threading.py", line 474, in start
_start_new_thread(self.__bootstrap, ())
thread.error: can't start new thread

这是我的脚本:

import threading
import urllib

class Resolver(threading.Thread):
def __init__(self, address, result_dict):
threading.Thread.__init__(self)
self.address = address
self.result_dict = result_dict

def run(self):
try:
content = urllib.urlopen("http://www.siteadvisor.com/sites/" + self.address).read(12000)
search1 = content.find("didn't find any significant problems.")
search2 = content.find('yellow')
search3 = content.find('web reputation analysis found potential security')
search4 = content.find("don't have the results yet.")

if search1 != -1:
result = "safe"
elif search2 != -1:
result = "caution"
elif search3 != -1:
result = "warning"
elif search4 != -1:
result = "unknown"
else:
result = ""

self.result_dict[self.address] = result

except:
pass


def main():
infile = open("domainslist", "r")
intext = infile.readlines()
threads = []
results = {}
for address in [address.strip() for address in intext if address.strip()]:
resolver_thread = Resolver(address, results)
threads.append(resolver_thread)
resolver_thread.start()

for thread in threads:
thread.join()

outfile = open('final.csv', 'w')
outfile.write("\n".join("%s,%s" % (address, ip) for address, ip in results.iteritems()))
outfile.close()

if __name__ == '__main__':
main()

任何帮助将不胜感激。

最佳答案

看起来您正在尝试启动太多线程。

您可以检查[address.strip() for address in intext if address.strip()]列表中有多少项。我想这是一个问题。基本上,允许启动新线程的可用资源是有限的。

解决方案是将列表分成 20 个元素的 block ,执行这些操作(在 20 个线程中),等待线程完成其工作,然后拾取下一个 block 。执行此操作,直到处理完列表中的所有元素。

您还可以使用一些线程池来更好地管理线程。 (我最近使用过 this implementation )。

关于python - 如何修复这个多线程 Python 脚本?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3122209/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com