gpt4 book ai didi

Python:无法启动新线程。我可以错开或延迟一些线程吗?

转载 作者:太空宇宙 更新时间:2023-11-04 10:47:08 25 4
gpt4 key购买 nike

不太确定如何问这个问题,因为我刚刚开始学习 python,但这里是这样的:

我有一个使用线程来抓取信息的网络抓取器。我正在寻找大约 900 种产品的定价和库存。当我用大约一半的脚本测试脚本时,没有问题。当我尝试抓取所有 900 种产品时,出现无法启动新线程的错误。

我想这是由于某些内存限制造成的,或者是因为我向服务器请求了太多请求

我想知道是否有办法减慢线程速度或错开请求。

错误代码:

Traceback (most recent call last):
File "C:\Python27\tests\dxpriceupdates.py", line 78, in <module>
t.start()
error: can't start new thread
>>>
Traceback (most recent call last):Exception in thread Thread-554:
Traceback (most recent call last):
File "C:\Python27\lib\urllib.py", line 346, in open_http
errcode, errmsg, headers = h.getreply()
File "C:\Python27\lib\httplib.py", line 1117, in getreply
response = self._conn.getresponse()
File "C:\Python27\lib\httplib.py", line 1045, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 441, in begin
self.msg = HTTPMessage(self.fp, 0)
File "C:\Python27\lib\mimetools.py", line 25, in __init__
rfc822.Message.__init__(self, fp, seekable)
File "C:\Python27\lib\rfc822.py", line 108, in __init__
self.readheaders()
File "C:\Python27\lib\httplib.py", line 308, in readheaders
self.addheader(headerseen, line[len(headerseen)+1:].strip())
MemoryError

<bound method Thread.__bootstrap of <Thread(Thread-221, stopped 9512)>>Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):

Traceback (most recent call last):
Unhandled exception in thread started by Unhandled exception in thread started by ...

这是 python(skulist.txt 只是一个文本文件,如 12345、23445、5551...):

from threading import Thread
import urllib
import re
import json
import math

def th(ur):
site = "http://dx.com/p/GetProductInfoRealTime?skus="+ur
htmltext = urllib.urlopen(site)
data = json.load(htmltext)
htmlrates = urllib.urlopen("http://rate-exchange.appspot.com/currency?from=USD&to=AUD")
datarates = json.load(htmlrates)
if data['success'] == True:
if data['data'][0]['discount'] is 0:
price = float(data['data'][0]['price'])
rate = float(datarates['rate']) + 0.12
cost = price*rate
if cost <= 5:
saleprice = math.ceil(cost*1.7) - .05
elif (cost >5) and (cost <= 10):
saleprice = math.ceil(cost*1.6) - .05
elif (cost >10) and (cost <= 15):
saleprice = math.ceil(cost*1.55) - .05
else:
saleprice = math.ceil(cost*1.5) - .05
if data['data'][0]['issoldout']:
soldout = "Out Of Stock"
enabled = "Disable"
qty = "0"
else:
soldout = "In Stock"
enabled = "Enabled"
qty = "9999"

#print model, saleprice, soldout, qty, enabled
myfile.write(str(ur)+","+str(saleprice)+","+str(soldout)+","+str(qty)+","+str(enabled)+"\n")
else:
price = float(data['data'][0]['listprice'])
rate = float(datarates['rate']) + 0.12
cost = price*rate
if cost <= 5:
saleprice = math.ceil(cost*1.7) - .05
elif (cost >5) and (cost <= 10):
saleprice = math.ceil(cost*1.6) - .05
elif (cost >10) and (cost <= 15):
saleprice = math.ceil(cost*1.55) - .05
else:
saleprice = math.ceil(cost*1.5) - .05
if data['data'][0]['issoldout']:
soldout = "Out Of Stock"
enabled = "Disable"
qty = "0"
else:
soldout = "In Stock"
enabled = "Enabled"
qty = "9999"

#print model, saleprice, soldout, qty, enabled
myfile.write(str(ur)+","+str(saleprice)+","+str(soldout)+","+str(qty)+","+str(enabled)+"\n")
else:
qty = "0"
print ur, "error \n"
myfile.write(str(ur)+","+"0.00"+","+"Out Of Stock"+","+str(qty)+","+"Disable\n")


skulist = open("skulist.txt").read()
skulist = skulist.replace(" ", "").split(",")

myfile = open("prices/price_update.txt", "w+")
myfile.close()

myfile = open("prices/price_update.txt", "a")
threadlist = []

for u in skulist:
t = Thread(target=th,args=(u,))
t.start()
threadlist.append(t)

for b in threadlist:
b.join()

myfile.close()

最佳答案

不要同时触发 900 个线程,您的 PC 可能会窒息!相反,使用池并将事件分配给一定数量的 worker 。像这样使用 multiprocessing:

from multiprocessing import Pool

WORKERS = 10
p = Pool(WORKERS)
p.map(tr, skulist)

通过一些试验为 WORKERS 找到正确的值。

关于Python:无法启动新线程。我可以错开或延迟一些线程吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16532658/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com