gpt4 book ai didi

python - python多线程处理打开多个webdrivers时报错

转载 作者:行者123 更新时间:2023-11-28 19:20:01 27 4
gpt4 key购买 nike

我正在使用 python 进行网页抓取。我正在尝试使用多线程来加速抓取。而且,我将使用 Selenium 。因此,在每个线程中,我打开一个 webdriver。当我打开 4 个线程时,程序运行良好。但是,当我尝试开启5个线程或超过5个线程时,程序返回错误如下:

WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\apogne\\appdata\\local\\temp\\tmpqzhfxq.webdriver.xpi\\platform\\WINNT_x86-msvc\\components\\webdriver-firefox-latest.dll'

程序可以简化如下,仍然出现同样的错误。

from selenium import webdriver
from threading import Thread

def f():
driver=webdriver.Firefox()
driver.close()

thread_list=[]

for i in range(5):
t=Thread(target=f)
t.start()
thread_list.append(t)

for t in thread_list:
t.join()

错误的完整轨迹如下。

Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\pinterest\user_info_multiThread.py", line 21, in gettingUserInfo
driver = webdriver.Firefox()
File "C:\Python27\lib\selenium\webdriver\firefox\webdriver.py", line 59, in __init__
self.binary, timeout),
File "C:\Python27\lib\selenium\webdriver\firefox\extension_connection.py", line 45, in __init__
self.profile.add_extension()
File "C:\Python27\lib\selenium\webdriver\firefox\firefox_profile.py", line 92, in add_extension
self._install_extension(extension)
File "C:\Python27\lib\selenium\webdriver\firefox\firefox_profile.py", line 285, in _install_extension
shutil.rmtree(tmpdir)
File "C:\Python27\lib\shutil.py", line 247, in rmtree
rmtree(fullname, ignore_errors, onerror)
File "C:\Python27\lib\shutil.py", line 247, in rmtree
rmtree(fullname, ignore_errors, onerror)
File "C:\Python27\lib\shutil.py", line 247, in rmtree
rmtree(fullname, ignore_errors, onerror)
File "C:\Python27\lib\shutil.py", line 252, in rmtree
onerror(os.remove, fullname, sys.exc_info())
File "C:\Python27\lib\shutil.py", line 250, in rmtree
os.remove(fullname)
WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\apogne\\appdata\\local\\temp\\tmpadxbvj.webdriver.xpi\\platform\\WINNT_x86-msvc\\components\\webdriver-firefox-previous.dll'

Exception in thread Thread-1:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\pinterest\user_info_multiThread.py", line 21, in gettingUserInfo
driver = webdriver.Firefox()
File "C:\Python27\lib\selenium\webdriver\firefox\webdriver.py", line 61, in __init__
keep_alive=True)
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 73, in __init__
self.start_session(desired_capabilities, browser_profile)
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 121, in start_session
'desiredCapabilities': desired_capabilities,
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 173, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\selenium\webdriver\remote\errorhandler.py", line 166, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: u'c is null' ; Stacktrace:
at nsCommandProcessor.prototype.newSession (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/command-processor.js:11751:61)
at nsCommandProcessor.prototype.execute (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/command-processor.js:11646:7)
at Dispatcher.executeAs/< (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/driver-component.js:8430:5)
at Resource.prototype.handle (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/driver-component.js:8577:219)
at Dispatcher.prototype.dispatch (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/driver-component.js:8524:36)
at WebDriverServer/<.handle (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/driver-component.js:11466:5)
at createHandlerFunc/< (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:1935:41)
at ServerHandler.prototype.handleResponse (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:2261:15)
at Connection.prototype.process (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:1168:5)
at RequestReader.prototype._handleResponse (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:1616:5)
at RequestReader.prototype._processBody (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:1464:9)
at RequestReader.prototype.onInputStreamReady (file:///c:/users/apogne/appdata/local/temp/tmpr0rxvj/extensions/fxdriver@googlecode.com/components/httpd.js:1333:9)

Exception in thread Thread-4:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\pinterest\user_info_multiThread.py", line 24, in gettingUserInfo
driver.get("http://www.pinterest.com")
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 185, in get
self.execute(Command.GET, {'url': url})
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 171, in execute
response = self.command_executor.execute(driver_command, params)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 349, in execute
return self._request(command_info[0], url, body=data)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 380, in _request
resp = self._conn.getresponse()
File "C:\Python27\lib\httplib.py", line 1067, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 373, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''

Exception in thread Thread-3:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\pinterest\user_info_multiThread.py", line 24, in gettingUserInfo
driver.get("http://www.pinterest.com")
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 185, in get
self.execute(Command.GET, {'url': url})
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 171, in execute
response = self.command_executor.execute(driver_command, params)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 349, in execute
return self._request(command_info[0], url, body=data)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 380, in _request
resp = self._conn.getresponse()
File "C:\Python27\lib\httplib.py", line 1067, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 373, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''

Exception in thread Thread-5:
Traceback (most recent call last):
File "C:\Python27\lib\threading.py", line 810, in __bootstrap_inner
self.run()
File "C:\Python27\lib\threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "C:\Python27\pinterest\user_info_multiThread.py", line 24, in gettingUserInfo
driver.get("http://www.pinterest.com")
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 185, in get
self.execute(Command.GET, {'url': url})
File "C:\Python27\lib\selenium\webdriver\remote\webdriver.py", line 171, in execute
response = self.command_executor.execute(driver_command, params)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 349, in execute
return self._request(command_info[0], url, body=data)
File "C:\Python27\lib\selenium\webdriver\remote\remote_connection.py", line 380, in _request
resp = self._conn.getresponse()
File "C:\Python27\lib\httplib.py", line 1067, in getresponse
response.begin()
File "C:\Python27\lib\httplib.py", line 409, in begin
version, status, reason = self._read_status()
File "C:\Python27\lib\httplib.py", line 373, in _read_status
raise BadStatusLine(line)
BadStatusLine: ''

有谁知道为什么会出现这个错误,我该如何解决?

最佳答案

你应该为

创建一个锁
driver=webdriver.Firefox()

这样一次只有一个线程在引导驱动程序

编辑:

from selenium import webdriver
from threading import Thread, Lock

def f():
#thread will either acquire lock or wait for it to be released by other thread
with my_lock:
#init this driver
driver = webdriver.Firefox()

#do your other stuff

driver.close()

thread_list=[]
my_lock = Lock()

for _ in xrange(5):
t = Thread( target=f )
t.start()
thread_list.append( t )

for t in thread_list:
t.join()

关于python - python多线程处理打开多个webdrivers时报错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27677887/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com