- c - 在位数组中找到第一个零
- linux - Unix 显示有关匹配两种模式之一的文件的信息
- 正则表达式替换多个文件
- linux - 隐藏来自 xtrace 的命令
这个错误已经困扰我几个小时了。我决定编写一个单独的项目,看看我是否可以复制它,我可以,但只能在我的服务器上。这适用于我的 Mac。
Mac:OSX El Capitan 10.11.6
服务器:CentOS 7.2.1511
两者都有 PhantomJS 版本:2.1.1
Python Mac:Python 2.7.11
Python 服务器:2.7.5
两者都有 Selenium 版本:2.53.0
相同的代码在两者上运行:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.common.exceptions import NoSuchElementException
import time
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
dcap["phantomjs.page.customHeaders.accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
dcap["phantomjs.page.customHeaders.Accept-Language"] = "en-US,en;q=0.8"
dcap["phantomjs.page.customHeaders.connection"] = "keep-alive"
driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.set_window_size(1120, 700)
driver.get("https://www.instagram.com/espn/")
while True:
print len(driver.find_elements_by_css_selector("a[href*='/p/']"))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
loadMore = driver.find_element_by_link_text("Load more")
loadMore.click()
except NoSuchElementException:
print "No such"
driver.save_screenshot('none.png')
Mac 输出:
12
24
No such
24
No such
36
No such
48
No such
48
No such
60
No such
72
No such
84
# This goes until I end it
服务器输出:
12
24
No such
Traceback (most recent call last):
File "junk.py", line 27, in <module>
driver.save_screenshot('none.png')
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 790, in get_screenshot_as_file
png = self.get_screenshot_as_png()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 809, in get_screenshot_as_png
return base64.b64decode(self.get_screenshot_as_base64().encode('ascii'))
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 819, in get_screenshot_as_base64
return self.execute(Command.SCREENSHOT)['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
删除截图行后的服务器输出:
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 23, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
我在这里找到了一个相关答案:Can't run PhantomJS in python via Selenium
所以我安装了 Selenium 2.37,它给出了同样的错误。
我读了this answer关于可能与更改 header 有关的问题,因此我通过将驱动程序更改为 driver = webdriver.PhantomJS()
删除了 header ,但仍然出现相同的错误。
我在服务器上也装了2.7.12,看看有没有区别。输出是:
# python2.7 junk.py
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 29, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/local/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1228, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1201, in do_open
r = h.getresponse(buffering=True)
File "/usr/local/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/local/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
正在检查系统空间。这是一个全新的 VPS,但仍然要确认:
最佳答案
编辑 3
添加以下内容:
except httplib.BadStatusLine:
pass
编辑 2
Python WebDriver 和 phantomJs 有问题 keep_alive .这可能是你的问题。所以添加 keep_alive=False 如下:
driver = webdriver.PhantomJS(desired_capabilities=dcap,keep_alive=False)
结束编辑
添加以下内容
import httplib
import socket
from selenium.webdriver.remote.command import Command
def get_status(driver):
try:
driver.execute(Command.STATUS)
return "Alive"
except (socket.error, httplib.CannotSendRequest):
return "Dead"
在 save_screenshot 语句之前调用 get_status(driver) 并打印结果。这将告诉我们驱动程序是否过早关闭。
编辑
在driver = webdriver.PhantomJS(desired_capabilities=dcap)之后添加如下内容
driver.implicitly_wait(10) #wait 10 seconds when doing a find_element before carrying on
关于python - httplib.BadStatusLine : '' on Linux but not Mac,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40799703/
我在 python 中使用 selenium 抓取一个网站,当我在 Windows 上运行相同的脚本时我得到了想要的结果,但是在 ubuntu 16.04 中当我运行相同的脚本时它抛出错误: File
我正在使用 suds 0.3.6。创建 suds 客户端时,随机出现错误: httplib.py,_read_status(),第 355 行,类 httplib.BadStatusLine' 这是用
我遇到了一个奇怪的错误,我似乎找不到解决方案。 这个错误不会在我每次点击这段代码时发生,也不会在循环中的同一次迭代中发生(它发生在一个循环中)。如果我运行够了,它似乎没有遇到错误,程序执行成功。无论如
一如既往,我经常遇到问题,我已经彻底搜索了当前问题的答案,但发现自己一头雾水。以下是我搜索过的一些地方:- How to fix httplib.BadStatusLine exception?- P
我将 PhantomJS 与 Selenium 结合使用,我想在 stackoverflow 上进行多次搜索。这段代码在我的本地电脑上运行良好,当我将它更改为内存较少的服务器时,它会引发 httpli
通过 python-requests 进行身份验证的代理返回以下错误: >>> import requests >>> proxies = {'https': 'http://username:pas
我正在尝试使用 Selenium 和 BeautifulSoup 抓取 Google Chrome 扩展商店的评论。但是,即使使用最新版本的 Chromedriver,我似乎也无法启动和运行 Sele
我这里有一个程序可以传输市场价格并根据价格执行订单,但是,每隔一段时间(几个小时左右)它就会抛出这个错误: Exception in thread Thread-1: Traceback (most
我正在尝试连接到groovyshark。因为 python 是我选择的语言。但我已经碰壁了。看来groveshark最近改变了他们的协议(protocol)的一部分,或者我可能遇到了python的限制
URL = "MY HTTP REQUEST URL" XML = "0" parameter = urllib.urlencode({'XML': XML}) response = urllib.u
我使用以下代码从网络获取图像: path = 'http://domgvozdem.ru/images/ustanovka-kondicionera-svoimi-rukami.jpg' def ex
我一生都无法弄清楚为什么我不能捕获这个异常。 查看此处this guide . def get_team_names(get_team_id_url, team_id): print(get_
我正在尝试使用 python-rest-client ( http://code.google.com/p/python-rest-client/wiki/Using_Connection ) 来执行
我一直在使用 Python 中的 requests 库查询 Web 服务器上的数据。我收到以下错误: ConnectionError: ('Connection aborted.', BadStatu
在 python 中,如果我导入请求并执行: t = requests.get("http://www.azlyrics.com/u/urban.html") 我得到这个异常: raise BadSt
我有一个简单的互联网检查器正在运行,但它偶尔会返回一个我似乎无法处理的错误... 函数如下: def internet_on(): try: urllib2.urlopen("
在我的 python 项目中,我使用 Splinter ( https://splinter.readthedocs.io/en/latest/ ) 打开浏览器并尝试访问网站: from splint
这个错误已经困扰我几个小时了。我决定编写一个单独的项目,看看我是否可以复制它,我可以,但只能在我的服务器上。这适用于我的 Mac。 Mac:OSX El Capitan 10.11.6 服务器:Cen
我在 Python 中使用请求,但总是遇到 BadStatusLine 错误。 我的代码如下: import requests ip = 'xx.xx.xx.xx' port = 80 proxies
尝试从 URL 获取数据,但请求遇到问题。我认为它与页面或数据的格式(没有)有关? 我使用的是Python 3.6.5,请求2.20.0 import requests r = requests.ge
我是一名优秀的程序员,十分优秀!