- c - 在位数组中找到第一个零
- linux - Unix 显示有关匹配两种模式之一的文件的信息
- 正则表达式替换多个文件
- linux - 隐藏来自 xtrace 的命令
这个错误已经困扰我几个小时了。我决定编写一个单独的项目,看看我是否可以复制它,我可以,但只能在我的服务器上。这适用于我的 Mac。
Mac:OSX El Capitan 10.11.6
服务器:CentOS 7.2.1511
两者都有 PhantomJS 版本:2.1.1
Python Mac:Python 2.7.11
Python 服务器:2.7.5
两者都有 Selenium 版本:2.53.0
相同的代码在两者上运行:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.common.exceptions import NoSuchElementException
import time
dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36"
dcap["phantomjs.page.customHeaders.accept"] = "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8"
dcap["phantomjs.page.customHeaders.Accept-Language"] = "en-US,en;q=0.8"
dcap["phantomjs.page.customHeaders.connection"] = "keep-alive"
driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.set_window_size(1120, 700)
driver.get("https://www.instagram.com/espn/")
while True:
print len(driver.find_elements_by_css_selector("a[href*='/p/']"))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
try:
loadMore = driver.find_element_by_link_text("Load more")
loadMore.click()
except NoSuchElementException:
print "No such"
driver.save_screenshot('none.png')
Mac 输出:
12
24
No such
24
No such
36
No such
48
No such
48
No such
60
No such
72
No such
84
# This goes until I end it
服务器输出:
12
24
No such
Traceback (most recent call last):
File "junk.py", line 27, in <module>
driver.save_screenshot('none.png')
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 790, in get_screenshot_as_file
png = self.get_screenshot_as_png()
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 809, in get_screenshot_as_png
return base64.b64decode(self.get_screenshot_as_base64().encode('ascii'))
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 819, in get_screenshot_as_base64
return self.execute(Command.SCREENSHOT)['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
删除截图行后的服务器输出:
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 23, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/lib64/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/usr/lib64/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/python2.7/urllib2.py", line 1244, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.7/urllib2.py", line 1217, in do_open
r = h.getresponse(buffering=True)
File "/usr/lib64/python2.7/httplib.py", line 1089, in getresponse
response.begin()
File "/usr/lib64/python2.7/httplib.py", line 444, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.7/httplib.py", line 408, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
我在这里找到了一个相关答案:Can't run PhantomJS in python via Selenium
所以我安装了 Selenium 2.37,它给出了同样的错误。
我读了this answer关于可能与更改 header 有关的问题,因此我通过将驱动程序更改为 driver = webdriver.PhantomJS()
删除了 header ,但仍然出现相同的错误。
我在服务器上也装了2.7.12,看看有没有区别。输出是:
# python2.7 junk.py
12
24
No such
24
Traceback (most recent call last):
File "junk.py", line 29, in <module>
loadMore = driver.find_element_by_link_text("Load more")
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 314, in find_element_by_link_text
return self.find_element(by=By.LINK_TEXT, value=link_text)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 744, in find_element
{'using': by, 'value': value})['value']
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/webdriver.py", line 231, in execute
response = self.command_executor.execute(driver_command, params)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 395, in execute
return self._request(command_info[0], url, body=data)
File "/usr/local/lib/python2.7/site-packages/selenium/webdriver/remote/remote_connection.py", line 463, in _request
resp = opener.open(request, timeout=self._timeout)
File "/usr/local/lib/python2.7/urllib2.py", line 429, in open
response = self._open(req, data)
File "/usr/local/lib/python2.7/urllib2.py", line 447, in _open
'_open', req)
File "/usr/local/lib/python2.7/urllib2.py", line 407, in _call_chain
result = func(*args)
File "/usr/local/lib/python2.7/urllib2.py", line 1228, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/local/lib/python2.7/urllib2.py", line 1201, in do_open
r = h.getresponse(buffering=True)
File "/usr/local/lib/python2.7/httplib.py", line 1136, in getresponse
response.begin()
File "/usr/local/lib/python2.7/httplib.py", line 453, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python2.7/httplib.py", line 417, in _read_status
raise BadStatusLine(line)
httplib.BadStatusLine: ''
正在检查系统空间。这是一个全新的 VPS,但仍然要确认:
最佳答案
编辑 3
添加以下内容:
except httplib.BadStatusLine:
pass
编辑 2
Python WebDriver 和 phantomJs 有问题 keep_alive .这可能是你的问题。所以添加 keep_alive=False 如下:
driver = webdriver.PhantomJS(desired_capabilities=dcap,keep_alive=False)
结束编辑
添加以下内容
import httplib
import socket
from selenium.webdriver.remote.command import Command
def get_status(driver):
try:
driver.execute(Command.STATUS)
return "Alive"
except (socket.error, httplib.CannotSendRequest):
return "Dead"
在 save_screenshot 语句之前调用 get_status(driver) 并打印结果。这将告诉我们驱动程序是否过早关闭。
编辑
在driver = webdriver.PhantomJS(desired_capabilities=dcap)之后添加如下内容
driver.implicitly_wait(10) #wait 10 seconds when doing a find_element before carrying on
关于python - httplib.BadStatusLine : '' on Linux but not Mac,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40799703/
我正在处理一组标记为 160 个组的 173k 点。我想通过合并最接近的(到 9 或 10 个组)来减少组/集群的数量。我搜索过 sklearn 或类似的库,但没有成功。 我猜它只是通过 knn 聚类
我有一个扁平数字列表,这些数字逻辑上以 3 为一组,其中每个三元组是 (number, __ignored, flag[0 or 1]),例如: [7,56,1, 8,0,0, 2,0,0, 6,1,
我正在使用 pipenv 来管理我的包。我想编写一个 python 脚本来调用另一个使用不同虚拟环境(VE)的 python 脚本。 如何运行使用 VE1 的 python 脚本 1 并调用另一个 p
假设我有一个文件 script.py 位于 path = "foo/bar/script.py"。我正在寻找一种在 Python 中通过函数 execute_script() 从我的主要 Python
这听起来像是谜语或笑话,但实际上我还没有找到这个问题的答案。 问题到底是什么? 我想运行 2 个脚本。在第一个脚本中,我调用另一个脚本,但我希望它们继续并行,而不是在两个单独的线程中。主要是我不希望第
我有一个带有 python 2.5.5 的软件。我想发送一个命令,该命令将在 python 2.7.5 中启动一个脚本,然后继续执行该脚本。 我试过用 #!python2.7.5 和http://re
我在 python 命令行(使用 python 2.7)中,并尝试运行 Python 脚本。我的操作系统是 Windows 7。我已将我的目录设置为包含我所有脚本的文件夹,使用: os.chdir("
剧透:部分解决(见最后)。 以下是使用 Python 嵌入的代码示例: #include int main(int argc, char** argv) { Py_SetPythonHome
假设我有以下列表,对应于及时的股票价格: prices = [1, 3, 7, 10, 9, 8, 5, 3, 6, 8, 12, 9, 6, 10, 13, 8, 4, 11] 我想确定以下总体上最
所以我试图在选择某个单选按钮时更改此框架的背景。 我的框架位于一个类中,并且单选按钮的功能位于该类之外。 (这样我就可以在所有其他框架上调用它们。) 问题是每当我选择单选按钮时都会出现以下错误: co
我正在尝试将字符串与 python 中的正则表达式进行比较,如下所示, #!/usr/bin/env python3 import re str1 = "Expecting property name
考虑以下原型(prototype) Boost.Python 模块,该模块从单独的 C++ 头文件中引入类“D”。 /* file: a/b.cpp */ BOOST_PYTHON_MODULE(c)
如何编写一个程序来“识别函数调用的行号?” python 检查模块提供了定位行号的选项,但是, def di(): return inspect.currentframe().f_back.f_l
我已经使用 macports 安装了 Python 2.7,并且由于我的 $PATH 变量,这就是我输入 $ python 时得到的变量。然而,virtualenv 默认使用 Python 2.6,除
我只想问如何加快 python 上的 re.search 速度。 我有一个很长的字符串行,长度为 176861(即带有一些符号的字母数字字符),我使用此函数测试了该行以进行研究: def getExe
list1= [u'%app%%General%%Council%', u'%people%', u'%people%%Regional%%Council%%Mandate%', u'%ppp%%Ge
这个问题在这里已经有了答案: Is it Pythonic to use list comprehensions for just side effects? (7 个答案) 关闭 4 个月前。 告
我想用 Python 将两个列表组合成一个列表,方法如下: a = [1,1,1,2,2,2,3,3,3,3] b= ["Sun", "is", "bright", "June","and" ,"Ju
我正在运行带有最新 Boost 发行版 (1.55.0) 的 Mac OS X 10.8.4 (Darwin 12.4.0)。我正在按照说明 here构建包含在我的发行版中的教程 Boost-Pyth
学习 Python,我正在尝试制作一个没有任何第 3 方库的网络抓取工具,这样过程对我来说并没有简化,而且我知道我在做什么。我浏览了一些在线资源,但所有这些都让我对某些事情感到困惑。 html 看起来
我是一名优秀的程序员,十分优秀!