gpt4 book ai didi

javascript - PhantomJS 的行为不同于 Firefox webdriver

转载 作者:行者123 更新时间:2023-11-30 16:45:05 26 4
gpt4 key购买 nike

我正在编写一些使用 Selenium 网络驱动程序 - Firefox 的代码。大多数事情似乎都有效,但是当我尝试将浏览器更改为 PhantomJS 时,它的行为开始有所不同。

我正在处理的页面需要缓慢滚动以加载越来越多的结果,这可能就是问题所在。

以下代码适用于 Firefox webdriver,但不适用于 PhantomJS:

def get_url(destination,start_date,end_date): #the date is like %Y-%m-%d 
return "https://www.pelikan.sk/sk/flights/listdfc=%s&dtc=C%s&rfc=C%s&rtc=%s&dd=%s&rd=%s&px=1000&ns=0&prc=&rng=0&rbd=0&ct=0&view=list" % ('CVIE%20BUD%20BTS',destination, destination,'CVIE%20BUD%20BTS', start_date, end_date)



def load_whole_page(self,destination,start_date,end_date):
deb()

url = get_url(destination,start_date,end_date)

self.driver.maximize_window()
self.driver.get(url)

wait = WebDriverWait(self.driver, 60)
wait.until(EC.invisibility_of_element_located((By.XPATH, '//img[contains(@src, "loading")]')))
wait.until(EC.invisibility_of_element_located((By.XPATH,
u'//div[. = "Poprosíme o trpezlivosť, hľadáme pre Vás ešte viac letov"]/preceding-sibling::img')))
i=0
old_driver_html = ''
end = False
while end==False:
i+=1

results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)
if len(results)>=__THRESHOLD__: # for testing purposes. Default value: 999
break
try:
self.driver.execute_script("arguments[0].scrollIntoView();", results[0])
self.driver.execute_script("arguments[0].scrollIntoView();", results[-1])
except:
self.driver.save_screenshot('screen_before_'+str()+'.png')
sleep(2)

print 'EXCEPTION<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<'
continue

new_driver_html = self.driver.page_source
if new_driver_html == old_driver_html:
print 'END OF PAGE'
break
old_driver_html = new_driver_html

wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, 'div.flightbox'), len(results)))
sleep(10)

为了检测页面何时完全加载,我比较了旧的 html 副本和新的 html,这可能不是我应该做的,但对于 Firefox 来说就足够了。

这是加载停止时 PhantomJS 的屏幕:enter image description here

使用 Firefox,它会加载越来越多的结果,但是使用 PhantomJS,它会停留在例如 10 个结果上。

有什么想法吗?这两个驱动程序有什么区别?

最佳答案

帮助我解决问题的两个关键问题:

  • 不要使用我之前帮助过的自定义等待
  • 先将window.document.body.scrollTop设置为0,然后连续设置为document.body.scrollHeight

工作代码:

results = []
while len(results) < 200:
results = driver.find_elements_by_css_selector("div.flightbox")

print len(results)

# scroll
driver.execute_script("arguments[0].scrollIntoView();", results[0])
driver.execute_script("window.document.body.scrollTop = 0;")
driver.execute_script("window.document.body.scrollTop = document.body.scrollHeight;")
driver.execute_script("arguments[0].scrollIntoView();", results[-1])

版本 2(无限循环,如果滚动条上不再加载任何内容则停止):

results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break

results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)

# scroll
for _ in xrange(5):
try:
self.driver.execute_script("""
arguments[0].scrollIntoView();
window.document.body.scrollTop = 0;
window.document.body.scrollTop = document.body.scrollHeight;
arguments[1].scrollIntoView();
""", results[0], results[-1])
except StaleElementReferenceException:
break # here it means more results were loaded

print "DONE. Result count: %d" % len(results)

请注意,我已经更改了 wait_for_more_than_n_elements 预期条件中的比较。替换:

return count >= self.count

与:

return count > self.count

版本 3(多次从页眉滚动到页脚):

header = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'header')))
footer = wait.until(EC.visibility_of_element_located((By.TAG_NAME, 'footer')))

results = []
while True:
try:
wait.until(wait_for_more_than_n_elements((By.CSS_SELECTOR, "div.flightbox"), len(results)))
except TimeoutException:
break

results = self.driver.find_elements_by_css_selector("div.flightbox")
print len(results)

# scroll
for _ in xrange(5):
self.driver.execute_script("""
arguments[0].scrollIntoView();
arguments[1].scrollIntoView();
""", header, footer)
sleep(1)

关于javascript - PhantomJS 的行为不同于 Firefox webdriver,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/31371460/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com