gpt4 book ai didi

javascript - 如何通过pyqt获取html页面的最终结果?

转载 作者:行者123 更新时间:2023-11-28 08:18:38 24 4
gpt4 key购买 nike

最近我试图从Google搜索结果中抓取数据,似乎pyqt是一个很好的模块,可以在html中执行javascript并获得最终的html结果。然而对于其他网站来说,它似乎工作正常。然而,对于Google搜索,却总是失败。我在这里遵循一个例子: http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/

代码是:

import sys
import time
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import *

class Render(QWebPage):

def __init__(self, url):
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()

def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()

url1 = 'http://www.google.com/search?start=0&client=firefox-a&q=adidas&safe=off&pws=0&tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2002%2Ccd_max%3A1%2F1%2F2001&filter=0&num=10&access=a&oe=UTF-8&ie=UTF-8'
url2 = 'http://www.google.com/search?start=0&client=firefox-a&q=adidas&safe=off&pws=0&tbs=cdr%3A1%2Ccd_min%3A1%2F1%2F2009%2Ccd_max%3A7%2F1%2F2009&filter=0&num=10&access=a&oe=UTF-8&ie=UTF-8'
r = Render(url1)
html = r.frame.toHtml()
print type(html)

outfile = open('page.html','w')
outfile.write(html.toUtf8())
outfile.close()
print 'finished!'

但是,url1和url2的结果总是得到相同的结果,当我在chrome中禁用javascript时,结果也是一样的。那么我们应该如何处理呢?我们如何获取 Google 搜索的最终 html?

最佳答案

import sys  
from PyQt4.QtGui import *
from PyQt4.QtCore import *
from PyQt4.QtWebKit import *

class Render(QWebPage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()

def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()

url = 'http://webscraping.com'
r = Render(url)
html = r.frame.toHtml()

来源:http://webscraping.com/blog/Scraping-JavaScript-webpages-with-webkit/

关于javascript - 如何通过pyqt获取html页面的最终结果?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23278192/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com