gpt4 book ai didi

python - 抓取一个小部件

转载 作者:行者123 更新时间:2023-12-01 09:19:41 26 4
gpt4 key购买 nike

我正在抓取数据,它正在抓取并打印第一页上出现的内容,但是下面还有大量数据。因此,接下来我添加了代码以向下滚动到页面底部,以便可以抓取所有内容。现在的问题是它滚动到底部但然后它只是等待并且从不打印。任何人都知道如何打印此文件,如果有人也知道如何打印,最终我会将结果保存到 Excel 文件中。非常感谢

 from selenium import webdriver

url = 'http://www.tradingview.com/screener'
driver = webdriver.Firefox()
driver.get(url)

SCROLL_PAUSE_TIME = 2

# Get scroll height
last_height = driver.execute_script("return document.body.scrollHeight")

while True:
# Scroll down to bottom
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

# Wait to load page
time.sleep(SCROLL_PAUSE_TIME)

# Calculate new scroll height and compare with last scroll height
new_height = driver.execute_script("return document.body.scrollHeight")
if new_height == last_height:
break
last_height = new_height

# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')

# will give a list of all company names
company_names = driver.find_elements_by_css('span.tv-screener__description')

# will give a list of all close values
close_values = driver.find_elements_by_xpath("//td[@class = 'tv-data-table__cell tv-screener-table__cell tv-screener-table__cell--numeric']/span")

# will give a list of all percentage changes
percentage_changes = driver.find_elements_by_xpath('//tbody/tr/td[3]')

# will give a list of all value changes
value_changes = driver.find_elements_by_xpath('//tbody/tr/td[4]')

# will give a list of all ranks
ranks = driver.find_elements_by_xpath('//tbody/tr/td[5]/span')

# will give a list of all volumes
volumes = driver.find_elements_by_xpath('//tbody/tr/td[6]')

# will give a list of all market caps
market_caps = driver.find_elements_by_xpath('//tbody/tr/td[7]')

# will give a list of all PEs
pes = driver.find_elements_by_xpath('//tbody/tr/td[8]')

# will give a list of all EPSs
epss = driver.find_elements_by_xpath('//tbody/tr/td[9]')

# will give a list of all EMPs
emps = driver.find_elements_by_xpath('//tbody/tr/td[10]')

# will give a list of all sectors
sectors = driver.find_elements_by_xpath('//tbody/tr/td[11]')

for index in range(len(tickers)):
print("Row " + index + " " + tickers[index].text + " " + company_names[index].text + " ")

最佳答案

您试图找到错误的元素。这:

element = driver.find_elements_by_id('js-screener-container')

应替换为:

# will give a list of all tickers
tickers = driver.find_elements_by_css_selector('a.tv-screener__symbol')

# will give a list of all company names
company_names = driver.find_elements_by_css_selector('span.tv-screener__description')

# will give a list of all close values
close_values = driver.find_elements_by_xpath("//td[@class = 'tv-data-table__cell tv-screener-table__cell tv-screener-table__cell--numeric']/span")

# will give a list of all percentage changes
percentage_changes = driver.find_elements_by_xpath('//tbody/tr/td[3]')

# will give a list of all value changes
value_changes = driver.find_elements_by_xpath('//tbody/tr/td[4]')

# will give a list of all ranks
ranks = driver.find_elements_by_xpath('//tbody/tr/td[5]/span')

# will give a list of all volumes
volumes = driver.find_elements_by_xpath('//tbody/tr/td[6]')

# will give a list of all market caps
market_caps = driver.find_elements_by_xpath('//tbody/tr/td[7]')

# will give a list of all PEs
pes = driver.find_elements_by_xpath('//tbody/tr/td[8]')

# will give a list of all EPSs
epss = driver.find_elements_by_xpath('//tbody/tr/td[9]')

# will give a list of all EMPs
emps = driver.find_elements_by_xpath('//tbody/tr/td[10]')

# will give a list of all sectors
sectors = driver.find_elements_by_xpath('//tbody/tr/td[11]')

现在您已将所有数据存储在列表中。如果你想构建一行数据,你可以使用这样的东西:

for index in range(len(tickers)):
print("Row " + tickers[index].text + " " + company_names[index].text + " " + ....)

输出将是这样的:

Row AAPL APPLE INC. 188.84 -1.03% -1.96 Neutral 61.308M 931.386B 17.40 10.98 123K Technology 
Row AMZN AMAZON.COM INC 1715.97 -0.46% -7.89 Buy 4.778M 835.516B 270.53 6.54 566K Consumer Cyclicals
...

PS:

我认为

SCROLL_PAUSE_TIME = 0.5

时间太短,因为有时通过在页面底部滚动来加载新内容可能需要0.5秒。我会增加此值以确保加载所有内容。

关于python - 抓取一个小部件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50899282/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com