gpt4 book ai didi

r - 使用 rvest 和 R 抓取 XHR 动态页面

转载 作者:行者123 更新时间:2023-12-05 04:09:15 25 4
gpt4 key购买 nike

我正在尝试通过 XHR 请求抓取动态网站 Morningstar.com。

我正在抓取的确切站点是:http://performance.morningstar.com/funds/etf/total-returns.action?t=SPY&region=USA&culture=en_US

我要抓取的是季度绩效数字(1 个月)。截至今天,结果应该是 0.64。

try(res <- GET(url = "http://performance.morningstar.com/fund/performance-return.action",
query = list(
t="SPY",
region="usa",
culture="en-US"
)
))

tryCatch(x <- content(res) %>%
html_nodes(xpath = '//*[@id="tab-quar-end-content"]/table/tbody/tr[1]/td[1]') %>%
html_text() %>%
trimws() %>%
as.numeric()
, error = function(e) x <-NA)

然而,结果是numeric(0)

知道我做错了什么吗?

苏迪

更新:

我能够使用以下代码获取 html 数据:

 try(res <- GET(url = "http://performance.morningstar.com/fund/performance-return.action",
query = list(

t = "SPY",
region = "usa",
culture = "en-US",
ops = "clear",
s = "0P0000J533",
ndec = "2",
ep = "true",
align = "q",
annlz = "true",
comparisonRemove = "false"

)
))

但我在使用 CSS 选择器或带有 rvest 的 xpath 指向数据时仍然遇到问题。

你们用什么来找到这些数据点? SelectorGadget 仍然是首选吗?

干杯,亚伦

最佳答案

library(httr)

GET(
url = "http://performance.morningstar.com/perform/Performance/cef/trailing-total-returns.action",
add_headers(
Referer = "http://performance.morningstar.com/funds/etf/total-returns.action?t=SPY&region=USA&culture=en_US",
`X-Requested-With` = "XMLHttpRequest"
),
query = list(
t = "ARCX:SPY", region = "usa", culture = "en-US",
cur = "", ops = "clear", s = "0P00001MK8", ndec = "2", ep = "true",
align = "q", annlz = "true", comparisonRemove = "false",
benchmarkSecId = "", benchmarktype = ""
),
verbose()
) -> res

您必须直接以 XHR 为目标。

关于r - 使用 rvest 和 R 抓取 XHR 动态页面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46309746/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com