% html_-6ren">
gpt4 book ai didi

r - 如何从R中的时间元素获取日期?

转载 作者:行者123 更新时间:2023-12-01 09:42:27 26 4
gpt4 key购买 nike

我正在尝试从 website 中的时间元素获取日期但结果是 NA。

url <- "https://www.trustpilot.com/review/www.spotify.com"

dates <- read_html(url) %>%
html_nodes("div.review-content-header__dates") %>%
html_attr("datetime")

怎么了?

提前致谢。

最佳答案

如果你想要的是字段 publishedDate里面<script>你可以做的节点:

library(rvest)

url <- "https://www.trustpilot.com/review/www.spotify.com"
dates <- read_html(url) %>%
html_nodes("div.review-content-header__dates") %>%
html_text()

# Now do some cleaning on the obtained data
# Remove extra spaces
dates <- gsub('\\s+','',dates)
# Remove text before the date
dates <- gsub('\\{\"publishedDate\":\"','',dates)
# Remove text after the date
dates <- gsub('Z\",\".*','',dates)

dates

> [1] "2019-09-23T15:02:07" "2019-09-21T15:24:14" "2019-09-20T15:16:35"
> [4] "2019-09-20T13:45:35" "2019-09-19T14:48:44" "2019-09-18T02:56:34"
> [7] "2019-09-16T00:24:32" "2019-09-13T00:04:14" "2019-09-12T19:47:27"
> [10] "2019-09-12T12:59:54" "2019-09-12T08:00:12" "2019-09-11T13:18:01"
> [13] "2019-09-10T08:07:54" "2019-09-05T16:16:53" "2019-09-05T14:17:42"
> [16] "2019-09-04T19:49:28" "2019-09-04T18:33:04" "2019-09-02T18:45:53"
> [19] "2019-08-31T20:53:44" "2019-08-30T23:24:25"

关于r - 如何从R中的时间元素获取日期?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58080475/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com