r - 在 R 中从维基百科中提取特定表格-6ren

r - 在 R 中从维基百科中提取特定表格

转载作者：行者123 更新时间：2023-12-04 01:05:27

我想从维基百科页面中提取第 20 个表 https://en.wikipedia.org/wiki/.. .
我现在使用此代码，但它只提取第一个标题表。

the_url <- "https://en.wikipedia.org/wiki/..."
tb <- the_url %>% read_html() %>% 
  html_node("table") %>% 
  html_table(fill = TRUE)

我应该怎么做才能得到特定的？谢谢!!

最佳答案

您可以根据与 id 为 Prize_money 的元素的关系进行 anchor 定，而不是索引表位置可以移动的位置。只返回一个节点以提高效率。避免使用较长的 xpath，因为它们可能很脆弱。

library(rvest)

table <- read_html('https://en.wikipedia.org/wiki/2018_FIFA_World_Cup#Prize_money') %>% 
  html_node(xpath = "//*[@id='Prize_money']/parent::h4/following-sibling::table[1]") %>% 
  html_table(fill = T)

关于r - 在 R 中从维基百科中提取特定表格，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/66626843/

文章推荐： python - 如何从转换为exe的python脚本运行exe文件

文章推荐： sql-server - 字符串 ')' 后的未闭合引号 - OPENQUERY

文章推荐： r - 如何快速求解最小二乘法(欠定系统)？

文章推荐： pandas - 连接 pandas DataFrame 中子级列的所有组合

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

r - 在 R 中从维基百科中提取特定表格