gpt4 book ai didi

r - 使用 rjson 在 R 中抓取 NBA 数据

转载 作者:行者123 更新时间:2023-12-02 01:33:17 25 4
gpt4 key购买 nike

我花了很长时间使用 R 来尝试抓取 NBA 数据,到目前为止我都是通过反复试验来完成的,但最后我发现了这个 documentation 。前段时间我在抓取 shotchartdetail 时遇到了一些问题,当我发现 this 时我发现了问题。

这有效

为此,这就是我所做的:

shotURLtotal <- paste0("http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2016-17&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=0&PlusMinus=N&Position=&Rank=N&RookieYear=&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&TeamID=0&VsConference=&VsDivision=&mode=Advanced&showDetails=0&showShots=1&showZones=0&PlayerPosition=")

Season <- rjson::fromJSON(file = shotURLtotal, method="C")
Names <- Season$resultSets[[1]][[2]]

Season <- data.frame(matrix(unlist(Season$resultSets[[1]][[3]]), ncol = length(Names), byrow = TRUE))

colnames(Season) <- Names

但这并不

但是当我尝试对 shotchartlineupdetail 执行相同操作时,它不起作用,我怀疑它与 CFID 有关,但我不知道知道这意味着什么,这就是我尝试过的。

shoturl <- "http://stats.nba.com/stats/shotchartlineupdetail/?leagueId=00&season=2016-17&seasonType=Regular+Season&teamId=0&outcome=&location=&month=0&seasonSegment=&dateFrom=&dateTo=&opponentTeamId=0&vsConference=&vsDivision=&gameSegment=&period=0&lastNGames=0&gameId=&group_id=0&contextFilter=&contextMeasure=FGA"


Season <- rjson::fromJSON(file = shoturl, method="C")
Names <- Season$resultSets[[1]][[2]]

Season <- data.frame(matrix(unlist(Season$resultSets[[1]][[3]]), ncol = length(Names), byrow = TRUE))

colnames(Season) <- Names

预期结果

预期结果应该是包含以下列的数据框:

c("GRID_TYPE", "GAME_ID", "GAME_EVENT_ID", "GROUP_ID", "GROUP_NAME", "PLAYER_ID", "PLAYER_NAME", "TEAM_ID", "TEAM_NAME", "PERIOD", "MINUTES_REMAINING", "SECONDS_REMAINING", "EVENT_TYPE", "ACTION_TYPE", "SHOT_TYPE", "SHOT_ZONE_BASIC", "SHOT_ZONE_AREA", "SHOT_ZONE_RANGE", "SHOT_DISTANCE", "LOC_X", "LOC_Y", "SHOT_ATTEMPTED_FLAG", "SHOT_MADE_FLAG", "GAME_DATE", "HTM", "VTM")

您可以通过以下方式获得:

shoturl <- "http://stats.nba.com/stats/shotchartlineupdetail/?leagueId=00&season=2016-17&seasonType=Regular+Season&teamId=0&outcome=&location=&month=0&seasonSegment=&dateFrom=&dateTo=&opponentTeamId=0&vsConference=&vsDivision=&gameSegment=&period=0&lastNGames=0&gameId=&group_id=0&contextFilter=&contextMeasure=FGA"


Season <- rjson::fromJSON(file = shoturl, method="C")
Names <- Season$resultSets[[1]][[2]]

所以名称将是数据框的列,问题是,如果不使用CFID,您会得到这些列的数据应该为空的列表,答案是@be_green给出的是联赛平均值,我需要球队的具体数据

最佳答案

所以我认为这里的问题是您需要将 PlayerIDTeamID 传递给 API。使用下面的 PlayerID = 2544TeamID = 1610612739 作为示例似乎可行:

library(tidyverse)
res <- jsonlite::read_json("https://stats.nba.com/stats/shotchartdetail?AheadBehind=&ClutchTime=&ContextFilter=&ContextMeasure=PTS&DateFrom=&DateTo=&EndPeriod=&EndRange=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&Month=0&OpponentTeamID=0&Outcome=&Period=0&PlayerID=2544&PlayerPosition=&PointDiff=&Position=&RangeType=&RookieYear=&Season=&SeasonSegment=&SeasonType=Regular+Season&StartPeriod=&StartRange=&TeamID=1610612739&VsConference=&VsDivision=")
# res %>% str(max.level = 3)

header_names <- flatten_chr(res$resultSets[[1]]$headers)
header_names
#> [1] "GRID_TYPE" "GAME_ID" "GAME_EVENT_ID"
#> [4] "PLAYER_ID" "PLAYER_NAME" "TEAM_ID"
#> [7] "TEAM_NAME" "PERIOD" "MINUTES_REMAINING"
#> [10] "SECONDS_REMAINING" "EVENT_TYPE" "ACTION_TYPE"
#> [13] "SHOT_TYPE" "SHOT_ZONE_BASIC" "SHOT_ZONE_AREA"
#> [16] "SHOT_ZONE_RANGE" "SHOT_DISTANCE" "LOC_X"
#> [19] "LOC_Y" "SHOT_ATTEMPTED_FLAG" "SHOT_MADE_FLAG"
#> [22] "GAME_DATE" "HTM" "VTM"

res$resultSets[[1]]$rowSet %>%
map(`[`, 1:24) %>%
map(~ set_names(., header_names)) %>%
bind_rows()
#> # A tibble: 8,369 x 24
#> GRID_TYPE GAME_ID GAME_EVENT_ID PLAYER_ID PLAYER_NAME TEAM_ID TEAM_NAME
#> <chr> <chr> <int> <int> <chr> <int> <chr>
#> 1 Shot Cha~ 002030~ 20 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 2 Shot Cha~ 002030~ 28 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 3 Shot Cha~ 002030~ 35 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 4 Shot Cha~ 002030~ 54 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 5 Shot Cha~ 002030~ 67 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 6 Shot Cha~ 002030~ 76 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 7 Shot Cha~ 002030~ 224 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 8 Shot Cha~ 002030~ 233 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 9 Shot Cha~ 002030~ 235 2544 LeBron Jam~ 1.61e9 Clevelan~
#> 10 Shot Cha~ 002030~ 322 2544 LeBron Jam~ 1.61e9 Clevelan~
#> # ... with 8,359 more rows, and 17 more variables: PERIOD <int>,
#> # MINUTES_REMAINING <int>, SECONDS_REMAINING <int>, EVENT_TYPE <chr>,
#> # ACTION_TYPE <chr>, SHOT_TYPE <chr>, SHOT_ZONE_BASIC <chr>,
#> # SHOT_ZONE_AREA <chr>, SHOT_ZONE_RANGE <chr>, SHOT_DISTANCE <int>,
#> # LOC_X <int>, LOC_Y <int>, SHOT_ATTEMPTED_FLAG <int>,
#> # SHOT_MADE_FLAG <int>, GAME_DATE <chr>, HTM <chr>, VTM <chr>

reprex package于2019年3月26日创建(v0.2.1)

关于r - 使用 rjson 在 R 中抓取 NBA 数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47745164/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com