gpt4 book ai didi

python - 在 Python 中将列表元素嵌套到数据框

转载 作者:太空宇宙 更新时间:2023-11-04 01:47:28 27 4
gpt4 key购买 nike

公平警告这个问题确实需要一个非标准的 Python 包,nba_api。我有一个包含 3 个元素的列表,列表中的每个元素包含另一个包含 2 个元素的列表:一个 player 数据框和一个 team 数据框。实现以下预期结果的推荐方法是什么:1 个组合 player 数据框和 1 个组合 team 数据框?来自 R 背景,我会通过以下方式解决这个问题: 1. 将 players 数据框与 team 数据框加入 joined_list 然后, 2. 使用do.call(rbind, joined_list) 将结果行绑定(bind)到一个数据框中。我知道这对于许多有经验的 Python 用户来说可能是非常基础的,但我在这里进行了多次搜索后,很难找到正确的方法。

import nba_api
import requests
import pandas as pd

from nba_api.stats.endpoints import boxscoreadvancedv2

# vector of game ids (test purposes)
gameids = ['0021900001','0021900002','0021900012']

headers1 = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}

# store player and team results for each gameids as elements of list temp
temp = list()
for i in range(len(gameids)):
temp.append(boxscoreadvancedv2.BoxScoreAdvancedV2(game_id = gameids[i], headers=headers1))

# manually access elements of list and output to data frame
## there has to be an easier way to access list elements and rowbind the results!!!
df_out0 = temp[0].get_data_frames()
df_player0 = df_out0[0]
df_team0 = df_out0[1]

df_out1 = temp[1].get_data_frames()
df_player1 = df_out1[0]
df_team1 = df_out1[1]

最佳答案

首先恭喜你坚持了下来,自己找到了解决办法! :D

评论和提示

你可以直接遍历列表,不需要索引

lst_1 = [1, 2, 3, 4]

for i in range(len(lst_1)):
print(i)

可以写成

lst_1 = [1, 2, 3, 4]

for item in lst_1:
print(item)

List comprehensionsgenerator expressions很棒

奖励:请注意我对变量名称所做的更改。参见 PEP 8有关 Python 风格的一般引用。

gameids = ['0021900001','0021900002','0021900012']

headers1 = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}

# store player and team results for each gameids as elements of list temp
temp = list()
for i in range(len(gameids)):
temp.append(boxscoreadvancedv2.BoxScoreAdvancedV2(game_id = gameids[i], headers=headers1))

可以写成

game_ids = ['0021900001','0021900002','0021900012']

api_headers = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}

api_results = [boxscoreadvancedv2.BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids]

你对同一件事进行了两次迭代

# output player frames
i=0
df_out=[]
df_players=[]
for i in range(len(temp)):
df_out = temp[i].get_data_frames()
df_players.append(df_out[0]) # index 0 will always contain player frame

df_players = pd.concat(df_players)
print(df_players)

# output team frames
i=0
df_out=[]
df_team=[]
for i in range(len(temp)):
df_out = temp[i].get_data_frames()
df_team.append(df_out[1]) # index 1 will always contain team frame

df_team = pd.concat(df_team)
print(df_team)

使用前两个技巧,我们最终得到的是:

players_lst = []
team_lst = []

for curr_res in api_results:
curr_dfs = curr_res.get_data_frames()
players_lst.append(curr_dfs[0])
team_lst.append(curr_dfs[1])

players_df = pd.concat(players_lst)
team_df = pd.concat(team_lst)

我的解决方案

在这里,为了清楚起见,稍微分解了一下。

import pandas as pd
from nba_api.stats.endpoints.boxscoreadvancedv2 import BoxScoreAdvancedV2

game_ids = ['0021900001', '0021900002', '0021900012']

api_headers = {
'Host': 'stats.nba.com',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0',
'Accept': 'application/json, text/plain, */*',
'Accept-Language': 'en-US,en;q=0.5',
'Referer': 'https://stats.nba.com/',
'Accept-Encoding': 'gzip, deflate, br',
'Connection': 'keep-alive',
}

# generator of results from the API
api_results = (BoxScoreAdvancedV2(game_id=curr_game_id, headers=api_headers) for curr_game_id in game_ids)

# generator of lists of DataFrames from the API results
# think of it like: [[Player DF, Team DF], [Player DF, Team DF], ...]
api_res_dfs = (curr_res.get_data_frames() for curr_res in api_results)

# unpacking the size 2 lists of DataFrames into 2 flat lists
# [[Player DF, Team DF], [Player DF, Team DF], ...] -> [Player DF, Player DF, ...], [Team DF, Team DF, ...]
# see https://stackoverflow.com/q/2921847/11301900 for more on the use of the asterisk (*)
players_tupe, team_tupe = zip(*api_res_dfs)

# concatenating the various DataFrames, exactly the same as in your original code
players_df = pd.concat(players_tupe)
team_df = pd.concat(team_tupe)

print(players_df)
print(team_df)

它取决于这样一个事实,正如您所指出的,玩家 DataFrame 始终排在列表的第一位,团队 DataFrame 始终排在第二位,而且这些是 唯一 中的两项结果列表。


如果您有任何问题,请告诉我:)

关于python - 在 Python 中将列表元素嵌套到数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58785030/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com