gpt4 book ai didi

python - 如何通过组合现有列中的数据来创建新列?

转载 作者:太空宇宙 更新时间:2023-11-03 20:22:16 24 4
gpt4 key购买 nike

我有一个包含 5 列的数据集,请原谅格式设置:

id     Price    Service Rater Name  Cleanliness
401013357 5 3 A 1
401014972 2 1 A 5
401022510 3 4 B 2
401022510 5 1 C 9
401022510 3 1 D 4
401022510 2 2 E 2

我希望每个 ID 只能有一行。因此,我需要为每个评估者的姓名和评级类别(例如评估者名称价格、评估者名称服务、评估者名称清洁度)创建列,每个列都在其自己的列中。谢谢。

我已经探索过 groupby 但无法弄清楚如何将它们操纵到新列中。谢谢!

Here's the code and data I'm actually using:

import requests
from pandas import DataFrame
import pandas as pd


linesinfo_url = 'https://api.collegefootballdata.com/lines?year=2018&seasonType=regular'
linesresp = requests.get(linesinfo_url)

dflines = DataFrame(linesresp.json())
#nesteddata in lines like game info
#setting game ID as index
dflines.set_index('id', inplace=True)

a = linesresp.json()
#defining a as the response to our get request for this data, in JSON format
buf = []
#i believe this creates a receptacle for nested data I'm extracting from json
for game in a:
for line in game['lines']:
game_dict = dict(id=game['id'])
for cat in ('provider', 'spread','formattedSpread', 'overUnder'):
game_dict[cat] = line[cat]
buf.append(game_dict)

dflinestable = pd.DataFrame(buf)
dflinestable.set_index(['id', 'provider'])

由此我得到

                              formattedSpread  overUnder  spread
id provider
401013357 consensus UMass -21 68.0 -21.0
401014972 consensus Rice -22.5 58.5 -22.5
401022510 Caesars Colorado State -17.5 57.5 -17.5
consensus Colorado State -17 57.5 -17.0
numberfire Colorado State -17 58.5 -17.0
teamrankings Colorado State -17 58.0 -17.0
401013437 numberfire Wyoming -5 47.0 5.0
teamrankings Wyoming -5 47.0 5.0
401020671 consensus Ball State -19.5 61.5 -19.5
401019470 Caesars UCF -22.5 NaN 22.5
consensus UCF -22.5 NaN 22.5
numberfire UCF -24 70.0 24.0
teamrankings UCF -24 70.0 24.0
401013328 numberfire Minnesota -21.5 47.0 -21.5
teamrankings Minnesota -21.5 49.0 -21.5

我正在寻找的结果是 4 个不同的提供程序中的每一个都有三列,因此它是 caesars_formattedSpread、caesars_overUnder、Caesars spread、numberfire_formattedSpread、numberfire_overUnder、numberfire_spread 等。

当我按照建议运行 unstack 时,我没有得到我所期望的结果。相反,我得到:

formattedSpread  0                  UMass -21
1 Rice -22.5
2 Colorado State -17.5
3 Colorado State -17
4 Colorado State -17
5 Colorado State -17
6 Wyoming -5
7 Wyoming -5
8 Ball State -19.5
9 UCF -22.5
10 UCF -22.5
11 UCF -24
12 UCF -24

最佳答案

* 根据已编辑的问题进行编辑*

假设您的数据帧是df:

df = df.set_index(['id', 'Rater Name']) # Make it a Multi Index
df_unstacked = df.unstack()

您编辑的代码的问题在于您没有将 dflinestable.set_index(['id', 'provider']) 分配给任何内容。因此,当您使用 dflinestable.unstack() 时,您将取消原始 dflinestable 的堆栈。

因此,对于您的整个代码,它应该是:

import requests
import pandas as pd


linesinfo_url = 'https://api.collegefootballdata.com/lines?year=2018&seasonType=regular'
linesresp = requests.get(linesinfo_url)

dflines = pd.DataFrame(linesresp.json())
#nesteddata in lines like game info
#setting game ID as index
dflines.set_index('id', inplace=True)

a = linesresp.json()
#defining a as the response to our get request for this data, in JSON format
buf = []
#i believe this creates a receptacle for nested data I'm extracting from json
for game in a:
for line in game['lines']:
game_dict = dict(id=game['id'])
for cat in ('provider', 'spread','formattedSpread', 'overUnder'):
game_dict[cat] = line[cat]
buf.append(game_dict)

dflinestable = pd.DataFrame(buf)
dflinestable.set_index(['id', 'provider'], inplace=True) # Add inplace=True
dflinestable_unstacked = dflinestable.unstack() # unstack (you could also reassign to the same df

# Flatten columns to single level, in the order as described
dflinestable_unstacked.columns = [f'{j}_{i}' for i, j in dflinestable_unstacked.columns]

这将为您提供一个类似(缩写)的 DataFrame:

          Caesars_formattedSpread  ... teamrankings_spread
id ...
401012246 Alabama -24 ... -23.5
401012247 Arkansas -34 ... NaN
401012248 Auburn -1 ... -1.5
401012249 NaN ... NaN
401012250 Georgia -44 ... NaN

关于python - 如何通过组合现有列中的数据来创建新列?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58082099/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com