gpt4 book ai didi

python - Pandas:列上的多重索引丢失后从宽变为长

转载 作者:太空宇宙 更新时间:2023-11-03 15:41:07 24 4
gpt4 key购买 nike

考虑常见的交易数据框:

trades = pd.DataFrame({
'time': pd.to_datetime(['20160525 13:30:00.023',
'20160525 13:30:00.038',
'20160525 13:30:00.048',
'20160525 13:30:00.048',
'20160525 13:30:00.048']),
'ticker': ['MSFT', 'MSFT','GOOG', 'BOOB', 'AAPL'],
'price': [51.95, 51.95, 720.77, 720.92, 98.00],
'quantity': [75, 155, 100, 100, 100]},
columns=['time', 'ticker', 'price', 'quantity'])

trades
Out[42]:
time ticker price quantity
0 2016-05-25 13:30:00.023 MSFT 51.95 75
1 2016-05-25 13:30:00.038 MSFT 51.95 155
2 2016-05-25 13:30:00.048 GOOG 720.77 100
3 2016-05-25 13:30:00.048 BOOB 720.92 100
4 2016-05-25 13:30:00.048 AAPL 98.00 100

在这里,我想从长到宽 reshape ,做一些事情,然后从宽 reshape 回长

从长到宽很容易

trades.set_index(['time','ticker'], inplace = True)
trades = trades.unstack()

trades
Out[44]:
price quantity \
ticker AAPL BOOB GOOG MSFT AAPL BOOB GOOG
time
2016-05-25 13:30:00.023 NaN NaN NaN 51.95 NaN NaN NaN
2016-05-25 13:30:00.038 NaN NaN NaN 51.95 NaN NaN NaN
2016-05-25 13:30:00.048 98.0 720.92 720.77 NaN 100.0 100.0 100.0


ticker MSFT
time
2016-05-25 13:30:00.023 75.0
2016-05-25 13:30:00.038 155.0
2016-05-25 13:30:00.048 NaN

但现在由于多种原因,我现在确实想在列上使用此多重索引,因此我有以下代码:

trades.columns=['_'.join(t) for t in trades.columns]

这基本上摆脱了多重索引并允许我使用普通列。数据现在如下所示:

trades
Out[47]:
price_AAPL price_BOOB price_GOOG price_MSFT \
time
2016-05-25 13:30:00.023 NaN NaN NaN 51.95
2016-05-25 13:30:00.038 NaN NaN NaN 51.95
2016-05-25 13:30:00.048 98.0 720.92 720.77 NaN

quantity_AAPL quantity_BOOB quantity_GOOG \
time
2016-05-25 13:30:00.023 NaN NaN NaN
2016-05-25 13:30:00.038 NaN NaN NaN
2016-05-25 13:30:00.048 100.0 100.0 100.0

quantity_MSFT
time
2016-05-25 13:30:00.023 75.0
2016-05-25 13:30:00.038 155.0
2016-05-25 13:30:00.048 NaN

问题是:我现在如何恢复长格式?

最佳答案

您可以使用:

#create MultiIndex from columns 
trades.columns = trades.columns.str.split('_', expand=True)
#stack and set index names for new column names
trades = trades.stack().rename_axis(['time','ticker']).reset_index()
#convert to int
trades.quantity = trades.quantity.astype(int)
print (trades)
time ticker price quantity
0 2016-05-25 13:30:00.023 MSFT 51.95 75
1 2016-05-25 13:30:00.038 MSFT 51.95 155
2 2016-05-25 13:30:00.048 AAPL 98.00 100
3 2016-05-25 13:30:00.048 BOOB 720.92 100
4 2016-05-25 13:30:00.048 GOOG 720.77 100

关于python - Pandas:列上的多重索引丢失后从宽变为长,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42114760/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com