gpt4 book ai didi

python - 如何按第二级对多索引数据帧进行排序

转载 作者:行者123 更新时间:2023-12-01 03:21:10 25 4
gpt4 key购买 nike

我有一个带有MultiIndexDataFrame。索引字段为 OptionSymbol(级别 0)和 QuoteDatetime(级别 1)。我已经对 DataFrame 进行了索引和排序,如下所示:

sorted = df.sort_values(
['OptionSymbol', 'QuoteDatetime'],
ascending=[False, True]
)

indexed = sorted.set_index(
['OptionSymbol', 'QuoteDatetime'],
drop=True
)

这会产生以下结果:

                                      Id  Strike Expiration OptionType
OptionSymbol QuoteDatetime
ZBYMZ 2013-09-02 234669 170.0 2011-01-22 put
2013-09-03 234901 170.0 2011-01-22 put
2013-09-04 235133 170.0 2011-01-22 put
... ... ... ... ... ...
YBWNA 2010-02-12 262202 95.0 2010-02-20 call
2010-02-16 262454 95.0 2010-02-20 call
2010-02-17 262707 95.0 2010-02-20 call
... ... ... ... ... ...
XWNAX 2012-07-12 262201 90.0 2010-02-20 call
2012-07-16 262453 90.0 2010-02-20 call
2012-07-17 262706 90.0 2010-02-20 call
... ... ... ... ... ...
WWWAX 2012-04-12 262201 90.0 2010-02-20 call
2012-04-16 262453 90.0 2010-02-20 call
2012-04-17 262706 90.0 2010-02-20 call
... ... ... ... ... ...

正如预期的那样,框架首先按 OptionSymbol 降序排序,并在 OptionSymbol 组内按升序排序。

我现在需要做的是使用 QuoteDatetime 中的第一个值,因此结果如下所示:

                                      Id  Strike Expiration OptionType
OptionSymbol QuoteDatetime
XBWNA 2010-02-12 262202 95.0 2010-02-20 call
2010-02-16 262454 95.0 2010-02-20 call
2010-02-17 262707 95.0 2010-02-20 call
... ... ... ... ... ...
NWWAX 2012-04-12 262201 90.0 2010-02-20 call
2012-04-16 262453 90.0 2010-02-20 call
2012-04-17 262706 90.0 2010-02-20 call
... ... ... ... ... ...
BWNAX 2012-07-12 262201 90.0 2010-02-20 call
2012-07-16 262453 90.0 2010-02-20 call
2012-07-17 262706 90.0 2010-02-20 call
... ... ... ... ... ...
XBYMZ 2013-09-02 234669 170.0 2011-01-22 put
2013-09-03 234901 170.0 2011-01-22 put
2013-09-04 235133 170.0 2011-01-22 put
... ... ... ... ... ...

我尝试了通过index=1进行排序的各种方法,但后来我丢失了OptionSymbol组。我该如何进行这种排序?

使用代码进行编辑以重新创建

from collections import OrderedDict
df = OrderedDict((
('OptionSymbol', pd.Series(['ZBYMZ', 'ZBYMZ', 'ZBYMZ', 'YBWNA', 'YBWNA', 'YBWNA', 'XWNAX', 'XWNAX', 'XWNAX', 'WWWAX', 'WWWAX', 'WWWAX', ])),
('QuoteDatetime', pd.Series(['2013-09-02', '2013-09-03', '2013-09-04', '2010-02-12', '2010-02-16', '2010-02-17', '2012-07-12', '2012-07-16', '2012-07-17', '2012-04-12', '2012-04-16', '2012-04-17'])),
('Id', pd.Series(np.random.randn(12,))),
('Strike', pd.Series(np.random.randn(12,))),
('Expiration', pd.Series(np.random.randn(12,))),
('OptionType', pd.Series(np.random.randn(12,)))
))

在这种情况下使用df.sort_index(level=1)很奇怪,但是在我的完整数据集(20+列)上我丢失了OptionSymbol分组。

最佳答案

IIUC 您可以简单地按第二级对索引进行排序:

In [27]: df.sort_index(level=1)
Out[27]:
Id Strike Expiration OptionType
OptionSymbol QuoteDatetime
YBWNA 2010-02-12 262202 95.0 2010-02-20 call
2010-02-16 262454 95.0 2010-02-20 call
2010-02-17 262707 95.0 2010-02-20 call
WWWAX 2012-04-12 262201 90.0 2010-02-20 call
2012-04-16 262453 90.0 2010-02-20 call
2012-04-17 262706 90.0 2010-02-20 call
XWNAX 2012-07-12 262201 90.0 2010-02-20 call
2012-07-16 262453 90.0 2010-02-20 call
2012-07-17 262706 90.0 2010-02-20 call
ZBYMZ 2013-09-02 234669 170.0 2011-01-22 put
2013-09-03 234901 170.0 2011-01-22 put
2013-09-04 235133 170.0 2011-01-22 put

关于python - 如何按第二级对多索引数据帧进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41907210/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com