作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我有一个以日期作为二级索引的 DataFrame。如何在两个日期之间进行筛选?
下面是生成 DataFrame 的代码:
dates=pd.date_range(start='2015-01-01', end='2018-12-01', freq='M')
persons=['John','Paul','Susan','Steve','Anne','Carol']
miindex=pd.MultiIndex.from_product([persons, dates],
names=['persons', 'dates'])
df = pd.DataFrame(np.random.randn(282, 4), columns=list('ABCD'), index=miindex)
A B C D
persons dates
John 2015-01-31 -1.381854 0.438590 -1.838329 0.085944
2015-02-28 -1.870273 0.040513 1.116906 0.473218
2015-03-31 0.522960 -0.190412 -0.650339 -0.532672
2015-04-30 0.147605 -0.045129 1.209839 1.831272
2015-05-31 -0.331290 -0.413971 -2.418138 0.149583
... ... ... ... ... ...
Carol 2018-07-31 -0.344657 0.871752 -0.040436 0.132283
2018-08-31 0.168781 0.776657 -0.103212 -0.082286
2018-09-30 0.019738 0.151568 -0.794741 -1.316847
2018-10-31 -1.047699 0.913352 1.009840 0.070882
2018-11-30 -1.360346 -0.850818 -0.824563 0.305373
如何过滤具有以下日期的行:
例如,过滤 01-01-2018 和我应该得到的日期
A B C D
persons dates
John 2018-01-31 1.092697 -0.534817 1.498770 -0.746335
2018-02-28 0.141443 0.286186 -0.652946 -0.331205
2018-03-31 -0.547728 0.942533 -0.315792 -1.564275
2018-04-30 2.383790 1.117817 -0.419611 1.603313
2018-05-31 0.405304 -1.468452 -0.713453 0.605490
... ... ... ...
Carol 2018-07-31 0.711990 0.615596 1.198836 2.283507
2018-08-31 -0.071486 -0.102290 -1.855148 0.284160
2018-09-30 1.461128 -1.163214 1.142434 0.183197
2018-10-31 -1.994097 -0.275098 0.877738 -1.094145
2018-11-30 0.225581 2.194110 0.160663 1.582566
请注意,您必须忽略输出中 A、B、C、D 列的值,因为我随机生成的 DataFrame 仅使用预期显示内容的索引。
最佳答案
将 partial string indexing 与 MultiIndex
一起使用,但首先按 DataFrame.sort_index
排序:
df = df.sort_index()
idx = pd.IndexSlice
print (df.loc[idx[:, "2016"], :])
A B C D
persons dates
Anne 2016-01-31 1.189332 1.240492 1.948487 1.049944
2016-02-29 0.155651 0.172096 -1.315934 2.447474
2016-03-31 0.258901 1.052156 0.194412 0.551807
2016-04-30 0.817727 -0.039305 0.196576 -1.163072
2016-05-31 -0.379003 -0.640898 -0.412814 -0.507134
... ... ... ...
Susan 2016-08-31 0.944875 0.655981 -1.167568 1.087909
2016-09-30 -0.533770 0.271889 0.743089 -1.021702
2016-10-31 -0.548632 0.980111 1.288285 -1.130429
2016-11-30 0.843035 -1.019152 0.394127 0.375720
2016-12-31 0.789154 0.660676 -0.097020 -0.392890
[72 rows x 4 columns]
print (df.loc[idx[:, "2015":"2017"], :])
A B C D
persons dates
Anne 2015-01-31 0.340056 -0.084973 -0.160449 0.476274
2015-02-28 1.521403 2.075643 -0.089913 -3.556345
2015-03-31 1.871844 -1.933054 0.360196 -1.184768
2015-04-30 1.996072 -0.671001 1.001818 0.787014
2015-05-31 0.642655 -0.685923 -0.854484 -0.311828
... ... ... ...
Susan 2017-08-31 -0.349868 1.095051 0.950181 1.365780
2017-09-30 0.937602 0.456578 0.169026 -0.559212
2017-10-31 -0.404749 0.595979 -0.434110 2.312148
2017-11-30 1.381366 -1.470635 0.773891 -0.686727
2017-12-31 -0.611788 0.963277 0.564169 -0.647526
[216 rows x 4 columns]
print (df.loc[idx[:, "01-02-2016":], :])
A B C D
persons dates
Anne 2016-01-31 1.189332 1.240492 1.948487 1.049944
2016-02-29 0.155651 0.172096 -1.315934 2.447474
2016-03-31 0.258901 1.052156 0.194412 0.551807
2016-04-30 0.817727 -0.039305 0.196576 -1.163072
2016-05-31 -0.379003 -0.640898 -0.412814 -0.507134
... ... ... ...
Susan 2018-07-31 -0.180213 -0.613854 -0.143997 0.938364
2018-08-31 -1.232334 -1.066170 2.074717 -0.219996
2018-09-30 -0.014457 0.350130 -0.920580 0.040339
2018-10-31 1.651722 -0.399346 -1.647574 0.323075
2018-11-30 1.465342 0.182188 0.039446 -1.155651
[210 rows x 4 columns]
print (df.loc[idx[:, "01-01-2018":], :])
A B C D
persons dates
Anne 2018-01-31 0.072784 -0.093604 -0.896780 -0.336099
2018-02-28 -0.591907 -0.439462 -0.189500 0.172523
2018-03-31 0.027810 -0.932447 0.547707 -0.148938
2018-04-30 -0.114616 0.116554 -0.840459 -1.807368
2018-05-31 -0.017403 0.562685 0.157102 1.739236
... ... ... ...
Susan 2018-07-31 -0.180213 -0.613854 -0.143997 0.938364
2018-08-31 -1.232334 -1.066170 2.074717 -0.219996
2018-09-30 -0.014457 0.350130 -0.920580 0.040339
2018-10-31 1.651722 -0.399346 -1.647574 0.323075
2018-11-30 1.465342 0.182188 0.039446 -1.155651
[66 rows x 4 columns]
关于python - 如何在多索引数据框中按第二级日期切片进行过滤,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66681726/
我是一名优秀的程序员,十分优秀!