gpt4 book ai didi

python-2.7 - Pandas 0.18.1 groupby 和重采样多级聚合错误

转载 作者:行者123 更新时间:2023-12-04 16:06:44 24 4
gpt4 key购买 nike

我刚刚将 Pandas 从 0.17.1 更新到 0.18.1,并认为我在更改一些预先存在的代码时发现了下面概述的新重采样方法的问题。根据此文档,我下面的示例中的 df3_resample 和 df4_resample 应该返回相同的数据帧,但是 df4_resample 会引发异常。这让我绊倒了一段时间,所以我想我会分享。

Exception: Column(s) A already selected

http://pandas.pydata.org/pandas-docs/version/0.18.0/whatsnew.html#whatsnew-0180-breaking-resample

http://pandas.pydata.org/pandas-docs/version/0.18.1/whatsnew.html#groupby-syntax-with-window-and-resample-operations
df = pd.DataFrame(np.random.rand(10,4),
columns=list('ABCD'),
index=pd.date_range('2010-01-01 09:00:00', periods=10, freq='s'))
df['item'] = 'item_a' # add column for groupby

# THIS WORKS
df1_resample = df.groupby('item').resample('2s').agg({'A': np.mean, 'B': np.max}).reset_index()
print df1_resample

# THIS WORKS
df2_resample = df.resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}}).reset_index()
print df2_resample

# THIS WORKS
df3_resample = df.groupby('item').apply(lambda x: x.resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}})).reset_index()
print df3_resample

# THIS DOESN"T WORKS
df4_resample = df.groupby('item').resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}})
print df4_resample

输出:
 item             level_1         A         B
0 item_a 2010-01-01 09:00:00 0.611660 0.739640
1 item_a 2010-01-01 09:00:02 0.615876 0.880113
2 item_a 2010-01-01 09:00:04 0.218292 0.441504
3 item_a 2010-01-01 09:00:06 0.753698 0.637787
4 item_a 2010-01-01 09:00:08 0.471272 0.474738
index A
A_mean A_max
0 2010-01-01 09:00:00 0.611660 0.813038
1 2010-01-01 09:00:02 0.615876 0.994657
2 2010-01-01 09:00:04 0.218292 0.233478
3 2010-01-01 09:00:06 0.753698 0.848107
4 2010-01-01 09:00:08 0.471272 0.610592
item level_1 A
A_mean A_max
0 item_a 2010-01-01 09:00:00 0.611660 0.813038
1 item_a 2010-01-01 09:00:02 0.615876 0.994657
2 item_a 2010-01-01 09:00:04 0.218292 0.233478
3 item_a 2010-01-01 09:00:06 0.753698 0.848107
4 item_a 2010-01-01 09:00:08 0.471272 0.610592


File "<some_file.py>", line 29, in <module>
df4_resample = df.groupby('item').resample('2s').agg({'A': {'A_mean': np.mean, 'A_max': np.max}})

File "C:\Anaconda2\lib\site-packages\pandas\tseries\resample.py", line 293, in aggregate
result, how = self._aggregate(arg, *args, **kwargs)

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 505, in _aggregate
result = list(_agg(arg, _agg_1dim).values())

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 496, in _agg
result[fname] = func(fname, agg_how)

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 479, in _agg_1dim
return colg.aggregate(how, _level=(_level or 0) + 1)

File "C:\Anaconda2\lib\site-packages\pandas\tseries\resample.py", line 293, in aggregate
result, how = self._aggregate(arg, *args, **kwargs)

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 528, in _aggregate
result = _agg(arg, lambda fname,

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 496, in _agg
result[fname] = func(fname, agg_how)

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 529, in <lambda>
agg_how: _agg_1dim(self._selection, agg_how))

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 475, in _agg_1dim
colg = self._gotitem(name, ndim=1, subset=subset)

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 680, in _gotitem
groupby=self._groupby[key],

File "C:\Anaconda2\lib\site-packages\pandas\core\base.py", line 326, in __getitem__
raise Exception('Column(s) %s already selected' % self._selection)

Exception: Column(s) A already selected

最佳答案

我不知道为什么 resample对此不起作用,但有一个方便的解决方法,不需要使用 lambda。试试这个:

df.groupby([
'item', pd.Grouper(freq = '2s')
]).agg({
'A' : ['mean', 'max']
}).rename(columns = {
'mean' : 'A_mean', 'max' : 'A_max'
}, level = 1).reset_index()

output

而不是使用 .resample('2S')您可以添加 pd.Grouper('2s')给您的 groupby() .它的功能与您的情况相同。这是文档--> http://pandas.pydata.org/pandas-docs/version/0.18/generated/pandas.Grouper.html

另一方面,您应该避免使用嵌套字典重命名列(已弃用),而是使用实际的 .rename()功能。

关于python-2.7 - Pandas 0.18.1 groupby 和重采样多级聚合错误,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38861244/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com