gpt4 book ai didi

python - Pandas - Groupby 多列

转载 作者:太空宇宙 更新时间:2023-11-03 13:39:43 26 4
gpt4 key购买 nike

我正在尝试按多列进行分组,并将它们聚合在一起,以便它们在分组后成为一个列表。

目前,DataFrame 看起来像这样:

enter image description here

我试过用这个:

grouped = DataFrame.groupby(['jobname', 'block'], axis=0)
DataFrame= grouped.aggregate(lambda x: list(x))

但是,当我在 IPython 中应用它时,它给我这个错误:

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-221-97113b757fa1> in <module>()
----> 1 cassandraFrame_2 = grouped.aggregate(lambda x: list(x))
2 cassandraFrame_2

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in aggregate(self, arg, *args, **kwargs)
2867
2868 if self.grouper.nkeys > 1:
-> 2869 return self._python_agg_general(arg, *args, **kwargs)
2870 else:
2871

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in _python_agg_general(self, func, *args, **kwargs)
1166 for name, obj in self._iterate_slices():
1167 try:
-> 1168 result, counts = self.grouper.agg_series(obj, f)
1169 output[name] = self._try_cast(result, obj)
1170 except TypeError:

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in agg_series(self, obj, func)
1633 return self._aggregate_series_fast(obj, func)
1634 except Exception:
-> 1635 return self._aggregate_series_pure_python(obj, func)
1636
1637 def _aggregate_series_fast(self, obj, func):

/usr/local/lib/python2.7/dist-packages/pandas/core/groupby.pyc in _aggregate_series_pure_python(self, obj, func)
1667 if (isinstance(res, (Series, Index, np.ndarray)) or
1668 isinstance(res, list)):
-> 1669 raise ValueError('Function does not reduce')
1670 result = np.empty(ngroups, dtype='O')
1671

ValueError: Function does not reduce

最终,我想将相同的作业名称组合在一起,并将 block 放在一起,但数据是一个元组列表,现在它是一个 3 项元组。

例如:

jobname       block         data
Complete-Test Simple_buff (tuple_1)
Complete-Test Simple_buff (tuple_2)

聚合:

jobname       block         data
Complete-Test Simple_buff [(tuple_1),(tuple_2)]

我可以按 jobname 分组,但是,这会将 block 聚合在一起,但我想将 blocks 分开。

有人能指出我正确的方向吗?

谢谢

最佳答案

看起来有一个显式检查聚合函数返回的值不是 SeriesIndexnp.ndarray,或列表

因此,以下应该有效:

grouped = df.groupby(['jobname', 'block'])
aggregated = grouped.aggregate(lambda x: tuple(x))

关于python - Pandas - Groupby 多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/33702855/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com