gpt4 book ai didi

python - 对 groupby pandas 数据框的算术运算

转载 作者:太空宇宙 更新时间:2023-11-04 10:10:58 26 4
gpt4 key购买 nike

我有一个包含 40 列和 400000 行的 Pandas 数据框。我在 3 列上创建了一个汇总数据集。

现在,我需要根据其中两列计算 % 指标。 Python 抛出错误 -

unsupported operand type(s) for /: 'SeriesGroupBy' and 'SeriesGroupBy'

这里是示例代码:

print sample_data
date part receipt bad_dollars total_dollars bad_percent
0 1 123 22 40 100 NaN
1 2 456 44 80 120 NaN
2 3 134 33 30 150 NaN
3 1 123 22 80 100 NaN
4 5 456 45 40 90 NaN
5 3 134 33 85 150 NaN
6 7 123 24 70 120 NaN
7 5 456 45 20 85 NaN
8 9 134 35 50 300 NaN
9 7 123 24 300 600 NaN

sample_data_group = sample_data.groupby(['date','part','receipt'])

sample_data_group['bad_percents']=sample_data_group['bad_dollars']/sample_data_group['total_dollars']

TypeError: unsupported operand type(s) for /: 'SeriesGroupBy' and 'SeriesGroupBy'

请帮忙!

最佳答案

您可以在 groupby 对象上使用 apply 来做到这一点:

import pandas as pd
import numpy as np

cols = ['index', 'date', 'part', 'receipt', 'bad_dollars', 'total_dollars',
'bad_percent']
sample_data = pd.DataFrame([
[0, 1, 123, 22, 40, 100, np.nan],
[1, 2, 456, 44, 80, 120, np.nan],
[2, 3, 134, 33, 30, 150, np.nan],
[3, 1, 123, 22, 80, 100, np.nan],
[4, 5, 456, 45, 40, 90, np.nan],
[5, 3, 134, 33, 85, 150, np.nan],
[6, 7, 123, 24, 70, 120, np.nan],
[7, 5, 456, 45, 20, 85, np.nan],
[8, 9, 134, 35, 50, 300, np.nan],
[9, 7, 123, 24, 300, 600, np.nan]],
columns = cols).set_index('index', drop = True)

sample_data_group = sample_data.groupby(['date','part','receipt'])

xx = sample_data_group.apply(
lambda x: x.assign(bad_percent = x.bad_dollars/x.total_dollars))\
.reset_index(['date','part', 'receipt'], drop = True)

关于python - 对 groupby pandas 数据框的算术运算,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38342528/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com