gpt4 book ai didi

python - 在 Python 中按几个间隔平均列值

转载 作者:行者123 更新时间:2023-12-05 07:24:30 26 4
gpt4 key购买 nike

我有一个包含深度和其他值列的数据框:

data = {'Depth': [1.0, 1.0, 1.5, 2.0, 2.5, 2.5, 3.0, 3.5, 4.0, 4.0, 5.0, 5.5, 6.0], 
'Value1':[44, 46, 221, 12, 47, 44, 67, 90, 100, 111, 112, 120, 122],
'Value2': [55, 65, 76, 45, 55, 58, 23, 12, 32, 20, 22, 26, 36]}

df = pd.DataFrame(data)

正如您有时看到的,Depth 中存在重复。

我希望能够以某种方式对间隔进行分组并对它们进行平均。例如我想要的输出是:

intervals = [1.0, 2.0]

获取间隔列表并将这些间隔上的数据集分解为每个值 (Value1, Value2) 的平均值以获得:

    Depth  Value1  Value2   Avg1_1  Avg2_1  Avg1_2   Avg2_2   
0 1.0 44 55 80.75 60.25 78.2 .
1 1.0 46 65 80.75 60.25 78.2 .
2 1.5 221 76 80.75 60.25 78.2 .
3 2.0 12 45 80.75 60.25 78.2
4 2.5 47 55 52.67 . 78.2
5 2.5 44 58 52.67 . 78.2
6 3.0 67 23 52.67 . 78.2
7 3.5 90 12 100.33 78.2
8 4.0 100 32 100.33 78.2
9 4.0 111 20 100.33 78.2
10 5.0 112 22 112 .
11 5.5 120 26 121 .
12 6.0 122 36 121 .

其中 Avg1_ 是 Value11.0 的每个区间内的平均值(包括(1.0 - 2.0、2.5 - 3.0,....等)。

有没有一种简单的方法可以在循环中使用 groupby 来做到这一点?

最佳答案

您可以使用数据框的 apply 方法完成此操作,然后通过 bool 值对满足 depth + 1.0深度 + 2.0

df['avg1_1'] = df.apply(lambda x: (df[df['Depth'] <= x['Depth'] + 1.0]['Value1'].values.sum() / 
len(df[df['Depth'] <= x['Depth'] + 1.0]['Value1'].values)),
axis=1)

df['avg2_1'] = df.apply(lambda x: (df[df['Depth'] <= x['Depth'] + 1.0]['Value2'].values.sum() /
len(df[df['Depth'] <= x['Depth'] + 1.0]['Value2'].values)),
axis=1)

df['avg1_2'] = df.apply(lambda x: (df[df['Depth'] <= x['Depth'] + 2.0]['Value1'].values.sum() /
len(df[df['Depth'] <= x['Depth'] + 2.0]['Value1'].values)),
axis=1)

df['avg2_2'] = df.apply(lambda x: (df[df['Depth'] <= x['Depth'] + 2.0]['Value2'].values.sum() /
len(df[df['Depth'] <= x['Depth'] + 2.0]['Value2'].values)),
axis=1)

这将返回:

Depth   Value1  Value2  newval  avg1_1  avg2_1  avg1_2  avg2_2
0 1.0 44 55 66.0 80.750000 60.250000 68.714286 53.857143
1 1.0 46 65 241.0 80.750000 60.250000 68.714286 53.857143
2 1.5 221 76 32.0 69.000000 59.000000 71.375000 48.625000
3 2.0 12 45 67.0 68.714286 53.857143 78.200000 44.100000
4 2.5 47 55 64.0 71.375000 48.625000 78.200000 44.100000
5 2.5 44 58 87.0 71.375000 48.625000 78.200000 44.100000
6 3.0 67 23 110.0 78.200000 44.100000 81.272727 42.090909
7 3.5 90 12 120.0 78.200000 44.100000 84.500000 40.750000
8 4.0 100 32 131.0 81.272727 42.090909 87.384615 40.384615
9 4.0 111 20 132.0 81.272727 42.090909 87.384615 40.384615
10 5.0 112 22 140.0 87.384615 40.384615 87.384615 40.384615
11 5.5 120 26 142.0 87.384615 40.384615 87.384615 40.384615
12 6.0 122 36 NaN 87.384615 40.384615 87.384615 40.384615

关于python - 在 Python 中按几个间隔平均列值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55407624/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com