gpt4 book ai didi

python - 如何使用 python pandas 对基于组的列进行平均?

转载 作者:行者123 更新时间:2023-11-29 05:06:35 25 4
gpt4 key购买 nike

我有这样的输入:

NAME            Geoid    Year   QTR Index 'Abilene, TX    10180   1978    3   0'Abilene, TX    10180   1978    4   0'Abilene, TX    10180   1979    1   0'Abilene, TX    10180   1979    2   0'Decatur, IL    19500   1998    1   110.51'Decatur, IL    19500   1998    2   110.48'Decatur, IL    19500   1998    3   113.01'Decatur, IL    19500   1998    4   114.16'Fairbanks, AK  21820   1990    1   63.74'Fairbanks, AK  21820   1990    2   70.68'Fairbanks, AK  21820   1990    3   83.56'Fairbanks, AK  21820   1990    4   83.95

The query that I want to convert to python from MYSQL is as this :

   SELECT  geoid, name, YEAR, AVG(index)
FROM table_1
WHERE geoid>0
GROUP BY geoid, metro_name, YEAR;

AVG 的 pythonic 等价物是 mean 是我在网上读到的,但是当我使用 mean 时它给了我一个单一的值。

pandas get column average/mean

但我希望输出分组的年份和季度如下:

Name            Geoid   YEAR    AVG(index)'Abilene, TX    10180   1978    0'Abilene, TX    10180   1979    0'Decatur, IL    19500   1998    111.75'Fairbanks, AK  21820   1990    74.9875

如何实现?

最佳答案

使用queryboolean indexing首先用于过滤,然后 groupby聚合 mean:

df1 = df.query('Geoid > 0').groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825

df1 = df[df['Geoid'] > 0].groupby(['NAME','Geoid','Year'], as_index=False)['Index'].mean()
print (df1)
NAME Geoid Year Index
0 'Abilene, TX 10180 1978 0.0000
1 'Abilene, TX 10180 1979 0.0000
2 'Decatur, IL 19500 1998 112.0400
3 'Fairbanks, AK 21820 1990 75.4825

关于python - 如何使用 python pandas 对基于组的列进行平均?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46648183/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com