gpt4 book ai didi

python - groupby后如何设置聚合?

转载 作者:行者123 更新时间:2023-12-04 10:46:00 27 4
gpt4 key购买 nike

鉴于我有一个数据集如下:

dt = {
"facility":["Ann Arbor","Ann Arbor","Detriot","Detriot","Detriot"],
"patient_ID":[4388,4388,9086,9086,9086],
"year":[2004,2007,2007,2008,2011],
"month":[8,9,9,6,2],
"Nr_Small":[0,0,5,12,10],
"Nr_Medium":[3,1,1,4,3],
"Nr_Large":[2,0,0,0,0],
"PeriodBetween2Visits" : [10,0,12,3,1],
"NumberOfVisits" : [2,2,3,3,3]

}

dt = pd.DataFrame(dt)

我需要保留 groupby patient_ID ,然后保留 facility , patient_ID , NumberOfVisits ,但是 最大值 最低 PeriodBetween2Visits .

这是我尝试过的:
dt = dt.groupby(['patient_ID'],as_index=False)["facility","patient_ID","PeriodBetween2Visits","NumberOfVisits"].agg({'PeriodBetween2Visits': ['min', 'max']})


dt.head()

但是,这不是我需要的!

对我来说正确的输出如下:

enter image description here

最佳答案

我在这里使用命名聚合,它内置于 groupby 和 agg recently :

 dt.groupby(['facility','patient_ID']).agg(
Min_PeriodBetween2Visits=('PeriodBetween2Visits','min'),
Max_PeriodBetween2Visits=('PeriodBetween2Visits','max'),
NumberOfVisits=('NumberOfVisits','nunique')).reset_index()
    facility  patient_ID  Min_PeriodBetween2Visits  Max_PeriodBetween2Visits  \
0 Ann Arbor 4388 0 10
1 Detriot 9086 1 12

NumberOfVisits
0 2
1 3

关于python - groupby后如何设置聚合?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59705580/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com