gpt4 book ai didi

python - Pandas - 计算分组依据的天数

转载 作者:太空宇宙 更新时间:2023-11-04 02:54:21 25 4
gpt4 key购买 nike

我想计算按 2 列分组后的天数:

groups = df.groupby([df.col1,df.col2])

现在我想计算每个组的相关天数:

result = groups['date_time'].dt.date.nunique()

当我想按天分组时,我正在使用类似的东西,但在这里我得到一个错误:

AttributeError: Cannot access attribute 'dt' of 'SeriesGroupBy' objects, try using the 'apply' method

获取天数的正确方法是什么?

最佳答案

您需要 groupby 的另一个变体- 首先定义列:

df['date_time'].dt.date.groupby([df.col1,df.col2]).nunique()

df.groupby(['col1','col2'])['date_time'].apply(lambda x: x.dt.date.nunique())

df['date_time1'] = df['date_time'].dt.date
a = df.groupby([df.col1,df.col2]).date_time1.nunique()

示例:

start = pd.to_datetime('2015-02-24')
rng = pd.date_range(start, periods=10, freq='15H')

df = pd.DataFrame({'date_time': rng, 'col1': [0]*5 + [1]*5, 'col2': [2]*3 + [3]*4+ [4]*3})
print (df)
col1 col2 date_time
0 0 2 2015-02-24 00:00:00
1 0 2 2015-02-24 15:00:00
2 0 2 2015-02-25 06:00:00
3 0 3 2015-02-25 21:00:00
4 0 3 2015-02-26 12:00:00
5 1 3 2015-02-27 03:00:00
6 1 3 2015-02-27 18:00:00
7 1 4 2015-02-28 09:00:00
8 1 4 2015-03-01 00:00:00
9 1 4 2015-03-01 15:00:00
#solution with apply
df1 = df.groupby(['col1','col2'])['date_time'].apply(lambda x: x.dt.date.nunique())
print (df1)
col1 col2
0 2 2
3 2
1 3 1
4 2
Name: date_time, dtype: int64

#create new helper column
df['date_time1'] = df['date_time'].dt.date
df2 = df.groupby([df.col1,df.col2]).date_time1.nunique()
print (df2)
col1 col2
0 2 2
3 2
1 3 1
4 2
Name: date_time, dtype: int64

df3 = df['date_time'].dt.date.groupby([df.col1,df.col2]).nunique()
print (df3)
col1 col2
0 2 2
3 2
1 3 1
4 2
Name: date_time, dtype: int64

关于python - Pandas - 计算分组依据的天数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/42898678/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com