gpt4 book ai didi

Python Pandas : Groupby date, 并按时间戳访问每个组

转载 作者:太空狗 更新时间:2023-10-30 01:02:08 25 4
gpt4 key购买 nike

我想按时间戳(日期)分组并按时间戳访问每个组,这看起来工作不正常。看起来组键以不同的格式奇怪地索引。

df= pd.DataFrame({'DATE' : ['10-Oct-2013', '10-Oct-2013', '10-Oct-2013', '11-Oct-2013', '11-Oct-2013', '11-Oct-2013'],'VAL' : [1,2,3,4,5,6]})

>>> df
DATE VAL
0 10-Oct-2013 1
1 10-Oct-2013 2
2 10-Oct-2013 3
3 11-Oct-2013 4
4 11-Oct-2013 5
5 11-Oct-2013 6


dfg=df.groupby(df['DATE'].apply(lambda x: pd.to_datetime(x)))

>>> dfg.groups.keys()
[numpy.datetime64('NaT'), numpy.datetime64('2013-10-10T17:00:00.000000000-0700'), numpy.datetime64('2013-10-09T17:00:00.000000000-0700')]

for d in dfg.groups.keys():
try:
print d,dfg.get_group(d).describe()
except:
print 'err'
>>
NaT err
2013-10-10T17:00:00.000000000-0700 err
2013-10-09T17:00:00.000000000-0700 err

rng = pd.to_datetime(pd.date_range('10/10/2013', periods=3, freq='D'))

for d in rng:
try:
print d,dfg.get_group(d).describe()
except:
print 'err'

2013-10-10 00:00:00 err
2013-10-11 00:00:00 err
2013-10-12 00:00:00 err

最佳答案

这是你的相框

In [40]: df = pd.DataFrame({'DATE' : ['10-Oct-2013', '10-Oct-2013', '10-Oct-2013', '11-Oct-2013', '11-Oct-2013', '11-Oct-2013'],'VAL' : [1,2,3,4,5,6]})

直接转换类似日期的列要快得多

In [41]: df['DATE']= pd.to_datetime(df['DATE'])

In [42]: df.dtypes
Out[42]:
DATE datetime64[ns]
VAL int64
dtype: object

In [43]: df
Out[43]:
DATE VAL
0 2013-10-10 00:00:00 1
1 2013-10-10 00:00:00 2
2 2013-10-10 00:00:00 3
3 2013-10-11 00:00:00 4
4 2013-10-11 00:00:00 5
5 2013-10-11 00:00:00 6

这看起来像你想要的那样

In [44]: df.groupby('DATE').describe()
Out[44]:
VAL
DATE
2013-10-10 count 3.0
mean 2.0
std 1.0
min 1.0
25% 1.5
50% 2.0
75% 2.5
max 3.0
2013-10-11 count 3.0
mean 5.0
std 1.0
min 4.0
25% 4.5
50% 5.0
75% 5.5
max 6.0

如果你真的想单独参加一个小组

In [45]: g = df.groupby('DATE')

In [46]: key = g.groups.keys()[0]

In [47]: key
Out[47]: numpy.datetime64('2013-10-09T20:00:00.000000000-0400')

In [48]: g.get_group(key.astype('i8'))
Out[48]:
DATE VAL
0 2013-10-10 00:00:00 1
1 2013-10-10 00:00:00 2
2 2013-10-10 00:00:00 3

datetime64[ns] 在内部存储为长整数,因此这就是访问它们的方式你通常真的没有理由这样做,因为你可以这样做

df.groupby('DATE').apply(lambda x: .....)

或者如果你真的想迭代

for g, grp in df.groupby('DATE'):
......

关于Python Pandas : Groupby date, 并按时间戳访问每个组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19458361/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com