gpt4 book ai didi

python - pandas:仅从工作日时间戳检测第一个/最后一个记录号

转载 作者:太空宇宙 更新时间:2023-11-03 17:10:41 26 4
gpt4 key购买 nike

有一个数据框,其中包括一列记录编号(升序)和一列工作日。该计划是提取每天的第一个和最后一个记录编号。例如:

df = pd.DataFrame({'records': [1, 2, 3, 4, 6, 7, 8, 12, 14, 15, 16, 19, 23, 26, 29, 38, 43, 59, 61],
'weekday': ['Monday', 'Monday', 'Monday', 'Tuesday', 'Tuesday', 'Wednesday', 'Thursday',
'Thursday', 'Thursday', 'Friday', 'Friday', 'Friday', 'Saturday', 'Sunday',
'Monday', 'Monday', 'Tuesday', 'Wednesday', 'Wednesday']})
>>> df

records weekday
0 1 Monday
1 2 Monday
2 3 Monday
3 4 Tuesday
4 6 Tuesday
5 7 Wednesday
6 8 Thursday
7 12 Thursday
8 14 Thursday
9 15 Friday
10 16 Friday
11 19 Friday
12 23 Saturday
13 26 Sunday
14 29 Monday
15 38 Monday
16 43 Tuesday
17 59 Wednesday
18 61 Wednesday

我正在尝试得到这样的东西:

    first  last  records    weekday
0 1 3 1 Monday
1 1 3 2 Monday
2 1 3 3 Monday
3 4 6 4 Tuesday
4 4 6 6 Tuesday
5 7 7 7 Wednesday
6 8 14 8 Thursday
7 8 14 12 Thursday
8 8 14 14 Thursday
9 15 19 15 Friday
10 15 19 16 Friday
11 15 19 19 Friday
12 23 23 23 Saturday
13 26 26 26 Sunday
14 29 38 29 Monday
15 29 38 38 Monday
16 43 43 43 Tuesday
17 59 61 59 Wednesday
18 59 61 61 Wednesday

那么我该从哪里开始呢?在监视任何变化的同时从上到下迭代工作日列是正确的方向吗?

最佳答案

使用compare-cumsum-groupby图案:

df['first'] = (df
.groupby((df.weekday != df.weekday.shift()).cumsum())
.records
.transform('first'))

df['last'] = (df
.groupby((df.weekday != df.weekday.shift()).cumsum())
.records
.transform('last'))
>>> df
records weekday first last
0 1 Monday 1 3
1 2 Monday 1 3
2 3 Monday 1 3
3 4 Tuesday 4 6
4 6 Tuesday 4 6
5 7 Wednesday 7 7
6 8 Thursday 8 14
7 12 Thursday 8 14
8 14 Thursday 8 14
9 15 Friday 15 19
10 16 Friday 15 19
11 19 Friday 15 19
12 23 Saturday 23 23
13 26 Sunday 26 26
14 29 Monday 29 38
15 38 Monday 29 38
16 43 Tuesday 43 43
17 59 Wednesday 59 61
18 61 Wednesday 59 61

诀窍是获取每个工作日的唯一索引(不仅仅是 1-7,而是每次出现新的工作日时递增 1)。

df['week_counter'] = (df.weekday != df.weekday.shift()).cumsum()
>>> df
records weekday week_counter
0 1 Monday 1
1 2 Monday 1
2 3 Monday 1
3 4 Tuesday 2
4 6 Tuesday 2
5 7 Wednesday 3
6 8 Thursday 4
7 12 Thursday 4
8 14 Thursday 4
...
16 43 Tuesday 9
17 59 Wednesday 10
18 61 Wednesday 10

这些week_counter值然后在groupby中使用来创建记录组,并使用transorm(以保持与记录组相同的形状)原始数据帧)获取每组的第一个和最后一个记录

关于python - pandas:仅从工作日时间戳检测第一个/最后一个记录号,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/34124733/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com