gpt4 book ai didi

Python 数据帧 : Seperate rows based on custom condition?

转载 作者:行者123 更新时间:2023-12-04 03:30:18 25 4
gpt4 key购买 nike

我的数据框包含三列 namecontentday

df

        content          day           name
0 first_day 01-01-2017 marcus
1 present 10-01-2017 marcus
2 first_day 01-02-2017 marcus
3 first_day 01-03-2017 marcus
4 absent 05-03-2017 marcus
5 present 20-03-2017 marcus
6 first_day 01-04-2017 bruno
7 present 11-04-2017 bruno
8 first_day 01-05-2017 bruno
9 absent 02-05-2017 bruno
10 first_day 01-06-2017 bruno
11 absent 02-06-2017 bruno
12 payment 09-06-2017 bruno

我试图找出 month wise 的用户,其行有 first_dayabsentpresent 连续.

示例输出:

        content          day           name         absent_after_present
0 first_day 01-01-2017 marcus False
1 first_day 01-02-2017 marcus False
2 first_day 01-03-2017 marcus True
3 first_day 01-04-2017 bruno False
4 first_day 01-05-2017 bruno False
5 first_day 01-06-2017 bruno True

例如:marcus first_day缺席present01-03-2017 连续05-03-201720-03-2017 同一个月。所以 marcus 状态应该是 True

最佳答案

也许您可以尝试提取每月的内容,然后按名称和月份分组,如下所示。

import pandas as pd

data = pd.DataFrame({'content' : ['first_day','present', 'first_day', 'first_day', 'absent',
'present', 'first_day', 'present', 'first_day', 'absent', 'first_day', 'absent', 'present'],
'day' : ['2017-01-01', '2017-01-10', '2017-02-01', '2017-03-01', '2017-03-05', '2017-03-20',
'2017-04-01', '2017-04-11', '2017-05-01', '2017-05-02', '2017-06-01', '2017-06-02', '2017-06-09'],
'name' : ['marcus', 'marcus', 'marcus', 'marcus', 'marcus', 'marcus', 'bruno', 'bruno', 'bruno',
'bruno', 'bruno', 'bruno', 'bruno']})

data['day'] = pd.to_datetime(data['day'])

data['month'] = data.day.dt.month

data_new = pd.DataFrame(data.groupby(['name', 'month'])['content'].unique()).join(pd.DataFrame(data.groupby(['name', 'month'])['day'].unique()), on=['name', 'month'])

data_new['absent_after_present'] = data_new['content'].apply(lambda x : True if len(x) == 3 and len(set(x)) == 3 else False)
data_new['day'] = data_new['day'].apply(lambda x : x[0])
data_new['content'] = data_new['content'].apply(lambda x : x[0])

data_new = data_new.droplevel(1)



data_new


name content day absent_after_present

bruno first_day 2017-04-01 False
bruno first_day 2017-05-01 False
bruno first_day 2017-06-01 True
marcus first_day 2017-01-01 False
marcus first_day 2017-02-01 False
marcus first_day 2017-03-01 True

关于Python 数据帧 : Seperate rows based on custom condition?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67018896/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com