gpt4 book ai didi

python - 你能在 Pandas 中拆分日期时间月份吗?

转载 作者:太空狗 更新时间:2023-10-30 02:28:31 25 4
gpt4 key购买 nike

有没有办法创建新的列来表示包含两个日期时间之间的增量的各个月份?输出可能是每个新的每月列的二进制值。我在想这样的事情(这是行不通的):

for i in [1, 2, 3, 4, 5]:
i_name = str(i)
values = example['end'] - example['start'] #Example line - need to expose values here)
example[i_name] = values

从这里开始:

    end         name        start
0 28/02/2012 joe bloggs 01/01/2012
1 15/03/2012 jane bloggs 01/02/2012
2 17/05/2012 jim bloggs 01/04/2012
3 18/04/2012 john bloggs 01/02/2012

对此:

    end         1   2   3   4   5   name        start
0 28/02/2012 1 1 0 0 0 joe bloggs 01/01/2012
1 15/03/2012 0 1 1 0 0 jane bloggs 01/02/2012
2 17/05/2012 0 0 0 1 1 jim bloggs 01/04/2012
3 18/04/2012 0 1 1 1 0 john bloggs 01/02/2012

最佳答案

我想你可以主要使用get_dummiesstack :

#convert columns to datetime
df['end'] = pd.to_datetime(df.end, dayfirst=True)
df['start'] = pd.to_datetime(df.start, dayfirst=True)
#print df

#get months to Series
end = df['end'].dt.month
start = df['start'].dt.month

#create difference DataFrame
df1 = pd.DataFrame({'end':end, 'start':start})
.apply(lambda x: pd.Series(range(x.start, x.end + 1)), axis=1)
print df1
0 1 2
0 1.0 2.0 NaN
1 2.0 3.0 NaN
2 4.0 5.0 NaN
3 2.0 3.0 4.0

#create indicator variables, sum values by index
df1 = pd.get_dummies(df1.stack().reset_index(level=1, drop=True))
.groupby(level=0).sum().astype(int)

#convert float columns names to int
df1.columns = df1.columns.to_series().astype(int)
print df1
1 2 3 4 5
0 1 1 0 0 0
1 0 1 1 0 0
2 0 0 0 1 1
3 0 1 1 1 0
#append to original DataFrame
print pd.concat([df, df1], axis=1)
end name start 1 2 3 4 5
0 2012-02-28 joe bloggs 2012-01-01 1 1 0 0 0
1 2012-03-15 jane bloggs 2012-02-01 0 1 1 0 0
2 2012-05-17 jim bloggs 2012-04-01 0 0 0 1 1
3 2012-04-18 john bloggs 2012-02-01 0 1 1 1 0

关于python - 你能在 Pandas 中拆分日期时间月份吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36573051/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com