gpt4 book ai didi

python - 如何在月份中的两个日期之间拆分天数

转载 作者:行者123 更新时间:2023-12-04 10:14:10 25 4
gpt4 key购买 nike

我有一个 Pandas 数据帧。它包含一列“start_date”和一列“end_date”。我想计算开始和结束之间的工作日,将它们分成几个月并将它们作为附加列添加到数据框中。

目前这是我编码的。
无论如何我可以让它更快吗?

data = pd.DataFrame([
{'start_date': '2020-03-03', 'end_date' : '2020-06-18'},
{'start_date': '2020-06-03', 'end_date' : '2020-09-18'},
])


def days_month(df):
days_month = pd.DatetimeIndex(pd.bdate_range(df['start_date'], df['end_date'])).month.value_counts()
for i in range(1, 13):
try:
days_month[i]
except:
days_month[i] = 0

return days_month

data[['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']] = data.apply(days_month, axis=1, result_type="expand")

print(data)
end_date start_date Jan Feb Mar Apr May Jun Jul Aug Sep Oct \
0 2020-06-18 2020-03-03 0 0 21 22 21 14 0 0 0 0
1 2020-09-18 2020-06-03 0 0 0 0 0 20 23 21 14 0

Nov Dec
0 0 0
1 0 0


更新尝试实现 Ethan 建议:

def countWeekDays(df):

fromDate=df['PO Creation Date']
toDate=df['PO Expected Delivery Date']

d = np.arange(fromDate, toDate, dtype=np.datetime64)

weekdays = d[np.is_busday(d, busdaycal=calendar())]

workDays = {m: np.array([i for i in weekdays if i.item().month==m]).size for m in range(1,13)}

return workDays

def calendar():
#set work week mask and optional holidays array
return np.busdaycalendar(weekmask='1111100', holidays=['2020-01-01','2020-01-20','2020-02-17','2020-05-25','2020-07-03','2020-09-07','2020-10-12','2020-11-11','2020-11-26','2020-12-25'])


data[['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']] = data.apply(countWeekDays, axis=1, result_type="expand")

但是,它给了我一个内存错误。不知道为什么...

最佳答案

您可以使用 numpy 日期时间函数和一些包装:

import numpy as np

def countWeekDays(fromDate='2020-03-03', toDate='2020-06-03'):
d = np.arange(fromDate, toDate, dtype=np.datetime64)

weekdays = d[np.is_busday(d, busdaycal=calendar())]

workDays = [(m, np.array([i for i in weekdays if i.item().month==m]).size) for m in range(1,13)]

return workDays

def calendar():
#set work week mask and optional holidays array
return np.busdaycalendar(weekmask='1111100', holidays=['2020-01-01','2020-01-20','2020-02-17','2020-05-25','2020-07-03','2020-09-07','2020-10-12','2020-11-11','2020-11-26','2020-12-25'])

结果:
>>> countWeekDays()
[(1, 0), (2, 0), (3, 21), (4, 22), (5, 20), (6, 2), (7, 0), (8, 0), (9, 0), (10, 0), (11, 0), (12, 0)]

这是对您的代码的修改,以构建一个与我的函数一起使用以获取工作日的数据框。我认为您遇到的错误是由于 Dataframe 的构建和修改方式造成的。我的经验是,修改数据框有点麻烦,最好将数据放在一起,然后从完整的数据集创建数据框:
def applyDays():
data = [{'start_date': '2020-03-03', 'end_date' : '2020-06-18'},
{'start_date': '2020-06-03', 'end_date' : '2020-09-18'}]
return countWeekDays(data)


def countWeekDays(lst): #, result_type):
months = ['start_date','end_date', 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
data = []
for row in lst:
fromDate = row['start_date'] # df['PO Creation Date']
toDate = row['end_date'] #df['PO Expected Delivery Date']
d = np.arange(fromDate, toDate, dtype=np.datetime64)
weekdays = d[np.is_busday(d, busdaycal=calendar())]

data.append([fromDate, toDate] + [np.array([i for i in weekdays if i.item().month==m]).size for m in range(1,13)])
return pd.DataFrame(data, columns=months)

数据框结果:
applyDays()
Out[6]:
start_date end_date Jan Feb Mar Apr May Jun Jul Aug Sep Oct \
0 2020-03-03 2020-06-18 0 0 21 22 20 13 0 0 0 0
1 2020-06-03 2020-09-18 0 0 0 0 0 20 22 21 12 0

Nov Dec
0 0 0
1 0 0

关于python - 如何在月份中的两个日期之间拆分天数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61163308/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com