gpt4 book ai didi

python - 在多组的多个日期范围内查找缺失日期

转载 作者:行者123 更新时间:2023-12-05 02:37:59 24 4
gpt4 key购买 nike

我正在尝试为 CURRENCY 列中的多个组提取 DATE FROM 列和 DATE TO 列中日期范围的缺失日期列表,对于每个组列 CURRENCY :

范围被分成多行

例如:货币 EUR 在第 0、1 和 2 行有三个日期范围,整个组中缺失的范围是 2021-10-06 到 2021-10-10。

缺少的范围在下面使用 * 符号突出显示,这个缺少的日期范围是我期望的预期输出。

df = pd.DataFrame({"CURRENCY":{"0":"EUR","1":"EUR","2":"EUR","3":"GBP","4":"GBP","5":"GBP","6":"USD","7":"USD","8":"SAR","9":"SAR"},
"DATE FROM":{"0":"2021-10-01","1":"2021-10-11","2":"2021-10-19","3":"2021-10-01","4":"2021-10-05",
"5":"2021-10-11","6":"2021-10-01","7":"2021-10-05","8":"2021-10-01","9":"2021-10-05"},
"DATE TO":{"0":"2021-10-05","1":"2021-10-18","2":"2021-10-23","3":"2021-10-04","4":"2021-10-07",
"5":"2021-10-18","6":"2021-10-02","7":"2021-10-10","8":"2021-10-01","9":"2021-10-10"}})

CURRENCY DATE FROM DATE TO
0 EUR 2021-10-01 2021-10-05*
1 EUR 2021-10-11* 2021-10-18
2 EUR 2021-10-19 2021-10-23
3 GBP 2021-10-01 2021-10-04
4 GBP 2021-10-05 2021-10-07*
5 GBP 2021-10-11* 2021-10-18
6 USD 2021-10-01 2021-10-02*
7 USD 2021-10-05* 2021-10-10
8 SAR 2021-10-01 2021-10-01*
9 SAR 2021-10-05* 2021-10-10

预期输出:

    CURRENCY    MISSING
0 EUR 2021-10-06
1 EUR 2021-10-07
2 EUR 2021-10-08
3 EUR 2021-10-09
4 EUR 2021-10-10
5 GBP 2021-10-08
6 GBP 2021-10-09
7 GBP 2021-10-10
8 USD 2021-10-03
9 USD 2021-10-04
10 SAR 2021-10-02
11 SAR 2021-10-03
12 SAR 2021-10-04

以下是我尝试过的方法,但这似乎不是解决问题的有效方法,我也不确定如何对这些输出范围进行分组以找到每个组(欧元、英镑等)的缺失日期:

date_from_list = df['DATE FROM'].to_list()
date_to_list = df['DATE TO'].to_list()
curr_list = df['CURRENCY'].to_list()

for date_from, date_to, curr in zip(date_from_list, date_to_list, curr_list):
print(curr_list, pd.date_range(date_from, date_to))

最佳答案

pd.date_range 和pd.period_range 都可以。我使用的日期范围仅在右侧或左侧接近,因此您可能需要进行一些过滤。代码如下

df= df.assign(end=df['DATE FROM'].shift(-1),start=df['DATE TO']).iloc[:-1 , :]#Define the start and end for date range

df=df.assign(Missing=df.apply(lambda x: pd.date_range(start=x['start'], end=x['end'], closed='right').tolist(), axis = 1)).explode('Missing').drop_duplicates('Missing').drop(['start','end'],axis=1)

CURRENCY DATE FROM DATE TO Missing
0 EUR 2021-10-01 2021-10-05 2021-10-06
0 EUR 2021-10-01 2021-10-05 2021-10-07
0 EUR 2021-10-01 2021-10-05 2021-10-08
0 EUR 2021-10-01 2021-10-05 2021-10-09
0 EUR 2021-10-01 2021-10-05 2021-10-10
0 EUR 2021-10-01 2021-10-05 2021-10-11
1 EUR 2021-10-11 2021-10-18 2021-10-19
2 EUR 2021-10-19 2021-10-23 NaT
3 GBP 2021-10-01 2021-10-04 2021-10-05
6 USD 2021-10-01 2021-10-02 2021-10-03
6 USD 2021-10-01 2021-10-02 2021-10-04
8 SAR 2021-10-01 2021-10-01 2021-10-02

关于python - 在多组的多个日期范围内查找缺失日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69824372/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com