gpt4 book ai didi

python - Julian 与 OutOfBoundsDatetime 公历日期

转载 作者:太空宇宙 更新时间:2023-11-04 02:38:19 24 4
gpt4 key购买 nike

我正在使用 pandas 数据框,并且有一个包含儒略日期的 DATE 列。我想将该列的每个值都转换为公历日期。

为此,我使用了以下代码:

df[['DATE']] = df[['DATE']].apply(lambda x: pd.to_datetime(x - pd.Timestamp(0).to_julian_date(), unit='D'))

不幸的是,我收到如下所示的错误:

OutOfBoundsDatetime: ("cannot convert input 381088.5 with the unit 'D'", u'occurred at index MXPLD_DATE')

当我在数据框中查找导致问题的输入值时,它根本不存在,我不知道 381088.5 来自哪里。

你能告诉我哪里做错了吗?

谢谢。

编辑 1

我尝试了@jezrael 的解决方案,但我仍然遇到类似的错误。

df['DATE'] = pd.to_datetime(df['DATE'], unit='D', origin='julian')

错误:

---------------------------------------------------------------------------
OutOfBoundsDatetime Traceback (most recent call last)
<ipython-input-18-4353e2be1ced> in <module>()
----> 1 df['DATE'] = pd.to_datetime(df['DATE'], unit='D', origin='julian')

/opt/anaconda2/lib/python2.7/site-packages/pandas/core/tools/datetimes.pyc in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
469 raise tslib.OutOfBoundsDatetime(
470 "{original} is Out of Bounds for "
--> 471 "origin='julian'".format(original=original))
472
473 elif origin not in ['unix', 'julian']:

OutOfBoundsDatetime: 0 2457184
1 2457155
2 2457155
3 2457155
4 2457155
5 2457155
6 2457155
7 2457155
8 2457155
9 2457155
10 2457155
11 2457155
12 2457155
13 2457155
14 2457155
15 2457155
16 2457155
17 2457155
18 2457155
19 2457155
20 2457155
21 2457155
22 2457155
23 2457155
24 2457155
25 2457155
26 2457155
27 2457155
28 2457701
29 2457701
...
4597928 2457724
4597929 2457724
4597930 2457724
4597931 2457724
4597932 2457724
4597933 2457724
4597934 2457724
4597935 2457724
4597936 2457724
4597937 2457724
4597938 2457724
4597939 2457724
4597940 2457724
4597941 2457724
4597942 2457724
4597943 2457724
4597944 2457724
4597945 2457724
4597946 2457724
4597947 2457724
4597948 2457724
4597949 2457724
4597950 2457724
4597951 2457724
4597952 2457724
4597953 2457724
4597954 2457724
4597955 2457724
4597956 2457724
4597957 2457724
Name: DATE, Length: 4597958, dtype: int64 is Out of Bounds for origin='julian'

最佳答案

我相信你需要to_datetime带参数 origin:

df = pd.DataFrame({'julian':[2458072.5, 2458073.5]})

df['date'] = pd.to_datetime(df['julian'], unit='D', origin='julian')
print (df)
julian date
0 2458072.5 2017-11-15
1 2458073.5 2017-11-16

编辑:

有些日期时间 OutOfBounds 有问题。

所以先查了timestamp limitations :

In [66]: pd.Timestamp.min
Out[66]: Timestamp('1677-09-21 00:12:43.145225')

In [67]: pd.Timestamp.max
Out[67]: Timestamp('2262-04-11 23:47:16.854775807')

然后获得最小的朱利安日期时间(通过在线转换,例如 here ):

maxdate = 2547338
mindate = 2333836

然后为超出范围的日期添加 NaN,例如通过 where :

 df = pd.DataFrame({'julian':[2821676, 2547338, 1, 2333836]})
maxdate = 2547338
mindate = 2333836

clean_dates = df['julian'].where(df['julian'].between(mindate, maxdate))
print (clean_dates)
0 NaN
1 2547338.0
2 NaN
3 2333836.0

df['date'] = pd.to_datetime(clean_dates, unit='D', origin='julian')
print (df)
julian date
0 2821676 NaT
1 2547338 2262-04-10 12:00:00
2 1 NaT
3 2333836 1677-09-21 12:00:00

最后将解决方案应用于您的数据 - 有 2 个值已转换为 NaT:

print (df['MXPLD_DATE'][~df['MXPLD_DATE'].between(mindate, maxdate)])

1217806 2821676
3167148 2821676
Name: MXPLD_DATE, dtype: int64

clean_dates = df['MXPLD_DATE'].where(df['MXPLD_DATE'].between(mindate, maxdate))
df['MXPLD_DATE'] = pd.to_datetime(clean_dates, unit='D', origin='julian')
print (df['MXPLD_DATE'])
0 2015-06-10 12:00:00
1 2015-05-12 12:00:00
2 2015-05-12 12:00:00
3 2015-05-12 12:00:00
4 2015-05-12 12:00:00
5 2015-05-12 12:00:00
6 2015-05-12 12:00:00
7 2015-05-12 12:00:00
8 2015-05-12 12:00:00
9 2015-05-12 12:00:00
10 2015-05-12 12:00:00

关于python - Julian 与 OutOfBoundsDatetime 公历日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47305631/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com