gpt4 book ai didi

python-3.x - 如何从 Python 中的非日期时间字符串中剥离时间?

转载 作者:行者123 更新时间:2023-12-04 07:47:00 26 4
gpt4 key购买 nike

我的字符串看起来像这样:

“音频是在 2019 年 2 月 7 日 21:50:00 (UTC) 由设备 243B1F05 在增益设置 2 下录制的,电池状态为 3.6V。”

我尝试使用 dateutil 中的解析器:

from dateutil.parser import parse
s = "Audio was recorded at 21:50:00 02/07/2019 (UTC) by device 243B1F05 at gain setting 2 while battery state was 3.6V."
dt = parse(s, fuzzy=True)
print(dt)

但是,我收到以下错误:

Traceback (most recent call last):
File "<string>", line 3, in <module>
File "/usr/local/lib/python3.8/dist-packages/dateutil/parser/_parser.py", line 1374, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/dateutil/parser/_parser.py", line 649, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: Audio was recorded at 21:50:00 02/07/2019 (UTC) by device 243B1F05 at gain setting 2 while battery state was 3.6V.

如何从这个字符串中提取时间和日期?是否有允许我在一行中执行此操作的正则表达式?

编辑:理想情况下,我正在寻找一种可以轻松应用于 pandas 数据框中的整个列的解决方案。

最佳答案

对正则表达式使用 re:

from dateutil.parser import parse
import re

s = "Audio was recorded at 21:50:00 02/07/2019 (UTC) by device 243B1F05 at gain setting 2 while battery state was 3.6V."

t = re.search(' (\d{2}:\d{2}:\d{2} \d{2}\/\d{2}\/\d{4}) ', s).group(1)
dt = parse(t, fuzzy=True)
print(dt)

输出:

2019-02-07 21:50:00

应用于数据框列:

pd.to_datetime(S.str.extract(' (\d{2}:\d{2}:\d{2} \d{2}\/\d{2}\/\d{4}) ').squeeze(), format='%H:%M:%S %m/%d/%Y')

关于python-3.x - 如何从 Python 中的非日期时间字符串中剥离时间?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67155475/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com