gpt4 book ai didi

Python Pandas 检查字符串是否仅为 "Date"或仅为 "Time"或 "Datetime"

转载 作者:行者123 更新时间:2023-12-04 02:42:17 25 4
gpt4 key购买 nike

我正在使用 pandas 读取 csv

str,date,float,time,datetime
a,10/11/19,1.1,10:30:00,10/11/19 10:30
b,10/11/19,1.2,10:00:00,10/11/19 10:30
c,10/11/19,1.3,11:10:11,10/11/19 10:30
df = pd.read_csv(file)

现在我的业务需求是要区分哪一列是纯日期字段,纯时间字段,哪一列是完整的日期时间。对于特定的列,我的代码是:

try:
dt = pd.to_datetime(df[col])
dates = [obj.date() for obj in dt]
times = [obj.time() for obj in dt]

if dates and (set(times) == set([datetime.time(0, 0)])):
# Its a pure date field
elif <something>:
# Its a pure time field
else:
#Its a Datetime field


except:
# its not a datefield

我的代码的问题是当只有时间字段时,pd.to_datetime 将采用今天的默认日期,因此我无法将它与日期时间区分开来。有什么简单的解决办法吗?请帮我在上面的代码中填写“something”

最佳答案

如果想要测试时间,pandas 默认使用今天的日期,所以可能的解决方案是用 Series.dt.date 测试它们, Timestamp.dateSeries.all如果列的所有值都匹配。

还为测试日期添加了另一个解决方案 - 通过 Series.dt.floor 测试删除时间后是否有相同的值:

df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02 12:23:10'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b']})
print (df)
a b c d
0 2019-01-01 12:23:10 2019-01-01 12:23:10 a
1 2019-01-02 12:23:10 2019-01-02 15:23:10 b

def check(col):
try:
dt = pd.to_datetime(df[col])

if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif (dt.dt.date == pd.Timestamp('now').date()).all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')


print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield

另一个想法也是测试数字列是否默认返回非数字,以防止将数字转换为日期时间,但如果可能,所有日期时间都只包含今天的日期(f 列),那么时间测试是不同的与 Series.str.contains对于匹配模式 HH:MM:SSH:MM:SS:

df = pd.DataFrame({'a':['2019-01-01 12:23:10',
'2019-01-02'],
'b':['2019-01-01',
'2019-01-02'],
'c':['12:23:10',
'15:23:10'],
'd':['a','b'],
'e':[1,2],
'f':['2019-11-13 12:23:10',
'2019-11-13'],})
print (df)
a b c d e f
0 2019-01-01 12:23:10 2019-01-01 12:23:10 a 1 2019-11-13 12:23:10
1 2019-01-02 2019-01-02 15:23:10 b 2 2019-11-13

def check(col):
if np.issubdtype(df[col].dtype, np.number):
return ('its not a datefield')

try:
dt = pd.to_datetime(df[col])
if (dt.dt.floor('d') == dt).all():
return ('Its a pure date field')
elif df[col].str.contains(r"^\d{1,2}:\d{2}:\d{2}$").all():
return ('Its a pure time field')
else:
return ('Its a Datetime field')
except:
return ('its not a datefield')


print (check('a'))
print (check('b'))
print (check('c'))
print (check('d'))
print (check('e'))
print (check('f'))
Its a Datetime field
Its a pure date field
Its a pure time field
its not a datefield
its not a datefield
Its a Datetime field

关于Python Pandas 检查字符串是否仅为 "Date"或仅为 "Time"或 "Datetime",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58831943/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com