gpt4 book ai didi

python - 使用 pandas 将字符串转换为日期时间值

转载 作者:行者123 更新时间:2023-12-01 04:04:21 25 4
gpt4 key购买 nike

我有一个 Twitter 数据集,我正在尝试使用 pandas 对其进行分析,但我不知道如何转换(例如“2 天”、“24 小时”或“2 个月”、“5 年” )转换为日期时间格式。

我使用了以下代码:

for i df_merge['last_tweet']:
n = i['last_tweet'].split(" ") [0]
d = i['last_tweet'].split(" ") [1]
if d in ["years", "year"]:
n_days = n*365
elif d in ["months", "month"]:
n_days = n*30

最佳答案

您可能想编写一个辅助函数...

import numpy as np
import pandas as pd

def ym2nptimedelta(delta):
delta_cfg = {
'month': 'M',
'months': 'M',
'year': 'Y',
'years': 'Y'
}
n, item = delta.lower().split()
return np.timedelta64(n, delta_cfg.get(item))

print(pd.datetime.today() - pd.Timedelta('2 days'))
print(pd.datetime.today() - pd.Timedelta('24 hours'))
print(pd.to_datetime(pd.datetime.now()) - ym2nptimedelta('2 years'))
print(pd.to_datetime(pd.datetime.now()) - ym2nptimedelta('5 years'))

输出:

2016-03-08 20:39:34.315969
2016-03-09 20:39:34.315969
2014-03-11 09:01:10.316969
2011-03-11 15:33:34.317969

UPDATE1(此辅助函数将处理所有可接受的 numpy 时间增量):

import numpy as np
import pandas as pd

def deltastr2date(delta):
delta_cfg = {
'year': 'Y',
'years': 'Y',
'month': 'M',
'months': 'M',
'week': 'W',
'weeks': 'W',
'day': 'D',
'days': 'D',
'hour': 'h',
'hours': 'h',
'min': 'm',
'minute': 'm',
'minutes': 'm',
'sec': 's',
'second': 's',
'seconds': 's',
}
n, item = delta.lower().split()
return pd.to_datetime(pd.datetime.now()) - np.timedelta64(n, delta_cfg.get(item))

print(deltastr2date('2 days'))
print(deltastr2date('24 hours'))
print(deltastr2date('2 years'))
print(deltastr2date('5 years'))
print(deltastr2date('1 week'))
print(deltastr2date('10 hours'))
print(deltastr2date('45 minutes'))

输出:

2016-03-08 20:50:01.701853
2016-03-09 20:50:01.702853
2014-03-11 09:11:37.702853
2011-03-11 15:44:01.703853
2016-03-03 20:50:01.704854
2016-03-10 10:50:01.705854
2016-03-10 20:05:01.705854

UPDATE2(展示如何将辅助函数应用于 DF 列):

import numpy as np
import pandas as pd

def deltastr2date(delta):
delta_cfg = {
'year': 'Y',
'years': 'Y',
'month': 'M',
'months': 'M',
'week': 'W',
'weeks': 'W',
'day': 'D',
'days': 'D',
'hour': 'h',
'hours': 'h',
'min': 'm',
'minute': 'm',
'minutes': 'm',
'sec': 's',
'second': 's',
'seconds': 's',
}
n, item = delta.lower().split()
return pd.to_datetime(pd.datetime.now()) - np.timedelta64(n, delta_cfg.get(item))

N = 20

dt_units = ['seconds','minutes','hours','days','weeks','months','years']

# generate random list of deltas
deltas = ['{0[0]} {0[1]}'.format(tup) for tup in zip(np.random.randint(1,5,N), np.random.choice(dt_units, N))]

df = pd.DataFrame({'delta': pd.Series(deltas)})

# add new column
df['last_tweet_dt'] = df['delta'].apply(deltastr2date)
print(df)

输出:

        delta              last_tweet_dt
0 3 hours 2016-03-10 20:32:49.252525
1 4 days 2016-03-06 23:32:49.252525
2 3 seconds 2016-03-10 23:32:46.253525
3 1 weeks 2016-03-03 23:32:49.253525
4 1 minutes 2016-03-10 23:31:49.253525
5 2 minutes 2016-03-10 23:30:49.253525
6 4 days 2016-03-06 23:32:49.254525
7 1 years 2015-03-11 17:43:37.254525
8 2 seconds 2016-03-10 23:32:47.254525
9 3 minutes 2016-03-10 23:29:49.254525
10 1 hours 2016-03-10 22:32:49.255525
11 2 seconds 2016-03-10 23:32:47.255525
12 3 minutes 2016-03-10 23:29:49.255525
13 3 months 2015-12-10 16:05:31.255525
14 4 weeks 2016-02-11 23:32:49.256526
15 3 months 2015-12-10 16:05:31.256526
16 4 hours 2016-03-10 19:32:49.256526
17 1 years 2015-03-11 17:43:37.256526
18 2 years 2014-03-11 11:54:25.257526
19 1 minutes 2016-03-10 23:31:49.257526

关于python - 使用 pandas 将字符串转换为日期时间值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35924630/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com