gpt4 book ai didi

Python/Pandas 分箱数据 Timedelta

转载 作者:太空狗 更新时间:2023-10-30 00:18:47 24 4
gpt4 key购买 nike

我有一个包含两列的 DataFrame

    userID     duration
0 DSm7ysk 03:08:49
1 no51CdJ 00:35:50
2 ...

“持续时间”的类型为 timedelta。我试过使用

bins = [dt.timedelta(minutes = 0), dt.timedelta(minutes = 
5),dt.timedelta(minutes = 10),dt.timedelta(minutes =
20),dt.timedelta(minutes = 30), dt.timedelta(hours = 4)]

labels = ['0-5min','5-10min','10-20min','20-30min','30min+']

df['bins'] = pd.cut(df['duration'], bins, labels = labels)

但是,分箱数据不使用指定的分箱,而是为帧中的每个持续时间创建。

将 timedelta 对象分箱到不规则分箱中的最简单方法是什么?或者我只是在这里遗漏了一些明显的东西?

最佳答案

它适用于 pandas 0.23.4

import pandas as pd
import numpy as np

df = pd.DataFrame({
'userID': ['DSm7ysk', 'no51CdJ', 'foo', 'bar'],
'duration': [pd.Timedelta('3 hours 8 minutes 49 seconds'), pd.Timedelta('35 minutes 50 seconds'), pd.Timedelta('1 minutes 13 seconds'), pd.Timedelta('6 minutes 43 seconds')]
})

bins = [
pd.Timedelta(minutes = 0),
pd.Timedelta(minutes = 5),
pd.Timedelta(minutes = 10),
pd.Timedelta(minutes = 20),
pd.Timedelta(minutes = 30),
pd.Timedelta(hours = 4)
]

labels = ['0-5min', '5-10min', '10-20min', '20-30min', '30min+']

df['bins'] = pd.cut(df['duration'], bins, labels = labels)

结果:

result

关于Python/Pandas 分箱数据 Timedelta,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46930291/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com