gpt4 book ai didi

python - Pandas 忽略缺失的日期来查找百分位数

转载 作者:行者123 更新时间:2023-12-01 06:48:40 25 4
gpt4 key购买 nike

我有一个数据框。我正在尝试查找日期时间的百分位数。我正在使用该功能:

数据框:

student, attempts, time
student 1,14, 9/3/2019 12:32:32 AM
student 2,2, 9/3/2019 9:37:14 PM
student 3, 5
student 4, 16, 9/5/2019 8:58:14 PM

studentInfo2 = [14, 4, Timestamp('2019-09-04 00:26:36')]
data['time'] = pd.to_datetime(data['time_0001'], errors='coerce')
perc1_first = stats.percentileofscore(data['time'].notnull(), student2Info[2], 'rank')

其中student2Info[2] 保存特定学生的日期时间。当我尝试执行此操作时,出现错误:

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

即使列中缺少时间,关于如何正确计算百分位数有什么想法吗?

最佳答案

您需要将时间戳转换为percentileofscore可以理解的单位。另外,pd.DataFrame.notnull()返回一个 bool 列表,您可以使用它来过滤DataFrame,它不返回过滤后的列表,所以我已经更新了为你。这是一个工作示例:

import pandas as pd
import scipy.stats as stats

data = pd.DataFrame.from_dict({
"student": [1, 2, 3, 4],
"attempts": [14, 2, 5, 16],
"time_0001": [
"9/3/2019 12:32:32 AM",
"9/3/2019 9:37:14 PM",
"",
"9/5/2019 8:58:14 PM"
]
})

student2Info = [14, 4, pd.Timestamp('2019-09-04 00:26:36')]
data['time'] = pd.to_datetime(data['time_0001'], errors='coerce')
perc1_first = stats.percentileofscore(data[data['time'].notnull()].time.transform(pd.Timestamp.toordinal), student2Info[2].toordinal(), 'rank')
print(perc1_first) #-> 66.66666666666667

关于python - Pandas 忽略缺失的日期来查找百分位数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/59120904/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com