gpt4 book ai didi

python - 如何计算 Pandas 数据框上的连续有序值

转载 作者:太空宇宙 更新时间:2023-11-03 11:58:01 25 4
gpt4 key购买 nike

我正在尝试从给定数据框中获取连续 0 值的最大计数,其中 id、日期、值列来自 pandas 上的数据框,看起来像这样:

id    date       value
354 2019-03-01 0
354 2019-03-02 0
354 2019-03-03 0
354 2019-03-04 5
354 2019-03-05 5
354 2019-03-09 7
354 2019-03-10 0
357 2019-03-01 5
357 2019-03-02 5
357 2019-03-03 8
357 2019-03-04 0
357 2019-03-05 0
357 2019-03-06 7
357 2019-03-07 7
540 2019-03-02 7
540 2019-03-03 8
540 2019-03-04 9
540 2019-03-05 8
540 2019-03-06 7
540 2019-03-07 5
540 2019-03-08 2
540 2019-03-09 3
540 2019-03-10 2

所需的结果将按 Id 分组,如下所示:

id   max_consecutive_zeros
354 3
357 2
540 0

我已经用 for 实现了我想要的,但是当你使用巨大的 pandas 数据帧时它变得非常慢,我找到了一些类似的解决方案,但它根本无法解决我的问题。

最佳答案

为具有相同值的连续行创建 groupID m。接下来,groupbyidm 上调用 value_counts,在 .loc 上multiindex 仅对最右侧索引级别的 0 值进行切片。最后,通过 id 中的 duplicated 过滤掉重复索引并重新索引,为没有 0 计数的 id 创建 0 值

m = df.value.diff().ne(0).cumsum().rename('gid')    
#Consecutive rows having the same value will be assigned same IDNumber by this command.
#It is the way to identify a group of consecutive rows having the same value, so I called it groupID.

df1 = df.groupby(['id', m]).value.value_counts().loc[:,:,0].droplevel(-1)
#this groupby groups consecutive rows of same value per ID into separate groups.
#within each group, count number of each value and `.loc` to pick specifically only `0` because we only concern on the count of value `0`.

df1[~df1.index.duplicated()].reindex(df.id.unique(), fill_value=0)
#There're several groups of value `0` per `id`. We want only group of highest count.
#`value_count` already sorted number of count descending, so we just need to pick
#the top one of duplicates by slicing on True/False mask of `duplicated`.
#finally, `reindex` adding any `id` doesn't have value 0 in original `df`.
#Note: `id` is the column `id` in `df`. It is different from groupID `m` we create to use with groupby

Out[315]:
id
354 3
357 2
540 0
Name: value, dtype: int64

关于python - 如何计算 Pandas 数据框上的连续有序值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57363649/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com