gpt4 book ai didi

python - 条件下的 DataFrameGroupBy diff()

转载 作者:太空狗 更新时间:2023-10-30 02:39:54 26 4
gpt4 key购买 nike

假设我有一个 DataFrame:

df = pd.DataFrame({'CATEGORY':['a','b','c','b','b','a','b'],
'VALUE':[pd.np.NaN,1,0,0,5,0,4]})

看起来像

    CATEGORY    VALUE
0 a NaN
1 b 1
2 c 0
3 b 0
4 b 5
5 a 0
6 b 4

我把它分组:

df = df.groupby(by='CATEGORY')

现在,让我借助一组“b”上的示例展示我想要的东西:

df.get_group('b')

b组:

    CATEGORY    VALUE
1 b 1
3 b 0
4 b 5
6 b 4

我需要:在每组范围内,统计VALUE之间的diff()值,跳过所有 NaN s 和 0秒。所以结果应该是:

    CATEGORY    VALUE  DIFF
1 b 1 -
3 b 0 -
4 b 5 4
6 b 4 -1

最佳答案

您可以使用 diff在删除 0NaN 值后减去值:

df = pd.DataFrame({'CATEGORY':['a','b','c','b','b','a','b'],
'VALUE':[pd.np.NaN,1,0,0,5,0,4]})

grouped = df.groupby("CATEGORY")

# define diff func
diff = lambda x: x["VALUE"].replace(0, np.NaN).dropna().diff()
df["DIFF"] = grouped.apply(diff).reset_index(0, drop=True)

print(df)

CATEGORY VALUE DIFF
0 a NaN NaN
1 b 1.0 NaN
2 c 0.0 NaN
3 b 0.0 NaN
4 b 5.0 4.0
5 a 0.0 NaN
6 b 4.0 -1.0

关于python - 条件下的 DataFrameGroupBy diff(),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43140444/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com