gpt4 book ai didi

Python:根据整数范围的值在 Pandas 数据框中创建组列

转载 作者:行者123 更新时间:2023-12-01 03:40:50 25 4
gpt4 key购买 nike

对于 diff 列中的每个范围 [0, 150],我想创建一个组列,每次范围重置时该列都会增加 1。当 diff 为负数时,范围会重置。

import pandas as pd
df = pd.DataFrame({'year': [2016, 2016, 2016, 2016, 2016, 2016, 2016],
'month' : [1, 1, 2, 3, 3, 3, 3],
'day': [23, 25, 1, 1, 7, 20, 30]})
df = pd.to_datetime(df)
df = pd.concat([df, pd.Series(data=[15, 35, 80, 5, 20, 45, 90])], axis=1)
df.columns = ['date', 'percentworn']
col_shift = ['percentworn']
df_shift = df.shift(1).loc[:, col_shift]
df_combined = df.join(df_shift, how='inner', rsuffix='_2')
df_combined.fillna(value=0,inplace=True)
df_combined['diff'] = df_combined['percentworn'] - df_combined['percentworn_2']

enter image description here

grp 列应为 0, 0, 0, 1, 1, 1, 1。我尝试过的代码是

def grping(df):
df_ = df.copy(deep=True)
i = 0
if df_['diff'] >= 0:
df_['grp'] = i
else:
i += 1
df_['grp'] = i
return df_
df_combined.apply(grping,axis=1)

我需要在递增后保持i += 1。我怎样才能实现这个目标?或者有更好的方法得到想要的结果吗?

enter image description here

最佳答案

IIUC 您可以测试 'diff' 列是否为负数,从而生成 bool 数组,然后将其转换为 int 并调用 cumsum:

In [313]:
df_combined['group'] = (df_combined['diff'] < 0).astype(int).cumsum()
df_combined

Out[313]:
date percentworn percentworn_2 diff group
0 2016-01-23 15 0.0 15.0 0
1 2016-01-25 35 15.0 20.0 0
2 2016-02-01 80 35.0 45.0 0
3 2016-03-01 5 80.0 -75.0 1
4 2016-03-07 20 5.0 15.0 1
5 2016-03-20 45 20.0 25.0 1
6 2016-03-30 90 45.0 45.0 1

分解以上内容:

In [314]:
df_combined['diff'] < 0

Out[314]:
0 False
1 False
2 False
3 True
4 False
5 False
6 False
Name: diff, dtype: bool

In [316]:
(df_combined['diff'] < 0).astype(int)

Out[316]:
0 0
1 0
2 0
3 1
4 0
5 0
6 0
Name: diff, dtype: int32

关于Python:根据整数范围的值在 Pandas 数据框中创建组列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39662016/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com