gpt4 book ai didi

python - Pandas 回填具体值

转载 作者:太空狗 更新时间:2023-10-30 01:11:25 26 4
gpt4 key购买 nike

我有这样的数据框:

df = pd.DataFrame({'val': [np.nan,np.nan,np.nan,np.nan, 15, 1, 5, 2,np.nan, np.nan, np.nan, np.nan,np.nan,np.nan,2,23,5,12, np.nan np.nan, 3,4,5]})
df['name'] = ['a']*8 + ['b']*15

df

>>>
val name
0 NaN a
1 NaN a
2 NaN a
3 NaN a
4 15.0 a
5 1.0 a
6 5.0 a
7 2.0 a
8 NaN b
9 NaN b
10 NaN b
11 NaN b
12 NaN b
13 NaN b
14 2.0 b
15 23.0 b
16 5.0 b
17 12.0 b
18 NaN b
19 NaN b
20 3.0 b
21 4.0 b
22 5.0 b

对于每个 name,我想用 -1 回填之前的 3 个 na 点,这样我就可以结束了

>>>
val name
0 NaN a
1 -1.0 a
2 -1.0 a
3 -1.0 a
4 15.0 a
5 1.0 a
6 5.0 a
7 2.0 a
8 NaN b
9 NaN b
10 NaN b
11 -1.0 b
12 -1.0 b
13 -1.0 b
14 2.0 b
15 23.0 b
16 5.0 b
17 12.0 b
18 -1 b
19 -1 b
20 3.0 b
21 4.0 b
22 5.0 b

请注意,可以有多个包含 NaN 的部分。如果一个部分少于 3 nan,它将填充所有 nan(最多回填 3)。

最佳答案

可以使用first_valid_index,返回每组的第一个非空值然后使用 loc

分配 -1
idx=df.groupby('name').val.apply(lambda x : x.first_valid_index())
for x in idx:
df.loc[x - 3:x - 1, 'val'] = -1

df
Out[51]:
val name
0 NaN a
1 -1.0 a
2 -1.0 a
3 -1.0 a
4 15.0 a
5 1.0 a
6 5.0 a
7 2.0 a
8 NaN b
9 NaN b
10 NaN b
11 -1.0 b
12 -1.0 b
13 -1.0 b
14 2.0 b
15 23.0 b
16 5.0 b
17 12.0 b

更新

s=df.groupby('name').val.bfill(limit=3)
s.loc[s.notnull()&df.val.isnull()]=-1
s
Out[59]:
0 NaN
1 -1.0
2 -1.0
3 -1.0
4 15.0
5 1.0
6 5.0
7 2.0
8 NaN
9 NaN
10 NaN
11 -1.0
12 -1.0
13 -1.0
14 2.0
15 23.0
16 5.0
17 12.0
18 NaN
19 -1.0
20 -1.0
21 -1.0
22 3.0
23 4.0
24 5.0
Name: val, dtype: float64

关于python - Pandas 回填具体值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50750030/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com