gpt4 book ai didi

python - Pandas :选择范围之间第一次出现的DataFrame行

转载 作者:太空宇宙 更新时间:2023-11-03 23:55:13 26 4
gpt4 key购买 nike

我有一个数据框,我想从中选择一个范围内的数据,只有这个范围的第一次出现。

数据框:

data = {'x':[1,2,3,4,5,6,7,6.5,5.5,4.5,3.5,2.5,1], 'y':[1,4,3,3,52,3,74,64,15,41,31,12,11]} 
df = pd.DataFrame(data)

例如:从 2 到 6 中选择 x,第一次出现:

     x   y
0 1.0 1 #out of range
1 2.0 4 #out of range
2 3.0 3 #this first occurrence
3 4.0 3 #this first occurrence
4 5.0 52 #thisfirst occurrence
5 6.0 3 #out of range
6 7.0 74 #out of range
7 6.5 64 #out of range
8 5.5 15 #not this since repeating RANGE
9 4.5 41 #not this since repeating RANGE
10 3.5 31 #not this since repeating RANGE
11 2.5 12 #not this since repeating RANGE
12 1.0 11 #out of range

输出

     x   y
2 3.0 3 #this first occurrence
3 4.0 3 #this first occurrence
4 5.0 52 #thisfirst occurrence

我正在尝试修改此示例:Select DataFrame rows between two dates在第一次出现的两个值之间选择数据:

xlim=[2,6]
mask = (df['x'] > xlim[0]) & (df['x'] <= xlim[1])
df=df.loc[mask] #need to make it the first occurrence here

最佳答案

这是一种方法:

# mask with True whenever a value is within the range
m = df.x.between(2,6, inclusive=False)
# logical XOR with the next row and cumsum
# Keeping only 1s will result in the dataframe of interest
df.loc[(m ^ m.shift()).cumsum().eq(1)]

x y
2 3.0 3
3 4.0 3
4 5.0 52

详情-

df.assign(in_range=m, is_next_different=(m ^ m.shift()).cumsum())

x y in_range is_next_different
0 1.0 1 False 0
1 2.0 4 False 0
2 3.0 3 True 1
3 4.0 3 True 1
4 5.0 52 True 1
5 6.0 3 False 2
6 7.0 74 False 2
7 6.5 64 False 2
8 5.5 15 True 3
9 4.5 41 True 3
10 3.5 31 True 3
11 2.5 12 True 3
12 1.0 11 False 4

关于python - Pandas :选择范围之间第一次出现的DataFrame行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58007870/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com