gpt4 book ai didi

python - Cumsum 列,同时跳过行或根据实际 cumsum 的结果在条件上设置固定值

转载 作者:行者123 更新时间:2023-12-04 17:09:02 25 4
gpt4 key购买 nike

我正在尝试在 pandas 中找到一个矢量化解决方案,该解决方案在电子表格中很常见,即在基于实际 cumsum 的结果跳过或设置固定值的情况下进行 cumsum。我有以下内容:

    A
1 0
2 -1
3 2
4 3
5 -2
6 -3
7 1
8 -1
9 1
10 -2
11 1
12 2
13 -1
14 -2

我需要添加第二列,其中的总和为“A”,如果其中一个总和为正值,则将其替换为 0,并使用该 0 继续计算总和。同时,如果总和给出了低于 B 列中 0 后记录的 A 列中最低值的负值我需要将其替换为 A 列中的最低值。我知道这是一个很大的问题,但是是否有矢量化解决方案?也许使用辅助列。结果应如下所示:

    A   B
1 0 0
2 -1 -1 # -1+0 = -1
3 2 0 # -1 + 2 = 1 but 1>0 so this is 0
4 3 0 # same as previous row
5 -2 -2 # -2+0 = -2
6 -3 -3 # -2-3 = -5 but the lowest value in column A since last 0 is -3 so this is replaced by -3
7 1 -2 # 1-3 = -2
8 -1 -3 # -1-2 = -3
9 1 -2 # -3 + 1 = -2
10 -2 -3 # -2-2 = -4 but the lowest value in column A since last 0 is -3 so this is replaced by -3
11 1 -2 # -3 +1 = -2
12 2 0 # -2+2 = 0
13 -1 -1 # 0-1 = -1
14 -2 -2 # -1-2 = -3 but the lowest value in column A since last cap is -2 so this is -2 instead of -3

目前我做了这个,但不是 100% 有效,而且效率也不高:

df['B'] = 0
df['B'][0] = 0
for x in range(len(df)-1):
A = df['A'][x + 1]
B = df['B'][x] + A
if B >= 0:
df['B'][x+1] = 0
elif B < 0 and A < 0 and B < A:
df['B'][x+1] = A
else:
df['B'][x + 1] = B

最佳答案

使用 df['A'].expanding(1).apply(function) 我可以运行自己的 function 首先只得到一行,接下来的 2 行,接下来的 3 行等。我没有给出之前计算的结果,它需要一次又一次地进行所有计算,但它不需要 global变量和硬编码df['A']

文档:Series.expanding

A = [0, -1, 2, 3, -2, -3, 1, -1, 1, -2, 1, 2, -1, -2]

import pandas as pd

df = pd.DataFrame({"A": A})

def function(values):
#print(values)
#print(type(valuse)
#print(len(values))

result = 0

last_zero = 0

for index, value in enumerate(values):
result += value

if result >= 0:
result = 0
last_zero = index
else:
minimal = min(values[last_zero:])
#print(index, last_zero, minimal)

#if result < minimal:
# result = minimal
result = max(result, minimal)

#print('result:', result)
return result

df['B'] = df['A'].expanding(1).apply(function)

df['B'] = df['B'].astype(int)

print(df)

结果:

    A  B
0 0 0
1 -1 -1
2 2 0
3 3 0
4 -2 -2
5 -3 -3
6 1 -2
7 -1 -3
8 1 -2
9 -2 -3
10 1 -2
11 2 0
12 -1 -1
13 -2 -2

相同但使用普通的 apply() - 它需要 global 变量和硬编码的 df['A']

A = [0, -1, 2, 3, -2, -3, 1, -1, 1, -2, 1, 2, -1, -2]

import pandas as pd

df = pd.DataFrame({"A": A})

result = 0
last_zero = 0
index = 0

def function(value):
global result
global last_zero
global index

result += value

if result >= 0:
result = 0
last_zero = index
else:
minimal = min(df['A'][last_zero:])
#print(index, last_zero, minimal)

#if result < minimal:
# result = minimal
result = max(result, minimal)

index += 1

#print('result:', result)
return result

df['B'] = df['A'].apply(function)
df['B'] = df['B'].astype(int)

print(df)

同样使用普通的for-loop

A = [0, -1, 2, 3, -2, -3, 1, -1, 1, -2, 1, 2, -1, -2]

import pandas as pd

df = pd.DataFrame({"A": A})

all_values = []

result = 0
last_zero = 0

for index, value in df['A'].iteritems():

result += value

if result >= 0:
result = 0
last_zero = index
else:
minimal = min(df['A'][last_zero:])
#print(index, last_zero, minimal)

#if result < minimal:
# result = minimal
result = max(result, minimal)

all_values.append(result)

df['B'] = all_values

print(df)

关于python - Cumsum 列,同时跳过行或根据实际 cumsum 的结果在条件上设置固定值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/69876970/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com