gpt4 book ai didi

python - 当需要多个groupby()和shift()时,如何在pandas数据框中逐行重新计算值?

转载 作者:太空宇宙 更新时间:2023-11-03 20:56:49 24 4
gpt4 key购买 nike

我有一个多索引数据框。其中有一列 - Shares - 应根据先前索引中的 Equity 列值逐行计算。

我尝试定义一个函数,以便能够逐行apply()到数据帧,但我意识到我不能使用groupby() > 也不使用此方法 shift()

我创建了数据框:

import pandas as pd
import numpy as np

date_index = pd.date_range(start='1/1/2019', end='1/10/2019')
symbol_index = ['AAPL','BOA','GE','MSFT']

idx = pd.MultiIndex.from_product([date_index, symbol_index], names=['Date', 'Symbol'])
col = ['Price', 'Shares', 'Profit','Total_Profit', 'Equity']

data = pd.DataFrame(index=idx,columns=col)

price_list = [46, 17, 56, 66, 54, 79, 33, 63, 60, 63, 39, 26]
data['Price'] = price_list

我的初始数据框如下所示:

                   Price  Shares  Profit  Total_Profit   Equity
Date Symbol
2019-01-01 AAPL 46 NaN NaN NaN NaN
BOA 17 NaN NaN NaN NaN
GE 56 NaN NaN NaN NaN
MSFT 66 NaN NaN NaN NaN
2019-01-02 AAPL 54 NaN NaN NaN NaN
BOA 79 NaN NaN NaN NaN
GE 33 NaN NaN NaN NaN
MSFT 63 NaN NaN NaN NaN
2019-01-03 AAPL 60 NaN NaN NaN NaN
BOA 63 NaN NaN NaN NaN
GE 39 NaN NaN NaN NaN
MSFT 26 NaN NaN NaN NaN

我需要这些变量:

starting_capital = 5000
risk_per_position = 0.1

我定义了列:

data['Shares'] = data.groupby('Symbol')['Equity'].shift(1).fillna(starting_capital) * risk_per_position / data['Price']
data['Shares'] = round(data['Shares'],0)
data['Profit'] = data['Shares'] * data['Price']
data['Total_Profit'] = data.groupby(by=['Date','Symbol'])['Profit'].sum().groupby('Date').cumsum().groupby('Date').tail(1).cumsum()
data['Total_Profit'] = data['Total_Profit'].bfill()
data['Equity'] = starting_capital + data['Total_Profit']
data['previous equity'] = data.groupby('Symbol')['Equity'].shift(1).fillna(starting_capital)

Shares at date_index - 因此 ProfitTotal_ProfitEquity 为好吧 - 应根据 previous_date_index 处的Equity 值计算。但是,现在始终根据 starting_capital 计算,输出为:

                   Price  Shares  Profit  Total_Profit   Equity
Date Symbol
2019-01-01 AAPL 46 11.0 506.0 2031.0 7031.0
BOA 17 29.0 493.0 2031.0 7031.0
GE 56 9.0 504.0 2031.0 7031.0
MSFT 66 8.0 528.0 2031.0 7031.0
2019-01-02 AAPL 54 9.0 486.0 3990.0 8990.0
BOA 79 6.0 474.0 3990.0 8990.0
GE 33 15.0 495.0 3990.0 8990.0
MSFT 63 8.0 504.0 3990.0 8990.0
2019-01-03 AAPL 60 8.0 480.0 5975.0 10975.0
BOA 63 8.0 504.0 5975.0 10975.0
GE 39 13.0 507.0 5975.0 10975.0
MSFT 26 19.0 494.0 5975.0 10975.0

输出应该是:

                   Price  Shares  Profit  Total_Profit   Equity
Date Symbol
2019-01-01 AAPL 46 11.0 506.0 2031.0 7031.0
BOA 17 29.0 493.0 2031.0 7031.0
GE 56 9.0 504.0 2031.0 7031.0
MSFT 66 8.0 528.0 2031.0 7031.0
2019-01-02 AAPL 54 13.0 702.0 4830.0 9830.0
BOA 79 9.0 711.0 4830.0 9830.0
GE 33 21.0 693.0 4830.0 9830.0
MSFT 63 11.0 693.0 4830.0 9830.0
2019-01-03 AAPL 60 16.0 960.0 8761.0 13761.0
BOA 63 16.0 1008.0 8761.0 13761.0
GE 39 25.0 975.0 8761.0 13761.0
MSFT 26 38.0 988.0 8761.0 13761.0

非常感谢您的帮助。在这种情况下,Shares 列的正确公式是什么?

最佳答案

data['Shares'] = data.['Equity'].shift(-1).groupby('Symbol').fillna(starting_capital) * 
risk_per_position / data['Price']

尝试将 'Equity' 列移动 -1,然后执行分组依据。

关于python - 当需要多个groupby()和shift()时,如何在pandas数据框中逐行重新计算值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55969499/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com