gpt4 book ai didi

Python:为简单的 OLS 循环一个变量

转载 作者:太空宇宙 更新时间:2023-11-03 11:15:29 26 4
gpt4 key购买 nike

我希望在 Python 中构建一个函数,该函数使用以下等式创建简单的 OLS 回归:

 Y_i - Y_i-1 = A + B(X_i - X_i-1) + E

换句话说,Y_Lag = alpha + beta(X_Lag) + Error term

目前,我有以下数据集(这是一个简短的版本)

注意:Y = Historic_Rate

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)), columns=['Historic_Rate', 'Overnight', '1M', '3M', '6M'])

所以,我要构建的是我迭代地获取 X 变量并将其放入简单的线性回归中,到目前为止我构建的代码如下所示:

#Start the iteration process for the regression to in turn fit 1 parameter

#Import required packages

import pandas as pd
import numpy as np
import statsmodels.formula.api as sm

#Import dataset

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)), columns=['Historic_Rate', 'Overnight', '1M', '3M', '6M'])
#Y_Lag is always 1 time period only

df['Y_Lag'] = df['Historic_Rate'].shift(1)

#Begin the process with 1 lag, taking one x variable in turn

array = df[0:0]
array.drop(array.columns[[0,5]], axis=1, inplace=True)
for X in array:
df['X_Lag'] = df['X'].shift(1)
Model = df[df.columns[4:5]]
Y = Model['Y_Lag']
X = Model['X_Lag']

Reg_model = sm.OLS(Y,X).fit()
predictions = model.predict(X)
# make the predictions by the model

# Print out the statistics
model.summary()

因此,从本质上讲,我希望创建一个列标题列表,依次系统地遍历我的循环,每个变量都将滞后,然后针对滞后的 Y 变量进行回归。

我还希望了解如何输出 model.X,其中 X 是数组的第 X 次迭代,用于变量的动态命名。

最佳答案

你很接近,我认为你只是混淆了你的变量 X使用字符串 'X'在你的循环中。我还认为你不是在计算 Y_i - Y_i-1 , 而只是倒退 Y_i-1反对X_i-1 .

以下是循环回归的方法。我们还将使用字典来存储回归结果,键作为列名。

import pandas as pd
import numpy as np
import statsmodels.api as sm

df = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
columns=['Historic_Rate', 'Overnight', '1M', '3M', '6M'])

fit_d = {} # This will hold all of the fit results and summaries
for col in [x for x in df.columns if x != 'Historic_Rate']:
Y = df['Historic_Rate'] - df['Historic_Rate'].shift(1)
# Need to remove the NaN for fit
Y = Y[Y.notnull()]

X = df[col] - df[col].shift(1)
X = X[X.notnull()]

X = sm.add_constant(X) # Add a constant to the fit

fit_d[col] = sm.OLS(Y,X).fit()

现在如果你想做一些预测,比如说你的最后一个模型,你可以这样做:

fit_d['6M'].predict(sm.add_constant(df['6M']-df['6M'].shift(1)))
#0 NaN
#1 0.5
#2 -2.0
#3 -1.0
#4 -0.5
#dtype: float64

您可以获得摘要:fit_d['6M'].summary()

                            OLS Regression Results                            
==============================================================================
Dep. Variable: Historic_Rate R-squared: 0.101
Model: OLS Adj. R-squared: -0.348
Method: Least Squares F-statistic: 0.2254
Date: Thu, 27 Sep 2018 Prob (F-statistic): 0.682
Time: 11:27:33 Log-Likelihood: -9.6826
No. Observations: 4 AIC: 23.37
Df Residuals: 2 BIC: 22.14
Df Model: 1
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const -0.4332 1.931 -0.224 0.843 -8.740 7.873
6M -0.2674 0.563 -0.475 0.682 -2.691 2.156
==============================================================================
Omnibus: nan Durbin-Watson: 2.301
Prob(Omnibus): nan Jarque-Bera (JB): 0.254
Skew: -0.099 Prob(JB): 0.881
Kurtosis: 1.781 Cond. No. 3.44
==============================================================================

关于Python:为简单的 OLS 循环一个变量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52539749/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com