Linear regression value errors with Scipy(基于Scipy的线性回归值误差分析)-6ren

Linear regression value errors with Scipy(基于Scipy的线性回归值误差分析)

转载作者：bug小助手更新时间：2023-10-25 11:49:13

26

4

I have two data arrays, xdata of shape (40,) and ydata of shape (40, 721, 1440) (time, lat, lon). My final goal is to compute the regression slope between these two datasets and obtain the errors from the slope to show the error distribution, where x is average sea surface temperatures for 40 years and y is an atmospheric variable. I have done this using one method where I calculated the covariance and then took the square root of this array, which if I understand correct, leaves me with the standard error. To verify this approach, I found the scipy.stat.linregress function and wanted to use this as it returns the standard error while calculating the slope. Although, I am running into errors when coding this.

I有两个数据数组，形状为(40，)扩展数据和形状为(40,721,1440)(时间，经度)的ydata。我的最终目标是计算这两个数据集之间的回归斜率，并从斜率中获得误差以显示误差分布，其中x是40年来的平均海洋表面温度，y是大气变量。我使用的是一种方法，我计算协方差，然后取这个数组的平方根，如果我理解正确，就会得到标准误差。为了验证这种方法，我找到了scipy.stat.linregress函数，并希望使用它，因为它在计算斜率时返回标准误差。不过，我在编写代码时遇到了错误。

To get my arrays the same size:

让我的数组大小相同：

Y = ydata.stack(allpoints = ['lat','lon'])
X = xdata.values[:, None] * np.ones(Y.shape)

Here, I have stacked my ydata into a 2D array rather than a 3D array, leaving with me with the dimensions of (40, 1038240). Then I create a temporary array to of the xdata, with an extra dimensions filled with ones to get the two arrays of the same shape. After this I pass it through the function:

在这里，我将ydata堆叠到一个2D数组中，而不是3D数组中，剩下的维度是(40,1038240)。然后，我为扩展数据创建了一个临时数组To，用一个额外的维度填充1，以获得形状相同的两个数组。在此之后，我将其传递给函数：

test = stats.linregress(X,Y)

And I am left with a value error saying:

我留下了一个值错误，写道：

ValueError: too many values to unpack (expected 4)

UPDATE:

最新情况：

I was able to get the package I was interested in using working:

我能够获得我感兴趣的使用Working的包：

from scipy import stats
ny = 721
nx = 1440
n = nx * ny
cape_2d = cape_ds.values.reshape(40,n)
reg_results = np.empty((5,n))

for i in range(n):
    reg_results[:,i] = stats.linregress(b,cape_2d[:,i])


slope, intercept, r_val, p_val, std_err = reg_results.reshape((5,ny,nx))
err_shape = std_err.reshape(721*1440)

plt.plot(err_shape)

At this point, I think I may have the correct answer, but Im worried that my standard errors are too high...Im not sure what to for it to look like, I was just hoping to get a normal distribution.

在这一点上，我想我可能有正确的答案，但我担心我的标准误差太高了…我不确定它看起来是什么样子，我只是希望得到一个正态分布。

更多回答

Won't this regression produce the same value for every Y point at the same time-step? Seems like you could simplify the problem by taking the mean of Y across all latitude and longitudes, giving you a Y array of shape (40,).

这种回归不是会在相同的时间步长为每个Y点产生相同的值吗？似乎你可以通过取所有纬度和经度的Y的平均值来简化问题，得到一个形状为(40，)的Y数组。

Im not sure? I dont think so...I have successfully completed the regression part of this problem using another method, but I am now just trying to see my error distribution and my initial method did not calculate this directly so I tried to take the sqrt(cov) and to check this, I wanted to use this package/function for the standard error.

我不确定？我不这么认为……我已经用另一种方法成功地完成了这个问题的回归部分，但我现在只是试图查看我的错误分布，而我最初的方法没有直接计算它，所以我尝试使用SQRT(Cov)来检查这一点，我想使用这个包/函数来处理标准错误。

If you have an already-working way of doing this regression, you could run your linear regression, get Y_pred, and calculate Y - Y_pred. You could then flatten that, and plot it using a histogram. That would show you the magnitude of the residuals, as well as what distribution it has.

如果您已经有了执行此回归的有效方法，则可以运行您的线性回归，获得Y_pred，并计算Y-Y_pred。然后，您可以将其展平，并使用直方图绘制它。这将显示残差的大小，以及它的分布情况。

I would get Y_pred by y_pred = slope*X + intercept, right? And then just subtract from the Y data array I input into the regression function? I have done this but the histogram does not seem like what I would expect it to be...

我会通过y_pred=斜率*X+截距得到Y_pred，对吗？然后从我输入到回归函数的Y数据数组中减去？我已经这样做了，但柱状图似乎不像我期望的那样……

Most linear regression packages give some .fit() or .predict() method for getting predictions from X values. Are you using a linear regression package, or did you write your own code for fitting it?

大多数线性回归包都提供了一些.fit()或.recast()方法，用于从X值获得预测。您使用的是线性回归包，还是您自己编写的代码来拟合它？

优秀答案推荐

更多回答

26

4

0

javascript - 将 json 编码的字符串转换为 [value, value],[value, value]
我正在尝试使用 flot 绘制 SQL 数据库中的数据图表，这是使用 php 收集的，然后使用 json 编码的。目前看起来像: [{"month":"February","data":482},
php - php数组的结果是[value][value]，我怎样才能得到像[value,value]这样的结果
我有一个来自 php 行的 json 结果，类似于 ["value"]["value"] 我尝试使用内爆函数，但得到的结果是“value”“value” |id_kategori|created_at
javascript - 为什么 select.setAttribute ('value' ,value) 产生与 select.value=value 不同的结果？
脚本 1 将记录 two 但浏览器仍会将 select 元素呈现为 One。该表单还将提交值 one。脚本 2 将记录、呈现和提交两个。我希望它们是同义词并做同样的事情。请解释它们为何不同，以及我
Python如何做列表字典的字典的.values().values()
我的python字典结构是这样的: ips[host][ip] 每行 ips[host][ip] 看起来像这样: [host, ip, network, mask, broadcast, mac, g
c# - 这是什么意思/做什么？ "value < 0 ? -value : value;"
在 C# 中我正在关注的一本书对设置和获取属性提出了这样的建议: double pri_test; public double Test { get { return pri_test; }
c++ - if (mask & VALUE) 还是 if ((mask & VALUE) == VALUE)？
您可能熟悉 enum 位掩码方案，例如: enum Flags { FLAG1 = 0x1, FLAG2 = 0x2, FLAG3 = 0x4, FLAG4 = 0x8
java - (String)value 和 value.toString() ， new Long(value) 和 (Long)value 之间的区别
在一些地方我看到了(String)value。在一些地方value.toString() 这两者有什么区别，在什么情况下我需要使用哪一个。 new Long(value) 和 (Long)value
javascript - 当 "!value ? null : value[0]"不等同于 "value ? value[0] : null"时，Javascript 中是否存在任何时间？
有没有什么时候 var result = !value ? null : value[0]; 不会等同于 var result = value ? value[0] : null; 最佳答案在此处将
javascript - 如何修复 "My first scan value is not same as my second scan value and the value scan in HTML is not same as value scan in notepad?"
我正在使用扫描仪检测设备。目前，我的条形码的值为 2345345 A1。因此，当我扫描到记事本或文本编辑器时，输出将类似于 2345345 A1，这是正确的条形码值。问题是: 当我第一次将条形码扫描
c# - 如何转换 Json key :value into value:value in C#?
我正在读取 C# 中的资源文件并将其转换为 JSON 字符串格式。现在我想将该 JSON 字符串的值转换为键。例子， [ { "key": "CreateAccount", "text":
Python( Pandas ): replace value if previous value is same as next value
我有以下问题: 我有一个数据框，最多可能有 600 万行左右。此数据框中的一列包含某些 ID。 ID NaN NaN D1 D1 D1 NaN D1 D1 NaN NaN NaN NaN D2 NaN
java - (Float value + Integer value + long value) 如何给出意想不到的结果？
import java.util.*; import java.lang.*; class Main { public static void main (String[] args) thr
android - values、values-v11 和 values-v14 文件夹的样式和主题
我目前正在开发我的应用程序，使其设计基于 Holo 主题。在全局范围内我想做的是工作，但我对文件夹 values、values-v11 和 values-v14. 所以我知道: values 的目标是
java ； HttpURL连接；查询项重复为 `paramName=value, value` 。预计为 `paramName=value`
我遇到了一个非常奇怪的问题。我的公司为我们的各种 Assets 使用集中式用户注册网络服务。我们一般通过HttpURLConnection使用请求方法GET向Web服务发送请求，通过qs设置参数。这
mySQL UPDATE value based on SELECT value of value +1 递增列值
查询: UPDATE nominees SET votes = ( SELECT votes FROM nominees WHERE ID =1 ) +1 错误: You can't specify
javascript - mathjs 评估错误 : (intermediate value)(intermediate value)(intermediate value) is not a function
如果我运行一段代码: obj = {}; obj['number'] = 1; obj['expressionS'] = 'Sin(0.5 * c1)'; obj['c
android - 错误 : String types not allowed (at 'fail' with value) @values/values. xml
我正在为我的应用创建一个带有 Twitter 帐户的登录页面。当我构建我的项目时会发生上述错误。 values/strings.xml @dimen/abc_text_size_medium
mysql - View 中的 SUM(table2.value * table2.value) (+ table1.value)
我在搜索引擎中使用以下 View : CREATE VIEW msr_joined_view AS SELECT table1.id AS msr_id, table1.msr_number, tab
xhtml - 验证错误 "Value Error : background-position Too many values or values are not"如何解决？
为什么验证会返回此错误。如何解决？ ul#navigation li#navigation-3 a.current Value Error : background-position Too
Python 数据帧 : find previous row's value before a specific value with same value in other columns
我有一个数据名如下 import pandas as pd d = { 'Name' : ['James', 'John', 'Peter', 'Thomas', 'Jacob', 'Andr

首页

博学

6Ren·AI

商城

Linear regression value errors with Scipy(基于Scipy的线性回归值误差分析)