gpt4 book ai didi

pandas - 使用 curve_fit() 时出现 ValueError : operands could not be broadcast together with shapes (38563, 54) (38563,)

转载 作者:行者123 更新时间:2023-11-30 09:42:27 32 4
gpt4 key购买 nike

注意:这道题不是乘法题,请忽略一些导入语句。现在详细信息如下,我使用 curve_fit() 来拟合周期性的 pandas 数据集。代码:

import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np
import datetime as dt
from sklearn.linear_model import LinearRegression
from sklearn import linear_model
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from sklearn import metrics
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import PolynomialFeatures
from scipy.optimize import leastsq
#import matplotlib.pyplot as plt
import pylab as plt
from scipy.optimize import curve_fit

df = pd.read_csv("Metro_Interstate_Traffic_Volume.csv")
df['holiday'].replace(to_replace = 'None', value = '0', inplace=True)
df.loc[df['holiday'] != '0', 'holiday'] = 1
print(df.shape)

df['date_time'] = pd.to_datetime(df['date_time'], format='%m/%d/%Y %H:%M')
df['date_time'] = (df['date_time']- dt.datetime(1970,1,1)).dt.total_seconds()

#print(df['date_time'].head())

non_dummy_cols = ['holiday','temp','rain_1h', 'snow_1h', 'clouds_all','date_time', 'traffic_volume']

dummy_cols = list(set(df.columns) - set(non_dummy_cols))
df = pd.get_dummies(df, columns=dummy_cols)
print(df.shape)

x = df[df.columns.values]
x = x.drop(['traffic_volume'], axis=1)
x = x.drop(['clouds_all'], axis = 1)
y = df['traffic_volume']
print(x.shape)
print(y.shape)

#plt.figure(figsize=(6,4))
#plt.scatter(df.date_time[0:100], df.traffic_volume[0:100], color = 'blue')
#plt.xlabel("Date Time")
#plt.ylabel("Traffic volume")
#plt.show()

x = StandardScaler().fit_transform(x)

x_train, x_test, y_train, y_test = train_test_split(x,y, test_size = 0.2, random_state= 4)

def my_sin(x, freq, amplitude, phase, offset):
return np.sin(x * freq + phase) * amplitude + offset

#x_train = np.array(x_train)
#y_train = np.array(y_train)

print(x_train)

popt, pcov = curve_fit(my_sin, x_train, y_train)
y_hat = my_sin(x_test, *popt)

错误:

ValueError: operands could not be broadcast together with shapes (38563,54) (38563,) 

下载 dataset网址

任何编程更改之前的数据集是:

enter image description here

那么我该如何克服这个错误呢?是否无法对 m*n x_train 使用 curve_fit ?

我还尝试将 y_train reshape 为 m*1 或 [ 2 , 2 ,....[]] 像这样,但这也不起作用。所以请帮我解决这个问题。

最佳答案

整个错误消息讲述了最后一行上方的故事:

Traceback (most recent call last):
File "temp.py", line 50, in <module>
popt, pcov = curve_fit(my_sin, x_train, y_train)
File "/usr/lib/python3/dist-packages/scipy/optimize/minpack.py", line 736, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "/usr/lib/python3/dist-packages/scipy/optimize/minpack.py", line 377, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "/usr/lib/python3/dist-packages/scipy/optimize/minpack.py", line 26, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "/usr/lib/python3/dist-packages/scipy/optimize/minpack.py", line 454, in func_wrapped
return func(xdata, *params) - ydata
ValueError: operands could not be broadcast together with shapes (38563,54) (38563,)

Curve_fit() 正在传递形状为 (38563, 54) 的函数“my_sin()”数据 - 这是 x_train.shape() 输出 - 并且返回具有相同形状的数据。 curve_fit 代码需要拟合函数来返回与 y_train 形状相同的数据,因此它可以将两者相减并计算误差。由于该函数不返回与 y_train 形状相同的数据,因此减法会出现异常。

我怀疑你应该在 sklearn 中使用线性回归,而不是 curve_fit 例程。

关于pandas - 使用 curve_fit() 时出现 ValueError : operands could not be broadcast together with shapes (38563, 54) (38563,),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57017268/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com