gpt4 book ai didi

pandas - 我如何使用 scipy optimize curve fit with panda df

转载 作者:行者123 更新时间:2023-12-05 05:11:46 26 4
gpt4 key购买 nike

这是我在这里的第一篇文章,我花了几个小时寻找这个答案,但我似乎无法弄清楚。我使用 pandas 将 .csv 传递给 np 矩阵。从那里我尝试应用简单的曲线拟合,但我得到的输出始终是错误的。该代码将绘制错误的拟合,并且不会绘制数据。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

df = pd.read_csv("Results.csv")
xdata = df['Frame'].as_matrix()
ydata = df['Area'].as_matrix()

def func(x, a, b, c):
return (a*np.sin(b*x))+(c * np.exp(x))
popt, pcov = curve_fit(func, xdata, ydata)

plt.plot(xdata, func(xdata, *popt), 'r-',
label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
popt, pcov = curve_fit(func, xdata, ydata)

plt.plot(xdata, func(xdata, *popt), 'g--',
label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()

这是数据的样子: This is what the data looks like

预先感谢您的帮助。

最佳答案

您的模型包含“exp(x)”并且数据文件包含 x 值 1000,无论起始值如何,这都会产生数学溢出错误 - 优化器无法找到解决该问题的方法,您必须更改拟合该数据集的方程式。我可以建议其他方程式,但这个数据集不能适合张贴的方程式。

编辑:根据您对除以 100 的评论,这里是使用 scipy 的差分进化遗传算法模块查找初始参数估计的代码,它使用拉丁超立方体算法来确保对参数空间的彻底搜索- 该算法需要搜索范围,并且参数范围比精确的初始参数值更容易找到。在这里,我尝试了几个范围,并从我所看到的中得到了可能是最合适的。

plot

import pandas as pd
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings


df = pd.read_csv("Results.csv")
xData = df['Frame'].as_matrix() / 100.0
yData = df['Area'].as_matrix()

def func(x, a, b, c):
return (a*numpy.sin(b*x))+(c * numpy.exp(x))


# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)


def generate_Initial_Parameters():

parameterBounds = []
parameterBounds.append([0.0, 100.0]) # search bounds for a
parameterBounds.append([0.0, 1.0]) # search bounds for b
parameterBounds.append([0.0, 1.0]) # search bounds for c

# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x

# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()

# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()

modelPredictions = func(xData, *fittedParameters)

absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))

print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)

# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')

# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)

# now the model as a line plot
axes.plot(xModel, yModel)

axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label

plt.show()
plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

关于pandas - 我如何使用 scipy optimize curve fit with panda df,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55212002/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com