gpt4 book ai didi

python - 将对数正态分布拟合到已经装箱的数据 python

转载 作者:太空宇宙 更新时间:2023-11-04 02:50:07 28 4
gpt4 key购买 nike

我想对我已经装箱的数据进行对数正态拟合。条形图如下所示:enter image description here

不幸的是,当我尝试使用标准 lognorm.pdf() 时,拟合分布的形状非常不同。我想这是因为我的数据已经装箱了。这是代码:

times, data, bin_points = ReadHistogramFile(filename)

xmin = 200
xmax = 800
x = np.linspace(xmin, xmax, 1000)
shape, loc, scale = stats.lognorm.fit(data, floc=0)
pdf = stats.lognorm.pdf(x, shape, loc=loc, scale=scale)

area=data.sum()
plt.bar(bars, data, width=10, color='b')
plt.plot(x*area, pdf, 'k' )

这是拟合分布的样子: enter image description here显然,缩放也有问题。不过我不太关心这个。我的主要问题是分布的形状。这可能与:this question 重复,但我找不到正确的解决方案。我试过了,但仍然得到与执行上述操作时非常相似的形状。感谢您的帮助!

更新:通过使用 curve_fit() 我能够得到一些适合。但我还不满意。我想要原始垃圾箱而不是统一垃圾箱。我也不确定,到底发生了什么,如果没有更好的选择。这是代码:

def normalize_integral(data, bin_size):
normalized_data = np.zeros(size(data))
print bin_size
sum = data.sum()
integral = bin_size*sum
for i in range(0, size(data)-1):
normalized_data[i] = data[i]/integral

print 'integral:', normalized_data.sum()*bin_size
return normalized_data



def pdf(x, mu, sigma):
"""pdf of lognormal distribution"""

return (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) / (x * sigma * np.sqrt(2 * np.pi)))


bin_points=np.linspace(280.5, 1099.55994, len(bin_points))
data=[9.78200000e+03 1.15120000e+04 1.18000000e+04 1.79620000e+04 2.76980000e+04 2.78260000e+04 3.35460000e+04 3.24260000e+04 3.16500000e+04 3.30820000e+04 4.84560000e+04 5.86500000e+04 6.34220000e+04 5.11880000e+04 5.13180000e+04 4.74320000e+04 4.35420000e+04 4.13400000e+04 3.60880000e+04 2.96900000e+04 2.66640000e+04 2.58720000e+04 2.57560000e+04 2.20960000e+04 1.46880000e+04 9.97200000e+03 5.74200000e+03 3.52000000e+03 2.74600000e+03 2.61800000e+03 1.50000000e+03 7.96000000e+02 5.40000000e+02 2.98000000e+02 2.90000000e+02 2.22000000e+02 2.26000000e+02 1.88000000e+02 1.20000000e+02 5.00000000e+01 5.40000000e+01 5.80000000e+01 5.20000000e+01 2.00000000e+01 2.80000000e+01 6.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00]
normalized_data_unitybins = normalize_integral(data,1)


plt.figure(figsize=(9,4))
ax1=plt.subplot(121)
ax2=plt.subplot(122)
ax2.bar(unity_bins, normalized_data_unitybins, width=1, color='b')
fitParams, fitCov = curve_fit(pdf, unity_bins, normalized_data_unitybins, p0=[1,1],maxfev = 1000000)
fitData=pdf(unity_bins, *fitParams)
ax2.plot(unity_bins, fitData,'g-')

ax1.bar(bin_points, normalized_data_unitybins, width=10, color='b')
fitParams, fitCov = curve_fit(pdf, bin_points, normalized_data_unitybins, p0=[1,1],maxfev = 1000000)
fitData=pdf(bin_points, *fitParams)
ax1.plot(bin_points, fitData,'g-')

enter image description here

最佳答案

如您所述,您不能对分箱数据使用 lognorm.fit。所以你需要做的就是从直方图中恢复原始数据。显然这不是“无损”的,bins 越多越好。

带有一些生成数据的示例代码:

import numpy as np
import scipy.stats as stats
import matplotlib.pylab as plt


# generate some data
ln = stats.lognorm(0.4,scale=100)
data = ln.rvs(size=2000)

counts, bins, _ = plt.hist(data, bins=50)
# note that the len of bins is 51, since it contains upper and lower limit of every bin

# restore data from histogram: counts multiplied bin centers
restored = [[d]*int(counts[n]) for n,d in enumerate((bins[1:]+bins[:-1])/2)]
# flatten the result
restored = [item for sublist in restored for item in sublist]

print stats.lognorm.fit(restored, floc=0)

dist = stats.lognorm(*stats.lognorm.fit(restored, floc=0))
x = np.arange(1,400)
y = dist.pdf(x)

# the pdf is normalized, so we need to scale it to match the histogram
y = y/y.max()
y = y*counts.max()

plt.plot(x,y,'r',linewidth=2)
plt.show()

fitted histogram

关于python - 将对数正态分布拟合到已经装箱的数据 python,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44137933/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com