gpt4 book ai didi

python - 重叠 Pandas KDE 和直方图时正确渲染 y 轴

转载 作者:行者123 更新时间:2023-12-04 07:44:08 24 4
gpt4 key购买 nike

之前已经问过类似的问题,但没有同时使用这两个绘图函数,所以我们在这里:
我有一个来自 Pandas DataFrame 的列,我正在绘制直方图和 KDE。但是,当我绘制它们时,y 轴使用原始数据值范围而不是离散数量的样本/bin(我想要的)。我怎样才能解决这个问题?实际情节是完美的,但 y 轴是错误的。
数据:

t2 = [140547476703.0, 113395471484.0, 158360225172.0, 105497674121.0, 186457736557.0, 153705359063.0, 36826568371.0, 200653068740.0, 190761317478.0, 126529980843.0, 98776029557.0, 132773701862.0, 14780432449.0, 167507656251.0, 121353262386.0, 136377019007.0, 134190768743.0, 218619462126.0, 07912778721.0, 215628911255.0, 147024833865.0, 94136343562.0, 135685803096.0, 165901502129.0, 45476074790.0, 125195690010.0, 113910844263.0, 123134290987.0, 112028565305.0, 93448218430.0, 07341012378.0, 93146854494.0, 132958913610.0, 102326700019.0, 196826471714.0, 122045354980.0, 76591131961.0, 134694468251.0, 120212625727.0, 108456858852.0, 106363042112.0, 193367024628.0, 39578667378.0, 178075400604.0, 155513974664.0, 132834624567.0, 137336282646.0, 125379267464.0]
代码:
fig = plt.figure()
# plot hist + kde
t2[t2.columns[0]].plot.kde(color = "maroon", label = "_nolegend_")
t2[t2.columns[0]].plot.hist(density = True, edgecolor = "grey", color = "tomato", title = t2.columns[0])

# plot mean/stdev
m = t2[t2.columns[0]].mean()
stdev = t2[t2.columns[0]].std()
plt.axvline(m, color = "black", ymax = 0.05, label = "mean")
plt.axvline(m-2*stdev, color = "black", ymax = 0.05, linestyle = ":", label = "+/- 2*Stdev")
plt.axvline(m+2*stdev, color = "black", ymax = 0.05, linestyle = ":")
plt.legend()
现在的样子:
enter image description here

最佳答案

如果您想要实际计数,则需要按箱的宽度乘以观察次数来放大 KDE。最棘手的部分是访问 Pandas 用来绘制 KDE 的数据。 (我已经删除了与图例相关的部分以简化手头的问题)。

import matplotlib.pyplot as plt
import numpy as np

# Calculate KDE, get data
axis = t2[t2.columns[0]].plot.kde(color = "maroon", label = "_nolegend_")
xdata = axis.get_children()[0]._x
ydata = axis.get_children()[0]._y
plt.clf()


# Real figure
fig, ax = plt.subplots(figsize=(7,5))
# Plot Histogram, no density.
x = ax.hist(t2[t2.columns[0]], edgecolor = "grey", color = "tomato")

# size of the bins * N obs
scale = np.diff(x[1])[0]*len(t2)

# Plot scaled KDE
ax.plot(xdata, ydata*scale, color='blue')
ax.set_ylabel('N observations')

plt.show()
enter image description here

关于python - 重叠 Pandas KDE 和直方图时正确渲染 y 轴,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67288172/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com