gpt4 book ai didi

python - pandas如何计算sem()?

转载 作者:太空宇宙 更新时间:2023-11-03 15:25:02 26 4
gpt4 key购买 nike

首先导入 pandas 并创建具有完美正态分布的 Series:

import pandas as pd

lst = [[5 for x in range(5)], [4 for x in range(4)], [3 for x in range(3)],
[2 for x in range(2)], [1 for x in range(1)], [2 for x in range(2)],
[3 for x in range(3)], [4 for x in range(4)], [5 for x in range(5)]]

lst = [item for sublists in lst for item in sublists]

series = pd.Series(lst)

让我们检查一下,分布是否正常:

print(round(sum(series - series.mean()) / series.count(), 1) == 0)
# if distribution is normal we'll see True

现在让我们打印 Universe 的 sem():

print(series.sem(ddof=0))
# 0.21619987017

现在作为示例:

print(series.sem()) # ddof=1
# 0.220026713637

但我无法理解 pandas 如何计算平均值的标准误差,如果它与宇宙一起工作。是否可以使用

se_x = sd_x / sqrt(len(x))

或创建样本?如果它创建样本,我可以设置多少样本以及如何设置样本数量?

如果计数 < 30,pandas 如何计算样本的 sem?

最佳答案

Pandas generates sem method dynamically

    cls.sem = _make_stat_function_ddof(
cls, 'sem', name, name2, axis_descr,
"Return unbiased standard error of the mean over requested "
"axis.\n\nNormalized by N-1 by default. This can be changed "
"using the ddof argument",
nanops.nansem)

where nanops.nansem() is :

@disallow('M8', 'm8')
def nansem(values, axis=None, skipna=True, ddof=1):
var = nanvar(values, axis, skipna, ddof=ddof)

mask = isnull(values)
if not is_float_dtype(values.dtype):
values = values.astype('f8')
count, _ = _get_counts_nanvar(mask, axis, ddof, values.dtype)
var = nanvar(values, axis, skipna, ddof=ddof)

return np.sqrt(var) / np.sqrt(count)

您可能还想检查 scipy.stats 中可用的方法模块

关于python - pandas如何计算sem()?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43233963/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com