gpt4 book ai didi

r - R 中的相关性,当我执行 "pairwise.complet.obs"时出现错误 "standard deviation is 0"

转载 作者:行者123 更新时间:2023-12-01 14:59:12 24 4
gpt4 key购买 nike

我正在尝试按组进行一些关联,并且一直在使用这个非常有用的线程:

spearman correlation by group in R

但是,在我的 2 个变量和分组中有一些 NA 值,所以我得到 NA 作为每个组的结果

所以我试了一下:

> j <- lapply(split(HTNPS, HTNPS$callcat), function(HTNPS){cor(HTNPS$NPS_int, 
HTNPS$holdtime_int,use="pairwise.complete.obs", method = "spearman")})

但是,虽然我得到了更合理的数字,但我收到了这个警告: 在 cor(HTNPS$NPS_int, HTNPS$holdtime_int, use = "pairwise.complete.obs", : 标准差为零

根据要求,我已经为相关列完成了 dput(head(HTNPS,40)

> dput(head(HTNPS[,20:24], 40))
structure(list(holdtime_int = structure(c(6, 11, 7, 7, 5, 7,
6, 5, 3, 6, 3, 5, 6, 105, 7, 6, 353, 5, 6, 9, 6, 6, 12, 5, 5,
5, 249, 5, 7, 11, 5, 7, 5, 290, 6, 6, 6, 6, 5, 6), .Dim = c(40L,
1L)), NPS_int = structure(c(1, NA, NA, 3, NA, 1, 1, 2, NA, NA,
NA, NA, 3, 2, 1, NA, 2, 4, 1, 2, NA, 3, 1, 1, 1, 1, 1, 1, 1,
2, 1, 3, 1, 1, 1, 2, 4, 2, 1, 1), .Dim = c(40L, 1L)), HTnot0 = structure(c(6,
11, 7, 7, 5, 7, 6, 5, 3, 6, 3, 5, 6, 105, 7, 6, 353, 5, 6, 9,
6, 6, 12, 5, 5, 5, 249, 5, 7, 11, 5, 7, 5, 290, 6, 6, 6, 6, 5,
6), .Dim = c(40L, 1L)), callcat = structure(c(NA, NA, "CARD",
"CARD", "GENERAL", "LOAN", "CHANGE DETAILS", "GENERAL", "LOAN",
"CHANGE DETAILS", "LOAN", "CARD", "FUNDS TRANSFER", "FEE", "BALANCE",
NA, "CARD", NA, NA, "STATEMENT", "CARD", "CARD", "GENERAL", "CARD",
"CARD", "TERM DEPOSIT", "CARD", "GENERAL", "CARD", "CARD", "GENERAL",
NA, NA, NA, NA, "CARD", "CARD", "FUNDS TRANSFER", "GENERAL",
"MyBusinessOverride"), .Dim = c(40L, 1L), .Dimnames = list(NULL,
"callcat")), HTcat = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 1L, 1L, 12L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 9L, 1L, 1L, 1L, 1L, 1L, 1L, 10L, 1L, 1L,
1L, 1L, 1L, 1L), .Dim = c(40L, 1L), .Dimnames = list(NULL, "HTcat"))), .Names = c("holdtime_int",
"NPS_int", "HTnot0", "callcat", "HTcat"), row.names = c(NA, 40L
), class = "data.frame")

最佳答案

如果您进行该拆分,您的许多样本仅包含一个观察值(在移除 NA 之后)。显然,那里没有相关性要计算。

您收到的警告是,当两个变量之一仅包含一个值时。在您的示例中,例如 callcat==FUNDS TRANSFER 的数据框。 holdtime_int 只有一个值(为 6),因此标准偏差为 0(因此出现警告)并且生成的相关性为 NA。

我不知道您为什么要查看这些相关性,但根据您提供的数据,它们对我来说几乎没有任何意义。如果你想摆脱警告,你可以像这样建立一个检查:

lapply(split(HTNPS,HTNPS$callcat), function(x){
x <- na.exclude( x[c("holdtime_int","NPS_int")] )
if(any(sapply(x, function(i)length(unique(i))) < 2 )){
NA
} else {
cor(x[,1],x[,2], method="spearman")
}
})

这应该会给你相同的结果但没有警告。注意使用 na.exclude 来去除 NA。

关于r - R 中的相关性,当我执行 "pairwise.complet.obs"时出现错误 "standard deviation is 0",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25498925/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com