gpt4 book ai didi

r - 来自 "sampling"的 strata() 返回错误 : arguments imply differing number of rows

转载 作者:行者123 更新时间:2023-12-02 22:14:17 26 4
gpt4 key购买 nike

我有一个如下所示的数据框:

'data.frame':   1090 obs. of  8 variables:
$ id : chr "INC000000209241" "INC000000218488" "INC000000218982" "INC000000225646" ...
$ service.type : chr "Incident" "Incident" "Incident" "Incident" ...
$ priority : chr "Critical" "Critical" "Critical" "Critical" ...

我对数据排序如下:

data <- data[order(data$priority),]

我一直在改变因素等的优先级,但无论我尝试什么,当我尝试运行以下命令时:

s = strata(data,c("priority"),size=c(0,0,1,5))

我总是收到以下错误:

Error in data.frame(..., check.names = FALSE) : 
arguments imply differing number of rows: 0, 1

我尝试调试该函数以查看我是否可以说出为什么会出现此错误(但我无法理解代码)。在执行 strata() 函数的这个阶段出现错误:

debug: r = cbind(r, i)

非常感谢您的帮助!

最佳答案

问题在于您试图将某些组的样本量设置为零。相反,在采样之前对原始数据进行子集化。

在这里,我们重现了您的问题。

library(sampling)
data(swissmunicipalities)
length(table(swissmunicipalities$REG)) # We have seven strata
# [1] 7

# Let's take two from each group
strata(swissmunicipalities,
stratanames = c("REG"),
size = rep(2, 7),
method="srswor")
# REG ID_unit Prob Stratum
# 93 4 93 0.011695906 1
# 145 4 145 0.011695906 1
# 2574 1 2574 0.003395586 2
# 2631 1 2631 0.003395586 2
# 826 3 826 0.006230530 3
# 1614 3 1614 0.006230530 3
# 583 2 583 0.002190581 4
# 1017 2 1017 0.002190581 4
# 1297 5 1297 0.004246285 5
# 2535 5 2535 0.004246285 5
# 342 6 342 0.010752688 6
# 347 6 347 0.010752688 6
# 651 7 651 0.008163265 7
# 2471 7 2471 0.008163265 7

# Let's try to drop the first two groups. Oops...
strata(swissmunicipalities,
stratanames = c("REG"),
size = c(0, 0, 2, 2, 2, 2, 2),
method="srswor")
# Error in data.frame(..., check.names = FALSE) :
# arguments imply differing number of rows: 0, 1

让我们分割并重试。

swiss2 <- swissmunicipalities[!swissmunicipalities$REG %in% c(1, 2), ]
table(swiss2$REG)
strata(swiss2,
stratanames = c("REG"),
size = c(2, 2, 2, 2, 2),
method="srswor")
# REG ID_unit Prob Stratum
# 58 4 58 0.011695906 1
# 115 4 115 0.011695906 1
# 432 3 432 0.006230530 2
# 986 3 986 0.006230530 2
# 1007 5 1007 0.004246285 3
# 1150 5 1150 0.004246285 3
# 190 6 190 0.010752688 4
# 497 6 497 0.010752688 4
# 1049 7 1049 0.008163265 5
# 1327 7 1327 0.008163265 5

关于r - 来自 "sampling"的 strata() 返回错误 : arguments imply differing number of rows,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14735411/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com