gpt4 book ai didi

r - ifelse 真的每次都会计算它的两个向量吗?慢吗?

转载 作者:行者123 更新时间:2023-12-03 05:37:34 25 4
gpt4 key购买 nike

ifelse 真的会计算 yesno 向量吗——就像每个向量的整体一样?或者它只是计算每个向量的一些值?

还有,ifelse真的那么慢吗?

最佳答案

是的。 (异常(exception))

ifelse 计算其 yes 值和 no 值。除非测试条件全部为TRUE或全部FALSE

我们可以通过生成随机数并观察实际生成了多少个数字来看到这一点。 (通过恢复种子)。

# TEST CONDITION, ALL TRUE
set.seed(1)
dump <- ifelse(rep(TRUE, 200), rnorm(200), rnorm(200))
next.random.number.after.all.true <- rnorm(1)

# TEST CONDITION, ALL FALSE
set.seed(1)
dump <- ifelse(rep(FALSE, 200), rnorm(200), rnorm(200))
next.random.number.after.all.false <- rnorm(1)

# TEST CONDITION, MIXED
set.seed(1)
dump <- ifelse(c(FALSE, rep(TRUE, 199)), rnorm(200), rnorm(200))
next.random.number.after.some.TRUE.some.FALSE <- rnorm(1)

# RESET THE SEED, GENERATE SEVERAL RANDOM NUMBERS TO SEARCH FOR A MATCH
set.seed(1)
r.1000 <- rnorm(1000)


cat("Quantity of random numbers generated during the `ifelse` statement when:",
"\n\tAll True ", which(r.1000 == next.random.number.after.all.true) - 1,
"\n\tAll False ", which(r.1000 == next.random.number.after.all.false) - 1,
"\n\tMixed T/F ", which(r.1000 == next.random.number.after.some.TRUE.some.FALSE) - 1
)

给出以下输出:

Quantity of random numbers generated during the `ifelse` statement when: 
All True 200
All False 200
Mixed T/F 400 <~~ Notice TWICE AS MANY numbers were
generated when `test` had both
T & F values present
<小时/>

我们还可以在源代码本身中看到它:

.
.
if (any(test[!nas]))
ans[test & !nas] <- rep(yes, length.out = length(ans))[test & # <~~~~ This line and the one below
!nas]
if (any(!test[!nas]))
ans[!test & !nas] <- rep(no, length.out = length(ans))[!test & # <~~~~ ... are the cluprits
!nas]
.
.

请注意,仅当存在时才计算 yesnotest 的某个非NA值,分别为 TRUEFALSE
在这一点上——这是提高效率的重要部分——计算每个向量的整体

<小时/>

好的,但是速度慢吗?

让我们看看是否可以测试它:

library(microbenchmark)

# Create some sample data
N <- 1e4
set.seed(1)
X <- sample(c(seq(100), rep(NA, 100)), N, TRUE)
Y <- ifelse(is.na(X), rnorm(X), NA) # Y has reverse NA/not-NA setup than X

这两个语句生成相同的结果

yesifelse <- quote(sort(ifelse(is.na(X), Y+17, X-17 ) ))
noiflese <- quote(sort(c(Y[is.na(X)]+17, X[is.na(Y)]-17)))

identical(eval(yesifelse), eval(noiflese))
# [1] TRUE

但其中一个的速度是另一个的两倍

microbenchmark(eval(yesifelse), eval(noiflese), times=50L)

N = 1,000
Unit: milliseconds
expr min lq median uq max neval
eval(yesifelse) 2.286621 2.348590 2.411776 2.537604 10.05973 50
eval(noiflese) 1.088669 1.093864 1.122075 1.149558 61.23110 50

N = 10,000
Unit: milliseconds
expr min lq median uq max neval
eval(yesifelse) 30.32039 36.19569 38.50461 40.84996 98.77294 50
eval(noiflese) 12.70274 13.58295 14.38579 20.03587 21.68665 50

关于r - ifelse 真的每次都会计算它的两个向量吗?慢吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16275149/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com