gpt4 book ai didi

r - 使用 mean() 和 sum()/.N 时,按组的 data.table 平均值提供不同的结果

转载 作者:行者123 更新时间:2023-12-05 00:42:56 26 4
gpt4 key购买 nike

在 data.table 中按组计算平均值时,我得到了不同的结果:

qty <- c(1:6)
name <- c("a", "b","a", "a", "c","b" )
type <- c("i", "i", "i", "f", "f", "f")

DT <- data.table(qty,name,type)

DT[, avg_mean := mean(qty) , by = .(name, type)]
DT[, avg_sum_N := sum(qty)/.N , by = .(name, type)]

> DT
qty name type avg_mean avg_sum_N
<int> <char> <char> <num> <num>
1: 1 a i 2 2
2: 2 b i 4 2
3: 3 a i 2 2
4: 4 a f 2 4
5: 5 c f 6 5
6: 6 b f 5 6

我希望 avg_meanavg_sum_N 完全相同,例如 avg_sum_N。为什么它们不同?谢谢。

请在下面找到 session 信息。

> packageVersion('data.table')
[1] ‘1.14.3’
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=Portuguese_Brazil.1252 LC_CTYPE=Portuguese_Brazil.1252 LC_MONETARY=Portuguese_Brazil.1252
[4] LC_NUMERIC=C LC_TIME=Portuguese_Brazil.1252

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] zoo_1.8-10 lubridate_1.8.0 RPostgres_1.4.3 DBI_1.1.2 stringi_1.7.6 readxl_1.4.0
[7] gsubfn_0.7 proto_1.0.0 stringr_1.4.0 magrittr_2.0.3 stringdist_0.9.8 fuzzyjoin_0.1.6
[13] data.table_1.14.3

loaded via a namespace (and not attached):
[1] Rcpp_1.0.8.3 pillar_1.7.0 compiler_4.1.0 cellranger_1.1.0 tools_4.1.0 bit_4.0.4
[7] lattice_0.20-44 lifecycle_1.0.1 tibble_3.1.6 pkgconfig_2.0.3 rlang_1.0.2 cli_3.2.0
[13] rstudioapi_0.13 writexl_1.4.0 parallel_4.1.0 dplyr_1.0.8 hms_1.1.1 generics_0.1.2
[19] vctrs_0.4.1 grid_4.1.0 bit64_4.0.5 tidyselect_1.1.2 glue_1.6.2 R6_2.5.1
[25] fansi_1.0.3 tcltk_4.1.0 blob_1.2.3 purrr_0.3.4 ellipsis_0.3.2 assertthat_0.2.1
[31] utf8_1.2.2 crayon_1.5.1

最佳答案

问题是 dev data.table 版本中的错误。data.table::update.dev.pkg() 修复了问题。

关于r - 使用 mean() 和 sum()/.N 时,按组的 data.table 平均值提供不同的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71983259/

26 4 0