gpt4 book ai didi

r - 如何在data.table R中对子组内的数据进行排序

转载 作者:行者123 更新时间:2023-12-04 13:26:54 25 4
gpt4 key购买 nike

考虑以下:
DT = data.table(a=sample(1:2), b=sample(1:1000,20))
如何显示b,说 最高值,每个 a?

我卡在 DT[,b,by=a][order(a,-b)] .

谢谢!

最佳答案

最优雅的应该是:

DT[order(-b),head(b,5),by=a]

在纯性能方面:
DT[order(-b), indx := seq_len(.N), "a"][indx <= 5][,indx:=NULL][]

或者@Frank 建议的那个:
DT[DT[order(-b),.I[1:.N<=5],"a"]$V1]

低于上述所有三个的基准:
# devtools::install_github("jangorecki/dwtools")
library(dwtools) # to populate complex dataset
N <- 5e6
DT <- dw.populate(N, scenario="fact")
str(DT)
#Classes ‘data.table’ and 'data.frame': 5000000 obs. of 8 variables:
# $ cust_code: chr "id010" "id076" "id024" "id081" ...
# $ prod_code: int 8234 5689 31198 35479 39140 37589 8184 39489 35266 3596 ...
# $ geog_code: chr "OH" "NH" "TN" "MI" ...
# $ time_code: Date, format: "2012-03-11" "2014-02-10" "2012-11-05" "2013-01-30" ...
# $ curr_code: chr "XRP" "HRK" "CAD" "BRL" ...
# $ amount : num 486 382 695 470 749 ...
# $ value : num 193454 33694 351418 84888 20673 ...

通过 cust_code 列,uniqueN 等于 100:
system.time(DT[order(-time_code),head(.SD,5),"cust_code"])
# user system elapsed
# 1.804 0.084 1.890
system.time(DT[order(-time_code), indx := seq_len(.N),"cust_code"][indx <= 5][,indx:=NULL][])
# user system elapsed
# 1.414 0.092 1.508
system.time(DT[DT[order(-time_code),.I[1:.N<=5],"cust_code"]$V1])
# user system elapsed
# 1.405 0.096 1.502

如果有更多组(prod_code 列,uniqueN 等于 50000),那么我们可以看到对性能的影响:
system.time(DT[order(time_code),head(.SD,5),"prod_code"])
# user system elapsed
# 10.177 0.109 10.322
system.time(DT[order(time_code), indx := seq_len(.N),"prod_code"][indx <= 5][,indx:=NULL][])
# user system elapsed
# 1.555 0.099 1.665
system.time(DT[DT[order(time_code),.I[1:.N<=5],"prod_code"]$V1])
# user system elapsed
# 1.697 0.064 1.764

2015-11-09 更新:

随着今天的 Arun 提交 e615532 headtail应该在引擎盖下进行优化。

关于r - 如何在data.table R中对子组内的数据进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28683712/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com